WorldWideScience

Sample records for meiosis-driven genome variation

  1. HGVA: the Human Genome Variation Archive

    OpenAIRE

    Lopez, Javier; Coll, Jacobo; Haimel, Matthias; Kandasamy, Swaathi; Tarraga, Joaquin; Furio-Tari, Pedro; Bari, Wasim; Bleda, Marta; Rueda, Antonio; Gr?f, Stefan; Rendon, Augusto; Dopazo, Joaquin; Medina, Ignacio

    2017-01-01

    Abstract High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic...

  2. Genome Variation Map: a data repository of genome variations in BIG Data Center

    OpenAIRE

    Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

    2017-01-01

    Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research a...

  3. From genomic variation to personalized medicine

    DEFF Research Database (Denmark)

    Wesolowska, Agata; Schmiegelow, Kjeld

    Genomic variation is the basis of interindividual differences in observable traits and disease susceptibility. Genetic studies are the driving force of personalized medicine, as many of the differences in treatment efficacy can be attributed to our genomic background. The rapid development...... a considerable amount of the phenotype variability, hence the major difficulty of interpretation lies in the complexity of molecular interactions. This PhD thesis describes the state-of-art of the functional human variation research (Chapter 1) and introduces childhood acute lymphoblastic leukaemia (ALL...... the thesis and includes some final remarks on the perspectives of genomic variation research and personalized medicine. In summary, this thesis demonstrates the feasibility of integrative analyses of genomic variations and introduces large-scale hypothesis-driven SNP exploration studies as an emerging...

  4. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  5. HGVA: the Human Genome Variation Archive.

    Science.gov (United States)

    Lopez, Javier; Coll, Jacobo; Haimel, Matthias; Kandasamy, Swaathi; Tarraga, Joaquin; Furio-Tari, Pedro; Bari, Wasim; Bleda, Marta; Rueda, Antonio; Gräf, Stefan; Rendon, Augusto; Dopazo, Joaquin; Medina, Ignacio

    2017-07-03

    High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic data for key reference projects in a clean, fast and integrated fashion. HGVA provides an efficient and intuitive web-interface for easy data mining, a comprehensive RESTful API and client libraries in Python, Java and JavaScript for fast programmatic access to its knowledge base. HGVA calculates population frequencies for these projects and enriches their data with variant annotation provided by CellBase, a rich and fast annotation solution. HGVA serves as a proof-of-concept of the genome analysis developments being carried out by the University of Cambridge together with UK's 100 000 genomes project and the National Institute for Health Research BioResource Rare-Diseases, in particular, deploying open-source for Computational Biology (OpenCB) software platform for storing and analyzing massive genomic datasets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Insights into structural variations and genome rearrangements in prokaryotic genomes.

    Science.gov (United States)

    Periwal, Vinita; Scaria, Vinod

    2015-01-01

    Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. Genomic Sequence Variation Markup Language (GSVML).

    Science.gov (United States)

    Nakaya, Jun; Kimura, Michio; Hiroi, Kaei; Ido, Keisuke; Yang, Woosung; Tanaka, Hiroshi

    2010-02-01

    With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics. We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed. We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated. GSVML was developed as

  8. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  9. Copy number variation in the bovine genome

    DEFF Research Database (Denmark)

    Fadista, João; Thomsen, Bo; Holm, Lars-Erik

    2010-01-01

    to genetic variation in cattle. Results We designed and used a set of NimbleGen CGH arrays that tile across the assayable portion of the cattle genome with approximately 6.3 million probes, at a median probe spacing of 301 bp. This study reports the highest resolution map of copy number variation...... in the cattle genome, with 304 CNV regions (CNVRs) being identified among the genomes of 20 bovine samples from 4 dairy and beef breeds. The CNVRs identified covered 0.68% (22 Mb) of the genome, and ranged in size from 1.7 to 2,031 kb (median size 16.7 kb). About 20% of the CNVs co-localized with segmental...... duplications, while 30% encompass genes, of which the majority is involved in environmental response. About 10% of the human orthologous of these genes are associated with human disease susceptibility and, hence, may have important phenotypic consequences. Conclusions Together, this analysis provides a useful...

  10. GFVO: the Genomic Feature and Variation Ontology

    KAUST Repository

    Baran, Joachim

    2015-05-05

    Falling costs in genomic laboratory experiments have led to a steady increase of genomic feature and variation data. Multiple genomic data formats exist for sharing these data, and whilst they are similar, they are addressing slightly different data viewpoints and are consequently not fully compatible with each other. The fragmentation of data format specifications makes it hard to integrate and interpret data for further analysis with information from multiple data providers. As a solution, a new ontology is presented here for annotating and representing genomic feature and variation dataset contents. The Genomic Feature and Variation Ontology (GFVO) specifically addresses genomic data as it is regularly shared using the GFF3 (incl. FASTA), GTF, GVF and VCF file formats. GFVO simplifies data integration and enables linking of genomic annotations across datasets through common semantics of genomic types and relations. Availability and implementation. The latest stable release of the ontology is available via its base URI; previous and development versions are available at the ontology’s GitHub repository: https://github.com/BioInterchange/Ontologies; versions of the ontology are indexed through BioPortal (without external class-/property-equivalences due to BioPortal release 4.10 limitations); examples and reference documentation is provided on a separate web-page: http://www.biointerchange.org/ontologies.html. GFVO version 1.0.2 is licensed under the CC0 1.0 Universal license (https://creativecommons.org/publicdomain/zero/1.0) and therefore de facto within the public domain; the ontology can be appropriated without attribution for commercial and non-commercial use.

  11. Genome Variation Map: a data repository of genome variations in BIG Data Center.

    Science.gov (United States)

    Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

    2018-01-04

    The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Genome Variation Map: a data repository of genome variations in BIG Data Center

    Science.gov (United States)

    Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang

    2018-01-01

    Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. PMID:29069473

  13. Structural genomic variation in ischemic stroke

    Science.gov (United States)

    Matarin, Mar; Simon-Sanchez, Javier; Fung, Hon-Chung; Scholz, Sonja; Gibbs, J. Raphael; Hernandez, Dena G.; Crews, Cynthia; Britton, Angela; Wavrant De Vrieze, Fabienne; Brott, Thomas G.; Brown, Robert D.; Worrall, Bradford B.; Silliman, Scott; Case, L. Douglas; Hardy, John A.; Rich, Stephen S.; Meschia, James F.; Singleton, Andrew B.

    2008-01-01

    Technological advances in molecular genetics allow rapid and sensitive identification of genomic copy number variants (CNVs). This, in turn, has sparked interest in the function such variation may play in disease. While a role for copy number mutations as a cause of Mendelian disorders is well established, it is unclear whether CNVs may affect risk for common complex disorders. We sought to investigate whether CNVs may modulate risk for ischemic stroke (IS) and to provide a catalog of CNVs in patients with this disorder by analyzing copy number metrics produced as a part of our previous genome-wide single-nucleotide polymorphism (SNP)-based association study of ischemic stroke in a North American white population. We examined CNVs in 263 patients with ischemic stroke (IS). Each identified CNV was compared with changes identified in 275 neurologically normal controls. Our analysis identified 247 CNVs, corresponding to 187 insertions (76%; 135 heterozygous; 25 homozygous duplications or triplications; 2 heterosomic) and 60 deletions (24%; 40 heterozygous deletions;3 homozygous deletions; 14 heterosomic deletions). Most alterations (81%) were the same as, or overlapped with, previously reported CNVs. We report here the first genome-wide analysis of CNVs in IS patients. In summary, our study did not detect any common genomic structural variation unequivocally linked to IS, although we cannot exclude that smaller CNVs or CNVs in genomic regions poorly covered by this methodology may confer risk for IS. The application of genome-wide SNP arrays now facilitates the evaluation of structural changes through the entire genome as part of a genome-wide genetic association study. PMID:18288507

  14. Copy Number Variations in Tilapia Genomes.

    Science.gov (United States)

    Li, Bi Jun; Li, Hong Lian; Meng, Zining; Zhang, Yong; Lin, Haoran; Yue, Gen Hua; Xia, Jun Hong

    2017-02-01

    Discovering the nature and pattern of genome variation is fundamental in understanding phenotypic diversity among populations. Although several millions of single nucleotide polymorphisms (SNPs) have been discovered in tilapia, the genome-wide characterization of larger structural variants, such as copy number variation (CNV) regions has not been carried out yet. We conducted a genome-wide scan for CNVs in 47 individuals from three tilapia populations. Based on 254 Gb of high-quality paired-end sequencing reads, we identified 4642 distinct high-confidence CNVs. These CNVs account for 1.9% (12.411 Mb) of the used Nile tilapia reference genome. A total of 1100 predicted CNVs were found overlapping with exon regions of protein genes. Further association analysis based on linear model regression found 85 CNVs ranging between 300 and 27,000 base pairs significantly associated to population types (R 2  > 0.9 and P > 0.001). Our study sheds first insights on genome-wide CNVs in tilapia. These CNVs among and within tilapia populations may have functional effects on phenotypes and specific adaptation to particular environments.

  15. Structural genomic variations and Parkinson's disease.

    Science.gov (United States)

    Bandrés-Ciga, Sara; Ruz, Clara; Barrero, Francisco J; Escamilla-Sevilla, Francisco; Pelegrina, Javier; Vives, Francisco; Duran, Raquel

    2017-10-01

    Parkinson's disease (PD) is the second most common neurodegenerative disease, whose prevalence is projected to be between 8.7 and 9.3 million by 2030. Until about 20 years ago, PD was considered to be the textbook example of a "non-genetic" disorder. Nowadays, PD is generally considered a multifactorial disorder that arises from the combination and complex interaction of genes and environmental factors. To date, a total of 7 genes including SNCA, LRRK2, PARK2, DJ-1, PINK 1, VPS35 and ATP13A2 have been seen to cause unequivocally Mendelian PD. Also, variants with incomplete penetrance in the genes LRRK2 and GBA are considered to be strong risk factors for PD worldwide. Although genetic studies have provided valuable insights into the pathogenic mechanisms underlying PD, the role of structural variation in PD has been understudied in comparison with other genomic variations. Structural genomic variations might substantially account for such genetic substrates yet to be discovered. The present review aims to provide an overview of the structural genomic variants implicated in the pathogenesis of PD.

  16. Genomic variation in Salmonella enterica core genes for epidemiological typing

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Lukjancenko, Oksana; Rundsten, Carsten Friis

    2012-01-01

    Background: Technological advances in high throughput genome sequencing are making whole genome sequencing (WGS) available as a routine tool for bacterial typing. Standardized procedures for identification of relevant genes and of variation are needed to enable comparison between studies and over...... genomes and evaluate their value as typing targets, comparing whole genome typing and traditional methods such as 16S and MLST. A consensus tree based on variation of core genes gives much better resolution than 16S and MLST; the pan-genome family tree is similar to the consensus tree, but with higher...... that there is a positive selection towards mutations leading to amino acid changes. Conclusions: Genomic variation within the core genome is useful for investigating molecular evolution and providing candidate genes for bacterial genome typing. Identification of genes with different degrees of variation is important...

  17. Genome size variation in the genus Avena.

    Science.gov (United States)

    Yan, Honghai; Martin, Sara L; Bekele, Wubishet A; Latta, Robert G; Diederichsen, Axel; Peng, Yuanying; Tinker, Nicholas A

    2016-03-01

    Genome size is an indicator of evolutionary distance and a metric for genome characterization. Here, we report accurate estimates of genome size in 99 accessions from 26 species of Avena. We demonstrate that the average genome size of C genome diploid species (2C = 10.26 pg) is 15% larger than that of A genome species (2C = 8.95 pg), and that this difference likely accounts for a progression of size among tetraploid species, where AB genome configuration had similar genome sizes (average 2C = 25.74 pg). Genome size was mostly consistent within species and in general agreement with current information about evolutionary distance among species. Results also suggest that most of the polyploid species in Avena have experienced genome downsizing in relation to their diploid progenitors. Genome size measurements could provide additional quality control for species identification in germplasm collections, especially in cases where diploid and polyploid species have similar morphology.

  18. Intrapopulation genome size variation in D. melanogaster reflects life history variation and plasticity.

    Directory of Open Access Journals (Sweden)

    Lisa L Ellis

    2014-07-01

    Full Text Available We determined female genome sizes using flow cytometry for 211 Drosophila melanogaster sequenced inbred strains from the Drosophila Genetic Reference Panel, and found significant conspecific and intrapopulation variation in genome size. We also compared several life history traits for 25 lines with large and 25 lines with small genomes in three thermal environments, and found that genome size as well as genome size by temperature interactions significantly correlated with survival to pupation and adulthood, time to pupation, female pupal mass, and female eclosion rates. Genome size accounted for up to 23% of the variation in developmental phenotypes, but the contribution of genome size to variation in life history traits was plastic and varied according to the thermal environment. Expression data implicate differences in metabolism that correspond to genome size variation. These results indicate that significant genome size variation exists within D. melanogaster and this variation may impact the evolutionary ecology of the species. Genome size variation accounts for a significant portion of life history variation in an environmentally dependent manner, suggesting that potential fitness effects associated with genome size variation also depend on environmental conditions.

  19. Intrapopulation Genome Size Variation in D. melanogaster Reflects Life History Variation and Plasticity

    Science.gov (United States)

    Ellis, Lisa L.; Huang, Wen; Quinn, Andrew M.; Ahuja, Astha; Alfrejd, Ben; Gomez, Francisco E.; Hjelmen, Carl E.; Moore, Kristi L.; Mackay, Trudy F. C.; Johnston, J. Spencer; Tarone, Aaron M.

    2014-01-01

    We determined female genome sizes using flow cytometry for 211 Drosophila melanogaster sequenced inbred strains from the Drosophila Genetic Reference Panel, and found significant conspecific and intrapopulation variation in genome size. We also compared several life history traits for 25 lines with large and 25 lines with small genomes in three thermal environments, and found that genome size as well as genome size by temperature interactions significantly correlated with survival to pupation and adulthood, time to pupation, female pupal mass, and female eclosion rates. Genome size accounted for up to 23% of the variation in developmental phenotypes, but the contribution of genome size to variation in life history traits was plastic and varied according to the thermal environment. Expression data implicate differences in metabolism that correspond to genome size variation. These results indicate that significant genome size variation exists within D. melanogaster and this variation may impact the evolutionary ecology of the species. Genome size variation accounts for a significant portion of life history variation in an environmentally dependent manner, suggesting that potential fitness effects associated with genome size variation also depend on environmental conditions. PMID:25057905

  20. Copy Number Variation in the Horse Genome

    Science.gov (United States)

    Ghosh, Sharmila; Qu, Zhipeng; Das, Pranab J.; Fang, Erica; Juras, Rytis; Cothran, E. Gus; McDonell, Sue; Kenney, Daniel G.; Lear, Teri L.; Adelson, David L.; Chowdhary, Bhanu P.; Raudsepp, Terje

    2014-01-01

    We constructed a 400K WG tiling oligoarray for the horse and applied it for the discovery of copy number variations (CNVs) in 38 normal horses of 16 diverse breeds, and the Przewalski horse. Probes on the array represented 18,763 autosomal and X-linked genes, and intergenic, sub-telomeric and chrY sequences. We identified 258 CNV regions (CNVRs) across all autosomes, chrX and chrUn, but not in chrY. CNVs comprised 1.3% of the horse genome with chr12 being most enriched. American Miniature horses had the highest and American Quarter Horses the lowest number of CNVs in relation to Thoroughbred reference. The Przewalski horse was similar to native ponies and draft breeds. The majority of CNVRs involved genes, while 20% were located in intergenic regions. Similar to previous studies in horses and other mammals, molecular functions of CNV-associated genes were predominantly in sensory perception, immunity and reproduction. The findings were integrated with previous studies to generate a composite genome-wide dataset of 1476 CNVRs. Of these, 301 CNVRs were shared between studies, while 1174 were novel and require further validation. Integrated data revealed that to date, 41 out of over 400 breeds of the domestic horse have been analyzed for CNVs, of which 11 new breeds were added in this study. Finally, the composite CNV dataset was applied in a pilot study for the discovery of CNVs in 6 horses with XY disorders of sexual development. A homozygous deletion involving AKR1C gene cluster in chr29 in two affected horses was considered possibly causative because of the known role of AKR1C genes in testicular androgen synthesis and sexual development. While the findings improve and integrate the knowledge of CNVs in horses, they also show that for effective discovery of variants of biomedical importance, more breeds and individuals need to be analyzed using comparable methodological approaches. PMID:25340504

  1. Copy number variation in the horse genome.

    Directory of Open Access Journals (Sweden)

    Sharmila Ghosh

    2014-10-01

    Full Text Available We constructed a 400K WG tiling oligoarray for the horse and applied it for the discovery of copy number variations (CNVs in 38 normal horses of 16 diverse breeds, and the Przewalski horse. Probes on the array represented 18,763 autosomal and X-linked genes, and intergenic, sub-telomeric and chrY sequences. We identified 258 CNV regions (CNVRs across all autosomes, chrX and chrUn, but not in chrY. CNVs comprised 1.3% of the horse genome with chr12 being most enriched. American Miniature horses had the highest and American Quarter Horses the lowest number of CNVs in relation to Thoroughbred reference. The Przewalski horse was similar to native ponies and draft breeds. The majority of CNVRs involved genes, while 20% were located in intergenic regions. Similar to previous studies in horses and other mammals, molecular functions of CNV-associated genes were predominantly in sensory perception, immunity and reproduction. The findings were integrated with previous studies to generate a composite genome-wide dataset of 1476 CNVRs. Of these, 301 CNVRs were shared between studies, while 1174 were novel and require further validation. Integrated data revealed that to date, 41 out of over 400 breeds of the domestic horse have been analyzed for CNVs, of which 11 new breeds were added in this study. Finally, the composite CNV dataset was applied in a pilot study for the discovery of CNVs in 6 horses with XY disorders of sexual development. A homozygous deletion involving AKR1C gene cluster in chr29 in two affected horses was considered possibly causative because of the known role of AKR1C genes in testicular androgen synthesis and sexual development. While the findings improve and integrate the knowledge of CNVs in horses, they also show that for effective discovery of variants of biomedical importance, more breeds and individuals need to be analyzed using comparable methodological approaches.

  2. Child Development and Structural Variation in the Human Genome

    Science.gov (United States)

    Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.

    2013-01-01

    Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…

  3. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-01

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human

  4. Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

    DEFF Research Database (Denmark)

    Zhan, Bujie; Fadista, João; Thomsen, Bo

    2011-01-01

    Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome...... of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation...

  5. Genome size, morphological and palynological variations, and ...

    African Journals Online (AJOL)

    The present study compares the morphological, palynologycal and genome size (C-value content) characteristics in the long-styled and short-styled plants in three Linum species, that is, ... The analysis of variance (ANOVA) test performed among the three Linum species showed a significant difference in 2C-value content.

  6. Genome Architecture and Its Roles in Human Copy Number Variation

    Directory of Open Access Journals (Sweden)

    Lu Chen

    2014-12-01

    Full Text Available Besides single-nucleotide variants in the human genome, large-scale genomic variants, such as copy number variations (CNVs, are being increasingly discovered as a genetic source of human diversity and the pathogenic factors of diseases. Recent experimental findings have shed light on the links between different genome architectures and CNV mutagenesis. In this review, we summarize various genomic features and discuss their contributions to CNV formation. Genomic repeats, including both low-copy and high-copy repeats, play important roles in CNV instability, which was initially known as DNA recombination events. Furthermore, it has been found that human genomic repeats can also induce DNA replication errors and consequently result in CNV mutations. Some recent studies showed that DNA replication timing, which reflects the high-order information of genomic organization, is involved in human CNV mutations. Our review highlights that genome architecture, from DNA sequence to high-order genomic organization, is an important molecular factor in CNV mutagenesis and human genomic instability.

  7. Salmon and steelhead genetics and genomics - Epigenetic and genomic variation in salmon and steelhead

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Conduct analyses of epigenetic and genomic variation in Chinook salmon and steelhead to determine influence on phenotypic expression of life history traits. Genetic,...

  8. Bonobos fall within the genomic variation of chimpanzees.

    Directory of Open Access Journals (Sweden)

    Anne Fischer

    Full Text Available To gain insight into the patterns of genetic variation and evolutionary relationships within and between bonobos and chimpanzees, we sequenced 150,000 base pairs of nuclear DNA divided among 15 autosomal regions as well as the complete mitochondrial genomes from 20 bonobos and 58 chimpanzees. Except for western chimpanzees, we found poor genetic separation of chimpanzees based on sample locality. In contrast, bonobos consistently cluster together but fall as a group within the variation of chimpanzees for many of the regions. Thus, while chimpanzees retain genomic variation that predates bonobo-chimpanzee speciation, extensive lineage sorting has occurred within bonobos such that much of their genome traces its ancestry back to a single common ancestor that postdates their origin as a group separate from chimpanzees.

  9. Mapping copy number variation by population-scale genome sequencing

    DEFF Research Database (Denmark)

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is......, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications...

  10. Genomic variation landscape of the human gut microbiome

    DEFF Research Database (Denmark)

    Schloissnig, Siegfried; Arumugam, Manimozhiyan; Sunagawa, Shinichi

    2013-01-01

    Whereas large-scale efforts have rapidly advanced the understanding and practical impact of human genomic variation, the practical impact of variation is largely unexplored in the human microbiome. We therefore developed a framework for metagenomic variation analysis and applied it to 252 faecal...... polymorphism rates of 0.11 was more variable between gut microbial species than across human hosts. Subjects sampled at varying time intervals exhibited individuality and temporal stability of SNP variation patterns, despite considerable composition changes of their gut microbiota. This indicates...

  11. Genome-wide sequence variations among Mycobacterium avium subspecies paratuberculosis.

    Directory of Open Access Journals (Sweden)

    Chung-Yi eHsu

    2011-12-01

    Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.

  12. The African Genome Variation Project shapes medical genetics in Africa.

    Science.gov (United States)

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O; Choudhury, Ananyo; Ritchie, Graham R S; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N; Young, Elizabeth H; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S

    2015-01-15

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.

  13. The African Genome Variation Project shapes medical genetics in Africa

    Science.gov (United States)

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O.; Choudhury, Ananyo; Ritchie, Graham R. S.; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N.; Young, Elizabeth H.; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P.; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A.; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S.

    2015-01-01

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.

  14. The African Genome Variation Project shapes medical genetics in Africa

    Science.gov (United States)

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O.; Choudhury, Ananyo; Ritchie, Graham R. S.; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N.; Young, Elizabeth H.; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P.; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A.; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S.

    2014-01-01

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterisation of African genetic diversity is needed. The African Genome Variation Project (AGVP) provides a resource to help design, implement and interpret genomic studies in sub-Saharan Africa (SSA) and worldwide. The AGVP represents dense genotypes from 1,481 and whole genome sequences (WGS) from 320 individuals across SSA. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across SSA. We identify new loci under selection, including for malaria and hypertension. We show that modern imputation panels can identify association signals at highly differentiated loci across populations in SSA. Using WGS, we show further improvement in imputation accuracy supporting efforts for large-scale sequencing of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa, showing for the first time that such designs are feasible. PMID:25470054

  15. Transformation of natural genetic variation into Haemophilus influenzae genomes.

    Directory of Open Access Journals (Sweden)

    Joshua Chang Mell

    2011-07-01

    Full Text Available Many bacteria are able to efficiently bind and take up double-stranded DNA fragments, and the resulting natural transformation shapes bacterial genomes, transmits antibiotic resistance, and allows escape from immune surveillance. The genomes of many competent pathogens show evidence of extensive historical recombination between lineages, but the actual recombination events have not been well characterized. We used DNA from a clinical isolate of Haemophilus influenzae to transform competent cells of a laboratory strain. To identify which of the ~40,000 polymorphic differences had recombined into the genomes of four transformed clones, their genomes and their donor and recipient parents were deep sequenced to high coverage. Each clone was found to contain ~1000 donor polymorphisms in 3-6 contiguous runs (8.1±4.5 kb in length that collectively comprised ~1-3% of each transformed chromosome. Seven donor-specific insertions and deletions were also acquired as parts of larger donor segments, but the presence of other structural variation flanking 12 of 32 recombination breakpoints suggested that these often disrupt the progress of recombination events. This is the first genome-wide analysis of chromosomes directly transformed with DNA from a divergent genotype, connecting experimental studies of transformation with the high levels of natural genetic variation found in isolates of the same species.

  16. Gene copy number variation throughout the Plasmodium falciparum genome

    Directory of Open Access Journals (Sweden)

    Stewart Lindsay B

    2009-08-01

    Full Text Available Abstract Background Gene copy number variation (CNV is responsible for several important phenotypes of the malaria parasite Plasmodium falciparum, including drug resistance, loss of infected erythrocyte cytoadherence and alteration of receptor usage for erythrocyte invasion. Despite the known effects of CNV, little is known about its extent throughout the genome. Results We performed a whole-genome survey of CNV genes in P. falciparum using comparative genome hybridisation of a diverse set of 16 laboratory culture-adapted isolates to a custom designed high density Affymetrix GeneChip array. Overall, 186 genes showed hybridisation signals consistent with deletion or amplification in one or more isolate. There is a strong association of CNV with gene length, genomic location, and low orthology to genes in other Plasmodium species. Sub-telomeric regions of all chromosomes are strongly associated with CNV genes independent from members of previously described multigene families. However, ~40% of CNV genes were located in more central regions of the chromosomes. Among the previously undescribed CNV genes, several that are of potential phenotypic relevance are identified. Conclusion CNV represents a major form of genetic variation within the P. falciparum genome; the distribution of gene features indicates the involvement of highly non-random mutational and selective processes. Additional studies should be directed at examining CNV in natural parasite populations to extend conclusions to clinical settings.

  17. Genome size variation affects song attractiveness in grasshoppers: evidence for sexual selection against large genomes.

    Science.gov (United States)

    Schielzeth, Holger; Streitner, Corinna; Lampe, Ulrike; Franzke, Alexandra; Reinhold, Klaus

    2014-12-01

    Genome size is largely uncorrelated to organismal complexity and adaptive scenarios. Genetic drift as well as intragenomic conflict have been put forward to explain this observation. We here study the impact of genome size on sexual attractiveness in the bow-winged grasshopper Chorthippus biguttulus. Grasshoppers show particularly large variation in genome size due to the high prevalence of supernumerary chromosomes that are considered (mildly) selfish, as evidenced by non-Mendelian inheritance and fitness costs if present in high numbers. We ranked male grasshoppers by song characteristics that are known to affect female preferences in this species and scored genome sizes of attractive and unattractive individuals from the extremes of this distribution. We find that attractive singers have significantly smaller genomes, demonstrating that genome size is reflected in male courtship songs and that females prefer songs of males with small genomes. Such a genome size dependent mate preference effectively selects against selfish genetic elements that tend to increase genome size. The data therefore provide a novel example of how sexual selection can reinforce natural selection and can act as an agent in an intragenomic arms race. Furthermore, our findings indicate an underappreciated route of how choosy females could gain indirect benefits. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.

  18. Genome-wide variation in recombination rate in Eucalyptus.

    Science.gov (United States)

    Gion, Jean-Marc; Hudson, Corey J; Lesur, Isabelle; Vaillancourt, René E; Potts, Brad M; Freeman, Jules S

    2016-08-09

    Meiotic recombination is a fundamental evolutionary process. It not only generates diversity, but influences the efficacy of natural selection and genome evolution. There can be significant heterogeneity in recombination rates within and between species, however this variation is not well understood outside of a few model taxa, particularly in forest trees. Eucalypts are forest trees of global economic importance, and dominate many Australian ecosystems. We studied recombination rate in Eucalyptus globulus using genetic linkage maps constructed in 10 unrelated individuals, and markers anchored to the Eucalyptus reference genome. This experimental design provided the replication to study whether recombination rate varied between individuals and chromosomes, and allowed us to study the genomic attributes and population genetic parameters correlated with this variation. Recombination rate varied significantly between individuals (range = 2.71 to 3.51 centimorgans/megabase [cM/Mb]), but was not significantly influenced by sex or cross type (F1 vs. F2). Significant differences in recombination rate between chromosomes were also evident (range = 1.98 to 3.81 cM/Mb), beyond those which were due to variation in chromosome size. Variation in chromosomal recombination rate was significantly correlated with gene density (r = 0.94), GC content (r = 0.90), and the number of tandem duplicated genes (r = -0.72) per chromosome. Notably, chromosome level recombination rate was also negatively correlated with the average genetic diversity across six species from an independent set of samples (r = -0.75). The correlations with genomic attributes are consistent with findings in other taxa, however, the direction of the correlation between diversity and recombination rate is opposite to that commonly observed. We argue this is likely to reflect the interaction of selection and specific genome architecture of Eucalyptus. Interestingly, the differences amongst

  19. Genome-wide associations of gene expression variation in humans.

    Directory of Open Access Journals (Sweden)

    Barbara E Stranger

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  20. Genome-Wide Associations of Gene Expression Variation in Humans.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  1. Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation.

    Science.gov (United States)

    Kidd, Jeffrey M; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F; Peckham, Heather E; Omberg, Larsson; Bormann Chung, Christina A; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G; Russell, Archie; Reynolds, Andy; Clark, Andrew G; Reese, Martin G; Lincoln, Stephen E; Butte, Atul J; De La Vega, Francisco M; Bustamante, Carlos D

    2012-10-05

    Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  2. Discrepancy variation of dinucleotide microsatellite repeats in eukaryotic genomes

    Directory of Open Access Journals (Sweden)

    HUAN GAO

    2009-01-01

    Full Text Available To address whether there are differences of variation among repeat motif types and among taxonomic groups, we present here an analysis of variation and correlation of dinucleotide microsatellite repeats in eukaryotic genomes. Ten taxonomic groups were compared, those being primates, mammalia (excluding primates and rodentia, rodentia, birds, fish, amphibians and reptiles, insects, molluscs, plants and fungi, respectively. The data used in the analysis is from the literature published in the Journal of Molecular Ecology Notes. Analysis of variation reveals that there are no significant differences between AC and AG repeat motif types. Moreover, the number of alleles correlates positively with the copy number in both AG and AC repeats. Similar conclusions can be obtained from each taxonomic group. These results strongly suggest that the increase of SSR variation is almost linear with the increase of the copy number of each repeat motif. As well, the results suggest that the variability of SSR in the genomes of low-ranking species seem to be more than that of high-ranking species, excluding primates and fungi.

  3. Potential Value of Genomic Copy Number Variations in Schizophrenia

    Directory of Open Access Journals (Sweden)

    Chuanjun Zhuo

    2017-06-01

    Full Text Available Schizophrenia is a devastating neuropsychiatric disorder affecting approximately 1% of the global population, and the disease has imposed a considerable burden on families and society. Although, the exact cause of schizophrenia remains unknown, several lines of scientific evidence have revealed that genetic variants are strongly correlated with the development and early onset of the disease. In fact, the heritability among patients suffering from schizophrenia is as high as 80%. Genomic copy number variations (CNVs are one of the main forms of genomic variations, ubiquitously occurring in the human genome. An increasing number of studies have shown that CNVs account for population diversity and genetically related diseases, including schizophrenia. The last decade has witnessed rapid advances in the development of novel genomic technologies, which have led to the identification of schizophrenia-associated CNVs, insight into the roles of the affected genes in their intervals in schizophrenia, and successful manipulation of the target CNVs. In this review, we focus on the recent discoveries of important CNVs that are associated with schizophrenia and outline the potential values that the study of CNVs will bring to the areas of schizophrenia research, diagnosis, and therapy. Furthermore, with the help of the novel genetic tool known as the Clustered Regularly Interspaced Short Palindromic Repeats-associated nuclease 9 (CRISPR/Cas9 system, the pathogenic CNVs as genomic defects could be corrected. In conclusion, the recent novel findings of schizophrenia-associated CNVs offer an exciting opportunity for schizophrenia research to decipher the pathological mechanisms underlying the onset and development of schizophrenia as well as to provide potential clinical applications in genetic counseling, diagnosis, and therapy for this complex mental disease.

  4. Genomic Variation in Natural Populations of Drosophila melanogaster

    Science.gov (United States)

    Langley, Charles H.; Stevens, Kristian; Cardeno, Charis; Lee, Yuh Chwen G.; Schrider, Daniel R.; Pool, John E.; Langley, Sasha A.; Suarez, Charlyn; Corbett-Detig, Russell B.; Kolaczkowski, Bryan; Fang, Shu; Nista, Phillip M.; Holloway, Alisha K.; Kern, Andrew D.; Dewey, Colin N.; Song, Yun S.; Hahn, Matthew W.; Begun, David J.

    2012-01-01

    This report of independent genome sequences of two natural populations of Drosophila melanogaster (37 from North America and 6 from Africa) provides unique insight into forces shaping genomic polymorphism and divergence. Evidence of interactions between natural selection and genetic linkage is abundant not only in centromere- and telomere-proximal regions, but also throughout the euchromatic arms. Linkage disequilibrium, which decays within 1 kbp, exhibits a strong bias toward coupling of the more frequent alleles and provides a high-resolution map of recombination rate. The juxtaposition of population genetics statistics in small genomic windows with gene structures and chromatin states yields a rich, high-resolution annotation, including the following: (1) 5′- and 3′-UTRs are enriched for regions of reduced polymorphism relative to lineage-specific divergence; (2) exons overlap with windows of excess relative polymorphism; (3) epigenetic marks associated with active transcription initiation sites overlap with regions of reduced relative polymorphism and relatively reduced estimates of the rate of recombination; (4) the rate of adaptive nonsynonymous fixation increases with the rate of crossing over per base pair; and (5) both duplications and deletions are enriched near origins of replication and their density correlates negatively with the rate of crossing over. Available demographic models of X and autosome descent cannot account for the increased divergence on the X and loss of diversity associated with the out-of-Africa migration. Comparison of the variation among these genomes to variation among genomes from D. simulans suggests that many targets of directional selection are shared between these species. PMID:22673804

  5. The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data.

    Science.gov (United States)

    Clarke, Laura; Fairley, Susan; Zheng-Bradley, Xiangqun; Streeter, Ian; Perry, Emily; Lowy, Ernesto; Tassé, Anne-Marie; Flicek, Paul

    2017-01-04

    The International Genome Sample Resource (IGSR; http://www.internationalgenome.org) expands in data type and population diversity the resources from the 1000 Genomes Project. IGSR represents the largest open collection of human variation data and provides easy access to these resources. IGSR was established in 2015 to maintain and extend the 1000 Genomes Project data, which has been widely used as a reference set of human variation and by researchers developing analysis methods. IGSR has mapped all of the 1000 Genomes sequence to the newest human reference (GRCh38), and will release updated variant calls to ensure maximal usefulness of the existing data. IGSR is collecting new structural variation data on the 1000 Genomes samples from long read sequencing and other technologies, and will collect relevant functional data into a single comprehensive resource. IGSR is extending coverage with new populations sequenced by collaborating groups. Here, we present the new data and analysis that IGSR has made available. We have also introduced a new data portal that increases discoverability of our data-previously only browseable through our FTP site-by focusing on particular samples, populations or data sets of interest. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Regulatory hotspots in the malaria parasite genome dictate transcriptional variation.

    Directory of Open Access Journals (Sweden)

    Joseph M Gonzales

    2008-09-01

    Full Text Available The determinants of transcriptional regulation in malaria parasites remain elusive. The presence of a well-characterized gene expression cascade shared by different Plasmodium falciparum strains could imply that transcriptional regulation and its natural variation do not contribute significantly to the evolution of parasite drug resistance. To clarify the role of transcriptional variation as a source of stain-specific diversity in the most deadly malaria species and to find genetic loci that dictate variations in gene expression, we examined genome-wide expression level polymorphisms (ELPs in a genetic cross between phenotypically distinct parasite clones. Significant variation in gene expression is observed through direct co-hybridizations of RNA from different P. falciparum clones. Nearly 18% of genes were regulated by a significant expression quantitative trait locus. The genetic determinants of most of these ELPs resided in hotspots that are physically distant from their targets. The most prominent regulatory locus, influencing 269 transcripts, coincided with a Chromosome 5 amplification event carrying the drug resistance gene, pfmdr1, and 13 other genes. Drug selection pressure in the Dd2 parental clone lineage led not only to a copy number change in the pfmdr1 gene but also to an increased copy number of putative neighboring regulatory factors that, in turn, broadly influence the transcriptional network. Previously unrecognized transcriptional variation, controlled by polymorphic regulatory genes and possibly master regulators within large copy number variants, contributes to sweeping phenotypic evolution in drug-resistant malaria parasites.

  7. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  8. Theories of Population Variation in Genes and Genomes

    DEFF Research Database (Denmark)

    Christiansen, Freddy

    This textbook provides an authoritative introduction to both classical and coalescent approaches to population genetics. Written for graduate students and advanced undergraduates by one of the world’s leading authorities in the field, the book focuses on the theoretical background of population...... genetics, while emphasizing the close interplay between theory and empiricism. Traditional topics such as genetic and phenotypic variation, mutation, migration, and linkage are covered and advanced by contemporary coalescent theory, which describes the genealogy of genes in a population, ultimately...... connecting them to a single common ancestor. Effects of selection, particularly genomic effects, are discussed with reference to molecular genetic variation. The book is designed for students of population genetics, bioinformatics, evolutionary biology, molecular evolution, and theoretical biology—as well...

  9. Simultaneous Structural Variation Discovery in Multiple Paired-End Sequenced Genomes

    Science.gov (United States)

    Hormozdiari, Fereydoun; Hajirasouliha, Iman; McPherson, Andrew; Eichler, Evan E.; Sahinalp, S. Cenk

    Next generation sequencing technologies have been decreasing the costs and increasing the world-wide capacity for sequence production at an unprecedented rate, making the initiation of large scale projects aiming to sequence almost 2000 genomes [1]. Structural variation detection promises to be one of the key diagnostic tools for cancer and other diseases with genomic origin. In this paper, we study the problem of detecting structural variation events in two or more sequenced genomes through high throughput sequencing . We propose to move from the current model of (1) detecting genomic variations in single next generation sequenced (NGS) donor genomes independently, and (2) checking whether two or more donor genomes indeed agree or disagree on the variations (in this paper we name this framework Independent Structural Variation Discovery and Merging - ISV&M), to a new model in which we detect structural variation events among multiple genomes simultaneously.

  10. PolyTB: A genomic variation map for Mycobacterium tuberculosis

    KAUST Repository

    Coll, Francesc

    2014-02-15

    Tuberculosis (TB) caused by Mycobacterium tuberculosis (Mtb) is the second major cause of death from an infectious disease worldwide. Recent advances in DNA sequencing are leading to the ability to generate whole genome information in clinical isolates of M. tuberculosis complex (MTBC). The identification of informative genetic variants such as phylogenetic markers and those associated with drug resistance or virulence will help barcode Mtb in the context of epidemiological, diagnostic and clinical studies. Mtb genomic datasets are increasingly available as raw sequences, which are potentially difficult and computer intensive to process, and compare across studies. Here we have processed the raw sequence data (>1500 isolates, eight studies) to compile a catalogue of SNPs (n = 74,039, 63% non-synonymous, 51.1% in more than one isolate, i.e. non-private), small indels (n = 4810) and larger structural variants (n = 800). We have developed the PolyTB web-based tool (http://pathogenseq.lshtm.ac.uk/polytb) to visualise the resulting variation and important meta-data (e.g. in silico inferred strain-types, location) within geographical map and phylogenetic views. This resource will allow researchers to identify polymorphisms within candidate genes of interest, as well as examine the genomic diversity and distribution of strains. PolyTB source code is freely available to researchers wishing to develop similar tools for their pathogen of interest. 2014 Elsevier Ltd. All rights reserved.

  11. Effective Normalization for Copy Number Variation Detection from Whole Genome Sequencing

    NARCIS (Netherlands)

    Janevski, A.; Varadan, V.; Kamalakaran, S.; Banerjee, N.; Dimitrova, D.

    2012-01-01

    Background Whole genome sequencing enables a high resolution view ofthe human genome and provides unique insights into genome structureat an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools while validatedalso include a number of

  12. Genomic and gene variation in Mycoplasma hominis strains

    DEFF Research Database (Denmark)

    Christiansen, Gunna; Andersen, H; Birkelund, Svend

    1987-01-01

    DNAs from 14 strains of Mycoplasma hominis isolated from various habitats, including strain PG21, were analyzed for genomic heterogeneity. DNA-DNA filter hybridization values were from 51 to 91%. Restriction endonuclease digestion patterns, analyzed by agarose gel electrophoresis, revealed...... no identity or cluster formation between strains. Variation within M. hominis rRNA genes was analyzed by Southern hybridization of EcoRI-cleaved DNA hybridized with a cloned fragment of the rRNA gene from the mycoplasma strain PG50. Five of the M. hominis strains showed identical hybridization patterns....... These hybridization patterns were compared with those of 12 other mycoplasma species, which showed a much more complex band pattern. Cloned nonribosomal RNA gene fragments of M. hominis PG21 DNA were analyzed, and the fragments were used to demonstrate heterogeneity among the strains. A monoclonal antibody against...

  13. Genomic copy number variations in three Southeast Asian populations.

    Science.gov (United States)

    Ku, Chee-Seng; Pawitan, Yudi; Sim, Xueling; Ong, Rick T H; Seielstad, Mark; Lee, Edmund J D; Teo, Yik-Ying; Chia, Kee-Seng; Salim, Agus

    2010-07-01

    Research on the role of copy number variations (CNVs) in the genetic risk of diseases in Asian populations has been hampered by a relative lack of reference CNV maps for Asian populations outside the East Asians. In this article, we report the population characteristics of CNVs in Chinese, Malay, and Asian Indian populations in Singapore. Using the Illumina Human 1M Beadchip array, we identify 1,174 CNV loci in these populations that corroborated with findings when the same samples were typed on the Affymetrix 6.0 platform. We identify 441 novel loci not previously reported in the Database of Genomic Variations (DGV). We observe a considerable number of loci that span all three populations and were previously unreported, as well as population-specific loci that are quite common in the respective populations. From this we observe the distribution of CNVs in the Asian Indian population to be considerably different from the Chinese and Malay populations. About half of the deletion loci and three-quarters of duplication loci overlap UCSC genes. Tens of loci show population differentiation and overlap with genes previously known to be associated with genetic risk of diseases. One of these loci is the CYP2A6 deletion, previously linked to reduced susceptibility to lung cancer. (c) 2010 Wiley-Liss, Inc.

  14. Variation in heterozygosity predicts variation in human substitution rates between populations, individuals and genomic regions.

    Directory of Open Access Journals (Sweden)

    William Amos

    Full Text Available The "heterozygote instability" (HI hypothesis suggests that gene conversion events focused on heterozygous sites during meiosis locally increase the mutation rate, but this hypothesis remains largely untested. As humans left Africa they lost variability, which, if HI operates, should have reduced the mutation rate in non-Africans. Relative substitution rates were quantified in diverse humans using aligned whole genome sequences from the 1,000 genomes project. Substitution rate is consistently greater in Africans than in non-Africans, but only in diploid regions of the genome, consistent with a role for heterozygosity. Analysing the same data partitioned into a series of non-overlapping 2 Mb windows reveals a strong, non-linear correlation between the amount of heterozygosity lost "out of Africa" and the difference in substitution rate between Africans and non-Africans. Putative recent mutations, derived variants that occur only once among the 80 human chromosomes sampled, occur preferentially at the centre of 2 Kb windows that have elevated heterozygosity compared both with the same region in a closely related population and with an immediately adjacent region in the same population. More than half of all substitutions appear attributable to variation in heterozygosity. This observation provides strong support for HI with implications for many branches of evolutionary biology.

  15. Whole-genome sequence variation, population structure and demographic history of the Dutch population

    NARCIS (Netherlands)

    Francioli, Laurent C.; Menelaou, Andronild; Pulit, Sara L.; Van Dijk, Freerk; Palamara, Pier Francesco; Elbers, Clara C.; Neerincx, Pieter B. T.; Ye, Kai; Guryev, Victor; Kloosterman, Wigard P.; Deelen, Patrick; Abdellaoui, Abdel; Van Leeuwen, Elisabeth M.; Van Oven, Mannis; Vermaat, Martijn; Li, Mingkun; Laros, Jeroen F. J.; Karssen, Lennart C.; Kanterakis, Alexandros; Amin, Najaf; Hottenga, Jouke Jan; Lameijer, Eric-Wubbo; Kattenberg, Mathijs; Dijkstra, Martijn; Byelas, Heorhiy; Van Settenl, Jessica; Van Schaik, Barbera D. C.; Bot, Jan; Nijman, Isaac J.; Renkens, Ivo; Marscha, Tobias; Schonhuth, Alexander; Hehir-Kwa, Jayne Y.; Handsaker, Robert E.; Polak, Paz; Sohail, Mashaal; Vuzman, Dana; Hormozdiari, Fereydoun; Van Enckevort, David; Mei, Hailiang; Koval, Vyacheslav; Moed, Ma-Tthijs H.; Van der Velde, K. Joeri; Rivadeneira, Fernando; Estrada, Karol; Medina-Gomez, Carolina; Isaacs, Aaron; Platteel, Mathieu; Swertz, Morris A.; Wijmenga, Cisca

    Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring

  16. Whole-genome sequence variation, population structure and demographic history of the Dutch population

    NARCIS (Netherlands)

    The Genome of the Netherlands Consortium; T. Marschall (Tobias); A. Schönhuth (Alexander)

    2014-01-01

    htmlabstractWhole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch

  17. No evidence that sex and transposable elements drive genome size variation in evening primroses.

    Science.gov (United States)

    Ågren, J Arvid; Greiner, Stephan; Johnson, Marc T J; Wright, Stephen I

    2015-04-01

    Genome size varies dramatically across species, but despite an abundance of attention there is little agreement on the relative contributions of selective and neutral processes in governing this variation. The rate of sex can potentially play an important role in genome size evolution because of its effect on the efficacy of selection and transmission of transposable elements (TEs). Here, we used a phylogenetic comparative approach and whole genome sequencing to investigate the contribution of sex and TE content to genome size variation in the evening primrose (Oenothera) genus. We determined genome size using flow cytometry for 30 species that vary in genetic system and find that variation in sexual/asexual reproduction cannot explain the almost twofold variation in genome size. Moreover, using whole genome sequences of three species of varying genome sizes and reproductive system, we found that genome size was not associated with TE abundance; instead the larger genomes had a higher abundance of simple sequence repeats. Although it has long been clear that sexual reproduction may affect various aspects of genome evolution in general and TE evolution in particular, it does not appear to have played a major role in genome size evolution in the evening primroses. © 2015 The Author(s).

  18. Karyotype diversity and genome size variation in Neotropical Maxillariinae orchids.

    Science.gov (United States)

    Moraes, A P; Koehler, S; Cabral, J S; Gomes, S S L; Viccini, L F; Barros, F; Felix, L P; Guerra, M; Forni-Martins, E R

    2017-03-01

    Orchidaceae is a widely distributed plant family with very diverse vegetative and floral morphology, and such variability is also reflected in their karyotypes. However, since only a low proportion of Orchidaceae has been analysed for chromosome data, greater diversity may await to be unveiled. Here we analyse both genome size (GS) and karyotype in two subtribes recently included in the broadened Maxillariinea to detect how much chromosome and GS variation there is in these groups and to evaluate which genome rearrangements are involved in the species evolution. To do so, the GS (14 species), the karyotype - based on chromosome number, heterochromatic banding and 5S and 45S rDNA localisation (18 species) - was characterised and analysed along with published data using phylogenetic approaches. The GS presented a high phylogenetic correlation and it was related to morphological groups in Bifrenaria (larger plants - higher GS). The two largest GS found among genera were caused by different mechanisms: polyploidy in Bifrenaria tyrianthina and accumulation of repetitive DNA in Scuticaria hadwenii. The chromosome number variability was caused mainly through descending dysploidy, and x=20 was estimated as the base chromosome number. Combining GS and karyotype data with molecular phylogeny, our data provide a more complete scenario of the karyotype evolution in Maxillariinae orchids, allowing us to suggest, besides dysploidy, that inversions and transposable elements as two mechanisms involved in the karyotype evolution. Such karyotype modifications could be associated with niche changes that occurred during species evolution. © 2016 German Botanical Society and The Royal Botanical Society of the Netherlands.

  19. Somatic genomic variations in extra-embryonic tissues

    Energy Technology Data Exchange (ETDEWEB)

    Weier, Jingly F.; Ferlatte, Christy; Weier, Heinz-Ulli G.

    2010-05-21

    In the mature chorion, one of the membranes that exist during pregnancy between the developing fetus and mother, human placental cells form highly specialized tissues composed of mesenchyme and floating or anchoring villi. Using fluorescence in situ hybridization, we found that human invasive cytotrophoblasts isolated from anchoring villi or the uterine wall had gained individual chromosomes; however, chromosome losses were detected infrequently. With chromosomes gained in what appeared to be a chromosome-specific manner, more than half of the invasive cytotrophoblasts in normal pregnancies were found to be hyperdiploid. Interestingly, the rates of hyperdiploid cells depended not only on gestational age, but were strongly associated with the extraembryonic compartment at the fetal-maternal interface from which they were isolated. Since hyperdiploid cells showed drastically reduced DNA replication as measured by bromodeoxyuridine incorporation, we conclude that aneuploidy is a part of the normal process of placentation potentially limiting the proliferative capabilities of invasive cytotrophoblasts. Thus, under the special circumstances of human reproduction, somatic genomic variations may exert a beneficial, anti-neoplastic effect on the organism.

  20. Genetic variation architecture of mitochondrial genome reveals the differentiation in Korean landrace and weedy rice

    OpenAIRE

    Wei Tong; Qiang He; Yong-Jin Park

    2017-01-01

    Mitochondrial genome variations have been detected despite the overall conservation of this gene content, which has been valuable for plant population genetics and evolutionary studies. Here, we describe mitochondrial variation architecture and our performance of a phylogenetic dissection of Korean landrace and weedy rice. A total of 4,717 variations across the mitochondrial genome were identified adjunct with 10 wild rice. Genetic diversity assessment revealed that wild rice has higher nucle...

  1. Analysis of the genetic variation in Mycobacterium tuberculosis strains by multiple genome alignments

    Directory of Open Access Journals (Sweden)

    Morales Juan

    2008-11-01

    Full Text Available Abstract Background The recent determination of the complete nucleotide sequence of several Mycobacterium tuberculosis (MTB genomes allows the use of comparative genomics as a tool for dissecting the nature and consequence of genetic variability within this species. The multiple alignment of the genomes of clinical strains (CDC1551, F11, Haarlem and C, along with the genomes of laboratory strains (H37Rv and H37Ra, provides new insights on the mechanisms of adaptation of this bacterium to the human host. Findings The genetic variation found in six M. tuberculosis strains does not involve significant genomic rearrangements. Most of the variation results from deletion and transposition events preferentially associated with insertion sequences and genes of the PE/PPE family but not with genes implicated in virulence. Using a Perl-based software islandsanalyser, which creates a representation of the genetic variation in the genome, we identified differences in the patterns of distribution and frequency of the polymorphisms across the genome. The identification of genes displaying strain-specific polymorphisms and the extrapolation of the number of strain-specific polymorphisms to an unlimited number of genomes indicates that the different strains contain a limited number of unique polymorphisms. Conclusion The comparison of multiple genomes demonstrates that the M. tuberculosis genome is currently undergoing an active process of gene decay, analogous to the adaptation process of obligate bacterial symbionts. This observation opens new perspectives into the evolution and the understanding of the pathogenesis of this bacterium.

  2. Genomic and karyotypic variation in Drosophila parasitoids (Hymenoptera, Cynipoidea, Figitidae

    Directory of Open Access Journals (Sweden)

    Vladimir Gokhman

    2011-08-01

    Full Text Available Drosophila melanogaster Meigen, 1830 has served as a model insect for over a century. Sequencing of the 11 additional Drosophila Fallen, 1823 species marks substantial progress in comparative genomics of this genus. By comparison, practically nothing is known about the genome size or genome sequences of parasitic wasps of Drosophila. Here, we present the first comparative analysis of genome size and karyotype structures of Drosophila parasitoids of the Leptopilina Förster, 1869 and Ganaspis Förster, 1869 species. The gametic genome size of Ganaspis xanthopoda (Ashmead, 1896 is larger than those of the three Leptopilina species studied. The genome sizes of all parasitic wasps studied here are also larger than those known for all Drosophila species. Surprisingly, genome sizes of these Drosophila parasitoids exceed the average value known for all previously studied Hymenoptera. The haploid chromosome number of both Leptopilina heterotoma (Thomson, 1862 and L. victoriae Nordlander, 1980 is ten. A chromosomal fusion appears to have produced a distinct karyotype for L. boulardi (Barbotin, Carton et Keiner-Pillault, 1979 (n = 9, whose genome size is smaller than that of wasps of the L. heterotoma clade. Like L. boulardi, the haploid chromosome number for G. xanthopoda is also nine. Our studies reveal a positive, but non linear, correlation between the genome size and total chromosome length in Drosophila parasitoids. These Drosophila parasitoids differ widely in their host range, and utilize different infection strategies to overcome host defense. Their comparative genomics, in relation to their exceptionally well-characterized hosts, will prove to be valuable for understanding the molecular basis of the host-parasite arms race and how such mechanisms shape the genetic structures of insect communities.

  3. Overview of the creative genome: effects of genome structure and sequence on the generation of variation and evolution.

    Science.gov (United States)

    Caporale, Lynn Helena

    2012-09-01

    This overview of a special issue of Annals of the New York Academy of Sciences discusses uneven distribution of distinct types of variation across the genome, the dependence of specific types of variation upon distinct classes of DNA sequences and/or the induction of specific proteins, the circumstances in which distinct variation-generating systems are activated, and the implications of this work for our understanding of evolution and of cancer. Also discussed is the value of non text-based computational methods for analyzing information carried by DNA, early insights into organizational frameworks that affect genome behavior, and implications of this work for comparative genomics. © 2012 New York Academy of Sciences.

  4. Are we Genomic Mosaics? Variations of the Genome of Somatic Cells can Contribute to Diversify our Phenotypes.

    Science.gov (United States)

    Astolfi, P A; Salamini, F; Sgaramella, V

    2010-09-01

    Theoretical and experimental evidences support the hypothesis that the genomes and the epigenomes may be different in the somatic cells of complex organisms. In the genome, the differences range from single base substitutions to chromosome number; in the epigenome, they entail multiple postsynthetic modifications of the chromatin. Somatic genome variations (SGV) may accumulate during development in response both to genetic programs, which may differ from tissue to tissue, and to environmental stimuli, which are often undetected and generally irreproducible. SGV may jeopardize physiological cellular functions, but also create novel coding and regulatory sequences, to be exposed to intraorganismal Darwinian selection. Genomes acknowledged as comparatively poor in genes, such as humans', could thus increase their pristine informational endowment. A better understanding of SGV will contribute to basic issues such as the "nature vs nurture" dualism and the inheritance of acquired characters. On the applied side, they may explain the low yield of cloning via somatic cell nuclear transfer, provide clues to some of the problems associated with transdifferentiation, and interfere with individual DNA analysis. SGV may be unique in the different cells types and in the different developmental stages, and thus explain the several hundred gaps persisting in the human genomes "completed" so far. They may compound the variations associated to our epigenomes and make of each of us an "(epi)genomic" mosaic. An ensuing paradigm is the possibility that a single genome (the ephemeral one assembled at fertilization) has the capacity to generate several different brains in response to different environments.

  5. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer; Rundsten, Carsten Friis; Ussery, David

    2012-01-01

    Background Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene variation can be studied in detail, including the number of mutations found for any given gene. This knowledge will be useful...... for creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps...... more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness...

  6. Transposable element distribution, abundance and role in genome size variation in the genus Oryza.

    Science.gov (United States)

    Zuccolo, Andrea; Sebastian, Aswathy; Talag, Jayson; Yu, Yeisoo; Kim, HyeRan; Collura, Kristi; Kudrna, Dave; Wing, Rod A

    2007-08-29

    The genus Oryza is composed of 10 distinct genome types, 6 diploid and 4 polyploid, and includes the world's most important food crop - rice (Oryza sativa [AA]). Genome size variation in the Oryza is more than 3-fold and ranges from 357 Mbp in Oryza glaberrima [AA] to 1283 Mbp in the polyploid Oryza ridleyi [HHJJ]. Because repetitive elements are known to play a significant role in genome size variation, we constructed random sheared small insert genomic libraries from 12 representative Oryza species and conducted a comprehensive study of the repetitive element composition, distribution and phylogeny in this genus. Particular attention was paid to the role played by the most important classes of transposable elements (Long Terminal Repeats Retrotransposons, Long interspersed Nuclear Elements, helitrons, DNA transposable elements) in shaping these genomes and in their contributing to genome size variation. We identified the elements primarily responsible for the most strikingly genome size variation in Oryza. We demonstrated how Long Terminal Repeat retrotransposons belonging to the same families have proliferated to very different extents in various species. We also showed that the pool of Long Terminal Repeat Retrotransposons is substantially conserved and ubiquitous throughout the Oryza and so its origin is ancient and its existence predates the speciation events that originated the genus. Finally we described the peculiar behavior of repeats in the species Oryza coarctata [HHKK] whose placement in the Oryza genus is controversial. Long Terminal Repeat retrotransposons are the major component of the Oryza genomes analyzed and, along with polyploidization, are the most important contributors to the genome size variation across the Oryza genus. Two families of Ty3-gypsy elements (RIRE2 and Atlantys) account for a significant portion of the genome size variations present in the Oryza genus.

  7. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome

    NARCIS (Netherlands)

    Collins, Ryan L; Brand, Harrison; Redin, Claire E.; Hanscom, Carrie; Antolik, Caroline; Stone, Matthew R; Glessner, Joseph T.; Mason, Tamara; Pregno, Giulia; Dorrani, Naghmeh; Mandrile, Giorgia; Giachino, Daniela; Perrin, Danielle; Walsh, Cole; Cipicchio, Michelle; Costello, Maura; Stortchevoi, Alexei; An, Joon Yong; Currall, Benjamin B; Seabra, Catarina M; Ragavendran, Ashok; Margolin, Lauren; Martinez-Agosto, Julian A.; Lucente, Diane; Levy, Brynn; Sanders, Jan-Stephan; Wapner, Ronald J.; Quintero-Rivera, Fabiola; Kloosterman, Wigard; Talkowski, Michael E.

    2017-01-01

    Background: Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. Results: We sequenced 689 participants with autism spectrum disorder (ASD) and other

  8. Host genome variations and risk of infections during induction treatment for childhood acute lymphoblastic leukaemia

    DEFF Research Database (Denmark)

    Lund, Bendik; Wesolowska-Andersen, Agata; Lausen, Birgitte

    2014-01-01

    Objectives: To investigate association of host genomic variation and risk of infections during treatment for childhood acute lymphoblastic leukaemia (ALL). Methods: We explored association of 34 000 singlenucleotide polymorphisms (SNPs) related primarily to pharmacogenomics and immune function...

  9. New Regions of the Human Genome Linked to Skin Color Variation in Some African Populations

    Science.gov (United States)

    In the first study of its kind, an international team of genomics researchers has identified new regions of the human genome that are associated with skin color variation in some African populations, opening new avenues for research on skin diseases and cancer in all populations.

  10. An integrated map of genetic variation from 1.092 human genomes

    DEFF Research Database (Denmark)

    Abecasis, Goncalo R.; Auton, Adam; Brooks, Lisa D.

    2012-01-01

    By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination ...

  11. The Organelle Genomes of Hassawi Rice (Oryza sativa L.) and Its Hybrid in Saudi Arabia: Genome Variation, Rearrangement, and Origins

    Science.gov (United States)

    Zhang, Tongwu; Hu, Songnian; Zhang, Guangyu; Pan, Linlin; Zhang, Xiaowei; Al-Mssallem, Ibrahim S.; Yu, Jun

    2012-01-01

    Hassawi rice (Oryza sativa L.) is a landrace adapted to the climate of Saudi Arabia, characterized by its strong resistance to soil salinity and drought. Using high quality sequencing reads extracted from raw data of a whole genome sequencing project, we assembled both chloroplast (cp) and mitochondrial (mt) genomes of the wild-type Hassawi rice (Hassawi-1) and its dwarf hybrid (Hassawi-2). We discovered 16 InDels (insertions and deletions) but no SNP (single nucleotide polymorphism) is present between the two Hassawi cp genomes. We identified 48 InDels and 26 SNPs in the two Hassawi mt genomes and a new type of sequence variation, termed reverse complementary variation (RCV) in the rice cp genomes. There are two and four RCVs identified in Hassawi-1 when compared to 93–11 (indica) and Nipponbare (japonica), respectively. Microsatellite sequence analysis showed there are more SSRs in the genic regions of both cp and mt genomes in the Hassawi rice than in the other rice varieties. There are also large repeats in the Hassawi mt genomes, with the longest length of 96,168 bp and 96,165 bp in Hassawi-1 and Hassawi-2, respectively. We believe that frequent DNA rearrangement in the Hassawi mt and cp genomes indicate ongoing dynamic processes to reach genetic stability under strong environmental pressures. Based on sequence variation analysis and the breeding history, we suggest that both Hassawi-1 and Hassawi-2 originated from the Indonesian variety Peta since genetic diversity between the two Hassawi cultivars is very low albeit an unknown historic origin of the wild-type Hassawi rice. PMID:22870184

  12. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...... fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides...

  13. Draft genome sequence of an elite Dura palm and whole-genome patterns of DNA variation in oil palm.

    Science.gov (United States)

    Jin, Jingjing; Lee, May; Bai, Bin; Sun, Yanwei; Qu, Jing; Rahmadsyah; Alfiko, Yuzer; Lim, Chin Huat; Suwanto, Antonius; Sugiharti, Maria; Wong, Limsoon; Ye, Jian; Chua, Nam-Hai; Yue, Gen Hua

    2016-12-01

    Oil palm is the world's leading source of vegetable oil and fat. Dura, Pisifera and Tenera are three forms of oil palm. The genome sequence of Pisifera is available whereas the Dura form has not been sequenced yet. We sequenced the genome of one elite Dura palm, and re-sequenced 17 palm genomes. The assemble genome sequence of the elite Dura tree contained 10,971 scaffolds and was 1.701 Gb in length, covering 94.49% of the oil palm genome. 36,105 genes were predicted. Re-sequencing of 17 additional palm trees identified 18.1 million SNPs. We found high genetic variation among palms from different geographical regions, but lower variation among Southeast Asian Dura and Pisifera palms. We mapped 10,000 SNPs on the linkage map of oil palm. In addition, high linkage disequilibrium (LD) was detected in the oil palms used in breeding populations of Southeast Asia, suggesting that LD mapping is likely to be practical in this important oil crop. Our data provide a valuable resource for accelerating genetic improvement and studying the mechanism underlying phenotypic variations of important oil palm traits. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  14. Detecting microsatellites within genomes: significant variation among algorithms

    Directory of Open Access Journals (Sweden)

    Rivals Eric

    2007-04-01

    Full Text Available Abstract Background Microsatellites are short, tandemly-repeated DNA sequences which are widely distributed among genomes. Their structure, role and evolution can be analyzed based on exhaustive extraction from sequenced genomes. Several dedicated algorithms have been developed for this purpose. Here, we compared the detection efficiency of five of them (TRF, Mreps, Sputnik, STAR, and RepeatMasker. Results Our analysis was first conducted on the human X chromosome, and microsatellite distributions were characterized by microsatellite number, length, and divergence from a pure motif. The algorithms work with user-defined parameters, and we demonstrate that the parameter values chosen can strongly influence microsatellite distributions. The five algorithms were then compared by fixing parameters settings, and the analysis was extended to three other genomes (Saccharomyces cerevisiae, Neurospora crassa and Drosophila melanogaster spanning a wide range of size and structure. Significant differences for all characteristics of microsatellites were observed among algorithms, but not among genomes, for both perfect and imperfect microsatellites. Striking differences were detected for short microsatellites (below 20 bp, regardless of motif. Conclusion Since the algorithm used strongly influences empirical distributions, studies analyzing microsatellite evolution based on a comparison between empirical and theoretical size distributions should therefore be considered with caution. We also discuss why a typological definition of microsatellites limits our capacity to capture their genomic distributions.

  15. Pan-Genome Analysis Links the Hereditary Variation of Leptospirillum ferriphilum With Its Evolutionary Adaptation

    Directory of Open Access Journals (Sweden)

    Xian Zhang

    2018-03-01

    Full Text Available Niche adaptation has long been recognized to drive intra-species differentiation and speciation, yet knowledge about its relatedness with hereditary variation of microbial genomes is relatively limited. Using Leptospirillum ferriphilum species as a case study, we present a detailed analysis of genomic features of five recognized strains. Genome-to-genome distance calculation preliminarily determined the roles of spatial distance and environmental heterogeneity that potentially contribute to intra-species variation within L. ferriphilum species at the genome level. Mathematical models were further constructed to extrapolate the expansion of L. ferriphilum genomes (an ‘open’ pan-genome, indicating the emergence of novel genes with new sequenced genomes. The identification of diverse mobile genetic elements (MGEs (such as transposases, integrases, and phage-associated genes revealed the prevalence of horizontal gene transfer events, which is an important evolutionary mechanism that provides avenues for the recruitment of novel functionalities and further for the genetic divergence of microbial genomes. Comprehensive analysis also demonstrated that the genome reduction by gene loss in a broad sense might contribute to the observed diversification. We thus inferred a plausible explanation to address this observation: the community-dependent adaptation that potentially economizes the limiting resources of the entire community. Now that the introduction of new genes is accompanied by a parallel abandonment of some other ones, our results provide snapshots on the biological fitness cost of environmental adaptation within the L. ferriphilum genomes. In short, our genome-wide analyses bridge the relation between genetic variation of L. ferriphilum with its evolutionary adaptation.

  16. ChickVD: a sequence variation database for the chicken genome

    DEFF Research Database (Denmark)

    Wang, Jing; He, Ximiao; Ruan, Jue

    2005-01-01

    Working in parallel with the efforts to sequence the chicken (Gallus gallus) genome, the Beijing Genomics Institute led an international team of scientists from China, USA, UK, Sweden, The Netherlands and Germany to map extensive DNA sequence variation throughout the chicken genome by sampling DN...... on quantitative trait loci using data from collaborating institutions and public resources. Our data can be queried by search engine and homology-based BLAST searches. ChickVD is publicly accessible at http://chicken.genomics.org.cn. Udgivelsesdato: 2005-Jan-1...

  17. Structural genomic variation as risk factor for idiopathic recurrent miscarriage

    DEFF Research Database (Denmark)

    Nagirnaja, Liina; Palta, Priit; Kasak, Laura

    2014-01-01

    Recurrent miscarriage (RM) is a multifactorial disorder with acknowledged genetic heritability that affects ∼3% of couples aiming at childbirth. As copy number variants (CNVs) have been shown to contribute to reproductive disease susceptibility, we aimed to describe genome-wide profile of CNVs an...

  18. Genome-wide patterns of copy number variation in the diversified chicken genomes using next-generation sequencing.

    Science.gov (United States)

    Yi, Guoqiang; Qu, Lujiang; Liu, Jianfeng; Yan, Yiyuan; Xu, Guiyun; Yang, Ning

    2014-11-07

    Copy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing. A total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson's correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding. Our results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.

  19. Identification of genomic indels and structural variations using split reads

    Directory of Open Access Journals (Sweden)

    Urban Alexander E

    2011-07-01

    Full Text Available Abstract Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC, a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read. All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions. A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models. This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions. We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole

  20. PolyTB: A genomic variation map for Mycobacterium tuberculosis

    KAUST Repository

    Coll, Francesc; Preston, Mark; Guerra-Assunç ã o, José Afonso; Hill-Cawthorn, Grant; Harris, David; Perdigã o, Joã o; Viveiros, Miguel; Portugal, Isabel; Drobniewski, Francis; Gagneux, Sebastien; Glynn, Judith R.; Pain, Arnab; Parkhill, Julian; McNerney, Ruth; Martin, Nigel; Clark, Taane G.

    2014-01-01

    ://pathogenseq.lshtm.ac.uk/polytb) to visualise the resulting variation and important meta-data (e.g. in silico inferred strain-types, location) within geographical map and phylogenetic views. This resource will allow researchers to identify polymorphisms within candidate genes of interest

  1. Human-specific HERV-K insertion causes genomic variations in the human genome.

    Directory of Open Access Journals (Sweden)

    Wonseok Shin

    Full Text Available Human endogenous retroviruses (HERV sequences account for about 8% of the human genome. Through comparative genomics and literature mining, we identified a total of 29 human-specific HERV-K insertions. We characterized them focusing on their structure and flanking sequence. The results showed that four of the human-specific HERV-K insertions deleted human genomic sequences via non-classical insertion mechanisms. Interestingly, two of the human-specific HERV-K insertion loci contained two HERV-K internals and three LTR elements, a pattern which could be explained by LTR-LTR ectopic recombination or template switching. In addition, we conducted a polymorphic test and observed that twelve out of the 29 elements are polymorphic in the human population. In conclusion, human-specific HERV-K elements have inserted into human genome since the divergence of human and chimpanzee, causing human genomic changes. Thus, we believe that human-specific HERV-K activity has contributed to the genomic divergence between humans and chimpanzees, as well as within the human population.

  2. Sequencing the CHO DXB11 genome reveals regional variations in genomic stability and haploidy

    DEFF Research Database (Denmark)

    Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.

    2015-01-01

    Background: The DHFR negative CHO DXB11 cell line (also known as DUX-B11 and DUKX) was historically the first CHO cell line to be used for large scale production of heterologous proteins and is still used for production of a number of complex proteins.  Results: Here we present the genomic sequence...... of the CHO DXB11 genome sequenced to a depth of 33x. Overall a significant genomic drift was seen favoring GC -> AT point mutations in line with the chemical mutagenesis strategy used for generation of the cell line. The sequencing depth for each gene in the genome revealed distinct peaks at sequencing...... in eight additional analyzed CHO genomes (15-20% haploidy) but not in the genome of the Chinese hamster. The dhfr gene is confirmed to be haploid in CHO DXB11; transcriptionally active and the remaining allele contains a G410C point mutation causing a Thr137Arg missense mutation. We find similar to 2...

  3. Genetic Variation in the Nuclear and Organellar Genomes Modulates Stochastic Variation in the Metabolome, Growth, and Defense

    Science.gov (United States)

    Joseph, Bindu; Corwin, Jason A.; Kliebenstein, Daniel J.

    2015-01-01

    Recent studies are starting to show that genetic control over stochastic variation is a key evolutionary solution of single celled organisms in the face of unpredictable environments. This has been expanded to show that genetic variation can alter stochastic variation in transcriptional processes within multi-cellular eukaryotes. However, little is known about how genetic diversity can control stochastic variation within more non-cell autonomous phenotypes. Using an Arabidopsis reciprocal RIL population, we showed that there is significant genetic diversity influencing stochastic variation in the plant metabolome, defense chemistry, and growth. This genetic diversity included loci specific for the stochastic variation of each phenotypic class that did not affect the other phenotypic classes or the average phenotype. This suggests that the organism's networks are established so that noise can exist in one phenotypic level like metabolism and not permeate up or down to different phenotypic levels. Further, the genomic variation within the plastid and mitochondria also had significant effects on the stochastic variation of all phenotypic classes. The genetic influence over stochastic variation within the metabolome was highly metabolite specific, with neighboring metabolites in the same metabolic pathway frequently showing different levels of noise. As expected from bet-hedging theory, there was more genetic diversity and a wider range of stochastic variation for defense chemistry than found for primary metabolism. Thus, it is possible to begin dissecting the stochastic variation of whole organismal phenotypes in multi-cellular organisms. Further, there are loci that modulate stochastic variation at different phenotypic levels. Finding the identity of these genes will be key to developing complete models linking genotype to phenotype. PMID:25569687

  4. Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure.

    Science.gov (United States)

    Gordon, Sean P; Contreras-Moreira, Bruno; Woods, Daniel P; Des Marais, David L; Burgess, Diane; Shu, Shengqiang; Stritt, Christoph; Roulin, Anne C; Schackwitz, Wendy; Tyler, Ludmila; Martin, Joel; Lipzen, Anna; Dochy, Niklas; Phillips, Jeremy; Barry, Kerrie; Geuten, Koen; Budak, Hikmet; Juenger, Thomas E; Amasino, Richard; Caicedo, Ana L; Goodstein, David; Davidson, Patrick; Mur, Luis A J; Figueroa, Melania; Freeling, Michael; Catalan, Pilar; Vogel, John P

    2017-12-19

    While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely to be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.

  5. Within-Host Variations of Human Papillomavirus Reveal APOBEC-Signature Mutagenesis in the Viral Genome.

    Science.gov (United States)

    Hirose, Yusuke; Onuki, Mamiko; Tenjimbayashi, Yuri; Mori, Seiichiro; Ishii, Yoshiyuki; Takeuchi, Takamasa; Tasaka, Nobutaka; Satoh, Toyomi; Morisada, Tohru; Iwata, Takashi; Miyamoto, Shingo; Matsumoto, Koji; Sekizawa, Akihiko; Kukimoto, Iwao

    2018-03-28

    Persistent infection with oncogenic human papillomaviruses (HPVs) causes cervical cancer, accompanied with the accumulation of somatic mutations into the host genome. There are concomitant genetic changes in the HPV genome during viral infection; however, their relevance to cervical carcinogenesis is poorly understood. Here we explored within-host genetic diversity of HPV by performing deep sequencing analyses of viral whole-genome sequences in clinical specimens. The whole genomes of HPV types 16, 52 and 58 were amplified by type-specific PCR from total cellular DNA of cervical exfoliated cells collected from patients with cervical intraepithelial neoplasia (CIN) and invasive cervical cancer (ICC), and were deep-sequenced. After constructing a reference vial genome sequence for each specimen, nucleotide positions showing changes with > 0.5% frequencies compared to the reference sequence were determined for individual samples. In total, 1,052 positions of nucleotide variations were detected in HPV genomes from 151 samples (CIN1, n = 56; CIN2/3, n = 68; ICC, n = 27), with varying numbers per sample. Overall, C-to-T and C-to-A substitutions were the dominant changes observed across all histological grades. While C-to-T transitions were predominantly detected in CIN1, their prevalence was decreased in CIN2/3 and fell below that of C-to-A transversions in ICC. Analysis of the tri-nucleotides context encompassing substituted bases revealed that Tp C pN, a preferred target sequence for cellular APOBEC cytosine deaminases, was a primary site for C-to-T substitutions in the HPV genome. These results strongly imply that the APOBEC proteins are drivers of HPV genome mutation, particularly in CIN1 lesions. IMPORTANCE HPVs exhibit surprisingly high levels of genetic diversity, including a large repertoire of minor genomic variants in each viral genotype. Here, by conducting deep sequencing analyses, we show for the first time a comprehensive snapshot of the "within

  6. Genomic structural variation contributes to phenotypic change of industrial bioethanol yeast Saccharomyces cerevisiae.

    Science.gov (United States)

    Zhang, Ke; Zhang, Li-Jie; Fang, Ya-Hong; Jin, Xin-Na; Qi, Lei; Wu, Xue-Chang; Zheng, Dao-Qiong

    2016-03-01

    Genomic structural variation (GSV) is a ubiquitous phenomenon observed in the genomes of Saccharomyces cerevisiae strains with different genetic backgrounds; however, the physiological and phenotypic effects of GSV are not well understood. Here, we first revealed the genetic characteristics of a widely used industrial S. cerevisiae strain, ZTW1, by whole genome sequencing. ZTW1 was identified as an aneuploidy strain and a large-scale GSV was observed in the ZTW1 genome compared with the genome of a diploid strain YJS329. These GSV events led to copy number variations (CNVs) in many chromosomal segments as well as one whole chromosome in the ZTW1 genome. Changes in the DNA dosage of certain functional genes directly affected their expression levels and the resultant ZTW1 phenotypes. Moreover, CNVs of large chromosomal regions triggered an aneuploidy stress in ZTW1. This stress decreased the proliferation ability and tolerance of ZTW1 to various stresses, while aneuploidy response stress may also provide some benefits to the fermentation performance of the yeast, including increased fermentation rates and decreased byproduct generation. This work reveals genomic characters of the bioethanol S. cerevisiae strain ZTW1 and suggests that GSV is an important kind of mutation that changes the traits of industrial S. cerevisiae strains. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Genome size variation and incidence of polyploidy in Scrophulariaceae sensu lato from the Iberian Peninsula.

    Science.gov (United States)

    Castro, Mariana; Castro, Sílvia; Loureiro, João

    2012-01-01

    In the last decade, genomic studies using DNA markers have strongly influenced the current phylogeny of angiosperms. Genome size and ploidy level have contributed to this discussion, being considered important characters in biosystematics, ecology and population biology. Despite the recent increase in studies related to genome size evolution and polyploidy incidence, only a few are available for Scrophulariaceae. In this context, we assessed the value of genome size, mostly as a taxonomic marker, and the role of polyploidy as a process of genesis and maintenance of plant diversity in Scrophulariaceae sensu lato in the Iberian Peninsula. Large-scale analyses of genome size and ploidy-level variation across the Iberian Peninsula were performed using flow cytometry. One hundred and sixty-two populations of 59 distinct taxa were analysed. A bibliographic review on chromosome counts was also performed. From the 59 sampled taxa, 51 represent first estimates of genome size. The majority of the Scrophulariaceae species presented very small to small genome sizes (2C ≤ 7.0 pg). Furthermore, in most of the analysed genera it was possible to use this character to separate several taxa, independently if these genera were homoploid or heteroploid. Also, some genome-related phenomena were detected, such as intraspecific variation of genome size in some genera and the possible occurrence of dysploidy in Verbascum spp. With respect to polyploidy, despite a few new DNA ploidy levels having been detected in Veronica, no multiple cytotypes have been found in any taxa. This work contributed with important basic scientific knowledge on genome size and polyploid incidence in the Scrophulariaceae, providing important background information for subsequent studies, with several perspectives for future studies being opened.

  8. Insights into the genome structure and copy-number variation of Eimeria tenella

    Directory of Open Access Journals (Sweden)

    Lim Lik-Sin

    2012-08-01

    Full Text Available Abstract Background Eimeria is a genus of parasites in the same phylum (Apicomplexa as human parasites such as Toxoplasma, Cryptosporidium and the malaria parasite Plasmodium. As an apicomplexan whose life-cycle involves a single host, Eimeria is a convenient model for understanding this group of organisms. Although the genomes of the Apicomplexa are diverse, that of Eimeria is unique in being composed of large alternating blocks of sequence with very different characteristics - an arrangement seen in no other organism. This arrangement has impeded efforts to fully sequence the genome of Eimeria, which remains the last of the major apicomplexans to be fully analyzed. In order to increase the value of the genome sequence data and aid in the effort to gain a better understanding of the Eimeria tenella genome, we constructed a whole genome map for the parasite. Results A total of 1245 contigs representing 70.0% of the whole genome assembly sequences (Wellcome Trust Sanger Institute were selected and subjected to marker selection. Subsequently, 2482 HAPPY markers were developed and typed. Of these, 795 were considered as usable markers, and utilized in the construction of a HAPPY map. Markers developed from chromosomally-assigned genes were then integrated into the HAPPY map and this aided the assignment of a number of linkage groups to their respective chromosomes. BAC-end sequences and contigs from whole genome sequencing were also integrated to improve and validate the HAPPY map. This resulted in an integrated HAPPY map consisting of 60 linkage groups that covers approximately half of the estimated 60 Mb genome. Further analysis suggests that the segmental organization first seen in Chromosome 1 is present throughout the genome, with repeat-poor (P regions alternating with repeat-rich (R regions. Evidence of copy-number variation between strains was also uncovered. Conclusions This paper describes the application of a whole genome mapping

  9. Genomic variation in recently collected maize landraces from Mexico

    Directory of Open Access Journals (Sweden)

    María Clara Arteaga

    2016-03-01

    Full Text Available The present dataset comprises 36,931 SNPs genotyped in 46 maize landraces native to Mexico as well as the teosinte subspecies Zea maiz ssp. parviglumis and ssp. mexicana. These landraces were collected directly from farmers mostly between 2006 and 2010. We accompany these data with a short description of the variation within each landrace, as well as maps, principal component analyses and neighbor joining trees showing the distribution of the genetic diversity relative to landrace, geographical features and maize biogeography. High levels of genetic variation were detected for the maize landraces (HE = 0.234 to 0.318 (mean 0.311, while slightly lower levels were detected in Zea m. mexicana and Zea m. parviglumis (HE = 0.262 and 0.234, respectively. The distribution of genetic variation was better explained by environmental variables given by the interaction of altitude and latitude than by landrace identity. This dataset is a follow up product of the Global Native Maize Project, an initiative to update the data on Mexican maize landraces and their wild relatives, and to generate information that is necessary for implementing the Mexican Biosafety Law. Keywords: Maize, Teosinte, Maize SNP50K BeadChip, Mexican landraces, Proyecto Global de Maíces Nativos

  10. Genomic variation in recently collected maize landraces from Mexico

    Science.gov (United States)

    Arteaga, María Clara; Moreno-Letelier, Alejandra; Mastretta-Yanes, Alicia; Vázquez-Lobo, Alejandra; Breña-Ochoa, Alejandra; Moreno-Estrada, Andrés; Eguiarte, Luis E.; Piñero, Daniel

    2015-01-01

    The present dataset comprises 36,931 SNPs genotyped in 46 maize landraces native to Mexico as well as the teosinte subspecies Zea maiz ssp. parviglumis and ssp. mexicana. These landraces were collected directly from farmers mostly between 2006 and 2010. We accompany these data with a short description of the variation within each landrace, as well as maps, principal component analyses and neighbor joining trees showing the distribution of the genetic diversity relative to landrace, geographical features and maize biogeography. High levels of genetic variation were detected for the maize landraces (HE = 0.234 to 0.318 (mean 0.311), while slightly lower levels were detected in Zea m. mexicana and Zea m. parviglumis (HE = 0.262 and 0.234, respectively). The distribution of genetic variation was better explained by environmental variables given by the interaction of altitude and latitude than by landrace identity. This dataset is a follow up product of the Global Native Maize Project, an initiative to update the data on Mexican maize landraces and their wild relatives, and to generate information that is necessary for implementing the Mexican Biosafety Law. PMID:26981357

  11. Comparison of variations detection between whole-genome amplification methods used in single-cell resequencing

    DEFF Research Database (Denmark)

    Hou, Yong; Wu, Kui; Shi, Xulian

    2015-01-01

    methods, focusing particularly on variations detection. Low-coverage whole-genome sequencing revealed that DOP-PCR had the highest duplication ratio, but an even read distribution and the best reproducibility and accuracy for detection of copy-number variations (CNVs). However, MDA had significantly...... performance using SCRS amplified by different WGA methods. It will guide researchers to determine which WGA method is best suited to individual experimental needs at single-cell level....

  12. Identifying tagging SNPs for African specific genetic variation from the African Diaspora Genome.

    Science.gov (United States)

    Johnston, Henry Richard; Hu, Yi-Juan; Gao, Jingjing; O'Connor, Timothy D; Abecasis, Gonçalo R; Wojcik, Genevieve L; Gignoux, Christopher R; Gourraud, Pierre-Antoine; Lizee, Antoine; Hansen, Mark; Genuario, Rob; Bullis, Dave; Lawley, Cindy; Kenny, Eimear E; Bustamante, Carlos; Beaty, Terri H; Mathias, Rasika A; Barnes, Kathleen C; Qin, Zhaohui S

    2017-04-21

    A primary goal of The Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) is to develop an 'African Diaspora Power Chip' (ADPC), a genotyping array consisting of tagging SNPs, useful in comprehensively identifying African specific genetic variation. This array is designed based on the novel variation identified in 642 CAAPA samples of African ancestry with high coverage whole genome sequence data (~30× depth). This novel variation extends the pattern of variation catalogued in the 1000 Genomes and Exome Sequencing Projects to a spectrum of populations representing the wide range of West African genomic diversity. These individuals from CAAPA also comprise a large swath of the African Diaspora population and incorporate historical genetic diversity covering nearly the entire Atlantic coast of the Americas. Here we show the results of designing and producing such a microchip array. This novel array covers African specific variation far better than other commercially available arrays, and will enable better GWAS analyses for researchers with individuals of African descent in their study populations. A recent study cataloging variation in continental African populations suggests this type of African-specific genotyping array is both necessary and valuable for facilitating large-scale GWAS in populations of African ancestry.

  13. Relationship between Deleterious Variation, Genomic Autozygosity, and Disease Risk: Insights from The 1000 Genomes Project.

    Science.gov (United States)

    Pemberton, Trevor J; Szpiech, Zachary A

    2018-04-05

    Genomic regions of autozygosity (ROAs) represent segments of individual genomes that are homozygous for haplotypes inherited identical-by-descent (IBD) from a common ancestor. ROAs are nonuniformly distributed across the genome, and increased ROA levels are a reported risk factor for numerous complex diseases. Previously, we hypothesized that long ROAs are enriched for deleterious homozygotes as a result of young haplotypes with recent deleterious mutations-relatively untouched by purifying selection-being paired IBD as a consequence of recent parental relatedness, a pattern supported by ROA and whole-exome sequence data on 27 individuals. Here, we significantly bolster support for our hypothesis and expand upon our original analyses using ROA and whole-genome sequence data on 2,436 individuals from The 1000 Genomes Project. Considering CADD deleteriousness scores, we reaffirm our previous observation that long ROAs are enriched for damaging homozygotes worldwide. We show that strongly damaging homozygotes experience greater enrichment than weaker damaging homozygotes, while overall enrichment varies appreciably among populations. Mendelian disease genes and those encoding FDA-approved drug targets have significantly increased rates of gain in damaging homozygotes with increasing ROA coverage relative to all other genes. In genes implicated in eight complex phenotypes for which ROA levels have been identified as a risk factor, rates of gain in damaging homozygotes vary across phenotypes and populations but frequently differ significantly from non-disease genes. These findings highlight the potential confounding effects of population background in the assessment of associations between ROA levels and complex disease risk, which might underlie reported inconsistencies in ROA-phenotype associations. Copyright © 2018 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  14. Genomic variation among populations of threatened coral: Acropora cervicornis.

    Science.gov (United States)

    Drury, C; Dale, K E; Panlilio, J M; Miller, S V; Lirman, D; Larson, E A; Bartels, E; Crawford, D L; Oleksiak, M F

    2016-04-13

    Acropora cervicornis, a threatened, keystone reef-building coral has undergone severe declines (>90 %) throughout the Caribbean. These declines could reduce genetic variation and thus hamper the species' ability to adapt. Active restoration strategies are a common conservation approach to mitigate species' declines and require genetic data on surviving populations to efficiently respond to declines while maintaining the genetic diversity needed to adapt to changing conditions. To evaluate active restoration strategies for the staghorn coral, the genetic diversity of A. cervicornis within and among populations was assessed in 77 individuals collected from 68 locations along the Florida Reef Tract (FRT) and in the Dominican Republic. Genotyping by Sequencing (GBS) identified 4,764 single nucleotide polymorphisms (SNPs). Pairwise nucleotide differences (π) within a population are large (~37 %) and similar to π across all individuals. This high level of genetic diversity along the FRT is similar to the diversity within a small, isolated reef. Much of the genetic diversity (>90 %) exists within a population, yet GBS analysis shows significant variation along the FRT, including 300 SNPs with significant FST values and significant divergence relative to distance. There are also significant differences in SNP allele frequencies over small spatial scales, exemplified by the large FST values among corals collected within Miami-Dade county. Large standing diversity was found within each population even after recent declines in abundance, including significant, potentially adaptive divergence over short distances. The data here inform conservation and management actions by uncovering population structure and high levels of diversity maintained within coral collections among sites previously shown to have little genetic divergence. More broadly, this approach demonstrates the power of GBS to resolve differences among individuals and identify subtle genetic structure

  15. Phylogeny, rate variation, and genome size evolution of Pelargonium (Geraniaceae).

    Science.gov (United States)

    Weng, Mao-Lun; Ruhlman, Tracey A; Gibby, Mary; Jansen, Robert K

    2012-09-01

    The phylogeny of 58 Pelargonium species was estimated using five plastid markers (rbcL, matK, ndhF, rpoC1, trnL-F) and one mitochondrial gene (nad5). The results confirmed the monophyly of three major clades and four subclades within Pelargonium but also indicate the need to revise some sectional classifications. This phylogeny was used to examine karyotype evolution in the genus: plotting chromosome sizes, numbers and 2C-values indicates that genome size is significantly correlated with chromosome size but not number. Accelerated rates of nucleotide substitution have been previously detected in both plastid and mitochondrial genes in Pelargonium, but sparse taxon sampling did not enable identification of the phylogenetic distribution of these elevated rates. Using the multigene phylogeny as a constraint, we investigated lineage- and locus-specific heterogeneity of substitution rates in Pelargonium for an expanded number of taxa and demonstrated that both plastid and mitochondrial genes have had accelerated substitution rates but with markedly disparate patterns. In the plastid, the exons of rpoC1 have significantly accelerated substitution rates compared to its intron and the acceleration was mainly due to nonsynonymous substitutions. In contrast, the mitochondrial gene, nad5, experienced substantial acceleration of synonymous substitution rates in three internal branches of Pelargonium, but this acceleration ceased in all terminal branches. Several lineages also have dN/dS ratios significantly greater than one for rpoC1, indicating that positive selection is acting on this gene, whereas the accelerated synonymous substitutions in the mitochondrial gene are the result of elevated mutation rates. Published by Elsevier Inc.

  16. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    Directory of Open Access Journals (Sweden)

    Aurélien Chateigner

    2015-07-01

    Full Text Available Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%. K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs. Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential.

  17. Natural selection affects multiple aspects of genetic variation at putatively peutral sites across the human genome

    DEFF Research Database (Denmark)

    Lohmueller, Kirk E; Albrechtsen, Anders; Li, Yingrui

    2011-01-01

    A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries...... these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination...... and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations...

  18. Extreme Recombination Frequencies Shape Genome Variation and Evolution in the Honeybee, Apis mellifera

    Science.gov (United States)

    Wallberg, Andreas; Glémin, Sylvain; Webster, Matthew T.

    2015-01-01

    Meiotic recombination is a fundamental cellular process, with important consequences for evolution and genome integrity. However, we know little about how recombination rates vary across the genomes of most species and the molecular and evolutionary determinants of this variation. The honeybee, Apis mellifera, has extremely high rates of meiotic recombination, although the evolutionary causes and consequences of this are unclear. Here we use patterns of linkage disequilibrium in whole genome resequencing data from 30 diploid honeybees to construct a fine-scale map of rates of crossing over in the genome. We find that, in contrast to vertebrate genomes, the recombination landscape is not strongly punctate. Crossover rates strongly correlate with levels of genetic variation, but not divergence, which indicates a pervasive impact of selection on the genome. Germ-line methylated genes have reduced crossover rate, which could indicate a role of methylation in suppressing recombination. Controlling for the effects of methylation, we do not infer a strong association between gene expression patterns and recombination. The site frequency spectrum is strongly skewed from neutral expectations in honeybees: rare variants are dominated by AT-biased mutations, whereas GC-biased mutations are found at higher frequencies, indicative of a major influence of GC-biased gene conversion (gBGC), which we infer to generate an allele fixation bias 5 – 50 times the genomic average estimated in humans. We uncover further evidence that this repair bias specifically affects transitions and favours fixation of CpG sites. Recombination, via gBGC, therefore appears to have profound consequences on genome evolution in honeybees and interferes with the process of natural selection. These findings have important implications for our understanding of the forces driving molecular evolution. PMID:25902173

  19. Genome size variation among and within Camellia species by using flow cytometric analysis.

    Directory of Open Access Journals (Sweden)

    Hui Huang

    Full Text Available BACKGROUND: The genus Camellia, belonging to the family Theaceae, is economically important group in flowering plants. Frequent interspecific hybridization together with polyploidization has made them become taxonomically "difficult taxa". The DNA content is often used to measure genome size variation and has largely advanced our understanding of plant evolution and genome variation. The goals of this study were to investigate patterns of interspecific and intraspecific variation of DNA contents and further explore genome size evolution in a phylogenetic context of the genus. METHODOLOGY/PRINCIPAL FINDINGS: The DNA amount in the genus was determined by using propidium iodide flow cytometry analysis for a total of 139 individual plants representing almost all sections of the two subgenera, Camellia and Thea. An improved WPB buffer was proven to be suitable for the Camellia species, which was able to counteract the negative effects of secondary metabolite and generated high-quality results with low coefficient of variation values (CV <5%. Our results showed trivial effects on different tissues of flowers, leaves and buds as well as cytosolic compounds on the estimation of DNA amount. The DNA content of C. sinensis var. assamica was estimated to be 1C = 3.01 pg by flow cytometric analysis, which is equal to a genome size of about 2940 Mb. CONCLUSION: Intraspecific and interspecific variations were observed in the genus Camellia, and as expected, the latter was larger than the former. Our study suggests a directional trend of increasing genome size in the genus Camellia probably owing to the frequent polyploidization events.

  20. Genic intolerance to functional variation and the interpretation of personal genomes.

    Directory of Open Access Journals (Sweden)

    Slavé Petrovski

    Full Text Available A central challenge in interpreting personal genomes is determining which mutations most likely influence disease. Although progress has been made in scoring the functional impact of individual mutations, the characteristics of the genes in which those mutations are found remain largely unexplored. For example, genes known to carry few common functional variants in healthy individuals may be judged more likely to cause certain kinds of disease than genes known to carry many such variants. Until now, however, it has not been possible to develop a quantitative assessment of how well genes tolerate functional genetic variation on a genome-wide scale. Here we describe an effort that uses sequence data from 6503 whole exome sequences made available by the NHLBI Exome Sequencing Project (ESP. Specifically, we develop an intolerance scoring system that assesses whether genes have relatively more or less functional genetic variation than expected based on the apparently neutral variation found in the gene. To illustrate the utility of this intolerance score, we show that genes responsible for Mendelian diseases are significantly more intolerant to functional genetic variation than genes that do not cause any known disease, but with striking variation in intolerance among genes causing different classes of genetic disease. We conclude by showing that use of an intolerance ranking system can aid in interpreting personal genomes and identifying pathogenic mutations.

  1. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    Science.gov (United States)

    Macas, Jiří; Novák, Petr; Pellicer, Jaume; Čížková, Jana; Koblížková, Andrea; Neumann, Pavel; Fuková, Iva; Doležel, Jaroslav; Kelly, Laura J; Leitch, Ilia J

    2015-01-01

    The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  2. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    Directory of Open Access Journals (Sweden)

    Jiří Macas

    Full Text Available The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57% of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%. Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  3. Genomic Diversity Using Copy Number Variations in Worldwide Chicken Populations

    Directory of Open Access Journals (Sweden)

    Erica Gorla

    2018-06-01

    Full Text Available Recently, many studies in livestock have focused on the identification of Copy Number Variants (CNVs using high-density Single Nucleotide Polymorphism (SNP arrays, but few have focused on studying chicken ecotypes coming from many locations. CNVs are polymorphisms, which may influence phenotype and are an important source of genetic variation in populations. The aim of this study was to explore the genetic difference and structure, using a high density SNP chip in 936 individuals from seven different countries (Brazil, Italy, Egypt, Mexico, Rwanda, Sri Lanka and Uganda. The DNA was genotyped with the Affymetrix Axiom®600k Chicken Genotyping Array and processed with stringent quality controls to obtain 559,201 SNPs in 915 individuals. The Log R Ratio (LRR and the B Allele Frequency of SNPs were used to perform the CNV calling with PennCNV software based on a Hidden Markov Model analysis and the LRR was used to perform CNV detection with SVS Golden Helix software.After filtering, a total of 19,027 CNVs were detected with the SVS software, while 9,065 CNVs were identified with the Penn CNV software. The CNVs were summarized in 7,001 Copy Number Variant Regions (CNVRs and 4,414 CNVRs, using the software BedTool.The consensus analysis across the CNVRs allowed the identification of 2,820 consensus CNVR, of which 1,721 were gain, 637 loss and 462 complex, for a total length of 53 Mb corresponding to the 5 % of the GalGal5 chicken autosomes. Only the consensus CNV regions obtained from both detections were considered for further analysis.The intersection analysis performed between the chicken gene database (Gallus_gallus-5.0 and the 1,927 consensus CNVRs allowed the identification (within or partial overlap of a total of 2,354 unique genes with an official gene ID.  The CNVRs identified here represent the first comprehensive mapping in several worldwide populations, using a high-density SNP chip.

  4. A Perfect Match Genomic Landscape Provides a Unified Framework for the Precise Detection of Variation in Natural and Synthetic Haploid Genomes.

    Science.gov (United States)

    Palacios-Flores, Kim; García-Sotelo, Jair; Castillo, Alejandra; Uribe, Carina; Aguilar, Luis; Morales, Lucía; Gómez-Romero, Laura; Reyes, José; Garciarubio, Alejandro; Boege, Margareta; Dávila, Guillermo

    2018-04-01

    We present a conceptually simple, sensitive, precise, and essentially nonstatistical solution for the analysis of genome variation in haploid organisms. The generation of a Perfect Match Genomic Landscape (PMGL), which computes intergenome identity with single nucleotide resolution, reveals signatures of variation wherever a query genome differs from a reference genome. Such signatures encode the precise location of different types of variants, including single nucleotide variants, deletions, insertions, and amplifications, effectively introducing the concept of a general signature of variation. The precise nature of variants is then resolved through the generation of targeted alignments between specific sets of sequence reads and known regions of the reference genome. Thus, the perfect match logic decouples the identification of the location of variants from the characterization of their nature, providing a unified framework for the detection of genome variation. We assessed the performance of the PMGL strategy via simulation experiments. We determined the variation profiles of natural genomes and of a synthetic chromosome, both in the context of haploid yeast strains. Our approach uncovered variants that have previously escaped detection. Moreover, our strategy is ideally suited for further refining high-quality reference genomes. The source codes for the automated PMGL pipeline have been deposited in a public repository. Copyright © 2018 by the Genetics Society of America.

  5. Limits of variation, specific infectivity, and genome packaging of massively recoded poliovirus genomes.

    Science.gov (United States)

    Song, Yutong; Gorbatsevych, Oleksandr; Liu, Ying; Mugavero, JoAnn; Shen, Sam H; Ward, Charles B; Asare, Emmanuel; Jiang, Ping; Paul, Aniko V; Mueller, Steffen; Wimmer, Eckard

    2017-10-10

    Computer design and chemical synthesis generated viable variants of poliovirus type 1 (PV1), whose ORF (6,189 nucleotides) carried up to 1,297 "Max" mutations (excess of overrepresented synonymous codon pairs) or up to 2,104 "SD" mutations (randomly scrambled synonymous codons). "Min" variants (excess of underrepresented synonymous codon pairs) are nonviable except for P2 Min , a variant temperature-sensitive at 33 and 39.5 °C. Compared with WT PV1, P2 Min displayed a vastly reduced specific infectivity (si) (WT, 1 PFU/118 particles vs. P2 Min , 1 PFU/35,000 particles), a phenotype that will be discussed broadly. Si of haploid PV presents cellular infectivity of a single genotype. We performed a comprehensive analysis of sequence and structures of the PV genome to determine if evolutionary conserved cis-acting packaging signal(s) were preserved after recoding. We showed that conserved synonymous sites and/or local secondary structures that might play a role in determining packaging specificity do not survive codon pair recoding. This makes it unlikely that numerous "cryptic, sequence-degenerate, dispersed RNA packaging signals mapping along the entire viral genome" [Patel N, et al. (2017) Nat Microbiol 2:17098] play the critical role in poliovirus packaging specificity. Considering all available evidence, we propose a two-step assembly strategy for +ssRNA viruses: step I, acquisition of packaging specificity, either ( a ) by specific recognition between capsid protein(s) and replication proteins (poliovirus), or ( b ) by the high affinity interaction of a single RNA packaging signal (PS) with capsid protein(s) (most +ssRNA viruses so far studied); step II, cocondensation of genome/capsid precursors in which an array of hairpin structures plays a role in virion formation.

  6. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Héloïse Bastide

    2013-06-01

    Full Text Available Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  7. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Science.gov (United States)

    Bastide, Héloïse; Betancourt, Andrea; Nolte, Viola; Tobler, Raymond; Stöbe, Petra; Futschik, Andreas; Schlötterer, Christian

    2013-06-01

    Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS) to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs) segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  8. Natural selection affects multiple aspects of genetic variation at putatively neutral sites across the human genome.

    Science.gov (United States)

    Lohmueller, Kirk E; Albrechtsen, Anders; Li, Yingrui; Kim, Su Yeon; Korneliussen, Thorfinn; Vinckenbosch, Nicolas; Tian, Geng; Huerta-Sanchez, Emilia; Feder, Alison F; Grarup, Niels; Jørgensen, Torben; Jiang, Tao; Witte, Daniel R; Sandbæk, Annelli; Hellmann, Ines; Lauritzen, Torsten; Hansen, Torben; Pedersen, Oluf; Wang, Jun; Nielsen, Rasmus

    2011-10-01

    A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries of genetic variation, like allele frequencies, are also correlated with recombination rate and whether these correlations can be explained solely by negative selection against deleterious mutations or whether positive selection acting on favorable alleles is also required. Here we attempt to address these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations. However, models with strong positive selection on nonsynonymous mutations and little negative selection predict a stronger negative correlation between neutral diversity and nonsynonymous divergence than observed in the actual data, supporting the importance of negative, rather than positive, selection throughout the genome. Further, we show that the widespread presence of weakly deleterious alleles, rather than a small number of strongly positively selected mutations, is responsible for the correlation between neutral genetic diversity and recombination rate. This work suggests that natural selection has affected multiple aspects of linked neutral variation throughout the human genome and that positive selection is not required to explain these observations.

  9. VarB Plus: An Integrated Tool for Visualization of Genome Variation Datasets

    KAUST Repository

    Hidayah, Lailatul

    2012-07-01

    Research on genomic sequences has been improving significantly as more advanced technology for sequencing has been developed. This opens enormous opportunities for sequence analysis. Various analytical tools have been built for purposes such as sequence assembly, read alignments, genome browsing, comparative genomics, and visualization. From the visualization perspective, there is an increasing trend towards use of large-scale computation. However, more than power is required to produce an informative image. This is a challenge that we address by providing several ways of representing biological data in order to advance the inference endeavors of biologists. This thesis focuses on visualization of variations found in genomic sequences. We develop several visualization functions and embed them in an existing variation visualization tool as extensions. The tool we improved is named VarB, hence the nomenclature for our enhancement is VarB Plus. To the best of our knowledge, besides VarB, there is no tool that provides the capability of dynamic visualization of genome variation datasets as well as statistical analysis. Dynamic visualization allows users to toggle different parameters on and off and see the results on the fly. The statistical analysis includes Fixation Index, Relative Variant Density, and Tajima’s D. Hence we focused our efforts on this tool. The scope of our work includes plots of per-base genome coverage, Principal Coordinate Analysis (PCoA), integration with a read alignment viewer named LookSeq, and visualization of geo-biological data. In addition to description of embedded functionalities, significance, and limitations, future improvements are discussed. The result is four extensions embedded successfully in the original tool, which is built on the Qt framework in C++. Hence it is portable to numerous platforms. Our extensions have shown acceptable execution time in a beta testing with various high-volume published datasets, as well as positive

  10. Background selection as baseline for nucleotide variation across the Drosophila genome.

    Directory of Open Access Journals (Sweden)

    Josep M Comeron

    2014-06-01

    Full Text Available The constant removal of deleterious mutations by natural selection causes a reduction in neutral diversity and efficacy of selection at genetically linked sites (a process called Background Selection, BGS. Population genetic studies, however, often ignore BGS effects when investigating demographic events or the presence of other types of selection. To obtain a more realistic evolutionary expectation that incorporates the unavoidable consequences of deleterious mutations, we generated high-resolution landscapes of variation across the Drosophila melanogaster genome under a BGS scenario independent of polymorphism data. We find that BGS plays a significant role in shaping levels of variation across the entire genome, including long introns and intergenic regions distant from annotated genes. We also find that a very large percentage of the observed variation in diversity across autosomes can be explained by BGS alone, up to 70% across individual chromosome arms at 100-kb scale, thus indicating that BGS predictions can be used as baseline to infer additional types of selection and demographic events. This approach allows detecting several outlier regions with signal of recent adaptive events and selective sweeps. The use of a BGS baseline, however, is particularly appropriate to investigate the presence of balancing selection and our study exposes numerous genomic regions with the predicted signature of higher polymorphism than expected when a BGS context is taken into account. Importantly, we show that these conclusions are robust to the mutation and selection parameters of the BGS model. Finally, analyses of protein evolution together with previous comparisons of genetic maps between Drosophila species, suggest temporally variable recombination landscapes and, thus, local BGS effects that may differ between extant and past phases. Because genome-wide BGS and temporal changes in linkage effects can skew approaches to estimate demographic and

  11. Rare and common regulatory variation in population-scale sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2011-07-01

    Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

  12. Genomic analysis of natural selection and phenotypic variation in high-altitude mongolians.

    Directory of Open Access Journals (Sweden)

    Jinchuan Xing

    Full Text Available Deedu (DU Mongolians, who migrated from the Mongolian steppes to the Qinghai-Tibetan Plateau approximately 500 years ago, are challenged by environmental conditions similar to native Tibetan highlanders. Identification of adaptive genetic factors in this population could provide insight into coordinated physiological responses to this environment. Here we examine genomic and phenotypic variation in this unique population and present the first complete analysis of a Mongolian whole-genome sequence. High-density SNP array data demonstrate that DU Mongolians share genetic ancestry with other Mongolian as well as Tibetan populations, specifically in genomic regions related with adaptation to high altitude. Several selection candidate genes identified in DU Mongolians are shared with other Asian groups (e.g., EDAR, neighboring Tibetan populations (including high-altitude candidates EPAS1, PKLR, and CYP2E1, as well as genes previously hypothesized to be associated with metabolic adaptation (e.g., PPARG. Hemoglobin concentration, a trait associated with high-altitude adaptation in Tibetans, is at an intermediate level in DU Mongolians compared to Tibetans and Han Chinese at comparable altitude. Whole-genome sequence from a DU Mongolian (Tianjiao1 shows that about 2% of the genomic variants, including more than 300 protein-coding changes, are specific to this individual. Our analyses of DU Mongolians and the first Mongolian genome provide valuable insight into genetic adaptation to extreme environments.

  13. Genome-wide detection of copy number variations among diverse horse breeds by array CGH.

    Directory of Open Access Journals (Sweden)

    Wei Wang

    Full Text Available Recent studies have found that copy number variations (CNVs are widespread in human and animal genomes. CNVs are a significant source of genetic variation, and have been shown to be associated with phenotypic diversity. However, the effect of CNVs on genetic variation in horses is not well understood. In the present study, CNVs in 6 different breeds of mare horses, Mongolia horse, Abaga horse, Hequ horse and Kazakh horse (all plateau breeds and Debao pony and Thoroughbred, were determined using aCGH. In total, seven hundred CNVs were identified ranging in size from 6.1 Kb to 0.57 Mb across all autosomes, with an average size of 43.08 Kb and a median size of 15.11 Kb. By merging overlapping CNVs, we found a total of three hundred and fifty-three CNV regions (CNVRs. The length of the CNVRs ranged from 6.1 Kb to 1.45 Mb with average and median sizes of 38.49 Kb and 13.1 Kb. Collectively, 13.59 Mb of copy number variation was identified among the horses investigated and accounted for approximately 0.61% of the horse genome sequence. Five hundred and eighteen annotated genes were affected by CNVs, which corresponded to about 2.26% of all horse genes. Through the gene ontology (GO, genetic pathway analysis and comparison of CNV genes among different breeds, we found evidence that CNVs involving 7 genes may be related to the adaptation to severe environment of these plateau horses. This study is the first report of copy number variations in Chinese horses, which indicates that CNVs are ubiquitous in the horse genome and influence many biological processes of the horse. These results will be helpful not only in mapping the horse whole-genome CNVs, but also to further research for the adaption to the high altitude severe environment for plateau horses.

  14. Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

    Directory of Open Access Journals (Sweden)

    McGuire Patrick E

    2010-12-01

    chromosomal regions. The net effect of these factors in T. aestivum is large variation in diversity among genomes and chromosomes, which impacts the development of SNP markers and their practical utility. Accumulation of new mutations in older polyploid species, such as wild emmer, results in increased diversity and its more uniform distribution across the genome.

  15. Variations and classification of toxic epitopes related to celiac disease among α-gliadin genes from four Aegilops genomes.

    Science.gov (United States)

    Li, Jie; Wang, Shunli; Li, Shanshan; Ge, Pei; Li, Xiaohui; Ma, Wujun; Zeller, F J; Hsam, Sai L K; Yan, Yueming

    2012-07-01

    The α-gliadins are associated with human celiac disease. A total of 23 noninterrupted full open reading frame α-gliadin genes and 19 pseudogenes were cloned and sequenced from C, M, N, and U genomes of four diploid Aegilops species. Sequence comparison of α-gliadin genes from Aegilops and Triticum species demonstrated an existence of extensive allelic variations in Gli-2 loci of the four Aegilops genomes. Specific structural features were found including the compositions and variations of two polyglutamine domains (QI and QII) and four T cell stimulatory toxic epitopes. The mean numbers of glutamine residues in the QI domain in C and N genomes and the QII domain in C, N, and U genomes were much higher than those in Triticum genomes, and the QI domain in C and N genomes and the QII domain in C, M, N, and U genomes displayed greater length variations. Interestingly, the types and numbers of four T cell stimulatory toxic epitopes in α-gliadins from the four Aegilops genomes were significantly less than those from Triticum A, B, D, and their progenitor genomes. Relationships between the structural variations of the two polyglutamine domains and the distributions of four T cell stimulatory toxic epitopes were found, resulting in the α-gliadin genes from the Aegilops and Triticum genomes to be classified into three groups.

  16. Meiotic gene-conversion rate and tract length variation in the human genome.

    Science.gov (United States)

    Padhukasahasram, Badri; Rannala, Bruce

    2013-02-27

    Meiotic recombination occurs in the form of two different mechanisms called crossing-over and gene-conversion and both processes have an important role in shaping genetic variation in populations. Although variation in crossing-over rates has been studied extensively using sperm-typing experiments, pedigree studies and population genetic approaches, our knowledge of variation in gene-conversion parameters (ie, rates and mean tract lengths) remains far from complete. To explore variability in population gene-conversion rates and its relationship to crossing-over rate variation patterns, we have developed and validated using coalescent simulations a comprehensive Bayesian full-likelihood method that can jointly infer crossing-over and gene-conversion rates as well as tract lengths from population genomic data under general variable rate models with recombination hotspots. Here, we apply this new method to SNP data from multiple human populations and attempt to characterize for the first time the fine-scale variation in gene-conversion parameters along the human genome. We find that the estimated ratio of gene-conversion to crossing-over rates varies considerably across genomic regions as well as between populations. However, there is a great degree of uncertainty associated with such estimates. We also find substantial evidence for variation in the mean conversion tract length. The estimated tract lengths did not show any negative relationship with the local heterozygosity levels in our analysis.European Journal of Human Genetics advance online publication, 27 February 2013; doi:10.1038/ejhg.2013.30.

  17. Distribution and diversity of cytotypes in Dianthus broteri as evidenced by genome size variations.

    Science.gov (United States)

    Balao, Francisco; Casimiro-Soriguer, Ramón; Talavera, María; Herrera, Javier; Talavera, Salvador

    2009-10-01

    Studying the spatial distribution of cytotypes and genome size in plants can provide valuable information about the evolution of polyploid complexes. Here, the spatial distribution of cytological races and the amount of DNA in Dianthus broteri, an Iberian carnation with several ploidy levels, is investigated. Sample chromosome counts and flow cytometry (using propidium iodide) were used to determine overall genome size (2C value) and ploidy level in 244 individuals of 25 populations. Both fresh and dried samples were investigated. Differences in 2C and 1Cx values among ploidy levels within biogeographical provinces were tested using ANOVA. Geographical correlations of genome size were also explored. Extensive variation in chromosomes numbers (2n = 2x = 30, 2n = 4x = 60, 2n = 6x = 90 and 2n = 12x =180) was detected, and the dodecaploid cytotype is reported for the first time in this genus. As regards cytotype distribution, six populations were diploid, 11 were tetraploid, three were hexaploid and five were dodecaploid. Except for one diploid population containing some triploid plants (2n = 45), the remaining populations showed a single cytotype. Diploids appeared in two disjunct areas (south-east and south-west), and so did tetraploids (although with a considerably wider geographic range). Dehydrated leaf samples provided reliable measurements of DNA content. Genome size varied significantly among some cytotypes, and also extensively within diploid (up to 1.17-fold) and tetraploid (1.22-fold) populations. Nevertheless, variations were not straightforwardly congruent with ecology and geographical distribution. Dianthus broteri shows the highest diversity of cytotypes known to date in the genus Dianthus. Moreover, some cytotypes present remarkable internal genome size variation. The evolution of the complex is discussed in terms of autopolyploidy, with primary and secondary contact zones.

  18. Genome-Wide Fine-Scale Recombination Rate Variation in Drosophila melanogaster

    Science.gov (United States)

    Song, Yun S.

    2012-01-01

    Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA) and the other from Africa (Gikongoro, Rwanda). It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features—including recombination rates, diversity, divergence, GC content, gene content, and sequence quality—is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between recombination and

  19. Chromosome Numbers and Genome Size Variation in Indian Species of Curcuma (Zingiberaceae)

    Science.gov (United States)

    Leong-Škorničková, Jana; Šída, Otakar; Jarolímová, Vlasta; Sabu, Mamyil; Fér, Tomáš; Trávníček, Pavel; Suda, Jan

    2007-01-01

    Background and Aims Genome size and chromosome numbers are important cytological characters that significantly influence various organismal traits. However, geographical representation of these data is seriously unbalanced, with tropical and subtropical regions being largely neglected. In the present study, an investigation was made of chromosomal and genome size variation in the majority of Curcuma species from the Indian subcontinent, and an assessment was made of the value of these data for taxonomic purposes. Methods Genome size of 161 homogeneously cultivated plant samples classified into 51 taxonomic entities was determined by propidium iodide flow cytometry. Chromosome numbers were counted in actively growing root tips using conventional rapid squash techniques. Key Results Six different chromosome counts (2n = 22, 42, 63, >70, 77 and 105) were found, the last two representing new generic records. The 2C-values varied from 1·66 pg in C. vamana to 4·76 pg in C. oligantha, representing a 2·87-fold range. Three groups of taxa with significantly different homoploid genome sizes (Cx-values) and distinct geographical distribution were identified. Five species exhibited intraspecific variation in nuclear DNA content, reaching up to 15·1 % in cultivated C. longa. Chromosome counts and genome sizes of three Curcuma-like species (Hitchenia caulina, Kaempferia scaposa and Paracautleya bhatii) corresponded well with typical hexaploid (2n = 6x = 42) Curcuma spp. Conclusions The basic chromosome number in the majority of Indian taxa (belonging to subgenus Curcuma) is x = 7; published counts correspond to 6x, 9x, 11x, 12x and 15x ploidy levels. Only a few species-specific C-values were found, but karyological and/or flow cytometric data may support taxonomic decisions in some species alliances with morphological similarities. Close evolutionary relationships among some cytotypes are suggested based on the similarity in homoploid genome sizes and geographical grouping

  20. [Analysis of genomic copy number variations in two sisters with primary amenorrhea and hyperandrogenism].

    Science.gov (United States)

    Zhang, Yanliang; Xu, Qiuyue; Cai, Xuemei; Li, Yixun; Song, Guibo; Wang, Juan; Zhang, Rongchen; Dai, Yong; Duan, Yong

    2015-12-01

    To analyze genomic copy number variations (CNVs) in two sisters with primary amenorrhea and hyperandrogenism. G-banding was performed for karyotype analysis. The whole genome of the two sisters were scanned and analyzed by array-based comparative genomic hybridization (array-CGH). The results were confirmed with real-time quantitative PCR (RT-qPCR). No abnormality was found by conventional G-banded chromosome analysis. Array-CGH has identified 11 identical CNVs from the sisters which, however, overlapped with CNVs reported by the Database of Genomic Variants (http://projects.tcag.ca/variation/). Therefore, they are likely to be benign. In addition, a -8.44 Mb 9p11.1-p13.1 duplication (38,561,587-47,002,387 bp, hg18) and a -80.9 kb 4q13.2 deletion (70,183,990-70,264,889 bp, hg18) were also detected in the elder and younger sister, respectively. The relationship between such CNVs and primary amenorrhea and hyperandrogenism was however uncertain. RT-qPCR results were in accordance with array-CGH. Two CNVs were detected in two sisters by array-CGH, for which further studies are needed to clarify their correlation with primary amenorrhea and hyperandrogenism.

  1. Comparative population genomics of latitudinal variation in Drosophila simulans and Drosophila melanogaster.

    Science.gov (United States)

    Machado, Heather E; Bergland, Alan O; O'Brien, Katherine R; Behrman, Emily L; Schmidt, Paul S; Petrov, Dmitri A

    2016-02-01

    Examples of clinal variation in phenotypes and genotypes across latitudinal transects have served as important models for understanding how spatially varying selection and demographic forces shape variation within species. Here, we examine the selective and demographic contributions to latitudinal variation through the largest comparative genomic study to date of Drosophila simulans and Drosophila melanogaster, with genomic sequence data from 382 individual fruit flies, collected across a spatial transect of 19 degrees latitude and at multiple time points over 2 years. Consistent with phenotypic studies, we find less clinal variation in D. simulans than D. melanogaster, particularly for the autosomes. Moreover, we find that clinally varying loci in D. simulans are less stable over multiple years than comparable clines in D. melanogaster. D. simulans shows a significantly weaker pattern of isolation by distance than D. melanogaster and we find evidence for a stronger contribution of migration to D. simulans population genetic structure. While population bottlenecks and migration can plausibly explain the differences in stability of clinal variation between the two species, we also observe a significant enrichment of shared clinal genes, suggesting that the selective forces associated with climate are acting on the same genes and phenotypes in D. simulans and D. melanogaster. © 2015 John Wiley & Sons Ltd.

  2. Genome-wide association study identified CNP12587 region underlying height variation in Chinese females.

    Directory of Open Access Journals (Sweden)

    Yin-Ping Zhang

    Full Text Available Human height is a highly heritable trait considered as an important factor for health. There has been limited success in identifying the genetic factors underlying height variation. We aim to identify sequence variants associated with adult height by a genome-wide association study of copy number variants (CNVs in Chinese.Genome-wide CNV association analyses were conducted in 1,625 unrelated Chinese adults and sex specific subgroup for height variation, respectively. Height was measured with a stadiometer. Affymetrix SNP6.0 genotyping platform was used to identify copy number polymorphisms (CNPs. We constructed a genomic map containing 1,009 CNPs in Chinese individuals and performed a genome-wide association study of CNPs with height.We detected 10 significant association signals for height (p<0.05 in the whole population, 9 and 11 association signals for Chinese female and male population, respectively. A copy number polymorphism (CNP12587, chr18:54081842-54086942, p = 2.41 × 10(-4 was found to be significantly associated with height variation in Chinese females even after strict Bonferroni correction (p = 0.048. Confirmatory real time PCR experiments lent further support for CNV validation. Compared to female subjects with two copies of the CNP, carriers of three copies had an average of 8.1% decrease in height. An important candidate gene, ubiquitin-protein ligase NEDD4-like (NEDD4L, was detected at this region, which plays important roles in bone metabolism by binding to bone formation regulators.Our findings suggest the important genetic variants underlying height variation in Chinese.

  3. Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping.

    Directory of Open Access Journals (Sweden)

    Amaury Vaysse

    2011-10-01

    Full Text Available The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease.

  4. A genomic overview of short genetic variations in a basal chordate, Ciona intestinalis

    Directory of Open Access Journals (Sweden)

    Satou Yutaka

    2012-05-01

    Full Text Available Abstract Background Although the Ciona intestinalis genome contains many allelic polymorphisms, there is only limited data analyzed systematically. Establishing a dense map of genetic variations in C. intestinalis is necessary not only for linkage analysis, but also for other experimental biology including molecular developmental and evolutionary studies, because animals from natural populations are typically used for experiments. Results Here, we identified over three million candidate short genomic variations within a 110 Mb euchromatin region among five C. intestinalis individuals. The average nucleotide diversity was approximately 1.1%. Genetic variations were found at a similar density in intergenic and gene regions. Non-synonymous and nonsense nucleotide substitutions were found in 12,493 and 1,214 genes accounting for 81.9% and 8.0% of the entire gene set, respectively, and over 60% of genes in the single animal encode non-identical proteins between maternal and paternal alleles. Conclusions Our results provide a framework for studying evolution of the animal genome, as well as a useful resource for a wide range of C. intestinalis researchers.

  5. Comparative analysis of complete chloroplast genome sequence and inversion variation in Lasthenia burkei (Madieae, Asteraceae).

    Science.gov (United States)

    Walker, Joseph F; Zanis, Michael J; Emery, Nancy C

    2014-04-01

    Complete chloroplast genome studies can help resolve relationships among large, complex plant lineages such as Asteraceae. We present the first whole plastome from the Madieae tribe and compare its sequence variation to other chloroplast genomes in Asteraceae. We used high throughput sequencing to obtain the Lasthenia burkei chloroplast genome. We compared sequence structure and rates of molecular evolution in the small single copy (SSC), large single copy (LSC), and inverted repeat (IR) regions to those for eight Asteraceae accessions and one Solanaceae accession. The chloroplast sequence of L. burkei is 150 746 bp and contains 81 unique protein coding genes and 4 coding ribosomal RNA sequences. We identified three major inversions in the L. burkei chloroplast, all of which have been found in other Asteraceae lineages, and a previously unreported inversion in Lactuca sativa. Regions flanking inversions contained tRNA sequences, but did not have particularly high G + C content. Substitution rates varied among the SSC, LSC, and IR regions, and rates of evolution within each region varied among species. Some observed differences in rates of molecular evolution may be explained by the relative proportion of coding to noncoding sequence within regions. Rates of molecular evolution vary substantially within and among chloroplast genomes, and major inversion events may be promoted by the presence of tRNAs. Collectively, these results provide insight into different mechanisms that may promote intramolecular recombination and the inversion of large genomic regions in the plastome.

  6. Genome variations associated with viral susceptibility and calcification in Emiliania huxleyi.

    Science.gov (United States)

    Kegel, Jessica U; John, Uwe; Valentin, Klaus; Frickenhaus, Stephan

    2013-01-01

    Emiliania huxleyi, a key player in the global carbon cycle is one of the best studied coccolithophores with respect to biogeochemical cycles, climatology, and host-virus interactions. Strains of E. huxleyi show phenotypic plasticity regarding growth behaviour, light-response, calcification, acidification, and virus susceptibility. This phenomenon is likely a consequence of genomic differences, or transcriptomic responses, to environmental conditions or threats such as viral infections. We used an E. huxleyi genome microarray based on the sequenced strain CCMP1516 (reference strain) to perform comparative genomic hybridizations (CGH) of 16 E. huxleyi strains of different geographic origin. We investigated the genomic diversity and plasticity and focused on the identification of genes related to virus susceptibility and coccolith production (calcification). Among the tested 31940 gene models a core genome of 14628 genes was identified by hybridization among 16 E. huxleyi strains. 224 probes were characterized as specific for the reference strain CCMP1516. Compared to the sequenced E. huxleyi strain CCMP1516 variation in gene content of up to 30 percent among strains was observed. Comparison of core and non-core transcripts sets in terms of annotated functions reveals a broad, almost equal functional coverage over all KOG-categories of both transcript sets within the whole annotated genome. Within the variable (non-core) genome we identified genes associated with virus susceptibility and calcification. Genes associated with virus susceptibility include a Bax inhibitor-1 protein, three LRR receptor-like protein kinases, and mitogen-activated protein kinase. Our list of transcripts associated with coccolith production will stimulate further research, e.g. by genetic manipulation. In particular, the V-type proton ATPase 16 kDa proteolipid subunit is proposed to be a plausible target gene for further calcification studies.

  7. Genome - wide variation and demographic history of small cats with a focus on Felis species

    Directory of Open Access Journals (Sweden)

    Anubhab Khan

    2017-10-01

    Full Text Available Majority of the 38 known cat species are classified as small and they inhabit five of the seven continents. They survive in a vast range of habitats but still 12 out of the 18 threatened felids are small cats. However, there has not been enough progress in the field of small cat research as they generally get overshadowed by the charismatic big cats. Here we attempt to create a resource for small cat research especially of the genus Felis which has six species out of which two are classified as vulnerable by IUCN and at least one more is at risk. We collected tissue samples of four Felis chaus (Jungle cat from central India and used available whole genome sequences of nine individuals from four other Felis species, two individuals of Prionailurus bengalensis and an Otocolobus manul. These whole genome sequences were filtered and aligned with the already published domestic cat (Felis catus genome assembly. Felids are closely related species and reads from all species in our study aligned with the domestic cat genome with a rate of at least 93%. We estimated the existing genomic variation by calculating heterozygous SNP encounter rate. So far, it seems that all wild cats have more genetic variation than Felis catus species. This can be attributed to the inbreeding in these cats. Among the wild cats, Felis silvestris seems to have the highest level of genetic variation. To understand the reasons behind the distribution of genetic variation in small cats, we estimated the demographic histories of each of the species using PSMC. This method can only detect demographic changes more than 1000 generations ago. We observe that roughly all species share a parallel history in terms of population increase. The most interesting and important feature might be that all wild small cat population sizes increased exponentially around twenty thousand years ago as opposed to domestic cat and big cats which declined around this time. Another interesting feature of

  8. Epigenetic Variation in Monozygotic Twins: A Genome-Wide Analysis of DNA Methylation in Buccal Cells

    Directory of Open Access Journals (Sweden)

    Jenny van Dongen

    2014-05-01

    Full Text Available DNA methylation is one of the most extensively studied epigenetic marks in humans. Yet, it is largely unknown what causes variation in DNA methylation between individuals. The comparison of DNA methylation profiles of monozygotic (MZ twins offers a unique experimental design to examine the extent to which such variation is related to individual-specific environmental influences and stochastic events or to familial factors (DNA sequence and shared environment. We measured genome-wide DNA methylation in buccal samples from ten MZ pairs (age 8–19 using the Illumina 450k array and examined twin correlations for methylation level at 420,921 CpGs after QC. After selecting CpGs showing the most variation in the methylation level between subjects, the mean genome-wide correlation (rho was 0.54. The correlation was higher, on average, for CpGs within CpG islands (CGIs, compared to CGI shores, shelves and non-CGI regions, particularly at hypomethylated CpGs. This finding suggests that individual-specific environmental and stochastic influences account for more variation in DNA methylation in CpG-poor regions. Our findings also indicate that it is worthwhile to examine heritable and shared environmental influences on buccal DNA methylation in larger studies that also include dizygotic twins.

  9. Common genetic variation and susceptibility to partial epilepsies: a genome-wide association study.

    Science.gov (United States)

    Kasperaviciūte, Dalia; Catarino, Claudia B; Heinzen, Erin L; Depondt, Chantal; Cavalleri, Gianpiero L; Caboclo, Luis O; Tate, Sarah K; Jamnadas-Khoda, Jenny; Chinthapalli, Krishna; Clayton, Lisa M S; Shianna, Kevin V; Radtke, Rodney A; Mikati, Mohamad A; Gallentine, William B; Husain, Aatif M; Alhusaini, Saud; Leppert, David; Middleton, Lefkos T; Gibson, Rachel A; Johnson, Michael R; Matthews, Paul M; Hosford, David; Heuser, Kjell; Amos, Leslie; Ortega, Marcos; Zumsteg, Dominik; Wieser, Heinz-Gregor; Steinhoff, Bernhard J; Krämer, Günter; Hansen, Jörg; Dorn, Thomas; Kantanen, Anne-Mari; Gjerstad, Leif; Peuralinna, Terhi; Hernandez, Dena G; Eriksson, Kai J; Kälviäinen, Reetta K; Doherty, Colin P; Wood, Nicholas W; Pandolfo, Massimo; Duncan, John S; Sander, Josemir W; Delanty, Norman; Goldstein, David B; Sisodiya, Sanjay M

    2010-07-01

    Partial epilepsies have a substantial heritability. However, the actual genetic causes are largely unknown. In contrast to many other common diseases for which genetic association-studies have successfully revealed common variants associated with disease risk, the role of common variation in partial epilepsies has not yet been explored in a well-powered study. We undertook a genome-wide association-study to identify common variants which influence risk for epilepsy shared amongst partial epilepsy syndromes, in 3445 patients and 6935 controls of European ancestry. We did not identify any genome-wide significant association. A few single nucleotide polymorphisms may warrant further investigation. We exclude common genetic variants with effect sizes above a modest 1.3 odds ratio for a single variant as contributors to genetic susceptibility shared across the partial epilepsies. We show that, at best, common genetic variation can only have a modest role in predisposition to the partial epilepsies when considered across syndromes in Europeans. The genetic architecture of the partial epilepsies is likely to be very complex, reflecting genotypic and phenotypic heterogeneity. Larger meta-analyses are required to identify variants of smaller effect sizes (odds ratio<1.3) or syndrome-specific variants. Further, our results suggest research efforts should also be directed towards identifying the multiple rare variants likely to account for at least part of the heritability of the partial epilepsies. Data emerging from genome-wide association-studies will be valuable during the next serious challenge of interpreting all the genetic variation emerging from whole-genome sequencing studies.

  10. A map of human genome variation from population-scale sequencing.

    Science.gov (United States)

    Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A

    2010-10-28

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.

  11. An initial comparative map of copy number variations in the goat (Capra hircus genome

    Directory of Open Access Journals (Sweden)

    Casadio Rita

    2010-11-01

    Full Text Available Abstract Background The goat (Capra hircus represents one of the most important farm animal species. It is reared in all continents with an estimated world population of about 800 million of animals. Despite its importance, studies on the goat genome are still in their infancy compared to those in other farm animal species. Comparative mapping between cattle and goat showed only a few rearrangements in agreement with the similarity of chromosome banding. We carried out a cross species cattle-goat array comparative genome hybridization (aCGH experiment in order to identify copy number variations (CNVs in the goat genome analysing animals of different breeds (Saanen, Camosciata delle Alpi, Girgentana, and Murciano-Granadina using a tiling oligonucleotide array with ~385,000 probes designed on the bovine genome. Results We identified a total of 161 CNVs (an average of 17.9 CNVs per goat, with the largest number in the Saanen breed and the lowest in the Camosciata delle Alpi goat. By aggregating overlapping CNVs identified in different animals we determined CNV regions (CNVRs: on the whole, we identified 127 CNVRs covering about 11.47 Mb of the virtual goat genome referred to the bovine genome (0.435% of the latter genome. These 127 CNVRs included 86 loss and 41 gain and ranged from about 24 kb to about 1.07 Mb with a mean and median equal to 90,292 bp and 49,530 bp, respectively. To evaluate whether the identified goat CNVRs overlap with those reported in the cattle genome, we compared our results with those obtained in four independent cattle experiments. Overlapping between goat and cattle CNVRs was highly significant (P Conclusions We describe a first map of goat CNVRs. This provides information on a comparative basis with the cattle genome by identifying putative recurrent interspecies CNVs between these two ruminant species. Several goat CNVs affect genes with important biological functions. Further studies are needed to evaluate the

  12. Extensive variation in the density and distribution of DNA polymorphism in sorghum genomes.

    Directory of Open Access Journals (Sweden)

    Joseph Evans

    Full Text Available Sorghum genotypes currently used for grain production in the United States were developed from African landraces that were imported starting in the mid-to-late 19(th century. Farmers and plant breeders selected genotypes for grain production with reduced plant height, early flowering, increased grain yield, adaptation to drought, and improved resistance to lodging, diseases and pests. DNA polymorphisms that distinguish three historically important grain sorghum genotypes, BTx623, BTx642 and Tx7000, were characterized by genome sequencing, genotyping by sequencing, genetic mapping, and pedigree-based haplotype analysis. The distribution and density of DNA polymorphisms in the sequenced genomes varied widely, in part because the lines were derived through breeding and selection from diverse Kafir, Durra, and Caudatum race accessions. Genomic DNA spanning dw1 (SBI-09 and dw3 (SBI-07 had identical haplotypes due to selection for reduced height. Lower SNP density in genes located in pericentromeric regions compared with genes located in euchromatic regions is consistent with background selection in these regions of low recombination. SNP density was higher in euchromatic DNA and varied >100-fold in contiguous intervals that spanned up to 300 Kbp. The localized variation in DNA polymorphism density occurred throughout euchromatic regions where recombination is elevated, however, polymorphism density was not correlated with gene density or DNA methylation. Overall, sorghum chromosomes contain distal euchromatic regions characterized by extensive, localized variation in DNA polymorphism density, and large pericentromeric regions of low gene density, diversity, and recombination.

  13. PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

    Science.gov (United States)

    Liu, Yang; Khan, Saad M; Wang, Juexin; Rynge, Mats; Zhang, Yuanxun; Zeng, Shuai; Chen, Shiyuan; Maldonado Dos Santos, Joao V; Valliyodan, Babu; Calyam, Prasad P; Merchant, Nirav; Nguyen, Henry T; Xu, Dong; Joshi, Trupti

    2016-10-06

    With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. PGen workflow has been optimized for the most

  14. Illumina based whole mitochondrial genome of Junonia iphita reveals minor intraspecific variation

    Directory of Open Access Journals (Sweden)

    Catherine Vanlalruati

    2015-12-01

    Full Text Available In the present study, the near complete mitochondrial genome (mitogenome of Junonia iphita (Lepidoptera: Nymphalidae: Nymphalinae was determined to be 14,892 bp. The gene order and orientation are identical to those in other butterfly species. The phylogenetic tree constructed from the whole mitogenomes using the 13 protein coding genes (PCGs defines the genetic relatedness of the two J. iphita species collected from two different regions. All the Junonia species clustered together, and were further subdivided into clade one consisting of J. almana and J. orithya and clade two comprising of the two J. iphita which were collected from Indo and Indochinese subregions separated by river barrier. Comparison between the two J. iphita sequences revealed minor variations and Single Nucleotide Polymorphisms were identified at 51 sites amounting to 0.4% of the entire mitochondrial genome.

  15. Copy number variation is a fundamental aspect of the placental genome.

    Directory of Open Access Journals (Sweden)

    Roberta L Hannibal

    2014-05-01

    Full Text Available Discovery of lineage-specific somatic copy number variation (CNV in mammals has led to debate over whether CNVs are mutations that propagate disease or whether they are a normal, and even essential, aspect of cell biology. We show that 1,000 N polyploid trophoblast giant cells (TGCs of the mouse placenta contain 47 regions, totaling 138 Megabases, where genomic copies are underrepresented (UR. UR domains originate from a subset of late-replicating heterochromatic regions containing gene deserts and genes involved in cell adhesion and neurogenesis. While lineage-specific CNVs have been identified in mammalian cells, classically in the immune system where V(DJ recombination occurs, we demonstrate that CNVs form during gestation in the placenta by an underreplication mechanism, not by recombination nor deletion. Our results reveal that large scale CNVs are a normal feature of the mammalian placental genome, which are regulated systematically during embryogenesis and are propagated by a mechanism of underreplication.

  16. Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K

    2016-01-27

    Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for

  17. Sequence variation of the feline immunodeficiency virus genome and its clinical relevance.

    Science.gov (United States)

    Stickney, A L; Dunowska, M; Cave, N J

    2013-06-08

    The ongoing evolution of feline immunodeficiency virus (FIV) has resulted in the existence of a diverse continuum of viruses. FIV isolates differ with regards to their mutation and replication rates, plasma viral loads, cell tropism and the ability to induce apoptosis. Clinical disease in FIV-infected cats is also inconsistent. Genomic sequence variation of FIV is likely to be responsible for some of the variation in viral behaviour. The specific genetic sequences that influence these key viral properties remain to be determined. With knowledge of the specific key determinants of pathogenicity, there is the potential for veterinarians in the future to apply this information for prognostic purposes. Genomic sequence variation of FIV also presents an obstacle to effective vaccine development. Most challenge studies demonstrate acceptable efficacy of a dual-subtype FIV vaccine (Fel-O-Vax FIV) against FIV infection under experimental settings; however, vaccine efficacy in the field still remains to be proven. It is important that we discover the key determinants of immunity induced by this vaccine; such data would compliment vaccine field efficacy studies and provide the basis to make informed recommendations on its use.

  18. Genome Wide Distributions and Functional Characterization of Copy Number Variations between Chinese and Western Pigs.

    Directory of Open Access Journals (Sweden)

    Hongyang Wang

    Full Text Available Copy number variations (CNVs refer to large insertions, deletions and duplications in the genomic structure ranging from one thousand to several million bases in size. Since the development of next generation sequencing technology, several methods have been well built for detection of copy number variations with high credibility and accuracy. Evidence has shown that CNV occurring in gene region could lead to phenotypic changes due to the alteration in gene structure and dosage. However, it still remains unexplored whether CNVs underlie the phenotypic differences between Chinese and Western domestic pigs. Based on the read-depth methods, we investigated copy number variations using 49 individuals derived from both Chinese and Western pig breeds. A total of 3,131 copy number variation regions (CNVRs were identified with an average size of 13.4 Kb in all individuals during domestication, harboring 1,363 genes. Among them, 129 and 147 CNVRs were Chinese and Western pig specific, respectively. Gene functional enrichments revealed that these CNVRs contribute to strong disease resistance and high prolificacy in Chinese domestic pigs, but strong muscle tissue development in Western domestic pigs. This finding is strongly consistent with the morphologic characteristics of Chinese and Western pigs, indicating that these group-specific CNVRs might have been preserved by artificial selection for the favored phenotypes during independent domestication of Chinese and Western pigs. In this study, we built high-resolution CNV maps in several domestic pig breeds and discovered the group specific CNVs by comparing Chinese and Western pigs, which could provide new insight into genomic variations during pigs' independent domestication, and facilitate further functional studies of CNV-associated genes.

  19. Genomic Features That Predict Allelic Imbalance in Humans Suggest Patterns of Constraint on Gene Expression Variation

    Science.gov (United States)

    Fédrigo, Olivier; Haygood, Ralph; Mukherjee, Sayan; Wray, Gregory A.

    2009-01-01

    Variation in gene expression is an important contributor to phenotypic diversity within and between species. Although this variation often has a genetic component, identification of the genetic variants driving this relationship remains challenging. In particular, measurements of gene expression usually do not reveal whether the genetic basis for any observed variation lies in cis or in trans to the gene, a distinction that has direct relevance to the physical location of the underlying genetic variant, and which may also impact its evolutionary trajectory. Allelic imbalance measurements identify cis-acting genetic effects by assaying the relative contribution of the two alleles of a cis-regulatory region to gene expression within individuals. Identification of patterns that predict commonly imbalanced genes could therefore serve as a useful tool and also shed light on the evolution of cis-regulatory variation itself. Here, we show that sequence motifs, polymorphism levels, and divergence levels around a gene can be used to predict commonly imbalanced genes in a human data set. Reduction of this feature set to four factors revealed that only one factor significantly differentiated between commonly imbalanced and nonimbalanced genes. We demonstrate that these results are consistent between the original data set and a second published data set in humans obtained using different technical and statistical methods. Finally, we show that variation in the single allelic imbalance-associated factor is partially explained by the density of genes in the region of a target gene (allelic imbalance is less probable for genes in gene-dense regions), and, to a lesser extent, the evenness of expression of the gene across tissues and the magnitude of negative selection on putative regulatory regions of the gene. These results suggest that the genomic distribution of functional cis-regulatory variants in the human genome is nonrandom, perhaps due to local differences in evolutionary

  20. Genomic regulation of natural variation in cortical and noncortical brain volume

    Directory of Open Access Journals (Sweden)

    Laughlin Rick E

    2006-02-01

    Full Text Available Abstract Background The relative growth of the neocortex parallels the emergence of complex cognitive functions across species. To determine the regions of the mammalian genome responsible for natural variations in cortical volume, we conducted a complex trait analysis using 34 strains of recombinant inbred (Rl strains of mice (BXD, as well as their two parental strains (C57BL/6J and DBA/2J. We measured both neocortical volume and total brain volume in 155 coronally sectioned mouse brains that were Nissl stained and embedded in celloidin. After correction for shrinkage, the measured cortical and noncortical brain volumes were entered into a multiple regression analysis, which removed the effects of body size and age from the measurements. Marker regression and interval mapping were computed using WebQTL. Results An ANOVA revealed that more than half of the variance of these regressed phenotypes is genetically determined. We then identified the regions of the genome regulating this heritability. We located genomic regions in which a linkage disequilibrium was present using WebQTL as both a mapping engine and genomic database. For neocortex, we found a genome-wide significant quantitative trait locus (QTL on chromosome 11 (marker D11Mit19, as well as a suggestive QTL on chromosome 16 (marker D16Mit100. In contrast, for noncortex the effect of chromosome 11 was markedly reduced, and a significant QTL appeared on chromosome 19 (D19Mit22. Conclusion This classic pattern of double dissociation argues strongly for different genetic factors regulating relative cortical size, as opposed to brain volume more generally. It is likely, however, that the effects of proximal chromosome 11 extend beyond the neocortex strictly defined. An analysis of single nucleotide polymorphisms in these regions indicated that ciliary neurotrophic factor (Cntf is quite possibly the gene underlying the noncortical QTL. Evidence for a candidate gene modulating neocortical

  1. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing.

    Science.gov (United States)

    Aflitos, Saulo; Schijlen, Elio; de Jong, Hans; de Ridder, Dick; Smit, Sandra; Finkers, Richard; Wang, Jun; Zhang, Gengyun; Li, Ning; Mao, Likai; Bakker, Freek; Dirks, Rob; Breit, Timo; Gravendeel, Barbara; Huits, Henk; Struss, Darush; Swanson-Wagner, Ruth; van Leeuwen, Hans; van Ham, Roeland C H J; Fito, Laia; Guignier, Laëtitia; Sevilla, Myrna; Ellul, Philippe; Ganko, Eric; Kapur, Arvind; Reclus, Emannuel; de Geus, Bernard; van de Geest, Henri; Te Lintel Hekkert, Bas; van Haarst, Jan; Smits, Lars; Koops, Andries; Sanchez-Perez, Gabino; van Heusden, Adriaan W; Visser, Richard; Quan, Zhiwu; Min, Jiumeng; Liao, Li; Wang, Xiaoli; Wang, Guangbiao; Yue, Zhen; Yang, Xinhua; Xu, Na; Schranz, Eric; Smets, Erik; Vos, Rutger; Rauwerda, Johan; Ursem, Remco; Schuit, Cees; Kerns, Mike; van den Berg, Jan; Vriezen, Wim; Janssen, Antoine; Datema, Erwin; Jahrman, Torben; Moquet, Frederic; Bonnet, Julien; Peters, Sander

    2014-10-01

    We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new reference genomes were reconstructed to support our comparative genome analyses. Comparative sequence alignment revealed group-, species- and accession-specific polymorphisms, explaining characteristic fruit traits and growth habits in the various cultivars. Using gene models from the annotated Heinz 1706 reference genome, we observed differences in the ratio between non-synonymous and synonymous SNPs (dN/dS) in fruit diversification and plant growth genes compared to a random set of genes, indicating positive selection and differences in selection pressure between crop accessions and wild species. In wild species, the number of single-nucleotide polymorphisms (SNPs) exceeds 10 million, i.e. 20-fold higher than found in most of the crop accessions, indicating dramatic genetic erosion of crop and heirloom tomatoes. In addition, the highest levels of heterozygosity were found for allogamous self-incompatible wild species, while facultative and autogamous self-compatible species display a lower heterozygosity level. Using whole-genome SNP information for maximum-likelihood analysis, we achieved complete tree resolution, whereas maximum-likelihood trees based on SNPs from ten fruit and growth genes show incomplete resolution for the crop accessions, partly due to the effect of heterozygous SNPs. Finally, results suggest that phylogenetic relationships are correlated with habitat, indicating the occurrence of geographical races within these groups, which is of practical importance for Solanum genome evolution studies. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.

  2. Assessing genome-wide copy number variation in the Han Chinese population.

    Science.gov (United States)

    Lu, Jianqi; Lou, Haiyi; Fu, Ruiqing; Lu, Dongsheng; Zhang, Feng; Wu, Zhendong; Zhang, Xi; Li, Changhua; Fang, Baijun; Pu, Fangfang; Wei, Jingning; Wei, Qian; Zhang, Chao; Wang, Xiaoji; Lu, Yan; Yan, Shi; Yang, Yajun; Jin, Li; Xu, Shuhua

    2017-10-01

    Copy number variation (CNV) is a valuable source of genetic diversity in the human genome and a well-recognised cause of various genetic diseases. However, CNVs have been considerably under-represented in population-based studies, particularly the Han Chinese which is the largest ethnic group in the world. To build a representative CNV map for the Han Chinese population. We conducted a genome-wide CNV study involving 451 male Han Chinese samples from 11 geographical regions encompassing 28 dialect groups, representing a less-biased panel compared with the currently available data. We detected CNVs by using 4.2M NimbleGen comparative genomic hybridisation array and whole-genome deep sequencing of 51 samples to optimise the filtering conditions in CNV discovery. A comprehensive Han Chinese CNV map was built based on a set of high-quality variants (positive predictive value >0.8, with sizes ranging from 369 bp to 4.16 Mb and a median of 5907 bp). The map consists of 4012 CNV regions (CNVRs), and more than half are novel to the 30 East Asian CNV Project and the 1000 Genomes Project Phase 3. We further identified 81 CNVRs specific to regional groups, which was indicative of the subpopulation structure within the Han Chinese population. Our data are complementary to public data sources, and the CNV map may facilitate in the identification of pathogenic CNVs and further biomedical research studies involving the Han Chinese population. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  3. Genomic variation in CYP3A4: type, frequencies and potential implications for pharmacogenetic understanding.

    OpenAIRE

    Creemer, O.

    2012-01-01

    The human cytochrome P450 3A subfamily metabolises endogenous substances and approximately half of all currently available drugs. There is marked inter-individual variation in hepatic expression of the major adult isoform, CYP3A4; the genetic component of this variability is estimated at 60-90% and, as yet, remains largely uncharacterised. Elucidation of genetic factors determining CYP3A4 activity would permit personalised dose-adjustment in therapies with CYP3A4 drug substrates. CYP3A4 genom...

  4. SV2: accurate structural variation genotyping and de novo mutation detection from whole genomes.

    Science.gov (United States)

    Antaki, Danny; Brandler, William M; Sebat, Jonathan

    2018-05-15

    Structural variation (SV) detection from short-read whole genome sequencing is error prone, presenting significant challenges for population or family-based studies of disease. Here, we describe SV2, a machine-learning algorithm for genotyping deletions and duplications from paired-end sequencing data. SV2 can rapidly integrate variant calls from multiple structural variant discovery algorithms into a unified call set with high genotyping accuracy and capability to detect de novo mutations. SV2 is freely available on GitHub (https://github.com/dantaki/SV2). jsebat@ucsd.edu. Supplementary data are available at Bioinformatics online.

  5. Genomic dissection of variation in clutch size and egg mass in a wild great tit (Parus major) population.

    Science.gov (United States)

    Santure, Anna W; De Cauwer, Isabelle; Robinson, Matthew R; Poissant, Jocelyn; Sheldon, Ben C; Slate, Jon

    2013-08-01

    Clutch size and egg mass are life history traits that have been extensively studied in wild bird populations, as life history theory predicts a negative trade-off between them, either at the phenotypic or at the genetic level. Here, we analyse the genomic architecture of these heritable traits in a wild great tit (Parus major) population, using three marker-based approaches - chromosome partitioning, quantitative trait locus (QTL) mapping and a genome-wide association study (GWAS). The variance explained by each great tit chromosome scales with predicted chromosome size, no location in the genome contains genome-wide significant QTL, and no individual SNPs are associated with a large proportion of phenotypic variation, all of which may suggest that variation in both traits is due to many loci of small effect, located across the genome. There is no evidence that any regions of the genome contribute significantly to both traits, which combined with a small, nonsignificant negative genetic covariance between the traits, suggests the absence of genetic constraints on the independent evolution of these traits. Our findings support the hypothesis that variation in life history traits in natural populations is likely to be determined by many loci of small effect spread throughout the genome, which are subject to continued input of variation by mutation and migration, although we cannot exclude the possibility of an additional input of major effect genes influencing either trait. © 2013 John Wiley & Sons Ltd.

  6. Orion: Detecting regions of the human non-coding genome that are intolerant to variation using population genetics.

    Science.gov (United States)

    Gussow, Ayal B; Copeland, Brett R; Dhindsa, Ryan S; Wang, Quanli; Petrovski, Slavé; Majoros, William H; Allen, Andrew S; Goldstein, David B

    2017-01-01

    There is broad agreement that genetic mutations occurring outside of the protein-coding regions play a key role in human disease. Despite this consensus, we are not yet capable of discerning which portions of non-coding sequence are important in the context of human disease. Here, we present Orion, an approach that detects regions of the non-coding genome that are depleted of variation, suggesting that the regions are intolerant of mutations and subject to purifying selection in the human lineage. We show that Orion is highly correlated with known intolerant regions as well as regions that harbor putatively pathogenic variation. This approach provides a mechanism to identify pathogenic variation in the human non-coding genome and will have immediate utility in the diagnostic interpretation of patient genomes and in large case control studies using whole-genome sequences.

  7. Complete chloroplast genomes from apomictic Taraxacum (Asteraceae): Identity and variation between three microspecies

    Science.gov (United States)

    Majeský, Ľuboš; Schwarzacher, Trude; Gornall, Richard; Heslop-Harrison, Pat

    2017-01-01

    Chloroplast DNA sequences show substantial variation between higher plant species, and less variation within species, so are typically excellent markers to investigate evolutionary, population and genetic relationships and phylogenies. We sequenced the plastomes of Taraxacum obtusifrons Markl. (O978); T. stridulum Trávniček ined. (S3); and T. amplum Markl. (A978), three apomictic triploid (2n = 3x = 24) dandelions from the T. officinale agg. We aimed to characterize the variation in plastomes, define relationships and correlations with the apomictic microspecies status, and refine placement of the microspecies in the evolutionary or phylogenetic context of the Asteraceae. The chloroplast genomes of accessions O978 and S3 were identical and 151,322 bp long (where the nuclear genes are known to show variation), while A978 was 151,349 bp long. All three genomes contained 135 unique genes, with an additional copy of the trnF-GGA gene in the LSC region and 20 duplicated genes in the IR region, along with short repeats, the typical major Inverted Repeats (IR1 and IR2, 24,431bp long), and Large and Small Single Copy regions (LSC 83,889bp and SSC 18,571bp in O978). Between the two Taraxacum plastomes types, we identified 28 SNPs. The distribution of polymorphisms suggests some parts of the Taraxacum plastome are evolving at a slower rate. There was a hemi-nested inversion in the LSC region that is common to Asteraceae, and an SSC inversion from ndhF to rps15 found only in some Asteraceae lineages. A comparative repeat analysis showed variation between Taraxacum and the phylogenetically close genus Lactuca, with many more direct repeats of 40bp or more in Lactuca (1% larger plastome than Taraxacum). When individual genes and non-coding regions were for Asteraceae phylogeny reconstruction, not all showed the same evolutionary scenario suggesting care is needed for interpretation of relationships if a limited number of markers are used. Studying genotypic diversity in

  8. EVA: Exome Variation Analyzer, an efficient and versatile tool for filtering strategies in medical genomics

    Directory of Open Access Journals (Sweden)

    Coutant Sophie

    2012-09-01

    Full Text Available Abstract Background Whole exome sequencing (WES has become the strategy of choice to identify a coding allelic variant for a rare human monogenic disorder. This approach is a revolution in medical genetics history, impacting both fundamental research, and diagnostic methods leading to personalized medicine. A plethora of efficient algorithms has been developed to ensure the variant discovery. They generally lead to ~20,000 variations that have to be narrow down to find the potential pathogenic allelic variant(s and the affected gene(s. For this purpose, commonly adopted procedures which implicate various filtering strategies have emerged: exclusion of common variations, type of the allelics variants, pathogenicity effect prediction, modes of inheritance and multiple individuals for exome comparison. To deal with the expansion of WES in medical genomics individual laboratories, new convivial and versatile software tools have to implement these filtering steps. Non-programmer biologists have to be autonomous combining themselves different filtering criteria and conduct a personal strategy depending on their assumptions and study design. Results We describe EVA (Exome Variation Analyzer, a user-friendly web-interfaced software dedicated to the filtering strategies for medical WES. Thanks to different modules, EVA (i integrates and stores annotated exome variation data as strictly confidential to the project owner, (ii allows to combine the main filters dealing with common variations, molecular types, inheritance mode and multiple samples, (iii offers the browsing of annotated data and filtered results in various interactive tables, graphical visualizations and statistical charts, (iv and finally offers export files and cross-links to external useful databases and softwares for further prioritization of the small subset of sorted candidate variations and genes. We report a demonstrative case study that allowed to identify a new candidate gene

  9. A genome wide survey of SNP variation reveals the genetic structure of sheep breeds.

    Directory of Open Access Journals (Sweden)

    James W Kijas

    Full Text Available The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identifying the first genome-wide set of SNP for sheep, we report on levels of genetic variability both within and between a diverse sample of ovine populations. Then, using cluster analysis and the partitioning of genetic variation, we demonstrate sheep are characterised by weak phylogeographic structure, overlapping genetic similarity and generally low differentiation which is consistent with their short evolutionary history. The degree of population substructure was, however, sufficient to cluster individuals based on geographic origin and known breed history. Specifically, African and Asian populations clustered separately from breeds of European origin sampled from Australia, New Zealand, Europe and North America. Furthermore, we demonstrate the presence of stratification within some, but not all, ovine breeds. The results emphasize that careful documentation of genetic structure will be an essential prerequisite when mapping the genetic basis of complex traits. Furthermore, the identification of a subset of SNP able to assign individuals into broad groupings demonstrates even a small panel of markers may be suitable for applications such as traceability.

  10. A Bayesian method and its variational approximation for prediction of genomic breeding values in multiple traits

    Directory of Open Access Journals (Sweden)

    Hayashi Takeshi

    2013-01-01

    Full Text Available Abstract Background Genomic selection is an effective tool for animal and plant breeding, allowing effective individual selection without phenotypic records through the prediction of genomic breeding value (GBV. To date, genomic selection has focused on a single trait. However, actual breeding often targets multiple correlated traits, and, therefore, joint analysis taking into consideration the correlation between traits, which might result in more accurate GBV prediction than analyzing each trait separately, is suitable for multi-trait genomic selection. This would require an extension of the prediction model for single-trait GBV to multi-trait case. As the computational burden of multi-trait analysis is even higher than that of single-trait analysis, an effective computational method for constructing a multi-trait prediction model is also needed. Results We described a Bayesian regression model incorporating variable selection for jointly predicting GBVs of multiple traits and devised both an MCMC iteration and variational approximation for Bayesian estimation of parameters in this multi-trait model. The proposed Bayesian procedures with MCMC iteration and variational approximation were referred to as MCBayes and varBayes, respectively. Using simulated datasets of SNP genotypes and phenotypes for three traits with high and low heritabilities, we compared the accuracy in predicting GBVs between multi-trait and single-trait analyses as well as between MCBayes and varBayes. The results showed that, compared to single-trait analysis, multi-trait analysis enabled much more accurate GBV prediction for low-heritability traits correlated with high-heritability traits, by utilizing the correlation structure between traits, while the prediction accuracy for uncorrelated low-heritability traits was comparable or less with multi-trait analysis in comparison with single-trait analysis depending on the setting for prior probability that a SNP has zero

  11. The analysis of APOL1 genetic variation and haplotype diversity provided by 1000 Genomes project.

    Science.gov (United States)

    Peng, Ting; Wang, Li; Li, Guisen

    2017-08-11

    The APOL1 gene variants has been shown to be associated with an increased risk of multiple kinds of diseases, particularly in African Americans, but not in Caucasians and Asians. In this study, we explored the single nucleotide polymorphism (SNP) and haplotype diversity of APOL1 gene in different races provided by 1000 Genomes project. Variants of APOL1 gene in 1000 Genome Project were obtained and SNPs located in the regulatory region or coding region were selected for genetic variation analysis. Total 2504 individuals from 26 populations were classified as four groups that included Africa, Europe, Asia and Admixed populations. Tag SNPs were selected to evaluate the haplotype diversities in the four populations by HaploStats software. APOL1 gene was surrounded by some of the most polymorphic genes in the human genome, variation of APOL1 gene was common, with up to 613 SNP (1000 Genome Project reported) and 99 of them (16.2%) with MAF ≥ 1%. There were 79 SNPs in the URR and 92 SNPs in 3'UTR. Total 12 SNPs in URR and 24 SNPs in 3'UTR were considered as common variants with MAF ≥ 1%. It is worth noting that URR-1 was presents lower frequencies in European populations, while other three haplotypes taken an opposite pattern; 3'UTR presents several high-frequency variation sites in a short segment, and the differences of its haplotypes among different population were significant (P < 0.01), UTR-1 and UTR-5 presented much higher frequency in African population, while UTR-2, UTR-3 and UTR-4 were much lower. APOL1 coding region showed that two SNP of G1 with higher frequency are actually pull down the haplotype H-1 frequency when considering all populations pooled together, and the diversity among the four populations be widen by the G1 two mutation (P 1  = 3.33E-4 vs P 2  = 3.61E-30). The distributions of APOL1 gene variants and haplotypes were significantly different among the different populations, in either regulatory or coding regions. It could provide

  12. Genome-wide recombination dynamics are associated with phenotypic variation in maize.

    Science.gov (United States)

    Pan, Qingchun; Li, Lin; Yang, Xiaohong; Tong, Hao; Xu, Shutu; Li, Zhigang; Li, Weiya; Muehlbauer, Gary J; Li, Jiansheng; Yan, Jianbing

    2016-05-01

    Meiotic recombination is a major driver of genetic diversity, species evolution, and agricultural improvement. Thus, an understanding of the genetic recombination landscape across the maize (Zea mays) genome will provide insight and tools for further study of maize evolution and improvement. Here, we used c. 50 000 single nucleotide polymorphisms to precisely map recombination events in 12 artificial maize segregating populations. We observed substantial variation in the recombination frequency and distribution along the ten maize chromosomes among the 12 populations and identified 143 recombination hot regions. Recombination breakpoints were partitioned into intragenic and intergenic events. Interestingly, an increase in the number of genes containing recombination events was accompanied by a decrease in the number of recombination events per gene. This kept the overall number of intragenic recombination events nearly invariable in a given population, suggesting that the recombination variation observed among populations was largely attributed to intergenic recombination. However, significant associations between intragenic recombination events and variation in gene expression and agronomic traits were observed, suggesting potential roles for intragenic recombination in plant phenotypic diversity. Our results provide a comprehensive view of the maize recombination landscape, and show an association between recombination, gene expression and phenotypic variation, which may enhance crop genetic improvement. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  13. Genome-size Variation in Switchgrass (Panicum virgatum: Flow Cytometry and Cytology Reveal Rampant Aneuploidy

    Directory of Open Access Journals (Sweden)

    Denise E. Costich

    2010-11-01

    Full Text Available Switchgrass ( L., a native perennial dominant of the prairies of North America, has been targeted as a model herbaceous species for biofeedstock development. A flow-cytometric survey of a core set of 11 primarily upland polyploid switchgrass accessions indicated that there was considerable variation in genome size within each accession, particularly at the octoploid (2 = 8 = 72 chromosome ploidy level. Highly variable chromosome counts in mitotic cell preparations indicated that aneuploidy was more common in octoploids (86.3% than tetraploids (23.2%. Furthermore, the incidence of hyper- versus hypoaneuploidy is equivalent in tetraploids. This is clearly not the case in octoploids, where close to 90% of the aneuploid counts are lower than the euploid number. Cytogenetic investigation using fluorescent in situ hybridization (FISH revealed an unexpected degree of variation in chromosome structure underlying the apparent genomic instability of this species. These results indicate that rapid advances in the breeding of polyploid biofuel feedstocks, based on the molecular-genetic dissection of biomass characteristics and yield, will be predicated on the continual improvement of our understanding of the cytogenetics of these species.

  14. A genome-wide analysis of putative functional and exonic variation associated with extremely high intelligence.

    Science.gov (United States)

    Spain, S L; Pedroso, I; Kadeva, N; Miller, M B; Iacono, W G; McGue, M; Stergiakouli, E; Davey Smith, G; Putallaz, M; Lubinski, D; Meaburn, E L; Plomin, R; Simpson, M A

    2016-08-01

    Although individual differences in intelligence (general cognitive ability) are highly heritable, molecular genetic analyses to date have had limited success in identifying specific loci responsible for its heritability. This study is the first to investigate exome variation in individuals of extremely high intelligence. Under the quantitative genetic model, sampling from the high extreme of the distribution should provide increased power to detect associations. We therefore performed a case-control association analysis with 1409 individuals drawn from the top 0.0003 (IQ >170) of the population distribution of intelligence and 3253 unselected population-based controls. Our analysis focused on putative functional exonic variants assayed on the Illumina HumanExome BeadChip. We did not observe any individual protein-altering variants that are reproducibly associated with extremely high intelligence and within the entire distribution of intelligence. Moreover, no significant associations were found for multiple rare alleles within individual genes. However, analyses using genome-wide similarity between unrelated individuals (genome-wide complex trait analysis) indicate that the genotyped functional protein-altering variation yields a heritability estimate of 17.4% (s.e. 1.7%) based on a liability model. In addition, investigation of nominally significant associations revealed fewer rare alleles associated with extremely high intelligence than would be expected under the null hypothesis. This observation is consistent with the hypothesis that rare functional alleles are more frequently detrimental than beneficial to intelligence.

  15. Functional conservation of nucleosome formation selectively biases presumably neutral molecular variation in yeast genomes.

    Science.gov (United States)

    Babbitt, Gregory A; Cotter, C R

    2011-01-01

    One prominent pattern of mutational frequency, long appreciated in comparative genomics, is the bias of purine/pyrimidine conserving substitutions (transitions) over purine/pyrimidine altering substitutions (transversions). Traditionally, this transitional bias has been thought to be driven by the underlying rates of DNA mutation and/or repair. However, recent sequencing studies of mutation accumulation lines in model organisms demonstrate that substitutions generally do not accumulate at rates that would indicate a transitional bias. These observations have called into question a very basic assumption of molecular evolution; that naturally occurring patterns of molecular variation in noncoding regions accurately reflect the underlying processes of randomly accumulating neutral mutation in nuclear genomes. Here, in Saccharomyces yeasts, we report a very strong inverse association (r = -0.951, P < 0.004) between the genome-wide frequency of substitutions and their average energetic effect on nucleosome formation, as predicted by a structurally based energy model of DNA deformation around the nucleosome core. We find that transitions occurring at sites positioned nearest the nucleosome surface, which are believed to function most importantly in nucleosome formation, alter the deformation energy of DNA to the nucleosome core by only a fraction of the energy changes typical of most transversions. When we examined the same substitutions set against random background sequences as well as an existing study reporting substitutions arising in mutation accumulation lines of Saccharomyces cerevisiae, we failed to find a similar relationship. These results support the idea that natural selection acting to functionally conserve chromatin organization may contribute significantly to genome-wide transitional bias, even in noncoding regions. Because nucleosome core structure is highly conserved across eukaryotes, our observations may also help to further explain locally elevated

  16. Genome-Wide Association Study Reveals Natural Variations Contributing to Drought Resistance in Crops

    Directory of Open Access Journals (Sweden)

    Hongwei Wang

    2017-06-01

    Full Text Available Crops are often cultivated in regions where they will face environmental adversities; resulting in substantial yield loss which can ultimately lead to food and societal problems. Thus, significant efforts have been made to breed stress tolerant cultivars in an attempt to minimize these problems and to produce more stability with respect to crop yields across broad geographies. Since stress tolerance is a complex and multi-genic trait, advancements with classical breeding approaches have been challenging. On the other hand, molecular breeding, which is based on transgenics, marker-assisted selection and genome editing technologies; holds great promise to enable farmers to better cope with these challenges. However, identification of the key genetic components underlying the trait is critical and will serve as the foundation for future crop genetic improvement. Recently, genome-wide association studies have made significant contributions to facilitate the discovery of natural variation contributing to stress tolerance in crops. From these studies, the identified loci can serve as targets for genomic selection or editing to enable the molecular design of new cultivars. Here, we summarize research progress on this issue and focus on the genetic basis of drought tolerance as revealed by genome-wide association studies and quantitative trait loci mapping. Although many favorable loci have been identified, elucidation of their molecular mechanisms contributing to increased stress tolerance still remains a challenge. Thus, continuous efforts are still required to functionally dissect this complex trait through comprehensive approaches, such as system biological studies. It is expected that proper application of the acquired knowledge will enable the development of stress tolerant cultivars; allowing agricultural production to become more sustainable under dynamic environmental conditions.

  17. Trait variation and genetic diversity in a banana genomic selection training population.

    Directory of Open Access Journals (Sweden)

    Moses Nyine

    Full Text Available Banana (Musa spp. is an important crop in the African Great Lakes region in terms of income and food security, with the highest per capita consumption worldwide. Pests, diseases and climate change hamper sustainable production of bananas. New breeding tools with increased crossbreeding efficiency are being investigated to breed for resistant, high yielding hybrids of East African Highland banana (EAHB. These include genomic selection (GS, which will benefit breeding through increased genetic gain per unit time. Understanding trait variation and the correlation among economically important traits is an essential first step in the development and selection of suitable GS models for banana. In this study, we tested the hypothesis that trait variations in bananas are not affected by cross combination, cycle, field management and their interaction with genotype. A training population created using EAHB breeding material and its progeny was phenotyped in two contrasting conditions. A high level of correlation among vegetative and yield related traits was observed. Therefore, genomic selection models could be developed for traits that are easily measured. It is likely that the predictive ability of traits that are difficult to phenotype will be similar to less difficult traits they are highly correlated with. Genotype response to cycle and field management practices varied greatly with respect to traits. Yield related traits accounted for 31-35% of principal component variation under low and high input field management conditions. Resistance to Black Sigatoka was stable across cycles but varied under different field management depending on the genotype. The best cross combination was 1201K-1xSH3217 based on selection response (R of hybrids. Genotyping using simple sequence repeat (SSR markers revealed that the training population was genetically diverse, reflecting a complex pedigree background, which was mostly influenced by the male parents.

  18. Trait variation and genetic diversity in a banana genomic selection training population.

    Science.gov (United States)

    Nyine, Moses; Uwimana, Brigitte; Swennen, Rony; Batte, Michael; Brown, Allan; Christelová, Pavla; Hřibová, Eva; Lorenzen, Jim; Doležel, Jaroslav

    2017-01-01

    Banana (Musa spp.) is an important crop in the African Great Lakes region in terms of income and food security, with the highest per capita consumption worldwide. Pests, diseases and climate change hamper sustainable production of bananas. New breeding tools with increased crossbreeding efficiency are being investigated to breed for resistant, high yielding hybrids of East African Highland banana (EAHB). These include genomic selection (GS), which will benefit breeding through increased genetic gain per unit time. Understanding trait variation and the correlation among economically important traits is an essential first step in the development and selection of suitable GS models for banana. In this study, we tested the hypothesis that trait variations in bananas are not affected by cross combination, cycle, field management and their interaction with genotype. A training population created using EAHB breeding material and its progeny was phenotyped in two contrasting conditions. A high level of correlation among vegetative and yield related traits was observed. Therefore, genomic selection models could be developed for traits that are easily measured. It is likely that the predictive ability of traits that are difficult to phenotype will be similar to less difficult traits they are highly correlated with. Genotype response to cycle and field management practices varied greatly with respect to traits. Yield related traits accounted for 31-35% of principal component variation under low and high input field management conditions. Resistance to Black Sigatoka was stable across cycles but varied under different field management depending on the genotype. The best cross combination was 1201K-1xSH3217 based on selection response (R) of hybrids. Genotyping using simple sequence repeat (SSR) markers revealed that the training population was genetically diverse, reflecting a complex pedigree background, which was mostly influenced by the male parents.

  19. Trait variation and genetic diversity in a banana genomic selection training population

    Science.gov (United States)

    Nyine, Moses; Uwimana, Brigitte; Swennen, Rony; Batte, Michael; Brown, Allan; Christelová, Pavla; Hřibová, Eva; Lorenzen, Jim

    2017-01-01

    Banana (Musa spp.) is an important crop in the African Great Lakes region in terms of income and food security, with the highest per capita consumption worldwide. Pests, diseases and climate change hamper sustainable production of bananas. New breeding tools with increased crossbreeding efficiency are being investigated to breed for resistant, high yielding hybrids of East African Highland banana (EAHB). These include genomic selection (GS), which will benefit breeding through increased genetic gain per unit time. Understanding trait variation and the correlation among economically important traits is an essential first step in the development and selection of suitable GS models for banana. In this study, we tested the hypothesis that trait variations in bananas are not affected by cross combination, cycle, field management and their interaction with genotype. A training population created using EAHB breeding material and its progeny was phenotyped in two contrasting conditions. A high level of correlation among vegetative and yield related traits was observed. Therefore, genomic selection models could be developed for traits that are easily measured. It is likely that the predictive ability of traits that are difficult to phenotype will be similar to less difficult traits they are highly correlated with. Genotype response to cycle and field management practices varied greatly with respect to traits. Yield related traits accounted for 31–35% of principal component variation under low and high input field management conditions. Resistance to Black Sigatoka was stable across cycles but varied under different field management depending on the genotype. The best cross combination was 1201K-1xSH3217 based on selection response (R) of hybrids. Genotyping using simple sequence repeat (SSR) markers revealed that the training population was genetically diverse, reflecting a complex pedigree background, which was mostly influenced by the male parents. PMID:28586365

  20. Population genomics of Pacific lamprey: adaptive variation in a highly dispersive species.

    Science.gov (United States)

    Hess, Jon E; Campbell, Nathan R; Close, David A; Docker, Margaret F; Narum, Shawn R

    2013-06-01

    Unlike most anadromous fishes that have evolved strict homing behaviour, Pacific lamprey (Entosphenus tridentatus) seem to lack philopatry as evidenced by minimal population structure across the species range. Yet unexplained findings of within-region population genetic heterogeneity coupled with the morphological and behavioural diversity described for the species suggest that adaptive genetic variation underlying fitness traits may be responsible. We employed restriction site-associated DNA sequencing to genotype 4439 quality filtered single nucleotide polymorphism (SNP) loci for 518 individuals collected across a broad geographical area including British Columbia, Washington, Oregon and California. A subset of putatively neutral markers (N = 4068) identified a significant amount of variation among three broad populations: northern British Columbia, Columbia River/southern coast and 'dwarf' adults (F(CT) = 0.02, P ≪ 0.001). Additionally, 162 SNPs were identified as adaptive through outlier tests, and inclusion of these markers revealed a signal of adaptive variation related to geography and life history. The majority of the 162 adaptive SNPs were not independent and formed four groups of linked loci. Analyses with matsam software found that 42 of these outlier SNPs were significantly associated with geography, run timing and dwarf life history, and 27 of these 42 SNPs aligned with known genes or highly conserved genomic regions using the genome browser available for sea lamprey. This study provides both neutral and adaptive context for observed genetic divergence among collections and thus reconciles previous findings of population genetic heterogeneity within a species that displays extensive gene flow. © 2012 John Wiley & Sons Ltd.

  1. Genomic variation and its impact on gene expression in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Andreas Massouras

    Full Text Available Understanding the relationship between genetic and phenotypic variation is one of the great outstanding challenges in biology. To meet this challenge, comprehensive genomic variation maps of human as well as of model organism populations are required. Here, we present a nucleotide resolution catalog of single-nucleotide, multi-nucleotide, and structural variants in 39 Drosophila melanogaster Genetic Reference Panel inbred lines. Using an integrative, local assembly-based approach for variant discovery, we identify more than 3.6 million distinct variants, among which were more than 800,000 unique insertions, deletions (indels, and complex variants (1 to 6,000 bp. While the SNP density is higher near other variants, we find that variants themselves are not mutagenic, nor are regions with high variant density particularly mutation-prone. Rather, our data suggest that the elevated SNP density around variants is mainly due to population-level processes. We also provide insights into the regulatory architecture of gene expression variation in adult flies by mapping cis-expression quantitative trait loci (cis-eQTLs for more than 2,000 genes. Indels comprise around 10% of all cis-eQTLs and show larger effects than SNP cis-eQTLs. In addition, we identified two-fold more gene associations in males as compared to females and found that most cis-eQTLs are sex-specific, revealing a partial decoupling of the genomic architecture between the sexes as well as the importance of genetic factors in mediating sex-biased gene expression. Finally, we performed RNA-seq-based allelic expression imbalance analyses in the offspring of crosses between sequenced lines, which revealed that the majority of strong cis-eQTLs can be validated in heterozygous individuals.

  2. DNA variation of the mammalian major histocompatibility complex reflects genomic diversity and population history

    International Nuclear Information System (INIS)

    Yuhki, Naoya; O'Brien, S.J.

    1990-01-01

    The major histocompatibility complex (MHC) is a multigene complex of tightly linked homologous genes that encode cell surface antigens that play a key role in immune regulation and response to foreign antigens. In most species, MHC gene products display extreme antigenic polymorphism, and their variability has been interpreted to reflect an adaptive strategy for accommodating rapidly evolving infectious agents that periodically afflict natural populations. Determination of the extent of MHC variation has been limited to populations in which skin grafting is feasible or for which serological reagents have been developed. The authors present here a quantitative analysis of restriction fragment length polymorphism of MHC class I genes in several mammalian species (cats, rodents, humans) known to have very different levels of genetic diversity based on functional MHC assays and on allozyme surveys. When homologous class I probes were employed, a notable concordance was observed between the extent of MHC restriction fragment variation and functional MHC variation detected by skin grafts or genome-wide diversity estimated by allozyme screens. These results confirm the genetically depauperate character of the African cheetah, Acinonyx jubatus, and the Asiatic lion, Panthera leo persica; further, they support the use of class I MHC molecular reagents in estimating the extent and character of genetic diversity in natural populations

  3. Genetic and epigenetic variation in 5S ribosomal RNA genes reveals genome dynamics in Arabidopsis thaliana.

    Science.gov (United States)

    Simon, Lauriane; Rabanal, Fernando A; Dubos, Tristan; Oliver, Cecilia; Lauber, Damien; Poulet, Axel; Vogt, Alexander; Mandlbauer, Ariane; Le Goff, Samuel; Sommer, Andreas; Duborjal, Hervé; Tatout, Christophe; Probst, Aline V

    2018-04-06

    Organized in tandem repeat arrays in most eukaryotes and transcribed by RNA polymerase III, expression of 5S rRNA genes is under epigenetic control. To unveil mechanisms of transcriptional regulation, we obtained here in depth sequence information on 5S rRNA genes from the Arabidopsis thaliana genome and identified differential enrichment in epigenetic marks between the three 5S rDNA loci situated on chromosomes 3, 4 and 5. We reveal the chromosome 5 locus as the major source of an atypical, long 5S rRNA transcript characteristic of an open chromatin structure. 5S rRNA genes from this locus translocated in the Landsberg erecta ecotype as shown by linkage mapping and chromosome-specific FISH analysis. These variations in 5S rDNA locus organization cause changes in the spatial arrangement of chromosomes in the nucleus. Furthermore, 5S rRNA gene arrangements are highly dynamic with alterations in chromosomal positions through translocations in certain mutants of the RNA-directed DNA methylation pathway and important copy number variations among ecotypes. Finally, variations in 5S rRNA gene sequence, chromatin organization and transcripts indicate differential usage of 5S rDNA loci in distinct ecotypes. We suggest that both the usage of existing and new 5S rDNA loci resulting from translocations may impact neighboring chromatin organization.

  4. DNA variation of the mammalian major histocompatibility complex reflects genomic diversity and population history

    Energy Technology Data Exchange (ETDEWEB)

    Yuhki, Naoya; O' Brien, S.J. (National Cancer Institute, Frederick, MD (USA))

    1990-01-01

    The major histocompatibility complex (MHC) is a multigene complex of tightly linked homologous genes that encode cell surface antigens that play a key role in immune regulation and response to foreign antigens. In most species, MHC gene products display extreme antigenic polymorphism, and their variability has been interpreted to reflect an adaptive strategy for accommodating rapidly evolving infectious agents that periodically afflict natural populations. Determination of the extent of MHC variation has been limited to populations in which skin grafting is feasible or for which serological reagents have been developed. The authors present here a quantitative analysis of restriction fragment length polymorphism of MHC class I genes in several mammalian species (cats, rodents, humans) known to have very different levels of genetic diversity based on functional MHC assays and on allozyme surveys. When homologous class I probes were employed, a notable concordance was observed between the extent of MHC restriction fragment variation and functional MHC variation detected by skin grafts or genome-wide diversity estimated by allozyme screens. These results confirm the genetically depauperate character of the African cheetah, Acinonyx jubatus, and the Asiatic lion, Panthera leo persica; further, they support the use of class I MHC molecular reagents in estimating the extent and character of genetic diversity in natural populations.

  5. Identification of genomic copy number variations associated with specific clinical features of head and neck cancer.

    Science.gov (United States)

    Zagradišnik, Boris; Krgović, Danijela; Herodež, Špela Stangler; Zagorac, Andreja; Ćižmarević, Bogdan; Vokač, Nadja Kokalj

    2018-01-01

    Copy number variations (CNSs) of large genomic regions are an important mechanism implicated in the development of head and neck cancer, however, for most changes their exact role is not well understood. The aim of this study was to find possible associations between gains/losses of genomic regions and clinically distinct subgroups of head and neck cancer patients. Array comparative genomic hybridization (aCGH) analysis was performed on DNA samples in 64 patients with cancer in oral cavity, oropharynx or hypopharynx. Overlapping genomic regions created from gains and losses were used for statistical analysis. Following regions were overrepresented: in tumors with stage I or II a gain of 2.98 Mb on 6p21.2-p11 and a gain of 7.4 Mb on 8q11.1-q11.23; in tumors with grade I histology a gain of 1.1 Mb on 8q24.13, a loss of a large part of p arm of chromosome 3, a loss of a 1.24 Mb on 6q14.3, and a loss of terminal 32 Mb region of 8p23.3; in cases with affected lymph nodes a gain of 0.75 Mb on 3q24, and a gain of 0.9 Mb on 3q26.32-q26.33; in cases with unaffected lymph nodes a gain of 1.1 Mb on 8q23.3, in patients not treated with surgery a gain of 12.2 Mb on 7q21.3-q22.3 and a gain of 0.33 Mb on 20q11.22. Our study identified several genomic regions of interest which appear to be associated with various clinically distinct subgroups of head and neck cancer. They represent a potentially important source of biomarkers useful for the clinical management of head and neck cancer. In particular, the PIK3CA and AGTR1 genes could be singled out to predict the lymph node involvement.

  6. Genomic Analysis of Hepatitis B Virus Reveals Antigen State and Genotype as Sources of Evolutionary Rate Variation

    Science.gov (United States)

    Harrison, Abby; Lemey, Philippe; Hurles, Matthew; Moyes, Chris; Horn, Susanne; Pryor, Jan; Malani, Joji; Supuri, Mathias; Masta, Andrew; Teriboriki, Burentau; Toatu, Tebuka; Penny, David; Rambaut, Andrew; Shapiro, Beth

    2011-01-01

    Hepatitis B virus (HBV) genomes are small, semi-double-stranded DNA circular genomes that contain alternating overlapping reading frames and replicate through an RNA intermediary phase. This complex biology has presented a challenge to estimating an evolutionary rate for HBV, leading to difficulties resolving the evolutionary and epidemiological history of the virus. Here, we re-examine rates of HBV evolution using a novel data set of 112 within-host, transmission history (pedigree) and among-host genomes isolated over 20 years from the indigenous peoples of the South Pacific, combined with 313 previously published HBV genomes. We employ Bayesian phylogenetic approaches to examine several potential causes and consequences of evolutionary rate variation in HBV. Our results reveal rate variation both between genotypes and across the genome, as well as strikingly slower rates when genomes are sampled in the Hepatitis B e antigen positive state, compared to the e antigen negative state. This Hepatitis B e antigen rate variation was found to be largely attributable to changes during the course of infection in the preCore and Core genes and their regulatory elements. PMID:21765983

  7. Plasticity of the Leishmania genome leading to gene copy number variations and drug resistance [version 1; referees: 5 approved

    Directory of Open Access Journals (Sweden)

    Marie-Claude N. Laffitte

    2016-09-01

    Full Text Available Leishmania has a plastic genome, and drug pressure can select for gene copy number variation (CNV. CNVs can apply either to whole chromosomes, leading to aneuploidy, or to specific genomic regions. For the latter, the amplification of chromosomal regions occurs at the level of homologous direct or inverted repeated sequences leading to extrachromosomal circular or linear amplified DNAs. This ability of Leishmania to respond to drug pressure by CNVs has led to the development of genomic screens such as Cos-Seq, which has the potential of expediting the discovery of drug targets for novel promising drug candidates.

  8. Variation in the OC locus of Acinetobacter baumannii genomes predicts extensive structural diversity in the lipooligosaccharide.

    Directory of Open Access Journals (Sweden)

    Johanna J Kenyon

    Full Text Available Lipooligosaccharide (LOS is a complex surface structure that is linked to many pathogenic properties of Acinetobacter baumannii. In A. baumannii, the genes responsible for the synthesis of the outer core (OC component of the LOS are located between ilvE and aspS. The content of the OC locus is usually variable within a species, and examination of 6 complete and 227 draft A. baumannii genome sequences available in GenBank non-redundant and Whole Genome Shotgun databases revealed nine distinct new types, OCL4-OCL12, in addition to the three known ones. The twelve gene clusters fell into two distinct groups, designated Group A and Group B, based on similarities in the genes present. OCL6 (Group B was unique in that it included genes for the synthesis of L-Rhamnosep. Genetic exchange of the different configurations between strains has occurred as some OC forms were found in several different sequence types (STs. OCL1 (Group A was the most widely distributed being present in 18 STs, and OCL6 was found in 16 STs. Variation within clones was also observed, with more than one OC locus type found in the two globally disseminated clones, GC1 and GC2, that include the majority of multiply antibiotic resistant isolates. OCL1 was the most abundant gene cluster in both GC1 and GC2 genomes but GC1 isolates also carried OCL2, OCL3 or OCL5, and OCL3 was also present in GC2. As replacement of the OC locus in the major global clones indicates the presence of sub-lineages, a PCR typing scheme was developed to rapidly distinguish Group A and Group B types, and to distinguish the specific forms found in GC1 and GC2 isolates.

  9. DESCARTES' RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA.

    Science.gov (United States)

    Bhaskar, Anand; Song, Yun S

    2014-01-01

    The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the "folded" SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes' rule of signs for polynomials to the Laplace transform of piecewise continuous functions.

  10. DESCARTES’ RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA1

    Science.gov (United States)

    Bhaskar, Anand; Song, Yun S.

    2016-01-01

    The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the “folded” SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes’ rule of signs for polynomials to the Laplace transform of piecewise continuous functions. PMID:28018011

  11. Genome-wide copy number variation (CNV in patients with autoimmune Addison's disease

    Directory of Open Access Journals (Sweden)

    Brønstad Ingeborg

    2011-08-01

    Full Text Available Abstract Background Addison's disease (AD is caused by an autoimmune destruction of the adrenal cortex. The pathogenesis is multi-factorial, involving genetic components and hitherto unknown environmental factors. The aim of the present study was to investigate if gene dosage in the form of copy number variation (CNV could add to the repertoire of genetic susceptibility to autoimmune AD. Methods A genome-wide study using the Affymetrix GeneChip® Genome-Wide Human SNP Array 6.0 was conducted in 26 patients with AD. CNVs in selected genes were further investigated in a larger material of patients with autoimmune AD (n = 352 and healthy controls (n = 353 by duplex Taqman real-time polymerase chain reaction assays. Results We found that low copy number of UGT2B28 was significantly more frequent in AD patients compared to controls; conversely high copy number of ADAM3A was associated with AD. Conclusions We have identified two novel CNV associations to ADAM3A and UGT2B28 in AD. The mechanism by which this susceptibility is conferred is at present unclear, but may involve steroid inactivation (UGT2B28 and T cell maturation (ADAM3A. Characterization of these proteins may unravel novel information on the pathogenesis of autoimmunity.

  12. Distinct Contributions of Replication and Transcription to Mutation Rate Variation of Human Genomes

    KAUST Repository

    Cui, Peng; Ding, Feng; Lin, Qiang; Zhang, Lingfang; Li, Ang; Zhang, Zhang; Hu, Songnian; Yu, Jun

    2012-01-01

    Here, we evaluate the contribution of two major biological processes—DNA replication and transcription—to mutation rate variation in human genomes. Based on analysis of the public human tissue transcriptomics data, high-resolution replicating map of Hela cells and dbSNP data, we present significant correlations between expression breadth, replication time in local regions and SNP density. SNP density of tissue-specific (TS) genes is significantly higher than that of housekeeping (HK) genes. TS genes tend to locate in late-replicating genomic regions and genes in such regions have a higher SNP density compared to those in early-replication regions. In addition, SNP density is found to be positively correlated with expression level among HK genes. We conclude that the process of DNA replication generates stronger mutational pressure than transcription-associated biological processes do, resulting in an increase of mutation rate in TS genes while having weaker effects on HK genes. In contrast, transcription-associated processes are mainly responsible for the accumulation of mutations in highly-expressed HK genes.

  13. Genome-wide copy number variation (CNV) in patients with autoimmune Addison's disease

    Science.gov (United States)

    2011-01-01

    Background Addison's disease (AD) is caused by an autoimmune destruction of the adrenal cortex. The pathogenesis is multi-factorial, involving genetic components and hitherto unknown environmental factors. The aim of the present study was to investigate if gene dosage in the form of copy number variation (CNV) could add to the repertoire of genetic susceptibility to autoimmune AD. Methods A genome-wide study using the Affymetrix GeneChip® Genome-Wide Human SNP Array 6.0 was conducted in 26 patients with AD. CNVs in selected genes were further investigated in a larger material of patients with autoimmune AD (n = 352) and healthy controls (n = 353) by duplex Taqman real-time polymerase chain reaction assays. Results We found that low copy number of UGT2B28 was significantly more frequent in AD patients compared to controls; conversely high copy number of ADAM3A was associated with AD. Conclusions We have identified two novel CNV associations to ADAM3A and UGT2B28 in AD. The mechanism by which this susceptibility is conferred is at present unclear, but may involve steroid inactivation (UGT2B28) and T cell maturation (ADAM3A). Characterization of these proteins may unravel novel information on the pathogenesis of autoimmunity. PMID:21851588

  14. A genome-wide association study of copy number variations with umbilical hernia in swine.

    Science.gov (United States)

    Long, Yi; Su, Ying; Ai, Huashui; Zhang, Zhiyan; Yang, Bin; Ruan, Guorong; Xiao, Shijun; Liao, Xinjun; Ren, Jun; Huang, Lusheng; Ding, Nengshui

    2016-06-01

    Umbilical hernia (UH) is one of the most common congenital defects in pigs, leading to considerable economic loss and serious animal welfare problems. To test whether copy number variations (CNVs) contribute to pig UH, we performed a case-control genome-wide CNV association study on 905 pigs from the Duroc, Landrace and Yorkshire breeds using the Porcine SNP60 BeadChip and penncnv algorithm. We first constructed a genomic map comprising 6193 CNVs that pertain to 737 CNV regions. Then, we identified eight CNVs significantly associated with the risk for UH in the three pig breeds. Six of seven significantly associated CNVs were validated using quantitative real-time PCR. Notably, a rare CNV (CNV14:13030843-13059455) encompassing the NUGGC gene was strongly associated with UH (permutation-corrected P = 0.0015) in Duroc pigs. This CNV occurred exclusively in seven Duroc UH-affected individuals. SNPs surrounding the CNV did not show association signals, indicating that rare CNVs may play an important role in complex pig diseases such as UH. The NUGGC gene has been implicated in human omphalocele and inguinal hernia. Our finding supports that CNVs, including the NUGGC CNV, contribute to the pathogenesis of pig UH. © 2016 Stichting International Foundation for Animal Genetics.

  15. Analysis of Genetic Variation across the Encapsidated Genome of Microplitis demolitor Bracovirus in Parasitoid Wasps.

    Directory of Open Access Journals (Sweden)

    Gaelen R Burke

    Full Text Available Insect parasitoids must complete part of their life cycle within or on another insect, ultimately resulting in the death of the host insect. One group of parasitoid wasps, the 'microgastroid complex' (Hymenoptera: Braconidae, engage in an association with beneficial symbiotic viruses that are essential for successful parasitism of hosts. These viruses, known as Bracoviruses, persist in an integrated form in the wasp genome, and activate to replicate in wasp ovaries during development to ultimately be delivered into host insects during parasitism. The lethal nature of host-parasitoid interactions, combined with the involvement of viruses in mediating these interactions, has led to the hypothesis that Bracoviruses are engaged in an arms race with hosts, resulting in recurrent adaptation in viral (and host genes. Deep sequencing was employed to characterize sequence variation across the encapsidated Bracovirus genome within laboratory and field populations of the parasitoid wasp species Microplitis demolitor. Contrary to expectations, there was a paucity of evidence for positive directional selection among virulence genes, which generally exhibited signatures of purifying selection. These data suggest that the dynamics of host-parasite interactions may not result in recurrent rounds of adaptation, and that adaptation may be more variable in time than previously expected.

  16. Structural variation discovery in the cancer genome using next generation sequencing: Computational solutions and perspectives

    Science.gov (United States)

    Liu, Biao; Conroy, Jeffrey M.; Morrison, Carl D.; Odunsi, Adekunle O.; Qin, Maochun; Wei, Lei; Trump, Donald L.; Johnson, Candace S.; Liu, Song; Wang, Jianmin

    2015-01-01

    Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937

  17. Distinct Contributions of Replication and Transcription to Mutation Rate Variation of Human Genomes

    KAUST Repository

    Cui, Peng

    2012-03-23

    Here, we evaluate the contribution of two major biological processes—DNA replication and transcription—to mutation rate variation in human genomes. Based on analysis of the public human tissue transcriptomics data, high-resolution replicating map of Hela cells and dbSNP data, we present significant correlations between expression breadth, replication time in local regions and SNP density. SNP density of tissue-specific (TS) genes is significantly higher than that of housekeeping (HK) genes. TS genes tend to locate in late-replicating genomic regions and genes in such regions have a higher SNP density compared to those in early-replication regions. In addition, SNP density is found to be positively correlated with expression level among HK genes. We conclude that the process of DNA replication generates stronger mutational pressure than transcription-associated biological processes do, resulting in an increase of mutation rate in TS genes while having weaker effects on HK genes. In contrast, transcription-associated processes are mainly responsible for the accumulation of mutations in highly-expressed HK genes.

  18. A refined model of the genomic basis for phenotypic variation in vertebrate hemostasis.

    Science.gov (United States)

    Ribeiro, Ângela M; Zepeda-Mendoza, M Lisandra; Bertelsen, Mads F; Kristensen, Annemarie T; Jarvis, Erich D; Gilbert, M Thomas P; da Fonseca, Rute R

    2015-06-30

    Hemostasis is a defense mechanism that enhances an organism's survival by minimizing blood loss upon vascular injury. In vertebrates, hemostasis has been evolving with the cardio-vascular and hemodynamic systems over the last 450 million years. Birds and mammals have very similar vascular and hemodynamic systems, thus the mechanism that blocks ruptures in the vasculature is expected to be the same. However, the speed of the process varies across vertebrates, and is particularly slow for birds. Understanding the differences in the hemostasis pathway between birds and mammals, and placing them in perspective to other vertebrates may provide clues to the genetic contribution to variation in blood clotting phenotype in vertebrates. We compiled genomic data corresponding to key elements involved in hemostasis across vertebrates to investigate its genetic basis and understand how it affects fitness. We found that: i) fewer genes are involved in hemostasis in birds compared to mammals; and ii) the largest differences concern platelet membrane receptors and components from the kallikrein-kinin system. We propose that lack of the cytoplasmic domain of the GPIb receptor subunit alpha could be a strong contributor to the prolonged bleeding phenotype in birds. Combined analysis of laboratory assessments of avian hemostasis with the first avian phylogeny based on genomic-scale data revealed that differences in hemostasis within birds are not explained by phylogenetic relationships, but more so by genetic variation underlying components of the hemostatic process, suggestive of natural selection. This work adds to our understanding of the evolution of hemostasis in vertebrates. The overlap with the inflammation, complement and renin-angiotensin (blood pressure regulation) pathways is a potential driver of rapid molecular evolution in the hemostasis network. Comparisons between avian species and mammals allowed us to hypothesize that the observed mammalian innovations might have

  19. A genome-wide association study demonstrates significant genetic variation for fracture risk in Thoroughbred racehorses

    Science.gov (United States)

    2014-01-01

    Background Thoroughbred racehorses are subject to non-traumatic distal limb bone fractures that occur during racing and exercise. Susceptibility to fracture may be due to underlying disturbances in bone metabolism which have a genetic cause. Fracture risk has been shown to be heritable in several species but this study is the first genetic analysis of fracture risk in the horse. Results Fracture cases (n = 269) were horses that sustained catastrophic distal limb fractures while racing on UK racecourses, necessitating euthanasia. Control horses (n = 253) were over 4 years of age, were racing during the same time period as the cases, and had no history of fracture at the time the study was carried out. The horses sampled were bred for both flat and National Hunt (NH) jump racing. 43,417 SNPs were employed to perform a genome-wide association analysis and to estimate the proportion of genetic variance attributable to the SNPs on each chromosome using restricted maximum likelihood (REML). Significant genetic variation associated with fracture risk was found on chromosomes 9, 18, 22 and 31. Three SNPs on chromosome 18 (62.05 Mb – 62.15 Mb) and one SNP on chromosome 1 (14.17 Mb) reached genome-wide significance (p fracture than cases, p = 1 × 10-4), while a second haplotype increases fracture risk (cases at 3.39 times higher risk of fracture than controls, p = 0.042). Conclusions Fracture risk in the Thoroughbred horse is a complex condition with an underlying genetic basis. Multiple genomic regions contribute to susceptibility to fracture risk. This suggests there is the potential to develop SNP-based estimators for genetic risk of fracture in the Thoroughbred racehorse, using methods pioneered in livestock genetics such as genomic selection. This information would be useful to racehorse breeders and owners, enabling them to reduce the risk of injury in their horses. PMID:24559379

  20. Copy number variation identification and analysis of the chicken genome using a 60K SNP BeadChip.

    Science.gov (United States)

    Rao, Y S; Li, J; Zhang, R; Lin, X R; Xu, J G; Xie, L; Xu, Z Q; Wang, L; Gan, J K; Xie, X J; He, J; Zhang, X Q

    2016-08-01

    Copy number variation (CNV) is an important source of genetic variation in organisms and a main factor that affects phenotypic variation. A comprehensive study of chicken CNV can provide valuable information on genetic diversity and facilitate future analyses of associations between CNV and economically important traits in chickens. In the present study, an F2 full-sib chicken population (554 individuals), established from a cross between Xinghua and White Recessive Rock chickens, was used to explore CNV in the chicken genome. Genotyping was performed using a chicken 60K SNP BeadChip. A total of 1,875 CNV were detected with the PennCNV algorithm, and the average number of CNV was 3.42 per individual. The CNV were distributed across 383 independent CNV regions (CNVR) and covered 41 megabases (3.97%) of the chicken genome. Seven CNVR in 108 individuals were validated by quantitative real-time PCR, and 81 of these individuals (75%) also were detected with the PennCNV algorithm. In total, 274 CNVR (71.54%) identified in the current study were previously reported. Of these, 147 (38.38%) were reported in at least 2 studies. Additionally, 109 of the CNVR (28.46%) discovered here are novel. A total of 709 genes within or overlapping with the CNVR was retrieved. Out of the 2,742 quantitative trait loci (QTL) collected in the chicken QTL database, 43 QTL had confidence intervals overlapping with the CNVR, and 32 CNVR encompassed one or more functional genes. The functional genes located in the CNVR are likely to be the QTG that are associated with underlying economic traits. This study considerably expands our insight into the structural variation in the genome of chickens and provides an important resource for genomic variation, especially for genomic structural variation related to economic traits in chickens. © 2016 Poultry Science Association Inc.

  1. Detecting single DNA copy number variations in complex genomes using one nanogram of starting DNA and BAC-array CGH.

    Science.gov (United States)

    Guillaud-Bataille, Marine; Valent, Alexander; Soularue, Pascal; Perot, Christine; Inda, Maria Mar; Receveur, Aline; Smaïli, Sadek; Roest Crollius, Hugues; Bénard, Jean; Bernheim, Alain; Gidrol, Xavier; Danglot, Gisèle

    2004-07-29

    Comparative genomic hybridization to bacterial artificial chromosome (BAC)-arrays (array-CGH) is a highly efficient technique, allowing the simultaneous measurement of genomic DNA copy number at hundreds or thousands of loci, and the reliable detection of local one-copy-level variations. We report a genome-wide amplification method allowing the same measurement sensitivity, using 1 ng of starting genomic DNA, instead of the classical 1 microg usually necessary. Using a discrete series of DNA fragments, we defined the parameters adapted to the most faithful ligation-mediated PCR amplification and the limits of the technique. The optimized protocol allows a 3000-fold DNA amplification, retaining the quantitative characteristics of the initial genome. Validation of the amplification procedure, using DNA from 10 tumour cell lines hybridized to BAC-arrays of 1500 spots, showed almost perfectly superimposed ratios for the non-amplified and amplified DNAs. Correlation coefficients of 0.96 and 0.99 were observed for regions of low-copy-level variations and all regions, respectively (including in vivo amplified oncogenes). Finally, labelling DNA using two nucleotides bearing the same fluorophore led to a significant increase in reproducibility and to the correct detection of one-copy gain or loss in >90% of the analysed data, even for pseudotriploid tumour genomes.

  2. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

    Science.gov (United States)

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-03-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.

  3. Analysis of the genome-wide variations among multiple strains of the plant pathogenic bacterium Xylella fastidiosa

    Directory of Open Access Journals (Sweden)

    Walker M Andrew

    2006-09-01

    Full Text Available Abstract Background The Gram-negative, xylem-limited phytopathogenic bacterium Xylella fastidiosa is responsible for causing economically important diseases in grapevine, citrus and many other plant species. Despite its economic impact, relatively little is known about the genomic variations among strains isolated from different hosts and their influence on the population genetics of this pathogen. With the availability of genome sequence information for four strains, it is now possible to perform genome-wide analyses to identify and categorize such DNA variations and to understand their influence on strain functional divergence. Results There are 1,579 genes and 194 non-coding homologous sequences present in the genomes of all four strains, representing a 76. 2% conservation of the sequenced genome. About 60% of the X. fastidiosa unique sequences exist as tandem gene clusters of 6 or more genes. Multiple alignments identified 12,754 SNPs and 14,449 INDELs in the 1528 common genes and 20,779 SNPs and 10,075 INDELs in the 194 non-coding sequences. The average SNP frequency was 1.08 × 10-2 per base pair of DNA and the average INDEL frequency was 2.06 × 10-2 per base pair of DNA. On an average, 60.33% of the SNPs were synonymous type while 39.67% were non-synonymous type. The mutation frequency, primarily in the form of external INDELs was the main type of sequence variation. The relative similarity between the strains was discussed according to the INDEL and SNP differences. The number of genes unique to each strain were 60 (9a5c, 54 (Dixon, 83 (Ann1 and 9 (Temecula-1. A sub-set of the strain specific genes showed significant differences in terms of their codon usage and GC composition from the native genes suggesting their xenologous origin. Tandem repeat analysis of the genomic sequences of the four strains identified associations of repeat sequences with hypothetical and phage related functions. Conclusion INDELs and strain specific genes

  4. Overlap in genomic variation associated with milk fat composition in Holstein Friesian and Dutch native dual-purpose breeds

    NARCIS (Netherlands)

    Maurice - Van Eijndhoven, M.H.T.; Bovenhuis, H.; Veerkamp, R.F.; Calus, M.P.L.

    2015-01-01

    The aim of this study was to identify if genomic variations associated with fatty acid (FA) composition are similar between the Holstein-Friesian (HF) and native dual-purpose breeds used in the Dutch dairy industry. Phenotypic and genotypic information were available for the breeds Meuse-Rhine-Yssel

  5. Genome-Wide Mapping of Structural Variations Reveals a Copy Number Variant That Determines Reproductive Morphology in Cucumber

    NARCIS (Netherlands)

    Zhang, Z.; Mao, L.; Chen, Junshi; Bu, F.; Li, G.; Sun, J.; Li, S.; Sun, H.; Jiao, C.; Blakely, R.; Pan, J.; Cai, R.; Luo, R.; Peer, Van de Y.; Jacobsen, E.; Fei, Z.; Huang, S.

    2015-01-01

    Structural variations (SVs) represent a major source of genetic diversity. However, the functional impact and formation mechanisms of SVs in plant genomes remain largely unexplored. Here, we report a nucleotide-resolution SV map of cucumber (Cucumis sativas) that comprises 26,788 SVs based on deep

  6. Genome-wide recombination rate variation in a recombination map of cotton.

    Science.gov (United States)

    Shen, Chao; Li, Ximei; Zhang, Ruiting; Lin, Zhongxu

    2017-01-01

    Recombination is crucial for genetic evolution, which not only provides new allele combinations but also influences the biological evolution and efficacy of natural selection. However, recombination variation is not well understood outside of the complex species' genomes, and it is particularly unclear in Gossypium. Cotton is the most important natural fibre crop and the second largest oil-seed crop. Here, we found that the genetic and physical maps distances did not have a simple linear relationship. Recombination rates were unevenly distributed throughout the cotton genome, which showed marked changes along the chromosome lengths and recombination was completely suppressed in the centromeric regions. Recombination rates significantly varied between A-subgenome (At) (range = 1.60 to 3.26 centimorgan/megabase [cM/Mb]) and D-subgenome (Dt) (range = 2.17 to 4.97 cM/Mb), which explained why the genetic maps of At and Dt are similar but the physical map of Dt is only half that of At. The translocation regions between A02 and A03 and between A04 and A05, and the inversion regions on A10, D10, A07 and D07 indicated relatively high recombination rates in the distal regions of the chromosomes. Recombination rates were positively correlated with the densities of genes, markers and the distance from the centromere, and negatively correlated with transposable elements (TEs). The gene ontology (GO) categories showed that genes in high recombination regions may tend to response to environmental stimuli, and genes in low recombination regions are related to mitosis and meiosis, which suggested that they may provide the primary driving force in adaptive evolution and assure the stability of basic cell cycle in a rapidly changing environment. Global knowledge of recombination rates will facilitate genetics and breeding in cotton.

  7. Autism genome-wide copy number variation reveals ubiquitin and neuronal genes.

    Science.gov (United States)

    Glessner, Joseph T; Wang, Kai; Cai, Guiqing; Korvatska, Olena; Kim, Cecilia E; Wood, Shawn; Zhang, Haitao; Estes, Annette; Brune, Camille W; Bradfield, Jonathan P; Imielinski, Marcin; Frackelton, Edward C; Reichert, Jennifer; Crawford, Emily L; Munson, Jeffrey; Sleiman, Patrick M A; Chiavacci, Rosetta; Annaiah, Kiran; Thomas, Kelly; Hou, Cuiping; Glaberson, Wendy; Flory, James; Otieno, Frederick; Garris, Maria; Soorya, Latha; Klei, Lambertus; Piven, Joseph; Meyer, Kacie J; Anagnostou, Evdokia; Sakurai, Takeshi; Game, Rachel M; Rudd, Danielle S; Zurawiecki, Danielle; McDougle, Christopher J; Davis, Lea K; Miller, Judith; Posey, David J; Michaels, Shana; Kolevzon, Alexander; Silverman, Jeremy M; Bernier, Raphael; Levy, Susan E; Schultz, Robert T; Dawson, Geraldine; Owley, Thomas; McMahon, William M; Wassink, Thomas H; Sweeney, John A; Nurnberger, John I; Coon, Hilary; Sutcliffe, James S; Minshew, Nancy J; Grant, Struan F A; Bucan, Maja; Cook, Edwin H; Buxbaum, Joseph D; Devlin, Bernie; Schellenberg, Gerard D; Hakonarson, Hakon

    2009-05-28

    Autism spectrum disorders (ASDs) are childhood neurodevelopmental disorders with complex genetic origins. Previous studies focusing on candidate genes or genomic regions have identified several copy number variations (CNVs) that are associated with an increased risk of ASDs. Here we present the results from a whole-genome CNV study on a cohort of 859 ASD cases and 1,409 healthy children of European ancestry who were genotyped with approximately 550,000 single nucleotide polymorphism markers, in an attempt to comprehensively identify CNVs conferring susceptibility to ASDs. Positive findings were evaluated in an independent cohort of 1,336 ASD cases and 1,110 controls of European ancestry. Besides previously reported ASD candidate genes, such as NRXN1 (ref. 10) and CNTN4 (refs 11, 12), several new susceptibility genes encoding neuronal cell-adhesion molecules, including NLGN1 and ASTN2, were enriched with CNVs in ASD cases compared to controls (P = 9.5 x 10(-3)). Furthermore, CNVs within or surrounding genes involved in the ubiquitin pathways, including UBE3A, PARK2, RFWD2 and FBXO40, were affected by CNVs not observed in controls (P = 3.3 x 10(-3)). We also identified duplications 55 kilobases upstream of complementary DNA AK123120 (P = 3.6 x 10(-6)). Although these variants may be individually rare, they target genes involved in neuronal cell-adhesion or ubiquitin degradation, indicating that these two important gene networks expressed within the central nervous system may contribute to the genetic susceptibility of ASD.

  8. Overlap in genomic variation associated with milk fat composition in Holstein Friesian and Dutch native dual-purpose breeds.

    Science.gov (United States)

    Maurice-Van Eijndhoven, M H T; Bovenhuis, H; Veerkamp, R F; Calus, M P L

    2015-09-01

    The aim of this study was to identify if genomic variations associated with fatty acid (FA) composition are similar between the Holstein-Friesian (HF) and native dual-purpose breeds used in the Dutch dairy industry. Phenotypic and genotypic information were available for the breeds Meuse-Rhine-Yssel (MRY), Dutch Friesian (DF), Groningen White Headed (GWH), and HF. First, the reliability of genomic breeding values of the native Dutch dual-purpose cattle breeds MRY, DF, and GWH was evaluated using single nucleotide polymorphism (SNP) effects estimated in HF, including all SNP or subsets with stronger associations in HF. Second, the genomic variation of the regions associated with FA composition in HF (regions on Bos taurus autosome 5, 14, and 26), were studied in the different breeds. Finally, similarities in genotype and allele frequencies between MRY, DF, GWH, and HF breeds were assessed for specific regions associated with FA composition. On average across the traits, the highest reliabilities of genomic prediction were estimated for GWH (0.158) and DF (0.116) when the 8 to 22 SNP with the strongest association in HF were included. With the same set of SNP, GEBV for MRY were the least reliable (0.022). This indicates that on average only 2 (MRY) to 16% (GWH) of the genomic variation in HF is shared with the native Dutch dual-purpose breeds. The comparison of predicted variances of different regions associated with milk and milk fat composition showed that breeds clearly differed in genomic variation within these regions. Finally, the correlations of allele frequencies between breeds across the 8 to 22 SNP with the strongest association in HF were around 0.8 between the Dutch native dual-purpose breeds, whereas the correlations between the native breeds and HF were clearly lower and around 0.5. There was no consistent relationship between the reliabilities of genomic prediction for a specific breed and the correlation between the allele frequencies of this breed

  9. Population-genetic nature of copy number variations in the human genome.

    Science.gov (United States)

    Kato, Mamoru; Kawaguchi, Takahisa; Ishikawa, Shumpei; Umeda, Takayoshi; Nakamichi, Reiichiro; Shapero, Michael H; Jones, Keith W; Nakamura, Yusuke; Aburatani, Hiroyuki; Tsunoda, Tatsuhiko

    2010-03-01

    Copy number variations (CNVs) are universal genetic variations, and their association with disease has been increasingly recognized. We designed high-density microarrays for CNVs, and detected 3000-4000 CNVs (4-6% of the genomic sequence) per population that included CNVs previously missed because of smaller sizes and residing in segmental duplications. The patterns of CNVs across individuals were surprisingly simple at the kilo-base scale, suggesting the applicability of a simple genetic analysis for these genetic loci. We utilized the probabilistic theory to determine integer copy numbers of CNVs and employed a recently developed phasing tool to estimate the population frequencies of integer copy number alleles and CNV-SNP haplotypes. The results showed a tendency toward a lower frequency of CNV alleles and that most of our CNVs were explained only by zero-, one- and two-copy alleles. Using the estimated population frequencies, we found several CNV regions with exceptionally high population differentiation. Investigation of CNV-SNP linkage disequilibrium (LD) for 500-900 bi- and multi-allelic CNVs per population revealed that previous conflicting reports on bi-allelic LD were unexpectedly consistent and explained by an LD increase correlated with deletion-allele frequencies. Typically, the bi-allelic LD was lower than SNP-SNP LD, whereas the multi-allelic LD was somewhat stronger than the bi-allelic LD. After further investigation of tag SNPs for CNVs, we conclude that the customary tagging strategy for disease association studies can be applicable for common deletion CNVs, but direct interrogation is needed for other types of CNVs.

  10. Genetic variation in the Staphylococcus aureus 8325 strain lineage revealed by whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Kristoffer T Bæk

    Full Text Available Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47 required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected Φ11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (Δ63bp in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75% steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin α3 (PSMα3, a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.

  11. A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica).

    Science.gov (United States)

    Jia, Guanqing; Huang, Xuehui; Zhi, Hui; Zhao, Yan; Zhao, Qiang; Li, Wenjun; Chai, Yang; Yang, Lifang; Liu, Kunyan; Lu, Hengyun; Zhu, Chuanrang; Lu, Yiqi; Zhou, Congcong; Fan, Danlin; Weng, Qijun; Guo, Yunli; Huang, Tao; Zhang, Lei; Lu, Tingting; Feng, Qi; Hao, Hangfei; Liu, Hongkuan; Lu, Ping; Zhang, Ning; Li, Yuhui; Guo, Erhu; Wang, Shujun; Wang, Suying; Liu, Jinrong; Zhang, Wenfei; Chen, Guoqiu; Zhang, Baojin; Li, Wei; Wang, Yongfang; Li, Haiquan; Zhao, Baohua; Li, Jiayang; Diao, Xianmin; Han, Bin

    2013-08-01

    Foxtail millet (Setaria italica) is an important grain crop that is grown in arid regions. Here we sequenced 916 diverse foxtail millet varieties, identified 2.58 million SNPs and used 0.8 million common SNPs to construct a haplotype map of the foxtail millet genome. We classified the foxtail millet varieties into two divergent groups that are strongly correlated with early and late flowering times. We phenotyped the 916 varieties under five different environments and identified 512 loci associated with 47 agronomic traits by genome-wide association studies. We performed a de novo assembly of deeply sequenced genomes of a Setaria viridis accession (the wild progenitor of S. italica) and an S. italica variety and identified complex interspecies and intraspecies variants. We also identified 36 selective sweeps that seem to have occurred during modern breeding. This study provides fundamental resources for genetics research and genetic improvement in foxtail millet.

  12. De novo Genome Assembly and Single Nucleotide Variations for Soybean Mosaic Virus Using Soybean Seed Transcriptome Data

    Directory of Open Access Journals (Sweden)

    Yeonhwa Jo

    2017-10-01

    Full Text Available Soybean is the most important legume crop in the world. Several diseases in soybean lead to serious yield losses in major soybean-producing countries. Moreover, soybean can be infected by diverse viruses. Recently, we carried out a large-scale screening to identify viruses infecting soybean using available soybean transcriptome data. Of the screened transcriptomes, a soybean transcriptome for soybean seed development analysis contains several virus-associated sequences. In this study, we identified five viruses, including soybean mosaic virus (SMV, infecting soybean by de novo transcriptome assembly followed by blast search. We assembled a nearly complete consensus genome sequence of SMV China using transcriptome data. Based on phylogenetic analysis, the consensus genome sequence of SMV China was closely related to SMV isolates from South Korea. We examined single nucleotide variations (SNVs for SMVs in the soybean seed transcriptome revealing 780 SNVs, which were evenly distributed on the SMV genome. Four SNVs, C-U, U-C, A-G, and G-A, were frequently identified. This result demonstrated the quasispecies variation of the SMV genome. Taken together, this study carried out bioinformatics analyses to identify viruses using soybean transcriptome data. In addition, we demonstrated the application of soybean transcriptome data for virus genome assembly and SNV analysis.

  13. Patterns of genomic variation in the poplar rust fungus Melampsora larici-populina identify pathogenesis-related factors

    Directory of Open Access Journals (Sweden)

    Antoine ePersoons

    2014-09-01

    Full Text Available Melampsora larici-populina is a fungal pathogen responsible for foliar rust disease on poplar trees, which causes damage to forest plantations worldwide, particularly in Northern Europe. The reference genome of the isolate 98AG31 was previously sequenced using a whole genome shotgun strategy, revealing a large genome of 101 megabases containing 16,399 predicted genes, which included secreted protein genes representing poplar rust candidate effectors. In the present study, the genomes of 15 isolates collected over the past 20 years throughout the French territory, representing distinct virulence profiles, were characterized by massively parallel sequencing to assess genetic variation in the poplar rust fungus. Comparison to the reference genome revealed striking structural variations. Analysis of coverage and sequencing depth identified large missing regions between isolates related to the mating type loci. More than 611,824 single-nucleotide polymorphism (SNP positions were uncovered overall, indicating a remarkable level of polymorphism. Based on the accumulation of non-synonymous substitutions in coding sequences and the relative frequencies of synonymous and non-synonymous polymorphisms (i.e. PN/PS, we identify candidate genes that may be involved in fungal pathogenesis. Correlation between non-synonymous SNPs in genes encoding secreted proteins and pathotypes of the studied isolates revealed candidate genes potentially related to virulences 1, 6 and 8 of the poplar rust fungus.

  14. Circadian pathway genetic variation and cancer risk: evidence from genome-wide association studies.

    Science.gov (United States)

    Mocellin, Simone; Tropea, Saveria; Benna, Clara; Rossi, Carlo Riccardo

    2018-02-19

    Dysfunction of the circadian clock and single polymorphisms of some circadian genes have been linked to cancer susceptibility, although data are scarce and findings inconsistent. We aimed to investigate the association between circadian pathway genetic variation and risk of developing common cancers based on the findings of genome-wide association studies (GWASs). Single nucleotide polymorphisms (SNPs) of 17 circadian genes reported by three GWAS meta-analyses dedicated to breast (Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) Consortium; cases, n = 15,748; controls, n = 18,084), prostate (Elucidating Loci Involved in Prostate Cancer Susceptibility (ELLIPSE) Consortium; cases, n = 14,160; controls, n = 12,724) and lung carcinoma (Transdisciplinary Research In Cancer of the Lung (TRICL) Consortium; cases, n = 12,160; controls, n = 16,838) in patients of European ancestry were utilized to perform pathway analysis by means of the adaptive rank truncated product (ARTP) method. Data were also available for the following subgroups: estrogen receptor negative breast cancer, aggressive prostate cancer, squamous lung carcinoma and lung adenocarcinoma. We found a highly significant statistical association between circadian pathway genetic variation and the risk of breast (pathway P value = 1.9 × 10 -6 ; top gene RORA, gene P value = 0.0003), prostate (pathway P value = 4.1 × 10 -6 ; top gene ARNTL, gene P value = 0.0002) and lung cancer (pathway P value = 6.9 × 10 -7 ; top gene RORA, gene P value = 2.0 × 10 -6 ), as well as all their subgroups. Out of 17 genes investigated, 15 were found to be significantly associated with the risk of cancer: four genes were shared by all three malignancies (ARNTL, CLOCK, RORA and RORB), two by breast and lung cancer (CRY1 and CRY2) and three by prostate and lung cancer (NPAS2, NR1D1 and PER3), whereas four genes were specific for lung cancer

  15. Effects of Sublethal Fungicides on Mutation Rates and Genomic Variation in Fungal Plant Pathogen, Sclerotinia sclerotiorum.

    Science.gov (United States)

    Amaradasa, B Sajeewa; Everhart, Sydney E

    2016-01-01

    when repeated, only one isolate had higher EC50 while most isolates showed no difference. Results of this support the hypothesis that sublethal fungicide stress increases mutation rates in a largely clonal plant pathogen under in vitro conditions. Collectively, this work will aid our understanding how non-lethal fungicide exposure may affect genomic variation, which may be an important mechanism of novel trait emergence, adaptation, and evolution for clonal organisms.

  16. Effects of Sublethal Fungicides on Mutation Rates and Genomic Variation in Fungal Plant Pathogen, Sclerotinia sclerotiorum.

    Directory of Open Access Journals (Sweden)

    B Sajeewa Amaradasa

    experiment, and when repeated, only one isolate had higher EC50 while most isolates showed no difference. Results of this support the hypothesis that sublethal fungicide stress increases mutation rates in a largely clonal plant pathogen under in vitro conditions. Collectively, this work will aid our understanding how non-lethal fungicide exposure may affect genomic variation, which may be an important mechanism of novel trait emergence, adaptation, and evolution for clonal organisms.

  17. Effects of Sublethal Fungicides on Mutation Rates and Genomic Variation in Fungal Plant Pathogen, Sclerotinia sclerotiorum

    Science.gov (United States)

    Amaradasa, B. Sajeewa

    2016-01-01

    , and when repeated, only one isolate had higher EC50 while most isolates showed no difference. Results of this support the hypothesis that sublethal fungicide stress increases mutation rates in a largely clonal plant pathogen under in vitro conditions. Collectively, this work will aid our understanding how non-lethal fungicide exposure may affect genomic variation, which may be an important mechanism of novel trait emergence, adaptation, and evolution for clonal organisms. PMID:27959950

  18. IW-Scoring: an Integrative Weighted Scoring framework for annotating and prioritizing genetic variations in the noncoding genome.

    Science.gov (United States)

    Wang, Jun; Dayem Ullah, Abu Z; Chelala, Claude

    2018-01-30

    The vast majority of germline and somatic variations occur in the noncoding part of the genome, only a small fraction of which are believed to be functional. From the tens of thousands of noncoding variations detectable in each genome, identifying and prioritizing driver candidates with putative functional significance is challenging. To address this, we implemented IW-Scoring, a new Integrative Weighted Scoring model to annotate and prioritise functionally relevant noncoding variations. We evaluate 11 scoring methods, and apply an unsupervised spectral approach for subsequent selective integration into two linear weighted functional scoring schemas for known and novel variations. IW-Scoring produces stable high-quality performance as the best predictors for three independent data sets. We demonstrate the robustness of IW-Scoring in identifying recurrent functional mutations in the TERT promoter, as well as disease SNPs in proximity to consensus motifs and with gene regulatory effects. Using follicular lymphoma as a paradigmatic cancer model, we apply IW-Scoring to locate 11 recurrently mutated noncoding regions in 14 follicular lymphoma genomes, and validate 9 of these regions in an extension cohort, including the promoter and enhancer regions of PAX5. Overall, IW-Scoring demonstrates greater versatility in identifying trait- and disease-associated noncoding variants. Scores from IW-Scoring as well as other methods are freely available from http://www.snp-nexus.org/IW-Scoring/. © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. aCNViewer: Comprehensive genome-wide visualization of absolute copy number and copy neutral variations.

    Directory of Open Access Journals (Sweden)

    Victor Renault

    Full Text Available Copy number variations (CNV include net gains or losses of part or whole chromosomal regions. They differ from copy neutral loss of heterozygosity (cn-LOH events which do not induce any net change in the copy number and are often associated with uniparental disomy. These phenomena have long been reported to be associated with diseases and particularly in cancer. Losses/gains of genomic regions are often correlated with lower/higher gene expression. On the other hand, loss of heterozygosity (LOH and cn-LOH are common events in cancer and may be associated with the loss of a functional tumor suppressor gene. Therefore, identifying recurrent CNV and cn-LOH events can be important as they may highlight common biological components and give insights into the development or mechanisms of a disease. However, no currently available tools allow a comprehensive whole-genome visualization of recurrent CNVs and cn-LOH in groups of samples providing absolute quantification of the aberrations leading to the loss of potentially important information.To overcome these limitations, we developed aCNViewer (Absolute CNV Viewer, a visualization tool for absolute CNVs and cn-LOH across a group of samples. aCNViewer proposes three graphical representations: dendrograms, bi-dimensional heatmaps showing chromosomal regions sharing similar abnormality patterns, and quantitative stacked histograms facilitating the identification of recurrent absolute CNVs and cn-LOH. We illustrated aCNViewer using publically available hepatocellular carcinomas (HCCs Affymetrix SNP Array data (Fig 1A. Regions 1q and 8q present a similar percentage of total gains but significantly different copy number gain categories (p-value of 0.0103 with a Fisher exact test, validated by another cohort of HCCs (p-value of 5.6e-7 (Fig 2B.aCNViewer is implemented in python and R and is available with a GNU GPLv3 license on GitHub https://github.com/FJD-CEPH/aCNViewer and Docker https://hub.docker.com/r/fjdceph/acnviewer/.aCNViewer@cephb.fr.

  20. aCNViewer: Comprehensive genome-wide visualization of absolute copy number and copy neutral variations.

    Science.gov (United States)

    Renault, Victor; Tost, Jörg; Pichon, Fabien; Wang-Renault, Shu-Fang; Letouzé, Eric; Imbeaud, Sandrine; Zucman-Rossi, Jessica; Deleuze, Jean-François; How-Kit, Alexandre

    2017-01-01

    Copy number variations (CNV) include net gains or losses of part or whole chromosomal regions. They differ from copy neutral loss of heterozygosity (cn-LOH) events which do not induce any net change in the copy number and are often associated with uniparental disomy. These phenomena have long been reported to be associated with diseases and particularly in cancer. Losses/gains of genomic regions are often correlated with lower/higher gene expression. On the other hand, loss of heterozygosity (LOH) and cn-LOH are common events in cancer and may be associated with the loss of a functional tumor suppressor gene. Therefore, identifying recurrent CNV and cn-LOH events can be important as they may highlight common biological components and give insights into the development or mechanisms of a disease. However, no currently available tools allow a comprehensive whole-genome visualization of recurrent CNVs and cn-LOH in groups of samples providing absolute quantification of the aberrations leading to the loss of potentially important information. To overcome these limitations, we developed aCNViewer (Absolute CNV Viewer), a visualization tool for absolute CNVs and cn-LOH across a group of samples. aCNViewer proposes three graphical representations: dendrograms, bi-dimensional heatmaps showing chromosomal regions sharing similar abnormality patterns, and quantitative stacked histograms facilitating the identification of recurrent absolute CNVs and cn-LOH. We illustrated aCNViewer using publically available hepatocellular carcinomas (HCCs) Affymetrix SNP Array data (Fig 1A). Regions 1q and 8q present a similar percentage of total gains but significantly different copy number gain categories (p-value of 0.0103 with a Fisher exact test), validated by another cohort of HCCs (p-value of 5.6e-7) (Fig 2B). aCNViewer is implemented in python and R and is available with a GNU GPLv3 license on GitHub https://github.com/FJD-CEPH/aCNViewer and Docker https

  1. Whole-genome copy number variation analysis in anophthalmia and microphthalmia.

    Science.gov (United States)

    Schilter, K F; Reis, L M; Schneider, A; Bardakjian, T M; Abdul-Rahman, O; Kozel, B A; Zimmerman, H H; Broeckel, U; Semina, E V

    2013-11-01

    Anophthalmia/microphthalmia (A/M) represent severe developmental ocular malformations. Currently, mutations in known genes explain less than 40% of A/M cases. We performed whole-genome copy number variation analysis in 60 patients affected with isolated or syndromic A/M. Pathogenic deletions of 3q26 (SOX2) were identified in four independent patients with syndromic microphthalmia. Other variants of interest included regions with a known role in human disease (likely pathogenic) as well as novel rearrangements (uncertain significance). A 2.2-Mb duplication of 3q29 in a patient with non-syndromic anophthalmia and an 877-kb duplication of 11p13 (PAX6) and a 1.4-Mb deletion of 17q11.2 (NF1) in two independent probands with syndromic microphthalmia and other ocular defects were identified; while ocular anomalies have been previously associated with 3q29 duplications, PAX6 duplications, and NF1 mutations in some cases, the ocular phenotypes observed here are more severe than previously reported. Three novel regions of possible interest included a 2q14.2 duplication which cosegregated with microphthalmia/microcornea and congenital cataracts in one family, and 2q21 and 15q26 duplications in two additional cases; each of these regions contains genes that are active during vertebrate ocular development. Overall, this study identified causative copy number mutations and regions with a possible role in ocular disease in 17% of A/M cases. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  2. A genome-wide investigation of copy number variation in patients with sporadic brain arteriovenous malformation.

    Directory of Open Access Journals (Sweden)

    Nasrine Bendjilali

    Full Text Available Brain arteriovenous malformations (BAVM are clusters of abnormal blood vessels, with shunting of blood from the arterial to venous circulation and a high risk of rupture and intracranial hemorrhage. Most BAVMs are sporadic, but also occur in patients with Hereditary Hemorrhagic Telangiectasia, a Mendelian disorder caused by mutations in genes in the transforming growth factor beta (TGFβ signaling pathway.To investigate whether copy number variations (CNVs contribute to risk of sporadic BAVM, we performed a genome-wide association study in 371 sporadic BAVM cases and 563 healthy controls, all Caucasian. Cases and controls were genotyped using the Affymetrix 6.0 array. CNVs were called using the PennCNV and Birdsuite algorithms and analyzed via segment-based and gene-based approaches. Common and rare CNVs were evaluated for association with BAVM.A CNV region on 1p36.13, containing the neuroblastoma breakpoint family, member 1 gene (NBPF1, was significantly enriched with duplications in BAVM cases compared to controls (P = 2.2×10(-9; NBPF1 was also significantly associated with BAVM in gene-based analysis using both PennCNV and Birdsuite. We experimentally validated the 1p36.13 duplication; however, the association did not replicate in an independent cohort of 184 sporadic BAVM cases and 182 controls (OR = 0.81, P = 0.8. Rare CNV analysis did not identify genes significantly associated with BAVM.We did not identify common CNVs associated with sporadic BAVM that replicated in an independent cohort. Replication in larger cohorts is required to elucidate the possible role of common or rare CNVs in BAVM pathogenesis.

  3. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Directory of Open Access Journals (Sweden)

    Sathishkumar Natarajan

    Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  4. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  5. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome.

    Science.gov (United States)

    Collins, Ryan L; Brand, Harrison; Redin, Claire E; Hanscom, Carrie; Antolik, Caroline; Stone, Matthew R; Glessner, Joseph T; Mason, Tamara; Pregno, Giulia; Dorrani, Naghmeh; Mandrile, Giorgia; Giachino, Daniela; Perrin, Danielle; Walsh, Cole; Cipicchio, Michelle; Costello, Maura; Stortchevoi, Alexei; An, Joon-Yong; Currall, Benjamin B; Seabra, Catarina M; Ragavendran, Ashok; Margolin, Lauren; Martinez-Agosto, Julian A; Lucente, Diane; Levy, Brynn; Sanders, Stephan J; Wapner, Ronald J; Quintero-Rivera, Fabiola; Kloosterman, Wigard; Talkowski, Michael E

    2017-03-06

    Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. We sequenced 689 participants with autism spectrum disorder (ASD) and other developmental abnormalities to construct a genome-wide map of large SV. Using long-insert jumping libraries at 105X mean physical coverage and linked-read whole-genome sequencing from 10X Genomics, we document seven major SV classes at ~5 kb SV resolution. Our results encompass 11,735 distinct large SV sites, 38.1% of which are novel and 16.8% of which are balanced or complex. We characterize 16 recurrent subclasses of complex SV (cxSV), revealing that: (1) cxSV are larger and rarer than canonical SV; (2) each genome harbors 14 large cxSV on average; (3) 84.4% of large cxSVs involve inversion; and (4) most large cxSV (93.8%) have not been delineated in previous studies. Rare SVs are more likely to disrupt coding and regulatory non-coding loci, particularly when truncating constrained and disease-associated genes. We also identify multiple cases of catastrophic chromosomal rearrangements known as chromoanagenesis, including somatic chromoanasynthesis, and extreme balanced germline chromothripsis events involving up to 65 breakpoints and 60.6 Mb across four chromosomes, further defining rare categories of extreme cxSV. These data provide a foundational map of large SV in the morbid human genome and demonstrate a previously underappreciated abundance and diversity of cxSV that should be considered in genomic studies of human disease.

  6. Defining the role of common variation in the genomic and biological architecture of adult human height

    NARCIS (Netherlands)

    A.R. Wood (Andrew); T. Esko (Tõnu); J. Yang (Jian); S. Vedantam (Sailaja); T.H. Pers (Tune); S. Gustafsson (Stefan); A.Y. Chu (Audrey Y); K. Estrada Gil (Karol); J. Luan; Z. Kutalik; N. Amin (Najaf); M.L. Buchkovich (Martin); D.C. Croteau-Chonka (Damien); F.R. Day (Felix); Y. Duan (Yanan); M. Fall (Magnus); R.S.N. Fehrmann (Rudolf); T. Ferreira (Teresa); A.U. Jackson (Anne); J. Karjalainen (Juha); K.S. Lo (Ken Sin); A. Locke (Adam); R. Mägi (Reedik); E. Mihailov (Evelin); E. Porcu (Eleonora); J.C. Randall (Joshua); A. Scherag (Andre); A.A.E. Vinkhuyzen (Anna A.); H.J. Westra (Harm-Jan); T.W. Winkler (Thomas W.); T. Workalemahu (Tsegaselassie); J.H. Zhao (Jing Hua); D. Absher (Devin); E. Albrecht (Eva); J. Baron (Jeffrey); M. Beekman (Marian); A. Demirkan (Ayşe); G.B. Ehret (Georg); B. Feenstra; M.F. Feitosa (Mary Furlan); K. Fischer (Krista); R.M. Fraser (Ross); A. Goel (Anuj); J. Gong (Jian); A.E. Justice (Anne); S. Kanoni (Stavroula); M.E. Kleber (Marcus); K. Kristiansson (Kati); U. Lim (Unhee); V. Lotay (Vaneet); J.C. Lui (Julian C); M. Mangino (Massimo); I.M. Leach (Irene Mateo); M.C. Medina-Gomez (Carolina); M.A. Nalls (Michael); A.S. Dimas (Antigone); C. Palmer (Cameron); D. Pasko (Dorota); S. Pechlivanis (Sonali); I. Prokopenko (Inga); J.S. Ried (Janina); S. Ripke (Stephan); D. Shungin (Dmitry); A. Stancáková (Alena); R.J. Strawbridge (Rona); Y.J. Sung (Yun Ju); T. Tanaka (Toshiko); A. Teumer (Alexander); S. Trompet (Stella); S.W. Van Der Laan (Sander W.); J. van Setten (Jessica); J.V. van Vliet-Ostaptchouk (Jana); Z. Wang (Zhaoming); L. Yengo (Loic); W. Zhang (Weihua); U. Afzal (Uzma); J. Ärnlöv (Johan); G.M. Arscott (Gillian M.); S. Bandinelli (Stefania); A. Barrett (Angela); C. Bellis (Claire); A.J. Bennett (Amanda); C. Berne (Christian); M. Blüher (Matthias); J.L. Bolton (Jennifer); Y. Böttcher (Yvonne); H.A. Boyd; M. Bruinenberg (M.); B.M. Buckley (Brendan M.); S. Buyske (Steven); I.H. Caspersen (Ida H.); P.S. Chines (Peter); R. Clarke (Robert); S. Claudi-Boehm (Simone); M.N. Cooper (Matthew); E.W. Daw (E Warwick); P.A. De Jong (Pim A); J. Deelen (Joris); G. Delgado; J.C. Denny (Josh C); R.A.M. Dhonukshe-Rutten (Rosalie); M. Dimitriou (Maria); A.S.F. Doney (Alex); M. Dörr (Marcus); N. Eklund (Niina); E. Eury (Elodie); L. Folkersen (Lasse); M. Garcia (Melissa); F. Geller (Frank); V. Giedraitis (Vilmantas); A. Go (Attie); H. Grallert (Harald); T.B. Grammer (Tanja B); J. Gräßler (Jürgen); H. Grönberg (Henrik); L.C.P.G.M. de Groot (Lisette); C.J. Groves (Christopher J.); J. Haessler (Jeff); P. Hall (Per); T. Haller (Toomas); G. Hallmans (Göran); M. Hannemann (Mario); C.A. Hartman (Catharina); M. Hassinen (Maija); C. Hayward (Caroline); N.L. Heard-Costa (Nancy); Q. Helmer (Quinta); G. Hemani; A.K. Henders (Anjali); H.L. Hillege (Hans); M.A. Hlatky (Mark); W. Hoffmann (Wolfgang); P. Hoffmann (Per); O.L. Holmen (Oddgeir); J.J. Houwing-Duistermaat (Jeanine); T. Illig (Thomas); A. Isaacs (Aaron); A.L. James (Alan); J. Jeff (Janina); B. Johansen (Berit); A. Johansson (Åsa); G.J. Jolley (Jason); T. Juliusdottir (Thorhildur); M.J. Junttila (Juhani); M.M.L. Kho (Marcia); L. Kinnunen (Leena); N. Klopp (Norman); T. Kocher; W. Kratzer (Wolfgang); P. Lichtner (Peter); L. Lind (Lars); J. Lindström (Jaana); S. Lobbens (Stéphane); M. Lorentzon (Mattias); Y. Lu (Yingchang); V. Lyssenko (Valeriya); P.K. Magnusson (Patrik); A. Mahajan (Anubha); M. Maillard (Marc); W.L. McArdle (Wendy); C.A. McKenzie (Colin A.); S. McLachlan (Stela); P.J. McLaren (Paul J); C. Menni (Cristina); S. Merger (Sigrun); L. Milani (Lili); A. Moayyeri (Alireza); K.L. Monda (Keri); M.A. Morken (Mario); G. Müller (Gabriele); M. Müller-Nurasyid (Martina); A.W. Musk (Arthur); N. Narisu (Narisu); M. Nauck (Matthias); I.M. Nolte (Ilja M.); M.M. Nöthen (Markus); L. Oozageer (Laticia); S. Pilz (Stefan); N.W. Rayner (Nigel William); F. Renström (Frida); N.R. Robertson (Neil R.); L.M. Rose (Lynda M.); R. Roussel (Ronan); S. Sanna (Serena); H. Scharnagl (Hubert); S. Scholtens (Salome); F.R. Schumacher (Fredrick R); H. Schunkert (Heribert); R.A. Scott (Robert); J.S. Sehmi (Joban); T. Seufferlein (Thomas); J. Shi (Jianxin); K. Silventoinen (Karri); J.H. Smit (Johannes); G.D. Smith; J. Smolonska (Joanna); A. Stanton (Alice); K. Stirrups (Kathy); D.J. Stott (David J); H.M. Stringham (Heather); J. Sundstrom (Johan); M. Swertz (Morris); A.C. Syvanen; B. Tayo (Bamidele); G. Thorleifsson (Gudmar); J.P. Tyrer (Jonathan); S. Van Dijk (Suzanne); N.M. van Schoor (Natasja); N. van der Velde (Nathalie); D. van Heemst (Diana); F.V.A. Van Oort (Floor V A); S.H.H.M. Vermeulen (Sita); N. Verweij (Niek); J.M. Vonk (Judith M); L. Waite (Lindsay); M. Waldenberger (Melanie); R. Wennauer (Roman); L.R. Wilkens (Lynne R.); C. Willenborg (Christina); T. Wilsgaard (Tom); M.K. Wojczynski (Mary ); A. Wong (Andrew); A. Wright (Alan); Q. Zhang (Qunyuan); D. Arveiler (Dominique); S.J.L. Bakker (Stephan); J. Beilby (John); R.N. Bergman (Richard); S.M. Bergmann (Sven); R. Biffar; J. Blangero (John); D.I. Boomsma (Dorret); S.R. Bornstein (Stefan R.); P. Bovet (Pascal); P. Brambilla (Paolo); M.J. Brown (Morris); H. Campbell (Harry); M. Caulfield (Mark); A. Chakravarti (Aravinda); F.S. Collins (Francis); D.C. Crawford (Dana); L.A. Cupples (Adrienne); J. Danesh (John); U. de Faire (Ulf); H.M. den Ruijter (Hester ); R. Erbel (Raimund); J. Erdmann (Jeanette); J. Eriksson; M. Farrall (Martin); E. Ferrannini (Ele); J. Ferrieres (Jean); I. Ford; N.G. Forouhi (Nita); T. Forrester (Terrence); R.T. Gansevoort (Ron); P.V. Gejman (Pablo); C. Gieger (Christian); A. Golay (Alain); R.F. Gottesman (Rebecca); V. Gudnason (Vilmundur); U. Gyllensten (Ulf); D.W. Haas (David W); A.S. Hall (Alistair); T.B. Harris (Tamara); A.T. Hattersley (Andrew); A.C. Heath (Andrew C); C. Hengstenberg (Christian); A.A. Hicks (Andrew); L.A. Hindorff (Lucia A); A. Hingorani (Aroon); A. Hofman (Albert); G.K. Hovingh (Kees); S.E. Humphries (Steve E.); S.C. Hunt (Steven); E. Hypponen (Elina); K.B. Jacobs (Kevin); M.-R. Jarvelin (Marjo-Riitta); P. Jousilahti (Pekka); A. Jula (Antti); J. Kaprio (Jaakko); J.J.P. Kastelein (John); M.H. Kayser (Manfred); F. Kee (Frank); S. Keinanen-Kiukaanniemi (Sirkka); L.A.L.M. Kiemeney (Bart); J.S. Kooner (Jaspal S.); C. Kooperberg (Charles); S. Koskinen (Seppo); P. Kovacs (Peter); A. Kraja (Aldi); M. Kumari (Meena); J. Kuusisto (Johanna); T.A. Lakka (Timo); C. Langenberg (Claudia); L. Le Marchand (Loic); T. Lehtimäki (Terho); S. Lupoli (Sara); P.A. Madden; S. Männistö (Satu); P. Manunta (Paolo); A. Marette (Andre'); T.C. Matise (Tara C.); B. McKnight (Barbara); T. Meitinger (Thomas); F.L. Moll (Frans); G.W. Montgomery (Grant W.); A.D. Morris (Andrew); A.P. Morris (Andrew); J.C. Murray (Jeffrey); M. Nelis (Mari); C. Ohlsson (Claes); A.J. Oldehinkel (Albertine); K.K. Ong (Ken K.); W.H. Ouwehand (Willem); G. Pasterkamp (Gerard); A. Peters (Annette); P.P. Pramstaller (Peter Paul); J.F. Price (Jackie F.); L. Qi (Lu); O. Raitakari (Olli); T. Rankinen (Tuomo); D.C. Rao (Dabeeru C.); T.K. Rice (Treva K.); M.D. Ritchie (Marylyn D.); I. Rudan (Igor); V. Salomaa (Veikko); N.J. Samani (Nilesh); J. Saramies (Jouko); M.A. Sarzynski (Mark A.); P.E.H. Schwarz (Peter E. H.); S. Sebert (Sylvain); P. Sever (Peter); A.R. Shuldiner (Alan); J. Sinisalo (Juha); V. Steinthorsdottir (Valgerdur); R.P. Stolk; J.-C. Tardif (Jean-Claude); A. Tönjes (Anke); A. Tremblay (Angelo); E. Tremoli (Elena); J. Virtamo (Jarmo); M.-C. Vohl (Marie-Claude); P. Amouyel (Philippe); F.W. Asselbergs (Folkert W.); T.L. Assimes (Themistocles); M. Bochud (Murielle); B.O. Boehm (Bernhard); E.A. Boerwinkle (Eric); E.P. Bottinger (Erwin P.); C. Bouchard (Claude); S. Cauchi (Stéphane); J.C. Chambers (John C.); S.J. Chanock (Stephen); R.S. Cooper (Richard S.); P.I.W. de Bakker (Paul); G.V. Dedoussis (George); L. Ferrucci (Luigi); P.W. Franks; P. Froguel (Philippe); L. Groop (Leif); C.A. Haiman (Christopher); A. Hamsten (Anders); M.G. Hayes (M. Geoffrey); J. Hui (Jennie); D. Hunter (David); K. Hveem (Kristian); J.W. Jukema (Jan Wouter); R.C. Kaplan (Robert); M. Kivimaki (Mika); D. Kuh (Diana); M. Laakso (Markku); Y. Liu (YongMei); N.G. Martin (Nicholas); W. März (Winfried); M. Melbye (Mads); S. Moebus (Susanne); P. Munroe (Patricia); I. Njølstad (Inger); B.A. Oostra (Ben); C.N.A. Palmer (Colin); N.L. Pedersen (Nancy L.); M. Perola (Markus); L. Perusse (Louis); U. Peters (Ulrike); J.E. Powell (Joseph); C. Power (Christine); T. Quertermous (Thomas); R. Rauramaa (Rainer); E. Reinmaa (Eva); P.M. Ridker (Paul); F. Rivadeneira Ramirez (Fernando); J.I. Rotter (Jerome I.); T. Saaristo (Timo); D. Saleheen; D. Schlessinger (David); P.E. Slagboom (P Eline); H. Snieder (Harold); T.D. Spector (Timothy); K. Strauch (Konstantin); M. Stumvoll (Michael); J. Tuomilehto (Jaakko); M. Uusitupa (Matti); P. van der Harst (Pim); H. Völzke (Henry); M. Walker (Mark); N.J. Wareham (Nick); H. Watkins (Hugh); H.E. Wichmann (Heinz Erich); J.F. Wilson (James F); P. Zanen (Pieter); P. Deloukas (Panagiotis); I.M. Heid (Iris); C.M. Lindgren (Cecilia); K.L. Mohlke (Karen); E.K. Speliotes (Elizabeth); U. Thorsteinsdottir (Unnur); I.E. Barroso (Inês); C.S. Fox (Caroline S.); K.E. North (Kari); D.P. Strachan (David P.); J.S. Beckmann (Jacques); S.I. Berndt (Sonja); M. Boehnke (Michael); I.B. Borecki (Ingrid); M.I. McCarthy (Mark); A. Metspalu (Andres); J-A. Zwart (John-Anker); A.G. Uitterlinden (André); C.M. van Duijn (Cornelia); L. Franke (Lude); C.J. Willer (Cristen); A. Price (Alkes); G. Lettre (Guillaume); R.J.F. Loos (Ruth); M.N. Weedon (Michael); E. Ingelsson (Erik); J.R. O´Connell; G.R. Abecasis (Gonçalo); D.I. Chasman (Daniel); D. Anderson (Denise); M.E. Goddard (Michael); P.M. Visscher (Peter); J.N. Hirschhorn (Joel); T.M. Frayling (Timothy)

    2014-01-01

    textabstractUsing genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated

  7. Defining the role of common variation in the genomic and biological architecture of adult human height

    NARCIS (Netherlands)

    Wood, Andrew R.; Esko, Tonu; Yang, Jian; Vedantam, Sailaja; Pers, Tune H.; Gustafsson, Stefan; Chu, Audrey Y.; Estrada, Karol; Luan, Jian'an; Kutalik, Zoltán; Amin, Najaf; Buchkovich, Martin L.; Croteau-Chonka, Damien C.; Day, Felix R.; Duan, Yanan; Fall, Tove; Fehrmann, Rudolf; Ferreira, Teresa; Jackson, Anne U.; Karjalainen, Juha; Lo, Ken Sin; Locke, Adam E.; Mägi, Reedik; Mihailov, Evelin; Porcu, Eleonora; Randall, Joshua C.; Scherag, André; Vinkhuyzen, Anna A. E.; Westra, Harm-Jan; Winkler, Thomas W.; Workalemahu, Tsegaselassie; Zhao, Jing Hua; Absher, Devin; Albrecht, Eva; Anderson, Denise; Baron, Jeffrey; Beekman, Marian; Demirkan, Ayse; Ehret, Georg B.; Feenstra, Bjarke; Feitosa, Mary F.; Fischer, Krista; Fraser, Ross M.; Goel, Anuj; Gong, Jian; Justice, Anne E.; Kanoni, Stavroula; Kleber, Marcus E.; Kristiansson, Kati; Lim, Unhee; Lotay, Vaneet; Lui, Julian C.; Mangino, Massimo; Mateo Leach, Irene; Medina-Gomez, Carolina; Nalls, Michael A.; Nyholt, Dale R.; Palmer, Cameron D.; Pasko, Dorota; Pechlivanis, Sonali; Prokopenko, Inga; Ried, Janina S.; Ripke, Stephan; Shungin, Dmitry; Stancáková, Alena; Strawbridge, Rona J.; Sung, Yun Ju; Tanaka, Toshiko; Teumer, Alexander; Trompet, Stella; van der Laan, Sander W.; van Setten, Jessica; van Vliet-Ostaptchouk, Jana V.; Wang, Zhaoming; Yengo, Loïc; Zhang, Weihua; Afzal, Uzma; Arnlöv, Johan; Arscott, Gillian M.; Bandinelli, Stefania; Barrett, Amy; Bellis, Claire; Bennett, Amanda J.; Berne, Christian; Blüher, Matthias; Bolton, Jennifer L.; Böttcher, Yvonne; Boyd, Heather A.; Bruinenberg, Marcel; Buckley, Brendan M.; Buyske, Steven; Caspersen, Ida H.; Chines, Peter S.; Clarke, Robert; Claudi-Boehm, Simone; Cooper, Matthew; Daw, E. Warwick; de Jong, Pim A.; Deelen, Joris; Delgado, Graciela; Denny, Josh C.; Dhonukshe-Rutten, Rosalie; Dimitriou, Maria; Doney, Alex S. F.; Dörr, Marcus; Eklund, Niina; Eury, Elodie; Folkersen, Lasse; Garcia, Melissa E.; Geller, Frank; Giedraitis, Vilmantas; Go, Alan S.; Grallert, Harald; Grammer, Tanja B.; Gräßler, Jürgen; Grönberg, Henrik; de Groot, Lisette C. P. G. M.; Groves, Christopher J.; Haessler, Jeffrey; Hall, Per; Haller, Toomas; Hallmans, Goran; Hannemann, Anke; Hartman, Catharina A.; Hassinen, Maija; Hayward, Caroline; Heard-Costa, Nancy L.; Helmer, Quinta; Hemani, Gibran; Henders, Anjali K.; Hillege, Hans L.; Hlatky, Mark A.; Hoffmann, Wolfgang; Hoffmann, Per; Holmen, Oddgeir; Houwing-Duistermaat, Jeanine J.; Illig, Thomas; Isaacs, Aaron; James, Alan L.; Jeff, Janina; Johansen, Berit; Johansson, Åsa; Jolley, Jennifer; Juliusdottir, Thorhildur; Junttila, Juhani; Kho, Abel N.; Kinnunen, Leena; Klopp, Norman; Kocher, Thomas; Kratzer, Wolfgang; Lichtner, Peter; Lind, Lars; Lindström, Jaana; Lobbens, Stéphane; Lorentzon, Mattias; Lu, Yingchang; Lyssenko, Valeriya; Magnusson, Patrik K. E.; Mahajan, Anubha; Maillard, Marc; McArdle, Wendy L.; McKenzie, Colin A.; McLachlan, Stela; McLaren, Paul J.; Menni, Cristina; Merger, Sigrun; Milani, Lili; Moayyeri, Alireza; Monda, Keri L.; Morken, Mario A.; Müller, Gabriele; Müller-Nurasyid, Martina; Musk, Arthur W.; Narisu, Narisu; Nauck, Matthias; Nolte, Ilja M.; Nöthen, Markus M.; Oozageer, Laticia; Pilz, Stefan; Rayner, Nigel W.; Renstrom, Frida; Robertson, Neil R.; Rose, Lynda M.; Roussel, Ronan; Sanna, Serena; Scharnagl, Hubert; Scholtens, Salome; Schumacher, Fredrick R.; Schunkert, Heribert; Scott, Robert A.; Sehmi, Joban; Seufferlein, Thomas; Shi, Jianxin; Silventoinen, Karri; Smit, Johannes H.; Smith, Albert Vernon; Smolonska, Joanna; Stanton, Alice V.; Stirrups, Kathleen; Stott, David J.; Stringham, Heather M.; Sundström, Johan; Swertz, Morris A.; Syvänen, Ann-Christine; Tayo, Bamidele O.; Thorleifsson, Gudmar; Tyrer, Jonathan P.; van Dijk, Suzanne; van Schoor, Natasja M.; van der Velde, Nathalie; van Heemst, Diana; van Oort, Floor V. A.; Vermeulen, Sita H.; Verweij, Niek; Vonk, Judith M.; Waite, Lindsay L.; Waldenberger, Melanie; Wennauer, Roman; Wilkens, Lynne R.; Willenborg, Christina; Wilsgaard, Tom; Wojczynski, Mary K.; Wong, Andrew; Wright, Alan F.; Zhang, Qunyuan; Arveiler, Dominique; Bakker, Stephan J. L.; Beilby, John; Bergman, Richard N.; Bergmann, Sven; Biffar, Reiner; Blangero, John; Boomsma, Dorret I.; Bornstein, Stefan R.; Bovet, Pascal; Brambilla, Paolo; Brown, Morris J.; Campbell, Harry; Caulfield, Mark J.; Chakravarti, Aravinda; Collins, Rory; Collins, Francis S.; Crawford, Dana C.; Cupples, L. Adrienne; Danesh, John; de Faire, Ulf; den Ruijter, Hester M.; Erbel, Raimund; Erdmann, Jeanette; Eriksson, Johan G.; Farrall, Martin; Ferrannini, Ele; Ferrières, Jean; Ford, Ian; Forouhi, Nita G.; Forrester, Terrence; Gansevoort, Ron T.; Gejman, Pablo V.; Gieger, Christian; Golay, Alain; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Haas, David W.; Hall, Alistair S.; Harris, Tamara B.; Hattersley, Andrew T.; Heath, Andrew C.; Hengstenberg, Christian; Hicks, Andrew A.; Hindorff, Lucia A.; Hingorani, Aroon D.; Hofman, Albert; Hovingh, G. Kees; Humphries, Steve E.; Hunt, Steven C.; Hypponen, Elina; Jacobs, Kevin B.; Jarvelin, Marjo-Riitta; Jousilahti, Pekka; Jula, Antti M.; Kaprio, Jaakko; Kastelein, John J. P.; Kayser, Manfred; Kee, Frank; Keinanen-Kiukaanniemi, Sirkka M.; Kiemeney, Lambertus A.; Kooner, Jaspal S.; Kooperberg, Charles; Koskinen, Seppo; Kovacs, Peter; Kraja, Aldi T.; Kumari, Meena; Kuusisto, Johanna; Lakka, Timo A.; Langenberg, Claudia; Le Marchand, Loic; Lehtimäki, Terho; Lupoli, Sara; Madden, Pamela A. F.; Männistö, Satu; Manunta, Paolo; Marette, André; Matise, Tara C.; McKnight, Barbara; Meitinger, Thomas; Moll, Frans L.; Montgomery, Grant W.; Morris, Andrew D.; Morris, Andrew P.; Murray, Jeffrey C.; Nelis, Mari; Ohlsson, Claes; Oldehinkel, Albertine J.; Ong, Ken K.; Ouwehand, Willem H.; Pasterkamp, Gerard; Peters, Annette; Pramstaller, Peter P.; Price, Jackie F.; Qi, Lu; Raitakari, Olli T.; Rankinen, Tuomo; Rao, D. C.; Rice, Treva K.; Ritchie, Marylyn; Rudan, Igor; Salomaa, Veikko; Samani, Nilesh J.; Saramies, Jouko; Sarzynski, Mark A.; Schwarz, Peter E. H.; Sebert, Sylvain; Sever, Peter; Shuldiner, Alan R.; Sinisalo, Juha; Steinthorsdottir, Valgerdur; Stolk, Ronald P.; Tardif, Jean-Claude; Tönjes, Anke; Tremblay, Angelo; Tremoli, Elena; Virtamo, Jarmo; Vohl, Marie-Claude; Amouyel, Philippe; Asselbergs, Folkert W.; Assimes, Themistocles L.; Bochud, Murielle; Boehm, Bernhard O.; Boerwinkle, Eric; Bottinger, Erwin P.; Bouchard, Claude; Cauchi, Stéphane; Chambers, John C.; Chanock, Stephen J.; Cooper, Richard S.; de Bakker, Paul I. W.; Dedoussis, George; Ferrucci, Luigi; Franks, Paul W.; Froguel, Philippe; Groop, Leif C.; Haiman, Christopher A.; Hamsten, Anders; Hayes, M. Geoffrey; Hui, Jennie; Hunter, David J.; Hveem, Kristian; Jukema, J. Wouter; Kaplan, Robert C.; Kivimaki, Mika; Kuh, Diana; Laakso, Markku; Liu, Yongmei; Martin, Nicholas G.; März, Winfried; Melbye, Mads; Moebus, Susanne; Munroe, Patricia B.; Njølstad, Inger; Oostra, Ben A.; Palmer, Colin N. A.; Pedersen, Nancy L.; Perola, Markus; Pérusse, Louis; Peters, Ulrike; Powell, Joseph E.; Power, Chris; Quertermous, Thomas; Rauramaa, Rainer; Reinmaa, Eva; Ridker, Paul M.; Rivadeneira, Fernando; Rotter, Jerome I.; Saaristo, Timo E.; Saleheen, Danish; Schlessinger, David; Slagboom, P. Eline; Snieder, Harold; Spector, Tim D.; Strauch, Konstantin; Stumvoll, Michael; Tuomilehto, Jaakko; Uusitupa, Matti; van der Harst, Pim; Völzke, Henry; Walker, Mark; Wareham, Nicholas J.; Watkins, Hugh; Wichmann, H.-Erich; Wilson, James F.; Zanen, Pieter; Deloukas, Panos; Heid, Iris M.; Lindgren, Cecilia M.; Mohlke, Karen L.; Speliotes, Elizabeth K.; Thorsteinsdottir, Unnur; Barroso, Inês; Fox, Caroline S.; North, Kari E.; Strachan, David P.; Beckmann, Jacques S.; Berndt, Sonja I.; Boehnke, Michael; Borecki, Ingrid B.; McCarthy, Mark I.; Metspalu, Andres; Stefansson, Kari; Uitterlinden, André G.; van Duijn, Cornelia M.; Franke, Lude; Willer, Cristen J.; Price, Alkes L.; Lettre, Guillaume; Loos, Ruth J. F.; Weedon, Michael N.; Ingelsson, Erik; O'Connell, Jeffrey R.; Abecasis, Goncalo R.; Chasman, Daniel I.; Goddard, Michael E.; Visscher, Peter M.; Hirschhorn, Joel N.; Frayling, Timothy M.; McCarty, Catherine A.; Starren, Justin; Peissig, Peggy; Berg, Richard; Rasmussen, Luke; Linneman, James; Miller, Aaron; Choudary, Vidhu; Chen, Lin; Waudby, Carol; Kitchner, Terrie; Reeser, Jonathan; Fost, Norman; Wilke, Russell A.; Chisholm, Rex L.; Avila, Pedro C.; Greenland, Philip; Hayes, M. Geoff; Kho, Abel; Kibbe, Warren A.; Lemke, Amy A.; Lowe, William L.; Smith, Maureen E.; Wolf, Wendy A.; Pacheco, Jennifer A.; Thompson, William K.; Humowiecki, Joel; Law, May; Chute, Christopher; Kullo, Iftikar; Koenig, Barbara; de Andrade, Mariza; Bielinski, Suzette; Pathak, Jyotishman; Savova, Guergana; Wu, Joel; Henriksen, Joan; Ding, Keyue; Hart, Lacey; Palbicki, Jeremy; Larson, Eric B.; Newton, Katherine; Ludman, Evette; Spangler, Leslie; Hart, Gene; Carrell, David; Jarvik, Gail; Crane, Paul; Burke, Wylie; Fullerton, Stephanie Malia; Trinidad, Susan Brown; Carlson, Chris; Hutchinson, Fred; McDavid, Andrew; Roden, Dan M.; Clayton, Ellen; Haines, Jonathan L.; Masys, Daniel R.; Churchill, Larry R.; Cornfield, Daniel; Crawford, Dana; Darbar, Dawood; Denny, Joshua C.; Malin, Bradley A.; Ritchie, Marylyn D.; Schildcrout, Jonathan S.; Xu, Hua; Ramirez, Andrea Havens; Basford, Melissa; Pulley, Jill; Alizadeh, Behrooz Z.; de Boer, Rudolf A.; Boezen, H. Marike; van der Klauw, Melanie M.; Navis, Gerjan; Ormel, Johan; Postma, Dirkje S.; Rosmalen, Judith G. M.; Slaets, Joris P.; Wolffenbuttel, Bruce H. R.; Wijmenga, Cisca; Kathiresan, Sekar; Voight, Benjamin F.; Purcell, Shaun; Musunuru, Kiran; Ardissino, Diego; Mannucci, Pier M.; Anand, Sonia; Engert, James C.; Reilly, Muredach P.; Rader, Daniel J.; Morgan, Thomas; Spertus, John A.; Stoll, Monika; Girelli, Domenico; McKeown, Pascal P.; Patterson, Chris C.; Siscovick, David S.; O'Donnell, Christopher J.; Elosua, Roberto; Peltonen, Leena; Schwartz, Stephen M.; Melander, Olle; Altshuler, David; Merlini, Pier Angelica; Berzuini, Carlo; Bernardinelli, Luisa; Peyvandi, Flora; Tubaro, Marco; Celli, Patrizia; Ferrario, Maurizio; Fetiveau, Raffaela; Marziliano, Nicola; Casari, Giorgio; Galli, Michele; Ribichini, Flavio; Rossi, Marco; Bernardi, Francesco; Zonzin, Pietro; Piazza, Alberto; Yee, Jean; Friedlander, Yechiel; Marrugat, Jaume; Lucas, Gavin; Subirana, Isaac; Sala, Joan; Ramos, Rafael; Meigs, James B.; Williams, Gordon; Nathan, David M.; MacRae, Calum A.; Havulinna, Aki S.; Berglund, Goran; Asselta, Rosanna; Duga, Stefano; Spreafico, Marta; Daly, Mark J.; Nemesh, James; Korn, Joshua M.; McCarroll, Steven A.; Surti, Aarti; Guiducci, Candace; Gianniny, Lauren; Mirel, Daniel; Parkin, Melissa; Burtt, Noel; Gabriel, Stacey B.; Thompson, John R.; Braund, Peter S.; Wright, Benjamin J.; Balmforth, Anthony J.; Ball, Stephen G.; Schunkert, I. Heribert; Linsel-Nitschke, Patrick; Lieb, Wolfgang; Ziegler, Andreas; König, Inke R.; Fischer, Marcus; Stark, Klaus; Grosshennig, Anika; Preuss, Michael; Schreiber, Stefan; Ouwehand, Willem; Scholz, Michael; Cambien, Francois; Goodall, Alison; Li, Mingyao; Chen, Zhen; Wilensky, Robert; Matthai, William; Qasim, Atif; Hakonarson, Hakon H.; Devaney, Joe; Burnett, Mary-Susan; Pichard, Augusto D.; Kent, Kenneth M.; Satler, Lowell; Lindsay, Joseph M.; Waksman, Ron; Knouff, Christopher W.; Waterworth, Dawn M.; Walker, Max C.; Mooser, Vincent; Epstein, Stephen E.; Scheffold, Thomas; Berger, Klaus; Huge, Andreas; Martinelli, Nicola; Olivieri, Oliviero; Corrocher, Roberto; Hólm, Hilma; Do, Ron; Xie, Changchun; Siscovick, David; Matise, Tara; Buyske, Steve; Higashio, Julia; Williams, Rasheeda; Nato, Andrew; Ambite, Jose Luis; Deelman, Ewa; Manolio, Teri; Hindorff, Lucia; Heiss, Gerardo; Taylor, Kira; Franceschini, Nora; Avery, Christy; Graff, Misa; Lin, Danyu; Quibrera, Miguel; Cochran, Barbara; Kao, Linda; Umans, Jason; Cole, Shelley; MacCluer, Jean; Person, Sharina; Pankow, James; Gross, Myron; Fornage, Myriam; Durda, Peter; Jenny, Nancy; Patsy, Bruce; Arnold, Alice; Buzkova, Petra; Haines, Jonathan; Murdock, Deborah; Glenn, Kim; Brown-Gentry, Kristin; Thornton-Wells, Tricia; Dumitrescu, Logan; Bush, William S.; Mitchell, Sabrina L.; Goodloe, Robert; Wilson, Sarah; Boston, Jonathan; Malinowski, Jennifer; Restrepo, Nicole; Oetjens, Matthew; Fowke, Jay; Zheng, Wei; Spencer, Kylee; Pendergrass, Sarah; Le Marchand, Loïc; Wilkens, Lynne; Park, Lani; Tiirikainen, Maarit; Kolonel, Laurence; Cheng, Iona; Wang, Hansong; Shohet, Ralph; Haiman, Christopher; Stram, Daniel; Henderson, Brian; Monroe, Kristine; Schumacher, Fredrick; Anderson, Garnet; Prentice, Ross; LaCroix, Andrea; Wu, Chunyuan; Carty, Cara; Rosse, Stephanie; Young, Alicia; Haessler, Jeff; Kocarnik, Jonathan; Lin, Yi; Jackson, Rebecca; Duggan, David; Kuller, Lew

    2014-01-01

    Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700

  8. GENOMICS SYMPOSIUM: Using genomic approaches to uncover sources of variation in age at puberty and reproductive longevity in sows

    Science.gov (United States)

    Genetic variants associated with traits such as age at puberty and litter size could provide insight into the underlying genetic sources of variation impacting sow reproductive longevity and productivity. Genomewide characterization and gene expression profiling were used using gilts from the Univer...

  9. Variation, Evolution, and Correlation Analysis of C+G Content and Genome or Chromosome Size in Different Kingdoms and Phyla

    Science.gov (United States)

    Li, Xiu-Qing; Du, Donglei

    2014-01-01

    C+G content (GC content or G+C content) is known to be correlated with genome/chromosome size in bacteria but the relationship for other kingdoms remains unclear. This study analyzed genome size, chromosome size, and base composition in most of the available sequenced genomes in various kingdoms. Genome size tends to increase during evolution in plants and animals, and the same is likely true for bacteria. The genomic C+G contents were found to vary greatly in microorganisms but were quite similar within each animal or plant subkingdom. In animals and plants, the C+G contents are ranked as follows: monocot plants>mammals>non-mammalian animals>dicot plants. The variation in C+G content between chromosomes within species is greater in animals than in plants. The correlation between average chromosome C+G content and chromosome length was found to be positive in Proteobacteria, Actinobacteria (but not in other analyzed bacterial phyla), Ascomycota fungi, and likely also in some plants; negative in some animals, insignificant in two protist phyla, and likely very weak in Archaea. Clearly, correlations between C+G content and chromosome size can be positive, negative, or not significant depending on the kingdoms/groups or species. Different phyla or species exhibit different patterns of correlation between chromosome-size and C+G content. Most chromosomes within a species have a similar pattern of variation in C+G content but outliers are common. The data presented in this study suggest that the C+G content is under genetic control by both trans- and cis- factors and that the correlation between C+G content and chromosome length can be positive, negative, or not significant in different phyla. PMID:24551092

  10. Chromosomal Copy Number Variation in Saccharomyces pastorianus Is Evidence for Extensive Genome Dynamics in Industrial Lager Brewing Strains.

    Science.gov (United States)

    van den Broek, M; Bolat, I; Nijkamp, J F; Ramos, E; Luttik, M A H; Koopman, F; Geertman, J M; de Ridder, D; Pronk, J T; Daran, J-M

    2015-09-01

    Lager brewing strains of Saccharomyces pastorianus are natural interspecific hybrids originating from the spontaneous hybridization of Saccharomyces cerevisiae and Saccharomyces eubayanus. Over the past 500 years, S. pastorianus has been domesticated to become one of the most important industrial microorganisms. Production of lager-type beers requires a set of essential phenotypes, including the ability to ferment maltose and maltotriose at low temperature, the production of flavors and aromas, and the ability to flocculate. Understanding of the molecular basis of complex brewing-related phenotypic traits is a prerequisite for rational strain improvement. While genome sequences have been reported, the variability and dynamics of S. pastorianus genomes have not been investigated in detail. Here, using deep sequencing and chromosome copy number analysis, we showed that S. pastorianus strain CBS1483 exhibited extensive aneuploidy. This was confirmed by quantitative PCR and by flow cytometry. As a direct consequence of this aneuploidy, a massive number of sequence variants was identified, leading to at least 1,800 additional protein variants in S. pastorianus CBS1483. Analysis of eight additional S. pastorianus strains revealed that the previously defined group I strains showed comparable karyotypes, while group II strains showed large interstrain karyotypic variability. Comparison of three strains with nearly identical genome sequences revealed substantial chromosome copy number variation, which may contribute to strain-specific phenotypic traits. The observed variability of lager yeast genomes demonstrates that systematic linking of genotype to phenotype requires a three-dimensional genome analysis encompassing physical chromosomal structures, the copy number of individual chromosomes or chromosomal regions, and the allelic variation of copies of individual genes. Copyright © 2015, van den Broek et al.

  11. Spatial variation in the parasite communities and genomic structure of urban rats in New York City.

    Science.gov (United States)

    Angley, L P; Combs, M; Firth, C; Frye, M J; Lipkin, I; Richardson, J L; Munshi-South, J

    2018-02-01

    Brown rats (Rattus norvegicus) are a globally distributed pest. Urban habitats can support large infestations of rats, posing a potential risk to public health from the parasites and pathogens they carry. Despite the potential influence of rodent-borne zoonotic diseases on human health, it is unclear how urban habitats affect the structure and transmission dynamics of ectoparasite and microbial communities (all referred to as "parasites" hereafter) among rat colonies. In this study, we use ecological data on parasites and genomic sequencing of their rat hosts to examine associations between spatial proximity, genetic relatedness and the parasite communities associated with 133 rats at five sites in sections of New York City with persistent rat infestations. We build on previous work showing that rats in New York carry a wide variety of parasites and report that these communities differ significantly among sites, even across small geographical distances. Ectoparasite community similarity was positively associated with geographical proximity; however, there was no general association between distance and microbial communities of rats. Sites with greater overall parasite diversity also had rats with greater infection levels and parasite species richness. Parasite community similarity among sites was not linked to genetic relatedness of rats, suggesting that these communities are not associated with genetic similarity among host individuals or host dispersal among sites. Discriminant analysis identified site-specific associations of several parasite species, suggesting that the presence of some species within parasite communities may allow researchers to determine the sites of origin for newly sampled rats. The results of our study help clarify the roles that colony structure and geographical proximity play in determining the ecology of R. norvegicus as a significant urban reservoir of zoonotic diseases. Our study also highlights the spatial variation present in urban

  12. Genetic basis for spontaneous hybrid genome doubling during allopolyploid speciation of common wheat shown by natural variation analyses of the paternal species.

    Directory of Open Access Journals (Sweden)

    Yoshihiro Matsuoka

    Full Text Available The complex process of allopolyploid speciation includes various mechanisms ranging from species crosses and hybrid genome doubling to genome alterations and the establishment of new allopolyploids as persisting natural entities. Currently, little is known about the genetic mechanisms that underlie hybrid genome doubling, despite the fact that natural allopolyploid formation is highly dependent on this phenomenon. We examined the genetic basis for the spontaneous genome doubling of triploid F1 hybrids between the direct ancestors of allohexaploid common wheat (Triticum aestivum L., AABBDD genome, namely Triticumturgidum L. (AABB genome and Aegilopstauschii Coss. (DD genome. An Ae. tauschii intraspecific lineage that is closely related to the D genome of common wheat was identified by population-based analysis. Two representative accessions, one that produces a high-genome-doubling-frequency hybrid when crossed with a T. turgidum cultivar and the other that produces a low-genome-doubling-frequency hybrid with the same cultivar, were chosen from that lineage for further analyses. A series of investigations including fertility analysis, immunostaining, and quantitative trait locus (QTL analysis showed that (1 production of functional unreduced gametes through nonreductional meiosis is an early step key to successful hybrid genome doubling, (2 first division restitution is one of the cytological mechanisms that cause meiotic nonreduction during the production of functional male unreduced gametes, and (3 six QTLs in the Ae. tauschii genome, most of which likely regulate nonreductional meiosis and its subsequent gamete production processes, are involved in hybrid genome doubling. Interlineage comparisons of Ae. tauschii's ability to cause hybrid genome doubling suggested an evolutionary model for the natural variation pattern of the trait in which non-deleterious mutations in six QTLs may have important roles. The findings of this study demonstrated

  13. Genomic Heterogeneity of Methicillin Resistant Staphylococcus aureus Associated with Variation in Severity of Illness among Children with Acute Hematogenous Osteomyelitis.

    Directory of Open Access Journals (Sweden)

    Claudia Gaviria-Agudelo

    Full Text Available The association between severity of illness of children with osteomyelitis caused by Methicillin-resistant Staphylococcus aureus (MRSA and genomic variation of the causative organism has not been previously investigated. The purpose of this study is to assess genomic heterogeneity among MRSA isolates from children with osteomyelitis who have diverse severity of illness.Children with osteomyelitis were prospectively studied between 2010 and 2011. Severity of illness of the affected children was determined from clinical and laboratory parameters. MRSA isolates were analyzed with next generation sequencing (NGS and optical mapping. Sequence data was used for multi-locus sequence typing (MLST, phylogenetic analysis by maximum likelihood (PAML, and identification of virulence genes and single nucleotide polymorphisms (SNP relative to reference strains.The twelve children studied demonstrated severity of illness scores ranging from 0 (mild to 9 (severe. All isolates were USA300, ST 8, SCC mec IVa MRSA by MLST. The isolates differed from reference strains by 2 insertions (40 Kb each and 2 deletions (10 and 25 Kb but had no rearrangements or copy number variations. There was a higher occurrence of virulence genes among study isolates when compared to the reference strains (p = 0.0124. There were an average of 11 nonsynonymous SNPs per strain. PAML demonstrated heterogeneity of study isolates from each other and from the reference strains.Genomic heterogeneity exists among MRSA isolates causing osteomyelitis among children in a single community. These variations may play a role in the pathogenesis of variation in clinical severity among these children.

  14. Genome-wide analysis of macrosatellite repeat copy number variation in worldwide populations: Evidence for differences and commonalities in size distributions and size restrictions

    NARCIS (Netherlands)

    M. Schaap (Michiel); R.J.L.F. Lemmers (Richard); R. Maassen (Roel); P.J. van der Vliet (Patrick); L.F. Hoogerheide (Lennart); H.K. van Dijk (Herman); N. Basturk (Nalan); P. de Knijff (Peter); S.M. van der Maarel (Silvère)

    2013-01-01

    textabstractBackground: Macrosatellite repeats (MSRs), usually spanning hundreds of kilobases of genomic DNA, comprise a significant proportion of the human genome. Because of their highly polymorphic nature, MSRs represent an extreme example of copy number variation, but their structure and

  15. Genome-wide analysis of macrosatellite repeat copy number variation in worldwide populations: evidence for differences and commonalities in size distributions and size restrictions

    NARCIS (Netherlands)

    Schaap, M.; Lemmers, R.J.L.F.; Maassen, R.; van der Vliet, P.J.; Hoogerheide, L.F.; van Dijk, H.K.; Basturk, N.; de Knijff, P.; van der Maarel, S.M.

    2013-01-01

    Background: Macrosatellite repeats (MSRs), usually spanning hundreds of kilobases of genomic DNA, comprise a significant proportion of the human genome. Because of their highly polymorphic nature, MSRs represent an extreme example of copy number variation, but their structure and function is largely

  16. Genomes

    National Research Council Canada - National Science Library

    Brown, T. A. (Terence A.)

    2002-01-01

    ... of genome expression and replication processes, and transcriptomics and proteomics. This text is richly illustrated with clear, easy-to-follow, full color diagrams, which are downloadable from the book's website...

  17. Dynamics of chromosome number and genome size variation in a cytogenetically variable sedge (Carex scoparia var. scoparia, Cyperaceae).

    Science.gov (United States)

    Chung, Kyong-Sook; Weber, Jaime A; Hipp, Andrew L

    2011-01-01

    High intraspecific cytogenetic variation in the sedge genus Carex (Cyperaceae) is hypothesized to be due to the "diffuse" or non-localized centromeres, which facilitate chromosome fission and fusion. If chromosome number changes are dominated by fission and fusion, then chromosome evolution will result primarily in changes in the potential for recombination among populations. Chromosome duplications, on the other hand, entail consequent opportunities for divergent evolution of paralogs. In this study, we evaluate whether genome size and chromosome number covary within species. We used flow cytometry to estimate genome sizes in Carex scoparia var. scoparia, sampling 99 plants (23 populations) in the Chicago region, and we used meiotic chromosome observations to document chromosome numbers and chromosome pairing relations. Chromosome numbers range from 2n = 62 to 2n = 68, and nuclear DNA 1C content from 0.342 to 0.361 pg DNA. Regressions of DNA content on chromosome number are nonsignificant for data analyzed by individual or population, and a regression model that excludes slope is favored over a model in which chromosome number predicts genome size. Chromosome rearrangements within cytogenetically variable Carex species are more likely a consequence of fission and fusion than of duplication and deletion. Moreover, neither genome size nor chromosome number is spatially autocorrelated, which suggests the potential for rapid chromosome evolution by fission and fusion at a relatively fine geographic scale (<350 km). These findings have important implications for ecological restoration and speciation within the largest angiosperm genus of the temperate zone.

  18. Patterns of Genome-Wide Variation in Glossina fuscipes fuscipes Tsetse Flies from Uganda

    Directory of Open Access Journals (Sweden)

    Andrea Gloria-Soria

    2016-06-01

    Full Text Available The tsetse fly Glossina fuscipes fuscipes (Gff is the insect vector of the two forms of Human African Trypanosomiasis (HAT that exist in Uganda. Understanding Gff population dynamics, and the underlying genetics of epidemiologically relevant phenotypes is key to reducing disease transmission. Using ddRAD sequence technology, complemented with whole-genome sequencing, we developed a panel of ∼73,000 single-nucleotide polymorphisms (SNPs distributed across the Gff genome that can be used for population genomics and to perform genome-wide-association studies. We used these markers to estimate genomic patterns of linkage disequilibrium (LD in Gff, and used the information, in combination with outlier-locus detection tests, to identify candidate regions of the genome under selection. LD in individual populations decays to half of its maximum value (r2max/2 between 1359 and 2429 bp. The overall LD estimated for the species reaches r2max/2 at 708 bp, an order of magnitude slower than in Drosophila. Using 53 infected (Trypanosoma spp. and uninfected flies from four genetically distinct Ugandan populations adapted to different environmental conditions, we were able to identify SNPs associated with the infection status of the fly and local environmental adaptation. The extent of LD in Gff likely facilitated the detection of loci under selection, despite the small sample size. Furthermore, it is probable that LD in the regions identified is much higher than the average genomic LD due to strong selection. Our results show that even modest sample sizes can reveal significant genetic associations in this species, which has implications for future studies given the difficulties of collecting field specimens with contrasting phenotypes for association analysis.

  19. Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa

    Science.gov (United States)

    Huckins, L M; Hatzikotoulas, K; Southam, L; Thornton, L M; Steinberg, J; Aguilera-McKay, F; Treasure, J; Schmidt, U; Gunasinghe, C; Romero, A; Curtis, C; Rhodes, D; Moens, J; Kalsi, G; Dempster, D; Leung, R; Keohane, A; Burghardt, R; Ehrlich, S; Hebebrand, J; Hinney, A; Ludolph, A; Walton, E; Deloukas, P; Hofman, A; Palotie, A; Palta, P; van Rooij, F J A; Stirrups, K; Adan, R; Boni, C; Cone, R; Dedoussis, G; van Furth, E; Gonidakis, F; Gorwood, P; Hudson, J; Kaprio, J; Kas, M; Keski-Rahonen, A; Kiezebrink, K; Knudsen, G-P; Slof-Op 't Landt, M C T; Maj, M; Monteleone, A M; Monteleone, P; Raevuori, A H; Reichborn-Kjennerud, T; Tozzi, F; Tsitsika, A; van Elburg, A; Adan, R A H; Alfredsson, L; Ando, T; Andreassen, O A; Aschauer, H; Baker, J H; Barrett, J C; Bencko, V; Bergen, A W; Berrettini, W H; Birgegard, A; Boni, C; Boraska Perica, V; Brandt, H; Breen, G; Bulik, C M; Carlberg, L; Cassina, M; Cichon, S; Clementi, M; Cohen-Woods, S; Coleman, J; Cone, R D; Courtet, P; Crawford, S; Crow, S; Crowley, J; Danner, U N; Davis, O S P; de Zwaan, M; Dedoussis, G; Degortes, D; DeSocio, J E; Dick, D M; Dikeos, D; Dina, C; Ding, B; Dmitrzak-Weglarz, M; Docampo, E; Duncan, L; Egberts, K; Ehrlich, S; Escaramís, G; Esko, T; Espeseth, T; Estivill, X; Favaro, A; Fernández-Aranda, F; Fichter, M M; Finan, C; Fischer, K; Floyd, J A B; Foretova, L; Forzan, M; Franklin, C S; Gallinger, S; Gambaro, G; Gaspar, H A; Giegling, I; Gonidakis, F; Gorwood, P; Gratacos, M; Guillaume, S; Guo, Y; Hakonarson, H; Halmi, K A; Hatzikotoulas, K; Hauser, J; Hebebrand, J; Helder, S; Herms, S; Herpertz-Dahlmann, B; Herzog, W; Hilliard, C E; Hinney, A; Hübel, C; Huckins, L M; Hudson, J I; Huemer, J; Inoko, H; Janout, V; Jiménez-Murcia, S; Johnson, C; Julià, A; Juréus, A; Kalsi, G; Kaminska, D; Kaplan, A S; Kaprio, J; Karhunen, L; Karwautz, A; Kas, M J H; Kaye, W; Kennedy, J L; Keski-Rahkonen, A; Kiezebrink, K; Klareskog, L; Klump, K L; Knudsen, G P S; Koeleman, B P C; Koubek, D; La Via, M C; Landén, M; Le Hellard, S; Levitan, R D; Li, D; Lichtenstein, P; Lilenfeld, L; Lissowska, J; Lundervold, A; Magistretti, P; Maj, M; Mannik, K; Marsal, S; Martin, N; Mattingsdal, M; McDevitt, S; McGuffin, P; Merl, E; Metspalu, A; Meulenbelt, I; Micali, N; Mitchell, J; Mitchell, K; Monteleone, P; Monteleone, A M; Mortensen, P; Munn-Chernoff, M A; Navratilova, M; Nilsson, I; Norring, C; Ntalla, I; Ophoff, R A; O'Toole, J K; Palotie, A; Pante, J; Papezova, H; Pinto, D; Rabionet, R; Raevuori, A; Rajewski, A; Ramoz, N; Rayner, N W; Reichborn-Kjennerud, T; Ripatti, S; Roberts, M; Rotondo, A; Rujescu, D; Rybakowski, F; Santonastaso, P; Scherag, A; Scherer, S W; Schmidt, U; Schork, N J; Schosser, A; Slachtova, L; Sladek, R; Slagboom, P E; Slof-Op 't Landt, M C T; Slopien, A; Soranzo, N; Southam, L; Steen, V M; Strengman, E; Strober, M; Sullivan, P F; Szatkiewicz, J P; Szeszenia-Dabrowska, N; Tachmazidou, I; Tenconi, E; Thornton, L M; Tortorella, A; Tozzi, F; Treasure, J; Tsitsika, A; Tziouvas, K; van Elburg, A A; van Furth, E F; Wagner, G; Walton, E; Watson, H; Wichmann, H-E; Widen, E; Woodside, D B; Yanovski, J; Yao, S; Yilmaz, Z; Zeggini, E; Zerwas, S; Zipfel, S; Collier, D A; Sullivan, P F; Breen, G; Bulik, C M; Zeggini, E

    2018-01-01

    Anorexia nervosa (AN) is a complex neuropsychiatric disorder presenting with dangerously low body weight, and a deep and persistent fear of gaining weight. To date, only one genome-wide significant locus associated with AN has been identified. We performed an exome-chip based genome-wide association studies (GWAS) in 2158 cases from nine populations of European origin and 15 485 ancestrally matched controls. Unlike previous studies, this GWAS also probed association in low-frequency and rare variants. Sixteen independent variants were taken forward for in silico and de novo replication (11 common and 5 rare). No findings reached genome-wide significance. Two notable common variants were identified: rs10791286, an intronic variant in OPCML (P=9.89 × 10−6), and rs7700147, an intergenic variant (P=2.93 × 10−5). No low-frequency variant associations were identified at genome-wide significance, although the study was well-powered to detect low-frequency variants with large effect sizes, suggesting that there may be no AN loci in this genomic search space with large effect sizes. PMID:29155802

  20. Single-Nucleotide Variations in Cardiac Arrhythmias: Prospects for Genomics and Proteomics Based Biomarker Discovery and Diagnostics

    Directory of Open Access Journals (Sweden)

    Ayman Abunimer

    2014-03-01

    Full Text Available Cardiovascular diseases are a large contributor to causes of early death in developed countries. Some of these conditions, such as sudden cardiac death and atrial fibrillation, stem from arrhythmias—a spectrum of conditions with abnormal electrical activity in the heart. Genome-wide association studies can identify single nucleotide variations (SNVs that may predispose individuals to developing acquired forms of arrhythmias. Through manual curation of published genome-wide association studies, we have collected a comprehensive list of 75 SNVs associated with cardiac arrhythmias. Ten of the SNVs result in amino acid changes and can be used in proteomic-based detection methods. In an effort to identify additional non-synonymous mutations that affect the proteome, we analyzed the post-translational modification S-nitrosylation, which is known to affect cardiac arrhythmias. We identified loss of seven known S-nitrosylation sites due to non-synonymous single nucleotide variations (nsSNVs. For predicted nitrosylation sites we found 1429 proteins where the sites are modified due to nsSNV. Analysis of the predicted S-nitrosylation dataset for over- or under-representation (compared to the complete human proteome of pathways and functional elements shows significant statistical over-representation of the blood coagulation pathway. Gene Ontology (GO analysis displays statistically over-represented terms related to muscle contraction, receptor activity, motor activity, cystoskeleton components, and microtubule activity. Through the genomic and proteomic context of SNVs and S-nitrosylation sites presented in this study, researchers can look for variation that can predispose individuals to cardiac arrhythmias. Such attempts to elucidate mechanisms of arrhythmia thereby add yet another useful parameter in predicting susceptibility for cardiac diseases.

  1. Population genomic analysis of strain variation in Leptospirillum group II bacteria involved in acid mine drainage formation.

    Science.gov (United States)

    Simmons, Sheri L; Dibartolo, Genevieve; Denef, Vincent J; Goltsman, Daniela S Aliaga; Thelen, Michael P; Banfield, Jillian F

    2008-07-22

    Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth approximately 20x). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types ( approximately 94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of

  2. Population genomic analysis of strain variation in Leptospirillum group II bacteria involved in acid mine drainage formation.

    Directory of Open Access Journals (Sweden)

    Sheri L Simmons

    2008-07-01

    Full Text Available Deeply sampled community genomic (metagenomic datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth approximately 20x. The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types ( approximately 94% sequence identity have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the

  3. Population-Genomic Insights into Variation in Prevotella intermedia and Prevotella nigrescens Isolates and Its Association with Periodontal Disease

    Directory of Open Access Journals (Sweden)

    Yifei Zhang

    2017-09-01

    Full Text Available High-throughput sequencing has helped to reveal the close relationship between Prevotella and periodontal disease, but the roles of subspecies diversity and genomic variation within this genus in periodontal diseases still need to be investigated. We performed a comparative genome analysis of 48 Prevotella intermedia and Prevotella nigrescens isolates that from the same cohort of subjects to identify the main drivers of their pathogenicity and adaptation to different environments. The comparisons were done between two species and between disease and health based on pooled sequences. The results showed that both P. intermedia and P. nigrescens have highly dynamic genomes and can take up various exogenous factors through horizontal gene transfer. The major differences between disease-derived and health-derived samples of P. intermedia and P. nigrescens were factors related to genome modification and recombination, indicating that the Prevotella isolates from disease sites may be more capable of genomic reconstruction. We also identified genetic elements specific to each sample, and found that disease groups had more unique virulence factors related to capsule and lipopolysaccharide synthesis, secretion systems, proteinases, and toxins, suggesting that strains from disease sites may have more specific virulence, particularly for P. intermedia. The differentially represented pathways between samples from disease and health were related to energy metabolism, carbohydrate and lipid metabolism, and amino acid metabolism, consistent with data from the whole subgingival microbiome in periodontal disease and health. Disease-derived samples had gained or lost several metabolic genes compared to healthy-derived samples, which could be linked with the difference in virulence performance between diseased and healthy sample groups. Our findings suggest that P. intermedia and P. nigrescens may serve as “crucial substances” in subgingival plaque, which may

  4. Population-Genomic Insights into Variation in Prevotella intermedia and Prevotella nigrescens Isolates and Its Association with Periodontal Disease.

    Science.gov (United States)

    Zhang, Yifei; Zhen, Min; Zhan, Yalin; Song, Yeqing; Zhang, Qian; Wang, Jinfeng

    2017-01-01

    High-throughput sequencing has helped to reveal the close relationship between Prevotella and periodontal disease, but the roles of subspecies diversity and genomic variation within this genus in periodontal diseases still need to be investigated. We performed a comparative genome analysis of 48 Prevotella intermedia and Prevotella nigrescens isolates that from the same cohort of subjects to identify the main drivers of their pathogenicity and adaptation to different environments. The comparisons were done between two species and between disease and health based on pooled sequences. The results showed that both P. intermedia and P. nigrescens have highly dynamic genomes and can take up various exogenous factors through horizontal gene transfer. The major differences between disease-derived and health-derived samples of P. intermedia and P. nigrescens were factors related to genome modification and recombination, indicating that the Prevotella isolates from disease sites may be more capable of genomic reconstruction. We also identified genetic elements specific to each sample, and found that disease groups had more unique virulence factors related to capsule and lipopolysaccharide synthesis, secretion systems, proteinases, and toxins, suggesting that strains from disease sites may have more specific virulence, particularly for P. intermedia . The differentially represented pathways between samples from disease and health were related to energy metabolism, carbohydrate and lipid metabolism, and amino acid metabolism, consistent with data from the whole subgingival microbiome in periodontal disease and health. Disease-derived samples had gained or lost several metabolic genes compared to healthy-derived samples, which could be linked with the difference in virulence performance between diseased and healthy sample groups. Our findings suggest that P. intermedia and P. nigrescens may serve as "crucial substances" in subgingival plaque, which may reflect changes in

  5. High-confidence assessment of functional impact of human mitochondrial non-synonymous genome variations by APOGEE.

    Directory of Open Access Journals (Sweden)

    Stefano Castellana

    2017-06-01

    Full Text Available 24,189 are all the possible non-synonymous amino acid changes potentially affecting the human mitochondrial DNA. Only a tiny subset was functionally evaluated with certainty so far, while the pathogenicity of the vast majority was only assessed in-silico by software predictors. Since these tools proved to be rather incongruent, we have designed and implemented APOGEE, a machine-learning algorithm that outperforms all existing prediction methods in estimating the harmfulness of mitochondrial non-synonymous genome variations. We provide a detailed description of the underlying algorithm, of the selected and manually curated training and test sets of variants, as well as of its classification ability.

  6. Ancestry variation and footprints of natural selection along the genome in Latin American populations.

    Science.gov (United States)

    Deng, Lian; Ruiz-Linares, Andrés; Xu, Shuhua; Wang, Sijia

    2016-02-18

    Latin American populations stem from the admixture of Europeans, Africans and Native Americans, which started over 400 years ago and had lasted for several centuries. Extreme deviation over the genome-wide average in ancestry estimations at certain genomic locations could reflect recent natural selection. We evaluated the distribution of ancestry estimations using 678 genome-wide microsatellite markers in 249 individuals from 13 admixed populations across Latin America. We found significant deviations in ancestry estimations including three locations with more than 3.5 times standard deviations from the genome-wide average: an excess of European ancestry at 1p36 and 14q32, and an excess of African ancestry at 6p22. Using simulations, we could show that at least the deviation at 6p22 was unlikely to result from genetic drift alone. By applying different linguistic groups as well as the most likely ancestral Native American populations as the ancestry, we showed that the choice of Native American ancestry could affect the local ancestry estimation. However, the signal at 6p22 consistently appeared in most of the analyses using various ancestral groups. This study provided important insights for recent natural selection in the context of the unique history of the New World and implications for disease mapping.

  7. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  8. Natural variation of histone modification and its impact on gene expression in the rat genome

    NARCIS (Netherlands)

    Rintisch, Carola; Heinig, Matthias; Bauerfeind, Anja; Schafer, Sebastian; Mieth, Christin; Patone, Giannino; Hummel, Oliver; Chen, Wei; Cook, Stuart; Cuppen, Edwin; Colomé-Tatché, Maria; Johannes, Frank; Jansen, Ritsert C; Neil, Helen; Werner, Michel; Pravenec, Michal; Vingron, Martin; Hubner, Norbert

    Histone modifications are epigenetic marks that play fundamental roles in many biological processes including the control of chromatin-mediated regulation of gene expression. Little is known about interindividual variability of histone modification levels across the genome and to what extent they

  9. Genome size and phenotypic variation of Nymphaea (Nymphaeaceae) species from Eastern Europe and temperate Asia

    Czech Academy of Sciences Publication Activity Database

    Dąbrowska, M. A.; Rola, K.; Volkova, P.; Suda, Jan; Zalewska-Gałosz, J.

    2015-01-01

    Roč. 84, č. 2 (2015), s. 277-286 ISSN 0001-6977 R&D Projects: GA ČR GB14-36079G Institutional support: RVO:67985939 Keywords : flow cytometry * genome size * morphometrics Subject RIV: EF - Botanics Impact factor: 1.213, year: 2015

  10. Using an online genome resource to identify myostatin variation in U.S. sheep

    Science.gov (United States)

    We created a public, searchable DNA sequence resource for sheep that contained approximately 14x whole genome sequence of 96 rams. The animals represent 10 popular U.S. breeds and share minimal pedigree relationships, making the resource suitable for viewing gene variants in the user-friendly Integ...

  11. Systematic differences in the response of genetic variation to pedigree and genome-based selection methods

    NARCIS (Netherlands)

    Heidaritabar, M.; Vereijken, A.; Muir, W.M.; Meuwissen, T.H.E.; Cheng, H.; Megens, H.J.W.C.; Groenen, M.; Bastiaansen, J.W.M.

    2014-01-01

    Genomic selection (GS) is a DNA-based method of selecting for quantitative traits in animal and plant breeding, and offers a potentially superior alternative to traditional breeding methods that rely on pedigree and phenotype information. Using a 60¿K SNP chip with markers spaced throughout the

  12. How genome size variation is linked with evolution within Chenopodium sensu lato

    Czech Academy of Sciences Publication Activity Database

    Mandák, Bohumil; Krak, Karol; Vít, Petr; Pavlíková, Zuzana; Lomonosova, M. N.; Habibi, Farzaneh; Lei, Wang; Jellen, E.N.; Douda, Jan

    2016-01-01

    Roč. 23, DEC 2016 (2016), s. 18-32 ISSN 1433-8319 R&D Projects: GA ČR GA13-02290S Institutional support: RVO:67985939 Keywords : Chenopodium * genome size evolution * flow cytometry Subject RIV: EF - Botanics Impact factor: 3.123, year: 2016

  13. BIGSdb: Scalable analysis of bacterial genome variation at the population level

    Directory of Open Access Journals (Sweden)

    Maiden Martin CJ

    2010-12-01

    Full Text Available Abstract Background The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner. Results The Bacterial Isolate Genome Sequence Database (BIGSDB is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens. The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences. These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses. Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches. LIMS functionality of the software enables linkage to and organisation of laboratory samples. The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database. Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus. The BIGSDB source code and documentation are available at http://pubmlst.org/software/database/bigsdb/. Conclusions Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies. BIGSDB

  14. Evolutionary and biotechnology implications of plastid genome variation in the inverted-repeat-lacking clade of legumes.

    Science.gov (United States)

    Sabir, Jamal; Schwarz, Erika; Ellison, Nicholas; Zhang, Jin; Baeshen, Nabih A; Mutwakil, Muhammed; Jansen, Robert; Ruhlman, Tracey

    2014-08-01

    Land plant plastid genomes (plastomes) provide a tractable model for evolutionary study in that they are relatively compact and gene dense. Among the groups that display an appropriate level of variation for structural features, the inverted-repeat-lacking clade (IRLC) of papilionoid legumes presents the potential to advance general understanding of the mechanisms of genomic evolution. Here, are presented six complete plastome sequences from economically important species of the IRLC, a lineage previously represented by only five completed plastomes. A number of characters are compared across the IRLC including gene retention and divergence, synteny, repeat structure and functional gene transfer to the nucleus. The loss of clpP intron 2 was identified in one newly sequenced member of IRLC, Glycyrrhiza glabra. Using deeply sequenced nuclear transcriptomes from two species helped clarify the nature of the functional transfer of accD to the nucleus in Trifolium, which likely occurred in the lineage leading to subgenus Trifolium. Legumes are second only to cereal crops in agricultural importance based on area harvested and total production. Genetic improvement via plastid transformation of IRLC crop species is an appealing proposition. Comparative analyses of intergenic spacer regions emphasize the need for complete genome sequences for developing transformation vectors for plastid genetic engineering of legume crops. © 2014 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  15. Single-Cell-Based Platform for Copy Number Variation Profiling through Digital Counting of Amplified Genomic DNA Fragments.

    Science.gov (United States)

    Li, Chunmei; Yu, Zhilong; Fu, Yusi; Pang, Yuhong; Huang, Yanyi

    2017-04-26

    We develop a novel single-cell-based platform through digital counting of amplified genomic DNA fragments, named multifraction amplification (mfA), to detect the copy number variations (CNVs) in a single cell. Amplification is required to acquire genomic information from a single cell, while introducing unavoidable bias. Unlike prevalent methods that directly infer CNV profiles from the pattern of sequencing depth, our mfA platform denatures and separates the DNA molecules from a single cell into multiple fractions of a reaction mix before amplification. By examining the sequencing result of each fraction for a specific fragment and applying a segment-merge maximum likelihood algorithm to the calculation of copy number, we digitize the sequencing-depth-based CNV identification and thus provide a method that is less sensitive to the amplification bias. In this paper, we demonstrate a mfA platform through multiple displacement amplification (MDA) chemistry. When performing the mfA platform, the noise of MDA is reduced; therefore, the resolution of single-cell CNV identification can be improved to 100 kb. We can also determine the genomic region free of allelic drop-out with mfA platform, which is impossible for conventional single-cell amplification methods.

  16. Systematic differences in the response of genetic variation to pedigree and genome-based selection methods.

    Science.gov (United States)

    Heidaritabar, M; Vereijken, A; Muir, W M; Meuwissen, T; Cheng, H; Megens, H-J; Groenen, M A M; Bastiaansen, J W M

    2014-12-01

    Genomic selection (GS) is a DNA-based method of selecting for quantitative traits in animal and plant breeding, and offers a potentially superior alternative to traditional breeding methods that rely on pedigree and phenotype information. Using a 60 K SNP chip with markers spaced throughout the entire chicken genome, we compared the impact of GS and traditional BLUP (best linear unbiased prediction) selection methods applied side-by-side in three different lines of egg-laying chickens. Differences were demonstrated between methods, both at the level and genomic distribution of allele frequency changes. In all three lines, the average allele frequency changes were larger with GS, 0.056 0.064 and 0.066, compared with BLUP, 0.044, 0.045 and 0.036 for lines B1, B2 and W1, respectively. With BLUP, 35 selected regions (empirical P selected regions were identified. Empirical thresholds for local allele frequency changes were determined from gene dropping, and differed considerably between GS (0.167-0.198) and BLUP (0.105-0.126). Between lines, the genomic regions with large changes in allele frequencies showed limited overlap. Our results show that GS applies selection pressure much more locally than BLUP, resulting in larger allele frequency changes. With these results, novel insights into the nature of selection on quantitative traits have been gained and important questions regarding the long-term impact of GS are raised. The rapid changes to a part of the genetic architecture, while another part may not be selected, at least in the short term, require careful consideration, especially when selection occurs before phenotypes are observed.

  17. Analyzing the genomic variation of microbial cell factories in the era of “New Biotechnology”

    DEFF Research Database (Denmark)

    Herrgard, Markus; Panagiotou, Gianni

    2012-01-01

    The application of genome-scale technologies, both experimental and in silico, to industrial biotechnology has allowed improving the conversion of biomass-derived feedstocks to chemicals, materials and fuels through microbial fermentation. In particular, due to rapidly decreasing costs and its...... technologies for finding the underlying molecular mechanisms for (a) improved carbon source utilization, (b) increased product formation, and (c) stress tolerance. We also discuss the strengths and weaknesses of different strategies for mapping industrially relevant genotype-to-phenotype links including...

  18. Demographic history and biologically relevant genetic variation of Native Mexicans inferred from whole-genome sequencing

    OpenAIRE

    Romero-Hidalgo, Sandra; Ochoa-Leyva, Adrián; Garcíarrubio, Alejandro; Acuña-Alonzo, Victor; Antúnez-Argüelles, Erika; Balcazar-Quintero, Martha; Barquera-Lozano, Rodrigo; Carnevale, Alessandra; Cornejo-Granados, Fernanda; Fernández-López, Juan Carlos; García-Herrera, Rodrigo; García-Ortíz, Humberto; Granados-Silvestre, Ángeles; Granados, Julio; Guerrero-Romero, Fernando

    2017-01-01

    Understanding the genetic structure of Native American populations is important to clarify their diversity, demographic history, and to identify genetic factors relevant for biomedical traits. Here, we show a demographic history reconstruction from 12 Native American whole genomes belonging to six distinct ethnic groups representing the three main described genetic clusters of Mexico (Northern, Southern, and Maya). Effective population size estimates of all Native American groups remained bel...

  19. Genome size variation in Macaronesian Angiosperms: Forty Percent of Canarian Endemic Flora Completed

    Czech Academy of Sciences Publication Activity Database

    Suda, Jan; Kyncl, Tomáš; Jarolímová, Vlasta

    2005-01-01

    Roč. 252, 3-4 (2005), s. 215-238 ISSN 0378-2697 R&D Projects: GA ČR(CZ) GA206/00/1445; GA ČR(CZ) GA206/04/0081; GA AV ČR(CZ) KSK6005114 Institutional research plan: CEZ:AV0Z60050516 Keywords : genome size * cytometry * Macaronesia Subject RIV: EF - Botanics Impact factor: 1.421, year: 2005

  20. Striking structural dynamism and nucleotide sequence variation of the transposon Galileo in the genome of Drosophila mojavensis.

    Science.gov (United States)

    Marzo, Mar; Bello, Xabier; Puig, Marta; Maside, Xulio; Ruiz, Alfredo

    2013-02-04

    Galileo is a transposable element responsible for the generation of three chromosomal inversions in natural populations of Drosophila buzzatii. Although the most characteristic feature of Galileo is the long internally-repetitive terminal inverted repeats (TIRs), which resemble the Drosophila Foldback element, its transposase-coding sequence has led to its classification as a member of the P-element superfamily (Class II, subclass 1, TIR order). Furthermore, Galileo has a wide distribution in the genus Drosophila, since it has been found in 6 of the 12 Drosophila sequenced genomes. Among these species, D. mojavensis, the one closest to D. buzzatii, presented the highest diversity in sequence and structure of Galileo elements. In the present work, we carried out a thorough search and annotation of all the Galileo copies present in the D. mojavensis sequenced genome. In our set of 170 Galileo copies we have detected 5 Galileo subfamilies (C, D, E, F, and X) with different structures ranging from nearly complete, to only 2 TIR or solo TIR copies. Finally, we have explored the structural and length variation of the Galileo copies that point out the relatively frequent rearrangements within and between Galileo elements. Different mechanisms responsible for these rearrangements are discussed. Although Galileo is a transposable element with an ancient history in the D. mojavensis genome, our data indicate a recent transpositional activity. Furthermore, the dynamism in sequence and structure, mainly affecting the TIRs, suggests an active exchange of sequences among the copies. This exchange could lead to new subfamilies of the transposon, which could be crucial for the long-term survival of the element in the genome.

  1. Spectrum of mitochondrial genomic variation and associated clinical presentation of prostate cancer in South African men.

    Science.gov (United States)

    McCrow, John P; Petersen, Desiree C; Louw, Melanie; Chan, Eva K F; Harmeyer, Katherine; Vecchiarelli, Stefano; Lyons, Ruth J; Bornman, M S Riana; Hayes, Vanessa M

    2016-03-01

    Prostate cancer incidence and mortality rates are significantly increased in African-American men, but limited studies have been performed within Sub-Saharan African populations. As mitochondria control energy metabolism and apoptosis we speculate that somatic mutations within mitochondrial genomes are candidate drivers of aggressive prostate carcinogenesis. We used matched blood and prostate tissue samples from 87 South African men (77 with African ancestry) to perform deep sequencing of complete mitochondrial genomes. Clinical presentation was biased toward aggressive disease (Gleason score >7, 64%), and compared with men without prostate cancer either with or without benign prostatic hyperplasia. We identified 144 somatic mtDNA single nucleotide variants (SNVs), of which 80 were observed in 39 men presenting with aggressive disease. Both the number and frequency of somatic mtDNA SNVs were associated with higher pathological stage. Besides doubling the total number of somatic PCa-associated mitochondrial genome mutations identified to date, we associate mutational load with aggressive prostate cancer status in men of African ancestry. © 2015 The Authors. The Prostate published by Wiley Periodicals, Inc.

  2. Analysis of genetic variation and potential applications in genome-scale metabolic modeling

    DEFF Research Database (Denmark)

    Cardoso, Joao; Andersen, Mikael Rørdam; Herrgard, Markus

    2015-01-01

    scale and resolution by re-sequencing thousands of strains systematically. In this article, we review challenges in the integration and analysis of large-scale re-sequencing data, present an extensive overview of bioinformatics methods for predicting the effects of genetic variants on protein function......Genetic variation is the motor of evolution and allows organisms to overcome the environmental challenges they encounter. It can be both beneficial and harmful in the process of engineering cell factories for the production of proteins and chemicals. Throughout the history of biotechnology......, there have been efforts to exploit genetic variation in our favor to create strains with favorable phenotypes. Genetic variation can either be present in natural populations or it can be artificially created by mutagenesis and selection or adaptive laboratory evolution. On the other hand, unintended genetic...

  3. Genetic Architecture of Natural Variation in Rice Chlorophyll Content Revealed by a Genome-Wide Association Study.

    Science.gov (United States)

    Wang, Quanxiu; Xie, Weibo; Xing, Hongkun; Yan, Ju; Meng, Xiangzhou; Li, Xinglei; Fu, Xiangkui; Xu, Jiuyue; Lian, Xingming; Yu, Sibin; Xing, Yongzhong; Wang, Gongwei

    2015-06-01

    Chlorophyll content is one of the most important physiological traits as it is closely related to leaf photosynthesis and crop yield potential. So far, few genes have been reported to be involved in natural variation of chlorophyll content in rice (Oryza sativa) and the extent of variations explored is very limited. We conducted a genome-wide association study (GWAS) using a diverse worldwide collection of 529 O. sativa accessions. A total of 46 significant association loci were identified. Three F2 mapping populations with parents selected from the association panel were tested for validation of GWAS signals. We clearly demonstrated that Grain number, plant height, and heading date7 (Ghd7) was a major locus for natural variation of chlorophyll content at the heading stage by combining evidence from near-isogenic lines and transgenic plants. The enhanced expression of Ghd7 decreased the chlorophyll content, mainly through down-regulating the expression of genes involved in the biosynthesis of chlorophyll and chloroplast. In addition, Narrow leaf1 (NAL1) corresponded to one significant association region repeatedly detected over two years. We revealed a high degree of polymorphism in the 5' UTR and four non-synonymous SNPs in the coding region of NAL1, and observed diverse effects of the major haplotypes. The loci or candidate genes identified would help to fine-tune and optimize the antenna size of canopies in rice breeding. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.

  4. A novel variational Bayes multiple locus Z-statistic for genome-wide association studies with Bayesian model averaging

    Science.gov (United States)

    Logsdon, Benjamin A.; Carty, Cara L.; Reiner, Alexander P.; Dai, James Y.; Kooperberg, Charles

    2012-01-01

    Motivation: For many complex traits, including height, the majority of variants identified by genome-wide association studies (GWAS) have small effects, leaving a significant proportion of the heritable variation unexplained. Although many penalized multiple regression methodologies have been proposed to increase the power to detect associations for complex genetic architectures, they generally lack mechanisms for false-positive control and diagnostics for model over-fitting. Our methodology is the first penalized multiple regression approach that explicitly controls Type I error rates and provide model over-fitting diagnostics through a novel normally distributed statistic defined for every marker within the GWAS, based on results from a variational Bayes spike regression algorithm. Results: We compare the performance of our method to the lasso and single marker analysis on simulated data and demonstrate that our approach has superior performance in terms of power and Type I error control. In addition, using the Women's Health Initiative (WHI) SNP Health Association Resource (SHARe) GWAS of African-Americans, we show that our method has power to detect additional novel associations with body height. These findings replicate by reaching a stringent cutoff of marginal association in a larger cohort. Availability: An R-package, including an implementation of our variational Bayes spike regression (vBsr) algorithm, is available at http://kooperberg.fhcrc.org/soft.html. Contact: blogsdon@fhcrc.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22563072

  5. Epigenetic Variation in Monozygotic Twins: A Genome-Wide Analysis of DNA Methylation in Buccal Cells

    NARCIS (Netherlands)

    van Dongen, J.; Ehli, E.A.; Slieker, R.C.; Bartels, M.; Weber, Z.M.; Davies, G.E.; Slagboom, P.E.; Heijmans, B.T.; Boomsma, D.I.

    2014-01-01

    DNA methylation is one of the most extensively studied epigenetic marks in humans. Yet, it is largely unknown what causes variation in DNA methylation between individuals. The comparison of DNA methylation profiles of monozygotic (MZ) twins offers a unique experimental design to examine the extent

  6. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing

    NARCIS (Netherlands)

    Aflitos, S.A.; Schijlen, E.G.W.M.; Jong, de J.H.S.G.M.; Ridder, de D.; Smit, S.; Finkers, H.J.; Bakker, F.T.; Geest, van de H.C.; Lintel Hekkert, te B.; Haarst, van J.C.; Smits, L.W.M.; Koops, A.J.; Sanchez-Perez, M.J.; Heusden, van A.W.; Visser, R.G.F.; Schranz, M.E.; Peters, S.A.

    2014-01-01

    We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative for the Lycopersicon, Arcanum, Eriopersicon, and Neolycopersicon groups which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new

  7. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing

    NARCIS (Netherlands)

    Aflitos, S.; Schijlen, E.; de Jong, H.; de Ridder, D.; Smit, S.; Finkers, R.; Wang, J.; Zhang, G.; Li, N.; Mao, L.; Bakker, F.; Dirks, R.; Breit, T.; Gravendeel, B.; Huits, H.; Struss, D.; Swanson-Wagner, R.; van Leeuwen, H.; van Ham, R.C.H.J.; Fito, L.; Guignier, L.; Sevilla, M.; Ellul, P.; Ganko, E.; Kapur, A.; Reclus, E.; de Geus, B.; van de Geest, H.; te Lintel Hekkert, B.; van Haarst, J.; Smits, L.; Koops, A.; Sanchez-Perez, G.; van Heusden, A.W.; Visser, R.; Quan, Z.; Min, J.; Liao, L.; Wang, X.; Wang, G.; Yue, Z.; Yang, X.; Xu, N.; Schranz, E.; Smets, E.; Vos, R.; Rauwerda, J.; Ursem, R.; Schuit, C.; Kerns, M.; van den Berg, J.; Vriezen, W.; Janssen, A.; Datema, E.; Jahrman, T.; Moquet, F.; Bonnet, J.; Peters, S.

    2014-01-01

    We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new

  8. Defining the role of common variation in the genomic and biological architecture of adult human height.

    Science.gov (United States)

    Wood, Andrew R; Esko, Tonu; Yang, Jian; Vedantam, Sailaja; Pers, Tune H; Gustafsson, Stefan; Chu, Audrey Y; Estrada, Karol; Luan, Jian'an; Kutalik, Zoltán; Amin, Najaf; Buchkovich, Martin L; Croteau-Chonka, Damien C; Day, Felix R; Duan, Yanan; Fall, Tove; Fehrmann, Rudolf; Ferreira, Teresa; Jackson, Anne U; Karjalainen, Juha; Lo, Ken Sin; Locke, Adam E; Mägi, Reedik; Mihailov, Evelin; Porcu, Eleonora; Randall, Joshua C; Scherag, André; Vinkhuyzen, Anna A E; Westra, Harm-Jan; Winkler, Thomas W; Workalemahu, Tsegaselassie; Zhao, Jing Hua; Absher, Devin; Albrecht, Eva; Anderson, Denise; Baron, Jeffrey; Beekman, Marian; Demirkan, Ayse; Ehret, Georg B; Feenstra, Bjarke; Feitosa, Mary F; Fischer, Krista; Fraser, Ross M; Goel, Anuj; Gong, Jian; Justice, Anne E; Kanoni, Stavroula; Kleber, Marcus E; Kristiansson, Kati; Lim, Unhee; Lotay, Vaneet; Lui, Julian C; Mangino, Massimo; Mateo Leach, Irene; Medina-Gomez, Carolina; Nalls, Michael A; Nyholt, Dale R; Palmer, Cameron D; Pasko, Dorota; Pechlivanis, Sonali; Prokopenko, Inga; Ried, Janina S; Ripke, Stephan; Shungin, Dmitry; Stancáková, Alena; Strawbridge, Rona J; Sung, Yun Ju; Tanaka, Toshiko; Teumer, Alexander; Trompet, Stella; van der Laan, Sander W; van Setten, Jessica; Van Vliet-Ostaptchouk, Jana V; Wang, Zhaoming; Yengo, Loïc; Zhang, Weihua; Afzal, Uzma; Arnlöv, Johan; Arscott, Gillian M; Bandinelli, Stefania; Barrett, Amy; Bellis, Claire; Bennett, Amanda J; Berne, Christian; Blüher, Matthias; Bolton, Jennifer L; Böttcher, Yvonne; Boyd, Heather A; Bruinenberg, Marcel; Buckley, Brendan M; Buyske, Steven; Caspersen, Ida H; Chines, Peter S; Clarke, Robert; Claudi-Boehm, Simone; Cooper, Matthew; Daw, E Warwick; De Jong, Pim A; Deelen, Joris; Delgado, Graciela; Denny, Josh C; Dhonukshe-Rutten, Rosalie; Dimitriou, Maria; Doney, Alex S F; Dörr, Marcus; Eklund, Niina; Eury, Elodie; Folkersen, Lasse; Garcia, Melissa E; Geller, Frank; Giedraitis, Vilmantas; Go, Alan S; Grallert, Harald; Grammer, Tanja B; Gräßler, Jürgen; Grönberg, Henrik; de Groot, Lisette C P G M; Groves, Christopher J; Haessler, Jeffrey; Hall, Per; Haller, Toomas; Hallmans, Goran; Hannemann, Anke; Hartman, Catharina A; Hassinen, Maija; Hayward, Caroline; Heard-Costa, Nancy L; Helmer, Quinta; Hemani, Gibran; Henders, Anjali K; Hillege, Hans L; Hlatky, Mark A; Hoffmann, Wolfgang; Hoffmann, Per; Holmen, Oddgeir; Houwing-Duistermaat, Jeanine J; Illig, Thomas; Isaacs, Aaron; James, Alan L; Jeff, Janina; Johansen, Berit; Johansson, Åsa; Jolley, Jennifer; Juliusdottir, Thorhildur; Junttila, Juhani; Kho, Abel N; Kinnunen, Leena; Klopp, Norman; Kocher, Thomas; Kratzer, Wolfgang; Lichtner, Peter; Lind, Lars; Lindström, Jaana; Lobbens, Stéphane; Lorentzon, Mattias; Lu, Yingchang; Lyssenko, Valeriya; Magnusson, Patrik K E; Mahajan, Anubha; Maillard, Marc; McArdle, Wendy L; McKenzie, Colin A; McLachlan, Stela; McLaren, Paul J; Menni, Cristina; Merger, Sigrun; Milani, Lili; Moayyeri, Alireza; Monda, Keri L; Morken, Mario A; Müller, Gabriele; Müller-Nurasyid, Martina; Musk, Arthur W; Narisu, Narisu; Nauck, Matthias; Nolte, Ilja M; Nöthen, Markus M; Oozageer, Laticia; Pilz, Stefan; Rayner, Nigel W; Renstrom, Frida; Robertson, Neil R; Rose, Lynda M; Roussel, Ronan; Sanna, Serena; Scharnagl, Hubert; Scholtens, Salome; Schumacher, Fredrick R; Schunkert, Heribert; Scott, Robert A; Sehmi, Joban; Seufferlein, Thomas; Shi, Jianxin; Silventoinen, Karri; Smit, Johannes H; Smith, Albert Vernon; Smolonska, Joanna; Stanton, Alice V; Stirrups, Kathleen; Stott, David J; Stringham, Heather M; Sundström, Johan; Swertz, Morris A; Syvänen, Ann-Christine; Tayo, Bamidele O; Thorleifsson, Gudmar; Tyrer, Jonathan P; van Dijk, Suzanne; van Schoor, Natasja M; van der Velde, Nathalie; van Heemst, Diana; van Oort, Floor V A; Vermeulen, Sita H; Verweij, Niek; Vonk, Judith M; Waite, Lindsay L; Waldenberger, Melanie; Wennauer, Roman; Wilkens, Lynne R; Willenborg, Christina; Wilsgaard, Tom; Wojczynski, Mary K; Wong, Andrew; Wright, Alan F; Zhang, Qunyuan; Arveiler, Dominique; Bakker, Stephan J L; Beilby, John; Bergman, Richard N; Bergmann, Sven; Biffar, Reiner; Blangero, John; Boomsma, Dorret I; Bornstein, Stefan R; Bovet, Pascal; Brambilla, Paolo; Brown, Morris J; Campbell, Harry; Caulfield, Mark J; Chakravarti, Aravinda; Collins, Rory; Collins, Francis S; Crawford, Dana C; Cupples, L Adrienne; Danesh, John; de Faire, Ulf; den Ruijter, Hester M; Erbel, Raimund; Erdmann, Jeanette; Eriksson, Johan G; Farrall, Martin; Ferrannini, Ele; Ferrières, Jean; Ford, Ian; Forouhi, Nita G; Forrester, Terrence; Gansevoort, Ron T; Gejman, Pablo V; Gieger, Christian; Golay, Alain; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Haas, David W; Hall, Alistair S; Harris, Tamara B; Hattersley, Andrew T; Heath, Andrew C; Hengstenberg, Christian; Hicks, Andrew A; Hindorff, Lucia A; Hingorani, Aroon D; Hofman, Albert; Hovingh, G Kees; Humphries, Steve E; Hunt, Steven C; Hypponen, Elina; Jacobs, Kevin B; Jarvelin, Marjo-Riitta; Jousilahti, Pekka; Jula, Antti M; Kaprio, Jaakko; Kastelein, John J P; Kayser, Manfred; Kee, Frank; Keinanen-Kiukaanniemi, Sirkka M; Kiemeney, Lambertus A; Kooner, Jaspal S; Kooperberg, Charles; Koskinen, Seppo; Kovacs, Peter; Kraja, Aldi T; Kumari, Meena; Kuusisto, Johanna; Lakka, Timo A; Langenberg, Claudia; Le Marchand, Loic; Lehtimäki, Terho; Lupoli, Sara; Madden, Pamela A F; Männistö, Satu; Manunta, Paolo; Marette, André; Matise, Tara C; McKnight, Barbara; Meitinger, Thomas; Moll, Frans L; Montgomery, Grant W; Morris, Andrew D; Morris, Andrew P; Murray, Jeffrey C; Nelis, Mari; Ohlsson, Claes; Oldehinkel, Albertine J; Ong, Ken K; Ouwehand, Willem H; Pasterkamp, Gerard; Peters, Annette; Pramstaller, Peter P; Price, Jackie F; Qi, Lu; Raitakari, Olli T; Rankinen, Tuomo; Rao, D C; Rice, Treva K; Ritchie, Marylyn; Rudan, Igor; Salomaa, Veikko; Samani, Nilesh J; Saramies, Jouko; Sarzynski, Mark A; Schwarz, Peter E H; Sebert, Sylvain; Sever, Peter; Shuldiner, Alan R; Sinisalo, Juha; Steinthorsdottir, Valgerdur; Stolk, Ronald P; Tardif, Jean-Claude; Tönjes, Anke; Tremblay, Angelo; Tremoli, Elena; Virtamo, Jarmo; Vohl, Marie-Claude; Amouyel, Philippe; Asselbergs, Folkert W; Assimes, Themistocles L; Bochud, Murielle; Boehm, Bernhard O; Boerwinkle, Eric; Bottinger, Erwin P; Bouchard, Claude; Cauchi, Stéphane; Chambers, John C; Chanock, Stephen J; Cooper, Richard S; de Bakker, Paul I W; Dedoussis, George; Ferrucci, Luigi; Franks, Paul W; Froguel, Philippe; Groop, Leif C; Haiman, Christopher A; Hamsten, Anders; Hayes, M Geoffrey; Hui, Jennie; Hunter, David J; Hveem, Kristian; Jukema, J Wouter; Kaplan, Robert C; Kivimaki, Mika; Kuh, Diana; Laakso, Markku; Liu, Yongmei; Martin, Nicholas G; März, Winfried; Melbye, Mads; Moebus, Susanne; Munroe, Patricia B; Njølstad, Inger; Oostra, Ben A; Palmer, Colin N A; Pedersen, Nancy L; Perola, Markus; Pérusse, Louis; Peters, Ulrike; Powell, Joseph E; Power, Chris; Quertermous, Thomas; Rauramaa, Rainer; Reinmaa, Eva; Ridker, Paul M; Rivadeneira, Fernando; Rotter, Jerome I; Saaristo, Timo E; Saleheen, Danish; Schlessinger, David; Slagboom, P Eline; Snieder, Harold; Spector, Tim D; Strauch, Konstantin; Stumvoll, Michael; Tuomilehto, Jaakko; Uusitupa, Matti; van der Harst, Pim; Völzke, Henry; Walker, Mark; Wareham, Nicholas J; Watkins, Hugh; Wichmann, H-Erich; Wilson, James F; Zanen, Pieter; Deloukas, Panos; Heid, Iris M; Lindgren, Cecilia M; Mohlke, Karen L; Speliotes, Elizabeth K; Thorsteinsdottir, Unnur; Barroso, Inês; Fox, Caroline S; North, Kari E; Strachan, David P; Beckmann, Jacques S; Berndt, Sonja I; Boehnke, Michael; Borecki, Ingrid B; McCarthy, Mark I; Metspalu, Andres; Stefansson, Kari; Uitterlinden, André G; van Duijn, Cornelia M; Franke, Lude; Willer, Cristen J; Price, Alkes L; Lettre, Guillaume; Loos, Ruth J F; Weedon, Michael N; Ingelsson, Erik; O'Connell, Jeffrey R; Abecasis, Goncalo R; Chasman, Daniel I; Goddard, Michael E; Visscher, Peter M; Hirschhorn, Joel N; Frayling, Timothy M

    2014-11-01

    Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.

  9. Defining the role of common variation in the genomic and biological architecture of adult human height

    Science.gov (United States)

    Chu, Audrey Y; Estrada, Karol; Luan, Jian’an; Kutalik, Zoltán; Amin, Najaf; Buchkovich, Martin L; Croteau-Chonka, Damien C; Day, Felix R; Duan, Yanan; Fall, Tove; Fehrmann, Rudolf; Ferreira, Teresa; Jackson, Anne U; Karjalainen, Juha; Lo, Ken Sin; Locke, Adam E; Mägi, Reedik; Mihailov, Evelin; Porcu, Eleonora; Randall, Joshua C; Scherag, André; Vinkhuyzen, Anna AE; Westra, Harm-Jan; Winkler, Thomas W; Workalemahu, Tsegaselassie; Zhao, Jing Hua; Absher, Devin; Albrecht, Eva; Anderson, Denise; Baron, Jeffrey; Beekman, Marian; Demirkan, Ayse; Ehret, Georg B; Feenstra, Bjarke; Feitosa, Mary F; Fischer, Krista; Fraser, Ross M; Goel, Anuj; Gong, Jian; Justice, Anne E; Kanoni, Stavroula; Kleber, Marcus E; Kristiansson, Kati; Lim, Unhee; Lotay, Vaneet; Lui, Julian C; Mangino, Massimo; Leach, Irene Mateo; Medina-Gomez, Carolina; Nalls, Michael A; Nyholt, Dale R; Palmer, Cameron D; Pasko, Dorota; Pechlivanis, Sonali; Prokopenko, Inga; Ried, Janina S; Ripke, Stephan; Shungin, Dmitry; Stancáková, Alena; Strawbridge, Rona J; Sung, Yun Ju; Tanaka, Toshiko; Teumer, Alexander; Trompet, Stella; van der Laan, Sander W; van Setten, Jessica; Van Vliet-Ostaptchouk, Jana V; Wang, Zhaoming; Yengo, Loïc; Zhang, Weihua; Afzal, Uzma; Ärnlöv, Johan; Arscott, Gillian M; Bandinelli, Stefania; Barrett, Amy; Bellis, Claire; Bennett, Amanda J; Berne, Christian; Blüher, Matthias; Bolton, Jennifer L; Böttcher, Yvonne; Boyd, Heather A; Bruinenberg, Marcel; Buckley, Brendan M; Buyske, Steven; Caspersen, Ida H; Chines, Peter S; Clarke, Robert; Claudi-Boehm, Simone; Cooper, Matthew; Daw, E Warwick; De Jong, Pim A; Deelen, Joris; Delgado, Graciela; Denny, Josh C; Dhonukshe-Rutten, Rosalie; Dimitriou, Maria; Doney, Alex SF; Dörr, Marcus; Eklund, Niina; Eury, Elodie; Folkersen, Lasse; Garcia, Melissa E; Geller, Frank; Giedraitis, Vilmantas; Go, Alan S; Grallert, Harald; Grammer, Tanja B; Gräßler, Jürgen; Grönberg, Henrik; de Groot, Lisette C.P.G.M.; Groves, Christopher J; Haessler, Jeffrey; Hall, Per; Haller, Toomas; Hallmans, Goran; Hannemann, Anke; Hartman, Catharina A; Hassinen, Maija; Hayward, Caroline; Heard-Costa, Nancy L; Helmer, Quinta; Hemani, Gibran; Henders, Anjali K; Hillege, Hans L; Hlatky, Mark A; Hoffmann, Wolfgang; Hoffmann, Per; Holmen, Oddgeir; Houwing-Duistermaat, Jeanine J; Illig, Thomas; Isaacs, Aaron; James, Alan L; Jeff, Janina; Johansen, Berit; Johansson, Åsa; Jolley, Jennifer; Juliusdottir, Thorhildur; Junttila, Juhani; Kho, Abel N; Kinnunen, Leena; Klopp, Norman; Kocher, Thomas; Kratzer, Wolfgang; Lichtner, Peter; Lind, Lars; Lindström, Jaana; Lobbens, Stéphane; Lorentzon, Mattias; Lu, Yingchang; Lyssenko, Valeriya; Magnusson, Patrik KE; Mahajan, Anubha; Maillard, Marc; McArdle, Wendy L; McKenzie, Colin A; McLachlan, Stela; McLaren, Paul J; Menni, Cristina; Merger, Sigrun; Milani, Lili; Moayyeri, Alireza; Monda, Keri L; Morken, Mario A; Müller, Gabriele; Müller-Nurasyid, Martina; Musk, Arthur W; Narisu, Narisu; Nauck, Matthias; Nolte, Ilja M; Nöthen, Markus M; Oozageer, Laticia; Pilz, Stefan; Rayner, Nigel W; Renstrom, Frida; Robertson, Neil R; Rose, Lynda M; Roussel, Ronan; Sanna, Serena; Scharnagl, Hubert; Scholtens, Salome; Schumacher, Fredrick R; Schunkert, Heribert; Scott, Robert A; Sehmi, Joban; Seufferlein, Thomas; Shi, Jianxin; Silventoinen, Karri; Smit, Johannes H; Smith, Albert Vernon; Smolonska, Joanna; Stanton, Alice V; Stirrups, Kathleen; Stott, David J; Stringham, Heather M; Sundström, Johan; Swertz, Morris A; Syvänen, Ann-Christine; Tayo, Bamidele O; Thorleifsson, Gudmar; Tyrer, Jonathan P; van Dijk, Suzanne; van Schoor, Natasja M; van der Velde, Nathalie; van Heemst, Diana; van Oort, Floor VA; Vermeulen, Sita H; Verweij, Niek; Vonk, Judith M; Waite, Lindsay L; Waldenberger, Melanie; Wennauer, Roman; Wilkens, Lynne R; Willenborg, Christina; Wilsgaard, Tom; Wojczynski, Mary K; Wong, Andrew; Wright, Alan F; Zhang, Qunyuan; Arveiler, Dominique; Bakker, Stephan JL; Beilby, John; Bergman, Richard N; Bergmann, Sven; Biffar, Reiner; Blangero, John; Boomsma, Dorret I; Bornstein, Stefan R; Bovet, Pascal; Brambilla, Paolo; Brown, Morris J; Campbell, Harry; Caulfield, Mark J; Chakravarti, Aravinda; Collins, Rory; Collins, Francis S; Crawford, Dana C; Cupples, L Adrienne; Danesh, John; de Faire, Ulf; den Ruijter, Hester M; Erbel, Raimund; Erdmann, Jeanette; Eriksson, Johan G; Farrall, Martin; Ferrannini, Ele; Ferrières, Jean; Ford, Ian; Forouhi, Nita G; Forrester, Terrence; Gansevoort, Ron T; Gejman, Pablo V; Gieger, Christian; Golay, Alain; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Haas, David W; Hall, Alistair S; Harris, Tamara B; Hattersley, Andrew T; Heath, Andrew C; Hengstenberg, Christian; Hicks, Andrew A; Hindorff, Lucia A; Hingorani, Aroon D; Hofman, Albert; Hovingh, G Kees; Humphries, Steve E; Hunt, Steven C; Hypponen, Elina; Jacobs, Kevin B; Jarvelin, Marjo-Riitta; Jousilahti, Pekka; Jula, Antti M; Kaprio, Jaakko; Kastelein, John JP; Kayser, Manfred; Kee, Frank; Keinanen-Kiukaanniemi, Sirkka M; Kiemeney, Lambertus A; Kooner, Jaspal S; Kooperberg, Charles; Koskinen, Seppo; Kovacs, Peter; Kraja, Aldi T; Kumari, Meena; Kuusisto, Johanna; Lakka, Timo A; Langenberg, Claudia; Le Marchand, Loic; Lehtimäki, Terho; Lupoli, Sara; Madden, Pamela AF; Männistö, Satu; Manunta, Paolo; Marette, André; Matise, Tara C; McKnight, Barbara; Meitinger, Thomas; Moll, Frans L; Montgomery, Grant W; Morris, Andrew D; Morris, Andrew P; Murray, Jeffrey C; Nelis, Mari; Ohlsson, Claes; Oldehinkel, Albertine J; Ong, Ken K; Ouwehand, Willem H; Pasterkamp, Gerard; Peters, Annette; Pramstaller, Peter P; Price, Jackie F; Qi, Lu; Raitakari, Olli T; Rankinen, Tuomo; Rao, DC; Rice, Treva K; Ritchie, Marylyn; Rudan, Igor; Salomaa, Veikko; Samani, Nilesh J; Saramies, Jouko; Sarzynski, Mark A; Schwarz, Peter EH; Sebert, Sylvain; Sever, Peter; Shuldiner, Alan R; Sinisalo, Juha; Steinthorsdottir, Valgerdur; Stolk, Ronald P; Tardif, Jean-Claude; Tönjes, Anke; Tremblay, Angelo; Tremoli, Elena; Virtamo, Jarmo; Vohl, Marie-Claude; Amouyel, Philippe; Asselbergs, Folkert W; Assimes, Themistocles L; Bochud, Murielle; Boehm, Bernhard O; Boerwinkle, Eric; Bottinger, Erwin P; Bouchard, Claude; Cauchi, Stéphane; Chambers, John C; Chanock, Stephen J; Cooper, Richard S; de Bakker, Paul IW; Dedoussis, George; Ferrucci, Luigi; Franks, Paul W; Froguel, Philippe; Groop, Leif C; Haiman, Christopher A; Hamsten, Anders; Hayes, M Geoffrey; Hui, Jennie; Hunter, David J.; Hveem, Kristian; Jukema, J Wouter; Kaplan, Robert C; Kivimaki, Mika; Kuh, Diana; Laakso, Markku; Liu, Yongmei; Martin, Nicholas G; März, Winfried; Melbye, Mads; Moebus, Susanne; Munroe, Patricia B; Njølstad, Inger; Oostra, Ben A; Palmer, Colin NA; Pedersen, Nancy L; Perola, Markus; Pérusse, Louis; Peters, Ulrike; Powell, Joseph E; Power, Chris; Quertermous, Thomas; Rauramaa, Rainer; Reinmaa, Eva; Ridker, Paul M; Rivadeneira, Fernando; Rotter, Jerome I; Saaristo, Timo E; Saleheen, Danish; Schlessinger, David; Slagboom, P Eline; Snieder, Harold; Spector, Tim D; Strauch, Konstantin; Stumvoll, Michael; Tuomilehto, Jaakko; Uusitupa, Matti; van der Harst, Pim; Völzke, Henry; Walker, Mark; Wareham, Nicholas J; Watkins, Hugh; Wichmann, H-Erich; Wilson, James F; Zanen, Pieter; Deloukas, Panos; Heid, Iris M; Lindgren, Cecilia M; Mohlke, Karen L; Speliotes, Elizabeth K; Thorsteinsdottir, Unnur; Barroso, Inês; Fox, Caroline S; North, Kari E; Strachan, David P; Beckmann, Jacques S.; Berndt, Sonja I; Boehnke, Michael; Borecki, Ingrid B; McCarthy, Mark I; Metspalu, Andres; Stefansson, Kari; Uitterlinden, André G; van Duijn, Cornelia M; Franke, Lude; Willer, Cristen J; Price, Alkes L.; Lettre, Guillaume; Loos, Ruth JF; Weedon, Michael N; Ingelsson, Erik; O’Connell, Jeffrey R; Abecasis, Goncalo R; Chasman, Daniel I; Goddard, Michael E

    2014-01-01

    Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explain one-fifth of heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ~2,000, ~3,700 and ~9,500 SNPs explained ~21%, ~24% and ~29% of phenotypic variance. Furthermore, all common variants together captured the majority (60%) of heritability. The 697 variants clustered in 423 loci enriched for genes, pathways, and tissue-types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/beta-catenin, and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants. PMID:25282103

  10. Metabolome-genome-wide association study dissects genetic architecture for generating natural variation in rice secondary metabolism

    Science.gov (United States)

    Matsuda, Fumio; Nakabayashi, Ryo; Yang, Zhigang; Okazaki, Yozo; Yonemaru, Jun-ichi; Ebana, Kaworu; Yano, Masahiro; Saito, Kazuki

    2015-01-01

    Plants produce structurally diverse secondary (specialized) metabolites to increase their fitness for survival under adverse environments. Several bioactive compounds for new drugs have been identified through screening of plant extracts. In this study, genome-wide association studies (GWAS) were conducted to investigate the genetic architecture behind the natural variation of rice secondary metabolites. GWAS using the metabolome data of 175 rice accessions successfully identified 323 associations among 143 single nucleotide polymorphisms (SNPs) and 89 metabolites. The data analysis highlighted that levels of many metabolites are tightly associated with a small number of strong quantitative trait loci (QTLs). The tight association may be a mechanism generating strains with distinct metabolic composition through the crossing of two different strains. The results indicate that one plant species produces more diverse phytochemicals than previously expected, and plants still contain many useful compounds for human applications. PMID:25267402

  11. Bayesian Nonparametric Hidden Markov Models with application to the analysis of copy-number-variation in mammalian genomes.

    Science.gov (United States)

    Yau, C; Papaspiliopoulos, O; Roberts, G O; Holmes, C

    2011-01-01

    We consider the development of Bayesian Nonparametric methods for product partition models such as Hidden Markov Models and change point models. Our approach uses a Mixture of Dirichlet Process (MDP) model for the unknown sampling distribution (likelihood) for the observations arising in each state and a computationally efficient data augmentation scheme to aid inference. The method uses novel MCMC methodology which combines recent retrospective sampling methods with the use of slice sampler variables. The methodology is computationally efficient, both in terms of MCMC mixing properties, and robustness to the length of the time series being investigated. Moreover, the method is easy to implement requiring little or no user-interaction. We apply our methodology to the analysis of genomic copy number variation.

  12. Rare Genome-Wide Copy Number Variation and Expression of Schizophrenia in 22q11.2 Deletion Syndrome.

    Science.gov (United States)

    Bassett, Anne S; Lowther, Chelsea; Merico, Daniele; Costain, Gregory; Chow, Eva W C; van Amelsvoort, Therese; McDonald-McGinn, Donna; Gur, Raquel E; Swillen, Ann; Van den Bree, Marianne; Murphy, Kieran; Gothelf, Doron; Bearden, Carrie E; Eliez, Stephan; Kates, Wendy; Philip, Nicole; Sashi, Vandana; Campbell, Linda; Vorstman, Jacob; Cubells, Joseph; Repetto, Gabriela M; Simon, Tony; Boot, Erik; Heung, Tracy; Evers, Rens; Vingerhoets, Claudia; van Duin, Esther; Zackai, Elaine; Vergaelen, Elfi; Devriendt, Koen; Vermeesch, Joris R; Owen, Michael; Murphy, Clodagh; Michaelovosky, Elena; Kushan, Leila; Schneider, Maude; Fremont, Wanda; Busa, Tiffany; Hooper, Stephen; McCabe, Kathryn; Duijff, Sasja; Isaev, Karin; Pellecchia, Giovanna; Wei, John; Gazzellone, Matthew J; Scherer, Stephen W; Emanuel, Beverly S; Guo, Tingwei; Morrow, Bernice E; Marshall, Christian R

    2017-11-01

    Chromosome 22q11.2 deletion syndrome (22q11.2DS) is associated with a more than 20-fold increased risk for developing schizophrenia. The aim of this study was to identify additional genetic factors (i.e., "second hits") that may contribute to schizophrenia expression. Through an international consortium, the authors obtained DNA samples from 329 psychiatrically phenotyped subjects with 22q11.2DS. Using a high-resolution microarray platform and established methods to assess copy number variation (CNV), the authors compared the genome-wide burden of rare autosomal CNV, outside of the 22q11.2 deletion region, between two groups: a schizophrenia group and those with no psychotic disorder at age ≥25 years. The authors assessed whether genes overlapped by rare CNVs were overrepresented in functional pathways relevant to schizophrenia. Rare CNVs overlapping one or more protein-coding genes revealed significant between-group differences. For rare exonic duplications, six of 19 gene sets tested were enriched in the schizophrenia group; genes associated with abnormal nervous system phenotypes remained significant in a stepwise logistic regression model and showed significant interactions with 22q11.2 deletion region genes in a connectivity analysis. For rare exonic deletions, the schizophrenia group had, on average, more genes overlapped. The additional rare CNVs implicated known (e.g., GRM7, 15q13.3, 16p12.2) and novel schizophrenia risk genes and loci. The results suggest that additional rare CNVs overlapping genes outside of the 22q11.2 deletion region contribute to schizophrenia risk in 22q11.2DS, supporting a multigenic hypothesis for schizophrenia. The findings have implications for understanding expression of psychotic illness and herald the importance of whole-genome sequencing to appreciate the overall genomic architecture of schizophrenia.

  13. Simultaneous inference of selection and population growth from patterns of variation in the human genome

    DEFF Research Database (Denmark)

    Williamson, Scott H.; Hernandez, Ryan; Fledel-Alon, Adi

    2005-01-01

    Natural selection and demographic forces can have similar effects on patterns of DNA polymorphism. Therefore, to infer selection from samples of DNA sequences, one must simultaneously account for demographic effects. Here we take a model-based approach to this problem by developing predictions fo......-specific methods, and (iii) strong evidence for very recent population growth....... for patterns of polymorphism in the presence of both population size change and natural selection. If data are available from different functional classes of variation, and a priori information suggests that mutations in one of those classes are selectively neutral, then the putatively neutral class can...... this method to a large polymorphism data set from 301 human genes and find (i) widespread negative selection acting on standing nonsynonymous variation, (ii) that the fitness effects of nonsynonymous mutations are well predicted by several measures of amino acid exchangeability, especially site...

  14. Genomic and proteomic analysis of soybean heritable variations induced by space flight

    Institute of Scientific and Technical Information of China (English)

    HE Jie; GAO Yong; SUN Ye-qing

    2009-01-01

    To analyze the biological effects of space environment, the diversity of genomic DNA between the space flight soybean 194(4126) with phenotype of good yield and good fruit quality induced by space flight and the soybean with ground control was studied by amplified fragment length polymorphism (AFLP) method, and the polymorphism of space flight soybean 194(4126) was 3.56%. The differences of protein expression of seeds and leaves between the two kinds of soybeans were analysed by two-dimensional electrophoresis, PDQuest software and MALDI-TOF mass spectrometry. Results show that the loss and decrease of protein expression in 194(4126) soybean are subjected to the space fight of seeds, and three special proteins including Dehydrin, MAT1 and ceQORH are identified. It is concluded that the space environment changes the phenotype and geno-type of soybeans due to the space flight of seeds.

  15. Direct linkage of mitochondrial genome variation to risk factors for type 2 diabetes in conplastic strains

    Czech Academy of Sciences Publication Activity Database

    Pravenec, Michal; Hyakukoku, M.; Houštěk, Josef; Zídek, Václav; Landa, Vladimír; Mlejnek, Petr; Mikšík, Ivan; Mothejzíková-Dudová, Kristýna; Pecina, Petr; Vrbacký, Marek; Drahota, Zdeněk; Vojtíšková, Alena; Mráček, Tomáš; Kazdová, L.; Oliyarnyk, O.; Wang, Ji.; Ho, Ch.; Qi, N.; Sugimoto, K.; Kurtz, T.

    2007-01-01

    Roč. 17, č. 9 (2007), s. 1319-1326 ISSN 1088-9051 R&D Projects: GA MŠk(CZ) 1M0520; GA ČR(CZ) GA301/06/0028; GA ČR GA303/07/0781 Grant - others:GA UK(CZ) 24/2005; GA UK(CZ) 26/2005; National Institutes of Health(US) HL35018; National Institutes of Health(US) HL56028; National Institutes of Health(US) HL63709; EURATOOLS(XE) LSHG-CT-2005-019015 Institutional research plan: CEZ:AV0Z50110509 Source of funding: R - rámcový projekt EK Keywords : mitochondrial genome * conplastic strains * risk factors for type 2 diabetes Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 11.224, year: 2007

  16. Genomic and transcriptome profiling identified both human and HBV genetic variations and their interactions in Chinese hepatocellular carcinoma

    Directory of Open Access Journals (Sweden)

    Hua Dong

    2015-12-01

    Full Text Available Interaction between HBV and host genome integrations in hepatocellular carcinoma (HCC development is a complex process and the mechanism is still unclear. Here we described in details the quality controls and data mining of aCGH and transcriptome sequencing data on 50 HCC samples from the Chinese patients, published by Dong et al. (2015 (GEO#: GSE65486. In additional to the HBV-MLL4 integration discovered, we also investigated the genetic aberrations of HBV and host genes as well as their genetic interactions. We reported human genome copy number changes and frequent transcriptome variations (e.g. TP53, CTNNB1 mutation, especially MLL family mutations in this cohort of the patients. For HBV genotype C, we identified a novel linkage disequilibrium region covering HBV replication regulatory elements, including basal core promoter, DR1, epsilon and poly-A regions, which is associated with HBV core antigen over-expression and almost exclusive to HBV-MLL4 integration.

  17. Variation in the complex carbohydrate biosynthesis loci of Acinetobacter baumannii genomes.

    Directory of Open Access Journals (Sweden)

    Johanna J Kenyon

    Full Text Available Extracellular polysaccharides are major immunogenic components of the bacterial cell envelope. However, little is known about their biosynthesis in the genus Acinetobacter, which includes A. baumannii, an important nosocomial pathogen. Whether Acinetobacter sp. produce a capsule or a lipopolysaccharide carrying an O antigen or both is not resolved. To explore these issues, genes involved in the synthesis of complex polysaccharides were located in 10 complete A. baumannii genome sequences, and the function of each of their products was predicted via comparison to enzymes with a known function. The absence of a gene encoding a WaaL ligase, required to link the carbohydrate polymer to the lipid A-core oligosaccharide (lipooligosaccharide forming lipopolysaccharide, suggests that only a capsule is produced. Nine distinct arrangements of a large capsule biosynthesis locus, designated KL1 to KL9, were found in the genomes. Three forms of a second, smaller variable locus, likely to be required for synthesis of the outer core of the lipid A-core moiety, were designated OCL1 to OCL3 and also annotated. Each K locus includes genes for capsule export as well as genes for synthesis of activated sugar precursors, and for glycosyltransfer, glycan modification and oligosaccharide repeat-unit processing. The K loci all include the export genes at one end and genes for synthesis of common sugar precursors at the other, with a highly variable region that includes the remaining genes in between. Five different capsule loci, KL2, KL6, KL7, KL8 and KL9 were detected in multiply antibiotic resistant isolates belonging to global clone 2, and two other loci, KL1 and KL4, in global clone 1. This indicates that this region is being substituted repeatedly in multiply antibiotic resistant isolates from these clones.

  18. Rhinovirus genome variation during chronic upper and lower respiratory tract infections.

    Directory of Open Access Journals (Sweden)

    Caroline Tapparel

    Full Text Available Routine screening of lung transplant recipients and hospital patients for respiratory virus infections allowed to identify human rhinovirus (HRV in the upper and lower respiratory tracts, including immunocompromised hosts chronically infected with the same strain over weeks or months. Phylogenetic analysis of 144 HRV-positive samples showed no apparent correlation between a given viral genotype or species and their ability to invade the lower respiratory tract or lead to protracted infection. By contrast, protracted infections were found almost exclusively in immunocompromised patients, thus suggesting that host factors rather than the virus genotype modulate disease outcome, in particular the immune response. Complete genome sequencing of five chronic cases to study rhinovirus genome adaptation showed that the calculated mutation frequency was in the range observed during acute human infections. Analysis of mutation hot spot regions between specimens collected at different times or in different body sites revealed that non-synonymous changes were mostly concentrated in the viral capsid genes VP1, VP2 and VP3, independent of the HRV type. In an immunosuppressed lung transplant recipient infected with the same HRV strain for more than two years, both classical and ultra-deep sequencing of samples collected at different time points in the upper and lower respiratory tracts showed that these virus populations were phylogenetically indistinguishable over the course of infection, except for the last month. Specific signatures were found in the last two lower respiratory tract populations, including changes in the 5'UTR polypyrimidine tract and the VP2 immunogenic site 2. These results highlight for the first time the ability of a given rhinovirus to evolve in the course of a natural infection in immunocompromised patients and complement data obtained from previous experimental inoculation studies in immunocompetent volunteers.

  19. A variational principle for computing nonequilibrium fluxes and potentials in genome-scale biochemical networks.

    Science.gov (United States)

    Fleming, R M T; Maes, C M; Saunders, M A; Ye, Y; Palsson, B Ø

    2012-01-07

    We derive a convex optimization problem on a steady-state nonequilibrium network of biochemical reactions, with the property that energy conservation and the second law of thermodynamics both hold at the problem solution. This suggests a new variational principle for biochemical networks that can be implemented in a computationally tractable manner. We derive the Lagrange dual of the optimization problem and use strong duality to demonstrate that a biochemical analogue of Tellegen's theorem holds at optimality. Each optimal flux is dependent on a free parameter that we relate to an elementary kinetic parameter when mass action kinetics is assumed. Copyright © 2011 Elsevier Ltd. All rights reserved.

  20. Aboriginal Australian mitochondrial genome variation - an increased understanding of population antiquity and diversity

    Science.gov (United States)

    Nagle, Nano; van Oven, Mannis; Wilcox, Stephen; van Holst Pellekaan, Sheila; Tyler-Smith, Chris; Xue, Yali; Ballantyne, Kaye N.; Wilcox, Leah; Papac, Luka; Cooke, Karen; van Oorschot, Roland A. H.; McAllister, Peter; Williams, Lesley; Kayser, Manfred; Mitchell, R. John; Adhikarla, Syama; Adler, Christina J.; Balanovska, Elena; Balanovsky, Oleg; Bertranpetit, Jaume; Clarke, Andrew C.; Comas, David; Cooper, Alan; der Sarkissian, Clio S. I.; Dulik, Matthew C.; Gaieski, Jill B.; Ganeshprasad, Arunkumar; Haak, Wolfgang; Haber, Marc; Hobbs, Angela; Javed, Asif; Jin, Li; Kaplan, Matthew E.; Li, Shilin; Martínez-Cruz, Begoña; Matisoo-Smith, Elizabeth A.; Melé, Marta; Merchant, Nirav C.; Owings, Amanda C.; Parida, Laxmi; Pitchappan, Ramasamy; Platt, Daniel E.; Quintana-Murci, Lluis; Renfrew, Colin; Royyuru, Ajay K.; Santhakumari, Arun Varatharajan; Santos, Fabrício R.; Schurr, Theodore G.; Soodyall, Himla; Soria Hernanz, David F.; Swamikrishnan, Pandikumar; Vilar, Miguel G.; Wells, R. Spencer; Zalloua, Pierre A.; Ziegle, Janet S.

    2017-03-01

    Aboriginal Australians represent one of the oldest continuous cultures outside Africa, with evidence indicating that their ancestors arrived in the ancient landmass of Sahul (present-day New Guinea and Australia) ~55 thousand years ago. Genetic studies, though limited, have demonstrated both the uniqueness and antiquity of Aboriginal Australian genomes. We have further resolved known Aboriginal Australian mitochondrial haplogroups and discovered novel indigenous lineages by sequencing the mitogenomes of 127 contemporary Aboriginal Australians. In particular, the more common haplogroups observed in our dataset included M42a, M42c, S, P5 and P12, followed by rarer haplogroups M15, M16, N13, O, P3, P6 and P8. We propose some major phylogenetic rearrangements, such as in haplogroup P where we delinked P4a and P4b and redefined them as P4 (New Guinean) and P11 (Australian), respectively. Haplogroup P2b was identified as a novel clade potentially restricted to Torres Strait Islanders. Nearly all Aboriginal Australian mitochondrial haplogroups detected appear to be ancient, with no evidence of later introgression during the Holocene. Our findings greatly increase knowledge about the geographic distribution and phylogenetic structure of mitochondrial lineages that have survived in contemporary descendants of Australia’s first settlers.

  1. Genome-Wide DNA Methylation Analysis and Epigenetic Variations Associated with Congenital Aortic Valve Stenosis (AVS.

    Directory of Open Access Journals (Sweden)

    Uppala Radhakrishna

    Full Text Available Congenital heart defect (CHD is the most common cause of death from congenital anomaly. Among several candidate epigenetic mechanisms, DNA methylation may play an important role in the etiology of CHDs. We conducted a genome-wide DNA methylation analysis using an Illumina Infinium 450k human methylation assay in a cohort of 24 newborns who had aortic valve stenosis (AVS, with gestational-age matched controls. The study identified significantly-altered CpG methylation at 59 sites in 52 genes in AVS subjects as compared to controls (either hypermethylated or demethylated. Gene Ontology analysis identified biological processes and functions for these genes including positive regulation of receptor-mediated endocytosis. Consistent with prior clinical data, the molecular function categories as determined using DAVID identified low-density lipoprotein receptor binding, lipoprotein receptor binding and identical protein binding to be over-represented in the AVS group. A significant epigenetic change in the APOA5 and PCSK9 genes known to be involved in AVS was also observed. A large number CpG methylation sites individually demonstrated good to excellent diagnostic accuracy for the prediction of AVS status, thus raising possibility of molecular screening markers for this disorder. Using epigenetic analysis we were able to identify genes significantly involved in the pathogenesis of AVS.

  2. Genome-Wide DNA Methylation Analysis and Epigenetic Variations Associated with Congenital Aortic Valve Stenosis (AVS).

    Science.gov (United States)

    Radhakrishna, Uppala; Albayrak, Samet; Alpay-Savasan, Zeynep; Zeb, Amna; Turkoglu, Onur; Sobolewski, Paul; Bahado-Singh, Ray O

    2016-01-01

    Congenital heart defect (CHD) is the most common cause of death from congenital anomaly. Among several candidate epigenetic mechanisms, DNA methylation may play an important role in the etiology of CHDs. We conducted a genome-wide DNA methylation analysis using an Illumina Infinium 450k human methylation assay in a cohort of 24 newborns who had aortic valve stenosis (AVS), with gestational-age matched controls. The study identified significantly-altered CpG methylation at 59 sites in 52 genes in AVS subjects as compared to controls (either hypermethylated or demethylated). Gene Ontology analysis identified biological processes and functions for these genes including positive regulation of receptor-mediated endocytosis. Consistent with prior clinical data, the molecular function categories as determined using DAVID identified low-density lipoprotein receptor binding, lipoprotein receptor binding and identical protein binding to be over-represented in the AVS group. A significant epigenetic change in the APOA5 and PCSK9 genes known to be involved in AVS was also observed. A large number CpG methylation sites individually demonstrated good to excellent diagnostic accuracy for the prediction of AVS status, thus raising possibility of molecular screening markers for this disorder. Using epigenetic analysis we were able to identify genes significantly involved in the pathogenesis of AVS.

  3. Insights into mechanisms of bacterial antigenic variation derived from the complete genome sequence of Anaplasma marginale.

    Science.gov (United States)

    Palmer, Guy H; Futse, James E; Knowles, Donald P; Brayton, Kelly A

    2006-10-01

    Persistence of Anaplasma spp. in the animal reservoir host is required for efficient tick-borne transmission of these pathogens to animals and humans. Using A. marginale infection of its natural reservoir host as a model, persistent infection has been shown to reflect sequential cycles in which antigenic variants emerge, replicate, and are controlled by the immune system. Variation in the immunodominant outer-membrane protein MSP2 is generated by a process of gene conversion, in which unique hypervariable region sequences (HVRs) located in pseudogenes are recombined into a single operon-linked msp2 expression site. Although organisms expressing whole HVRs derived from pseudogenes emerge early in infection, long-term persistent infection is dependent on the generation of complex mosaics in which segments from different HVRs recombine into the expression site. The resulting combinatorial diversity generates the number of variants both predicted and shown to emerge during persistence.

  4. Variation in genome-wide levels of meiotic recombination is established at the onset of prophase in mammalian males.

    Directory of Open Access Journals (Sweden)

    Brian Baier

    2014-01-01

    Full Text Available Segregation of chromosomes during the first meiotic division relies on crossovers established during prophase. Although crossovers are strictly regulated so that at least one occurs per chromosome, individual variation in crossover levels is not uncommon. In an analysis of different inbred strains of male mice, we identified among-strain variation in the number of foci for the crossover-associated protein MLH1. We report studies of strains with "low" (CAST/EiJ, "medium" (C3H/HeJ, and "high" (C57BL/6J genome-wide MLH1 values to define factors responsible for this variation. We utilized immunofluorescence to analyze the number and distribution of proteins that function at different stages in the recombination pathway: RAD51 and DMC1, strand invasion proteins acting shortly after double-strand break (DSB formation, MSH4, part of the complex stabilizing double Holliday junctions, and the Bloom helicase BLM, thought to have anti-crossover activity. For each protein, we identified strain-specific differences that mirrored the results for MLH1; i.e., CAST/EiJ mice had the lowest values, C3H/HeJ mice intermediate values, and C57BL/6J mice the highest values. This indicates that differences in the numbers of DSBs (as identified by RAD51 and DMC1 are translated into differences in the number of crossovers, suggesting that variation in crossover levels is established by the time of DSB formation. However, DSBs per se are unlikely to be the primary determinant, since allelic variation for the DSB-inducing locus Spo11 resulted in differences in the numbers of DSBs but not the number of MLH1 foci. Instead, chromatin conformation appears to be a more important contributor, since analysis of synaptonemal complex length and DNA loop size also identified consistent strain-specific differences; i.e., crossover frequency increased with synaptonemal complex length and was inversely related to chromatin loop size. This indicates a relationship between recombination

  5. Northeast African genomic variation shaped by the continuity of indigenous groups and Eurasian migrations.

    Directory of Open Access Journals (Sweden)

    Nina Hollfelder

    2017-08-01

    Full Text Available Northeast Africa has a long history of human habitation, with fossil-finds from the earliest anatomically modern humans, and housing ancient civilizations. The region is also the gate-way out of Africa, as well as a portal for migration into Africa from Eurasia via the Middle East and the Arabian Peninsula. We investigate the population history of northeast Africa by genotyping ~3.9 million SNPs in 221 individuals from 18 populations sampled in Sudan and South Sudan and combine this data with published genome-wide data from surrounding areas. We find a strong genetic divide between the populations from the northeastern parts of the region (Nubians, central Arab populations, and the Beja and populations towards the west and south (Nilotes, Darfur and Kordofan populations. This differentiation is mainly caused by a large Eurasian ancestry component of the northeast populations likely driven by migration of Middle Eastern groups followed by admixture that affected the local populations in a north-to-south succession of events. Genetic evidence points to an early admixture event in the Nubians, concurrent with historical contact between North Sudanese and Arab groups. We estimate the admixture in current-day Sudanese Arab populations to about 700 years ago, coinciding with the fall of Dongola in 1315/1316 AD, a wave of admixture that reached the Darfurian/Kordofanian populations some 400-200 years ago. In contrast to the northeastern populations, the current-day Nilotic populations from the south of the region display little or no admixture from Eurasian groups indicating long-term isolation and population continuity in these areas of northeast Africa.

  6. Northeast African genomic variation shaped by the continuity of indigenous groups and Eurasian migrations.

    Science.gov (United States)

    Hollfelder, Nina; Schlebusch, Carina M; Günther, Torsten; Babiker, Hiba; Hassan, Hisham Y; Jakobsson, Mattias

    2017-08-01

    Northeast Africa has a long history of human habitation, with fossil-finds from the earliest anatomically modern humans, and housing ancient civilizations. The region is also the gate-way out of Africa, as well as a portal for migration into Africa from Eurasia via the Middle East and the Arabian Peninsula. We investigate the population history of northeast Africa by genotyping ~3.9 million SNPs in 221 individuals from 18 populations sampled in Sudan and South Sudan and combine this data with published genome-wide data from surrounding areas. We find a strong genetic divide between the populations from the northeastern parts of the region (Nubians, central Arab populations, and the Beja) and populations towards the west and south (Nilotes, Darfur and Kordofan populations). This differentiation is mainly caused by a large Eurasian ancestry component of the northeast populations likely driven by migration of Middle Eastern groups followed by admixture that affected the local populations in a north-to-south succession of events. Genetic evidence points to an early admixture event in the Nubians, concurrent with historical contact between North Sudanese and Arab groups. We estimate the admixture in current-day Sudanese Arab populations to about 700 years ago, coinciding with the fall of Dongola in 1315/1316 AD, a wave of admixture that reached the Darfurian/Kordofanian populations some 400-200 years ago. In contrast to the northeastern populations, the current-day Nilotic populations from the south of the region display little or no admixture from Eurasian groups indicating long-term isolation and population continuity in these areas of northeast Africa.

  7. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Liu, Siyang; Huang, Shujia; Rao, Junhua

    2015-01-01

    present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome......) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We...... assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction...

  8. Genomic structural variation-mediated allelic suppression causes hybrid male sterility in rice.

    Science.gov (United States)

    Shen, Rongxin; Wang, Lan; Liu, Xupeng; Wu, Jiang; Jin, Weiwei; Zhao, Xiucai; Xie, Xianrong; Zhu, Qinlong; Tang, Huiwu; Li, Qing; Chen, Letian; Liu, Yao-Guang

    2017-11-03

    Hybrids between divergent populations commonly show hybrid sterility; this reproductive barrier hinders hybrid breeding of the japonica and indica rice (Oryza sativa L.) subspecies. Here we show that structural changes and copy number variation at the Sc locus confer japonica-indica hybrid male sterility. The japonica allele, Sc-j, contains a pollen-essential gene encoding a DUF1618-domain protein; the indica allele, Sc-i, contains two or three tandem-duplicated ~ 28-kb segments, each carrying an Sc-j-homolog with a distinct promoter. In Sc-j/Sc-i hybrids, the high-expression of Sc-i in sporophytic cells causes suppression of Sc-j expression in pollen and selective abortion of Sc-j-pollen, leading to transmission ratio distortion. Knocking out one or two of the three Sc-i copies by CRISPR/Cas9 rescues Sc-j expression and male fertility. Our results reveal the gene dosage-dependent allelic suppression as a mechanism of hybrid incompatibility, and provide an effective approach to overcome the reproductive barrier for hybrid breeding.

  9. Genomic Structural Variations Affecting Virulence During Clonal Expansion of Pseudomonas syringae pv. actinidiae Biovar 3 in Europe.

    Science.gov (United States)

    Firrao, Giuseppe; Torelli, Emanuela; Polano, Cesare; Ferrante, Patrizia; Ferrini, Francesca; Martini, Marta; Marcelletti, Simone; Scortichini, Marco; Ermacora, Paolo

    2018-01-01

    Pseudomonas syringae pv. actinidiae (Psa) biovar 3 caused pandemic bacterial canker of Actinidia chinensis and Actinidia deliciosa since 2008. In Europe, the disease spread rapidly in the kiwifruit cultivation areas from a single introduction. In this study, we investigated the genomic diversity of Psa biovar 3 strains during the primary clonal expansion in Europe using single molecule real-time (SMRT), Illumina and Sanger sequencing technologies. We recorded evidences of frequent mobilization and loss of transposon Tn6212, large chromosome inversions, and ectopic integration of IS sequences (remarkably ISPsy31, ISPsy36, and ISPsy37). While no phenotype change associated with Tn6212 mobilization could be detected, strains CRAFRU 12.29 and CRAFRU 12.50 did not elicit the hypersensitivity response (HR) on tobacco and eggplant leaves and were limited in their growth in kiwifruit leaves due to insertion of ISPsy31 and ISPsy36 in the hrpS and hrpR genes, respectively, interrupting the hrp cluster. Both strains had been isolated from symptomatic plants, suggesting coexistence of variant strains with reduced virulence together with virulent strains in mixed populations. The structural differences caused by rearrangements of self-genetic elements within European and New Zealand strains were comparable in number and type to those occurring among the European strains, in contrast with the significant difference in terms of nucleotide polymorphisms. We hypothesize a relaxation, during clonal expansion, of the selection limiting the accumulation of deleterious mutations associated with genome structural variation due to transposition of mobile elements. This consideration may be relevant when evaluating strategies to be adopted for epidemics management.

  10. Genomic and Phenotypic Variation in Morphogenetic Networks of Two Candida albicans Isolates Subtends Their Different Pathogenic Potential

    Directory of Open Access Journals (Sweden)

    Duccio Cavalieri

    2018-01-01

    Full Text Available The transition from commensalism to pathogenicity of Candida albicans reflects both the host inability to mount specific immune responses and the microorganism’s dimorphic switch efficiency. In this study, we used whole genome sequencing and microarray analysis to investigate the genomic determinants of the phenotypic changes observed in two C. albicans clinical isolates (YL1 and YQ2. In vitro experiments employing epithelial, microglial, and peripheral blood mononuclear cells were thus used to evaluate C. albicans isolates interaction with first line host defenses, measuring adhesion, susceptibility to phagocytosis, and induction of secretory responses. Moreover, a murine model of peritoneal infection was used to compare the in vivo pathogenic potential of the two isolates. Genome sequence and gene expression analysis of C. albicans YL1 and YQ2 showed significant changes in cellular pathways involved in environmental stress response, adhesion, filamentous growth, invasiveness, and dimorphic transition. This was in accordance with the observed marked phenotypic differences in biofilm production, dimorphic switch efficiency, cell adhesion, invasion, and survival to phagocyte-mediated host defenses. The mutations in key regulators of the hyphal growth pathway in the more virulent strain corresponded to an overall greater number of budding yeast cells released. Compared to YQ2, YL1 consistently showed enhanced pathogenic potential, since in vitro, it was less susceptible to ingestion by phagocytic cells and more efficient in invading epithelial cells, while in vivo YL1 was more effective than YQ2 in recruiting inflammatory cells, eliciting IL-1β response and eluding phagocytic cells. Overall, these results indicate an unexpected isolate-specific variation in pathways important for host invasion and colonization, showing how the genetic background of C. albicans may greatly affect its behavior both in vitro and in vivo. Based on this approach, we

  11. The distribution and impact of common copy-number variation in the genome of the domesticated apple, Malus x domestica Borkh.

    Science.gov (United States)

    Boocock, James; Chagné, David; Merriman, Tony R; Black, Michael A

    2015-10-23

    Copy number variation (CNV) is a common feature of eukaryotic genomes, and a growing body of evidence suggests that genes affected by CNV are enriched in processes that are associated with environmental responses. Here we use next generation sequence (NGS) data to detect copy-number variable regions (CNVRs) within the Malus x domestica genome, as well as to examine their distribution and impact. CNVRs were detected using NGS data derived from 30 accessions of M. x domestica analyzed using the read-depth method, as implemented in the CNVrd2 software. To improve the reliability of our results, we developed a quality control and analysis procedure that involved checking for organelle DNA, not repeat masking, and the determination of CNVR identity using a permutation testing procedure. Overall, we identified 876 CNVRs, which spanned 3.5 % of the apple genome. To verify that detected CNVRs were not artifacts, we analyzed the B- allele-frequencies (BAF) within a single nucleotide polymorphism (SNP) array dataset derived from a screening of 185 individual apple accessions and found the CNVRs were enriched for SNPs having aberrant BAFs (P apple scab. We present the first analysis and catalogue of CNVRs in the M. x domestica genome. The enrichment of the CNVRs with R gene models and their overlap with gene loci of agricultural significance draw attention to a form of unexplored genetic variation in apple. This research will underpin further investigation of the role that CNV plays within the apple genome.

  12. Genome-wide copy number variation study associates metabotropic glutamate receptor gene networks with attention deficit hyperactivity disorder

    Science.gov (United States)

    Elia, Josephine; Glessner, Joseph T; Wang, Kai; Takahashi, Nagahide; Shtir, Corina J; Hadley, Dexter; Sleiman, Patrick M A; Zhang, Haitao; Kim, Cecilia E; Robison, Reid; Lyon, Gholson J; Flory, James H; Bradfield, Jonathan P; Imielinski, Marcin; Hou, Cuiping; Frackelton, Edward C; Chiavacci, Rosetta M; Sakurai, Takeshi; Rabin, Cara; Middleton, Frank A; Thomas, Kelly A; Garris, Maria; Mentch, Frank; Freitag, Christine M; Steinhausen, Hans-Christoph; Todorov, Alexandre A; Reif, Andreas; Rothenberger, Aribert; Franke, Barbara; Mick, Eric O; Roeyers, Herbert; Buitelaar, Jan; Lesch, Klaus-Peter; Banaschewski, Tobias; Ebstein, Richard P; Mulas, Fernando; Oades, Robert D; Sergeant, Joseph; Sonuga-Barke, Edmund; Renner, Tobias J; Romanos, Marcel; Romanos, Jasmin; Warnke, Andreas; Walitza, Susanne; Meyer, Jobst; Pálmason, Haukur; Seitz, Christiane; Loo, Sandra K; Smalley, Susan L; Biederman, Joseph; Kent, Lindsey; Asherson, Philip; Anney, Richard J L; Gaynor, J William; Shaw, Philip; Devoto, Marcella; White, Peter S; Grant, Struan F A; Buxbaum, Joseph D; Rapoport, Judith L; Williams, Nigel M; Nelson, Stanley F; Faraone, Stephen V; Hakonarson, Hakon

    2014-01-01

    Attention deficit hyperactivity disorder (ADHD) is a common, heritable neuropsychiatric disorder of unknown etiology. We performed a whole-genome copy number variation (CNV) study on 1,013 cases with ADHD and 4,105 healthy children of European ancestry using 550,000 SNPs. We evaluated statistically significant findings in multiple independent cohorts, with a total of 2,493 cases with ADHD and 9,222 controls of European ancestry, using matched platforms. CNVs affecting metabotropic glutamate receptor genes were enriched across all cohorts (P = 2.1 × 10−9). We saw GRM5 (encoding glutamate receptor, metabotropic 5) deletions in ten cases and one control (P = 1.36 × 10−6). We saw GRM7 deletions in six cases, and we saw GRM8 deletions in eight cases and no controls. GRM1 was duplicated in eight cases. We experimentally validated the observed variants using quantitative RT-PCR. A gene network analysis showed that genes interacting with the genes in the GRM family are enriched for CNVs in ~10% of the cases (P = 4.38 × 10−10) after correction for occurrence in the controls. We identified rare recurrent CNVs affecting glutamatergic neurotransmission genes that were overrepresented in multiple ADHD cohorts. PMID:22138692

  13. Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.

    Science.gov (United States)

    Bhaskar, Anand; Wang, Y X Rachel; Song, Yun S

    2015-02-01

    With the recent increase in study sample sizes in human genetics, there has been growing interest in inferring historical population demography from genomic variation data. Here, we present an efficient inference method that can scale up to very large samples, with tens or hundreds of thousands of individuals. Specifically, by utilizing analytic results on the expected frequency spectrum under the coalescent and by leveraging the technique of automatic differentiation, which allows us to compute gradients exactly, we develop a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies. Our method is orders of magnitude faster than previous demographic inference methods based on the frequency spectrum. In addition to inferring demography, our method can also accurately estimate locus-specific mutation rates. We perform extensive validation of our method on simulated data and show that it can accurately infer multiple recent epochs of rapid exponential growth, a signal that is difficult to pick up with small sample sizes. Lastly, we use our method to analyze data from recent sequencing studies, including a large-sample exome-sequencing data set of tens of thousands of individuals assayed at a few hundred genic regions. © 2015 Bhaskar et al.; Published by Cold Spring Harbor Laboratory Press.

  14. Identification of balanced chromosomal rearrangements previously unknown among participants in the 1000 Genomes Project: implications for interpretation of structural variation in genomes and the future of clinical cytogenetics.

    Science.gov (United States)

    Dong, Zirui; Wang, Huilin; Chen, Haixiao; Jiang, Hui; Yuan, Jianying; Yang, Zhenjun; Wang, Wen-Jing; Xu, Fengping; Guo, Xiaosen; Cao, Ye; Zhu, Zhenzhen; Geng, Chunyu; Cheung, Wan Chee; Kwok, Yvonne K; Yang, Huanming; Leung, Tak Yeung; Morton, Cynthia C; Cheung, Sau Wai; Choy, Kwong Wai

    2017-11-02

    PurposeRecent studies demonstrate that whole-genome sequencing enables detection of cryptic rearrangements in apparently balanced chromosomal rearrangements (also known as balanced chromosomal abnormalities, BCAs) previously identified by conventional cytogenetic methods. We aimed to assess our analytical tool for detecting BCAs in the 1000 Genomes Project without knowing which bands were affected.MethodsThe 1000 Genomes Project provides an unprecedented integrated map of structural variants in phenotypically normal subjects, but there is no information on potential inclusion of subjects with apparent BCAs akin to those traditionally detected in diagnostic cytogenetics laboratories. We applied our analytical tool to 1,166 genomes from the 1000 Genomes Project with sufficient physical coverage (8.25-fold).ResultsWith this approach, we detected four reciprocal balanced translocations and four inversions, ranging in size from 57.9 kb to 13.3 Mb, all of which were confirmed by cytogenetic methods and polymerase chain reaction studies. One of these DNAs has a subtle translocation that is not readily identified by chromosome analysis because of the similarity of the banding patterns and size of exchanged segments, and another results in disruption of all transcripts of an OMIM gene.ConclusionOur study demonstrates the extension of utilizing low-pass whole-genome sequencing for unbiased detection of BCAs including translocations and inversions previously unknown in the 1000 Genomes Project.GENETICS in MEDICINE advance online publication, 2 November 2017; doi:10.1038/gim.2017.170.

  15. Genomic prediction in contrast to a genome-wide association study in explaining heritable variation of complex growth traits in breeding populations of Eucalyptus.

    Science.gov (United States)

    Müller, Bárbara S F; Neves, Leandro G; de Almeida Filho, Janeo E; Resende, Márcio F R; Muñoz, Patricio R; Dos Santos, Paulo E T; Filho, Estefano Paludzyszyn; Kirst, Matias; Grattapaglia, Dario

    2017-07-11

    The advent of high-throughput genotyping technologies coupled to genomic prediction methods established a new paradigm to integrate genomics and breeding. We carried out whole-genome prediction and contrasted it to a genome-wide association study (GWAS) for growth traits in breeding populations of Eucalyptus benthamii (n =505) and Eucalyptus pellita (n =732). Both species are of increasing commercial interest for the development of germplasm adapted to environmental stresses. Predictive ability reached 0.16 in E. benthamii and 0.44 in E. pellita for diameter growth. Predictive abilities using either Genomic BLUP or different Bayesian methods were similar, suggesting that growth adequately fits the infinitesimal model. Genomic prediction models using ~5000-10,000 SNPs provided predictive abilities equivalent to using all 13,787 and 19,506 SNPs genotyped in the E. benthamii and E. pellita populations, respectively. No difference was detected in predictive ability when different sets of SNPs were utilized, based on position (equidistantly genome-wide, inside genes, linkage disequilibrium pruned or on single chromosomes), as long as the total number of SNPs used was above ~5000. Predictive abilities obtained by removing relatedness between training and validation sets fell near zero for E. benthamii and were halved for E. pellita. These results corroborate the current view that relatedness is the main driver of genomic prediction, although some short-range historical linkage disequilibrium (LD) was likely captured for E. pellita. A GWAS identified only one significant association for volume growth in E. pellita, illustrating the fact that while genome-wide regression is able to account for large proportions of the heritability, very little or none of it is captured into significant associations using GWAS in breeding populations of the size evaluated in this study. This study provides further experimental data supporting positive prospects of using genome-wide data to

  16. Structural variation and rates of genome evolution in the grass family seen through comparison of sequences of genomes greatly differing in size.

    Science.gov (United States)

    Dvorak, Jan; Wang, Le; Zhu, Tingting; Jorgensen, Chad M; Deal, Karin R; Dai, Xiongtao; Dawson, Matthew W; Müller, Hans-Georg; Luo, Ming-Cheng; Ramasamy, Ramesh K; Dehghani, Hamid; Gu, Yong Q; Gill, Bikram S; Distelfeld, Assaf; Devos, Katrien M; Qi, Peng; You, Frank M; Gulick, Patrick J; McGuire, Patrick E

    2018-05-16

    Homology was searched with genes annotated in the Aegilops tauschii pseudomolecules against genes annotated in the pseudomolecules of tetraploid wild emmer wheat, Brachypodium distachyon, sorghum, and rice. Similar searches were initiated with genes annotated in the rice pseudomolecules. Matrices of colinear genes and rearrangements in their order were constructed. Optical Bionano genome maps were constructed and used to validate rearrangements unique to the wild emmer and Ae. tauschii genomes. Most common rearrangements were short paracentric inversions and short intrachromosomal translocations. Intrachromosomal translocations outnumbered segmental intrachromosomal duplications. The densities of paracentric inversion lengths were approximated by exponential distributions in all six genomes. Densities of colinear genes along the Ae. tauschii chromosomes were highly correlated with meiotic recombination rates but those of rearrangements were not, suggesting different causes of the erosion of gene colinearity and evolution of major chromosome rearrangements. Frequent rearrangements sharing breakpoints suggested that chromosomes have been rearranged recurrently at some sites. The distal 4 Mb of the short arms of rice chromosomes Os11 and Os12 and corresponding regions in the sorghum, B. distachyon, and Triticeae genomes contain clusters of interstitial translocations including from 1 to 7 colinear genes. The rates of acquisition of major rearrangements were greater in the wild emmer wheat and Ae. tauschii genomes than in the lineage preceding their divergence or in the B. distachyon, rice, and sorghum lineages. It is suggested that synergy between large quantities of dynamic transposable elements and annual growth habit caused the fast evolution of the Triticeae genomes. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  17. Analysis of Genome-Wide Copy Number Variations in Chinese Indigenous and Western Pig Breeds by 60 K SNP Genotyping Arrays

    Science.gov (United States)

    Sun, Yaqi; Wang, Hongyang; Wang, Chao; Yu, Shaobo; Liu, Jing; Zhang, Yu; Fan, Bin; Li, Kui; Liu, Bang

    2014-01-01

    Copy number variations (CNVs) represent a substantial source of structural variants in mammals and contribute to both normal phenotypic variability and disease susceptibility. Although low-resolution CNV maps are produced in many domestic animals, and several reports have been published about the CNVs of porcine genome, the differences between Chinese and western pigs still remain to be elucidated. In this study, we used Porcine SNP60 BeadChip and PennCNV algorithm to perform a genome-wide CNV detection in 302 individuals from six Chinese indigenous breeds (Tongcheng, Laiwu, Luchuan, Bama, Wuzhishan and Ningxiang pigs), three western breeds (Yorkshire, Landrace and Duroc) and one hybrid (Tongcheng×Duroc). A total of 348 CNV Regions (CNVRs) across genome were identified, covering 150.49 Mb of the pig genome or 6.14% of the autosomal genome sequence. In these CNVRs, 213 CNVRs were found to exist only in the six Chinese indigenous breeds, and 60 CNVRs only in the three western breeds. The characters of CNVs in four Chinese normal size breeds (Luchuan, Tongcheng and Laiwu pigs) and two minipig breeds (Bama and Wuzhishan pigs) were also analyzed in this study. Functional annotation suggested that these CNVRs possess a great variety of molecular function and may play important roles in phenotypic and production traits between Chinese and western breeds. Our results are important complementary to the CNV map in pig genome, which provide new information about the diversity of Chinese and western pig breeds, and facilitate further research on porcine genome CNVs. PMID:25198154

  18. Genome size variation in the pine fusiform rust pathogen Cronartium quercuum f.sp. fusiforme as determined by flow cytometry

    Science.gov (United States)

    Claire L Anderson; Thomas L Kubisiak; C Dana Nelson; Jason A Smith; John M Davis

    2010-01-01

    The genome size of the pine fusiform rust pathogen Cronartium quercuum f.sp. fusiforme (Cqf) was determined by flow cytometric analysis of propidium iodide-stained, intact haploid pycniospores with haploid spores of two genetically well characterized fungal species, Sclerotinia sclerotiorum and Puccinia graminis f.sp. tritici, as size standards. The Cqf haploid genome...

  19. Genome-wide DNA methylation alterations of Alternanthera philoxeroides in natural and manipulated habitats: implications for epigenetic regulation of rapid responses to environmental fluctuation and phenotypic variation.

    Science.gov (United States)

    Gao, Lexuan; Geng, Yupeng; Li, Bo; Chen, Jiakuan; Yang, Ji

    2010-11-01

    Alternanthera philoxeroides (alligator weed) is an invasive weed that can colonize both aquatic and terrestrial habitats. Individuals growing in different habitats exhibit extensive phenotypic variation but little genetic differentiation in its introduced range. The mechanisms underpinning the wide range of phenotypic variation and rapid adaptation to novel and changing environments remain uncharacterized. In this study, we examined the epigenetic variation and its correlation with phenotypic variation in plants exposed to natural and manipulated environmental variability. Genome-wide methylation profiling using methylation-sensitive amplified fragment length polymorphism (MSAP) revealed considerable DNA methylation polymorphisms within and between natural populations. Plants of different source populations not only underwent significant morphological changes in common garden environments, but also underwent a genome-wide epigenetic reprogramming in response to different treatments. Methylation alterations associated with response to different water availability were detected in 78.2% (169/216) of common garden induced polymorphic sites, demonstrating the environmental sensitivity and flexibility of the epigenetic regulatory system. These data provide evidence of the correlation between epigenetic reprogramming and the reversible phenotypic response of alligator weed to particular environmental factors. © 2010 Blackwell Publishing Ltd.

  20. Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

    Science.gov (United States)

    Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

    2014-07-04

    Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was

  1. Genome-wide association study identified genetic variations and candidate genes for plant architecture component traits in Chinese upland cotton.

    Science.gov (United States)

    Su, Junji; Li, Libei; Zhang, Chi; Wang, Caixiang; Gu, Lijiao; Wang, Hantao; Wei, Hengling; Liu, Qibao; Huang, Long; Yu, Shuxun

    2018-06-01

    Thirty significant associations between 22 SNPs and five plant architecture component traits in Chinese upland cotton were identified via GWAS. Four peak SNP loci located on chromosome D03 were simultaneously associated with more plant architecture component traits. A candidate gene, Gh_D03G0922, might be responsible for plant height in upland cotton. A compact plant architecture is increasingly required for mechanized harvesting processes in China. Therefore, cotton plant architecture is an important trait, and its components, such as plant height, fruit branch length and fruit branch angle, affect the suitability of a cultivar for mechanized harvesting. To determine the genetic basis of cotton plant architecture, a genome-wide association study (GWAS) was performed using a panel composed of 355 accessions and 93,250 single nucleotide polymorphisms (SNPs) identified using the specific-locus amplified fragment sequencing method. Thirty significant associations between 22 SNPs and five plant architecture component traits were identified via GWAS. Most importantly, four peak SNP loci located on chromosome D03 were simultaneously associated with more plant architecture component traits, and these SNPs were harbored in one linkage disequilibrium block. Furthermore, 21 candidate genes for plant architecture were predicted in a 0.95-Mb region including the four peak SNPs. One of these genes (Gh_D03G0922) was near the significant SNP D03_31584163 (8.40 kb), and its Arabidopsis homologs contain MADS-box domains that might be involved in plant growth and development. qRT-PCR showed that the expression of Gh_D03G0922 was upregulated in the apical buds and young leaves of the short and compact cotton varieties, and virus-induced gene silencing (VIGS) proved that the silenced plants exhibited increased PH. These results indicate that Gh_D03G0922 is likely the candidate gene for PH in cotton. The genetic variations and candidate genes identified in this study lay a foundation

  2. Metabolic and genomic analysis elucidates strain-level variation in Microbacterium spp. isolated from chromate contaminated sediment

    Data.gov (United States)

    U.S. Environmental Protection Agency — The data is in the form of genomic sequences deposited in a public database, growth curves, and bioinformatic analysis of sequences. This dataset is associated with...

  3. Combining high-throughput phenotyping and genome-wide association studies to reveal natural genetic variation in rice

    OpenAIRE

    Yang, Wanneng; Guo, Zilong; Huang, Chenglong; Duan, Lingfeng; Chen, Guoxing; Jiang, Ni; Fang, Wei; Feng, Hui; Xie, Weibo; Lian, Xingming; Wang, Gongwei; Luo, Qingming; Zhang, Qifa; Liu, Qian; Xiong, Lizhong

    2014-01-01

    Even as the study of plant genomics rapidly develops through the use of high-throughput sequencing techniques, traditional plant phenotyping lags far behind. Here we develop a high-throughput rice phenotyping facility (HRPF) to monitor 13 traditional agronomic traits and 2 newly defined traits during the rice growth period. Using genome-wide association studies (GWAS) of the 15 traits, we identify 141 associated loci, 25 of which contain known genes such as the Green Revolution semi-dwarf gen...

  4. Evolutionary origin of Rosaceae-specific active non-autonomous hAT elements and their contribution to gene regulation and genomic structural variation.

    Science.gov (United States)

    Wang, Lu; Peng, Qian; Zhao, Jianbo; Ren, Fei; Zhou, Hui; Wang, Wei; Liao, Liao; Owiti, Albert; Jiang, Quan; Han, Yuepeng

    2016-05-01

    Transposable elements account for approximately 30 % of the Prunus genome; however, their evolutionary origin and functionality remain largely unclear. In this study, we identified a hAT transposon family, termed Moshan, in Prunus. The Moshan elements consist of three types, aMoshan, tMoshan, and mMoshan. The aMoshan and tMoshan types contain intact or truncated transposase genes, respectively, while the mMoshan type is miniature inverted-repeat transposable element (MITE). The Moshan transposons are unique to Rosaceae, and the copy numbers of different Moshan types are significantly correlated. Sequence homology analysis reveals that the mMoshan MITEs are direct deletion derivatives of the tMoshan progenitors, and one kind of mMoshan containing a MuDR-derived fragment were amplified predominately in the peach genome. The mMoshan sequences contain cis-regulatory elements that can enhance gene expression up to 100-fold. The mMoshan MITEs can serve as potential sources of micro and long noncoding RNAs. Whole-genome re-sequencing analysis indicates that mMoshan elements are highly active, and an insertion into S-haplotype-specific F-box gene was reported to cause the breakdown of self-incompatibility in sour cherry. Taken together, all these results suggest that the mMoshan elements play important roles in regulating gene expression and driving genomic structural variation in Prunus.

  5. Salix transect of Europe: variation in ploidy and genome size in willow-associated common nettle, Urtica dioica L. sens. lat., from Greece to arctic Norway.

    Science.gov (United States)

    Cronk, Quentin; Hidalgo, Oriane; Pellicer, Jaume; Percy, Diana; Leitch, Ilia J

    2016-01-01

    The common stinging nettle, Urtica dioica L. sensu lato, is an invertebrate "superhost", its clonal patches maintaining large populations of insects and molluscs. It is extremely widespread in Europe and highly variable, and two ploidy levels (diploid and tetraploid) are known. However, geographical patterns in cytotype variation require further study. We assembled a collection of nettles in conjunction with a transect of Europe from the Aegean to Arctic Norway (primarily conducted to examine the diversity of Salix and Salix -associated insects). Using flow cytometry to measure genome size, our sample of 29 plants reveals 5 diploids and 24 tetraploids. Two diploids were found in SE Europe (Bulgaria and Romania) and three diploids in S. Finland. More detailed cytotype surveys in these regions are suggested. The tetraploid genome size (2C value) varied between accessions from 2.36 to 2.59 pg. The diploids varied from 1.31 to 1.35 pg per 2C nucleus, equivalent to a haploid genome size of c. 650 Mbp. Within the tetraploids, we find that the most northerly samples (from N. Finland and arctic Norway) have a generally higher genome size. This is possibly indicative of a distinct population in this region.

  6. Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53 949)

    Science.gov (United States)

    Davies, G; Armstrong, N; Bis, J C; Bressler, J; Chouraki, V; Giddaluru, S; Hofer, E; Ibrahim-Verbaas, C A; Kirin, M; Lahti, J; van der Lee, S J; Le Hellard, S; Liu, T; Marioni, R E; Oldmeadow, C; Postmus, I; Smith, A V; Smith, J A; Thalamuthu, A; Thomson, R; Vitart, V; Wang, J; Yu, L; Zgaga, L; Zhao, W; Boxall, R; Harris, S E; Hill, W D; Liewald, D C; Luciano, M; Adams, H; Ames, D; Amin, N; Amouyel, P; Assareh, A A; Au, R; Becker, J T; Beiser, A; Berr, C; Bertram, L; Boerwinkle, E; Buckley, B M; Campbell, H; Corley, J; De Jager, P L; Dufouil, C; Eriksson, J G; Espeseth, T; Faul, J D; Ford, I; Scotland, Generation; Gottesman, R F; Griswold, M E; Gudnason, V; Harris, T B; Heiss, G; Hofman, A; Holliday, E G; Huffman, J; Kardia, S L R; Kochan, N; Knopman, D S; Kwok, J B; Lambert, J-C; Lee, T; Li, G; Li, S-C; Loitfelder, M; Lopez, O L; Lundervold, A J; Lundqvist, A; Mather, K A; Mirza, S S; Nyberg, L; Oostra, B A; Palotie, A; Papenberg, G; Pattie, A; Petrovic, K; Polasek, O; Psaty, B M; Redmond, P; Reppermund, S; Rotter, J I; Schmidt, H; Schuur, M; Schofield, P W; Scott, R J; Steen, V M; Stott, D J; van Swieten, J C; Taylor, K D; Trollor, J; Trompet, S; Uitterlinden, A G; Weinstein, G; Widen, E; Windham, B G; Jukema, J W; Wright, A F; Wright, M J; Yang, Q; Amieva, H; Attia, J R; Bennett, D A; Brodaty, H; de Craen, A J M; Hayward, C; Ikram, M A; Lindenberger, U; Nilsson, L-G; Porteous, D J; Räikkönen, K; Reinvang, I; Rudan, I; Sachdev, P S; Schmidt, R; Schofield, P R; Srikanth, V; Starr, J M; Turner, S T; Weir, D R; Wilson, J F; van Duijn, C; Launer, L; Fitzpatrick, A L; Seshadri, S; Mosley, T H; Deary, I J

    2015-01-01

    General cognitive function is substantially heritable across the human life course from adolescence to old age. We investigated the genetic contribution to variation in this important, health- and well-being-related trait in middle-aged and older adults. We conducted a meta-analysis of genome-wide association studies of 31 cohorts (N=53 949) in which the participants had undertaken multiple, diverse cognitive tests. A general cognitive function phenotype was tested for, and created in each cohort by principal component analysis. We report 13 genome-wide significant single-nucleotide polymorphism (SNP) associations in three genomic regions, 6q16.1, 14q12 and 19q13.32 (best SNP and closest gene, respectively: rs10457441, P=3.93 × 10−9, MIR2113; rs17522122, P=2.55 × 10−8, AKAP6; rs10119, P=5.67 × 10−9, APOE/TOMM40). We report one gene-based significant association with the HMGN1 gene located on chromosome 21 (P=1 × 10−6). These genes have previously been associated with neuropsychiatric phenotypes. Meta-analysis results are consistent with a polygenic model of inheritance. To estimate SNP-based heritability, the genome-wide complex trait analysis procedure was applied to two large cohorts, the Atherosclerosis Risk in Communities Study (N=6617) and the Health and Retirement Study (N=5976). The proportion of phenotypic variation accounted for by all genotyped common SNPs was 29% (s.e.=5%) and 28% (s.e.=7%), respectively. Using polygenic prediction analysis, ~1.2% of the variance in general cognitive function was predicted in the Generation Scotland cohort (N=5487; P=1.5 × 10−17). In hypothesis-driven tests, there was significant association between general cognitive function and four genes previously associated with Alzheimer's disease: TOMM40, APOE, ABCG1 and MEF2C. PMID:25644384

  7. The role of copy number variation in susceptibility to amyotrophic lateral sclerosis: genome-wide association study and comparison with published loci.

    Directory of Open Access Journals (Sweden)

    Louise V Wain

    2009-12-01

    Full Text Available The genetic contribution to sporadic amyotrophic lateral sclerosis (ALS has not been fully elucidated. There are increasing efforts to characterise the role of copy number variants (CNVs in human diseases; two previous studies concluded that CNVs may influence risk of sporadic ALS, with multiple rare CNVs more important than common CNVs. A little-explored issue surrounding genome-wide CNV association studies is that of post-calling filtering and merging of raw CNV calls. We undertook simulations to define filter thresholds and considered optimal ways of merging overlapping CNV calls for association testing, taking into consideration possibly overlapping or nested, but distinct, CNVs and boundary estimation uncertainty.In this study we screened Illumina 300K SNP genotyping data from 730 ALS cases and 789 controls for copy number variation. Following quality control filters using thresholds defined by simulation, a total of 11321 CNV calls were made across 575 cases and 621 controls. Using region-based and gene-based association analyses, we identified several loci showing nominally significant association. However, the choice of criteria for combining calls for association testing has an impact on the ranking of the results by their significance. Several loci which were previously reported as being associated with ALS were identified here. However, of another 15 genes previously reported as exhibiting ALS-specific copy number variation, only four exhibited copy number variation in this study. Potentially interesting novel loci, including EEF1D, a translation elongation factor involved in the delivery of aminoacyl tRNAs to the ribosome (a process which has previously been implicated in genetic studies of spinal muscular atrophy were identified but must be treated with caution due to concerns surrounding genomic location and platform suitability.Interpretation of CNV association findings must take into account the effects of filtering and combining

  8. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    Directory of Open Access Journals (Sweden)

    David Jean-Philippe

    2009-11-01

    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  9. Genome-wide assessment of the association of rare and common copy number variations to testicular germ cell cancer

    DEFF Research Database (Denmark)

    Edsgard, Stefan Daniel; Dalgaard, Marlene Danner; Weinhold, Nils

    2013-01-01

    Testicular germ cell cancer (TGCC) is one of the most heritable forms of cancer. Previous genome-wide association studies have focused on single nucleotide polymorphisms, largely ignoring the influence of copy number variants (CNVs). Here we present a genome-wide study of CNV on a cohort of 212...... of rare CNVs related to cell migration (false-discovery rate = 0.021, 1.8% of cases and 1.1% of controls). Dysregulation during migration of primordial germ cells has previously been suspected to be a part of TGCC development and this set of multiple rare variants may thereby have a minor contribution...

  10. Whole-Genome Resequencing of Experimental Populations Reveals Polygenic Basis of Egg-Size Variation in Drosophila melanogaster.

    Science.gov (United States)

    Jha, Aashish R; Miles, Cecelia M; Lippert, Nodia R; Brown, Christopher D; White, Kevin P; Kreitman, Martin

    2015-10-01

    Complete genome resequencing of populations holds great promise in deconstructing complex polygenic traits to elucidate molecular and developmental mechanisms of adaptation. Egg size is a classic adaptive trait in insects, birds, and other taxa, but its highly polygenic architecture has prevented high-resolution genetic analysis. We used replicated experimental evolution in Drosophila melanogaster and whole-genome sequencing to identify consistent signatures of polygenic egg-size adaptation. A generalized linear-mixed model revealed reproducible allele frequency differences between replicated experimental populations selected for large and small egg volumes at approximately 4,000 single nucleotide polymorphisms (SNPs). Several hundred distinct genomic regions contain clusters of these SNPs and have lower heterozygosity than the genomic background, consistent with selection acting on polymorphisms in these regions. These SNPs are also enriched among genes expressed in Drosophila ovaries and many of these genes have well-defined functions in Drosophila oogenesis. Additional genes regulating egg development, growth, and cell size show evidence of directional selection as genes regulating these biological processes are enriched for highly differentiated SNPs. Genetic crosses performed with a subset of candidate genes demonstrated that these genes influence egg size, at least in the large genetic background. These findings confirm the highly polygenic architecture of this adaptive trait, and suggest the involvement of many novel candidate genes in regulating egg size. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  11. Continuous Morphological Variation Correlated with Genome Size Indicates Frequent Introgressive Hybridization among Diphasiastrum Species (Lycopodiaceae) in Central Europe

    Czech Academy of Sciences Publication Activity Database

    Hanušová, K.; Ekrt, L.; Vít, Petr; Kolář, Filip; Urfus, Tomáš

    2014-01-01

    Roč. 9, č. 6 (2014), no.-e99552 E-ISSN 1932-6203 R&D Projects: GA ČR GB14-36079G Institutional support: RVO:67985939 Keywords : genome size * merphometrics * Diphasiastrum Subject RIV: EF - Botanics Impact factor: 3.234, year: 2014

  12. Variation in Linked Selection and Recombination Drive Genomic Divergence during Allopatric Speciation of European and American Aspens.

    Science.gov (United States)

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-07-01

    Despite the global economic and ecological importance of forest trees, the genomic basis of differential adaptation and speciation in tree species is still poorly understood. Populus tremula and Populus tremuloides are two of the most widespread tree species in the Northern Hemisphere. Using whole-genome re-sequencing data of 24 P. tremula and 22 P. tremuloides individuals, we find that the two species diverged ∼2.2-3.1 million years ago, coinciding with the severing of the Bering land bridge and the onset of dramatic climatic oscillations during the Pleistocene. Both species have experienced substantial population expansions following long-term declines after species divergence. We detect widespread and heterogeneous genomic differentiation between species, and in accordance with the expectation of allopatric speciation, coalescent simulations suggest that neutral evolutionary processes can account for most of the observed patterns of genetic differentiation. However, there is an excess of regions exhibiting extreme differentiation relative to those expected under demographic simulations, which is indicative of the action of natural selection. Overall genetic differentiation is negatively associated with recombination rate in both species, providing strong support for a role of linked selection in generating the heterogeneous genomic landscape of differentiation between species. Finally, we identify a number of candidate regions and genes that may have been subject to positive and/or balancing selection during the speciation process. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  13. Genome sequence variation in the constricta strain dramatically alters the protein interaction and localization map of Potato yellow dwarf virus

    Science.gov (United States)

    The genome sequence of the constricta strain of Potato yellow dwarf virus (CYDV) was determined to be 12,792 nucleotides long and organized into seven open reading frames with the gene order 3’-N-X-P-Y-M-G-L-5’, which encodes the nucleocapsid, phosphoprotein, movement, matrix, glycoprotein and RNA-d...

  14. The correlation of copy number variations with longevity in a genome-wide association study of Han Chinese

    DEFF Research Database (Denmark)

    Zhao, Xin; Liu, Xiaomin; Zhang, Aiping

    2018-01-01

    208), the risk of cancer (FOXA1, LAMA5, ZNF716), and vascular and immune-related diseases (ARHGEF10, TOR2A, SH2D3C). In addition, we found several pathways enriched in long-lived genomes, including FOXA1 and FOXA transcription factor networks involved in regulating aging or age-dependent diseases...

  15. The missing indels: an estimate of indel variation in a human genome and analysis of factors that impede detection

    Science.gov (United States)

    Jiang, Yue; Turinsky, Andrei L.; Brudno, Michael

    2015-01-01

    With the development of High-Throughput Sequencing (HTS) thousands of human genomes have now been sequenced. Whenever different studies analyze the same genome they usually agree on the amount of single-nucleotide polymorphisms, but differ dramatically on the number of insertion and deletion variants (indels). Furthermore, there is evidence that indels are often severely under-reported. In this manuscript we derive the total number of indel variants in a human genome by combining data from different sequencing technologies, while assessing the indel detection accuracy. Our estimate of approximately 1 million indels in a Yoruban genome is much higher than the results reported in several recent HTS studies. We identify two key sources of difficulties in indel detection: the insufficient coverage, read length or alignment quality; and the presence of repeats, including short interspersed elements and homopolymers/dimers. We quantify the effect of these factors on indel detection. The quality of sequencing data plays a major role in improving indel detection by HTS methods. However, many indels exist in long homopolymers and repeats, where their detection is severely impeded. The true number of indel events is likely even higher than our current estimates, and new techniques and technologies will be required to detect them. PMID:26130710

  16. Genomewide variation in an introgression line of rice-Zizania revealed by whole-genome re-sequencing.

    Directory of Open Access Journals (Sweden)

    Zhen-Hui Wang

    Full Text Available BACKGROUND: Hybridization between genetically diverged organisms is known as an important avenue that drives plant genome evolution. The possible outcomes of hybridization would be the occurrences of genetic instabilities in the resultant hybrids. It remained under-investigated however whether pollination by alien pollens of a closely related but sexually "incompatible" species could evoke genomic changes and to what extent it may result in phenotypic novelties in the derived progenies. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we have re-sequenced the genomes of Oryza sativa ssp. japonica cv. Matsumae and one of its derived introgressant RZ35 that was obtained from an introgressive hybridization between Matsumae and Zizanialatifolia Griseb. in general, 131 millions 90 base pair (bp paired-end reads were generated which covered 13.2 and 21.9 folds of the Matsumae and RZ35 genomes, respectively. Relative to Matsumae, a total of 41,724 homozygous single nucleotide polymorphisms (SNPs and 17,839 homozygous insertions/deletions (indels were identified in RZ35, of which 3,797 SNPs were nonsynonymous mutations. Furthermore, rampant mobilization of transposable elements (TEs was found in the RZ35 genome. The results of pathogen inoculation revealed that RZ35 exhibited enhanced resistance to blast relative to Matsumae. Notably, one nonsynonymous mutation was found in the known blast resistance gene Pid3/Pi25 and real-time quantitative (q RT-PCR analysis revealed constitutive up-regulation of its expression, suggesting both altered function and expression of Pid3/Pi25 may be responsible for the enhanced resistance to rice blast by RZ35. CONCLUSIONS/SIGNIFICANCE: Our results demonstrate that introgressive hybridization by Zizania has provoked genomewide, extensive genomic changes in the rice genome, and some of which have resulted in important phenotypic novelties. These findings suggest that introgressive hybridization by alien pollens of even a

  17. A genome-wide scan study identifies a single nucleotide substitution in ASIP associated with white versus non-white coat-colour variation in sheep (Ovis aries).

    Science.gov (United States)

    Li, M-H; Tiirikka, T; Kantanen, J

    2014-02-01

    In sheep, coat colour (and pattern) is one of the important traits of great biological, economic and social importance. However, the genetics of sheep coat colour has not yet been fully clarified. We conducted a genome-wide association study of sheep coat colours by genotyping 47 303 single-nucleotide polymorphisms (SNPs) in the Finnsheep population in Finland. We identified 35 SNPs associated with all the coat colours studied, which cover genomic regions encompassing three known pigmentation genes (TYRP1, ASIP and MITF) in sheep. Eighteen of these associations were confirmed in further tests between white versus non-white individuals, but none of the 35 associations were significant in the analysis of only non-white colours. Across the tests, the s66432.1 in ASIP showed significant association (P=4.2 × 10(-11) for all the colours; P=2.3 × 10(-11) for white versus non-white colours) with the variation in coat colours and strong linkage disequilibrium with other significant variants surrounding the ASIP gene. The signals detected around the ASIP gene were explained by differences in white versus non-white alleles. Further, a genome scan for selection for white coat pigmentation identified a strong and striking selection signal spanning ASIP. Our study identified the main candidate gene for the coat colour variation between white and non-white as ASIP, an autosomal gene that has been directly implicated in the pathway regulating melanogenesis. Together with ASIP, the two other newly identified genes (TYRP1 and MITF) in the Finnsheep, bordering associated SNPs, represent a new resource for enriching sheep coat-colour genetics and breeding.

  18. Identifying Rare Variation in Cases of Schizophrenia in the Isolated Population of the Faroe Islands using Whole-genome Sequencing

    DEFF Research Database (Denmark)

    Als, Thomas Damm; Lescai, Francesco; Dahl, Hans

    to map risk variants involved in complex traits. We aim at utilizing samples of cases and controls of the isolated population of the Faroe Islands to conduct whole-genome-sequence analysis in order to identify rare genetic variants associated with schizophrenia. We will search for rare genetic variants...... of developing SZ. However, these studies are designed to examining only “the common variant” proportion of the genomic landscape of SZ. Due to increased genetic drift during founding and potential bottlenecks, followed by population expansion, isolated populations may be particularly useful in identifying rare...... disease variants, that may appear at higher frequencies and/or within a more clearly distinct haplotype structure compared to outbred populations. Small isolated populations also typically show reduced phenotypic, genetic and environmental heterogeneity, thus making them advantageous in studies aiming...

  19. Combining high-throughput phenotyping and genome-wide association studies to reveal natural genetic variation in rice

    Science.gov (United States)

    Yang, Wanneng; Guo, Zilong; Huang, Chenglong; Duan, Lingfeng; Chen, Guoxing; Jiang, Ni; Fang, Wei; Feng, Hui; Xie, Weibo; Lian, Xingming; Wang, Gongwei; Luo, Qingming; Zhang, Qifa; Liu, Qian; Xiong, Lizhong

    2014-01-01

    Even as the study of plant genomics rapidly develops through the use of high-throughput sequencing techniques, traditional plant phenotyping lags far behind. Here we develop a high-throughput rice phenotyping facility (HRPF) to monitor 13 traditional agronomic traits and 2 newly defined traits during the rice growth period. Using genome-wide association studies (GWAS) of the 15 traits, we identify 141 associated loci, 25 of which contain known genes such as the Green Revolution semi-dwarf gene, SD1. Based on a performance evaluation of the HRPF and GWAS results, we demonstrate that high-throughput phenotyping has the potential to replace traditional phenotyping techniques and can provide valuable gene identification information. The combination of the multifunctional phenotyping tools HRPF and GWAS provides deep insights into the genetic architecture of important traits. PMID:25295980

  20. Complete plastid genome sequence of Primula sinensis (Primulaceae: structure comparison, sequence variation and evidence for accD transfer to nucleus

    Directory of Open Access Journals (Sweden)

    Tong-Jian Liu

    2016-06-01

    Full Text Available Species-rich genus Primula L. is a typical plant group with which to understand genetic variance between species in different levels of relationships. Chloroplast genome sequences are used to be the information resource for quantifying this difference and reconstructing evolutionary history. In this study, we reported the complete chloroplast genome sequence of Primula sinensis and compared it with other related species. This genome of chloroplast showed a typical circular quadripartite structure with 150,859 bp in sequence length consisting of 37.2% GC base. Two inverted repeated regions (25,535 bp were separated by a large single-copy region (82,064 bp and a small single-copy region (17,725 bp. The genome consists of 112 genes, including 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Among them, seven coding genes, seven tRNA genes and four rRNA genes have two copies due to their locations in the IR regions. The accD and infA genes lacking intact open reading frames (ORF were identified as pseudogenes. SSR and sequence variation analyses were also performed on the plastome of Primula sinensis, comparing with another available plastome of P. poissonii. The four most variable regions, rpl36–rps8, rps16–trnQ, trnH–psbA and ndhC–trnV, were identified. Phylogenetic relationship estimates using three sub-datasets extracted from a matrix of 57 protein-coding gene sequences showed the identical result that was consistent with previous studies. A transcript found from P. sinensis transcriptome showed a high similarity to plastid accD functional region and was identified as a putative plastid transit peptide at the N-terminal region. The result strongly suggested that plastid accD has been functionally transferred to the nucleus in P. sinensis.

  1. Genome-wide association mapping identifies the genetic basis of discrete and quantitative variation in sexual weaponry in a wild sheep population.

    Science.gov (United States)

    Johnston, Susan E; McEwan, John C; Pickering, Natalie K; Kijas, James W; Beraldi, Dario; Pilkington, Jill G; Pemberton, Josephine M; Slate, Jon

    2011-06-01

    Understanding the genetic architecture of phenotypic variation in natural populations is a fundamental goal of evolutionary genetics. Wild Soay sheep (Ovis aries) have an inherited polymorphism for horn morphology in both sexes, controlled by a single autosomal locus, Horns. The majority of males have large normal horns, but a small number have vestigial, deformed horns, known as scurs; females have either normal horns, scurs or no horns (polled). Given that scurred males and polled females have reduced fitness within each sex, it is counterintuitive that the polymorphism persists within the population. Therefore, identifying the genetic basis of horn type will provide a vital foundation for understanding why the different morphs are maintained in the face of natural selection. We conducted a genome-wide association study using ∼36000 single nucleotide polymorphisms (SNPs) and determined the main candidate for Horns as RXFP2, an autosomal gene with a known involvement in determining primary sex characters in humans and mice. Evidence from additional SNPs in and around RXFP2 supports a new model of horn-type inheritance in Soay sheep, and for the first time, sheep with the same horn phenotype but different underlying genotypes can be identified. In addition, RXFP2 was shown to be an additive quantitative trait locus (QTL) for horn size in normal-horned males, accounting for up to 76% of additive genetic variation in this trait. This finding contrasts markedly from genome-wide association studies of quantitative traits in humans and some model species, where it is often observed that mapped loci only explain a modest proportion of the overall genetic variation. © 2011 Blackwell Publishing Ltd.

  2. Genomic regions, cellular components and gene regulatory basis underlying pod length variations in cowpea (V. unguiculata L. Walp).

    Science.gov (United States)

    Xu, Pei; Wu, Xinyi; Muñoz-Amatriaín, María; Wang, Baogen; Wu, Xiaohua; Hu, Yaowen; Huynh, Bao-Lam; Close, Timothy J; Roberts, Philip A; Zhou, Wen; Lu, Zhongfu; Li, Guojing

    2017-05-01

    Cowpea (V. unguiculata L. Walp) is a climate resilient legume crop important for food security. Cultivated cowpea (V. unguiculata L) generally comprises the bushy, short-podded grain cowpea dominant in Africa and the climbing, long-podded vegetable cowpea popular in Asia. How selection has contributed to the diversification of the two types of cowpea remains largely unknown. In the current study, a novel genotyping assay for over 50 000 SNPs was employed to delineate genomic regions governing pod length. Major, minor and epistatic QTLs were identified through QTL mapping. Seventy-two SNPs associated with pod length were detected by genome-wide association studies (GWAS). Population stratification analysis revealed subdivision among a cowpea germplasm collection consisting of 299 accessions, which is consistent with pod length groups. Genomic scan for selective signals suggested that domestication of vegetable cowpea was accompanied by selection of multiple traits including pod length, while the further improvement process was featured by selection of pod length primarily. Pod growth kinetics assay demonstrated that more durable cell proliferation rather than cell elongation or enlargement was the main reason for longer pods. Transcriptomic analysis suggested the involvement of sugar, gibberellin and nutritional signalling in regulation of pod length. This study establishes the basis for map-based cloning of pod length genes in cowpea and for marker-assisted selection of this trait in breeding programmes. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  3. Integrating genome-wide genetic variations and monocyte expression data reveals trans-regulated gene modules in humans.

    Directory of Open Access Journals (Sweden)

    Maxime Rotival

    2011-12-01

    Full Text Available One major expectation from the transcriptome in humans is to characterize the biological basis of associations identified by genome-wide association studies. So far, few cis expression quantitative trait loci (eQTLs have been reliably related to disease susceptibility. Trans-regulating mechanisms may play a more prominent role in disease susceptibility. We analyzed 12,808 genes detected in at least 5% of circulating monocyte samples from a population-based sample of 1,490 European unrelated subjects. We applied a method of extraction of expression patterns-independent component analysis-to identify sets of co-regulated genes. These patterns were then related to 675,350 SNPs to identify major trans-acting regulators. We detected three genomic regions significantly associated with co-regulated gene modules. Association of these loci with multiple expression traits was replicated in Cardiogenics, an independent study in which expression profiles of monocytes were available in 758 subjects. The locus 12q13 (lead SNP rs11171739, previously identified as a type 1 diabetes locus, was associated with a pattern including two cis eQTLs, RPS26 and SUOX, and 5 trans eQTLs, one of which (MADCAM1 is a potential candidate for mediating T1D susceptibility. The locus 12q24 (lead SNP rs653178, which has demonstrated extensive disease pleiotropy, including type 1 diabetes, hypertension, and celiac disease, was associated to a pattern strongly correlating to blood pressure level. The strongest trans eQTL in this pattern was CRIP1, a known marker of cellular proliferation in cancer. The locus 12q15 (lead SNP rs11177644 was associated with a pattern driven by two cis eQTLs, LYZ and YEATS4, and including 34 trans eQTLs, several of them tumor-related genes. This study shows that a method exploiting the structure of co-expressions among genes can help identify genomic regions involved in trans regulation of sets of genes and can provide clues for understanding the

  4. Evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data.

    Science.gov (United States)

    Acikel, Cengizhan; Aydin Son, Yesim; Celik, Cemil; Gul, Husamettin

    2016-01-01

    Multifactor dimensionality reduction (MDR) is a nonparametric approach that can be used to detect relevant interactions between single-nucleotide polymorphisms (SNPs). The aim of this study was to build the best genomic model based on SNP associations and to identify candidate polymorphisms that are the underlying molecular basis of the bipolar disorders. This study was performed on Whole-Genome Association Study of Bipolar Disorder (dbGaP [database of Genotypes and Phenotypes] study accession number: phs000017.v3.p1) data. After preprocessing of the genotyping data, three classification-based data mining methods (ie, random forest, naïve Bayes, and k-nearest neighbor) were performed. Additionally, as a nonparametric, model-free approach, the MDR method was used to evaluate the SNP profiles. The validity of these methods was evaluated using true classification rate, recall (sensitivity), precision (positive predictive value), and F-measure. Random forests, naïve Bayes, and k-nearest neighbors identified 16, 13, and ten candidate SNPs, respectively. Surprisingly, the top six SNPs were reported by all three methods. Random forests and k-nearest neighbors were more successful than naïve Bayes, with recall values >0.95. On the other hand, MDR generated a model with comparable predictive performance based on five SNPs. Although different SNP profiles were identified in MDR compared to the classification-based models, all models mapped SNPs to the DOCK10 gene. Three classification-based data mining approaches, random forests, naïve Bayes, and k-nearest neighbors, have prioritized similar SNP profiles as predictors of bipolar disorders, in contrast to MDR, which has found different SNPs through analysis of two-way and three-way interactions. The reduced number of associated SNPs discovered by MDR, without loss in the classification performance, would facilitate validation studies and decision support models, and would reduce the cost to develop predictive and

  5. Clinical significance of rare copy number variations in epilepsy: a case-control survey using microarray-based comparative genomic hybridization.

    Science.gov (United States)

    Striano, Pasquale; Coppola, Antonietta; Paravidino, Roberta; Malacarne, Michela; Gimelli, Stefania; Robbiano, Angela; Traverso, Monica; Pezzella, Marianna; Belcastro, Vincenzo; Bianchi, Amedeo; Elia, Maurizio; Falace, Antonio; Gazzerro, Elisabetta; Ferlazzo, Edoardo; Freri, Elena; Galasso, Roberta; Gobbi, Giuseppe; Molinatto, Cristina; Cavani, Simona; Zuffardi, Orsetta; Striano, Salvatore; Ferrero, Giovanni Battista; Silengo, Margherita; Cavaliere, Maria Luigia; Benelli, Matteo; Magi, Alberto; Piccione, Maria; Dagna Bricarelli, Franca; Coviello, Domenico A; Fichera, Marco; Minetti, Carlo; Zara, Federico

    2012-03-01

    To perform an extensive search for genomic rearrangements by microarray-based comparative genomic hybridization in patients with epilepsy. Prospective cohort study. Epilepsy centers in Italy. Two hundred seventy-nine patients with unexplained epilepsy, 265 individuals with nonsyndromic mental retardation but no epilepsy, and 246 healthy control subjects were screened by microarray-based comparative genomic hybridization. Identification of copy number variations (CNVs) and gene enrichment. Rare CNVs occurred in 26 patients (9.3%) and 16 healthy control subjects (6.5%) (P = .26). The CNVs identified in patients were larger (P = .03) and showed higher gene content (P = .02) than those in control subjects. The CNVs larger than 1 megabase (P = .002) and including more than 10 genes (P = .005) occurred more frequently in patients than in control subjects. Nine patients (34.6%) among those harboring rare CNVs showed rearrangements associated with emerging microdeletion or microduplication syndromes. Mental retardation and neuropsychiatric features were associated with rare CNVs (P = .004), whereas epilepsy type was not. The CNV rate in patients with epilepsy and mental retardation or neuropsychiatric features is not different from that observed in patients with mental retardation only. Moreover, significant enrichment of genes involved in ion transport was observed within CNVs identified in patients with epilepsy. Patients with epilepsy show a significantly increased burden of large, rare, gene-rich CNVs, particularly when associated with mental retardation and neuropsychiatric features. The limited overlap between CNVs observed in the epilepsy group and those observed in the group with mental retardation only as well as the involvement of specific (ion channel) genes indicate a specific association between the identified CNVs and epilepsy. Screening for CNVs should be performed for diagnostic purposes preferentially in patients with epilepsy and mental retardation or

  6. Variations in the G6PC2/ABCB11 genomic region are associated with fasting glucose levels

    DEFF Research Database (Denmark)

    Chen, Wei-Min; Erdos, Michael R; Jackson, Anne U

    2008-01-01

    Identifying the genetic variants that regulate fasting glucose concentrations may further our understanding of the pathogenesis of diabetes. We therefore investigated the association of fasting glucose levels with SNPs in 2 genome-wide scans including a total of 5,088 nondiabetic individuals from...... Finland and Sardinia. We found a significant association between the SNP rs563694 and fasting glucose concentrations (P = 3.5 x 10(-7)). This association was further investigated in an additional 18,436 nondiabetic individuals of mixed European descent from 7 different studies. The combined P value...... for association in these follow-up samples was 6.9 x 10(-26), and combining results from all studies resulted in an overall P value for association of 6.4 x 10(-33). Across these studies, fasting glucose concentrations increased 0.01-0.16 mM with each copy of the major allele, accounting for approximately 1...

  7. Genetic control of environmental variation of two quantitative traits of Drosophila melanogaster revealed by whole-genome sequencing

    DEFF Research Database (Denmark)

    Sørensen, Peter; de los Campos, Gustavo; Morgante, Fabio

    2015-01-01

    and others more volatile performance. Understanding the mechanisms responsible for environmental variability not only informs medical questions but is relevant in evolution and in agricultural science. In this work fully sequenced inbred lines of Drosophila melanogaster were analyzed to study the nature...... of genetic control of environmental variance for two quantitative traits: starvation resistance (SR) and startle response (SL). The evidence for genetic control of environmental variance is compelling for both traits. Sequence information is incorporated in random regression models to study the underlying...... genetic signals, which are shown to be different in the two traits. Genomic variance in sexual dimorphism was found for SR but not for SL. Indeed, the proportion of variance captured by sequence information and the contribution to this variance from four chromosome segments differ between sexes in SR...

  8. Meta-analysis of genome-wide association studies identifies 8 novel loci involved in shape variation of human head hair.

    Science.gov (United States)

    Liu, Fan; Chen, Yan; Zhu, Gu; Hysi, Pirro G; Wu, Sijie; Adhikari, Kaustubh; Breslin, Krystal; Pospiech, Ewelina; Hamer, Merel A; Peng, Fuduan; Muralidharan, Charanya; Acuna-Alonzo, Victor; Canizales-Quinteros, Samuel; Bedoya, Gabriel; Gallo, Carla; Poletti, Giovanni; Rothhammer, Francisco; Bortolini, Maria Catira; Gonzalez-Jose, Rolando; Zeng, Changqing; Xu, Shuhua; Jin, Li; Uitterlinden, André G; Ikram, M Arfan; van Duijn, Cornelia M; Nijsten, Tamar; Walsh, Susan; Branicki, Wojciech; Wang, Sijia; Ruiz-Linares, Andrés; Spector, Timothy D; Martin, Nicholas G; Medland, Sarah E; Kayser, Manfred

    2018-02-01

    Shape variation of human head hair shows striking variation within and between human populations, while its genetic basis is far from being understood. We performed a series of genome-wide association studies (GWASs) and replication studies in a total of 28 964 subjects from 9 cohorts from multiple geographic origins. A meta-analysis of three European GWASs identified 8 novel loci (1p36.23 ERRFI1/SLC45A1, 1p36.22 PEX14, 1p36.13 PADI3, 2p13.3 TGFA, 11p14.1 LGR4, 12q13.13 HOXC13, 17q21.2 KRTAP, and 20q13.33 PTK6), and confirmed 4 previously known ones (1q21.3 TCHH/TCHHL1/LCE3E, 2q35 WNT10A, 4q21.21 FRAS1, and 10p14 LINC00708/GATA3), all showing genome-wide significant association with hair shape (P 5e-8). All except one (1p36.22 PEX14) were replicated with nominal significance in at least one of the 6 additional cohorts of European, Native American and East Asian origins. Three additional previously known genes (EDAR, OFCC1, and PRSS53) were confirmed at the nominal significance level. A multivariable regression model revealed that 14 SNPs from different genes significantly and independently contribute to hair shape variation, reaching a cross-validated AUC value of 0.66 (95% CI: 0.62-0.70) and an AUC value of 0.64 in an independent validation cohort, providing an improved accuracy compared with a previous model. Prediction outcomes of 2504 individuals from a multiethnic sample were largely consistent with general knowledge on the global distribution of hair shape variation. Our study thus delivers target genes and DNA variants for future functional studies to further evaluate the molecular basis of hair shape in humans. © The Author(s) 2017. Published by Oxford University Press.

  9. Meta-analysis of genome-wide association studies identifies 8 novel loci involved in shape variation of human head hair

    Science.gov (United States)

    Liu, Fan; Chen, Yan; Zhu, Gu; Hysi, Pirro G; Wu, Sijie; Adhikari, Kaustubh; Breslin, Krystal; Pośpiech, Ewelina; Hamer, Merel A; Peng, Fuduan; Muralidharan, Charanya; Acuna-Alonzo, Victor; Canizales-Quinteros, Samuel; Bedoya, Gabriel; Gallo, Carla; Poletti, Giovanni; Rothhammer, Francisco; Bortolini, Maria Catira; Gonzalez-Jose, Rolando; Zeng, Changqing; Xu, Shuhua; Jin, Li; Uitterlinden, André G; Ikram, M Arfan; van Duijn, Cornelia M; Nijsten, Tamar; Walsh, Susan; Branicki, Wojciech; Wang, Sijia; Ruiz-Linares, Andrés; Spector, Timothy D; Martin, Nicholas G; Medland, Sarah E; Kayser, Manfred

    2018-01-01

    Abstract Shape variation of human head hair shows striking variation within and between human populations, while its genetic basis is far from being understood. We performed a series of genome-wide association studies (GWASs) and replication studies in a total of 28 964 subjects from 9 cohorts from multiple geographic origins. A meta-analysis of three European GWASs identified 8 novel loci (1p36.23 ERRFI1/SLC45A1, 1p36.22 PEX14, 1p36.13 PADI3, 2p13.3 TGFA, 11p14.1 LGR4, 12q13.13 HOXC13, 17q21.2 KRTAP, and 20q13.33 PTK6), and confirmed 4 previously known ones (1q21.3 TCHH/TCHHL1/LCE3E, 2q35 WNT10A, 4q21.21 FRAS1, and 10p14 LINC00708/GATA3), all showing genome-wide significant association with hair shape (P < 5e-8). All except one (1p36.22 PEX14) were replicated with nominal significance in at least one of the 6 additional cohorts of European, Native American and East Asian origins. Three additional previously known genes (EDAR, OFCC1, and PRSS53) were confirmed at the nominal significance level. A multivariable regression model revealed that 14 SNPs from different genes significantly and independently contribute to hair shape variation, reaching a cross-validated AUC value of 0.66 (95% CI: 0.62–0.70) and an AUC value of 0.64 in an independent validation cohort, providing an improved accuracy compared with a previous model. Prediction outcomes of 2504 individuals from a multiethnic sample were largely consistent with general knowledge on the global distribution of hair shape variation. Our study thus delivers target genes and DNA variants for future functional studies to further evaluate the molecular basis of hair shape in humans. PMID:29220522

  10. Nucleotide variation in the mitochondrial genome provides evidence for dual routes of postglacial recolonization and genetic recombination in the northeastern brook trout (Salvelinus fontinalis).

    Science.gov (United States)

    Pilgrim, B L; Perry, R C; Barron, J L; Marshall, H D

    2012-09-26

    Levels and patterns of mitochondrial DNA (mtDNA) variation were examined to investigate the population structure and possible routes of postglacial recolonization of the world's northernmost native populations of brook trout (Salvelinus fontinalis), which are found in Labrador, Canada. We analyzed the sequence diversity of a 1960-bp portion of the mitochondrial genome (NADH dehydrogenase 1 gene and part of cytochrome oxidase 1) of 126 fish from 32 lakes distributed throughout seven regions of northeastern Canada. These populations were found to have low levels of mtDNA diversity, a characteristic trait of populations at northern extremes, with significant structuring at the level of the watershed. Upon comparison of northeastern brook trout sequences to the publicly available brook trout whole mitochondrial genome (GenBank AF154850), we infer that the GenBank sequence is from a fish whose mtDNA has recombined with that of Arctic charr (S. alpinus). The haplotype distribution provides evidence of two different postglacial founding groups contributing to present-day brook trout populations in the northernmost part of their range; the evolution of the majority of the haplotypes coincides with the timing of glacier retreat from Labrador. Our results exemplify the strong influence that historical processes such as glaciations have had on shaping the current genetic structure of northern species such as the brook trout.

  11. Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing.

    Science.gov (United States)

    Cornman, Robert Scott; Boncristiani, Humberto; Dainat, Benjamin; Chen, Yanping; vanEngelsdorp, Dennis; Weaver, Daniel; Evans, Jay D

    2013-03-07

    Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li's D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low FST. This initial survey of genetic variation within honey bee RNA viruses suggests future directions for studies examining the underlying causes of population-genetic structure in these economically important pathogens.

  12. PeachVar-DB: A Curated Collection of Genetic Variations for the Interactive Analysis of Peach Genome Data.

    Science.gov (United States)

    Cirilli, Marco; Flati, Tiziano; Gioiosa, Silvia; Tagliaferri, Ilario; Ciacciulli, Angelo; Gao, Zhongshan; Gattolin, Stefano; Geuna, Filippo; Maggi, Francesco; Bottoni, Paolo; Rossini, Laura; Bassi, Daniele; Castrignanò, Tiziana; Chillemi, Giovanni

    2018-01-01

    Applying next-generation sequencing (NGS) technologies to species of agricultural interest has the potential to accelerate the understanding and exploration of genetic resources. The storage, availability and maintenance of huge quantities of NGS-generated data remains a major challenge. The PeachVar-DB portal, available at http://hpc-bioinformatics.cineca.it/peach, is an open-source catalog of genetic variants present in peach (Prunus persica L. Batsch) and wild-related species of Prunus genera, annotated from 146 samples publicly released on the Sequence Read Archive (SRA). We designed a user-friendly web-based interface of the database, providing search tools to retrieve single nucleotide polymorphism (SNP) and InDel variants, along with useful statistics and information. PeachVar-DB results are linked to the Genome Database for Rosaceae (GDR) and the Phytozome database to allow easy access to other external useful plant-oriented resources. In order to extend the genetic diversity covered by the PeachVar-DB further, and to allow increasingly powerful comparative analysis, we will progressively integrate newly released data. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  13. Replication of genome wide association studies of alcohol dependence: support for association with variation in ADH1C.

    Directory of Open Access Journals (Sweden)

    Joanna M Biernacka

    Full Text Available Genome-wide association studies (GWAS have revealed many single nucleotide polymorphisms (SNPs associated with complex traits. Although these studies frequently fail to identify statistically significant associations, the top association signals from GWAS may be enriched for true associations. We therefore investigated the association of alcohol dependence with 43 SNPs selected from association signals in the first two published GWAS of alcoholism. Our analysis of 808 alcohol-dependent cases and 1,248 controls provided evidence of association of alcohol dependence with SNP rs1614972 in the ADH1C gene (unadjusted p = 0.0017. Because the GWAS study that originally reported association of alcohol dependence with this SNP [1] included only men, we also performed analyses in sex-specific strata. The results suggest that this SNP has a similar effect in both sexes (men: OR (95%CI = 0.80 (0.66, 0.95; women: OR (95%CI = 0.83 (0.66, 1.03. We also observed marginal evidence of association of the rs1614972 minor allele with lower alcohol consumption in the non-alcoholic controls (p = 0.081, and independently in the alcohol-dependent cases (p = 0.046. Despite a number of potential differences between the samples investigated by the prior GWAS and the current study, data presented here provide additional support for the association of SNP rs1614972 in ADH1C with alcohol dependence and extend this finding by demonstrating association with consumption levels in both non-alcoholic and alcohol-dependent populations. Further studies should investigate the association of other polymorphisms in this gene with alcohol dependence and related alcohol-use phenotypes.

  14. Application of DETECTER, an evolutionary genomic tool to analyze genetic variation, to the cystic fibrosis gene family

    Directory of Open Access Journals (Sweden)

    De Kee Danny W

    2006-03-01

    Full Text Available Abstract Background The medical community requires computational tools that distinguish missense genetic differences having phenotypic impact within the vast number of sense mutations that do not. Tools that do this will become increasingly important for those seeking to use human genome sequence data to predict disease, make prognoses, and customize therapy to individual patients. Results An approach, termed DETECTER, is proposed to identify sites in a protein sequence where amino acid replacements are likely to have a significant effect on phenotype, including causing genetic disease. This approach uses a model-dependent tool to estimate the normalized replacement rate at individual sites in a protein sequence, based on a history of those sites extracted from an evolutionary analysis of the corresponding protein family. This tool identifies sites that have higher-than-average, average, or lower-than-average rates of change in the lineage leading to the sequence in the population of interest. The rates are then combined with sequence data to determine the likelihoods that particular amino acids were present at individual sites in the evolutionary history of the gene family. These likelihoods are used to predict whether any specific amino acid replacements, if introduced at the site in a modern human population, would have a significant impact on fitness. The DETECTER tool is used to analyze the cystic fibrosis transmembrane conductance regulator (CFTR gene family. Conclusion In this system, DETECTER retrodicts amino acid replacements associated with the cystic fibrosis disease with greater accuracy than alternative approaches. While this result validates this approach for this particular family of proteins only, the approach may be applicable to the analysis of polymorphisms generally, including SNPs in a human population.

  15. Dynamic DNA cytosine methylation in the Populus trichocarpa genome: tissue-level variation and relationship to gene expression

    Directory of Open Access Journals (Sweden)

    Vining Kelly J

    2012-01-01

    Full Text Available Abstract Background DNA cytosine methylation is an epigenetic modification that has been implicated in many biological processes. However, large-scale epigenomic studies have been applied to very few plant species, and variability in methylation among specialized tissues and its relationship to gene expression is poorly understood. Results We surveyed DNA methylation from seven distinct tissue types (vegetative bud, male inflorescence [catkin], female catkin, leaf, root, xylem, phloem in the reference tree species black cottonwood (Populus trichocarpa. Using 5-methyl-cytosine DNA immunoprecipitation followed by Illumina sequencing (MeDIP-seq, we mapped a total of 129,360,151 36- or 32-mer reads to the P. trichocarpa reference genome. We validated MeDIP-seq results by bisulfite sequencing, and compared methylation and gene expression using published microarray data. Qualitative DNA methylation differences among tissues were obvious on a chromosome scale. Methylated genes had lower expression than unmethylated genes, but genes with methylation in transcribed regions ("gene body methylation" had even lower expression than genes with promoter methylation. Promoter methylation was more frequent than gene body methylation in all tissues except male catkins. Male catkins differed in demethylation of particular transposable element categories, in level of gene body methylation, and in expression range of genes with methylated transcribed regions. Tissue-specific gene expression patterns were correlated with both gene body and promoter methylation. Conclusions We found striking differences among tissues in methylation, which were apparent at the chromosomal scale and when genes and transposable elements were examined. In contrast to other studies in plants, gene body methylation had a more repressive effect on transcription than promoter methylation.

  16. Genomic Variation and Evolution of Vibrio parahaemolyticus ST36 over the Course of a Transcontinental Epidemic Expansion

    Directory of Open Access Journals (Sweden)

    Jaime Martinez-Urtaza

    2017-11-01

    Full Text Available Vibrio parahaemolyticus is the leading cause of seafood-related infections with illnesses undergoing a geographic expansion. In this process of expansion, the most fundamental change has been the transition from infections caused by local strains to the surge of pandemic clonal types. Pandemic clone sequence type 3 (ST3 was the only example of transcontinental spreading until 2012, when ST36 was detected outside the region where it is endemic in the U.S. Pacific Northwest causing infections along the U.S. northeast coast and Spain. Here, we used genome-wide analyses to reconstruct the evolutionary history of the V. parahaemolyticus ST36 clone over the course of its geographic expansion during the previous 25 years. The origin of this lineage was estimated to be in ~1985. By 1995, a new variant emerged in the region and quickly replaced the old clone, which has not been detected since 2000. The new Pacific Northwest (PNW lineage was responsible for the first cases associated with this clone outside the Pacific Northwest region. After several introductions into the northeast coast, the new PNW clone differentiated into a highly dynamic group that continues to cause illness on the northeast coast of the United States. Surprisingly, the strains detected in Europe in 2012 diverged from this ancestral group around 2000 and have conserved genetic features present only in the old PNW lineage. Recombination was identified as the major driver of diversification, with some preliminary observations suggesting a trend toward a more specialized lifestyle, which may represent a critical element in the expansion of epidemics under scenarios of coastal warming.

  17. Genome wide meta-analysis highlights the role of genetic variation in RARRES2 in the regulation of circulating serum chemerin.

    Directory of Open Access Journals (Sweden)

    Anke Tönjes

    2014-12-01

    Full Text Available Chemerin is an adipokine proposed to link obesity and chronic inflammation of adipose tissue. Genetic factors determining chemerin release from adipose tissue are yet unknown. We conducted a meta-analysis of genome-wide association studies (GWAS for serum chemerin in three independent cohorts from Europe: Sorbs and KORA from Germany and PPP-Botnia from Finland (total N = 2,791. In addition, we measured mRNA expression of genes within the associated loci in peripheral mononuclear cells by micro-arrays, and within adipose tissue by quantitative RT-PCR and performed mRNA expression quantitative trait and expression-chemerin association studies to functionally substantiate our loci. Heritability estimate of circulating chemerin levels was 16.2% in the Sorbs cohort. Thirty single nucleotide polymorphisms (SNPs at chromosome 7 within the retinoic acid receptor responder 2 (RARRES2/Leucine Rich Repeat Containing (LRRC61 locus reached genome-wide significance (p<5.0×10-8 in the meta-analysis (the strongest evidence for association at rs7806429 with p = 7.8×10-14, beta = -0.067, explained variance 2.0%. All other SNPs within the cluster were in linkage disequilibrium with rs7806429 (minimum r2 = 0.43 in the Sorbs cohort. The results of the subgroup analyses of males and females were consistent with the results found in the total cohort. No significant SNP-sex interaction was observed. rs7806429 was associated with mRNA expression of RARRES2 in visceral adipose tissue in women (p<0.05 after adjusting for age and body mass index. In conclusion, the present meta-GWAS combined with mRNA expression studies highlights the role of genetic variation in the RARRES2 locus in the regulation of circulating chemerin concentrations.

  18. Whole-genome patterns of linkage disequilibrium across flycatcher populations clarify the causes and consequences of fine-scale recombination rate variation in birds.

    Science.gov (United States)

    Kawakami, Takeshi; Mugal, Carina F; Suh, Alexander; Nater, Alexander; Burri, Reto; Smeds, Linnéa; Ellegren, Hans

    2017-08-01

    Recombination rate is heterogeneous across the genome of various species and so are genetic diversity and differentiation as a consequence of linked selection. However, we still lack a clear picture of the underlying mechanisms for regulating recombination. Here we estimated fine-scale population recombination rate based on the patterns of linkage disequilibrium across the genomes of multiple populations of two closely related flycatcher species (Ficedula albicollis and F. hypoleuca). This revealed an overall conservation of the recombination landscape between these species at the scale of 200 kb, but we also identified differences in the local rate of recombination despite their recent divergence (recombination rate in a lineage-specific manner, indicating differences in the extent of linked selection between species. We detected 400-3,085 recombination hotspots per population. Location of hotspots was conserved between species, but the intensity of hotspot activity varied between species. Recombination hotspots were primarily associated with CpG islands (CGIs), regardless of whether CGIs were at promoter regions or away from genes. Recombination hotspots were also associated with specific transposable elements (TEs), but this association appears indirect due to shared preferences of the transposition machinery and the recombination machinery for accessible open chromatin regions. Our results suggest that CGIs are a major determinant of the localization of recombination hotspots, and we propose that both the distribution of TEs and fine-scale variation in recombination rate may be associated with the evolution of the epigenetic landscape. © 2017 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  19. Genetic variation in the mitochondrial genome of the giant grouper Epinephelus lanceolatus (Bloch, 1790 and its application for the identification of broodstock

    Directory of Open Access Journals (Sweden)

    Seng S. Cheng

    2015-11-01

    Full Text Available Mitochondrial DNA (mtDNA markers are ideal for the validation of maternal inheritance and the identification of brood-stock in aquaculture breeding programs. The complete mitochondrial genomes of 11 species of grouper are currently available at the GenBank. This study was directed towards the characterization of mtDNA loci which can be applied for identification of interspecific F1 hybrids developed from Epinephelus fuscoguttatus and Epinephelus lanceolatus in aquaculture breeding programs. DNA was extracted from the fin clip of one specimen of E. lanceolatus which the source of sperm for the artificial spawning of the interspecific F1 hybrid E. fuscoguttatus × E. lanceolatus. Specific primers were designed to amplify the DNA after comparative analysis of the mtDNA genomes available at the GenBank. The primers were applied to test for cross-amplification in F1 hybrids as well as in the maternal parent E. fuscoguttatus (Forsskål, 1775 and the genetically related species Epinephelus coioides and Epinephelus corallicola (Valenciennes, 1828. DNA sequence analysis revealed that the Malaysian variety of E. lanceolatus exhibited variation at 11 of the 13 ORFs when compared to the variety from Taiwan. A distinct segmented duplication was observed in the D-loop region which was determined to be unique to the E. lanceolatus specimen obtained from Sabah, Malaysia. Cross amplification of mtDNA loci in the groupers E. fuscoguttatus, E. coioides, E. corallicola and the F1 hybrid of E. fuscoguttatus × E. lanceolatus revealed distinct profiles for each of the species with a clear indication that mtDNA were inherited from the maternal parent of the F1 hybrid.. mtDNA loci can be applied by fish breeders to determine interspecific hybridization events.

  20. Production of HIV-1 vif mRNA Is Modulated by Natural Nucleotide Variations and SLSA1 RNA Structure in SA1D2prox Genomic Region

    Directory of Open Access Journals (Sweden)

    Masako Nomaguchi

    2017-12-01

    Full Text Available Genomic RNA of HIV-1 contains localized structures critical for viral replication. Its structural analysis has demonstrated a stem-loop structure, SLSA1, in a nearby region of HIV-1 genomic splicing acceptor 1 (SA1. We have previously shown that the expression level of vif mRNA is considerably altered by some natural single-nucleotide variations (nSNVs clustering in SLSA1 structure. In this study, besides eleven nSNVs previously identified by us, we totally found nine new nSNVs in the SLSA1-containing sequence from SA1, splicing donor 2, and through to the start codon of Vif that significantly affect the vif mRNA level, and designated the sequence SA1D2prox (142 nucleotides for HIV-1 NL4-3. We then examined by extensive variant and mutagenesis analyses how SA1D2prox sequence and SLSA1 secondary structure are related to vif mRNA level. While the secondary structure and stability of SLSA1 was largely changed by nSNVs and artificial mutations introduced to restore the original NL4-3 form from altered ones by nSNVs, no clear association of the two SLSA1 properties with vif mRNA level was observed. In contrast, when naturally occurring SA1D2prox sequences that contain multiple nSNVs were examined, we attained significant inverse correlation between the vif level and SLSA1 stability. These results may suggest that SA1D2prox sequence adapts over time, and also that the altered SA1D2prox sequence, SLSA1 stability, and vif level are mutually related. In total, we show here that the entire SA1D2prox sequence and SLSA1 stability critically contribute to the modulation of vif mRNA level.

  1. Genome-wide copy number variation analysis identified deletions in SFMBT1 associated with fasting plasma glucose in a Han Chinese population.

    Science.gov (United States)

    Chung, Ren-Hua; Chiu, Yen-Feng; Hung, Yi-Jen; Lee, Wen-Jane; Wu, Kwan-Dun; Chen, Hui-Ling; Lin, Ming-Wei; Chen, Yii-Der I; Quertermous, Thomas; Hsiung, Chao A

    2017-08-08

    Fasting glucose and fasting insulin are glycemic traits closely related to diabetes, and understanding the role of genetic factors in these traits can help reveal the etiology of type 2 diabetes. Although single nucleotide polymorphisms (SNPs) in several candidate genes have been found to be associated with fasting glucose and fasting insulin, copy number variations (CNVs), which have been reported to be associated with several complex traits, have not been reported for association with these two traits. We aimed to identify CNVs associated with fasting glucose and fasting insulin. We conducted a genome-wide CNV association analysis for fasting plasma glucose (FPG) and fasting plasma insulin (FPI) using a family-based genome-wide association study sample from a Han Chinese population in Taiwan. A family-based CNV association test was developed in this study to identify common CNVs (i.e., CNVs with frequencies ≥ 5%), and a generalized estimating equation approach was used to test the associations between the traits and counts of global rare CNVs (i.e., CNVs with frequencies <5%). We found a significant genome-wide association for common deletions with a frequency of 5.2% in the Scm-like with four mbt domains 1 (SFMBT1) gene with FPG (association p-value = 2×10 -4 and an adjusted p-value = 0.0478 for multiple testing). No significant association was observed between global rare CNVs and FPG or FPI. The deletions in 20 individuals with DNA samples available were successfully validated using PCR-based amplification. The association of the deletions in SFMBT1 with FPG was further evaluated using an independent population-based replication sample obtained from the Taiwan Biobank. An association p-value of 0.065, which was close to the significance level of 0.05, for FPG was obtained by testing 9 individuals with CNVs in the SFMBT1 gene region and 11,692 individuals with normal copies in the replication cohort. Previous studies have found that SNPs in SFMBT1 are

  2. Exploring hepsin functional genetic variation association with disease specific protein expression in bipolar disorder: Applications of a proteomic informed genomic approach.

    Science.gov (United States)

    Nassan, Malik; Jia, Yun-Fang; Jenkins, Greg; Colby, Colin; Feeder, Scott; Choi, Doo-Sup; Veldic, Marin; McElroy, Susan L; Bond, David J; Weinshilboum, Richard; Biernacka, Joanna M; Frye, Mark A

    2017-12-01

    In a prior discovery study, increased levels of serum Growth Differentiation Factor 15 (GDF15), Hepsin (HPN), and Matrix Metalloproteinase-7 (MMP7) were observed in bipolar depressed patients vs controls. This exploratory post-hoc analysis applied a proteomic-informed genomic research strategy to study the potential functional role of these proteins in bipolar disorder (BP). Utilizing the Genotype-Tissue Expression (GTEx) database to identify cis-acting blood expression quantitative trait loci (cis-eQTLs), five eQTL variants from the HPN gene were analyzed for association with BP cases using genotype data of cases from the discovery study (n = 58) versus healthy controls (n = 777). After adjusting for relevant covariates, we analyzed the relationship between these 5 cis-eQTLs and HPN serum level in the BP cases. All 5 cis-eQTL minor alleles were significantly more frequent in BP cases vs controls [(rs62122114, OR = 1.6, p = 0.02), (rs67003112, OR = 1.6, p = 0.02), (rs4997929, OR = 1.7, p = 0.01), (rs12610663, OR = 1.7, p = 0.01), (rs62122148, OR = 1.7, P = 0.01)]. The minor allele (A) in rs62122114 was significantly associated with increased serum HPN level in BP cases (Beta = 0.12, P = 0.049). However, this same minor allele was associated with reduced gene expression in GTEx controls. These exploratory analyses suggest that genetic variation in/near the gene encoding for hepsin protein may influence risk of bipolar disorder. This genetic variation, at least for the rs62122114-A allele, may have functional impact (i.e. differential expression) as evidenced by serum HPN protein expression. Although limited by small sample size, this study highlights the merits of proteomic informed functional genomic studies as a tool to investigate with greater precision the genetic risk of bipolar disorder and secondary relationships to protein expression recognizing, and encouraging in subsequent studies, high likelihood of epigenetic modification of

  3. Revisiting the Iberian honey bee (Apis mellifera iberiensis) contact zone: maternal and genome-wide nuclear variations provide support for secondary contact from historical refugia.

    Science.gov (United States)

    Chávez-Galarza, Julio; Henriques, Dora; Johnston, J Spencer; Carneiro, Miguel; Rufino, José; Patton, John C; Pinto, M Alice

    2015-06-01

    Dissecting diversity patterns of organisms endemic to Iberia has been truly challenging for a variety of taxa, and the Iberian honey bee is no exception. Surveys of genetic variation in the Iberian honey bee are among the most extensive for any honey bee subspecies. From these, differential and complex patterns of diversity have emerged, which have yet to be fully resolved. Here, we used a genome-wide data set of 309 neutrally tested single nucleotide polymorphisms (SNPs), scattered across the 16 honey bee chromosomes, which were genotyped in 711 haploid males. These SNPs were analysed along with an intergenic locus of the mtDNA, to reveal historical patterns of population structure across the entire range of the Iberian honey bee. Overall, patterns of population structure inferred from nuclear loci by multiple clustering approaches and geographic cline analysis were consistent with two major clusters forming a well-defined cline that bisects Iberia along a northeastern-southwestern axis, a pattern that remarkably parallels that of the mtDNA. While a mechanism of primary intergradation or isolation by distance could explain the observed clinal variation, our results are more consistent with an alternative model of secondary contact between divergent populations previously isolated in glacial refugia, as proposed for a growing list of other Iberian taxa. Despite current intense honey bee management, human-mediated processes have seemingly played a minor role in shaping Iberian honey bee genetic structure. This study highlights the complexity of the Iberian honey bee patterns and reinforces the importance of Iberia as a reservoir of Apis mellifera diversity. © 2015 John Wiley & Sons Ltd.

  4. Characterization of genomic variations in SNPs of PE_PGRS genes reveals deletions and insertions in extensively drug resistant (XDR) M. tuberculosis strains from Pakistan

    KAUST Repository

    Kanji, Akbar

    2015-01-21

    Background Mycobacterium tuberculosis (MTB) PE_PGRS genes belong to the PE multigene family. Although the function of PE_PGRS genes is unknown, it is hypothesized that the PE_PGRS genes may be associated with antigenic variability in MTB. Material and methods Whole genome sequencing analysis was performed on (n = 37) extensively drug-resistant (XDR) MTB strains from Pakistan, which included Lineage 1 (East African Indian, n = 2); Other lineage 1 (n = 3); Lineage 3 (Central Asian, n = 24); Other lineage 3 (n = 4); Lineage 4 (X3, n = 1) and T group (n = 3) MTB strains. Results There were 107 SNPs identified from the analysis of 42 PE_PGRS genes; of these, 13 were non-synonymous SNPs (nsSNPs). The nsSNPs identified in PE_PGRS genes – 6, 9 and 10 – were common in all EAI, CAS, Other lineages (1 and 3), T1 and X3. Deletions (DELs) in PE_PGRS genes – 3 and 19 – were observed in 17 (80.9%) CAS1 and 6 (85.7%) in Other lineages (1 and 3) XDR MTB strains, while DELs in the PE_PGRS49 were observed in all CAS1, CAS, CAS2 and Other lineages (1 and 3) XDR MTB strains. All CAS, EAI and Other lineages (1 and 3) strains showed insertions (INS) in PE_PGRS6 gene, while INS in the PE_PGRS genes 19 and 33 were observed in 20 (95.2%) CAS1, all CAS, CAS2, EAI and Other lineages (1 and 3) XDR MTB strains. Conclusion Genetic diversity in PE_PGRS genes contributes to antigenic variability and may result in increased immunogenicity of strains. This is the first study identifying variations in nsSNPs and INDELs in the PE_PGRS genes of XDR-TB strains from Pakistan. It highlights common genetic variations which may contribute to persistence.

  5. Characterization of genomic variations in SNPs of PE_PGRS genes reveals deletions and insertions in extensively drug resistant (XDR) M. tuberculosis strains from Pakistan

    KAUST Repository

    Kanji, Akbar

    2015-03-01

    Background: Mycobacterium tuberculosis (MTB) PE_PGRS genes belong to the PE multi-gene family. Although the function of the members of the PE_PGRS multi-gene family is not yet known, it is hypothesized that the PE_PGRS genes may be associated with genetic variability. Material and methods: Whole genome sequencing analysis was performed on (n= 37) extensively drug resistant (XDR) MTB strains from Pakistan which included Central Asian (n= 23), East African Indian (n= 2), X3 (n= 1), T group (n= 3) and Orphan (n= 8) MTB strains. Results: By analyzing 42 PE_PGRS genes, 111 SNPs were identified, of which 13 were non-synonymous SNPs (nsSNPs). The nsSNPs identified in the PE_PGRS genes were as follows: 6, 9, 10 and 55 present in each of the CAS, EAI, Orphan, T1 and X3 XDR MTB strains studied. Deletions in PE_PGRS genes: 19, 21 and 23 were observed in 7 (35.0%) CAS1 and 3 (37.5%) in Orphan XDR MTB strains, while deletions in the PE_PGRS genes: 49 and 50 were observed in 36 (95.0%) CAS1 and all CAS, CAS2 and Orphan XDR MTB strains. An insertion in PE_PGRS6 gene was observed in all CAS, EAI3 and Orphan, while insertions in the PE_PGRS genes 19 and 33 were observed in 19 (95%) CAS1 and all CAS, CAS2, EAI3 and Orphan XDR MTB strains. Conclusion: Genetic diversity in PE_PGRS genes contributes to antigenic variability and may result in increased immunogenicity of strains. This is the first study identifying variations in nsSNPs, Insertions and Deletions in the PE_PGRS genes of XDR-TB strains from Pakistan. It highlights common genetic variations which may contribute to persistence.

  6. Population genomic analysis suggests strong influence of river network on spatial distribution of genetic variation in invasive saltcedar across the southwestern United States

    Science.gov (United States)

    Lee, Soo-Rang; Jo, Yeong-Seok; Park, Chan-Ho; Friedman, Jonathan M.; Olson, Matthew S.

    2018-01-01

    Understanding the complex influences of landscape and anthropogenic elements that shape the population genetic structure of invasive species provides insight into patterns of colonization and spread. The application of landscape genomics techniques to these questions may offer detailed, previously undocumented insights into factors influencing species invasions. We investigated the spatial pattern of genetic variation and the influences of landscape factors on population similarity in an invasive riparian shrub, saltcedar (Tamarix L.) by analysing 1,997 genomewide SNP markers for 259 individuals from 25 populations collected throughout the southwestern United States. Our results revealed a broad-scale spatial genetic differentiation of saltcedar populations between the Colorado and Rio Grande river basins and identified potential barriers to population similarity along both river systems. River pathways most strongly contributed to population similarity. In contrast, low temperature and dams likely served as barriers to population similarity. We hypothesize that large-scale geographic patterns in genetic diversity resulted from a combination of early introductions from distinct populations, the subsequent influence of natural selection, dispersal barriers and founder effects during range expansion.

  7. Ensembl Genomes 2016: more genomes, more complexity.

    Science.gov (United States)

    Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

    2016-01-04

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics.

    Science.gov (United States)

    Kelly, Benjamin J; Fitch, James R; Hu, Yangqiu; Corsmeier, Donald J; Zhong, Huachun; Wetzel, Amy N; Nordquist, Russell D; Newsom, David L; White, Peter

    2015-01-20

    While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.

  9. Genome-wide copy number variation analysis in extended families and unrelated individuals characterized for musical aptitude and creativity in music.

    Science.gov (United States)

    Ukkola-Vuoti, Liisa; Kanduri, Chakravarthi; Oikkonen, Jaana; Buck, Gemma; Blancher, Christine; Raijas, Pirre; Karma, Kai; Lähdesmäki, Harri; Järvelä, Irma

    2013-01-01

    Music perception and practice represent complex cognitive functions of the human brain. Recently, evidence for the molecular genetic background of music related phenotypes has been obtained. In order to further elucidate the molecular background of musical phenotypes we analyzed genome wide copy number variations (CNVs) in five extended pedigrees and in 172 unrelated subjects characterized for musical aptitude and creative functions in music. Musical aptitude was defined by combination of the scores of three music tests (COMB scores): auditory structuring ability, Seashores test for pitch and for time. Data on creativity in music (herein composing, improvising and/or arranging music) was surveyed using a web-based questionnaire.Several CNVRs containing genes that affect neurodevelopment, learning and memory were detected. A deletion at 5q31.1 covering the protocadherin-α gene cluster (Pcdha 1-9) was found co-segregating with low music test scores (COMB) in both sample sets. Pcdha is involved in neural migration, differentiation and synaptogenesis. Creativity in music was found to co-segregate with a duplication covering glucose mutarotase gene (GALM) at 2p22. GALM has influence on serotonin release and membrane trafficking of the human serotonin transporter. Interestingly, genes related to serotonergic systems have been shown to associate not only with psychiatric disorders but also with creativity and music perception. Both, Pcdha and GALM, are related to the serotonergic systems influencing cognitive and motor functions, important for music perception and practice. Finally, a 1.3 Mb duplication was identified in a subject with low COMB scores in the region previously linked with absolute pitch (AP) at 8q24. No differences in the CNV burden was detected among the high/low music test scores or creative/non-creative groups. In summary, CNVs and genes found in this study are related to cognitive functions. Our result suggests new candidate genes for music perception

  10. Genetic variation of temperature-regulated curd induction in cauliflower: elucidation of floral transition by genome-wide association mapping and gene expression analysis

    Science.gov (United States)

    Matschegewski, Claudia; Zetzsche, Holger; Hasan, Yaser; Leibeguth, Lena; Briggs, William; Ordon, Frank; Uptmoor, Ralf

    2015-01-01

    Cauliflower (Brassica oleracea var. botrytis) is a vernalization-responsive crop. High ambient temperatures delay harvest time. The elucidation of the genetic regulation of floral transition is highly interesting for a precise harvest scheduling and to ensure stable market supply. This study aims at genetic dissection of temperature-dependent curd induction in cauliflower by genome-wide association studies and gene expression analysis. To assess temperature-dependent curd induction, two greenhouse trials under distinct temperature regimes were conducted on a diversity panel consisting of 111 cauliflower commercial parent lines, genotyped with 14,385 SNPs. Broad phenotypic variation and high heritability (0.93) were observed for temperature-related curd induction within the cauliflower population. GWA mapping identified a total of 18 QTL localized on chromosomes O1, O2, O3, O4, O6, O8, and O9 for curding time under two distinct temperature regimes. Among those, several QTL are localized within regions of promising candidate flowering genes. Inferring population structure and genetic relatedness among the diversity set assigned three main genetic clusters. Linkage disequilibrium (LD) patterns estimated global LD extent of r2 = 0.06 and a maximum physical distance of 400 kb for genetic linkage. Transcriptional profiling of flowering genes FLOWERING LOCUS C (BoFLC) and VERNALIZATION 2 (BoVRN2) was performed, showing increased expression levels of BoVRN2 in genotypes with faster curding. However, functional relevance of BoVRN2 and BoFLC2 could not consistently be supported, which probably suggests to act facultative and/or might evidence for BoVRN2/BoFLC-independent mechanisms in temperature-regulated floral transition in cauliflower. Genetic insights in temperature-regulated curd induction can underpin genetically informed phenology models and benefit molecular breeding strategies toward the development of thermo-tolerant cultivars. PMID:26442034

  11. Genome-wide copy number variation analysis in extended families and unrelated individuals characterized for musical aptitude and creativity in music.

    Directory of Open Access Journals (Sweden)

    Liisa Ukkola-Vuoti

    Full Text Available Music perception and practice represent complex cognitive functions of the human brain. Recently, evidence for the molecular genetic background of music related phenotypes has been obtained. In order to further elucidate the molecular background of musical phenotypes we analyzed genome wide copy number variations (CNVs in five extended pedigrees and in 172 unrelated subjects characterized for musical aptitude and creative functions in music. Musical aptitude was defined by combination of the scores of three music tests (COMB scores: auditory structuring ability, Seashores test for pitch and for time. Data on creativity in music (herein composing, improvising and/or arranging music was surveyed using a web-based questionnaire.Several CNVRs containing genes that affect neurodevelopment, learning and memory were detected. A deletion at 5q31.1 covering the protocadherin-α gene cluster (Pcdha 1-9 was found co-segregating with low music test scores (COMB in both sample sets. Pcdha is involved in neural migration, differentiation and synaptogenesis. Creativity in music was found to co-segregate with a duplication covering glucose mutarotase gene (GALM at 2p22. GALM has influence on serotonin release and membrane trafficking of the human serotonin transporter. Interestingly, genes related to serotonergic systems have been shown to associate not only with psychiatric disorders but also with creativity and music perception. Both, Pcdha and GALM, are related to the serotonergic systems influencing cognitive and motor functions, important for music perception and practice. Finally, a 1.3 Mb duplication was identified in a subject with low COMB scores in the region previously linked with absolute pitch (AP at 8q24. No differences in the CNV burden was detected among the high/low music test scores or creative/non-creative groups. In summary, CNVs and genes found in this study are related to cognitive functions. Our result suggests new candidate genes for

  12. Genome-Wide Copy Number Variation Analysis in Extended Families and Unrelated Individuals Characterized for Musical Aptitude and Creativity in Music

    Science.gov (United States)

    Oikkonen, Jaana; Buck, Gemma; Blancher, Christine; Raijas, Pirre; Karma, Kai; Lähdesmäki, Harri; Järvelä, Irma

    2013-01-01

    Music perception and practice represent complex cognitive functions of the human brain. Recently, evidence for the molecular genetic background of music related phenotypes has been obtained. In order to further elucidate the molecular background of musical phenotypes we analyzed genome wide copy number variations (CNVs) in five extended pedigrees and in 172 unrelated subjects characterized for musical aptitude and creative functions in music. Musical aptitude was defined by combination of the scores of three music tests (COMB scores): auditory structuring ability, Seashores test for pitch and for time. Data on creativity in music (herein composing, improvising and/or arranging music) was surveyed using a web-based questionnaire. Several CNVRs containing genes that affect neurodevelopment, learning and memory were detected. A deletion at 5q31.1 covering the protocadherin-α gene cluster (Pcdha 1-9) was found co-segregating with low music test scores (COMB) in both sample sets. Pcdha is involved in neural migration, differentiation and synaptogenesis. Creativity in music was found to co-segregate with a duplication covering glucose mutarotase gene (GALM) at 2p22. GALM has influence on serotonin release and membrane trafficking of the human serotonin transporter. Interestingly, genes related to serotonergic systems have been shown to associate not only with psychiatric disorders but also with creativity and music perception. Both, Pcdha and GALM, are related to the serotonergic systems influencing cognitive and motor functions, important for music perception and practice. Finally, a 1.3 Mb duplication was identified in a subject with low COMB scores in the region previously linked with absolute pitch (AP) at 8q24. No differences in the CNV burden was detected among the high/low music test scores or creative/non-creative groups. In summary, CNVs and genes found in this study are related to cognitive functions. Our result suggests new candidate genes for music

  13. Ensembl variation resources

    Directory of Open Access Journals (Sweden)

    Marin-Garcia Pablo

    2010-05-01

    Full Text Available Abstract Background The maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics. Description The Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl. Conclusions Variation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org.

  14. Sizing up arthropod genomes: an evaluation of the impact of environmental variation on genome size estimates by flow cytometry and the use of qPCR as a method of estimation.

    Science.gov (United States)

    Gregory, T Ryan; Nathwani, Paula; Bonnett, Tiffany R; Huber, Dezene P W

    2013-09-01

    A study was undertaken to evaluate both a pre-existing method and a newly proposed approach for the estimation of nuclear genome sizes in arthropods. First, concerns regarding the reliability of the well-established method of flow cytometry relating to impacts of rearing conditions on genome size estimates were examined. Contrary to previous reports, a more carefully controlled test found negligible environmental effects on genome size estimates in the fly Drosophila melanogaster. Second, a more recently touted method based on quantitative real-time PCR (qPCR) was examined in terms of ease of use, efficiency, and (most importantly) accuracy using four test species: the flies Drosophila melanogaster and Musca domestica and the beetles Tribolium castaneum and Dendroctonus ponderosa. The results of this analysis demonstrated that qPCR has the tendency to produce substantially different genome size estimates from other established techniques while also being far less efficient than existing methods.

  15. A genome-wide scan study identifies a single nucleotide substitution in ASIP associated with white versus non-white coat-colour variation in sheep (Ovis aries)

    OpenAIRE

    Li, M-H; Tiirikka, T; Kantanen, J

    2013-01-01

    In sheep, coat colour (and pattern) is one of the important traits of great biological, economic and social importance. However, the genetics of sheep coat colour has not yet been fully clarified. We conducted a genome-wide association study of sheep coat colours by genotyping 47 303 single-nucleotide polymorphisms (SNPs) in the Finnsheep population in Finland. We identified 35 SNPs associated with all the coat colours studied, which cover genomic regions encompassing three kno...

  16. pTC Plasmids from Sulfolobus Species in the Geothermal Area of Tengchong, China: Genomic Conservation and Naturally-Occurring Variations as a Result of Transposition by Mobile Genetic Elements.

    Science.gov (United States)

    Xiang, Xiaoyu; Huang, Xiaoxing; Wang, Haina; Huang, Li

    2015-02-12

    Plasmids occur frequently in Archaea. A novel plasmid (denoted pTC1) containing typical conjugation functions has been isolated from Sulfolobus tengchongensis RT8-4, a strain obtained from a hot spring in Tengchong, China, and characterized. The plasmid is a circular double-stranded DNA molecule of 20,417 bp. Among a total of 26 predicted pTC1 ORFs, 23 have homologues in other known Sulfolobus conjugative plasmids (CPs). pTC1 resembles other Sulfolobus CPs in genome architecture, and is most highly conserved in the genomic region encoding conjugation functions. However, attempts to demonstrate experimentally the capacity of the plasmid for conjugational transfer were unsuccessful. A survey revealed that pTC1 and its closely related plasmid variants were widespread in the geothermal area of Tengchong. Variations of the plasmids at the target sites for transposition by an insertion sequence (IS) and a miniature inverted-repeat transposable element (MITE) were readily detected. The IS was efficiently inserted into the pTC1 genome, and the inserted sequence was inactivated and degraded more frequently in an imprecise manner than in a precise manner. These results suggest that the host organism has evolved a strategy to maintain a balance between the insertion and elimination of mobile genetic elements to permit genomic plasticity while inhibiting their fast spreading.

  17. pTC Plasmids from Sulfolobus Species in the Geothermal Area of Tengchong, China: Genomic Conservation and Naturally-Occurring Variations as a Result of Transposition by Mobile Genetic Elements

    Directory of Open Access Journals (Sweden)

    Xiaoyu Xiang

    2015-02-01

    Full Text Available Plasmids occur frequently in Archaea. A novel plasmid (denoted pTC1 containing typical conjugation functions has been isolated from Sulfolobus tengchongensis RT8-4, a strain obtained from a hot spring in Tengchong, China, and characterized. The plasmid is a circular double-stranded DNA molecule of 20,417 bp. Among a total of 26 predicted pTC1 ORFs, 23 have homologues in other known Sulfolobus conjugative plasmids (CPs. pTC1 resembles other Sulfolobus CPs in genome architecture, and is most highly conserved in the genomic region encoding conjugation functions. However, attempts to demonstrate experimentally the capacity of the plasmid for conjugational transfer were unsuccessful. A survey revealed that pTC1 and its closely related plasmid variants were widespread in the geothermal area of Tengchong. Variations of the plasmids at the target sites for transposition by an insertion sequence (IS and a miniature inverted-repeat transposable element (MITE were readily detected. The IS was efficiently inserted into the pTC1 genome, and the inserted sequence was inactivated and degraded more frequently in an imprecise manner than in a precise manner. These results suggest that the host organism has evolved a strategy to maintain a balance between the insertion and elimination of mobile genetic elements to permit genomic plasticity while inhibiting their fast spreading.

  18. Accounting for Genotype-by-Environment Interactions and Residual Genetic Variation in Genomic Selection for Water-Soluble Carbohydrate Concentration in Wheat.

    Science.gov (United States)

    Ovenden, Ben; Milgate, Andrew; Wade, Len J; Rebetzke, Greg J; Holland, James B

    2018-05-31

    Abiotic stress tolerance traits are often complex and recalcitrant targets for conventional breeding improvement in many crop species. This study evaluated the potential of genomic selection to predict water-soluble carbohydrate concentration (WSCC), an important drought tolerance trait, in wheat under field conditions. A panel of 358 varieties and breeding lines constrained for maturity was evaluated under rainfed and irrigated treatments across two locations and two years. Whole-genome marker profiles and factor analytic mixed models were used to generate genomic estimated breeding values (GEBVs) for specific environments and environment groups. Additive genetic variance was smaller than residual genetic variance for WSCC, such that genotypic values were dominated by residual genetic effects rather than additive breeding values. As a result, GEBVs were not accurate predictors of genotypic values of the extant lines, but GEBVs should be reliable selection criteria to choose parents for intermating to produce new populations. The accuracy of GEBVs for untested lines was sufficient to increase predicted genetic gain from genomic selection per unit time compared to phenotypic selection if the breeding cycle is reduced by half by the use of GEBVs in off-season generations. Further, genomic prediction accuracy depended on having phenotypic data from environments with strong correlations with target production environments to build prediction models. By combining high-density marker genotypes, stress-managed field evaluations, and mixed models that model simultaneously covariances among genotypes and covariances of complex trait performance between pairs of environments, we were able to train models with good accuracy to facilitate genetic gain from genomic selection. Copyright © 2018 Ovenden et al.

  19. Cytoplasmic genetic variation and extensive cytonuclear interactions influence natural variation in the metabolome

    DEFF Research Database (Denmark)

    Joseph, Bindu; Corwin, Jason A.; Li, Baohua

    2013-01-01

    Understanding genome to phenotype linkages has been greatly enabled by genomic sequencing. However, most genome analysis is typically confined to the nuclear genome. We conducted a metabolomic QTL analysis on a reciprocal RIL population structured to examine how variation in the organelle genomes...... was a central hub in the epistatic network controlling the plant metabolome. This epistatic influence manifested such that the cytoplasmic background could alter or hide pairwise epistasis between nuclear loci. Thus, cytoplasmic genetic variation plays a central role in controlling natural variation...... in metabolomic networks. This suggests that cytoplasmic genomes must be included in any future analysis of natural variation....

  20. Characterization of genomic variations in SNPs of PE_PGRS genes reveals deletions and insertions in extensively drug resistant (XDR) M. tuberculosis strains from Pakistan

    KAUST Repository

    Kanji, Akbar; Hasan, Zahra; Ali, Asho; McNerney, Ruth; Mallard, Kim; Coll, Francesc; Hill-Cawthorne, Grant A.; Nair, Mridul; Clark, Taane G.; Zaver, Ambreen; Jafri, Sana; Hasan, Rumina

    2015-01-01

    Genetic diversity in PE_PGRS genes contributes to antigenic variability and may result in increased immunogenicity of strains. This is the first study identifying variations in nsSNPs and INDELs in the PE_PGRS genes of XDR-TB strains from Pakistan. It highlights common genetic variations which may contribute to persistence.

  1. Accounting for genotype–by-environment interactions and non-additive genetic variation in genomic selection for water-soluble carbohydrate concentration in wheat

    Science.gov (United States)

    Abiotic stress tolerance traits are often complex and recalcitrant targets for conventional breeding improvement in many crop species. This study evaluated the potential of genomic selection to predict water-soluble carbohydrate concentration (WSCC), an important drought tolerance trait, in wheat un...

  2. Molecular phylogeny and SNP variation of polar bears (Ursus maritimus), brown bears (U. arctos), and black bears (U. americanus) derived from genome sequences.

    Science.gov (United States)

    Cronin, Matthew A; Rincon, Gonzalo; Meredith, Robert W; MacNeil, Michael D; Islas-Trejo, Alma; Cánovas, Angela; Medrano, Juan F

    2014-01-01

    We assessed the relationships of polar bears (Ursus maritimus), brown bears (U. arctos), and black bears (U. americanus) with high throughput genomic sequencing data with an average coverage of 25× for each species. A total of 1.4 billion 100-bp paired-end reads were assembled using the polar bear and annotated giant panda (Ailuropoda melanoleuca) genome sequences as references. We identified 13.8 million single nucleotide polymorphisms (SNP) in the 3 species aligned to the polar bear genome. These data indicate that polar bears and brown bears share more SNP with each other than either does with black bears. Concatenation and coalescence-based analysis of consensus sequences of approximately 1 million base pairs of ultraconserved elements in the nuclear genome resulted in a phylogeny with black bears as the sister group to brown and polar bears, and all brown bears are in a separate clade from polar bears. Genotypes for 162 SNP loci of 336 bears from Alaska and Montana showed that the species are genetically differentiated and there is geographic population structure of brown and black bears but not polar bears.

  3. Comparative Genomic Analysis for Genetic Variation in Sacbrood Virus of Apis cerana and Apis mellifera Honeybees From Different Regions of Vietnam.

    Science.gov (United States)

    Reddy, Kondreddy Eswar; Thu, Ha Thi; Yoo, Mi Sun; Ramya, Mummadireddy; Reddy, Bheemireddy Anjana; Lien, Nguyen Thi Kim; Trang, Nguyen Thi Phuong; Duong, Bui Thi Thuy; Lee, Hyun-Jeong; Kang, Seung-Won; Quyen, Dong Van

    2017-09-01

    Sacbrood virus (SBV) is one of the most common viral infections of honeybees. The entire genome sequence for nine SBV infecting honeybees, Apis cerana and Apis mellifera, in Vietnam, namely AcSBV-Viet1, AcSBV-Viet2, AcSBV-Viet3, AmSBV-Viet4, AcSBV-Viet5, AmSBV-Viet6, AcSBV-Viet7, AcSBV-Viet8, and AcSBV-Viet9, was determined. These sequences were aligned with seven previously reported complete genome sequences of SBV from other countries, and various genomic regions were compared. The Vietnamese SBVs (VN-SBVs) shared 91-99% identity with each other, and shared 89-94% identity with strains from other countries. The open reading frames (ORFs) of the VN-SBV genomes differed greatly from those of SBVs from other countries, especially in their VP1 sequences. The AmSBV-Viet6 and AcSBV-Viet9 genome encodes 17 more amino acids within this region than the other VN-SBVs. In a phylogenetic analysis, the strains AmSBV-Viet4, AcSBV-Viet2, and AcSBV-Viet3 were clustered in group with AmSBV-UK, AmSBV-Kor21, and AmSBV-Kor19 strains. Whereas, the strains AmSBV-Viet6 and AcSBV-Viet7 clustered separately with the AcSBV strains from Korea and AcSBV-VietSBM2. And the strains AcSBV-Viet8, AcSBV-Viet1, AcSBV-Viet5, and AcSBV-Viet9 clustered with the AcSBV-India, AcSBV-Kor and AcSBV-VietSBM2. In a Simplot graph, the VN-SBVs diverged stronger in their ORF regions than in their 5' or 3' untranslated regions. The VN-SBVs possess genetic characteristics which are more similar to the Asian AcSBV strains than to AmSBV-UK strain. Taken together, our data indicate that host specificity, geographic distance, and viral cross-infections between different bee species may explain the genetic diversity among the VN-SBVs in A. cerana and A. mellifera and other SBV strains. © The Authors 2017. Published by Oxford University Press on behalf of Entomological Society of America.

  4. Genome-wide immunity studies in the rabbit: transcriptome variations in peripheral blood mononuclear cells after in vitro stimulation by LPS or PMA-Ionomycin.

    Science.gov (United States)

    Jacquier, Vincent; Estellé, Jordi; Schmaltz-Panneau, Barbara; Lecardonnel, Jérôme; Moroldo, Marco; Lemonnier, Gaëtan; Turner-Maier, Jason; Duranthon, Véronique; Oswald, Isabelle P; Gidenne, Thierry; Rogel-Gaillard, Claire

    2015-01-23

    Our purpose was to obtain genome-wide expression data for the rabbit species on the responses of peripheral blood mononuclear cells (PBMCs) after in vitro stimulation by lipopolysaccharide (LPS) or phorbol myristate acetate (PMA) and ionomycin. This transcriptome profiling was carried out using microarrays enriched with immunity-related genes, and annotated with the most recent data available for the rabbit genome. The LPS affected 15 to 20 times fewer genes than PMA-Ionomycin after both 4 hours (T4) and 24 hours (T24), of in vitro stimulation, in comparison with mock-stimulated PBMCs. LPS induced an inflammatory response as shown by a significant up-regulation of IL12A and CXCL11 at T4, followed by an increased transcription of IL6, IL1B, IL1A, IL36, IL37, TNF, and CCL4 at T24. Surprisingly, we could not find an up-regulation of IL8 either at T4 or at T24, and detected a down-regulation of DEFB1 and BPI at T24. A concerted up-regulation of SAA1, S100A12 and F3 was found upon stimulation by LPS. PMA-Ionomycin induced a very early expression of Th1, Th2, Treg, and Th17 responses by PBMCs at T4. The Th1 response increased at T24 as shown by the increase of the transcription of IFNG and by contrast to other cytokines which significantly decreased from T4 to T24 (IL2, IL4, IL10, IL13, IL17A, CD69) by comparison to mock-stimulation. The granulocyte-macrophage colony-stimulating factor (CSF2) was by far the most over-expressed gene at both T4 and T24 by comparison to mock-stimulated cells, confirming a major impact of PMA-Ionomycin on cell growth and proliferation. A significant down-regulation of IL16 was observed at T4 and T24, in agreement with a role of IL16 in PBMC apoptosis. We report new data on the responses of PBMCs to LPS and PMA-Ionomycin in the rabbit species, thus enlarging the set of mammalian species for which such reports exist. The availability of the rabbit genome assembly together with high throughput genomic tools should pave the way for more

  5. Comparative genome analysis of VSP-II and SNPs reveals heterogenic variation in contemporary strains of Vibrio cholerae O1 isolated from cholera patients in Kolkata, India.

    Science.gov (United States)

    Imamura, Daisuke; Morita, Masatomo; Sekizuka, Tsuyoshi; Mizuno, Tamaki; Takemura, Taichiro; Yamashiro, Tetsu; Chowdhury, Goutam; Pazhani, Gururaja P; Mukhopadhyay, Asish K; Ramamurthy, Thandavarayan; Miyoshi, Shin-Ichi; Kuroda, Makoto; Shinoda, Sumio; Ohnishi, Makoto

    2017-02-01

    Cholera is an acute diarrheal disease and a major public health problem in many developing countries in Asia, Africa, and Latin America. Since the Bay of Bengal is considered the epicenter for the seventh cholera pandemic, it is important to understand the genetic dynamism of Vibrio cholerae from Kolkata, as a representative of the Bengal region. We analyzed whole genome sequence data of V. cholerae O1 isolated from cholera patients in Kolkata, India, from 2007 to 2014 and identified the heterogeneous genomic region in these strains. In addition, we carried out a phylogenetic analysis based on the whole genome single nucleotide polymorphisms to determine the genetic lineage of strains in Kolkata. This analysis revealed the heterogeneity of the Vibrio seventh pandemic island (VSP)-II in Kolkata strains. The ctxB genotype was also heterogeneous and was highly related to VSP-II types. In addition, phylogenetic analysis revealed the shifts in predominant strains in Kolkata. Two distinct lineages, 1 and 2, were found between 2007 and 2010. However, the proportion changed markedly in 2010 and lineage 2 strains were predominant thereafter. Lineage 2 can be divided into four sublineages, I, II, III and IV. The results of this study indicate that lineages 1 and 2-I were concurrently prevalent between 2007 and 2009, and lineage 2-III observed in 2010, followed by the predominance of lineage 2-IV in 2011 and continued until 2014. Our findings demonstrate that the epidemic of cholera in Kolkata was caused by several distinct strains that have been constantly changing within the genetic lineages of V. cholerae O1 in recent years.

  6. The Salmon Smai Family of Short Interspersed Repetitive Elements (Sines): Interspecific and Intraspecific Variation of the Insertion of Sines in the Genomes of Chum and Pink Salmon

    OpenAIRE

    Takasaki, N.; Yamaki, T.; Hamada, M.; Park, L.; Okada, N.

    1997-01-01

    The genomes of chum salmon and pink salmon contain a family of short interspersed repetitive elements (SINEs), designated the salmon SmaI family. It is restricted to these two species, a distribution that suggests that this SINE family might have been generated in their common ancestor. When insertions of the SmaI SINEs at 10 orthologous loci of these species were analyzed, however, it was found that there were no shared insertion sites between chum and pink salmon. Furthermore, at six loci w...

  7. Annotating individual human genomes.

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J

    2011-10-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.

  8. ANNOTATING INDIVIDUAL HUMAN GENOMES*

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.

    2014-01-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162

  9. Genome-wide association studies identify heavy metal ATPase3 as the primary determinant of natural variation in leaf cadmium in Arabidopsis thaliana

    Science.gov (United States)

    Understanding the mechanism of cadmium (Cd) accumulation in plants is important to help reduce its potential toxicity to both plants and humans through dietary and environmental exposure. Here, we report a study to uncover the genetic basis underlying natural variation in Cd accumulation in a world-...

  10. Genetic contributions to variation in general cognitive function: A meta-analysis of genome-wide association studies in the CHARGE consortium (N=53 949)

    NARCIS (Netherlands)

    G. Davies (Gail); N.J. Armstrong (Nicola J.); J.C. Bis (Joshua); J. Bressler (Jan); V. Chouraki (Vincent); S. Giddaluru (Sudheer); E. Hofer; C.A. Ibrahim-Verbaas (Carla); M. Kirin (Mirna); J. Lahti; S.J. van der Lee (Sven); S. Le Hellard (Stephanie); T. Liu; R.E. Marioni (Riccardo); C. Oldmeadow (Christopher); D. Postmus (Douwe); G.D. Smith; J.A. Smith (Jennifer A); A. Thalamuthu (Anbupalam); R. Thomson (Russell); V. Vitart (Veronique); J. Wang; L. Yu; L. Zgaga (Lina); W. Zhao (Wei); R. Boxall (Ruth); S.E. Harris (Sarah); W.D. Hill (W. David); D.C. Liewald (David C.); M. Luciano (Michelle); H.H.H. Adams (Hieab); D. Ames (David); N. Amin (Najaf); P. Amouyel (Philippe); A.A. Assareh; R. Au; J.T. Becker (James); A. Beiser; C. Berr (Claudine); L. Bertram (Lars); E.A. Boerwinkle (Eric); B.M. Buckley (Brendan M.); H. Campbell (Harry); J. Corley; P.L. De Jager; C. Dufouil (Carole); J.G. Eriksson (Johan G.); T. Espeseth (Thomas); J.D. Faul; I. Ford; G. Scotland (Generation); R.F. Gottesman (Rebecca); M.D. Griswold (Michael); V. Gudnason (Vilmundur); T.B. Harris; G. Heiss (Gerardo); A. Hofman (Albert); E.G. Holliday (Elizabeth); J.E. Huffman (Jennifer); S.L.R. Kardia (Sharon); N.A. Kochan (Nicole A.); D.S. Knopman (David); J.B. Kwok; J.-C. Lambert; T. Lee; G. Li; S.-C. Li; M. Loitfelder (Marisa); O.L. Lopez (Oscar); A.J. Lundervold; A. Lundqvist; R. Mather; S.S. Mirza (Saira); L. Nyberg; B.A. Oostra (Ben); A. Palotie (Aarno); G. Papenberg; A. Pattie (Alison); K. Petrovic (Katja); O. Polasek (Ozren); B.M. Psaty (Bruce); P. Redmond (Paul); S. Reppermund; J.I. Rotter; R. Schmidt (Reinhold); M. Schuur (Maaike); P.W. Schofield; R.J. Scott; V.M. Steen (Vidar); D.J. Stott (David J.); J.C. van Swieten (John); K.D. Taylor (Kent); J. Trollor; S. Trompet (Stella); A.G. Uitterlinden (André); G. Weinstein; E. Widen (Elisabeth); B.G. Windham (B Gwen); J.W. Jukema (Jan Wouter); A. Wright (Alan); M.J. Wright (Margaret); Q. Yang (Qiong Fang); H. Amieva (Hélène); J. Attia (John); D.A. Bennett (David); H. Brodaty (Henry); A.J. de Craen (Anton); C. Hayward; M.A. Ikram (Arfan); U. Lindenberger; L.-G. Nilsson; D.J. Porteous (David J.); K. Räikkönen (Katri); I. Reinvang (Ivar); I. Rudan (Igor); P.S. Sachdev (Perminder); R. Schmidt; P. Schofield (Peter); V. Srikanth; J.M. Starr (John); S.T. Turner (Stephen); D.R. Weir (David R.); J.F. Wilson (James F); C.M. van Duijn (Cornelia); L.J. Launer (Lenore); A.L. Fitzpatrick (Annette); S. Seshadri (Sudha); T.H. Mosley (Thomas H.); I.J. Deary (Ian J.)

    2015-01-01

    textabstractGeneral cognitive function is substantially heritable across the human life course from adolescence to old age. We investigated the genetic contribution to variation in this important, health- and well-being-related trait in middle-aged and older adults. We conducted a meta-analysis of

  11. Regions of the bread wheat D genome associated with variation in key photosynthesis traits and shoot biomass under both well watered and water deficient conditions.

    Science.gov (United States)

    Osipova, Svetlana; Permyakov, Alexey; Permyakova, Marina; Pshenichnikova, Tatyana; Verkhoturov, Vasiliy; Rudikovsky, Alexandr; Rudikovskaya, Elena; Shishparenok, Alexandr; Doroshkov, Alexey; Börner, Andreas

    2016-05-01

    A quantitative trait locus (QTL) approach was taken to reveal the genetic basis in wheat of traits associated with photosynthesis during a period of exposure to water deficit stress. The performance, with respect to shoot biomass, gas exchange and chlorophyll fluorescence, leaf pigment content and the activity of various ascorbate-glutathione cycle enzymes and catalase, of a set of 80 wheat lines, each containing a single chromosomal segment introgressed from the bread wheat D genome progenitor Aegilops tauschii, was monitored in plants exposed to various water regimes. Four of the seven D genome chromosomes (1D, 2D, 5D, and 7D) carried clusters of both major (LOD >3.0) and minor (LOD between 2.0 and 3.0) QTL. A major QTL underlying the activity of glutathione reductase was located on chromosome 2D, and another, controlling the activity of ascorbate peroxidase, on chromosome 7D. A region of chromosome 2D defined by the microsatellite locus Xgwm539 and a second on chromosome 7D flanked by the marker loci Xgwm1242 and Xgwm44 harbored a number of QTL associated with the water deficit stress response.

  12. Genomic variation in macrophage-cultured European porcine reproductive and respiratory syndrome virus Olot/91 revealed using ultra-deep next generation sequencing.

    Science.gov (United States)

    Lu, Zen H; Brown, Alexander; Wilson, Alison D; Calvert, Jay G; Balasch, Monica; Fuentes-Utrilla, Pablo; Loecherbach, Julia; Turner, Frances; Talbot, Richard; Archibald, Alan L; Ait-Ali, Tahar

    2014-03-04

    Porcine Reproductive and Respiratory Syndrome (PRRS) is a disease of major economic impact worldwide. The etiologic agent of this disease is the PRRS virus (PRRSV). Increasing evidence suggest that microevolution within a coexisting quasispecies population can give rise to high sequence heterogeneity in PRRSV. We developed a pipeline based on the ultra-deep next generation sequencing approach to first construct the complete genome of a European PRRSV, strain Olot/9, cultured on macrophages and then capture the rare variants representative of the mixed quasispecies population. Olot/91 differs from the reference Lelystad strain by about 5% and a total of 88 variants, with frequencies as low as 1%, were detected in the mixed population. These variants included 16 non-synonymous variants concentrated in the genes encoding structural and nonstructural proteins; including Glycoprotein 2a and 5. Using an ultra-deep sequencing methodology, the complete genome of Olot/91 was constructed without any prior knowledge of the sequence. Rare variants that constitute minor fractions of the heterogeneous PRRSV population could successfully be detected to allow further exploration of microevolutionary events.

  13. Traditional medicine and genomics

    Directory of Open Access Journals (Sweden)

    Kalpana Joshi

    2010-01-01

    Full Text Available ′Omics′ developments in the form of genomics, proteomics and metabolomics have increased the impetus of traditional medicine research. Studies exploring the genomic, proteomic and metabolomic basis of human constitutional types based on Ayurveda and other systems of oriental medicine are becoming popular. Such studies remain important to developing better understanding of human variations and individual differences. Countries like India, Korea, China and Japan are investing in research on evidence-based traditional medicines and scientific validation of fundamental principles. This review provides an account of studies addressing relationships between traditional medicine and genomics.

  14. Traditional medicine and genomics.

    Science.gov (United States)

    Joshi, Kalpana; Ghodke, Yogita; Shintre, Pooja

    2010-01-01

    'Omics' developments in the form of genomics, proteomics and metabolomics have increased the impetus of traditional medicine research. Studies exploring the genomic, proteomic and metabolomic basis of human constitutional types based on Ayurveda and other systems of oriental medicine are becoming popular. Such studies remain important to developing better understanding of human variations and individual differences. Countries like India, Korea, China and Japan are investing in research on evidence-based traditional medicines and scientific validation of fundamental principles. This review provides an account of studies addressing relationships between traditional medicine and genomics.

  15. Genomic variation in myeloma: design, content, and initial application of the Bank On A Cure SNP Panel to detect associations with progression-free survival

    Directory of Open Access Journals (Sweden)

    Fang Gang

    2008-09-01

    Full Text Available Abstract Background We have engaged in an international program designated the Bank On A Cure, which has established DNA banks from multiple cooperative and institutional clinical trials, and a platform for examining the association of genetic variations with disease risk and outcomes in multiple myeloma. We describe the development and content of a novel custom SNP panel that contains 3404 SNPs in 983 genes, representing cellular functions and pathways that may influence disease severity at diagnosis, toxicity, progression or other treatment outcomes. A systematic search of national databases was used to identify non-synonymous coding SNPs and SNPs within transcriptional regulatory regions. To explore SNP associations with PFS we compared SNP profiles of short term (less than 1 year, n = 70 versus long term progression-free survivors (greater than 3 years, n = 73 in two phase III clinical trials. Results Quality controls were established, demonstrating an accurate and robust screening panel for genetic variations, and some initial racial comparisons of allelic variation were done. A variety of analytical approaches, including machine learning tools for data mining and recursive partitioning analyses, demonstrated predictive value of the SNP panel in survival. While the entire SNP panel showed genotype predictive association with PFS, some SNP subsets were identified within drug response, cellular signaling and cell cycle genes. Conclusion A targeted gene approach was undertaken to develop an SNP panel that can test for associations with clinical outcomes in myeloma. The initial analysis provided some predictive power, demonstrating that genetic variations in the myeloma patient population may influence PFS.

  16. Salix transect of Europe: variation in ploidy and genome size in willow-associated common nettle, Urtica dioica L. sens. lat., from Greece to arctic Norway

    OpenAIRE

    Quentin Cronk; Oriane Hidalgo; Jaume Pellicer; Diana Percy; Ilia Leitch

    2016-01-01

    Abstract Background The common stinging nettle, Urtica dioica L. sensu lato, is an invertebrate "superhost", its clonal patches maintaining large populations of insects and molluscs. It is extremely widespread in Europe and highly variable, and two ploidy levels (diploid and tetraploid) are known. However, geographical patterns in cytotype variation require further study. New information We assembled a collection of nettles in conjunction with a transect of Europe from the Aegean to Arctic No...

  17. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi Xuan; Han, Bin; Kurata, Nori

    2015-01-01

    . Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all

  18. Comparative Genomics in Homo sapiens.

    Science.gov (United States)

    Oti, Martin; Sammeth, Michael

    2018-01-01

    Genomes can be compared at different levels of divergence, either between species or within species. Within species genomes can be compared between different subpopulations, such as human subpopulations from different continents. Investigating the genomic differences between different human subpopulations is important when studying complex diseases that are affected by many genetic variants, as the variants involved can differ between populations. The 1000 Genomes Project collected genome-scale variation data for 2504 human individuals from 26 different populations, enabling a systematic comparison of variation between human subpopulations. In this chapter, we present step-by-step a basic protocol for the identification of population-specific variants employing the 1000 Genomes data. These variants are subsequently further investigated for those that affect the proteome or RNA splice sites, to investigate potentially biologically relevant differences between the populations.

  19. Mutation of the RDR1 gene caused genome-wide changes in gene expression, regional variation in small RNA clusters and localized alteration in DNA methylation in rice.

    Science.gov (United States)

    Wang, Ningning; Zhang, Di; Wang, Zhenhui; Xun, Hongwei; Ma, Jian; Wang, Hui; Huang, Wei; Liu, Ying; Lin, Xiuyun; Li, Ning; Ou, Xiufang; Zhang, Chunyu; Wang, Ming-Bo; Liu, Bao

    2014-06-30

    Endogenous small (sm) RNAs (primarily si- and miRNAs) are important trans/cis-acting regulators involved in diverse cellular functions. In plants, the RNA-dependent RNA polymerases (RDRs) are essential for smRNA biogenesis. It has been established that RDR2 is involved in the 24 nt siRNA-dependent RNA-directed DNA methylation (RdDM) pathway. Recent studies have suggested that RDR1 is involved in a second RdDM pathway that relies mostly on 21 nt smRNAs and functions to silence a subset of genomic loci that are usually refractory to the normal RdDM pathway in Arabidopsis. Whether and to what extent the homologs of RDR1 may have similar functions in other plants remained unknown. We characterized a loss-of-function mutant (Osrdr1) of the OsRDR1 gene in rice (Oryza sativa L.) derived from a retrotransposon Tos17 insertion. Microarray analysis identified 1,175 differentially expressed genes (5.2% of all expressed genes in the shoot-tip tissue of rice) between Osrdr1 and WT, of which 896 and 279 genes were up- and down-regulated, respectively, in Osrdr1. smRNA sequencing revealed regional alterations in smRNA clusters across the rice genome. Some of the regions with altered smRNA clusters were associated with changes in DNA methylation. In addition, altered expression of several miRNAs was detected in Osrdr1, and at least some of which were associated with altered expression of predicted miRNA target genes. Despite these changes, no phenotypic difference was identified in Osrdr1 relative to WT under normal condition; however, ephemeral phenotypic fluctuations occurred under some abiotic stress conditions. Our results showed that OsRDR1 plays a role in regulating a substantial number of endogenous genes with diverse functions in rice through smRNA-mediated pathways involving DNA methylation, and which participates in abiotic stress response.

  20. Genomic selection in plant breeding.

    Science.gov (United States)

    Newell, Mark A; Jannink, Jean-Luc

    2014-01-01

    Genomic selection (GS) is a method to predict the genetic value of selection candidates based on the genomic estimated breeding value (GEBV) predicted from high-density markers positioned throughout the genome. Unlike marker-assisted selection, the GEBV is based on all markers including both minor and major marker effects. Thus, the GEBV may capture more of the genetic variation for the particular trait under selection.

  1. Parentage assignment with genomic markers: a major advance for understanding and exploiting genetic variation of quantitative traits in farmed aquatic animals

    Directory of Open Access Journals (Sweden)

    Marc eVandeputte

    2014-12-01

    Full Text Available Since the middle of the 1990s, parentage assignment using microsatellite markers has been introduced as a tool in aquaculture breeding. It now allows close to 100% assignment success, and offered new ways to develop aquaculture breeding using mixed family designs in industry conditions. Its main achievements are the knowledge and control of family representation and inbreeding, especially in mass spawning species, above all the capacity to estimate reliable genetic parameters in any species and rearing system with no prior investment in structures, and the development of new breeding programs in many species. Parentage assignment should not be seen as a way to replace physical tagging, but as a new way to conceive breeding programs, which have to be optimized with its specific constraints, one of the most important being to well define the number of individuals to genotype to limit costs, maximize genetic gain while minimizing inbreeding. The recent possible shift to (for the moment more costly SNP markers should benefit from future developments in genomics and MAS selection to combine parentage assignment and indirect prediction of breeding values.

  2. A global reference for human genetic variation

    DEFF Research Database (Denmark)

    Auton, Adam; Abecasis, Goncalo R.; M. Altshuler, David

    2015-01-01

    The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals ...

  3. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute's genomic medicine portfolio.

    Science.gov (United States)

    Manolio, Teri A

    2016-10-01

    Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual's genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of "Genomic Medicine Meetings," under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and difficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI's genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so. Published by Elsevier Ireland Ltd.

  4. A Thousand Fly Genomes: An Expanded Drosophila Genome Nexus.

    Science.gov (United States)

    Lack, Justin B; Lange, Jeremy D; Tang, Alison D; Corbett-Detig, Russell B; Pool, John E

    2016-12-01

    The Drosophila Genome Nexus is a population genomic resource that provides D. melanogaster genomes from multiple sources. To facilitate comparisons across data sets, genomes are aligned using a common reference alignment pipeline which involves two rounds of mapping. Regions of residual heterozygosity, identity-by-descent, and recent population admixture are annotated to enable data filtering based on the user's needs. Here, we present a significant expansion of the Drosophila Genome Nexus, which brings the current data object to a total of 1,121 wild-derived genomes. New additions include 305 previously unpublished genomes from inbred lines representing six population samples in Egypt, Ethiopia, France, and South Africa, along with another 193 genomes added from recently-published data sets. We also provide an aligned D. simulans genome to facilitate divergence comparisons. This improved resource will broaden the range of population genomic questions that can addressed from multi-population allele frequencies and haplotypes in this model species. The larger set of genomes will also enhance the discovery of functionally relevant natural variation that exists within and between populations. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  5. A Genome Wide Study of Copy Number Variation Associated with Nasopharyngeal Carcinoma in Malaysian Chinese Identifies CNVs at 11q14.3 and 6p21.3 as Candidate Loci

    Science.gov (United States)

    Low, Joyce Siew Yong; Chin, Yoon Ming; Mushiroda, Taisei; Kubo, Michiaki; Govindasamy, Gopala Krishnan; Pua, Kin Choo; Yap, Yoke Yeow; Yap, Lee Fah; Subramaniam, Selva Kumar; Ong, Cheng Ai; Tan, Tee Yong; Khoo, Alan Soo Beng; Ng, Ching Ching

    2016-01-01

    Background Nasopharyngeal carcinoma (NPC) is a neoplasm of the epithelial lining of the nasopharynx. Despite various reports linking genomic variants to NPC predisposition, very few reports were done on copy number variations (CNV). CNV is an inherent structural variation that has been found to be involved in cancer predisposition. Methods A discovery cohort of Malaysian Chinese descent (NPC patients, n = 140; Healthy controls, n = 256) were genotyped using Illumina® HumanOmniExpress BeadChip. PennCNV and cnvPartition calling algorithms were applied for CNV calling. Taqman CNV assays and digital PCR were used to validate CNV calls and replicate candidate copy number variant region (CNVR) associations in a follow-up Malaysian Chinese (NPC cases, n = 465; and Healthy controls, n = 677) and Malay cohort (NPC cases, n = 114; Healthy controls, n = 124). Results Six putative CNVRs overlapping GRM5, MICA/HCP5/HCG26, LILRB3/LILRA6, DPY19L2, RNase3/RNase2 and GOLPH3 genes were jointly identified by PennCNV and cnvPartition. CNVs overlapping GRM5 and MICA/HCP5/HCG26 were subjected to further validation by Taqman CNV assays and digital PCR. Combined analysis in Malaysian Chinese cohort revealed a strong association at CNVR on chromosome 11q14.3 (Pcombined = 1.54x10-5; odds ratio (OR) = 7.27; 95% CI = 2.96–17.88) overlapping GRM5 and a suggestive association at CNVR on chromosome 6p21.3 (Pcombined = 1.29x10-3; OR = 4.21; 95% CI = 1.75–10.11) overlapping MICA/HCP5/HCG26 genes. Conclusion Our results demonstrated the association of CNVs towards NPC susceptibility, implicating a possible role of CNVs in NPC development. PMID:26730743

  6. A Genome Wide Study of Copy Number Variation Associated with Nasopharyngeal Carcinoma in Malaysian Chinese Identifies CNVs at 11q14.3 and 6p21.3 as Candidate Loci.

    Directory of Open Access Journals (Sweden)

    Joyce Siew Yong Low

    Full Text Available Nasopharyngeal carcinoma (NPC is a neoplasm of the epithelial lining of the nasopharynx. Despite various reports linking genomic variants to NPC predisposition, very few reports were done on copy number variations (CNV. CNV is an inherent structural variation that has been found to be involved in cancer predisposition.A discovery cohort of Malaysian Chinese descent (NPC patients, n = 140; Healthy controls, n = 256 were genotyped using Illumina® HumanOmniExpress BeadChip. PennCNV and cnvPartition calling algorithms were applied for CNV calling. Taqman CNV assays and digital PCR were used to validate CNV calls and replicate candidate copy number variant region (CNVR associations in a follow-up Malaysian Chinese (NPC cases, n = 465; and Healthy controls, n = 677 and Malay cohort (NPC cases, n = 114; Healthy controls, n = 124.Six putative CNVRs overlapping GRM5, MICA/HCP5/HCG26, LILRB3/LILRA6, DPY19L2, RNase3/RNase2 and GOLPH3 genes were jointly identified by PennCNV and cnvPartition. CNVs overlapping GRM5 and MICA/HCP5/HCG26 were subjected to further validation by Taqman CNV assays and digital PCR. Combined analysis in Malaysian Chinese cohort revealed a strong association at CNVR on chromosome 11q14.3 (Pcombined = 1.54x10-5; odds ratio (OR = 7.27; 95% CI = 2.96-17.88 overlapping GRM5 and a suggestive association at CNVR on chromosome 6p21.3 (Pcombined = 1.29x10-3; OR = 4.21; 95% CI = 1.75-10.11 overlapping MICA/HCP5/HCG26 genes.Our results demonstrated the association of CNVs towards NPC susceptibility, implicating a possible role of CNVs in NPC development.

  7. Extreme genomes

    OpenAIRE

    DeLong, Edward F

    2000-01-01

    The complete genome sequence of Thermoplasma acidophilum, an acid- and heat-loving archaeon, has recently been reported. Comparative genomic analysis of this 'extremophile' is providing new insights into the metabolic machinery, ecology and evolution of thermophilic archaea.

  8. Grass genomes

    OpenAIRE

    Bennetzen, Jeffrey L.; SanMiguel, Phillip; Chen, Mingsheng; Tikhonov, Alexander; Francki, Michael; Avramova, Zoya

    1998-01-01

    For the most part, studies of grass genome structure have been limited to the generation of whole-genome genetic maps or the fine structure and sequence analysis of single genes or gene clusters. We have investigated large contiguous segments of the genomes of maize, sorghum, and rice, primarily focusing on intergenic spaces. Our data indicate that much (>50%) of the maize genome is composed of interspersed repetitive DNAs, primarily nested retrotransposons that in...

  9. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines whe...

  10. RadGenomics project

    Energy Technology Data Exchange (ETDEWEB)

    Iwakawa, Mayumi; Imai, Takashi; Harada, Yoshinobu [National Inst. of Radiological Sciences, Chiba (Japan). Frontier Research Center] [and others

    2002-06-01

    Human health is determined by a complex interplay of factors, predominantly between genetic susceptibility, environmental conditions and aging. The ultimate aim of the RadGenomics (Radiation Genomics) project is to understand the implications of heterogeneity in responses to ionizing radiation arising from genetic variation between individuals in the human population. The rapid progression of the human genome sequencing and the recent development of new technologies in molecular genetics are providing us with new opportunities to understand the genetic basis of individual differences in susceptibility to natural and/or artificial environmental factors, including radiation exposure. The RadGenomics project will inevitably lead to improved protocols for personalized radiotherapy and reductions in the potential side effects of such treatment. The project will contribute to future research into the molecular mechanisms of radiation sensitivity in humans and will stimulate the development of new high-throughput technologies for a broader application of biological and medical sciences. The staff members are specialists in a variety of fields, including genome science, radiation biology, medical science, molecular biology, and informatics, and have joined the RadGenomics project from various universities, companies, and research institutes. The project started in April 2001. (author)

  11. RadGenomics project

    International Nuclear Information System (INIS)

    Iwakawa, Mayumi; Imai, Takashi; Harada, Yoshinobu

    2002-01-01

    Human health is determined by a complex interplay of factors, predominantly between genetic susceptibility, environmental conditions and aging. The ultimate aim of the RadGenomics (Radiation Genomics) project is to understand the implications of heterogeneity in responses to ionizing radiation arising from genetic variation between individuals in the human population. The rapid progression of the human genome sequencing and the recent development of new technologies in molecular genetics are providing us with new opportunities to understand the genetic basis of individual differences in susceptibility to natural and/or artificial environmental factors, including radiation exposure. The RadGenomics project will inevitably lead to improved protocols for personalized radiotherapy and reductions in the potential side effects of such treatment. The project will contribute to future research into the molecular mechanisms of radiation sensitivity in humans and will stimulate the development of new high-throughput technologies for a broader application of biological and medical sciences. The staff members are specialists in a variety of fields, including genome science, radiation biology, medical science, molecular biology, and informatics, and have joined the RadGenomics project from various universities, companies, and research institutes. The project started in April 2001. (author)

  12. Human Genome Sequencing in Health and Disease

    Science.gov (United States)

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  13. Genomics of Preterm Birth

    Science.gov (United States)

    Swaggart, Kayleigh A.; Pavlicev, Mihaela; Muglia, Louis J.

    2015-01-01

    The molecular mechanisms controlling human birth timing at term, or resulting in preterm birth, have been the focus of considerable investigation, but limited insights have been gained over the past 50 years. In part, these processes have remained elusive because of divergence in reproductive strategies and physiology shown by model organisms, making extrapolation to humans uncertain. Here, we summarize the evolution of progesterone signaling and variation in pregnancy maintenance and termination. We use this comparative physiology to support the hypothesis that selective pressure on genomic loci involved in the timing of parturition have shaped human birth timing, and that these loci can be identified with comparative genomic strategies. Previous limitations imposed by divergence of mechanisms provide an important new opportunity to elucidate fundamental pathways of parturition control through increasing availability of sequenced genomes and associated reproductive physiology characteristics across diverse organisms. PMID:25646385

  14. GFVO: the Genomic Feature and Variation Ontology

    KAUST Repository

    Baran, Joachim; Durgahee, Bibi Sehnaaz Begum; Eilbeck, Karen; Antezana, Erick; Hoehndorf, Robert; Dumontier, Michel

    2015-01-01

    Availability and implementation. The latest stable release of the ontology is available via its base URI; previous and development versions are available at the ontology’s GitHub repository: https://github.com/BioInterchange/Ontologies; versions of the ontology are indexed through BioPortal (without external class-/property-equivalences due to BioPortal release 4.10 limitations); examples and reference documentation is provided on a separate web-page: http://www.biointerchange.org/ontologies.html. GFVO version 1.0.2 is licensed under the CC0 1.0 Universal license (https://creativecommons.org/publicdomain/zero/1.0) and therefore de facto within the public domain; the ontology can be appropriated without attribution for commercial and non-commercial use.

  15. Genomics and the human genome project: implications for psychiatry

    OpenAIRE

    Kelsoe, J R

    2004-01-01

    In the past decade the Human Genome Project has made extraordinary strides in understanding of fundamental human genetics. The complete human genetic sequence has been determined, and the chromosomal location of almost all human genes identified. Presently, a large international consortium, the HapMap Project, is working to identify a large portion of genetic variation in different human populations and the structure and relationship of these variants to each other. The Human Genome Project h...

  16. Molecular subtypes in stage II-III colon cancer defined by genomic instability: early recurrence-risk associated with a high copy-number variation and loss of RUNX3 and CDKN2A.

    Directory of Open Access Journals (Sweden)

    Marianne Berg

    Full Text Available We sought to investigate various molecular subtypes defined by genomic instability that may be related to early death and recurrence in colon cancer.We sought to investigate various molecular subtypes defined by instability at microsatellites (MSI, changes in methylation patterns (CpG island methylator phenotype, CIMP or copy number variation (CNV in 8 genes. Stage II-III colon cancers (n = 64 were investigated by methylation-specific multiplex ligated probe amplification (MS-MLPA. Correlation of CNV, CIMP and MSI, with mutations in KRAS and BRAFV600E were assessed for overlap in molecular subtypes and early recurrence risk by uni- and multivariate regression.The CIMP phenotype occurred in 34% (22/64 and MSI in 27% (16/60 of the tumors, with noted CIMP/MSI overlap. Among the molecular subtypes, a high CNV phenotype had an associated odds ratio (OR for recurrence of 3.2 (95% CI 1.1-9.3; P = 0.026. Losses of CACNA1G (OR of 2.9, 95% CI 1.4-6.0; P = 0.001, IGF2 (OR of 4.3, 95% CI 1.1-15.8; P = 0.007, CDKN2A (p16 (OR of 2.0, 95% CI 1.1-3.6; P = 0.024, and RUNX3 (OR of 3.4, 95% CI 1.3-8.7; P = 0.002 were associated with early recurrence, while MSI, CIMP, KRAS or BRAF V600E mutations were not. The CNV was significantly higher in deceased patients (CNV in 6 of 8 compared to survivors (CNV in 3 of 8. Only stage and loss of RUNX3 and CDKN2A were significant in the multivariable risk-model for early recurrence.A high copy number variation phenotype is a strong predictor of early recurrence and death, and may indicate a dose-dependent relationship between genetic instability and outcome. Loss of tumor suppressors RUNX3 and CDKN2A were related to recurrence-risk and warrants further investigation.

  17. Genome U-Plot: a whole genome visualization.

    Science.gov (United States)

    Gaitatzes, Athanasios; Johnson, Sarah H; Smadbeck, James B; Vasmatzis, George

    2018-05-15

    The ability to produce and analyze whole genome sequencing (WGS) data from samples with structural variations (SV) generated the need to visualize such abnormalities in simplified plots. Conventional two-dimensional representations of WGS data frequently use either circular or linear layouts. There are several diverse advantages regarding both these representations, but their major disadvantage is that they do not use the two-dimensional space very efficiently. We propose a layout, termed the Genome U-Plot, which spreads the chromosomes on a two-dimensional surface and essentially quadruples the spatial resolution. We present the Genome U-Plot for producing clear and intuitive graphs that allows researchers to generate novel insights and hypotheses by visualizing SVs such as deletions, amplifications, and chromoanagenesis events. The main features of the Genome U-Plot are its layered layout, its high spatial resolution and its improved aesthetic qualities. We compare conventional visualization schemas with the Genome U-Plot using visualization metrics such as number of line crossings and crossing angle resolution measures. Based on our metrics, we improve the readability of the resulting graph by at least 2-fold, making apparent important features and making it easy to identify important genomic changes. A whole genome visualization tool with high spatial resolution and improved aesthetic qualities. An implementation and documentation of the Genome U-Plot is publicly available at https://github.com/gaitat/GenomeUPlot. vasmatzis.george@mayo.edu. Supplementary data are available at Bioinformatics online.

  18. Genomic Signatures of Reinforcement

    Directory of Open Access Journals (Sweden)

    Austin G. Garner

    2018-04-01

    Full Text Available Reinforcement is the process by which selection against hybridization increases reproductive isolation between taxa. Much research has focused on demonstrating the existence of reinforcement, yet relatively little is known about the genetic basis of reinforcement or the evolutionary conditions under which reinforcement can occur. Inspired by reinforcement’s characteristic phenotypic pattern of reproductive trait divergence in sympatry but not in allopatry, we discuss whether reinforcement also leaves a distinct genomic pattern. First, we describe three patterns of genetic variation we expect as a consequence of reinforcement. Then, we discuss a set of alternative processes and complicating factors that may make the identification of reinforcement at the genomic level difficult. Finally, we consider how genomic analyses can be leveraged to inform if and to what extent reinforcement evolved in the face of gene flow between sympatric lineages and between allopatric and sympatric populations of the same lineage. Our major goals are to understand if genome scans for particular patterns of genetic variation could identify reinforcement, isolate the genetic basis of reinforcement, or infer the conditions under which reinforcement evolved.

  19. Genomic Signatures of Reinforcement

    Science.gov (United States)

    Goulet, Benjamin E.

    2018-01-01

    Reinforcement is the process by which selection against hybridization increases reproductive isolation between taxa. Much research has focused on demonstrating the existence of reinforcement, yet relatively little is known about the genetic basis of reinforcement or the evolutionary conditions under which reinforcement can occur. Inspired by reinforcement’s characteristic phenotypic pattern of reproductive trait divergence in sympatry but not in allopatry, we discuss whether reinforcement also leaves a distinct genomic pattern. First, we describe three patterns of genetic variation we expect as a consequence of reinforcement. Then, we discuss a set of alternative processes and complicating factors that may make the identification of reinforcement at the genomic level difficult. Finally, we consider how genomic analyses can be leveraged to inform if and to what extent reinforcement evolved in the face of gene flow between sympatric lineages and between allopatric and sympatric populations of the same lineage. Our major goals are to understand if genome scans for particular patterns of genetic variation could identify reinforcement, isolate the genetic basis of reinforcement, or infer the conditions under which reinforcement evolved. PMID:29614048

  20. Genetic Variation in Cardiomyopathy and Cardiovascular Disorders.

    Science.gov (United States)

    McNally, Elizabeth M; Puckelwartz, Megan J

    2015-01-01

    With the wider deployment of massively-parallel, next-generation sequencing, it is now possible to survey human genome data for research and clinical purposes. The reduced cost of producing short-read sequencing has now shifted the burden to data analysis. Analysis of genome sequencing remains challenged by the complexity of the human genome, including redundancy and the repetitive nature of genome elements and the large amount of variation in individual genomes. Public databases of human genome sequences greatly facilitate interpretation of common and rare genetic variation, although linking database sequence information to detailed clinical information is limited by privacy and practical issues. Genetic variation is a rich source of knowledge for cardiovascular disease because many, if not all, cardiovascular disorders are highly heritable. The role of rare genetic variation in predicting risk and complications of cardiovascular diseases has been well established for hypertrophic and dilated cardiomyopathy, where the number of genes that are linked to these disorders is growing. Bolstered by family data, where genetic variants segregate with disease, rare variation can be linked to specific genetic variation that offers profound diagnostic information. Understanding genetic variation in cardiomyopathy is likely to help stratify forms of heart failure and guide therapy. Ultimately, genetic variation may be amenable to gene correction and gene editing strategies.

  1. Organizational heterogeneity of vertebrate genomes.

    Science.gov (United States)

    Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham

    2012-01-01

    Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  2. Organizational heterogeneity of vertebrate genomes.

    Directory of Open Access Journals (Sweden)

    Svetlana Frenkel

    Full Text Available Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  3. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime

    2015-11-18

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a textbased browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.

  4. Prospects for Genomic Research in Forestry

    Directory of Open Access Journals (Sweden)

    K. V. Krutovsky

    2014-08-01

    Full Text Available Conifers are keystone species of boreal forests. Their whole genome sequencing, assembly and annotation will allow us to understand the evolution of the complex ancient giant conifer genomes that are 4 times larger in larch and 7–9 times larger in pines than the human genome. Genomic studies will allow also to obtain important whole genome sequence data and develop highly polymorphic and informative genetic markers, such as microsatellites and single nucleotide polymorphisms (SNPs that can be efficiently used in timber origin identification, for genetic variation monitoring, to study local and climate change adaptation and in tree improvement and conservation programs.

  5. Genome Imprinting

    Indian Academy of Sciences (India)

    the cell nucleus (mitochondrial and chloroplast genomes), and. (3) traits governed ... tively good embryonic development but very poor development of membranes and ... Human homologies for the type of situation described above are naturally ..... imprint; (b) New modifications of the paternal genome in germ cells of each ...

  6. Baculovirus Genomics

    NARCIS (Netherlands)

    Oers, van M.M.; Vlak, J.M.

    2007-01-01

    Baculovirus genomes are covalently closed circles of double stranded-DNA varying in size between 80 and 180 kilobase-pair. The genomes of more than fourty-one baculoviruses have been sequenced to date. The majority of these (37) are pathogenic to lepidopteran hosts; three infect sawflies

  7. Genomic Testing

    Science.gov (United States)

    ... this database. Top of Page Evaluation of Genomic Applications in Practice and Prevention (EGAPP™) In 2004, the Centers for Disease Control and Prevention launched the EGAPP initiative to establish and test a ... and other applications of genomic technology that are in transition from ...

  8. Ancient genomes

    OpenAIRE

    Hoelzel, A Rus

    2005-01-01

    Ever since its invention, the polymerase chain reaction has been the method of choice for work with ancient DNA. In an application of modern genomic methods to material from the Pleistocene, a recent study has instead undertaken to clone and sequence a portion of the ancient genome of the cave bear.

  9. Reference free phasing and representation of complex variation

    DEFF Research Database (Denmark)

    Jensen, Jacob Malte

    2017-01-01

    High throughput sequencing has revolutionized our ability to interrogate genomes and entire human genomes are sequenced daily across the world. Mapping of short reads to a reference genome has enhanced our ability to detect genetic variation and is currently the most widely used technology....... Therefore, new methods for detecting variation that reduce reference bias are needed including ways of representing genomes that account for the variability within and between populations. The major histocompatibility complex (MHC) region is one of the most diverse and complex regions of the human genome...... to detect and call variation in humans. However, it has become evident that mapping of short reads to a single reference genome is subject to ascertainment bias (reference bias). This bias is especially pronounced in complex regions of the genome and particularly hampers detection of structural variation...

  10. Genome Size Dynamics and Evolution in Monocots

    Directory of Open Access Journals (Sweden)

    Ilia J. Leitch

    2010-01-01

    Full Text Available Monocot genomic diversity includes striking variation at many levels. This paper compares various genomic characters (e.g., range of chromosome numbers and ploidy levels, occurrence of endopolyploidy, GC content, chromosome packaging and organization, genome size between monocots and the remaining angiosperms to discern just how distinctive monocot genomes are. One of the most notable features of monocots is their wide range and diversity of genome sizes, including the species with the largest genome so far reported in plants. This genomic character is analysed in greater detail, within a phylogenetic context. By surveying available genome size and chromosome data it is apparent that different monocot orders follow distinctive modes of genome size and chromosome evolution. Further insights into genome size-evolution and dynamics were obtained using statistical modelling approaches to reconstruct the ancestral genome size at key nodes across the monocot phylogenetic tree. Such approaches reveal that while the ancestral genome size of all monocots was small (1C=1.9 pg, there have been several major increases and decreases during monocot evolution. In addition, notable increases in the rates of genome size-evolution were found in Asparagales and Poales compared with other monocot lineages.

  11. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes

    Science.gov (United States)

    Liu, Shengyi; Liu, Yumei; Yang, Xinhua; Tong, Chaobo; Edwards, David; Parkin, Isobel A. P.; Zhao, Meixia; Ma, Jianxin; Yu, Jingyin; Huang, Shunmou; Wang, Xiyin; Wang, Junyi; Lu, Kun; Fang, Zhiyuan; Bancroft, Ian; Yang, Tae-Jin; Hu, Qiong; Wang, Xinfa; Yue, Zhen; Li, Haojie; Yang, Linfeng; Wu, Jian; Zhou, Qing; Wang, Wanxin; King, Graham J; Pires, J. Chris; Lu, Changxin; Wu, Zhangyan; Sampath, Perumal; Wang, Zhuo; Guo, Hui; Pan, Shengkai; Yang, Limei; Min, Jiumeng; Zhang, Dong; Jin, Dianchuan; Li, Wanshun; Belcram, Harry; Tu, Jinxing; Guan, Mei; Qi, Cunkou; Du, Dezhi; Li, Jiana; Jiang, Liangcai; Batley, Jacqueline; Sharpe, Andrew G; Park, Beom-Seok; Ruperao, Pradeep; Cheng, Feng; Waminal, Nomar Espinosa; Huang, Yin; Dong, Caihua; Wang, Li; Li, Jingping; Hu, Zhiyong; Zhuang, Mu; Huang, Yi; Huang, Junyan; Shi, Jiaqin; Mei, Desheng; Liu, Jing; Lee, Tae-Ho; Wang, Jinpeng; Jin, Huizhe; Li, Zaiyun; Li, Xun; Zhang, Jiefu; Xiao, Lu; Zhou, Yongming; Liu, Zhongsong; Liu, Xuequn; Qin, Rui; Tang, Xu; Liu, Wenbin; Wang, Yupeng; Zhang, Yangyong; Lee, Jonghoon; Kim, Hyun Hee; Denoeud, France; Xu, Xun; Liang, Xinming; Hua, Wei; Wang, Xiaowu; Wang, Jun; Chalhoub, Boulos; Paterson, Andrew H

    2014-01-01

    Polyploidization has provided much genetic variation for plant adaptive evolution, but the mechanisms by which the molecular evolution of polyploid genomes establishes genetic architecture underlying species differentiation are unclear. Brassica is an ideal model to increase knowledge of polyploid evolution. Here we describe a draft genome sequence of Brassica oleracea, comparing it with that of its sister species B. rapa to reveal numerous chromosome rearrangements and asymmetrical gene loss in duplicated genomic blocks, asymmetrical amplification of transposable elements, differential gene co-retention for specific pathways and variation in gene expression, including alternative splicing, among a large number of paralogous and orthologous genes. Genes related to the production of anticancer phytochemicals and morphological variations illustrate consequences of genome duplication and gene divergence, imparting biochemical and morphological variation to B. oleracea. This study provides insights into Brassica genome evolution and will underpin research into the many important crops in this genus. PMID:24852848

  12. Population Genomics of Paramecium Species.

    Science.gov (United States)

    Johri, Parul; Krenek, Sascha; Marinov, Georgi K; Doak, Thomas G; Berendonk, Thomas U; Lynch, Michael

    2017-05-01

    Population-genomic analyses are essential to understanding factors shaping genomic variation and lineage-specific sequence constraints. The dearth of such analyses for unicellular eukaryotes prompted us to assess genomic variation in Paramecium, one of the most well-studied ciliate genera. The Paramecium aurelia complex consists of ∼15 morphologically indistinguishable species that diverged subsequent to two rounds of whole-genome duplications (WGDs, as long as 320 MYA) and possess extremely streamlined genomes. We examine patterns of both nuclear and mitochondrial polymorphism, by sequencing whole genomes of 10-13 worldwide isolates of each of three species belonging to the P. aurelia complex: P. tetraurelia, P. biaurelia, P. sexaurelia, as well as two outgroup species that do not share the WGDs: P. caudatum and P. multimicronucleatum. An apparent absence of global geographic population structure suggests continuous or recent dispersal of Paramecium over long distances. Intergenic regions are highly constrained relative to coding sequences, especially in P. caudatum and P. multimicronucleatum that have shorter intergenic distances. Sequence diversity and divergence are reduced up to ∼100-150 bp both upstream and downstream of genes, suggesting strong constraints imposed by the presence of densely packed regulatory modules. In addition, comparison of sequence variation at non-synonymous and synonymous sites suggests similar recent selective pressures on paralogs within and orthologs across the deeply diverging species. This study presents the first genome-wide population-genomic analysis in ciliates and provides a valuable resource for future studies in evolutionary and functional genetics in Paramecium. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  13. Utilizing linkage disequilibrium information from Indian Genome ...

    Indian Academy of Sciences (India)

    Using LD information derived from Indian Genome Variation database (IGVdb) on populations .... Line diagram represents the SNPs selected in Indian (upper panel) and CEPH .... out procedure for extracting DNA from human nucleated cells.

  14. Genome-scale neurogenetics: methodology and meaning.

    Science.gov (United States)

    McCarroll, Steven A; Feng, Guoping; Hyman, Steven E

    2014-06-01

    Genetic analysis is currently offering glimpses into molecular mechanisms underlying such neuropsychiatric disorders as schizophrenia, bipolar disorder and autism. After years of frustration, success in identifying disease-associated DNA sequence variation has followed from new genomic technologies, new genome data resources, and global collaborations that could achieve the scale necessary to find the genes underlying highly polygenic disorders. Here we describe early results from genome-scale studies of large numbers of subjects and the emerging significance of these results for neurobiology.

  15. Ebolavirus comparative genomics

    Science.gov (United States)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  16. Herbarium genomics

    DEFF Research Database (Denmark)

    Bakker, Freek T.; Lei, Di; Yu, Jiaying

    2016-01-01

    Herbarium genomics is proving promising as next-generation sequencing approaches are well suited to deal with the usually fragmented nature of archival DNA. We show that routine assembly of partial plastome sequences from herbarium specimens is feasible, from total DNA extracts and with specimens...... up to 146 years old. We use genome skimming and an automated assembly pipeline, Iterative Organelle Genome Assembly, that assembles paired-end reads into a series of candidate assemblies, the best one of which is selected based on likelihood estimation. We used 93 specimens from 12 different...... correlation between plastome coverage and nuclear genome size (C value) in our samples, but the range of C values included is limited. Finally, we conclude that routine plastome sequencing from herbarium specimens is feasible and cost-effective (compared with Sanger sequencing or plastome...

  17. Feast and famine in plant genomes.

    Science.gov (United States)

    Jonathan F. Wendel; Richard C. Cronn; J. Spencer Jonhston; H. James. Price

    2002-01-01

    Plant genomes vary over several orders of magnitude in size, even among closely related species, yet the origin, genesis and significance of this variation are not clear. Because DNA content varies over a sevenfold range among diploid species in the cotton genus (Gossypium) and its allies, this group offers opportunities for exploring patterns and mechanisms of genome...

  18. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute’s genomic medicine portfolio

    Science.gov (United States)

    Manolio, Teri A.

    2016-01-01

    Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual’s genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of “Genomic Medicine Meetings,” under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and diffficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI’s genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so. PMID:27612677

  19. Evolutionary significance of epigenetic variation

    NARCIS (Netherlands)

    Richards, C.L.; Verhoeven, K.J.F.; Bossdorf, O.; Wendel, J.F.; Greilhuber, J.; Dolezel, J.; Leitch, I.J.

    2012-01-01

    Several chapters in this volume demonstrate how epigenetic work at the molecular level over the last few decades has revolutionized our understanding of genome function and developmental biology. However, epigenetic processes not only further our understanding of variation and regulation at the

  20. Genomic individuality and its biological implications.

    Science.gov (United States)

    Zhao, J

    1996-06-01

    It is a widely accepted fundamental concept that all somatic genomes of a human individual are identical to each other. The theoretical basis of this concept is that all of these somatic genomes are the descendants of the genome of a single fertilized cell as well as the simple replicated products of asexual reproduction, thus not forming any new recombined genomes. The question here is whether such a concept might only represent one side of somatic genome biology and, even worse, whether it has perhaps already led to a very prevalent misconception that within the organism body, there exists no variability among individual somatic genomes. A hypothesis, called genomic individuality, is proposed, simply saying that every individual somatic genome, perhaps with rare exceptions, has its own unique or individual 'genetic identity' or 'fingerprint', which is characterized by its distinctive sequences or patterns of deoxyribonucleic acid molecules, or both. Thus, no two somatic genomes can be identical to each other in every or all aspects, and consequently, there must be a great deal of genomic variation present within the body of any multicellular organism. The concept or hypothesis of genomic individuality would not only provide a more complete understanding of genome biology, but also suggest a new insight into the studies of the biology of cells and organisms.

  1. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    Science.gov (United States)

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  2. The genome of Eucalyptus grandis

    Energy Technology Data Exchange (ETDEWEB)

    Myburg, Alexander A.; Grattapaglia, Dario; Tuskan, Gerald A.; Hellsten, Uffe; Hayes, Richard D.; Grimwood, Jane; Jenkins, Jerry; Lindquist, Erika; Tice, Hope; Bauer, Diane; Goodstein, David M.; Dubchak, Inna; Poliakov, Alexandre; Mizrachi, Eshchar; Kullan, Anand R. K.; Hussey, Steven G.; Pinard, Desre; van der Merwe, Karen; Singh, Pooja; van Jaarsveld, Ida; Silva-Junior, Orzenil B.; Togawa, Roberto C.; Pappas, Marilia R.; Faria, Danielle A.; Sansaloni, Carolina P.; Petroli, Cesar D.; Yang, Xiaohan; Ranjan, Priya; Tschaplinski, Timothy J.; Ye, Chu-Yu; Li, Ting; Sterck, Lieven; Vanneste, Kevin; Murat, Florent; Soler, Marçal; Clemente, Hélène San; Saidi, Naijib; Cassan-Wang, Hua; Dunand, Christophe; Hefer, Charles A.; Bornberg-Bauer, Erich; Kersting, Anna R.; Vining, Kelly; Amarasinghe, Vindhya; Ranik, Martin; Naithani, Sushma; Elser, Justin; Boyd, Alexander E.; Liston, Aaron; Spatafora, Joseph W.; Dharmwardhana, Palitha; Raja, Rajani; Sullivan, Christopher; Romanel, Elisson; Alves-Ferreira, Marcio; Külheim, Carsten; Foley, William; Carocha, Victor; Paiva, Jorge; Kudrna, David; Brommonschenkel, Sergio H.; Pasquali, Giancarlo; Byrne, Margaret; Rigault, Philippe; Tibbits, Josquin; Spokevicius, Antanas; Jones, Rebecca C.; Steane, Dorothy A.; Vaillancourt, René E.; Potts, Brad M.; Joubert, Fourie; Barry, Kerrie; Pappas, Georgios J.; Strauss, Steven H.; Jaiswal, Pankaj; Grima-Pettenati, Jacqueline; Salse, Jérôme; Van de Peer, Yves; Rokhsar, Daniel S.; Schmutz, Jeremy

    2014-06-11

    Eucalypts are the world s most widely planted hardwood trees. Their broad adaptability, rich species diversity, fast growth and superior multipurpose wood, have made them a global renewable resource of fiber and energy that mitigates human pressures on natural forests. We sequenced and assembled >94% of the 640 Mbp genome of Eucalyptus grandis into its 11 chromosomes. A set of 36,376 protein coding genes were predicted revealing that 34% occur in tandem duplications, the largest proportion found thus far in any plant genome. Eucalypts also show the highest diversity of genes for plant specialized metabolism that act as chemical defence against biotic agents and provide unique pharmaceutical oils. Resequencing of a set of inbred tree genomes revealed regions of strongly conserved heterozygosity, likely hotspots of inbreeding depression. The resequenced genome of the sister species E. globulus underscored the high inter-specific genome colinearity despite substantial genome size variation in the genus. The genome of E. grandis is the first reference for the early diverging Rosid order Myrtales and is placed here basal to the Eurosids. This resource expands knowledge on the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.

  3. Creation and genomic analysis of irradiation hybrids in Populus

    Science.gov (United States)

    Matthew S. Zinkgraf; K. Haiby; M.C. Lieberman; L. Comai; I.M. Henry; Andrew Groover

    2016-01-01

    Establishing efficient functional genomic systems for creating and characterizing genetic variation in forest trees is challenging. Here we describe protocols for creating novel gene-dosage variation in Populus through gamma-irradiation of pollen, followed by genomic analysis to identify chromosomal regions that have been deleted or inserted in...

  4. Cephalopod genomics

    DEFF Research Database (Denmark)

    Albertin, Caroline B.; Bonnaud, Laure; Brown, C. Titus

    2012-01-01

    The Cephalopod Sequencing Consortium (CephSeq Consortium) was established at a NESCent Catalysis Group Meeting, ``Paths to Cephalopod Genomics-Strategies, Choices, Organization,'' held in Durham, North Carolina, USA on May 24-27, 2012. Twenty-eight participants representing nine countries (Austria......, Australia, China, Denmark, France, Italy, Japan, Spain and the USA) met to address the pressing need for genome sequencing of cephalopod mollusks. This group, drawn from cephalopod biologists, neuroscientists, developmental and evolutionary biologists, materials scientists, bioinformaticians and researchers...... active in sequencing, assembling and annotating genomes, agreed on a set of cephalopod species of particular importance for initial sequencing and developed strategies and an organization (CephSeq Consortium) to promote this sequencing. The conclusions and recommendations of this meeting are described...

  5. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  6. RPAN: rice pan-genome browser for ∼3000 rice genomes.

    Science.gov (United States)

    Sun, Chen; Hu, Zhiqiang; Zheng, Tianqing; Lu, Kuangchen; Zhao, Yue; Wang, Wensheng; Shi, Jianxin; Wang, Chunchao; Lu, Jinyuan; Zhang, Dabing; Li, Zhikang; Wei, Chaochun

    2017-01-25

    A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. The South Asian genome.

    Directory of Open Access Journals (Sweden)

    John C Chambers

    Full Text Available The genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.

  8. Comparative Genomics

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 11; Issue 8. Comparative Genomics - A Powerful New Tool in Biology. Anand K Bachhawat. General Article Volume 11 Issue 8 August 2006 pp 22-40. Fulltext. Click here to view fulltext PDF. Permanent link:

  9. CoNVaQ: a web tool for copy number variation-based association studies

    DEFF Research Database (Denmark)

    Larsen, Simon Jonas; do Canto, Luisa Matos; Rogatto, Silvia Regina

    2018-01-01

    Copy number variations (CNVs) are large segments of the genome that are duplicated or deleted. Structural variations in the genome have been linked to many complex diseases. Similar to how genome-wide association studies (GWAS) have helped discover single-nucleotide polymorphisms linked to diseas...

  10. Genome projects and the functional-genomic era.

    Science.gov (United States)

    Sauer, Sascha; Konthur, Zoltán; Lehrach, Hans

    2005-12-01

    The problems we face today in public health as a result of the -- fortunately -- increasing age of people and the requirements of developing countries create an urgent need for new and innovative approaches in medicine and in agronomics. Genomic and functional genomic approaches have a great potential to at least partially solve these problems in the future. Important progress has been made by procedures to decode genomic information of humans, but also of other key organisms. The basic comprehension of genomic information (and its transfer) should now give us the possibility to pursue the next important step in life science eventually leading to a basic understanding of biological information flow; the elucidation of the function of all genes and correlative products encoded in the genome, as well as the discovery of their interactions in a molecular context and the response to environmental factors. As a result of the sequencing projects, we are now able to ask important questions about sequence variation and can start to comprehensively study the function of expressed genes on different levels such as RNA, protein or the cell in a systematic context including underlying networks. In this article we review and comment on current trends in large-scale systematic biological research. A particular emphasis is put on technology developments that can provide means to accomplish the tasks of future lines of functional genomics.

  11. The Saccharomyces Genome Database Variant Viewer.

    Science.gov (United States)

    Sheppard, Travis K; Hitz, Benjamin C; Engel, Stacia R; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla S; Demeter, Janos; Hellerstedt, Sage T; Karra, Kalpana; Nash, Robert S; Paskov, Kelley M; Skrzypek, Marek S; Weng, Shuai; Wong, Edith D; Cherry, J Michael

    2016-01-04

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Comparing Mycobacterium tuberculosis genomes using genome topology networks.

    Science.gov (United States)

    Jiang, Jianping; Gu, Jianlei; Zhang, Liang; Zhang, Chenyi; Deng, Xiao; Dou, Tonghai; Zhao, Guoping; Zhou, Yan

    2015-02-14

    Over the last decade, emerging research methods, such as comparative genomic analysis and phylogenetic study, have yielded new insights into genotypes and phenotypes of closely related bacterial strains. Several findings have revealed that genomic structural variations (SVs), including gene gain/loss, gene duplication and genome rearrangement, can lead to different phenotypes among strains, and an investigation of genes affected by SVs may extend our knowledge of the relationships between SVs and phenotypes in microbes, especially in pathogenic bacteria. In this work, we introduce a 'Genome Topology Network' (GTN) method based on gene homology and gene locations to analyze genomic SVs and perform phylogenetic analysis. Furthermore, the concept of 'unfixed ortholog' has been proposed, whose members are affected by SVs in genome topology among close species. To improve the precision of 'unfixed ortholog' recognition, a strategy to detect annotation differences and complete gene annotation was applied. To assess the GTN method, a set of thirteen complete M. tuberculosis genomes was analyzed as a case study. GTNs with two different gene homology-assigning methods were built, the Clusters of Orthologous Groups (COG) method and the orthoMCL clustering method, and two phylogenetic trees were constructed accordingly, which may provide additional insights into whole genome-based phylogenetic analysis. We obtained 24 unfixable COG groups, of which most members were related to immunogenicity and drug resistance, such as PPE-repeat proteins (COG5651) and transcriptional regulator TetR gene family members (COG1309). The GTN method has been implemented in PERL and released on our website. The tool can be downloaded from http://homepage.fudan.edu.cn/zhouyan/gtn/ , and allows re-annotating the 'lost' genes among closely related genomes, analyzing genes affected by SVs, and performing phylogenetic analysis. With this tool, many immunogenic-related and drug resistance-related genes

  13. Personal genomics services: whose genomes?

    Science.gov (United States)

    Gurwitz, David; Bregman-Eschet, Yael

    2009-07-01

    New companies offering personal whole-genome information services over the internet are dynamic and highly visible players in the personal genomics field. For fees currently ranging from US$399 to US$2500 and a vial of saliva, individuals can now purchase online access to their individual genetic information regarding susceptibility to a range of chronic diseases and phenotypic traits based on a genome-wide SNP scan. Most of the companies offering such services are based in the United States, but their clients may come from nearly anywhere in the world. Although the scientific validity, clinical utility and potential future implications of such services are being hotly debated, several ethical and regulatory questions related to direct-to-consumer (DTC) marketing strategies of genetic tests have not yet received sufficient attention. For example, how can we minimize the risk of unauthorized third parties from submitting other people's DNA for testing? Another pressing question concerns the ownership of (genotypic and phenotypic) information, as well as the unclear legal status of customers regarding their own personal information. Current legislation in the US and Europe falls short of providing clear answers to these questions. Until the regulation of personal genomics services catches up with the technology, we call upon commercial providers to self-regulate and coordinate their activities to minimize potential risks to individual privacy. We also point out some specific steps, along the trustee model, that providers of DTC personal genomics services as well as regulators and policy makers could consider for addressing some of the concerns raised below.

  14. One bacterial cell, one complete genome.

    Directory of Open Access Journals (Sweden)

    Tanja Woyke

    2010-04-01

    Full Text Available While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200-900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA. Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs, indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  15. One Bacterial Cell, One Complete Genome

    Energy Technology Data Exchange (ETDEWEB)

    Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos; Clum, Alicia; Copeland, Alex; Schackwitz, Wendy; Lapidus, Alla; Wu, Dongying; McCutcheon, John P.; McDonald, Bradon R.; Moran, Nancy A.; Bristow, James; Cheng, Jan-Fang

    2010-04-26

    While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  16. Visualization for genomics: the Microbial Genome Viewer.

    NARCIS (Netherlands)

    Kerkhoven, R.; Enckevort, F.H.J. van; Boekhorst, J.; Molenaar, D; Siezen, R.J.

    2004-01-01

    SUMMARY: A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a

  17. Deep whole-genome sequencing of 90 Han Chinese genomes.

    Science.gov (United States)

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000

  18. A 1000 Arab genome project to study the Emirati population.

    Science.gov (United States)

    Al-Ali, Mariam; Osman, Wael; Tay, Guan K; AlSafar, Habiba S

    2018-04-01

    Discoveries from the human genome, HapMap, and 1000 genome projects have collectively contributed toward the creation of a catalog of human genetic variations that has improved our understanding of human diversity. Despite the collegial nature of many of these genome study consortiums, which has led to the cataloging of genetic variations of different ethnic groups from around the world, genome data on the Arab population remains overwhelmingly underrepresented. The National Arab Genome project in the United Arab Emirates (UAE) aims to address this deficiency by using Next Generation Sequencing (NGS) technology to provide data to improve our understanding of the Arab genome and catalog variants that are unique to the Arab population of the UAE. The project was conceived to shed light on the similarities and differences between the Arab genome and those of the other ethnic groups.

  19. Variational principles

    CERN Document Server

    Moiseiwitsch, B L

    2004-01-01

    This graduate-level text's primary objective is to demonstrate the expression of the equations of the various branches of mathematical physics in the succinct and elegant form of variational principles (and thereby illuminate their interrelationship). Its related intentions are to show how variational principles may be employed to determine the discrete eigenvalues for stationary state problems and to illustrate how to find the values of quantities (such as the phase shifts) that arise in the theory of scattering. Chapter-by-chapter treatment consists of analytical dynamics; optics, wave mecha

  20. Ancient genomics

    DEFF Research Database (Denmark)

    Der Sarkissian, Clio; Allentoft, Morten Erik; Avila Arcos, Maria del Carmen

    2015-01-01

    throughput of next generation sequencing platforms and the ability to target short and degraded DNA molecules. Many ancient specimens previously unsuitable for DNA analyses because of extensive degradation can now successfully be used as source materials. Additionally, the analytical power obtained...... by increasing the number of sequence reads to billions effectively means that contamination issues that have haunted aDNA research for decades, particularly in human studies, can now be efficiently and confidently quantified. At present, whole genomes have been sequenced from ancient anatomically modern humans...

  1. Marine genomics

    DEFF Research Database (Denmark)

    Oliveira Ribeiro, Ângela Maria; Foote, Andrew David; Kupczok, Anne

    2017-01-01

    Marine ecosystems occupy 71% of the surface of our planet, yet we know little about their diversity. Although the inventory of species is continually increasing, as registered by the Census of Marine Life program, only about 10% of the estimated two million marine species are known. This lag......-throughput sequencing approaches have been helping to improve our knowledge of marine biodiversity, from the rich microbial biota that forms the base of the tree of life to a wealth of plant and animal species. In this review, we present an overview of the applications of genomics to the study of marine life, from...

  2. Genetic Variability of Myxoma Virus Genomes

    OpenAIRE

    Braun, Christoph; Thürmer, Andrea; Daniel, Rolf; Schultz, Anne-Kathrin; Bulla, Ingo; Schirrmeier, Horst; Mayer, Dietmar; Neubert, Andreas; Czerny, Claus-Peter

    2017-01-01

    Myxomatosis is a recurrent problem on rabbit farms throughout Europe despite the success of vaccines. To identify gene variations of field and vaccine strains that may be responsible for changes in virulence, immunomodulation, and immunoprotection, the genomes of 6 myxoma virus (MYXV) strains were sequenced: German field isolates Munich-1, FLI-H, 2604, and 3207; vaccine strain MAV; and challenge strain ZA. The analyzed genomes ranged from 147.6 kb (strain MAV) to 161.8 kb (strain 3207). All s...

  3. PICMI: mapping point mutations on genomes.

    KAUST Repository

    Le Pera, Loredana; Marcatili, Paolo; Tramontano, Anna

    2010-01-01

    MOTIVATION: Several international collaborations and local projects are producing extensive catalogues of genomic variations that are supplementing existing collections such as the OMIM catalogue. The flood of this type of data will keep increasing and, especially, it will be relevant to a wider user base, including not only molecular biologists, geneticists and bioinformaticians, but also clinical researchers. Mapping the observed variations, sometimes only described at the amino acid level, on a genome, identifying whether they affect a gene and-if so-whether they also affect different isoforms of the same gene, is a time consuming and often frustrating task. RESULTS: The PICMI server is an easy to use tool for quickly mapping one or more amino acid or nucleotide variations on a genome and its products, including alternatively spliced isoforms. AVAILABILITY: The server is available at www.biocomputing.it/picmi.

  4. PICMI: mapping point mutations on genomes.

    KAUST Repository

    Le Pera, Loredana

    2010-10-12

    MOTIVATION: Several international collaborations and local projects are producing extensive catalogues of genomic variations that are supplementing existing collections such as the OMIM catalogue. The flood of this type of data will keep increasing and, especially, it will be relevant to a wider user base, including not only molecular biologists, geneticists and bioinformaticians, but also clinical researchers. Mapping the observed variations, sometimes only described at the amino acid level, on a genome, identifying whether they affect a gene and-if so-whether they also affect different isoforms of the same gene, is a time consuming and often frustrating task. RESULTS: The PICMI server is an easy to use tool for quickly mapping one or more amino acid or nucleotide variations on a genome and its products, including alternatively spliced isoforms. AVAILABILITY: The server is available at www.biocomputing.it/picmi.

  5. Genomic definition of species. Revision 2

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Drmanac, R.

    1993-03-01

    A genome is the sum total of the DNA sequences in the cells of an individual organism. The common usage that species possess genomes comes naturally to biochemists, who have shown that all protein and nucleic acid molecules are at the same time species- and individual-specific, with minor individual variations being superimposed on a consensus sequence that is constant for a species. By extension, this property is attributed to the common features of DNA in the chromosomes of members of a given species and is called species genome. Our proposal for the definition of a biological species is as follows: A species comprises a group of actual and potential biological organisms built according to a unique genome program that is recorded, and at least in part expressed, in the structures of their genomic nucleic acid molecule(s), having intragroup sequence differences which can be fully interconverted in the process of organismal reproduction.

  6. Quantifying Temporal Genomic Erosion in Endangered Species.

    Science.gov (United States)

    Díez-Del-Molino, David; Sánchez-Barreiro, Fatima; Barnes, Ian; Gilbert, M Thomas P; Dalén, Love

    2018-03-01

    Many species have undergone dramatic population size declines over the past centuries. Although stochastic genetic processes during and after such declines are thought to elevate the risk of extinction, comparative analyses of genomic data from several endangered species suggest little concordance between genome-wide diversity and current population sizes. This is likely because species-specific life-history traits and ancient bottlenecks overshadow the genetic effect of recent demographic declines. Therefore, we advocate that temporal sampling of genomic data provides a more accurate approach to quantify genetic threats in endangered species. Specifically, genomic data from predecline museum specimens will provide valuable baseline data that enable accurate estimation of recent decreases in genome-wide diversity, increases in inbreeding levels, and accumulation of deleterious genetic variation. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Genome instability: Linking ageing and brain degeneration.

    Science.gov (United States)

    Barzilai, Ari; Schumacher, Björn; Shiloh, Yosef

    2017-01-01

    Ageing is a multifactorial process affected by cumulative physiological changes resulting from stochastic processes combined with genetic factors, which together alter metabolic homeostasis. Genetic variation in maintenance of genome stability is emerging as an important determinant of ageing pace. Genome instability is also closely associated with a broad spectrum of conditions involving brain degeneration. Similarities and differences can be found between ageing-associated decline of brain functionality and the detrimental effect of genome instability on brain functionality and development. This review discusses these similarities and differences and highlights cell classes whose role in these processes might have been underestimated-glia and microglia. Copyright © 2016. Published by Elsevier B.V.

  8. A comparison of rice chloroplast genomes

    DEFF Research Database (Denmark)

    Tang, Jiabin; Xia, Hong'ai; Cao, Mengliang

    2004-01-01

    Using high quality sequence reads extracted from our whole genome shotgun repository, we assembled two chloroplast genome sequences from two rice (Oryza sativa) varieties, one from 93-11 (a typical indica variety) and the other from PA64S (an indica-like variety with maternal origin of japonica......), which are both parental varieties of the super-hybrid rice, LYP9. Based on the patterns of high sequence coverage, we partitioned chloroplast sequence variations into two classes, intravarietal and intersubspecific polymorphisms. Intravarietal polymorphisms refer to variations within 93-11 or PA64S...

  9. Functional genomics of physiological plasticity and local adaptation in killifish.

    Science.gov (United States)

    Whitehead, Andrew; Galvez, Fernando; Zhang, Shujun; Williams, Larissa M; Oleksiak, Marjorie F

    2011-01-01

    Evolutionary solutions to the physiological challenges of life in highly variable habitats can span the continuum from evolution of a cosmopolitan plastic phenotype to the evolution of locally adapted phenotypes. Killifish (Fundulus sp.) have evolved both highly plastic and locally adapted phenotypes within different selective contexts, providing a comparative system in which to explore the genomic underpinnings of physiological plasticity and adaptive variation. Importantly, extensive variation exists among populations and species for tolerance to a variety of stressors, and we exploit this variation in comparative studies to yield insights into the genomic basis of evolved phenotypic variation. Notably, species of Fundulus occupy the continuum of osmotic habitats from freshwater to marine and populations within Fundulus heteroclitus span far greater variation in pollution tolerance than across all species of fish. Here, we explore how transcriptome regulation underpins extreme physiological plasticity on osmotic shock and how genomic and transcriptomic variation is associated with locally evolved pollution tolerance. We show that F. heteroclitus quickly acclimate to extreme osmotic shock by mounting a dramatic rapid transcriptomic response including an early crisis control phase followed by a tissue remodeling phase involving many regulatory pathways. We also show that convergent evolution of locally adapted pollution tolerance involves complex patterns of gene expression and genome sequence variation, which is confounded with body-weight dependence for some genes. Similarly, exploiting the natural phenotypic variation associated with other established and emerging model organisms is likely to greatly accelerate the pace of discovery of the genomic basis of phenotypic variation.

  10. Genomic Resource and Genome Guided Comparison of Twenty Type Strains of the Genus Methylobacterium

    Directory of Open Access Journals (Sweden)

    Vasvi Chaudhry

    2017-12-01

    Full Text Available Bacteria of the genus Methylobacterium are widespread in diverse habitats ranging from soil, water and plant (phyllosphere, rhizosphere and endosphere. In the present study, we in house generated genomic data resource of six type strains along with fourteen database genomes of the Methylobacterium genus to carry out phylogenomic, taxonomic, comparative and ecological studies of this genus. Overall, the genus shows high diversity and genetic variation primarily due to its ability to acquire genetic material from diverse sources through horizontal gene transfer. As majority of species identified in this study are plant associated with their genomes equipped with methylotrophy and photosynthesis related gene along with genes for plant probiotic traits. Most of the species genomes are equipped with genes for adaptation and defense for UV radiation, oxidative stress and desiccation. The genus has an open pan-genome and we predicted the role of gain/loss of prophages and CRISPR elements in diversity and evolution. Our genomic resource with annotation and analysis provides a platform for interspecies genomic comparisons in the genus Methylobacterium, and to unravel their natural genome diversity and to study how natural selection shapes their genome with the adaptive mechanisms which allow them to acquire diverse habitat lifestyles. This type strains genomic data display power of Next Generation Sequencing in rapidly creating resource paving the way for studies on phylogeny and taxonomy as well as for basic and applied research for this important genus.

  11. Rodent malaria parasites : genome organization & comparative genomics

    NARCIS (Netherlands)

    Kooij, Taco W.A.

    2006-01-01

    The aim of the studies described in this thesis was to investigate the genome organization of rodent malaria parasites (RMPs) and compare the organization and gene content of the genomes of RMPs and the human malaria parasite P. falciparum. The release of the complete genome sequence of P.

  12. Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.

    Science.gov (United States)

    Hirsch, Cory D; Evans, Joseph; Buell, C Robin; Hirsch, Candice N

    2014-07-01

    Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  13. The business value and cost-effectiveness of genomic medicine.

    Science.gov (United States)

    Crawford, James M; Aspinall, Mara G

    2012-05-01

    Genomic medicine offers the promise of more effective diagnosis and treatment of human diseases. Genome sequencing early in the course of disease may enable more timely and informed intervention, with reduced healthcare costs and improved long-term outcomes. However, genomic medicine strains current models for demonstrating value, challenging efforts to achieve fair payment for services delivered, both for laboratory diagnostics and for use of molecular information in clinical management. Current models of healthcare reform stipulate that care must be delivered at equal or lower cost, with better patient and population outcomes. To achieve demonstrated value, genomic medicine must overcome many uncertainties: the clinical relevance of genomic variation; potential variation in technical performance and/or computational analysis; management of massive information sets; and must have available clinical interventions that can be informed by genomic analysis, so as to attain more favorable cost management of healthcare delivery and demonstrate improvements in cost-effectiveness.

  14. Metaleptic Variations

    OpenAIRE

    Pernot, Dominique

    2014-01-01

    Les derniers romans de Gabriel Josipovici offrent beaucoup de variété, allant de la parodie, de la fiction comique légère, dans Only Joking et Making Mistakes, à des sujets plus graves, plus personnels, ontologiques. Dans un court roman, Everything Passes, et dans un roman majeur, Goldberg: Variations, le lecteur est amené à se poser des questions sur la nature mystérieuse de la réalité, qui est, trop souvent, acceptée sans conteste par de nombreux roma...

  15. Funding Opportunity: Genomic Data Centers

    Science.gov (United States)

    Funding Opportunity CCG, Funding Opportunity Center for Cancer Genomics, CCG, Center for Cancer Genomics, CCG RFA, Center for cancer genomics rfa, genomic data analysis network, genomic data analysis network centers,

  16. The Arab genome: Health and wealth.

    Science.gov (United States)

    Zayed, Hatem

    2016-11-05

    The 22 Arab nations have a unique genetic structure, which reflects both conserved and diverse gene pools due to the prevalent endogamous and consanguineous marriage culture and the long history of admixture among different ethnic subcultures descended from the Asian, European, and African continents. Human genome sequencing has enabled large-scale genomic studies of different populations and has become a powerful tool for studying disease predictions and diagnosis. Despite the importance of the Arab genome for better understanding the dynamics of the human genome, discovering rare genetic variations, and studying early human migration out of Africa, it is poorly represented in human genome databases, such as HapMap and the 1000 Genomes Project. In this review, I demonstrate the significance of sequencing the Arab genome and setting an Arab genome reference(s) for better understanding the molecular pathogenesis of genetic diseases, discovering novel/rare variants, and identifying a meaningful genotype-phenotype correlation for complex diseases. Copyright © 2016. Published by Elsevier B.V.

  17. Assembly of viral genomes from metagenomes

    Directory of Open Access Journals (Sweden)

    Saskia L Smits

    2014-12-01

    Full Text Available Viral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow rapid phylogenetic characterization of these new viruses. Often, however, complete viral genomes are not recovered, but rather several distinct contigs derived from a