WorldWideScience

Sample records for genome variation project

  1. The African Genome Variation Project shapes medical genetics in Africa.

    Science.gov (United States)

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O; Choudhury, Ananyo; Ritchie, Graham R S; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N; Young, Elizabeth H; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S

    2015-01-15

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.

  2. The African Genome Variation Project shapes medical genetics in Africa

    Science.gov (United States)

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O.; Choudhury, Ananyo; Ritchie, Graham R. S.; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N.; Young, Elizabeth H.; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P.; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A.; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S.

    2015-01-01

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.

  3. The African Genome Variation Project shapes medical genetics in Africa

    Science.gov (United States)

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O.; Choudhury, Ananyo; Ritchie, Graham R. S.; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N.; Young, Elizabeth H.; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P.; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A.; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S.

    2014-01-01

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterisation of African genetic diversity is needed. The African Genome Variation Project (AGVP) provides a resource to help design, implement and interpret genomic studies in sub-Saharan Africa (SSA) and worldwide. The AGVP represents dense genotypes from 1,481 and whole genome sequences (WGS) from 320 individuals across SSA. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across SSA. We identify new loci under selection, including for malaria and hypertension. We show that modern imputation panels can identify association signals at highly differentiated loci across populations in SSA. Using WGS, we show further improvement in imputation accuracy supporting efforts for large-scale sequencing of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa, showing for the first time that such designs are feasible. PMID:25470054

  4. The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data.

    Science.gov (United States)

    Clarke, Laura; Fairley, Susan; Zheng-Bradley, Xiangqun; Streeter, Ian; Perry, Emily; Lowy, Ernesto; Tassé, Anne-Marie; Flicek, Paul

    2017-01-04

    The International Genome Sample Resource (IGSR; http://www.internationalgenome.org) expands in data type and population diversity the resources from the 1000 Genomes Project. IGSR represents the largest open collection of human variation data and provides easy access to these resources. IGSR was established in 2015 to maintain and extend the 1000 Genomes Project data, which has been widely used as a reference set of human variation and by researchers developing analysis methods. IGSR has mapped all of the 1000 Genomes sequence to the newest human reference (GRCh38), and will release updated variant calls to ensure maximal usefulness of the existing data. IGSR is collecting new structural variation data on the 1000 Genomes samples from long read sequencing and other technologies, and will collect relevant functional data into a single comprehensive resource. IGSR is extending coverage with new populations sequenced by collaborating groups. Here, we present the new data and analysis that IGSR has made available. We have also introduced a new data portal that increases discoverability of our data-previously only browseable through our FTP site-by focusing on particular samples, populations or data sets of interest. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Relationship between Deleterious Variation, Genomic Autozygosity, and Disease Risk: Insights from The 1000 Genomes Project.

    Science.gov (United States)

    Pemberton, Trevor J; Szpiech, Zachary A

    2018-04-05

    Genomic regions of autozygosity (ROAs) represent segments of individual genomes that are homozygous for haplotypes inherited identical-by-descent (IBD) from a common ancestor. ROAs are nonuniformly distributed across the genome, and increased ROA levels are a reported risk factor for numerous complex diseases. Previously, we hypothesized that long ROAs are enriched for deleterious homozygotes as a result of young haplotypes with recent deleterious mutations-relatively untouched by purifying selection-being paired IBD as a consequence of recent parental relatedness, a pattern supported by ROA and whole-exome sequence data on 27 individuals. Here, we significantly bolster support for our hypothesis and expand upon our original analyses using ROA and whole-genome sequence data on 2,436 individuals from The 1000 Genomes Project. Considering CADD deleteriousness scores, we reaffirm our previous observation that long ROAs are enriched for damaging homozygotes worldwide. We show that strongly damaging homozygotes experience greater enrichment than weaker damaging homozygotes, while overall enrichment varies appreciably among populations. Mendelian disease genes and those encoding FDA-approved drug targets have significantly increased rates of gain in damaging homozygotes with increasing ROA coverage relative to all other genes. In genes implicated in eight complex phenotypes for which ROA levels have been identified as a risk factor, rates of gain in damaging homozygotes vary across phenotypes and populations but frequently differ significantly from non-disease genes. These findings highlight the potential confounding effects of population background in the assessment of associations between ROA levels and complex disease risk, which might underlie reported inconsistencies in ROA-phenotype associations. Copyright © 2018 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  6. The analysis of APOL1 genetic variation and haplotype diversity provided by 1000 Genomes project.

    Science.gov (United States)

    Peng, Ting; Wang, Li; Li, Guisen

    2017-08-11

    The APOL1 gene variants has been shown to be associated with an increased risk of multiple kinds of diseases, particularly in African Americans, but not in Caucasians and Asians. In this study, we explored the single nucleotide polymorphism (SNP) and haplotype diversity of APOL1 gene in different races provided by 1000 Genomes project. Variants of APOL1 gene in 1000 Genome Project were obtained and SNPs located in the regulatory region or coding region were selected for genetic variation analysis. Total 2504 individuals from 26 populations were classified as four groups that included Africa, Europe, Asia and Admixed populations. Tag SNPs were selected to evaluate the haplotype diversities in the four populations by HaploStats software. APOL1 gene was surrounded by some of the most polymorphic genes in the human genome, variation of APOL1 gene was common, with up to 613 SNP (1000 Genome Project reported) and 99 of them (16.2%) with MAF ≥ 1%. There were 79 SNPs in the URR and 92 SNPs in 3'UTR. Total 12 SNPs in URR and 24 SNPs in 3'UTR were considered as common variants with MAF ≥ 1%. It is worth noting that URR-1 was presents lower frequencies in European populations, while other three haplotypes taken an opposite pattern; 3'UTR presents several high-frequency variation sites in a short segment, and the differences of its haplotypes among different population were significant (P < 0.01), UTR-1 and UTR-5 presented much higher frequency in African population, while UTR-2, UTR-3 and UTR-4 were much lower. APOL1 coding region showed that two SNP of G1 with higher frequency are actually pull down the haplotype H-1 frequency when considering all populations pooled together, and the diversity among the four populations be widen by the G1 two mutation (P 1  = 3.33E-4 vs P 2  = 3.61E-30). The distributions of APOL1 gene variants and haplotypes were significantly different among the different populations, in either regulatory or coding regions. It could provide

  7. HGVA: the Human Genome Variation Archive

    OpenAIRE

    Lopez, Javier; Coll, Jacobo; Haimel, Matthias; Kandasamy, Swaathi; Tarraga, Joaquin; Furio-Tari, Pedro; Bari, Wasim; Bleda, Marta; Rueda, Antonio; Gr?f, Stefan; Rendon, Augusto; Dopazo, Joaquin; Medina, Ignacio

    2017-01-01

    Abstract High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic...

  8. RadGenomics project

    Energy Technology Data Exchange (ETDEWEB)

    Iwakawa, Mayumi; Imai, Takashi; Harada, Yoshinobu [National Inst. of Radiological Sciences, Chiba (Japan). Frontier Research Center] [and others

    2002-06-01

    Human health is determined by a complex interplay of factors, predominantly between genetic susceptibility, environmental conditions and aging. The ultimate aim of the RadGenomics (Radiation Genomics) project is to understand the implications of heterogeneity in responses to ionizing radiation arising from genetic variation between individuals in the human population. The rapid progression of the human genome sequencing and the recent development of new technologies in molecular genetics are providing us with new opportunities to understand the genetic basis of individual differences in susceptibility to natural and/or artificial environmental factors, including radiation exposure. The RadGenomics project will inevitably lead to improved protocols for personalized radiotherapy and reductions in the potential side effects of such treatment. The project will contribute to future research into the molecular mechanisms of radiation sensitivity in humans and will stimulate the development of new high-throughput technologies for a broader application of biological and medical sciences. The staff members are specialists in a variety of fields, including genome science, radiation biology, medical science, molecular biology, and informatics, and have joined the RadGenomics project from various universities, companies, and research institutes. The project started in April 2001. (author)

  9. RadGenomics project

    International Nuclear Information System (INIS)

    Iwakawa, Mayumi; Imai, Takashi; Harada, Yoshinobu

    2002-01-01

    Human health is determined by a complex interplay of factors, predominantly between genetic susceptibility, environmental conditions and aging. The ultimate aim of the RadGenomics (Radiation Genomics) project is to understand the implications of heterogeneity in responses to ionizing radiation arising from genetic variation between individuals in the human population. The rapid progression of the human genome sequencing and the recent development of new technologies in molecular genetics are providing us with new opportunities to understand the genetic basis of individual differences in susceptibility to natural and/or artificial environmental factors, including radiation exposure. The RadGenomics project will inevitably lead to improved protocols for personalized radiotherapy and reductions in the potential side effects of such treatment. The project will contribute to future research into the molecular mechanisms of radiation sensitivity in humans and will stimulate the development of new high-throughput technologies for a broader application of biological and medical sciences. The staff members are specialists in a variety of fields, including genome science, radiation biology, medical science, molecular biology, and informatics, and have joined the RadGenomics project from various universities, companies, and research institutes. The project started in April 2001. (author)

  10. HGVA: the Human Genome Variation Archive.

    Science.gov (United States)

    Lopez, Javier; Coll, Jacobo; Haimel, Matthias; Kandasamy, Swaathi; Tarraga, Joaquin; Furio-Tari, Pedro; Bari, Wasim; Bleda, Marta; Rueda, Antonio; Gräf, Stefan; Rendon, Augusto; Dopazo, Joaquin; Medina, Ignacio

    2017-07-03

    High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic data for key reference projects in a clean, fast and integrated fashion. HGVA provides an efficient and intuitive web-interface for easy data mining, a comprehensive RESTful API and client libraries in Python, Java and JavaScript for fast programmatic access to its knowledge base. HGVA calculates population frequencies for these projects and enriches their data with variant annotation provided by CellBase, a rich and fast annotation solution. HGVA serves as a proof-of-concept of the genome analysis developments being carried out by the University of Cambridge together with UK's 100 000 genomes project and the National Institute for Health Research BioResource Rare-Diseases, in particular, deploying open-source for Computational Biology (OpenCB) software platform for storing and analyzing massive genomic datasets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. The human genome project

    International Nuclear Information System (INIS)

    Worton, R.

    1996-01-01

    The Human Genome Project is a massive international research project, costing 3 to 5 billion dollars and expected to take 15 years, which will identify the all the genes in the human genome - i.e. the complete sequence of bases in human DNA. The prize will be the ability to identify genes causing or predisposing to disease, and in some cases the development of gene therapy, but this new knowledge will raise important ethical issues

  12. Identification of balanced chromosomal rearrangements previously unknown among participants in the 1000 Genomes Project: implications for interpretation of structural variation in genomes and the future of clinical cytogenetics.

    Science.gov (United States)

    Dong, Zirui; Wang, Huilin; Chen, Haixiao; Jiang, Hui; Yuan, Jianying; Yang, Zhenjun; Wang, Wen-Jing; Xu, Fengping; Guo, Xiaosen; Cao, Ye; Zhu, Zhenzhen; Geng, Chunyu; Cheung, Wan Chee; Kwok, Yvonne K; Yang, Huanming; Leung, Tak Yeung; Morton, Cynthia C; Cheung, Sau Wai; Choy, Kwong Wai

    2017-11-02

    PurposeRecent studies demonstrate that whole-genome sequencing enables detection of cryptic rearrangements in apparently balanced chromosomal rearrangements (also known as balanced chromosomal abnormalities, BCAs) previously identified by conventional cytogenetic methods. We aimed to assess our analytical tool for detecting BCAs in the 1000 Genomes Project without knowing which bands were affected.MethodsThe 1000 Genomes Project provides an unprecedented integrated map of structural variants in phenotypically normal subjects, but there is no information on potential inclusion of subjects with apparent BCAs akin to those traditionally detected in diagnostic cytogenetics laboratories. We applied our analytical tool to 1,166 genomes from the 1000 Genomes Project with sufficient physical coverage (8.25-fold).ResultsWith this approach, we detected four reciprocal balanced translocations and four inversions, ranging in size from 57.9 kb to 13.3 Mb, all of which were confirmed by cytogenetic methods and polymerase chain reaction studies. One of these DNAs has a subtle translocation that is not readily identified by chromosome analysis because of the similarity of the banding patterns and size of exchanged segments, and another results in disruption of all transcripts of an OMIM gene.ConclusionOur study demonstrates the extension of utilizing low-pass whole-genome sequencing for unbiased detection of BCAs including translocations and inversions previously unknown in the 1000 Genomes Project.GENETICS in MEDICINE advance online publication, 2 November 2017; doi:10.1038/gim.2017.170.

  13. Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Block, S. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Cornwall, J. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dally, W. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dyson, F. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Fortson, N. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Joyce, G. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Kimble, H. J. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Lewis, N. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Max, C. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Prince, T. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Schwitters, R. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Weinberger, P. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Woodin, W. H. [The MITRE Corporation, McLean, VA (US). JASON Program Office

    1998-01-04

    The study reviews Department of Energy supported aspects of the United States Human Genome Project, the joint National Institutes of Health/Department of Energy program to characterize all human genetic material, to discover the set of human genes, and to render them accessible for further biological study. The study concentrates on issues of technology, quality assurance/control, and informatics relevant to current effort on the genome project and needs beyond it. Recommendations are presented on areas of the genome program that are of particular interest to and supported by the Department of Energy.

  14. Genomics and the human genome project: implications for psychiatry

    OpenAIRE

    Kelsoe, J R

    2004-01-01

    In the past decade the Human Genome Project has made extraordinary strides in understanding of fundamental human genetics. The complete human genetic sequence has been determined, and the chromosomal location of almost all human genes identified. Presently, a large international consortium, the HapMap Project, is working to identify a large portion of genetic variation in different human populations and the structure and relationship of these variants to each other. The Human Genome Project h...

  15. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  16. Genome Variation Map: a data repository of genome variations in BIG Data Center

    OpenAIRE

    Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

    2017-01-01

    Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research a...

  17. Structural genomic variations and Parkinson's disease.

    Science.gov (United States)

    Bandrés-Ciga, Sara; Ruz, Clara; Barrero, Francisco J; Escamilla-Sevilla, Francisco; Pelegrina, Javier; Vives, Francisco; Duran, Raquel

    2017-10-01

    Parkinson's disease (PD) is the second most common neurodegenerative disease, whose prevalence is projected to be between 8.7 and 9.3 million by 2030. Until about 20 years ago, PD was considered to be the textbook example of a "non-genetic" disorder. Nowadays, PD is generally considered a multifactorial disorder that arises from the combination and complex interaction of genes and environmental factors. To date, a total of 7 genes including SNCA, LRRK2, PARK2, DJ-1, PINK 1, VPS35 and ATP13A2 have been seen to cause unequivocally Mendelian PD. Also, variants with incomplete penetrance in the genes LRRK2 and GBA are considered to be strong risk factors for PD worldwide. Although genetic studies have provided valuable insights into the pathogenic mechanisms underlying PD, the role of structural variation in PD has been understudied in comparison with other genomic variations. Structural genomic variations might substantially account for such genetic substrates yet to be discovered. The present review aims to provide an overview of the structural genomic variants implicated in the pathogenesis of PD.

  18. The Ensembl genome database project.

    Science.gov (United States)

    Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M

    2002-01-01

    The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.

  19. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-01

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human

  20. The Pediatric Cancer Genome Project

    Science.gov (United States)

    Downing, James R; Wilson, Richard K; Zhang, Jinghui; Mardis, Elaine R; Pui, Ching-Hon; Ding, Li; Ley, Timothy J; Evans, William E

    2013-01-01

    The St. Jude Children’s Research Hospital–Washington University Pediatric Cancer Genome Project (PCGP) is participating in the international effort to identify somatic mutations that drive cancer. These cancer genome sequencing efforts will not only yield an unparalleled view of the altered signaling pathways in cancer but should also identify new targets against which novel therapeutics can be developed. Although these projects are still deep in the phase of generating primary DNA sequence data, important results are emerging and valuable community resources are being generated that should catalyze future cancer research. We describe here the rationale for conducting the PCGP, present some of the early results of this project and discuss the major lessons learned and how these will affect the application of genomic sequencing in the clinic. PMID:22641210

  1. From genomic variation to personalized medicine

    DEFF Research Database (Denmark)

    Wesolowska, Agata; Schmiegelow, Kjeld

    Genomic variation is the basis of interindividual differences in observable traits and disease susceptibility. Genetic studies are the driving force of personalized medicine, as many of the differences in treatment efficacy can be attributed to our genomic background. The rapid development...... a considerable amount of the phenotype variability, hence the major difficulty of interpretation lies in the complexity of molecular interactions. This PhD thesis describes the state-of-art of the functional human variation research (Chapter 1) and introduces childhood acute lymphoblastic leukaemia (ALL...... the thesis and includes some final remarks on the perspectives of genomic variation research and personalized medicine. In summary, this thesis demonstrates the feasibility of integrative analyses of genomic variations and introduces large-scale hypothesis-driven SNP exploration studies as an emerging...

  2. A 1000 Arab genome project to study the Emirati population.

    Science.gov (United States)

    Al-Ali, Mariam; Osman, Wael; Tay, Guan K; AlSafar, Habiba S

    2018-04-01

    Discoveries from the human genome, HapMap, and 1000 genome projects have collectively contributed toward the creation of a catalog of human genetic variations that has improved our understanding of human diversity. Despite the collegial nature of many of these genome study consortiums, which has led to the cataloging of genetic variations of different ethnic groups from around the world, genome data on the Arab population remains overwhelmingly underrepresented. The National Arab Genome project in the United Arab Emirates (UAE) aims to address this deficiency by using Next Generation Sequencing (NGS) technology to provide data to improve our understanding of the Arab genome and catalog variants that are unique to the Arab population of the UAE. The project was conceived to shed light on the similarities and differences between the Arab genome and those of the other ethnic groups.

  3. Genome projects and the functional-genomic era.

    Science.gov (United States)

    Sauer, Sascha; Konthur, Zoltán; Lehrach, Hans

    2005-12-01

    The problems we face today in public health as a result of the -- fortunately -- increasing age of people and the requirements of developing countries create an urgent need for new and innovative approaches in medicine and in agronomics. Genomic and functional genomic approaches have a great potential to at least partially solve these problems in the future. Important progress has been made by procedures to decode genomic information of humans, but also of other key organisms. The basic comprehension of genomic information (and its transfer) should now give us the possibility to pursue the next important step in life science eventually leading to a basic understanding of biological information flow; the elucidation of the function of all genes and correlative products encoded in the genome, as well as the discovery of their interactions in a molecular context and the response to environmental factors. As a result of the sequencing projects, we are now able to ask important questions about sequence variation and can start to comprehensively study the function of expressed genes on different levels such as RNA, protein or the cell in a systematic context including underlying networks. In this article we review and comment on current trends in large-scale systematic biological research. A particular emphasis is put on technology developments that can provide means to accomplish the tasks of future lines of functional genomics.

  4. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  5. The 1000 bull genome project

    Science.gov (United States)

    To meet growing global demands for high value protein from milk and meat, rates of genetic gain in domestic cattle must be accelerated. At the same time, animal health and welfare must be considered. The 1000 bull genomes project supports these goals by providing annotated sequence variants and ge...

  6. The Human Genome Diversity Project

    Energy Technology Data Exchange (ETDEWEB)

    Cavalli-Sforza, L. [Stanford Univ., CA (United States)

    1994-12-31

    The Human Genome Diversity Project (HGD Project) is an international anthropology project that seeks to study the genetic richness of the entire human species. This kind of genetic information can add a unique thread to the tapestry knowledge of humanity. Culture, environment, history, and other factors are often more important, but humanity`s genetic heritage, when analyzed with recent technology, brings another type of evidence for understanding species` past and present. The Project will deepen the understanding of this genetic richness and show both humanity`s diversity and its deep and underlying unity. The HGD Project is still largely in its planning stages, seeking the best ways to reach its goals. The continuing discussions of the Project, throughout the world, should improve the plans for the Project and their implementation. The Project is as global as humanity itself; its implementation will require the kinds of partnerships among different nations and cultures that make the involvement of UNESCO and other international organizations particularly appropriate. The author will briefly discuss the Project`s history, describe the Project, set out the core principles of the Project, and demonstrate how the Project will help combat the scourge of racism.

  7. Insights into structural variations and genome rearrangements in prokaryotic genomes.

    Science.gov (United States)

    Periwal, Vinita; Scaria, Vinod

    2015-01-01

    Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. Genomic Sequence Variation Markup Language (GSVML).

    Science.gov (United States)

    Nakaya, Jun; Kimura, Michio; Hiroi, Kaei; Ido, Keisuke; Yang, Woosung; Tanaka, Hiroshi

    2010-02-01

    With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics. We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed. We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated. GSVML was developed as

  9. Copy number variation in the bovine genome

    DEFF Research Database (Denmark)

    Fadista, João; Thomsen, Bo; Holm, Lars-Erik

    2010-01-01

    to genetic variation in cattle. Results We designed and used a set of NimbleGen CGH arrays that tile across the assayable portion of the cattle genome with approximately 6.3 million probes, at a median probe spacing of 301 bp. This study reports the highest resolution map of copy number variation...... in the cattle genome, with 304 CNV regions (CNVRs) being identified among the genomes of 20 bovine samples from 4 dairy and beef breeds. The CNVRs identified covered 0.68% (22 Mb) of the genome, and ranged in size from 1.7 to 2,031 kb (median size 16.7 kb). About 20% of the CNVs co-localized with segmental...... duplications, while 30% encompass genes, of which the majority is involved in environmental response. About 10% of the human orthologous of these genes are associated with human disease susceptibility and, hence, may have important phenotypic consequences. Conclusions Together, this analysis provides a useful...

  10. GFVO: the Genomic Feature and Variation Ontology

    KAUST Repository

    Baran, Joachim

    2015-05-05

    Falling costs in genomic laboratory experiments have led to a steady increase of genomic feature and variation data. Multiple genomic data formats exist for sharing these data, and whilst they are similar, they are addressing slightly different data viewpoints and are consequently not fully compatible with each other. The fragmentation of data format specifications makes it hard to integrate and interpret data for further analysis with information from multiple data providers. As a solution, a new ontology is presented here for annotating and representing genomic feature and variation dataset contents. The Genomic Feature and Variation Ontology (GFVO) specifically addresses genomic data as it is regularly shared using the GFF3 (incl. FASTA), GTF, GVF and VCF file formats. GFVO simplifies data integration and enables linking of genomic annotations across datasets through common semantics of genomic types and relations. Availability and implementation. The latest stable release of the ontology is available via its base URI; previous and development versions are available at the ontology’s GitHub repository: https://github.com/BioInterchange/Ontologies; versions of the ontology are indexed through BioPortal (without external class-/property-equivalences due to BioPortal release 4.10 limitations); examples and reference documentation is provided on a separate web-page: http://www.biointerchange.org/ontologies.html. GFVO version 1.0.2 is licensed under the CC0 1.0 Universal license (https://creativecommons.org/publicdomain/zero/1.0) and therefore de facto within the public domain; the ontology can be appropriated without attribution for commercial and non-commercial use.

  11. Genome Variation Map: a data repository of genome variations in BIG Data Center.

    Science.gov (United States)

    Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

    2018-01-04

    The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Genome Variation Map: a data repository of genome variations in BIG Data Center

    Science.gov (United States)

    Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang

    2018-01-01

    Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. PMID:29069473

  13. Structural genomic variation in ischemic stroke

    Science.gov (United States)

    Matarin, Mar; Simon-Sanchez, Javier; Fung, Hon-Chung; Scholz, Sonja; Gibbs, J. Raphael; Hernandez, Dena G.; Crews, Cynthia; Britton, Angela; Wavrant De Vrieze, Fabienne; Brott, Thomas G.; Brown, Robert D.; Worrall, Bradford B.; Silliman, Scott; Case, L. Douglas; Hardy, John A.; Rich, Stephen S.; Meschia, James F.; Singleton, Andrew B.

    2008-01-01

    Technological advances in molecular genetics allow rapid and sensitive identification of genomic copy number variants (CNVs). This, in turn, has sparked interest in the function such variation may play in disease. While a role for copy number mutations as a cause of Mendelian disorders is well established, it is unclear whether CNVs may affect risk for common complex disorders. We sought to investigate whether CNVs may modulate risk for ischemic stroke (IS) and to provide a catalog of CNVs in patients with this disorder by analyzing copy number metrics produced as a part of our previous genome-wide single-nucleotide polymorphism (SNP)-based association study of ischemic stroke in a North American white population. We examined CNVs in 263 patients with ischemic stroke (IS). Each identified CNV was compared with changes identified in 275 neurologically normal controls. Our analysis identified 247 CNVs, corresponding to 187 insertions (76%; 135 heterozygous; 25 homozygous duplications or triplications; 2 heterosomic) and 60 deletions (24%; 40 heterozygous deletions;3 homozygous deletions; 14 heterosomic deletions). Most alterations (81%) were the same as, or overlapped with, previously reported CNVs. We report here the first genome-wide analysis of CNVs in IS patients. In summary, our study did not detect any common genomic structural variation unequivocally linked to IS, although we cannot exclude that smaller CNVs or CNVs in genomic regions poorly covered by this methodology may confer risk for IS. The application of genome-wide SNP arrays now facilitates the evaluation of structural changes through the entire genome as part of a genome-wide genetic association study. PMID:18288507

  14. Simultaneous Structural Variation Discovery in Multiple Paired-End Sequenced Genomes

    Science.gov (United States)

    Hormozdiari, Fereydoun; Hajirasouliha, Iman; McPherson, Andrew; Eichler, Evan E.; Sahinalp, S. Cenk

    Next generation sequencing technologies have been decreasing the costs and increasing the world-wide capacity for sequence production at an unprecedented rate, making the initiation of large scale projects aiming to sequence almost 2000 genomes [1]. Structural variation detection promises to be one of the key diagnostic tools for cancer and other diseases with genomic origin. In this paper, we study the problem of detecting structural variation events in two or more sequenced genomes through high throughput sequencing . We propose to move from the current model of (1) detecting genomic variations in single next generation sequenced (NGS) donor genomes independently, and (2) checking whether two or more donor genomes indeed agree or disagree on the variations (in this paper we name this framework Independent Structural Variation Discovery and Merging - ISV&M), to a new model in which we detect structural variation events among multiple genomes simultaneously.

  15. Copy Number Variations in Tilapia Genomes.

    Science.gov (United States)

    Li, Bi Jun; Li, Hong Lian; Meng, Zining; Zhang, Yong; Lin, Haoran; Yue, Gen Hua; Xia, Jun Hong

    2017-02-01

    Discovering the nature and pattern of genome variation is fundamental in understanding phenotypic diversity among populations. Although several millions of single nucleotide polymorphisms (SNPs) have been discovered in tilapia, the genome-wide characterization of larger structural variants, such as copy number variation (CNV) regions has not been carried out yet. We conducted a genome-wide scan for CNVs in 47 individuals from three tilapia populations. Based on 254 Gb of high-quality paired-end sequencing reads, we identified 4642 distinct high-confidence CNVs. These CNVs account for 1.9% (12.411 Mb) of the used Nile tilapia reference genome. A total of 1100 predicted CNVs were found overlapping with exon regions of protein genes. Further association analysis based on linear model regression found 85 CNVs ranging between 300 and 27,000 base pairs significantly associated to population types (R 2  > 0.9 and P > 0.001). Our study sheds first insights on genome-wide CNVs in tilapia. These CNVs among and within tilapia populations may have functional effects on phenotypes and specific adaptation to particular environments.

  16. All about the Human Genome Project (HGP)

    Science.gov (United States)

    ... Care Genomic Medicine Working Group New Horizons and Research Patient Management Policy and Ethics Issues Quick Links for Patient Care Education All About the Human Genome Project Fact Sheets Genetic Education Resources for ...

  17. Whole-genome sequence variation, population structure and demographic history of the Dutch population

    NARCIS (Netherlands)

    Francioli, Laurent C.; Menelaou, Andronild; Pulit, Sara L.; Van Dijk, Freerk; Palamara, Pier Francesco; Elbers, Clara C.; Neerincx, Pieter B. T.; Ye, Kai; Guryev, Victor; Kloosterman, Wigard P.; Deelen, Patrick; Abdellaoui, Abdel; Van Leeuwen, Elisabeth M.; Van Oven, Mannis; Vermaat, Martijn; Li, Mingkun; Laros, Jeroen F. J.; Karssen, Lennart C.; Kanterakis, Alexandros; Amin, Najaf; Hottenga, Jouke Jan; Lameijer, Eric-Wubbo; Kattenberg, Mathijs; Dijkstra, Martijn; Byelas, Heorhiy; Van Settenl, Jessica; Van Schaik, Barbera D. C.; Bot, Jan; Nijman, Isaac J.; Renkens, Ivo; Marscha, Tobias; Schonhuth, Alexander; Hehir-Kwa, Jayne Y.; Handsaker, Robert E.; Polak, Paz; Sohail, Mashaal; Vuzman, Dana; Hormozdiari, Fereydoun; Van Enckevort, David; Mei, Hailiang; Koval, Vyacheslav; Moed, Ma-Tthijs H.; Van der Velde, K. Joeri; Rivadeneira, Fernando; Estrada, Karol; Medina-Gomez, Carolina; Isaacs, Aaron; Platteel, Mathieu; Swertz, Morris A.; Wijmenga, Cisca

    Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring

  18. Whole-genome sequence variation, population structure and demographic history of the Dutch population

    NARCIS (Netherlands)

    The Genome of the Netherlands Consortium; T. Marschall (Tobias); A. Schönhuth (Alexander)

    2014-01-01

    htmlabstractWhole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch

  19. Parasite Genome Projects and the Trypanosoma cruzi Genome Initiative

    Directory of Open Access Journals (Sweden)

    Wim Degrave

    1997-11-01

    Full Text Available Since the start of the human genome project, a great number of genome projects on other "model" organism have been initiated, some of them already completed. Several initiatives have also been started on parasite genomes, mainly through support from WHO/TDR, involving North-South and South-South collaborations, and great hopes are vested in that these initiatives will lead to new tools for disease control and prevention, as well as to the establishment of genomic research technology in developing countries. The Trypanosoma cruzi genome project, using the clone CL-Brener as starting point, has made considerable progress through the concerted action of more than 20 laboratories, most of them in the South. A brief overview of the current state of the project is given

  20. The Human Genome Project: big science transforms biology and medicine.

    Science.gov (United States)

    Hood, Leroy; Rowen, Lee

    2013-01-01

    The Human Genome Project has transformed biology through its integrated big science approach to deciphering a reference human genome sequence along with the complete sequences of key model organisms. The project exemplifies the power, necessity and success of large, integrated, cross-disciplinary efforts - so-called 'big science' - directed towards complex major objectives. In this article, we discuss the ways in which this ambitious endeavor led to the development of novel technologies and analytical tools, and how it brought the expertise of engineers, computer scientists and mathematicians together with biologists. It established an open approach to data sharing and open-source software, thereby making the data resulting from the project accessible to all. The genome sequences of microbes, plants and animals have revolutionized many fields of science, including microbiology, virology, infectious disease and plant biology. Moreover, deeper knowledge of human sequence variation has begun to alter the practice of medicine. The Human Genome Project has inspired subsequent large-scale data acquisition initiatives such as the International HapMap Project, 1000 Genomes, and The Cancer Genome Atlas, as well as the recently announced Human Brain Project and the emerging Human Proteome Project.

  1. The Chlamydomonas genome project: a decade on

    Science.gov (United States)

    Blaby, Ian K.; Blaby-Haas, Crysten; Tourasse, Nicolas; Hom, Erik F. Y.; Lopez, David; Aksoy, Munevver; Grossman, Arthur; Umen, James; Dutcher, Susan; Porter, Mary; King, Stephen; Witman, George; Stanke, Mario; Harris, Elizabeth H.; Goodstein, David; Grimwood, Jane; Schmutz, Jeremy; Vallon, Olivier; Merchant, Sabeeha S.; Prochnik, Simon

    2014-01-01

    The green alga Chlamydomonas reinhardtii is a popular unicellular organism for studying photosynthesis, cilia biogenesis and micronutrient homeostasis. Ten years since its genome project was initiated, an iterative process of improvements to the genome and gene predictions has propelled this organism to the forefront of the “omics” era. Housed at Phytozome, the Joint Genome Institute’s (JGI) plant genomics portal, the most up-to-date genomic data include a genome arranged on chromosomes and high-quality gene models with alternative splice forms supported by an abundance of RNA-Seq data. Here, we present the past, present and future of Chlamydomonas genomics. Specifically, we detail progress on genome assembly and gene model refinement, discuss resources for gene annotations, functional predictions and locus ID mapping between versions and, importantly, outline a standardized framework for naming genes. PMID:24950814

  2. Genome-wide associations of gene expression variation in humans.

    Directory of Open Access Journals (Sweden)

    Barbara E Stranger

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  3. Genome-Wide Associations of Gene Expression Variation in Humans.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  4. Genomic variation in Salmonella enterica core genes for epidemiological typing

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Lukjancenko, Oksana; Rundsten, Carsten Friis

    2012-01-01

    Background: Technological advances in high throughput genome sequencing are making whole genome sequencing (WGS) available as a routine tool for bacterial typing. Standardized procedures for identification of relevant genes and of variation are needed to enable comparison between studies and over...... genomes and evaluate their value as typing targets, comparing whole genome typing and traditional methods such as 16S and MLST. A consensus tree based on variation of core genes gives much better resolution than 16S and MLST; the pan-genome family tree is similar to the consensus tree, but with higher...... that there is a positive selection towards mutations leading to amino acid changes. Conclusions: Genomic variation within the core genome is useful for investigating molecular evolution and providing candidate genes for bacterial genome typing. Identification of genes with different degrees of variation is important...

  5. Helminth genome projects: all or nothing

    Czech Academy of Sciences Publication Activity Database

    Lukeš, Julius; Horák, Aleš; Scholz, Tomáš

    2005-01-01

    Roč. 21, č. 6 (2005), s. 265-266 ISSN 1471-4922 Institutional research plan: CEZ:AV0Z60220518 Keywords : genome project * helminth * Dracunculus Subject RIV: EG - Zoology Impact factor: 4.526, year: 2005

  6. Origins of the Human Genome Project.

    Science.gov (United States)

    Watson, J D; Cook-Deegan, R M

    1991-01-01

    The Human Genome Project has become a reality. Building on a debate that dates back to 1985, several genome projects are now in full stride around the world, and more are likely to form in the next several years. Italy began its genome program in 1987, and the United Kingdom and U.S.S.R. in 1988. The European communities mounted several genome projects on yeast, bacteria, Drosophila, and Arabidospis thaliana (a rapidly growing plant with a small genome) in 1988, and in 1990 commenced a new 2-year program on the human genome. In the United States, we have completed the first year of operation of the National Center for Human Genome Research at the National Institutes of Health (NIH), now the largest single funding source for genome research in the world. There have been dedicated budgets focused on genome-scale research at NIH, the U.S. Department of Energy, and the Howard Hughes Medical Institute for several years, and results are beginning to accumulate. There were three annual meetings on genome mapping and sequencing at Cold Spring Harbor, New York, in the spring of 1988, 1989, and 1990; the talks have shifted from a discussion about how to approach problems to presenting results from experiments already performed. We have finally begun to work rather than merely talk. The purpose of genome projects is to assemble data on the structure of DNA in human chromosomes and those of other organisms. A second goal is to develop new technologies to perform mapping and sequencing. There have been impressive technical advances in the past 5 years since the debate about the human genome project began. We are on the verge of beginning pilot projects to test several approaches to sequencing long stretches of DNA, using both automation and manual methods. Ordered sets of yeast artificial chromosome and cosmid clones have been assembled to span more than 2 million base pairs of several human chromosomes, and a region of 10 million base pairs has been assembled for

  7. An integrated map of genetic variation from 1.092 human genomes

    DEFF Research Database (Denmark)

    Abecasis, Goncalo R.; Auton, Adam; Brooks, Lisa D.

    2012-01-01

    By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination ...

  8. The life cycle of a genome project: perspectives and guidelines inspired by insect genome projects.

    Science.gov (United States)

    Papanicolaou, Alexie

    2016-01-01

    Many research programs on non-model species biology have been empowered by genomics. In turn, genomics is underpinned by a reference sequence and ancillary information created by so-called "genome projects". The most reliable genome projects are the ones created as part of an active research program and designed to address specific questions but their life extends past publication. In this opinion paper I outline four key insights that have facilitated maintaining genomic communities: the key role of computational capability, the iterative process of building genomic resources, the value of community participation and the importance of manual curation. Taken together, these ideas can and do ensure the longevity of genome projects and the growing non-model species community can use them to focus a discussion with regards to its future genomic infrastructure.

  9. Genome size variation in the genus Avena.

    Science.gov (United States)

    Yan, Honghai; Martin, Sara L; Bekele, Wubishet A; Latta, Robert G; Diederichsen, Axel; Peng, Yuanying; Tinker, Nicholas A

    2016-03-01

    Genome size is an indicator of evolutionary distance and a metric for genome characterization. Here, we report accurate estimates of genome size in 99 accessions from 26 species of Avena. We demonstrate that the average genome size of C genome diploid species (2C = 10.26 pg) is 15% larger than that of A genome species (2C = 8.95 pg), and that this difference likely accounts for a progression of size among tetraploid species, where AB genome configuration had similar genome sizes (average 2C = 25.74 pg). Genome size was mostly consistent within species and in general agreement with current information about evolutionary distance among species. Results also suggest that most of the polyploid species in Avena have experienced genome downsizing in relation to their diploid progenitors. Genome size measurements could provide additional quality control for species identification in germplasm collections, especially in cases where diploid and polyploid species have similar morphology.

  10. Intrapopulation genome size variation in D. melanogaster reflects life history variation and plasticity.

    Directory of Open Access Journals (Sweden)

    Lisa L Ellis

    2014-07-01

    Full Text Available We determined female genome sizes using flow cytometry for 211 Drosophila melanogaster sequenced inbred strains from the Drosophila Genetic Reference Panel, and found significant conspecific and intrapopulation variation in genome size. We also compared several life history traits for 25 lines with large and 25 lines with small genomes in three thermal environments, and found that genome size as well as genome size by temperature interactions significantly correlated with survival to pupation and adulthood, time to pupation, female pupal mass, and female eclosion rates. Genome size accounted for up to 23% of the variation in developmental phenotypes, but the contribution of genome size to variation in life history traits was plastic and varied according to the thermal environment. Expression data implicate differences in metabolism that correspond to genome size variation. These results indicate that significant genome size variation exists within D. melanogaster and this variation may impact the evolutionary ecology of the species. Genome size variation accounts for a significant portion of life history variation in an environmentally dependent manner, suggesting that potential fitness effects associated with genome size variation also depend on environmental conditions.

  11. Intrapopulation Genome Size Variation in D. melanogaster Reflects Life History Variation and Plasticity

    Science.gov (United States)

    Ellis, Lisa L.; Huang, Wen; Quinn, Andrew M.; Ahuja, Astha; Alfrejd, Ben; Gomez, Francisco E.; Hjelmen, Carl E.; Moore, Kristi L.; Mackay, Trudy F. C.; Johnston, J. Spencer; Tarone, Aaron M.

    2014-01-01

    We determined female genome sizes using flow cytometry for 211 Drosophila melanogaster sequenced inbred strains from the Drosophila Genetic Reference Panel, and found significant conspecific and intrapopulation variation in genome size. We also compared several life history traits for 25 lines with large and 25 lines with small genomes in three thermal environments, and found that genome size as well as genome size by temperature interactions significantly correlated with survival to pupation and adulthood, time to pupation, female pupal mass, and female eclosion rates. Genome size accounted for up to 23% of the variation in developmental phenotypes, but the contribution of genome size to variation in life history traits was plastic and varied according to the thermal environment. Expression data implicate differences in metabolism that correspond to genome size variation. These results indicate that significant genome size variation exists within D. melanogaster and this variation may impact the evolutionary ecology of the species. Genome size variation accounts for a significant portion of life history variation in an environmentally dependent manner, suggesting that potential fitness effects associated with genome size variation also depend on environmental conditions. PMID:25057905

  12. Attitudes towards the Human Genome Project.

    Science.gov (United States)

    Shahroudi, Julie; Shaw, Geraldine

    Attitudes concerning the Human Genome Project were reported by faculty (N=40) and students (N=66) from a liberal arts college. Positive attitudes toward the project involved privacy, insurance and health, economic purposes, reproductive purposes, genetic counseling, religion and overall opinions. Negative attitudes were expressed regarding…

  13. Copy Number Variation in the Horse Genome

    Science.gov (United States)

    Ghosh, Sharmila; Qu, Zhipeng; Das, Pranab J.; Fang, Erica; Juras, Rytis; Cothran, E. Gus; McDonell, Sue; Kenney, Daniel G.; Lear, Teri L.; Adelson, David L.; Chowdhary, Bhanu P.; Raudsepp, Terje

    2014-01-01

    We constructed a 400K WG tiling oligoarray for the horse and applied it for the discovery of copy number variations (CNVs) in 38 normal horses of 16 diverse breeds, and the Przewalski horse. Probes on the array represented 18,763 autosomal and X-linked genes, and intergenic, sub-telomeric and chrY sequences. We identified 258 CNV regions (CNVRs) across all autosomes, chrX and chrUn, but not in chrY. CNVs comprised 1.3% of the horse genome with chr12 being most enriched. American Miniature horses had the highest and American Quarter Horses the lowest number of CNVs in relation to Thoroughbred reference. The Przewalski horse was similar to native ponies and draft breeds. The majority of CNVRs involved genes, while 20% were located in intergenic regions. Similar to previous studies in horses and other mammals, molecular functions of CNV-associated genes were predominantly in sensory perception, immunity and reproduction. The findings were integrated with previous studies to generate a composite genome-wide dataset of 1476 CNVRs. Of these, 301 CNVRs were shared between studies, while 1174 were novel and require further validation. Integrated data revealed that to date, 41 out of over 400 breeds of the domestic horse have been analyzed for CNVs, of which 11 new breeds were added in this study. Finally, the composite CNV dataset was applied in a pilot study for the discovery of CNVs in 6 horses with XY disorders of sexual development. A homozygous deletion involving AKR1C gene cluster in chr29 in two affected horses was considered possibly causative because of the known role of AKR1C genes in testicular androgen synthesis and sexual development. While the findings improve and integrate the knowledge of CNVs in horses, they also show that for effective discovery of variants of biomedical importance, more breeds and individuals need to be analyzed using comparable methodological approaches. PMID:25340504

  14. Copy number variation in the horse genome.

    Directory of Open Access Journals (Sweden)

    Sharmila Ghosh

    2014-10-01

    Full Text Available We constructed a 400K WG tiling oligoarray for the horse and applied it for the discovery of copy number variations (CNVs in 38 normal horses of 16 diverse breeds, and the Przewalski horse. Probes on the array represented 18,763 autosomal and X-linked genes, and intergenic, sub-telomeric and chrY sequences. We identified 258 CNV regions (CNVRs across all autosomes, chrX and chrUn, but not in chrY. CNVs comprised 1.3% of the horse genome with chr12 being most enriched. American Miniature horses had the highest and American Quarter Horses the lowest number of CNVs in relation to Thoroughbred reference. The Przewalski horse was similar to native ponies and draft breeds. The majority of CNVRs involved genes, while 20% were located in intergenic regions. Similar to previous studies in horses and other mammals, molecular functions of CNV-associated genes were predominantly in sensory perception, immunity and reproduction. The findings were integrated with previous studies to generate a composite genome-wide dataset of 1476 CNVRs. Of these, 301 CNVRs were shared between studies, while 1174 were novel and require further validation. Integrated data revealed that to date, 41 out of over 400 breeds of the domestic horse have been analyzed for CNVs, of which 11 new breeds were added in this study. Finally, the composite CNV dataset was applied in a pilot study for the discovery of CNVs in 6 horses with XY disorders of sexual development. A homozygous deletion involving AKR1C gene cluster in chr29 in two affected horses was considered possibly causative because of the known role of AKR1C genes in testicular androgen synthesis and sexual development. While the findings improve and integrate the knowledge of CNVs in horses, they also show that for effective discovery of variants of biomedical importance, more breeds and individuals need to be analyzed using comparable methodological approaches.

  15. Child Development and Structural Variation in the Human Genome

    Science.gov (United States)

    Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.

    2013-01-01

    Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…

  16. Human genomics projects and precision medicine.

    Science.gov (United States)

    Carrasco-Ramiro, F; Peiró-Pastor, R; Aguado, B

    2017-09-01

    The completion of the Human Genome Project (HGP) in 2001 opened the floodgates to a deeper understanding of medicine. There are dozens of HGP-like projects which involve from a few tens to several million genomes currently in progress, which vary from having specialized goals or a more general approach. However, data generation, storage, management and analysis in public and private cloud computing platforms have raised concerns about privacy and security. The knowledge gained from further research has changed the field of genomics and is now slowly permeating into clinical medicine. The new precision (personalized) medicine, where genome sequencing and data analysis are essential components, allows tailored diagnosis and treatment according to the information from the patient's own genome and specific environmental factors. P4 (predictive, preventive, personalized and participatory) medicine is introducing new concepts, challenges and opportunities. This review summarizes current sequencing technologies, concentrates on ongoing human genomics projects, and provides some examples in which precision medicine has already demonstrated clinical impact in diagnosis and/or treatment.

  17. Justice and the Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Murphy, T.F.; Lappe, M. (eds.)

    1992-01-01

    Most of the essays gathered in this volume were first presented at a conference, Justice and the Human Genome, in Chicago in early November, 1991. The goal of the, conference was to consider questions of justice as they are and will be raised by the Human Genome Project. To achieve its goal of identifying and elucidating the challenges of justice inherent in genomic research and its social applications the conference drew together in one forum members from academia, medicine, and industry with interests divergent as rate-setting for insurance, the care of newborns, and the history of ethics. The essays in this volume address a number of theoretical and practical concerns relative to the meaning of genomic research.

  18. Justice and the Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Murphy, T.F.; Lappe, M. [eds.

    1992-12-31

    Most of the essays gathered in this volume were first presented at a conference, Justice and the Human Genome, in Chicago in early November, 1991. The goal of the, conference was to consider questions of justice as they are and will be raised by the Human Genome Project. To achieve its goal of identifying and elucidating the challenges of justice inherent in genomic research and its social applications the conference drew together in one forum members from academia, medicine, and industry with interests divergent as rate-setting for insurance, the care of newborns, and the history of ethics. The essays in this volume address a number of theoretical and practical concerns relative to the meaning of genomic research.

  19. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    DEFF Research Database (Denmark)

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls....... In a dairy data set, predictions using BayesRC and imputed sequence data from 1000 Bull Genomes were 2% more accurate than with 800k data. We could demonstrate the method identified causal mutations in some cases. Further improvements will come from more accurate imputation of sequence variant genotypes...

  20. The Human Genome Project and Biology Education.

    Science.gov (United States)

    McInerney, Joseph D.

    1996-01-01

    Highlights the importance of the Human Genome Project in educating the public about genetics. Discusses four challenges that science educators must address: teaching for conceptual understanding, the nature of science, the personal and social impact of science and technology, and the principles of technology. Contains 45 references. (JRH)

  1. Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

    DEFF Research Database (Denmark)

    Zhan, Bujie; Fadista, João; Thomsen, Bo

    2011-01-01

    Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome...... of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation...

  2. Implications of the Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Kitcher, P.

    1998-11-01

    The Human Genome Project (HGP), launched in 1991, aims to map and sequence the human genome by 2006. During the fifteen-year life of the project, it is projected that $3 billion in federal funds will be allocated to it. The ultimate aims of spending this money are to analyze the structure of human DNA, to identify all human genes, to recognize the functions of those genes, and to prepare for the biology and medicine of the twenty-first century. The following summary examines some of the implications of the program, concentrating on its scientific import and on the ethical and social problems that it raises. Its aim is to expose principles that might be used in applying the information which the HGP will generate. There is no attempt here to translate the principles into detailed proposals for legislation. Arguments and discussion can be found in the full report, but, like this summary, that report does not contain any legislative proposals.

  3. Origins of the Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, Robert

    1993-07-01

    The human genome project was borne of technology, grew into a science bureaucracy in the US and throughout the world, and is now being transformed into a hybrid academic and commercial enterprise. The next phase of the project promises to veer more sharply toward commercial application, harnessing both the technical prowess of molecular biology and the rapidly growing body of knowledge about DNA structure to the pursuit of practical benefits. Faith that the systematic analysis of DNA structure will prove to be a powerful research tool underlies the rationale behind the genome project. The notion that most genetic information is embedded in the sequence of CNA base pairs comprising chromosomes is a central tenet. A rough analogy is to liken an organism's genetic code to computer code. The coal of the genome project, in this parlance, is to identify and catalog 75,000 or more files (genes) in the software that directs construction of a self-modifying and self-replicating system -- a living organism.

  4. Origins of the Human Genome Project

    Science.gov (United States)

    Cook-Deegan, Robert (Affiliation: Institute of Medicine, National Academy of Sciences)

    1993-07-01

    The human genome project was borne of technology, grew into a science bureaucracy in the United States and throughout the world, and is now being transformed into a hybrid academic and commercial enterprise. The next phase of the project promises to veer more sharply toward commercial application, harnessing both the technical prowess of molecular biology and the rapidly growing body of knowledge about DNA structure to the pursuit of practical benefits. Faith that the systematic analysis of DNA structure will prove to be a powerful research tool underlies the rationale behind the genome project. The notion that most genetic information is embedded in the sequence of CNA base pairs comprising chromosomes is a central tenet. A rough analogy is to liken an organism's genetic code to computer code. The coal of the genome project, in this parlance, is to identify and catalog 75,000 or more files (genes) in the software that directs construction of a self-modifying and self-replicating system -- a living organism.

  5. A comprehensive crop genome research project: the Superhybrid Rice Genome Project in China.

    Science.gov (United States)

    Yu, Jun; Wong, Gane Ka-Shu; Liu, Siqi; Wang, Jian; Yang, Huanming

    2007-06-29

    In May 2000, the Beijing Institute of Genomics formally announced the launch of a comprehensive crop genome research project on rice genomics, the Chinese Superhybrid Rice Genome Project. SRGP is not simply a sequencing project targeted to a single rice (Oryza sativa L.) genome, but a full-swing research effort with an ultimate goal of providing inclusive basic genomic information and molecular tools not only to understand biology of the rice, both as an important crop species and a model organism of cereals, but also to focus on a popular superhybrid rice landrace, LYP9. We have completed the first phase of SRGP and provide the rice research community with a finished genome sequence of an indica variety, 93-11 (the paternal cultivar of LYP9), together with ample data on subspecific (between subspecies) polymorphisms, transcriptomes and proteomes, useful for within-species comparative studies. In the second phase, we have acquired the genome sequence of the maternal cultivar, PA64S, together with the detailed catalogues of genes uniquely expressed in the parental cultivars and the hybrid as well as allele-specific markers that distinguish parental alleles. Although SRGP in China is not an open-ended research programme, it has been designed to pave a way for future plant genomics research and application, such as to interrogate fundamentals of plant biology, including genome duplication, polyploidy and hybrid vigour, as well as to provide genetic tools for crop breeding and to carry along a social burden-leading a fight against the world's hunger. It began with genomics, the newly developed and industry-scale research field, and from the world's most populous country. In this review, we summarize our scientific goals and noteworthy discoveries that exploit new territories of systematic investigations on basic and applied biology of rice and other major cereal crops.

  6. Variation in heterozygosity predicts variation in human substitution rates between populations, individuals and genomic regions.

    Directory of Open Access Journals (Sweden)

    William Amos

    Full Text Available The "heterozygote instability" (HI hypothesis suggests that gene conversion events focused on heterozygous sites during meiosis locally increase the mutation rate, but this hypothesis remains largely untested. As humans left Africa they lost variability, which, if HI operates, should have reduced the mutation rate in non-Africans. Relative substitution rates were quantified in diverse humans using aligned whole genome sequences from the 1,000 genomes project. Substitution rate is consistently greater in Africans than in non-Africans, but only in diploid regions of the genome, consistent with a role for heterozygosity. Analysing the same data partitioned into a series of non-overlapping 2 Mb windows reveals a strong, non-linear correlation between the amount of heterozygosity lost "out of Africa" and the difference in substitution rate between Africans and non-Africans. Putative recent mutations, derived variants that occur only once among the 80 human chromosomes sampled, occur preferentially at the centre of 2 Kb windows that have elevated heterozygosity compared both with the same region in a closely related population and with an immediately adjacent region in the same population. More than half of all substitutions appear attributable to variation in heterozygosity. This observation provides strong support for HI with implications for many branches of evolutionary biology.

  7. Genome size, morphological and palynological variations, and ...

    African Journals Online (AJOL)

    The present study compares the morphological, palynologycal and genome size (C-value content) characteristics in the long-styled and short-styled plants in three Linum species, that is, ... The analysis of variance (ANOVA) test performed among the three Linum species showed a significant difference in 2C-value content.

  8. An overview of the human genome project

    Energy Technology Data Exchange (ETDEWEB)

    Batzer, M.A.

    1994-01-01

    The human genome project is one of the most ambitious scientific projects to date, with the ultimate goal being a nucleotide sequence for all four billion bases of human DNA. In the process of determining the nucleotide sequence for each base, the location, function, and regulatory regions from the estimated 100,000 human genes will be identified. The genome project itself relies upon maps of the human genetic code derived from several different levels of resolution. Genetic linkage analysis provides a low resolution genome map. The information for genetic linkage maps is derived from the analysis of chromosome specific markers such as Sequence Tagged Sites (STSs), Variable Number of Tandem Repeats (VNTRs) or other polymorphic (highly informative) loci in a number of different-families. Using this information the location of an unknown disease gene can be limited to a region comprised of one million base pairs of DNA or less. After this point, one must construct or have access to a physical map of the region of interest. Physical mapping involves the construction of an ordered overlapping (contiguous) set of recombinant DNA clones. These clones may be derived from a number of different vectors including cosmids, Bacterial Artificial Chromosomes (BACs), P1 derived Artificial Chromosomes (PACs), somatic cell hybrids, or Yeast Artificial Chromosomes (YACs). The ultimate goal for physical mapping is to establish a completely overlapping (contiguous) set of clones for the entire genome. After a gene or region of interest has been localized using physical mapping the nucleotide sequence is determined. The overlap between genetic mapping, physical mapping and DNA sequencing has proven to be a powerful tool for the isolation of disease genes through positional cloning.

  9. Genome Architecture and Its Roles in Human Copy Number Variation

    Directory of Open Access Journals (Sweden)

    Lu Chen

    2014-12-01

    Full Text Available Besides single-nucleotide variants in the human genome, large-scale genomic variants, such as copy number variations (CNVs, are being increasingly discovered as a genetic source of human diversity and the pathogenic factors of diseases. Recent experimental findings have shed light on the links between different genome architectures and CNV mutagenesis. In this review, we summarize various genomic features and discuss their contributions to CNV formation. Genomic repeats, including both low-copy and high-copy repeats, play important roles in CNV instability, which was initially known as DNA recombination events. Furthermore, it has been found that human genomic repeats can also induce DNA replication errors and consequently result in CNV mutations. Some recent studies showed that DNA replication timing, which reflects the high-order information of genomic organization, is involved in human CNV mutations. Our review highlights that genome architecture, from DNA sequence to high-order genomic organization, is an important molecular factor in CNV mutagenesis and human genomic instability.

  10. The Human Genome Diversity (HGD) Project. Summary document

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1993-12-31

    In 1991 a group of human geneticists and molecular biologists proposed to the scientific community that a world wide survey be undertaken of variation in the human genome. To aid their considerations, the committee therefore decided to hold a small series of international workshops to explore the major scientific issues involved. The intention was to define a framework for the project which could provide a basis for much wider and more detailed discussion and planning--it was recognized that the successful implementation of the proposed project, which has come to be known as the Human Genome Diversity (HGD) Project, would not only involve scientists but also various national and international non-scientific groups all of which should contribute to the project`s development. The international HGD workshop held in Sardinia in September 1993 was the last in the initial series of planning workshops. As such it not only explored new ground but also pulled together into a more coherent form much of the formal and informal discussion that had taken place in the preceding two years. This report presents the deliberations of the Sardinia workshop within a consideration of the overall development of the HGD Project to date.

  11. The Organelle Genomes of Hassawi Rice (Oryza sativa L.) and Its Hybrid in Saudi Arabia: Genome Variation, Rearrangement, and Origins

    Science.gov (United States)

    Zhang, Tongwu; Hu, Songnian; Zhang, Guangyu; Pan, Linlin; Zhang, Xiaowei; Al-Mssallem, Ibrahim S.; Yu, Jun

    2012-01-01

    Hassawi rice (Oryza sativa L.) is a landrace adapted to the climate of Saudi Arabia, characterized by its strong resistance to soil salinity and drought. Using high quality sequencing reads extracted from raw data of a whole genome sequencing project, we assembled both chloroplast (cp) and mitochondrial (mt) genomes of the wild-type Hassawi rice (Hassawi-1) and its dwarf hybrid (Hassawi-2). We discovered 16 InDels (insertions and deletions) but no SNP (single nucleotide polymorphism) is present between the two Hassawi cp genomes. We identified 48 InDels and 26 SNPs in the two Hassawi mt genomes and a new type of sequence variation, termed reverse complementary variation (RCV) in the rice cp genomes. There are two and four RCVs identified in Hassawi-1 when compared to 93–11 (indica) and Nipponbare (japonica), respectively. Microsatellite sequence analysis showed there are more SSRs in the genic regions of both cp and mt genomes in the Hassawi rice than in the other rice varieties. There are also large repeats in the Hassawi mt genomes, with the longest length of 96,168 bp and 96,165 bp in Hassawi-1 and Hassawi-2, respectively. We believe that frequent DNA rearrangement in the Hassawi mt and cp genomes indicate ongoing dynamic processes to reach genetic stability under strong environmental pressures. Based on sequence variation analysis and the breeding history, we suggest that both Hassawi-1 and Hassawi-2 originated from the Indonesian variety Peta since genetic diversity between the two Hassawi cultivars is very low albeit an unknown historic origin of the wild-type Hassawi rice. PMID:22870184

  12. Salmon and steelhead genetics and genomics - Epigenetic and genomic variation in salmon and steelhead

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Conduct analyses of epigenetic and genomic variation in Chinook salmon and steelhead to determine influence on phenotypic expression of life history traits. Genetic,...

  13. Bonobos fall within the genomic variation of chimpanzees.

    Directory of Open Access Journals (Sweden)

    Anne Fischer

    Full Text Available To gain insight into the patterns of genetic variation and evolutionary relationships within and between bonobos and chimpanzees, we sequenced 150,000 base pairs of nuclear DNA divided among 15 autosomal regions as well as the complete mitochondrial genomes from 20 bonobos and 58 chimpanzees. Except for western chimpanzees, we found poor genetic separation of chimpanzees based on sample locality. In contrast, bonobos consistently cluster together but fall as a group within the variation of chimpanzees for many of the regions. Thus, while chimpanzees retain genomic variation that predates bonobo-chimpanzee speciation, extensive lineage sorting has occurred within bonobos such that much of their genome traces its ancestry back to a single common ancestor that postdates their origin as a group separate from chimpanzees.

  14. Mapping copy number variation by population-scale genome sequencing

    DEFF Research Database (Denmark)

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is......, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications...

  15. The Genome of the Netherlands: design, and project goals

    Science.gov (United States)

    Boomsma, Dorret I; Wijmenga, Cisca; Slagboom, Eline P; Swertz, Morris A; Karssen, Lennart C; Abdellaoui, Abdel; Ye, Kai; Guryev, Victor; Vermaat, Martijn; van Dijk, Freerk; Francioli, Laurent C; Hottenga, Jouke Jan; Laros, Jeroen F J; Li, Qibin; Li, Yingrui; Cao, Hongzhi; Chen, Ruoyan; Du, Yuanping; Li, Ning; Cao, Sujie; van Setten, Jessica; Menelaou, Androniki; Pulit, Sara L; Hehir-Kwa, Jayne Y; Beekman, Marian; Elbers, Clara C; Byelas, Heorhiy; de Craen, Anton J M; Deelen, Patrick; Dijkstra, Martijn; den Dunnen, Johan T; de Knijff, Peter; Houwing-Duistermaat, Jeanine; Koval, Vyacheslav; Estrada, Karol; Hofman, Albert; Kanterakis, Alexandros; Enckevort, David van; Mai, Hailiang; Kattenberg, Mathijs; van Leeuwen, Elisabeth M; Neerincx, Pieter B T; Oostra, Ben; Rivadeneira, Fernanodo; Suchiman, Eka H D; Uitterlinden, Andre G; Willemsen, Gonneke; Wolffenbuttel, Bruce H; Wang, Jun; de Bakker, Paul I W; van Ommen, Gert-Jan; van Duijn, Cornelia M

    2014-01-01

    Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent–offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910–1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14–15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project. PMID:23714750

  16. Genomic variation landscape of the human gut microbiome

    DEFF Research Database (Denmark)

    Schloissnig, Siegfried; Arumugam, Manimozhiyan; Sunagawa, Shinichi

    2013-01-01

    Whereas large-scale efforts have rapidly advanced the understanding and practical impact of human genomic variation, the practical impact of variation is largely unexplored in the human microbiome. We therefore developed a framework for metagenomic variation analysis and applied it to 252 faecal...... polymorphism rates of 0.11 was more variable between gut microbial species than across human hosts. Subjects sampled at varying time intervals exhibited individuality and temporal stability of SNP variation patterns, despite considerable composition changes of their gut microbiota. This indicates...

  17. The Personal Genome Project Canada: findings from whole genome sequences of the inaugural 56 participants.

    Science.gov (United States)

    Reuter, Miriam S; Walker, Susan; Thiruvahindrapuram, Bhooma; Whitney, Joe; Cohn, Iris; Sondheimer, Neal; Yuen, Ryan K C; Trost, Brett; Paton, Tara A; Pereira, Sergio L; Herbrick, Jo-Anne; Wintle, Richard F; Merico, Daniele; Howe, Jennifer; MacDonald, Jeffrey R; Lu, Chao; Nalpathamkalam, Thomas; Sung, Wilson W L; Wang, Zhuozhi; Patel, Rohan V; Pellecchia, Giovanna; Wei, John; Strug, Lisa J; Bell, Sherilyn; Kellam, Barbara; Mahtani, Melanie M; Bassett, Anne S; Bombard, Yvonne; Weksberg, Rosanna; Shuman, Cheryl; Cohn, Ronald D; Stavropoulos, Dimitri J; Bowdin, Sarah; Hildebrandt, Matthew R; Wei, Wei; Romm, Asli; Pasceri, Peter; Ellis, James; Ray, Peter; Meyn, M Stephen; Monfared, Nasim; Hosseini, S Mohsen; Joseph-George, Ann M; Keeley, Fred W; Cook, Ryan A; Fiume, Marc; Lee, Hin C; Marshall, Christian R; Davies, Jill; Hazell, Allison; Buchanan, Janet A; Szego, Michael J; Scherer, Stephen W

    2018-02-05

    The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set ( n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants - associated with cancer, cardiac or neurodegenerative phenotypes - remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care. © 2018 Joule Inc. or its licensors.

  18. Chambolle's Projection Algorithm for Total Variation Denoising

    Directory of Open Access Journals (Sweden)

    Joan Duran

    2013-12-01

    Full Text Available Denoising is the problem of removing the inherent noise from an image. The standard noise model is additive white Gaussian noise, where the observed image f is related to the underlying true image u by the degradation model f=u+n, and n is supposed to be at each pixel independently and identically distributed as a zero-mean Gaussian random variable. Since this is an ill-posed problem, Rudin, Osher and Fatemi introduced the total variation as a regularizing term. It has proved to be quite efficient for regularizing images without smoothing the boundaries of the objects. This paper focuses on the simple description of the theory and on the implementation of Chambolle's projection algorithm for minimizing the total variation of a grayscale image. Furthermore, we adapt the algorithm to the vectorial total variation for color images. The implementation is described in detail and its parameters are analyzed and varied to come up with a reliable implementation.

  19. Variational principles for the projected breakup amplitude

    International Nuclear Information System (INIS)

    Hahn, Y.

    1976-01-01

    Two alternate forms of variational principles for the breakup amplitude describing the two- to three-cluster transition are derived such that all the integrals involved in the intermediate stages are well defined. The first form contains a trial Green's function with which both the initial and final state trial wave functions are constructed. The earlier form of the Kohn-type variational principle derived by Lieber, Rosenberg, and Spruch is recovered, however, when this connection between the trial functions is removed. The second form of the variational principle is derived by projecting out from the trial functions all the open channel components which correspond to the two-cluster structures including the rearrangement channels. The remaining part of the wave functions describes the channels with three-cluster structures, and the integrals involving this part are then mathematically well defined

  20. The Human Genome Project (HGP): dividends and challenges: a ...

    African Journals Online (AJOL)

    The Human Genome Project (HGP): dividends and challenges: a review. ... Genomic studies have given profound insights into the genetic organization of ... with it will be an essential part of modern medicine and biology for years to come.

  1. Genome-wide sequence variations among Mycobacterium avium subspecies paratuberculosis.

    Directory of Open Access Journals (Sweden)

    Chung-Yi eHsu

    2011-12-01

    Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.

  2. Freedom and Responsibility in Synthetic Genomics: The Synthetic Yeast Project

    OpenAIRE

    Sliva, Anna; Yang, Huanming; Boeke, Jef D.; Mathews, Debra J. H.

    2015-01-01

    First introduced in 2011, the Synthetic Yeast Genome (Sc2.0) Project is a large international synthetic genomics project that will culminate in the first eukaryotic cell (Saccharomyces cerevisiae) with a fully synthetic genome. With collaborators from across the globe and from a range of institutions spanning from do-it-yourself biology (DIYbio) to commercial enterprises, it is important that all scientists working on this project are cognizant of the ethical and policy issues associated with...

  3. The modest beginnings of one genome project.

    Science.gov (United States)

    Kaback, David B

    2013-06-01

    One of the top things on a geneticist's wish list has to be a set of mutants for every gene in their particular organism. Such a set was produced for the yeast, Saccharomyces cerevisiae near the end of the 20th century by a consortium of yeast geneticists. However, the functional genomic analysis of one chromosome, its smallest, had already begun more than 25 years earlier as a project that was designed to define most or all of that chromosome's essential genes by temperature-sensitive lethal mutations. When far fewer than expected genes were uncovered, the relatively new field of molecular cloning enabled us and indeed, the entire community of yeast researchers to approach this problem more definitively. These studies ultimately led to cloning, genomic sequencing, and the production and phenotypic analysis of the entire set of knockout mutations for this model organism as well as a better concept of what defines an essential function, a wish fulfilled that enables this model eukaryote to continue at the forefront of research in modern biology.

  4. Transformation of natural genetic variation into Haemophilus influenzae genomes.

    Directory of Open Access Journals (Sweden)

    Joshua Chang Mell

    2011-07-01

    Full Text Available Many bacteria are able to efficiently bind and take up double-stranded DNA fragments, and the resulting natural transformation shapes bacterial genomes, transmits antibiotic resistance, and allows escape from immune surveillance. The genomes of many competent pathogens show evidence of extensive historical recombination between lineages, but the actual recombination events have not been well characterized. We used DNA from a clinical isolate of Haemophilus influenzae to transform competent cells of a laboratory strain. To identify which of the ~40,000 polymorphic differences had recombined into the genomes of four transformed clones, their genomes and their donor and recipient parents were deep sequenced to high coverage. Each clone was found to contain ~1000 donor polymorphisms in 3-6 contiguous runs (8.1±4.5 kb in length that collectively comprised ~1-3% of each transformed chromosome. Seven donor-specific insertions and deletions were also acquired as parts of larger donor segments, but the presence of other structural variation flanking 12 of 32 recombination breakpoints suggested that these often disrupt the progress of recombination events. This is the first genome-wide analysis of chromosomes directly transformed with DNA from a divergent genotype, connecting experimental studies of transformation with the high levels of natural genetic variation found in isolates of the same species.

  5. Gene copy number variation throughout the Plasmodium falciparum genome

    Directory of Open Access Journals (Sweden)

    Stewart Lindsay B

    2009-08-01

    Full Text Available Abstract Background Gene copy number variation (CNV is responsible for several important phenotypes of the malaria parasite Plasmodium falciparum, including drug resistance, loss of infected erythrocyte cytoadherence and alteration of receptor usage for erythrocyte invasion. Despite the known effects of CNV, little is known about its extent throughout the genome. Results We performed a whole-genome survey of CNV genes in P. falciparum using comparative genome hybridisation of a diverse set of 16 laboratory culture-adapted isolates to a custom designed high density Affymetrix GeneChip array. Overall, 186 genes showed hybridisation signals consistent with deletion or amplification in one or more isolate. There is a strong association of CNV with gene length, genomic location, and low orthology to genes in other Plasmodium species. Sub-telomeric regions of all chromosomes are strongly associated with CNV genes independent from members of previously described multigene families. However, ~40% of CNV genes were located in more central regions of the chromosomes. Among the previously undescribed CNV genes, several that are of potential phenotypic relevance are identified. Conclusion CNV represents a major form of genetic variation within the P. falciparum genome; the distribution of gene features indicates the involvement of highly non-random mutational and selective processes. Additional studies should be directed at examining CNV in natural parasite populations to extend conclusions to clinical settings.

  6. Genome size variation affects song attractiveness in grasshoppers: evidence for sexual selection against large genomes.

    Science.gov (United States)

    Schielzeth, Holger; Streitner, Corinna; Lampe, Ulrike; Franzke, Alexandra; Reinhold, Klaus

    2014-12-01

    Genome size is largely uncorrelated to organismal complexity and adaptive scenarios. Genetic drift as well as intragenomic conflict have been put forward to explain this observation. We here study the impact of genome size on sexual attractiveness in the bow-winged grasshopper Chorthippus biguttulus. Grasshoppers show particularly large variation in genome size due to the high prevalence of supernumerary chromosomes that are considered (mildly) selfish, as evidenced by non-Mendelian inheritance and fitness costs if present in high numbers. We ranked male grasshoppers by song characteristics that are known to affect female preferences in this species and scored genome sizes of attractive and unattractive individuals from the extremes of this distribution. We find that attractive singers have significantly smaller genomes, demonstrating that genome size is reflected in male courtship songs and that females prefer songs of males with small genomes. Such a genome size dependent mate preference effectively selects against selfish genetic elements that tend to increase genome size. The data therefore provide a novel example of how sexual selection can reinforce natural selection and can act as an agent in an intragenomic arms race. Furthermore, our findings indicate an underappreciated route of how choosy females could gain indirect benefits. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.

  7. Identifying tagging SNPs for African specific genetic variation from the African Diaspora Genome.

    Science.gov (United States)

    Johnston, Henry Richard; Hu, Yi-Juan; Gao, Jingjing; O'Connor, Timothy D; Abecasis, Gonçalo R; Wojcik, Genevieve L; Gignoux, Christopher R; Gourraud, Pierre-Antoine; Lizee, Antoine; Hansen, Mark; Genuario, Rob; Bullis, Dave; Lawley, Cindy; Kenny, Eimear E; Bustamante, Carlos; Beaty, Terri H; Mathias, Rasika A; Barnes, Kathleen C; Qin, Zhaohui S

    2017-04-21

    A primary goal of The Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) is to develop an 'African Diaspora Power Chip' (ADPC), a genotyping array consisting of tagging SNPs, useful in comprehensively identifying African specific genetic variation. This array is designed based on the novel variation identified in 642 CAAPA samples of African ancestry with high coverage whole genome sequence data (~30× depth). This novel variation extends the pattern of variation catalogued in the 1000 Genomes and Exome Sequencing Projects to a spectrum of populations representing the wide range of West African genomic diversity. These individuals from CAAPA also comprise a large swath of the African Diaspora population and incorporate historical genetic diversity covering nearly the entire Atlantic coast of the Americas. Here we show the results of designing and producing such a microchip array. This novel array covers African specific variation far better than other commercially available arrays, and will enable better GWAS analyses for researchers with individuals of African descent in their study populations. A recent study cataloging variation in continental African populations suggests this type of African-specific genotyping array is both necessary and valuable for facilitating large-scale GWAS in populations of African ancestry.

  8. Skate Genome Project: Cyber-Enabled Bioinformatics Collaboration

    Science.gov (United States)

    Vincent, J.

    2011-01-01

    The Skate Genome Project, a pilot project of the North East Cyber infrastructure Consortium, aims to produce a draft genome sequence of Leucoraja erinacea, the Little Skate. The pilot project was designed to also develop expertise in large scale collaborations across the NECC region. An overview of the bioinformatics and infrastructure challenges faced during the first year of the project will be presented. Results to date and lessons learned from the perspective of a bioinformatics core will be highlighted.

  9. Genome-wide variation in recombination rate in Eucalyptus.

    Science.gov (United States)

    Gion, Jean-Marc; Hudson, Corey J; Lesur, Isabelle; Vaillancourt, René E; Potts, Brad M; Freeman, Jules S

    2016-08-09

    Meiotic recombination is a fundamental evolutionary process. It not only generates diversity, but influences the efficacy of natural selection and genome evolution. There can be significant heterogeneity in recombination rates within and between species, however this variation is not well understood outside of a few model taxa, particularly in forest trees. Eucalypts are forest trees of global economic importance, and dominate many Australian ecosystems. We studied recombination rate in Eucalyptus globulus using genetic linkage maps constructed in 10 unrelated individuals, and markers anchored to the Eucalyptus reference genome. This experimental design provided the replication to study whether recombination rate varied between individuals and chromosomes, and allowed us to study the genomic attributes and population genetic parameters correlated with this variation. Recombination rate varied significantly between individuals (range = 2.71 to 3.51 centimorgans/megabase [cM/Mb]), but was not significantly influenced by sex or cross type (F1 vs. F2). Significant differences in recombination rate between chromosomes were also evident (range = 1.98 to 3.81 cM/Mb), beyond those which were due to variation in chromosome size. Variation in chromosomal recombination rate was significantly correlated with gene density (r = 0.94), GC content (r = 0.90), and the number of tandem duplicated genes (r = -0.72) per chromosome. Notably, chromosome level recombination rate was also negatively correlated with the average genetic diversity across six species from an independent set of samples (r = -0.75). The correlations with genomic attributes are consistent with findings in other taxa, however, the direction of the correlation between diversity and recombination rate is opposite to that commonly observed. We argue this is likely to reflect the interaction of selection and specific genome architecture of Eucalyptus. Interestingly, the differences amongst

  10. Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation.

    Science.gov (United States)

    Kidd, Jeffrey M; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F; Peckham, Heather E; Omberg, Larsson; Bormann Chung, Christina A; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G; Russell, Archie; Reynolds, Andy; Clark, Andrew G; Reese, Martin G; Lincoln, Stephen E; Butte, Atul J; De La Vega, Francisco M; Bustamante, Carlos D

    2012-10-05

    Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  11. The Human Genome Project: how do we protect Australians?

    Science.gov (United States)

    Stott Despoja, N

    It is the moon landing of the nineties: the ambitious Human Genome Project--identifying the up to 100,000 genes that make up human DNA and the sequences of the three billion base-pairs that comprise the human genome. However, unlike the moon landing, the effects of the genome project will have a fundamental impact on the way we see ourselves and each other.

  12. Genic intolerance to functional variation and the interpretation of personal genomes.

    Directory of Open Access Journals (Sweden)

    Slavé Petrovski

    Full Text Available A central challenge in interpreting personal genomes is determining which mutations most likely influence disease. Although progress has been made in scoring the functional impact of individual mutations, the characteristics of the genes in which those mutations are found remain largely unexplored. For example, genes known to carry few common functional variants in healthy individuals may be judged more likely to cause certain kinds of disease than genes known to carry many such variants. Until now, however, it has not been possible to develop a quantitative assessment of how well genes tolerate functional genetic variation on a genome-wide scale. Here we describe an effort that uses sequence data from 6503 whole exome sequences made available by the NHLBI Exome Sequencing Project (ESP. Specifically, we develop an intolerance scoring system that assesses whether genes have relatively more or less functional genetic variation than expected based on the apparently neutral variation found in the gene. To illustrate the utility of this intolerance score, we show that genes responsible for Mendelian diseases are significantly more intolerant to functional genetic variation than genes that do not cause any known disease, but with striking variation in intolerance among genes causing different classes of genetic disease. We conclude by showing that use of an intolerance ranking system can aid in interpreting personal genomes and identifying pathogenic mutations.

  13. Discrepancy variation of dinucleotide microsatellite repeats in eukaryotic genomes

    Directory of Open Access Journals (Sweden)

    HUAN GAO

    2009-01-01

    Full Text Available To address whether there are differences of variation among repeat motif types and among taxonomic groups, we present here an analysis of variation and correlation of dinucleotide microsatellite repeats in eukaryotic genomes. Ten taxonomic groups were compared, those being primates, mammalia (excluding primates and rodentia, rodentia, birds, fish, amphibians and reptiles, insects, molluscs, plants and fungi, respectively. The data used in the analysis is from the literature published in the Journal of Molecular Ecology Notes. Analysis of variation reveals that there are no significant differences between AC and AG repeat motif types. Moreover, the number of alleles correlates positively with the copy number in both AG and AC repeats. Similar conclusions can be obtained from each taxonomic group. These results strongly suggest that the increase of SSR variation is almost linear with the increase of the copy number of each repeat motif. As well, the results suggest that the variability of SSR in the genomes of low-ranking species seem to be more than that of high-ranking species, excluding primates and fungi.

  14. Potential Value of Genomic Copy Number Variations in Schizophrenia

    Directory of Open Access Journals (Sweden)

    Chuanjun Zhuo

    2017-06-01

    Full Text Available Schizophrenia is a devastating neuropsychiatric disorder affecting approximately 1% of the global population, and the disease has imposed a considerable burden on families and society. Although, the exact cause of schizophrenia remains unknown, several lines of scientific evidence have revealed that genetic variants are strongly correlated with the development and early onset of the disease. In fact, the heritability among patients suffering from schizophrenia is as high as 80%. Genomic copy number variations (CNVs are one of the main forms of genomic variations, ubiquitously occurring in the human genome. An increasing number of studies have shown that CNVs account for population diversity and genetically related diseases, including schizophrenia. The last decade has witnessed rapid advances in the development of novel genomic technologies, which have led to the identification of schizophrenia-associated CNVs, insight into the roles of the affected genes in their intervals in schizophrenia, and successful manipulation of the target CNVs. In this review, we focus on the recent discoveries of important CNVs that are associated with schizophrenia and outline the potential values that the study of CNVs will bring to the areas of schizophrenia research, diagnosis, and therapy. Furthermore, with the help of the novel genetic tool known as the Clustered Regularly Interspaced Short Palindromic Repeats-associated nuclease 9 (CRISPR/Cas9 system, the pathogenic CNVs as genomic defects could be corrected. In conclusion, the recent novel findings of schizophrenia-associated CNVs offer an exciting opportunity for schizophrenia research to decipher the pathological mechanisms underlying the onset and development of schizophrenia as well as to provide potential clinical applications in genetic counseling, diagnosis, and therapy for this complex mental disease.

  15. Genomic Variation in Natural Populations of Drosophila melanogaster

    Science.gov (United States)

    Langley, Charles H.; Stevens, Kristian; Cardeno, Charis; Lee, Yuh Chwen G.; Schrider, Daniel R.; Pool, John E.; Langley, Sasha A.; Suarez, Charlyn; Corbett-Detig, Russell B.; Kolaczkowski, Bryan; Fang, Shu; Nista, Phillip M.; Holloway, Alisha K.; Kern, Andrew D.; Dewey, Colin N.; Song, Yun S.; Hahn, Matthew W.; Begun, David J.

    2012-01-01

    This report of independent genome sequences of two natural populations of Drosophila melanogaster (37 from North America and 6 from Africa) provides unique insight into forces shaping genomic polymorphism and divergence. Evidence of interactions between natural selection and genetic linkage is abundant not only in centromere- and telomere-proximal regions, but also throughout the euchromatic arms. Linkage disequilibrium, which decays within 1 kbp, exhibits a strong bias toward coupling of the more frequent alleles and provides a high-resolution map of recombination rate. The juxtaposition of population genetics statistics in small genomic windows with gene structures and chromatin states yields a rich, high-resolution annotation, including the following: (1) 5′- and 3′-UTRs are enriched for regions of reduced polymorphism relative to lineage-specific divergence; (2) exons overlap with windows of excess relative polymorphism; (3) epigenetic marks associated with active transcription initiation sites overlap with regions of reduced relative polymorphism and relatively reduced estimates of the rate of recombination; (4) the rate of adaptive nonsynonymous fixation increases with the rate of crossing over per base pair; and (5) both duplications and deletions are enriched near origins of replication and their density correlates negatively with the rate of crossing over. Available demographic models of X and autosome descent cannot account for the increased divergence on the X and loss of diversity associated with the out-of-Africa migration. Comparison of the variation among these genomes to variation among genomes from D. simulans suggests that many targets of directional selection are shared between these species. PMID:22673804

  16. GI-POP: a combinational annotation and genomic island prediction pipeline for ongoing microbial genome projects.

    Science.gov (United States)

    Lee, Chi-Ching; Chen, Yi-Ping Phoebe; Yao, Tzu-Jung; Ma, Cheng-Yu; Lo, Wei-Cheng; Lyu, Ping-Chiang; Tang, Chuan Yi

    2013-04-10

    Sequencing of microbial genomes is important because of microbial-carrying antibiotic and pathogenetic activities. However, even with the help of new assembling software, finishing a whole genome is a time-consuming task. In most bacteria, pathogenetic or antibiotic genes are carried in genomic islands. Therefore, a quick genomic island (GI) prediction method is useful for ongoing sequencing genomes. In this work, we built a Web server called GI-POP (http://gipop.life.nthu.edu.tw) which integrates a sequence assembling tool, a functional annotation pipeline, and a high-performance GI predicting module, in a support vector machine (SVM)-based method called genomic island genomic profile scanning (GI-GPS). The draft genomes of the ongoing genome projects in contigs or scaffolds can be submitted to our Web server, and it provides the functional annotation and highly probable GI-predicting results. GI-POP is a comprehensive annotation Web server designed for ongoing genome project analysis. Researchers can perform annotation and obtain pre-analytic information include possible GIs, coding/non-coding sequences and functional analysis from their draft genomes. This pre-analytic system can provide useful information for finishing a genome sequencing project. Copyright © 2012 Elsevier B.V. All rights reserved.

  17. Rare and common regulatory variation in population-scale sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2011-07-01

    Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

  18. Regulatory hotspots in the malaria parasite genome dictate transcriptional variation.

    Directory of Open Access Journals (Sweden)

    Joseph M Gonzales

    2008-09-01

    Full Text Available The determinants of transcriptional regulation in malaria parasites remain elusive. The presence of a well-characterized gene expression cascade shared by different Plasmodium falciparum strains could imply that transcriptional regulation and its natural variation do not contribute significantly to the evolution of parasite drug resistance. To clarify the role of transcriptional variation as a source of stain-specific diversity in the most deadly malaria species and to find genetic loci that dictate variations in gene expression, we examined genome-wide expression level polymorphisms (ELPs in a genetic cross between phenotypically distinct parasite clones. Significant variation in gene expression is observed through direct co-hybridizations of RNA from different P. falciparum clones. Nearly 18% of genes were regulated by a significant expression quantitative trait locus. The genetic determinants of most of these ELPs resided in hotspots that are physically distant from their targets. The most prominent regulatory locus, influencing 269 transcripts, coincided with a Chromosome 5 amplification event carrying the drug resistance gene, pfmdr1, and 13 other genes. Drug selection pressure in the Dd2 parental clone lineage led not only to a copy number change in the pfmdr1 gene but also to an increased copy number of putative neighboring regulatory factors that, in turn, broadly influence the transcriptional network. Previously unrecognized transcriptional variation, controlled by polymorphic regulatory genes and possibly master regulators within large copy number variants, contributes to sweeping phenotypic evolution in drug-resistant malaria parasites.

  19. A map of human genome variation from population-scale sequencing.

    Science.gov (United States)

    Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A

    2010-10-28

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.

  20. Comparative genomic data of the Avian Phylogenomics Project.

    Science.gov (United States)

    Zhang, Guojie; Li, Bo; Li, Cai; Gilbert, M Thomas P; Jarvis, Erich D; Wang, Jun

    2014-01-01

    genomic data that has been generated and used in our Avian Phylogenomics Project. To the best of our knowledge, the Avian Phylogenomics Project is the biggest vertebrate comparative genomics project to date. The genomic data presented here is expected to accelerate further analyses in many fields, including phylogenetics, comparative genomics, evolution, neurobiology, development biology, and other related areas.

  1. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  2. A computational genomics pipeline for prokaryotic sequencing projects.

    Science.gov (United States)

    Kislyuk, Andrey O; Katz, Lee S; Agrawal, Sonia; Hagen, Matthew S; Conley, Andrew B; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C; Sammons, Scott A; Govil, Dhwani; Mair, Raydel D; Tatti, Kathleen M; Tondella, Maria L; Harcourt, Brian H; Mayer, Leonard W; Jordan, I King

    2010-08-01

    New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.

  3. Theories of Population Variation in Genes and Genomes

    DEFF Research Database (Denmark)

    Christiansen, Freddy

    This textbook provides an authoritative introduction to both classical and coalescent approaches to population genetics. Written for graduate students and advanced undergraduates by one of the world’s leading authorities in the field, the book focuses on the theoretical background of population...... genetics, while emphasizing the close interplay between theory and empiricism. Traditional topics such as genetic and phenotypic variation, mutation, migration, and linkage are covered and advanced by contemporary coalescent theory, which describes the genealogy of genes in a population, ultimately...... connecting them to a single common ancestor. Effects of selection, particularly genomic effects, are discussed with reference to molecular genetic variation. The book is designed for students of population genetics, bioinformatics, evolutionary biology, molecular evolution, and theoretical biology—as well...

  4. The Human Genome Project: An Imperative for International Collaboration.

    Science.gov (United States)

    Allende, J. E.

    1989-01-01

    Discussed is the Human Genome Project which aims to decipher the totality of the human genetic information. The historical background, the objectives, international cooperation, ethical discussion, and the role of UNESCO are included. (KR)

  5. The Human Genome Project: big science transforms biology and medicine

    OpenAIRE

    Hood, Leroy; Rowen, Lee

    2013-01-01

    The Human Genome Project has transformed biology through its integrated big science approach to deciphering a reference human genome sequence along with the complete sequences of key model organisms. The project exemplifies the power, necessity and success of large, integrated, cross-disciplinary efforts - so-called ‘big science’ - directed towards complex major objectives. In this article, we discuss the ways in which this ambitious endeavor led to the development of novel technologies and a...

  6. PolyTB: A genomic variation map for Mycobacterium tuberculosis

    KAUST Repository

    Coll, Francesc

    2014-02-15

    Tuberculosis (TB) caused by Mycobacterium tuberculosis (Mtb) is the second major cause of death from an infectious disease worldwide. Recent advances in DNA sequencing are leading to the ability to generate whole genome information in clinical isolates of M. tuberculosis complex (MTBC). The identification of informative genetic variants such as phylogenetic markers and those associated with drug resistance or virulence will help barcode Mtb in the context of epidemiological, diagnostic and clinical studies. Mtb genomic datasets are increasingly available as raw sequences, which are potentially difficult and computer intensive to process, and compare across studies. Here we have processed the raw sequence data (>1500 isolates, eight studies) to compile a catalogue of SNPs (n = 74,039, 63% non-synonymous, 51.1% in more than one isolate, i.e. non-private), small indels (n = 4810) and larger structural variants (n = 800). We have developed the PolyTB web-based tool (http://pathogenseq.lshtm.ac.uk/polytb) to visualise the resulting variation and important meta-data (e.g. in silico inferred strain-types, location) within geographical map and phylogenetic views. This resource will allow researchers to identify polymorphisms within candidate genes of interest, as well as examine the genomic diversity and distribution of strains. PolyTB source code is freely available to researchers wishing to develop similar tools for their pathogen of interest. 2014 Elsevier Ltd. All rights reserved.

  7. Effective Normalization for Copy Number Variation Detection from Whole Genome Sequencing

    NARCIS (Netherlands)

    Janevski, A.; Varadan, V.; Kamalakaran, S.; Banerjee, N.; Dimitrova, D.

    2012-01-01

    Background Whole genome sequencing enables a high resolution view ofthe human genome and provides unique insights into genome structureat an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools while validatedalso include a number of

  8. National human genome projects: an update and an agenda.

    Science.gov (United States)

    An, Joon Yong

    2017-01-01

    Population genetic and human genetic studies are being accelerated with genome technology and data sharing. Accordingly, in the past 10 years, several countries have initiated genetic research using genome technology and identified the genetic architecture of the ethnic groups living in the corresponding country or suggested the genetic foundation of a social phenomenon. Genetic research has been conducted from epidemiological studies that previously described the health or disease conditions in defined population. This perspective summarizes national genome projects conducted in the past 10 years and introduces case studies to utilize genomic data in genetic research.

  9. [Analysis of genomic copy number variations in two sisters with primary amenorrhea and hyperandrogenism].

    Science.gov (United States)

    Zhang, Yanliang; Xu, Qiuyue; Cai, Xuemei; Li, Yixun; Song, Guibo; Wang, Juan; Zhang, Rongchen; Dai, Yong; Duan, Yong

    2015-12-01

    To analyze genomic copy number variations (CNVs) in two sisters with primary amenorrhea and hyperandrogenism. G-banding was performed for karyotype analysis. The whole genome of the two sisters were scanned and analyzed by array-based comparative genomic hybridization (array-CGH). The results were confirmed with real-time quantitative PCR (RT-qPCR). No abnormality was found by conventional G-banded chromosome analysis. Array-CGH has identified 11 identical CNVs from the sisters which, however, overlapped with CNVs reported by the Database of Genomic Variants (http://projects.tcag.ca/variation/). Therefore, they are likely to be benign. In addition, a -8.44 Mb 9p11.1-p13.1 duplication (38,561,587-47,002,387 bp, hg18) and a -80.9 kb 4q13.2 deletion (70,183,990-70,264,889 bp, hg18) were also detected in the elder and younger sister, respectively. The relationship between such CNVs and primary amenorrhea and hyperandrogenism was however uncertain. RT-qPCR results were in accordance with array-CGH. Two CNVs were detected in two sisters by array-CGH, for which further studies are needed to clarify their correlation with primary amenorrhea and hyperandrogenism.

  10. The Genome 10K Project: a way forward.

    Science.gov (United States)

    Koepfli, Klaus-Peter; Paten, Benedict; O'Brien, Stephen J

    2015-01-01

    The Genome 10K Project was established in 2009 by a consortium of biologists and genome scientists determined to facilitate the sequencing and analysis of the complete genomes of 10,000 vertebrate species. Since then the number of selected and initiated species has risen from ∼26 to 277 sequenced or ongoing with funding, an approximately tenfold increase in five years. Here we summarize the advances and commitments that have occurred by mid-2014 and outline the achievements and present challenges of reaching the 10,000-species goal. We summarize the status of known vertebrate genome projects, recommend standards for pronouncing a genome as sequenced or completed, and provide our present and future vision of the landscape of Genome 10K. The endeavor is ambitious, bold, expensive, and uncertain, but together the Genome 10K Consortium of Scientists and the worldwide genomics community are moving toward their goal of delivering to the coming generation the gift of genome empowerment for many vertebrate species.

  11. Los Alamos Science: The Human Genome Project. Number 20, 1992

    Science.gov (United States)

    Cooper, N. G.; Shea, N. eds.

    1992-01-01

    This document provides a broad overview of the Human Genome Project, with particular emphasis on work being done at Los Alamos. It tries to emphasize the scientific aspects of the project, compared to the more speculative information presented in the popular press. There is a brief introduction to modern genetics, including a review of classic work. There is a broad overview of the Genome Project, describing what the project is, what are some of its major five-year goals, what are major technological challenges ahead of the project, and what can the field of biology, as well as society expect to see as benefits from this project. Specific results on the efforts directed at mapping chromosomes 16 and 5 are discussed. A brief introduction to DNA libraries is presented, bearing in mind that Los Alamos has housed such libraries for many years prior to the Genome Project. Information on efforts to do applied computational work related to the project are discussed, as well as experimental efforts to do rapid DNA sequencing by means of single-molecule detection using applied spectroscopic methods. The article introduces the Los Alamos staff which are working on the Genome Project, and concludes with brief discussions on ethical, legal, and social implications of this work; a brief glimpse of genetics as it may be practiced in the next century; and a glossary of relevant terms.

  12. Los Alamos Science: The Human Genome Project. Number 20, 1992

    Energy Technology Data Exchange (ETDEWEB)

    Cooper, N G; Shea, N [eds.

    1992-01-01

    This article provides a broad overview of the Human Genome Project, with particular emphasis on work being done at Los Alamos. It tries to emphasize the scientific aspects of the project, compared to the more speculative information presented in the popular press. There is a brief introduction to modern genetics, including a review of classic work. There is a broad overview of the Genome Project, describing what the project is, what are some of its major five-year goals, what are major technological challenges ahead of the project, and what can the field of biology, as well as society expect to see as benefits from this project. Specific results on the efforts directed at mapping chromosomes 16 and 5 are discussed. A brief introduction to DNA libraries is presented, bearing in mind that Los Alamos has housed such libraries for many years prior to the Genome Project. Information on efforts to do applied computational work related to the project are discussed, as well as experimental efforts to do rapid DNA sequencing by means of single-molecule detection using applied spectroscopic methods. The article introduces the Los Alamos staff which are working on the Genome Project, and concludes with brief discussions on ethical, legal, and social implications of this work; a brief glimpse of genetics as it may be practiced in the next century; and a glossary of relevant terms.

  13. A decade of human genome project conclusion: Scientific diffusion about our genome knowledge.

    Science.gov (United States)

    Moraes, Fernanda; Góes, Andréa

    2016-05-06

    The Human Genome Project (HGP) was initiated in 1990 and completed in 2003. It aimed to sequence the whole human genome. Although it represented an advance in understanding the human genome and its complexity, many questions remained unanswered. Other projects were launched in order to unravel the mysteries of our genome, including the ENCyclopedia of DNA Elements (ENCODE). This review aims to analyze the evolution of scientific knowledge related to both the HGP and ENCODE projects. Data were retrieved from scientific articles published in 1990-2014, a period comprising the development and the 10 years following the HGP completion. The fact that only 20,000 genes are protein and RNA-coding is one of the most striking HGP results. A new concept about the organization of genome arose. The ENCODE project was initiated in 2003 and targeted to map the functional elements of the human genome. This project revealed that the human genome is pervasively transcribed. Therefore, it was determined that a large part of the non-protein coding regions are functional. Finally, a more sophisticated view of chromatin structure emerged. The mechanistic functioning of the genome has been redrafted, revealing a much more complex picture. Besides, a gene-centric conception of the organism has to be reviewed. A number of criticisms have emerged against the ENCODE project approaches, raising the question of whether non-conserved but biochemically active regions are truly functional. Thus, HGP and ENCODE projects accomplished a great map of the human genome, but the data generated still requires further in depth analysis. © 2016 by The International Union of Biochemistry and Molecular Biology, 44:215-223, 2016. © 2016 The International Union of Biochemistry and Molecular Biology.

  14. The 1000 Genomes Project: new opportunities for research and social challenges

    Science.gov (United States)

    2010-01-01

    The 1000 Genomes Project, an international collaboration, is sequencing the whole genome of approximately 2,000 individuals from different worldwide populations. The central goal of this project is to describe most of the genetic variation that occurs at a population frequency greater than 1%. The results of this project will allow scientists to identify genetic variation at an unprecedented degree of resolution and will also help improve the imputation methods for determining unobserved genetic variants that are not represented on current genotyping arrays. By identifying novel or rare functional genetic variants, researchers will be able to pinpoint disease-causing genes in genomic regions initially identified by association studies. This level of detailed sequence information will also improve our knowledge of the evolutionary processes and the genomic patterns that have shaped the human species as we know it today. The new data will also lay the foundation for future clinical applications, such as prediction of disease susceptibility and drug response. However, the forthcoming availability of whole genome sequences at affordable prices will raise ethical concerns and pose potential threats to individual privacy. Nevertheless, we believe that these potential risks are outweighed by the benefits in terms of diagnosis and research, so long as rigorous safeguards are kept in place through legislation that prevents discrimination on the basis of the results of genetic testing. PMID:20193048

  15. Human genome project: revolutionizing biology through leveraging technology

    Science.gov (United States)

    Dahl, Carol A.; Strausberg, Robert L.

    1996-04-01

    The Human Genome Project (HGP) is an international project to develop genetic, physical, and sequence-based maps of the human genome. Since the inception of the HGP it has been clear that substantially improved technology would be required to meet the scientific goals, particularly in order to acquire the complete sequence of the human genome, and that these technologies coupled with the information forthcoming from the project would have a dramatic effect on the way biomedical research is performed in the future. In this paper, we discuss the state-of-the-art for genomic DNA sequencing, technological challenges that remain, and the potential technological paths that could yield substantially improved genomic sequencing technology. The impact of the technology developed from the HGP is broad-reaching and a discussion of other research and medical applications that are leveraging HGP-derived DNA analysis technologies is included. The multidisciplinary approach to the development of new technologies that has been successful for the HGP provides a paradigm for facilitating new genomic approaches toward understanding the biological role of functional elements and systems within the cell, including those encoded within genomic DNA and their molecular products.

  16. The human genome project and the future of medical practice ...

    African Journals Online (AJOL)

    Contrary to the scepticism that characterised the planning stages of the human genome project, the technology and sequence data resulting from the project are set to revolutionise medical practice for good. The expected benefits include: enhanced discovery of disease genes, which will lead to improved knowledge on the ...

  17. Genomic and gene variation in Mycoplasma hominis strains

    DEFF Research Database (Denmark)

    Christiansen, Gunna; Andersen, H; Birkelund, Svend

    1987-01-01

    DNAs from 14 strains of Mycoplasma hominis isolated from various habitats, including strain PG21, were analyzed for genomic heterogeneity. DNA-DNA filter hybridization values were from 51 to 91%. Restriction endonuclease digestion patterns, analyzed by agarose gel electrophoresis, revealed...... no identity or cluster formation between strains. Variation within M. hominis rRNA genes was analyzed by Southern hybridization of EcoRI-cleaved DNA hybridized with a cloned fragment of the rRNA gene from the mycoplasma strain PG50. Five of the M. hominis strains showed identical hybridization patterns....... These hybridization patterns were compared with those of 12 other mycoplasma species, which showed a much more complex band pattern. Cloned nonribosomal RNA gene fragments of M. hominis PG21 DNA were analyzed, and the fragments were used to demonstrate heterogeneity among the strains. A monoclonal antibody against...

  18. Identification of genomic indels and structural variations using split reads

    Directory of Open Access Journals (Sweden)

    Urban Alexander E

    2011-07-01

    Full Text Available Abstract Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC, a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read. All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions. A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models. This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions. We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole

  19. Genomic copy number variations in three Southeast Asian populations.

    Science.gov (United States)

    Ku, Chee-Seng; Pawitan, Yudi; Sim, Xueling; Ong, Rick T H; Seielstad, Mark; Lee, Edmund J D; Teo, Yik-Ying; Chia, Kee-Seng; Salim, Agus

    2010-07-01

    Research on the role of copy number variations (CNVs) in the genetic risk of diseases in Asian populations has been hampered by a relative lack of reference CNV maps for Asian populations outside the East Asians. In this article, we report the population characteristics of CNVs in Chinese, Malay, and Asian Indian populations in Singapore. Using the Illumina Human 1M Beadchip array, we identify 1,174 CNV loci in these populations that corroborated with findings when the same samples were typed on the Affymetrix 6.0 platform. We identify 441 novel loci not previously reported in the Database of Genomic Variations (DGV). We observe a considerable number of loci that span all three populations and were previously unreported, as well as population-specific loci that are quite common in the respective populations. From this we observe the distribution of CNVs in the Asian Indian population to be considerably different from the Chinese and Malay populations. About half of the deletion loci and three-quarters of duplication loci overlap UCSC genes. Tens of loci show population differentiation and overlap with genes previously known to be associated with genetic risk of diseases. One of these loci is the CYP2A6 deletion, previously linked to reduced susceptibility to lung cancer. (c) 2010 Wiley-Liss, Inc.

  20. Assessing genome-wide copy number variation in the Han Chinese population.

    Science.gov (United States)

    Lu, Jianqi; Lou, Haiyi; Fu, Ruiqing; Lu, Dongsheng; Zhang, Feng; Wu, Zhendong; Zhang, Xi; Li, Changhua; Fang, Baijun; Pu, Fangfang; Wei, Jingning; Wei, Qian; Zhang, Chao; Wang, Xiaoji; Lu, Yan; Yan, Shi; Yang, Yajun; Jin, Li; Xu, Shuhua

    2017-10-01

    Copy number variation (CNV) is a valuable source of genetic diversity in the human genome and a well-recognised cause of various genetic diseases. However, CNVs have been considerably under-represented in population-based studies, particularly the Han Chinese which is the largest ethnic group in the world. To build a representative CNV map for the Han Chinese population. We conducted a genome-wide CNV study involving 451 male Han Chinese samples from 11 geographical regions encompassing 28 dialect groups, representing a less-biased panel compared with the currently available data. We detected CNVs by using 4.2M NimbleGen comparative genomic hybridisation array and whole-genome deep sequencing of 51 samples to optimise the filtering conditions in CNV discovery. A comprehensive Han Chinese CNV map was built based on a set of high-quality variants (positive predictive value >0.8, with sizes ranging from 369 bp to 4.16 Mb and a median of 5907 bp). The map consists of 4012 CNV regions (CNVRs), and more than half are novel to the 30 East Asian CNV Project and the 1000 Genomes Project Phase 3. We further identified 81 CNVRs specific to regional groups, which was indicative of the subpopulation structure within the Han Chinese population. Our data are complementary to public data sources, and the CNV map may facilitate in the identification of pathogenic CNVs and further biomedical research studies involving the Han Chinese population. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  1. Harvard Personal Genome Project: lessons from participatory public research

    Science.gov (United States)

    2014-01-01

    Background Since its initiation in 2005, the Harvard Personal Genome Project has enrolled thousands of volunteers interested in publicly sharing their genome, health and trait data. Because these data are highly identifiable, we use an ‘open consent’ framework that purposefully excludes promises about privacy and requires participants to demonstrate comprehension prior to enrollment. Discussion Our model of non-anonymous, public genomes has led us to a highly participatory model of researcher-participant communication and interaction. The participants, who are highly committed volunteers, self-pursue and donate research-relevant datasets, and are actively engaged in conversations with both our staff and other Personal Genome Project participants. We have quantitatively assessed these communications and donations, and report our experiences with returning research-grade whole genome data to participants. We also observe some of the community growth and discussion that has occurred related to our project. Summary We find that public non-anonymous data is valuable and leads to a participatory research model, which we encourage others to consider. The implementation of this model is greatly facilitated by web-based tools and methods and participant education. Project results are long-term proactive participant involvement and the growth of a community that benefits both researchers and participants. PMID:24713084

  2. Harvard Personal Genome Project: lessons from participatory public research.

    Science.gov (United States)

    Ball, Madeleine P; Bobe, Jason R; Chou, Michael F; Clegg, Tom; Estep, Preston W; Lunshof, Jeantine E; Vandewege, Ward; Zaranek, Alexander; Church, George M

    2014-02-28

    Since its initiation in 2005, the Harvard Personal Genome Project has enrolled thousands of volunteers interested in publicly sharing their genome, health and trait data. Because these data are highly identifiable, we use an 'open consent' framework that purposefully excludes promises about privacy and requires participants to demonstrate comprehension prior to enrollment. Our model of non-anonymous, public genomes has led us to a highly participatory model of researcher-participant communication and interaction. The participants, who are highly committed volunteers, self-pursue and donate research-relevant datasets, and are actively engaged in conversations with both our staff and other Personal Genome Project participants. We have quantitatively assessed these communications and donations, and report our experiences with returning research-grade whole genome data to participants. We also observe some of the community growth and discussion that has occurred related to our project. We find that public non-anonymous data is valuable and leads to a participatory research model, which we encourage others to consider. The implementation of this model is greatly facilitated by web-based tools and methods and participant education. Project results are long-term proactive participant involvement and the growth of a community that benefits both researchers and participants.

  3. The human Genome project and the future of oncology

    International Nuclear Information System (INIS)

    Collins, Francis S.

    1996-01-01

    The Human Genome Project is an ambitious 15-year effort to devise maps and sequence of the 3-billion base pair human genome, including all 100,000 genes. The project is running ahead of schedule and under budget. Already the effects on progress in disease gene discovery have been dramatic, especially for cancer. The most appropriate uses of susceptibility testing for breast, ovarian, and colon cancer are being investigated in research protocols, and the need to prevent genetic discrimination in employment and health insurance is becoming more urgent. In the longer term, these gene discoveries are likely to usher in a new era of therapeutic molecular medicine

  4. Sequencing of a new target genome: the Pediculus humanus humanus (Phthiraptera: Pediculidae) genome project.

    Science.gov (United States)

    Pittendrigh, B R; Clark, J M; Johnston, J S; Lee, S H; Romero-Severson, J; Dasch, G A

    2006-11-01

    The human body louse, Pediculus humanus humanus (L.), and the human head louse, Pediculus humanus capitis, belong to the hemimetabolous order Phthiraptera. The body louse is the primary vector that transmits the bacterial agents of louse-borne relapsing fever, trench fever, and epidemic typhus. The genomes of the bacterial causative agents of several of these aforementioned diseases have been sequenced. Thus, determining the body louse genome will enhance studies of host-vector-pathogen interactions. Although not important as a major disease vector, head lice are of major social concern. Resistance to traditional pesticides used to control head and body lice have developed. It is imperative that new molecular targets be discovered for the development of novel compounds to control these insects. No complete genome sequence exists for a hemimetabolous insect species primarily because hemimetabolous insects often have large (2000 Mb) to very large (up to 16,300 Mb) genomes. Fortuitously, we determined that the human body louse has one of the smallest genome sizes known in insects, suggesting it may be a suitable choice as a minimal hemimetabolous genome in which many genes have been eliminated during its adaptation to human parasitism. Because many louse species infest birds and mammals, the body louse genome-sequencing project will facilitate studies of their comparative genomics. A 6-8X coverage of the body louse genome, plus sequenced expressed sequence tags, should provide the entomological, evolutionary biology, medical, and public health communities with useful genetic information.

  5. No evidence that sex and transposable elements drive genome size variation in evening primroses.

    Science.gov (United States)

    Ågren, J Arvid; Greiner, Stephan; Johnson, Marc T J; Wright, Stephen I

    2015-04-01

    Genome size varies dramatically across species, but despite an abundance of attention there is little agreement on the relative contributions of selective and neutral processes in governing this variation. The rate of sex can potentially play an important role in genome size evolution because of its effect on the efficacy of selection and transmission of transposable elements (TEs). Here, we used a phylogenetic comparative approach and whole genome sequencing to investigate the contribution of sex and TE content to genome size variation in the evening primrose (Oenothera) genus. We determined genome size using flow cytometry for 30 species that vary in genetic system and find that variation in sexual/asexual reproduction cannot explain the almost twofold variation in genome size. Moreover, using whole genome sequences of three species of varying genome sizes and reproductive system, we found that genome size was not associated with TE abundance; instead the larger genomes had a higher abundance of simple sequence repeats. Although it has long been clear that sexual reproduction may affect various aspects of genome evolution in general and TE evolution in particular, it does not appear to have played a major role in genome size evolution in the evening primroses. © 2015 The Author(s).

  6. Total Variation and Tomographic Imaging from Projections

    DEFF Research Database (Denmark)

    Hansen, Per Christian; Jørgensen, Jakob Heide

    2011-01-01

    or 3D reconstruction from noisy projections. We demonstrate that for a small signal-to-noise ratio, this new approach allows us to compute better (i.e., more reliable) reconstructions than those obtained by classical methods. This is possible due to the use of the TV reconstruction model, which...

  7. Karyotype diversity and genome size variation in Neotropical Maxillariinae orchids.

    Science.gov (United States)

    Moraes, A P; Koehler, S; Cabral, J S; Gomes, S S L; Viccini, L F; Barros, F; Felix, L P; Guerra, M; Forni-Martins, E R

    2017-03-01

    Orchidaceae is a widely distributed plant family with very diverse vegetative and floral morphology, and such variability is also reflected in their karyotypes. However, since only a low proportion of Orchidaceae has been analysed for chromosome data, greater diversity may await to be unveiled. Here we analyse both genome size (GS) and karyotype in two subtribes recently included in the broadened Maxillariinea to detect how much chromosome and GS variation there is in these groups and to evaluate which genome rearrangements are involved in the species evolution. To do so, the GS (14 species), the karyotype - based on chromosome number, heterochromatic banding and 5S and 45S rDNA localisation (18 species) - was characterised and analysed along with published data using phylogenetic approaches. The GS presented a high phylogenetic correlation and it was related to morphological groups in Bifrenaria (larger plants - higher GS). The two largest GS found among genera were caused by different mechanisms: polyploidy in Bifrenaria tyrianthina and accumulation of repetitive DNA in Scuticaria hadwenii. The chromosome number variability was caused mainly through descending dysploidy, and x=20 was estimated as the base chromosome number. Combining GS and karyotype data with molecular phylogeny, our data provide a more complete scenario of the karyotype evolution in Maxillariinae orchids, allowing us to suggest, besides dysploidy, that inversions and transposable elements as two mechanisms involved in the karyotype evolution. Such karyotype modifications could be associated with niche changes that occurred during species evolution. © 2016 German Botanical Society and The Royal Botanical Society of the Netherlands.

  8. Genomic variation in recently collected maize landraces from Mexico

    Directory of Open Access Journals (Sweden)

    María Clara Arteaga

    2016-03-01

    Full Text Available The present dataset comprises 36,931 SNPs genotyped in 46 maize landraces native to Mexico as well as the teosinte subspecies Zea maiz ssp. parviglumis and ssp. mexicana. These landraces were collected directly from farmers mostly between 2006 and 2010. We accompany these data with a short description of the variation within each landrace, as well as maps, principal component analyses and neighbor joining trees showing the distribution of the genetic diversity relative to landrace, geographical features and maize biogeography. High levels of genetic variation were detected for the maize landraces (HE = 0.234 to 0.318 (mean 0.311, while slightly lower levels were detected in Zea m. mexicana and Zea m. parviglumis (HE = 0.262 and 0.234, respectively. The distribution of genetic variation was better explained by environmental variables given by the interaction of altitude and latitude than by landrace identity. This dataset is a follow up product of the Global Native Maize Project, an initiative to update the data on Mexican maize landraces and their wild relatives, and to generate information that is necessary for implementing the Mexican Biosafety Law. Keywords: Maize, Teosinte, Maize SNP50K BeadChip, Mexican landraces, Proyecto Global de Maíces Nativos

  9. Genomic variation in recently collected maize landraces from Mexico

    Science.gov (United States)

    Arteaga, María Clara; Moreno-Letelier, Alejandra; Mastretta-Yanes, Alicia; Vázquez-Lobo, Alejandra; Breña-Ochoa, Alejandra; Moreno-Estrada, Andrés; Eguiarte, Luis E.; Piñero, Daniel

    2015-01-01

    The present dataset comprises 36,931 SNPs genotyped in 46 maize landraces native to Mexico as well as the teosinte subspecies Zea maiz ssp. parviglumis and ssp. mexicana. These landraces were collected directly from farmers mostly between 2006 and 2010. We accompany these data with a short description of the variation within each landrace, as well as maps, principal component analyses and neighbor joining trees showing the distribution of the genetic diversity relative to landrace, geographical features and maize biogeography. High levels of genetic variation were detected for the maize landraces (HE = 0.234 to 0.318 (mean 0.311), while slightly lower levels were detected in Zea m. mexicana and Zea m. parviglumis (HE = 0.262 and 0.234, respectively). The distribution of genetic variation was better explained by environmental variables given by the interaction of altitude and latitude than by landrace identity. This dataset is a follow up product of the Global Native Maize Project, an initiative to update the data on Mexican maize landraces and their wild relatives, and to generate information that is necessary for implementing the Mexican Biosafety Law. PMID:26981357

  10. Freedom and Responsibility in Synthetic Genomics: The Synthetic Yeast Project.

    Science.gov (United States)

    Sliva, Anna; Yang, Huanming; Boeke, Jef D; Mathews, Debra J H

    2015-08-01

    First introduced in 2011, the Synthetic Yeast Genome (Sc2.0) PROJECT is a large international synthetic genomics project that will culminate in the first eukaryotic cell (Saccharomyces cerevisiae) with a fully synthetic genome. With collaborators from across the globe and from a range of institutions spanning from do-it-yourself biology (DIYbio) to commercial enterprises, it is important that all scientists working on this project are cognizant of the ethical and policy issues associated with this field of research and operate under a common set of principles. In this commentary, we survey the current ethics and regulatory landscape of synthetic biology and present the Sc2.0 Statement of Ethics and Governance to which all members of the project adhere. This statement focuses on four aspects of the Sc2.0 PROJECT: societal benefit, intellectual property, safety, and self-governance. We propose that such project-level agreements are an important, valuable, and flexible model of self-regulation for similar global, large-scale synthetic biology projects in order to maximize the benefits and minimize potential harms. Copyright © 2015 by the Genetics Society of America.

  11. Somatic genomic variations in extra-embryonic tissues

    Energy Technology Data Exchange (ETDEWEB)

    Weier, Jingly F.; Ferlatte, Christy; Weier, Heinz-Ulli G.

    2010-05-21

    In the mature chorion, one of the membranes that exist during pregnancy between the developing fetus and mother, human placental cells form highly specialized tissues composed of mesenchyme and floating or anchoring villi. Using fluorescence in situ hybridization, we found that human invasive cytotrophoblasts isolated from anchoring villi or the uterine wall had gained individual chromosomes; however, chromosome losses were detected infrequently. With chromosomes gained in what appeared to be a chromosome-specific manner, more than half of the invasive cytotrophoblasts in normal pregnancies were found to be hyperdiploid. Interestingly, the rates of hyperdiploid cells depended not only on gestational age, but were strongly associated with the extraembryonic compartment at the fetal-maternal interface from which they were isolated. Since hyperdiploid cells showed drastically reduced DNA replication as measured by bromodeoxyuridine incorporation, we conclude that aneuploidy is a part of the normal process of placentation potentially limiting the proliferative capabilities of invasive cytotrophoblasts. Thus, under the special circumstances of human reproduction, somatic genomic variations may exert a beneficial, anti-neoplastic effect on the organism.

  12. Reconsidering democracy. History of the Human Genome Project.

    NARCIS (Netherlands)

    Marli Huijer

    2003-01-01

    What options are open for people—citizens, politicians, and other nonscientists—to become actively involved in and anticipate new directions in the life sciences? In addressing this question, this article focuses on the start of the Human Genome Project (1985-1990). By contrasting various models of

  13. The Human Genome Project: Biology, Computers, and Privacy.

    Science.gov (United States)

    Cutter, Mary Ann G.; Drexler, Edward; Gottesman, Kay S.; Goulding, Philip G.; McCullough, Laurence B.; McInerney, Joseph D.; Micikas, Lynda B.; Mural, Richard J.; Murray, Jeffrey C.; Zola, John

    This module, for high school teachers, is the second of two modules about the Human Genome Project (HGP) produced by the Biological Sciences Curriculum Study (BSCS). The first section of this module provides background information for teachers about the structure and objectives of the HGP, aspects of the science and technology that underlie the…

  14. Reconsidering democracy - History of the human genome project

    NARCIS (Netherlands)

    Huijer, M

    What options are open for people-citizens, politicians, and other nonscientists-to become actively involved in and anticipate new directions in the life sciences? In addressing this question, this article focuses on the start of the Human Genome Project (1985-1990). By contrasting various models of

  15. Enhancing Biology Instruction with the Human Genome Project

    Science.gov (United States)

    Buxeda, Rosa J.; Moore-Russo, Deborah A.

    2003-01-01

    The Human Genome Project (HGP) is a recent scientific milestone that has received notable attention. This article shows how a biology course is using the HGP to enhance students' experiences by providing awareness of cutting edge research, with information on new emerging career options, and with opportunities to consider ethical questions raised…

  16. Comparing genetic variants detected in the 1000 genomes project ...

    Indian Academy of Sciences (India)

    Single-nucleotide polymorphisms (SNPs) determined based on SNP arrays from the international HapMap consortium (HapMap) and the genetic variants detected in the 1000 genomes project (1KGP) can serve as two references for genomewide association studies (GWAS). We conducted comparative analyses to provide ...

  17. Mapping our genes: The genome projects: How big, how fast

    Energy Technology Data Exchange (ETDEWEB)

    none,

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for /open quotes/writing the rules/close quotes/ of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. OTA prepared this report with the assistance of several hundred experts throughout the world. 342 refs., 26 figs., 11 tabs.

  18. Mapping Our Genes: The Genome Projects: How Big, How Fast

    Science.gov (United States)

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for ?writing the rules? of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. The Office of Technology Assessment (OTA) prepared this report with the assistance of several hundred experts throughout the world.

  19. PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

    Science.gov (United States)

    Liu, Yang; Khan, Saad M; Wang, Juexin; Rynge, Mats; Zhang, Yuanxun; Zeng, Shuai; Chen, Shiyuan; Maldonado Dos Santos, Joao V; Valliyodan, Babu; Calyam, Prasad P; Merchant, Nirav; Nguyen, Henry T; Xu, Dong; Joshi, Trupti

    2016-10-06

    With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. PGen workflow has been optimized for the most

  20. Sequence Analysis and Characterization of Active Human Alu Subfamilies Based on the 1000 Genomes Pilot Project.

    Science.gov (United States)

    Konkel, Miriam K; Walker, Jerilyn A; Hotard, Ashley B; Ranck, Megan C; Fontenot, Catherine C; Storer, Jessica; Stewart, Chip; Marth, Gabor T; Batzer, Mark A

    2015-08-29

    The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  1. Coordinated international action to accelerate genome-to-phenome with FAANG, the Functional Annotation of Animal Genomes project : open letter

    NARCIS (Netherlands)

    Archibald, A.L.; Bottema, C.D.; Brauning, R.; Burgess, S.C.; Burt, D.W.; Casas, E.; Cheng, H.H.; Clarke, L.; Couldrey, C.; Dalrymple, B.P.; Elsik, C.G.; Foissac, S.; Giuffra, E.; Groenen, M.A.M.; Hayes, B.J.; Huang, L.S.; Khatib, H.; Kijas, J.W.; Kim, H.; Lunney, J.K.; McCarthy, F.M.; McEwan, J.; Moore, S.; Nanduri, B.; Notredame, C.; Palti, Y.; Plastow, G.S.; Reecy, J.M.; Rohrer, G.; Sarropoulou, E.; Schmidt, C.J.; Silverstein, J.; Tellam, R.L.; Tixier-Boichard, M.; Tosser-klopp, G.; Tuggle, C.K.; Vilkki, J.; White, S.N.; Zhao, S.; Zhou, H.

    2015-01-01

    We describe the organization of a nascent international effort, the Functional Annotation of Animal Genomes (FAANG) project, whose aim is to produce comprehensive maps of functional elements in the genomes of domesticated animal species.

  2. Genetic variation architecture of mitochondrial genome reveals the differentiation in Korean landrace and weedy rice

    OpenAIRE

    Wei Tong; Qiang He; Yong-Jin Park

    2017-01-01

    Mitochondrial genome variations have been detected despite the overall conservation of this gene content, which has been valuable for plant population genetics and evolutionary studies. Here, we describe mitochondrial variation architecture and our performance of a phylogenetic dissection of Korean landrace and weedy rice. A total of 4,717 variations across the mitochondrial genome were identified adjunct with 10 wild rice. Genetic diversity assessment revealed that wild rice has higher nucle...

  3. The Qatar genome project: translation of whole-genome sequencing into clinical practice.

    Science.gov (United States)

    Zayed, Hatem

    2016-10-01

    Qatar Genome Project was launched in 2013 with the intent to sequence the genome of each Qatari citizen in an effort to protect Qataris from the high rate of indigenous genetic diseases by allowing the mapping of disease-causing variants/rare variants and establishing a Qatari reference genome. Indeed, this project is expected to have numerous global benefits because the elevated homogeneity of the Qatari population, that will make Qatar an excellent genetic laboratory that will generate a wealth of data that will allow us to make sense of the genotype-phenotype correlations of many diseases, especially the complex multifactorial diseases, and will pave the way for changing the traditional medical practice of looking first at the phenotype rather than the genotype. © 2016 John Wiley & Sons Ltd.

  4. nGASP - the nematode genome annotation assessment project

    Energy Technology Data Exchange (ETDEWEB)

    Coghlan, A; Fiedler, T J; McKay, S J; Flicek, P; Harris, T W; Blasiar, D; Allen, J; Stein, L D

    2008-12-19

    While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets for 10 Mb of the C. elegans genome. Predictions were compared to reference gene sets consisting of confirmed or manually curated gene models from WormBase. The most accurate gene-finders were 'combiner' algorithms, which made use of transcript- and protein-alignments and multi-genome alignments, as well as gene predictions from other gene-finders. Gene-finders that used alignments of ESTs, mRNAs and proteins came in second place. There was a tie for third place between gene-finders that used multi-genome alignments and ab initio gene-finders. The median gene level sensitivity of combiners was 78% and their specificity was 42%, which is nearly the same accuracy as reported for combiners in the human genome. C. elegans genes with exons of unusual hexamer content, as well as those with many exons, short exons, long introns, a weak translation start signal, weak splice sites, or poorly conserved orthologs were the most challenging for gene-finders. While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets for 10 Mb of the C

  5. Analysis of the genetic variation in Mycobacterium tuberculosis strains by multiple genome alignments

    Directory of Open Access Journals (Sweden)

    Morales Juan

    2008-11-01

    Full Text Available Abstract Background The recent determination of the complete nucleotide sequence of several Mycobacterium tuberculosis (MTB genomes allows the use of comparative genomics as a tool for dissecting the nature and consequence of genetic variability within this species. The multiple alignment of the genomes of clinical strains (CDC1551, F11, Haarlem and C, along with the genomes of laboratory strains (H37Rv and H37Ra, provides new insights on the mechanisms of adaptation of this bacterium to the human host. Findings The genetic variation found in six M. tuberculosis strains does not involve significant genomic rearrangements. Most of the variation results from deletion and transposition events preferentially associated with insertion sequences and genes of the PE/PPE family but not with genes implicated in virulence. Using a Perl-based software islandsanalyser, which creates a representation of the genetic variation in the genome, we identified differences in the patterns of distribution and frequency of the polymorphisms across the genome. The identification of genes displaying strain-specific polymorphisms and the extrapolation of the number of strain-specific polymorphisms to an unlimited number of genomes indicates that the different strains contain a limited number of unique polymorphisms. Conclusion The comparison of multiple genomes demonstrates that the M. tuberculosis genome is currently undergoing an active process of gene decay, analogous to the adaptation process of obligate bacterial symbionts. This observation opens new perspectives into the evolution and the understanding of the pathogenesis of this bacterium.

  6. Genomic and karyotypic variation in Drosophila parasitoids (Hymenoptera, Cynipoidea, Figitidae

    Directory of Open Access Journals (Sweden)

    Vladimir Gokhman

    2011-08-01

    Full Text Available Drosophila melanogaster Meigen, 1830 has served as a model insect for over a century. Sequencing of the 11 additional Drosophila Fallen, 1823 species marks substantial progress in comparative genomics of this genus. By comparison, practically nothing is known about the genome size or genome sequences of parasitic wasps of Drosophila. Here, we present the first comparative analysis of genome size and karyotype structures of Drosophila parasitoids of the Leptopilina Förster, 1869 and Ganaspis Förster, 1869 species. The gametic genome size of Ganaspis xanthopoda (Ashmead, 1896 is larger than those of the three Leptopilina species studied. The genome sizes of all parasitic wasps studied here are also larger than those known for all Drosophila species. Surprisingly, genome sizes of these Drosophila parasitoids exceed the average value known for all previously studied Hymenoptera. The haploid chromosome number of both Leptopilina heterotoma (Thomson, 1862 and L. victoriae Nordlander, 1980 is ten. A chromosomal fusion appears to have produced a distinct karyotype for L. boulardi (Barbotin, Carton et Keiner-Pillault, 1979 (n = 9, whose genome size is smaller than that of wasps of the L. heterotoma clade. Like L. boulardi, the haploid chromosome number for G. xanthopoda is also nine. Our studies reveal a positive, but non linear, correlation between the genome size and total chromosome length in Drosophila parasitoids. These Drosophila parasitoids differ widely in their host range, and utilize different infection strategies to overcome host defense. Their comparative genomics, in relation to their exceptionally well-characterized hosts, will prove to be valuable for understanding the molecular basis of the host-parasite arms race and how such mechanisms shape the genetic structures of insect communities.

  7. Overview of the creative genome: effects of genome structure and sequence on the generation of variation and evolution.

    Science.gov (United States)

    Caporale, Lynn Helena

    2012-09-01

    This overview of a special issue of Annals of the New York Academy of Sciences discusses uneven distribution of distinct types of variation across the genome, the dependence of specific types of variation upon distinct classes of DNA sequences and/or the induction of specific proteins, the circumstances in which distinct variation-generating systems are activated, and the implications of this work for our understanding of evolution and of cancer. Also discussed is the value of non text-based computational methods for analyzing information carried by DNA, early insights into organizational frameworks that affect genome behavior, and implications of this work for comparative genomics. © 2012 New York Academy of Sciences.

  8. EVA: Exome Variation Analyzer, an efficient and versatile tool for filtering strategies in medical genomics

    Directory of Open Access Journals (Sweden)

    Coutant Sophie

    2012-09-01

    Full Text Available Abstract Background Whole exome sequencing (WES has become the strategy of choice to identify a coding allelic variant for a rare human monogenic disorder. This approach is a revolution in medical genetics history, impacting both fundamental research, and diagnostic methods leading to personalized medicine. A plethora of efficient algorithms has been developed to ensure the variant discovery. They generally lead to ~20,000 variations that have to be narrow down to find the potential pathogenic allelic variant(s and the affected gene(s. For this purpose, commonly adopted procedures which implicate various filtering strategies have emerged: exclusion of common variations, type of the allelics variants, pathogenicity effect prediction, modes of inheritance and multiple individuals for exome comparison. To deal with the expansion of WES in medical genomics individual laboratories, new convivial and versatile software tools have to implement these filtering steps. Non-programmer biologists have to be autonomous combining themselves different filtering criteria and conduct a personal strategy depending on their assumptions and study design. Results We describe EVA (Exome Variation Analyzer, a user-friendly web-interfaced software dedicated to the filtering strategies for medical WES. Thanks to different modules, EVA (i integrates and stores annotated exome variation data as strictly confidential to the project owner, (ii allows to combine the main filters dealing with common variations, molecular types, inheritance mode and multiple samples, (iii offers the browsing of annotated data and filtered results in various interactive tables, graphical visualizations and statistical charts, (iv and finally offers export files and cross-links to external useful databases and softwares for further prioritization of the small subset of sorted candidate variations and genes. We report a demonstrative case study that allowed to identify a new candidate gene

  9. Are we Genomic Mosaics? Variations of the Genome of Somatic Cells can Contribute to Diversify our Phenotypes.

    Science.gov (United States)

    Astolfi, P A; Salamini, F; Sgaramella, V

    2010-09-01

    Theoretical and experimental evidences support the hypothesis that the genomes and the epigenomes may be different in the somatic cells of complex organisms. In the genome, the differences range from single base substitutions to chromosome number; in the epigenome, they entail multiple postsynthetic modifications of the chromatin. Somatic genome variations (SGV) may accumulate during development in response both to genetic programs, which may differ from tissue to tissue, and to environmental stimuli, which are often undetected and generally irreproducible. SGV may jeopardize physiological cellular functions, but also create novel coding and regulatory sequences, to be exposed to intraorganismal Darwinian selection. Genomes acknowledged as comparatively poor in genes, such as humans', could thus increase their pristine informational endowment. A better understanding of SGV will contribute to basic issues such as the "nature vs nurture" dualism and the inheritance of acquired characters. On the applied side, they may explain the low yield of cloning via somatic cell nuclear transfer, provide clues to some of the problems associated with transdifferentiation, and interfere with individual DNA analysis. SGV may be unique in the different cells types and in the different developmental stages, and thus explain the several hundred gaps persisting in the human genomes "completed" so far. They may compound the variations associated to our epigenomes and make of each of us an "(epi)genomic" mosaic. An ensuing paradigm is the possibility that a single genome (the ephemeral one assembled at fertilization) has the capacity to generate several different brains in response to different environments.

  10. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer; Rundsten, Carsten Friis; Ussery, David

    2012-01-01

    Background Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene variation can be studied in detail, including the number of mutations found for any given gene. This knowledge will be useful...... for creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps...... more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness...

  11. Transposable element distribution, abundance and role in genome size variation in the genus Oryza.

    Science.gov (United States)

    Zuccolo, Andrea; Sebastian, Aswathy; Talag, Jayson; Yu, Yeisoo; Kim, HyeRan; Collura, Kristi; Kudrna, Dave; Wing, Rod A

    2007-08-29

    The genus Oryza is composed of 10 distinct genome types, 6 diploid and 4 polyploid, and includes the world's most important food crop - rice (Oryza sativa [AA]). Genome size variation in the Oryza is more than 3-fold and ranges from 357 Mbp in Oryza glaberrima [AA] to 1283 Mbp in the polyploid Oryza ridleyi [HHJJ]. Because repetitive elements are known to play a significant role in genome size variation, we constructed random sheared small insert genomic libraries from 12 representative Oryza species and conducted a comprehensive study of the repetitive element composition, distribution and phylogeny in this genus. Particular attention was paid to the role played by the most important classes of transposable elements (Long Terminal Repeats Retrotransposons, Long interspersed Nuclear Elements, helitrons, DNA transposable elements) in shaping these genomes and in their contributing to genome size variation. We identified the elements primarily responsible for the most strikingly genome size variation in Oryza. We demonstrated how Long Terminal Repeat retrotransposons belonging to the same families have proliferated to very different extents in various species. We also showed that the pool of Long Terminal Repeat Retrotransposons is substantially conserved and ubiquitous throughout the Oryza and so its origin is ancient and its existence predates the speciation events that originated the genus. Finally we described the peculiar behavior of repeats in the species Oryza coarctata [HHKK] whose placement in the Oryza genus is controversial. Long Terminal Repeat retrotransposons are the major component of the Oryza genomes analyzed and, along with polyploidization, are the most important contributors to the genome size variation across the Oryza genus. Two families of Ty3-gypsy elements (RIRE2 and Atlantys) account for a significant portion of the genome size variations present in the Oryza genus.

  12. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome

    NARCIS (Netherlands)

    Collins, Ryan L; Brand, Harrison; Redin, Claire E.; Hanscom, Carrie; Antolik, Caroline; Stone, Matthew R; Glessner, Joseph T.; Mason, Tamara; Pregno, Giulia; Dorrani, Naghmeh; Mandrile, Giorgia; Giachino, Daniela; Perrin, Danielle; Walsh, Cole; Cipicchio, Michelle; Costello, Maura; Stortchevoi, Alexei; An, Joon Yong; Currall, Benjamin B; Seabra, Catarina M; Ragavendran, Ashok; Margolin, Lauren; Martinez-Agosto, Julian A.; Lucente, Diane; Levy, Brynn; Sanders, Jan-Stephan; Wapner, Ronald J.; Quintero-Rivera, Fabiola; Kloosterman, Wigard; Talkowski, Michael E.

    2017-01-01

    Background: Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. Results: We sequenced 689 participants with autism spectrum disorder (ASD) and other

  13. Host genome variations and risk of infections during induction treatment for childhood acute lymphoblastic leukaemia

    DEFF Research Database (Denmark)

    Lund, Bendik; Wesolowska-Andersen, Agata; Lausen, Birgitte

    2014-01-01

    Objectives: To investigate association of host genomic variation and risk of infections during treatment for childhood acute lymphoblastic leukaemia (ALL). Methods: We explored association of 34 000 singlenucleotide polymorphisms (SNPs) related primarily to pharmacogenomics and immune function...

  14. New Regions of the Human Genome Linked to Skin Color Variation in Some African Populations

    Science.gov (United States)

    In the first study of its kind, an international team of genomics researchers has identified new regions of the human genome that are associated with skin color variation in some African populations, opening new avenues for research on skin diseases and cancer in all populations.

  15. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...... fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides...

  16. Draft genome sequence of an elite Dura palm and whole-genome patterns of DNA variation in oil palm.

    Science.gov (United States)

    Jin, Jingjing; Lee, May; Bai, Bin; Sun, Yanwei; Qu, Jing; Rahmadsyah; Alfiko, Yuzer; Lim, Chin Huat; Suwanto, Antonius; Sugiharti, Maria; Wong, Limsoon; Ye, Jian; Chua, Nam-Hai; Yue, Gen Hua

    2016-12-01

    Oil palm is the world's leading source of vegetable oil and fat. Dura, Pisifera and Tenera are three forms of oil palm. The genome sequence of Pisifera is available whereas the Dura form has not been sequenced yet. We sequenced the genome of one elite Dura palm, and re-sequenced 17 palm genomes. The assemble genome sequence of the elite Dura tree contained 10,971 scaffolds and was 1.701 Gb in length, covering 94.49% of the oil palm genome. 36,105 genes were predicted. Re-sequencing of 17 additional palm trees identified 18.1 million SNPs. We found high genetic variation among palms from different geographical regions, but lower variation among Southeast Asian Dura and Pisifera palms. We mapped 10,000 SNPs on the linkage map of oil palm. In addition, high linkage disequilibrium (LD) was detected in the oil palms used in breeding populations of Southeast Asia, suggesting that LD mapping is likely to be practical in this important oil crop. Our data provide a valuable resource for accelerating genetic improvement and studying the mechanism underlying phenotypic variations of important oil palm traits. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  17. Detecting microsatellites within genomes: significant variation among algorithms

    Directory of Open Access Journals (Sweden)

    Rivals Eric

    2007-04-01

    Full Text Available Abstract Background Microsatellites are short, tandemly-repeated DNA sequences which are widely distributed among genomes. Their structure, role and evolution can be analyzed based on exhaustive extraction from sequenced genomes. Several dedicated algorithms have been developed for this purpose. Here, we compared the detection efficiency of five of them (TRF, Mreps, Sputnik, STAR, and RepeatMasker. Results Our analysis was first conducted on the human X chromosome, and microsatellite distributions were characterized by microsatellite number, length, and divergence from a pure motif. The algorithms work with user-defined parameters, and we demonstrate that the parameter values chosen can strongly influence microsatellite distributions. The five algorithms were then compared by fixing parameters settings, and the analysis was extended to three other genomes (Saccharomyces cerevisiae, Neurospora crassa and Drosophila melanogaster spanning a wide range of size and structure. Significant differences for all characteristics of microsatellites were observed among algorithms, but not among genomes, for both perfect and imperfect microsatellites. Striking differences were detected for short microsatellites (below 20 bp, regardless of motif. Conclusion Since the algorithm used strongly influences empirical distributions, studies analyzing microsatellite evolution based on a comparison between empirical and theoretical size distributions should therefore be considered with caution. We also discuss why a typological definition of microsatellites limits our capacity to capture their genomic distributions.

  18. Pan-Genome Analysis Links the Hereditary Variation of Leptospirillum ferriphilum With Its Evolutionary Adaptation

    Directory of Open Access Journals (Sweden)

    Xian Zhang

    2018-03-01

    Full Text Available Niche adaptation has long been recognized to drive intra-species differentiation and speciation, yet knowledge about its relatedness with hereditary variation of microbial genomes is relatively limited. Using Leptospirillum ferriphilum species as a case study, we present a detailed analysis of genomic features of five recognized strains. Genome-to-genome distance calculation preliminarily determined the roles of spatial distance and environmental heterogeneity that potentially contribute to intra-species variation within L. ferriphilum species at the genome level. Mathematical models were further constructed to extrapolate the expansion of L. ferriphilum genomes (an ‘open’ pan-genome, indicating the emergence of novel genes with new sequenced genomes. The identification of diverse mobile genetic elements (MGEs (such as transposases, integrases, and phage-associated genes revealed the prevalence of horizontal gene transfer events, which is an important evolutionary mechanism that provides avenues for the recruitment of novel functionalities and further for the genetic divergence of microbial genomes. Comprehensive analysis also demonstrated that the genome reduction by gene loss in a broad sense might contribute to the observed diversification. We thus inferred a plausible explanation to address this observation: the community-dependent adaptation that potentially economizes the limiting resources of the entire community. Now that the introduction of new genes is accompanied by a parallel abandonment of some other ones, our results provide snapshots on the biological fitness cost of environmental adaptation within the L. ferriphilum genomes. In short, our genome-wide analyses bridge the relation between genetic variation of L. ferriphilum with its evolutionary adaptation.

  19. ChickVD: a sequence variation database for the chicken genome

    DEFF Research Database (Denmark)

    Wang, Jing; He, Ximiao; Ruan, Jue

    2005-01-01

    Working in parallel with the efforts to sequence the chicken (Gallus gallus) genome, the Beijing Genomics Institute led an international team of scientists from China, USA, UK, Sweden, The Netherlands and Germany to map extensive DNA sequence variation throughout the chicken genome by sampling DN...... on quantitative trait loci using data from collaborating institutions and public resources. Our data can be queried by search engine and homology-based BLAST searches. ChickVD is publicly accessible at http://chicken.genomics.org.cn. Udgivelsesdato: 2005-Jan-1...

  20. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.

    Science.gov (United States)

    Holt, Carson; Yandell, Mark

    2011-12-22

    Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets.

  1. Factors causing cost variation for constructing wastewater projects in Egypt

    Directory of Open Access Journals (Sweden)

    Remon Fayek Aziz

    2013-03-01

    Full Text Available Cost is one of the major considerations throughout the project management life cycle and can be regarded as one of the most important parameters of a project and the driving force of project success. Despite its proven importance, it is common to see a construction project failing to achieve its objectives within the specific cost. Cost variation is a very frequent phenomenon and is almost associated with nearly constructing all wastewater projects. Maintaining steady cost projection on wastewater projects had been recently an issue of serious concern, both to the client and project contractors. Cost deviation from initial cost plan had been prevalent on construction sites. However, little or no effort has been made to curtail the phenomenon, this research work attempts to identify, investigate and rank factors perceived to affect cost variation in the Egyptian wastewater projects with respect to their relative importance so as to proffer possible ways of coping with this phenomenon. To achieve this objective, author invited practitioners and experts, comprising a statistically representative sample, to participate in a structured questionnaire survey. Brain storming was taken into consideration, through which a number of cost variation factors were identified for constructing wastewater projects. Totally 52 factors were short-listed to be made part of the questionnaire survey and the survey was conducted with experts and representatives from private, public and local general construction firms. The data were analyzed using Relative Importance Index, ranking and simple percentages. It was analytically discovered that factors such as: (1 Lowest bidding procurement method; (2 Additional work; (3 Bureaucracy in bidding/tendering method; (4 Wrong method of cost estimation; and (5 Funding problems were critical for causing cost variation, while (1 Inaccurate cost estimation; (2 Mode of financing and payment for completed work; (3 Unexpected ground

  2. Structural genomic variation as risk factor for idiopathic recurrent miscarriage

    DEFF Research Database (Denmark)

    Nagirnaja, Liina; Palta, Priit; Kasak, Laura

    2014-01-01

    Recurrent miscarriage (RM) is a multifactorial disorder with acknowledged genetic heritability that affects ∼3% of couples aiming at childbirth. As copy number variants (CNVs) have been shown to contribute to reproductive disease susceptibility, we aimed to describe genome-wide profile of CNVs an...

  3. Documenting genomics: Applying archival theory to preserving the records of the Human Genome Project.

    Science.gov (United States)

    Shaw, Jennifer

    2016-02-01

    The Human Genome Archive Project (HGAP) aimed to preserve the documentary heritage of the UK's contribution to the Human Genome Project (HGP) by using archival theory to develop a suitable methodology for capturing the results of modern, collaborative science. After assessing past projects and different archival theories, the HGAP used an approach based on the theory of documentation strategy to try to capture the records of a scientific project that had an influence beyond the purely scientific sphere. The HGAP was an archival survey that ran for two years. It led to ninety scientists being contacted and has, so far, led to six collections being deposited in the Wellcome Library, with additional collections being deposited in other UK repositories. In applying documentation strategy the HGAP was attempting to move away from traditional archival approaches to science, which have generally focused on retired Nobel Prize winners. It has been partially successful in this aim, having managed to secure collections from people who are not 'big names', but who made an important contribution to the HGP. However, the attempt to redress the gender imbalance in scientific collections and to improve record-keeping in scientific organisations has continued to be difficult to achieve. Copyright © 2015 The Author. Published by Elsevier Ltd.. All rights reserved.

  4. Genome-wide patterns of copy number variation in the diversified chicken genomes using next-generation sequencing.

    Science.gov (United States)

    Yi, Guoqiang; Qu, Lujiang; Liu, Jianfeng; Yan, Yiyuan; Xu, Guiyun; Yang, Ning

    2014-11-07

    Copy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing. A total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson's correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding. Our results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.

  5. Gene disruptions using P transposable elements: an integral component of the Drosophila genome project.

    OpenAIRE

    Spradling, A C; Stern, D M; Kiss, I; Roote, J; Laverty, T; Rubin, G M

    1995-01-01

    Biologists require genetic as well as molecular tools to decipher genomic information and ultimately to understand gene function. The Berkeley Drosophila Genome Project is addressing these needs with a massive gene disruption project that uses individual, genetically engineered P transposable elements to target open reading frames throughout the Drosophila genome. DNA flanking the insertions is sequenced, thereby placing an extensive series of genetic markers on the physical genomic map and a...

  6. PolyTB: A genomic variation map for Mycobacterium tuberculosis

    KAUST Repository

    Coll, Francesc; Preston, Mark; Guerra-Assunç ã o, José Afonso; Hill-Cawthorn, Grant; Harris, David; Perdigã o, Joã o; Viveiros, Miguel; Portugal, Isabel; Drobniewski, Francis; Gagneux, Sebastien; Glynn, Judith R.; Pain, Arnab; Parkhill, Julian; McNerney, Ruth; Martin, Nigel; Clark, Taane G.

    2014-01-01

    ://pathogenseq.lshtm.ac.uk/polytb) to visualise the resulting variation and important meta-data (e.g. in silico inferred strain-types, location) within geographical map and phylogenetic views. This resource will allow researchers to identify polymorphisms within candidate genes of interest

  7. Human-specific HERV-K insertion causes genomic variations in the human genome.

    Directory of Open Access Journals (Sweden)

    Wonseok Shin

    Full Text Available Human endogenous retroviruses (HERV sequences account for about 8% of the human genome. Through comparative genomics and literature mining, we identified a total of 29 human-specific HERV-K insertions. We characterized them focusing on their structure and flanking sequence. The results showed that four of the human-specific HERV-K insertions deleted human genomic sequences via non-classical insertion mechanisms. Interestingly, two of the human-specific HERV-K insertion loci contained two HERV-K internals and three LTR elements, a pattern which could be explained by LTR-LTR ectopic recombination or template switching. In addition, we conducted a polymorphic test and observed that twelve out of the 29 elements are polymorphic in the human population. In conclusion, human-specific HERV-K elements have inserted into human genome since the divergence of human and chimpanzee, causing human genomic changes. Thus, we believe that human-specific HERV-K activity has contributed to the genomic divergence between humans and chimpanzees, as well as within the human population.

  8. Sequencing the CHO DXB11 genome reveals regional variations in genomic stability and haploidy

    DEFF Research Database (Denmark)

    Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.

    2015-01-01

    Background: The DHFR negative CHO DXB11 cell line (also known as DUX-B11 and DUKX) was historically the first CHO cell line to be used for large scale production of heterologous proteins and is still used for production of a number of complex proteins.  Results: Here we present the genomic sequence...... of the CHO DXB11 genome sequenced to a depth of 33x. Overall a significant genomic drift was seen favoring GC -> AT point mutations in line with the chemical mutagenesis strategy used for generation of the cell line. The sequencing depth for each gene in the genome revealed distinct peaks at sequencing...... in eight additional analyzed CHO genomes (15-20% haploidy) but not in the genome of the Chinese hamster. The dhfr gene is confirmed to be haploid in CHO DXB11; transcriptionally active and the remaining allele contains a G410C point mutation causing a Thr137Arg missense mutation. We find similar to 2...

  9. Genetic Variation in the Nuclear and Organellar Genomes Modulates Stochastic Variation in the Metabolome, Growth, and Defense

    Science.gov (United States)

    Joseph, Bindu; Corwin, Jason A.; Kliebenstein, Daniel J.

    2015-01-01

    Recent studies are starting to show that genetic control over stochastic variation is a key evolutionary solution of single celled organisms in the face of unpredictable environments. This has been expanded to show that genetic variation can alter stochastic variation in transcriptional processes within multi-cellular eukaryotes. However, little is known about how genetic diversity can control stochastic variation within more non-cell autonomous phenotypes. Using an Arabidopsis reciprocal RIL population, we showed that there is significant genetic diversity influencing stochastic variation in the plant metabolome, defense chemistry, and growth. This genetic diversity included loci specific for the stochastic variation of each phenotypic class that did not affect the other phenotypic classes or the average phenotype. This suggests that the organism's networks are established so that noise can exist in one phenotypic level like metabolism and not permeate up or down to different phenotypic levels. Further, the genomic variation within the plastid and mitochondria also had significant effects on the stochastic variation of all phenotypic classes. The genetic influence over stochastic variation within the metabolome was highly metabolite specific, with neighboring metabolites in the same metabolic pathway frequently showing different levels of noise. As expected from bet-hedging theory, there was more genetic diversity and a wider range of stochastic variation for defense chemistry than found for primary metabolism. Thus, it is possible to begin dissecting the stochastic variation of whole organismal phenotypes in multi-cellular organisms. Further, there are loci that modulate stochastic variation at different phenotypic levels. Finding the identity of these genes will be key to developing complete models linking genotype to phenotype. PMID:25569687

  10. Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure.

    Science.gov (United States)

    Gordon, Sean P; Contreras-Moreira, Bruno; Woods, Daniel P; Des Marais, David L; Burgess, Diane; Shu, Shengqiang; Stritt, Christoph; Roulin, Anne C; Schackwitz, Wendy; Tyler, Ludmila; Martin, Joel; Lipzen, Anna; Dochy, Niklas; Phillips, Jeremy; Barry, Kerrie; Geuten, Koen; Budak, Hikmet; Juenger, Thomas E; Amasino, Richard; Caicedo, Ana L; Goodstein, David; Davidson, Patrick; Mur, Luis A J; Figueroa, Melania; Freeling, Michael; Catalan, Pilar; Vogel, John P

    2017-12-19

    While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely to be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.

  11. Within-Host Variations of Human Papillomavirus Reveal APOBEC-Signature Mutagenesis in the Viral Genome.

    Science.gov (United States)

    Hirose, Yusuke; Onuki, Mamiko; Tenjimbayashi, Yuri; Mori, Seiichiro; Ishii, Yoshiyuki; Takeuchi, Takamasa; Tasaka, Nobutaka; Satoh, Toyomi; Morisada, Tohru; Iwata, Takashi; Miyamoto, Shingo; Matsumoto, Koji; Sekizawa, Akihiko; Kukimoto, Iwao

    2018-03-28

    Persistent infection with oncogenic human papillomaviruses (HPVs) causes cervical cancer, accompanied with the accumulation of somatic mutations into the host genome. There are concomitant genetic changes in the HPV genome during viral infection; however, their relevance to cervical carcinogenesis is poorly understood. Here we explored within-host genetic diversity of HPV by performing deep sequencing analyses of viral whole-genome sequences in clinical specimens. The whole genomes of HPV types 16, 52 and 58 were amplified by type-specific PCR from total cellular DNA of cervical exfoliated cells collected from patients with cervical intraepithelial neoplasia (CIN) and invasive cervical cancer (ICC), and were deep-sequenced. After constructing a reference vial genome sequence for each specimen, nucleotide positions showing changes with > 0.5% frequencies compared to the reference sequence were determined for individual samples. In total, 1,052 positions of nucleotide variations were detected in HPV genomes from 151 samples (CIN1, n = 56; CIN2/3, n = 68; ICC, n = 27), with varying numbers per sample. Overall, C-to-T and C-to-A substitutions were the dominant changes observed across all histological grades. While C-to-T transitions were predominantly detected in CIN1, their prevalence was decreased in CIN2/3 and fell below that of C-to-A transversions in ICC. Analysis of the tri-nucleotides context encompassing substituted bases revealed that Tp C pN, a preferred target sequence for cellular APOBEC cytosine deaminases, was a primary site for C-to-T substitutions in the HPV genome. These results strongly imply that the APOBEC proteins are drivers of HPV genome mutation, particularly in CIN1 lesions. IMPORTANCE HPVs exhibit surprisingly high levels of genetic diversity, including a large repertoire of minor genomic variants in each viral genotype. Here, by conducting deep sequencing analyses, we show for the first time a comprehensive snapshot of the "within

  12. Genomic structural variation contributes to phenotypic change of industrial bioethanol yeast Saccharomyces cerevisiae.

    Science.gov (United States)

    Zhang, Ke; Zhang, Li-Jie; Fang, Ya-Hong; Jin, Xin-Na; Qi, Lei; Wu, Xue-Chang; Zheng, Dao-Qiong

    2016-03-01

    Genomic structural variation (GSV) is a ubiquitous phenomenon observed in the genomes of Saccharomyces cerevisiae strains with different genetic backgrounds; however, the physiological and phenotypic effects of GSV are not well understood. Here, we first revealed the genetic characteristics of a widely used industrial S. cerevisiae strain, ZTW1, by whole genome sequencing. ZTW1 was identified as an aneuploidy strain and a large-scale GSV was observed in the ZTW1 genome compared with the genome of a diploid strain YJS329. These GSV events led to copy number variations (CNVs) in many chromosomal segments as well as one whole chromosome in the ZTW1 genome. Changes in the DNA dosage of certain functional genes directly affected their expression levels and the resultant ZTW1 phenotypes. Moreover, CNVs of large chromosomal regions triggered an aneuploidy stress in ZTW1. This stress decreased the proliferation ability and tolerance of ZTW1 to various stresses, while aneuploidy response stress may also provide some benefits to the fermentation performance of the yeast, including increased fermentation rates and decreased byproduct generation. This work reveals genomic characters of the bioethanol S. cerevisiae strain ZTW1 and suggests that GSV is an important kind of mutation that changes the traits of industrial S. cerevisiae strains. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  13. Genome size variation and incidence of polyploidy in Scrophulariaceae sensu lato from the Iberian Peninsula.

    Science.gov (United States)

    Castro, Mariana; Castro, Sílvia; Loureiro, João

    2012-01-01

    In the last decade, genomic studies using DNA markers have strongly influenced the current phylogeny of angiosperms. Genome size and ploidy level have contributed to this discussion, being considered important characters in biosystematics, ecology and population biology. Despite the recent increase in studies related to genome size evolution and polyploidy incidence, only a few are available for Scrophulariaceae. In this context, we assessed the value of genome size, mostly as a taxonomic marker, and the role of polyploidy as a process of genesis and maintenance of plant diversity in Scrophulariaceae sensu lato in the Iberian Peninsula. Large-scale analyses of genome size and ploidy-level variation across the Iberian Peninsula were performed using flow cytometry. One hundred and sixty-two populations of 59 distinct taxa were analysed. A bibliographic review on chromosome counts was also performed. From the 59 sampled taxa, 51 represent first estimates of genome size. The majority of the Scrophulariaceae species presented very small to small genome sizes (2C ≤ 7.0 pg). Furthermore, in most of the analysed genera it was possible to use this character to separate several taxa, independently if these genera were homoploid or heteroploid. Also, some genome-related phenomena were detected, such as intraspecific variation of genome size in some genera and the possible occurrence of dysploidy in Verbascum spp. With respect to polyploidy, despite a few new DNA ploidy levels having been detected in Veronica, no multiple cytotypes have been found in any taxa. This work contributed with important basic scientific knowledge on genome size and polyploid incidence in the Scrophulariaceae, providing important background information for subsequent studies, with several perspectives for future studies being opened.

  14. Insights into the genome structure and copy-number variation of Eimeria tenella

    Directory of Open Access Journals (Sweden)

    Lim Lik-Sin

    2012-08-01

    Full Text Available Abstract Background Eimeria is a genus of parasites in the same phylum (Apicomplexa as human parasites such as Toxoplasma, Cryptosporidium and the malaria parasite Plasmodium. As an apicomplexan whose life-cycle involves a single host, Eimeria is a convenient model for understanding this group of organisms. Although the genomes of the Apicomplexa are diverse, that of Eimeria is unique in being composed of large alternating blocks of sequence with very different characteristics - an arrangement seen in no other organism. This arrangement has impeded efforts to fully sequence the genome of Eimeria, which remains the last of the major apicomplexans to be fully analyzed. In order to increase the value of the genome sequence data and aid in the effort to gain a better understanding of the Eimeria tenella genome, we constructed a whole genome map for the parasite. Results A total of 1245 contigs representing 70.0% of the whole genome assembly sequences (Wellcome Trust Sanger Institute were selected and subjected to marker selection. Subsequently, 2482 HAPPY markers were developed and typed. Of these, 795 were considered as usable markers, and utilized in the construction of a HAPPY map. Markers developed from chromosomally-assigned genes were then integrated into the HAPPY map and this aided the assignment of a number of linkage groups to their respective chromosomes. BAC-end sequences and contigs from whole genome sequencing were also integrated to improve and validate the HAPPY map. This resulted in an integrated HAPPY map consisting of 60 linkage groups that covers approximately half of the estimated 60 Mb genome. Further analysis suggests that the segmental organization first seen in Chromosome 1 is present throughout the genome, with repeat-poor (P regions alternating with repeat-rich (R regions. Evidence of copy-number variation between strains was also uncovered. Conclusions This paper describes the application of a whole genome mapping

  15. Variation in genome composition of blue-aleurone wheat

    Czech Academy of Sciences Publication Activity Database

    Burešová, Veronika; Kopecký, David; Bartoš, Jan; Martinek, P.; Watanabe, N.; Vyhnánek, T.; Doležel, Jaroslav

    2015-01-01

    Roč. 128, č. 2 (2015), s. 273-282 ISSN 0040-5752 R&D Projects: GA MŠk(CZ) LO1204 Institutional support: RVO:61389030 Keywords : TRITICUM-AESTIVUM L * COMMON WHEAT * THINOPYRUM-PONTICUM Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.900, year: 2015

  16. Genomes to life project quarterly report June 2004.

    Energy Technology Data Exchange (ETDEWEB)

    Heffelfinger, Grant S.

    2005-01-01

    This SAND report provides the technical progress through June 2004 of the Sandia-led project, ''Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling'', funded by the DOE Office of Science Genomes to Life Program. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO{sub 2} are important terms in the global environmental response to anthropogenic atmospheric inputs of CO{sub 2} and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. In this project, we will investigate the carbon sequestration behavior of Synechococcus Sp., an abundant marine cyanobacteria known to be important to environmental responses to carbon dioxide levels, through experimental and computational methods. This project is a combined experimental and computational effort with emphasis on developing and applying new computational tools and methods. Our experimental effort will provide the biology and data to drive the computational efforts and include significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microarray experiments. Computational tools will be essential to our efforts to discover and characterize the function of the molecular machines of Synechococcus. To this end, molecular simulation methods will be coupled with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes

  17. The GenABEL Project for statistical genomics.

    Science.gov (United States)

    Karssen, Lennart C; van Duijn, Cornelia M; Aulchenko, Yurii S

    2016-01-01

    Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the "core team", facilitating agile statistical omics methodology development and fast dissemination.

  18. The human genome project and the Catholic Church (1)

    Science.gov (United States)

    Moraczewski, Albert S

    1991-12-01

    The Cathlic Church has not made any formal statements about the Human Genome Project as such. But the present Pope, John Paul II, has commented, albeit very briefly, on various aspects of genetic manipulation. Genetic interventions which are therapeutic (e.g. gene therapy), namely, directed to the correction or amelioration of a disorder are acceptable, in principle, provided they promote the personal well being of the individual being so treated. Genetic interventions which are not therapeutic for the specific individual involved but are experimental and directed primarily to improving humans as biological entities are of dubious moral probity, but are not necessarily to be totally rejected out of hand. To be morally acceptable such genetic intervention should meet certain conditions which include due respect for the given psychological nature of each individual human being. In addition, no harm should be inflicted on the process of human generation, and its fundamental design should not be altered. Any genetic manipulation which results in, or tends to, the creation of groups with different qualities such that there would result a fresh marginalization of these people must be avoided. It has been also suggested by a few that because the Son of God took on a human nature in Jesus Christ, one may not so alter the human genome that a new distinct species would be created....

  19. The UK Human Genome Mapping Project online computing service.

    Science.gov (United States)

    Rysavy, F R; Bishop, M J; Gibbs, G P; Williams, G W

    1992-04-01

    This paper presents an overview of computing and networking facilities developed by the Medical Research Council to provide online computing support to the Human Genome Mapping Project (HGMP) in the UK. The facility is connected to a number of other computing facilities in various centres of genetics and molecular biology research excellence, either directly via high-speed links or through national and international wide-area networks. The paper describes the design and implementation of the current system, a 'client/server' network of Sun, IBM, DEC and Apple servers, gateways and workstations. A short outline of online computing services currently delivered by this system to the UK human genetics research community is also provided. More information about the services and their availability could be obtained by a direct approach to the UK HGMP-RC.

  20. The impact of the human genome project on risk assessment

    International Nuclear Information System (INIS)

    Katarzyna Doerffer; Paul Unrau.

    1996-01-01

    The radiation protection approach to risk assessment assumes that cancer induction following radiation exposure is purely random. Present risk assessment methods derive risk from cancer incidence frequencies in exposed populations and associate disease outcomes totally with the level of exposure to ionizing red aeon. Exposure defines a risk factor that affects the probability of the disease outcome. But cancer risk can be affected by other risk factors such as underlying genetic factors (predisposition) of the exposed organism. These genetic risk factors are now becoming available for incorporation into ionizing radiation risk assessment Progress in the Human Genome Project (HOP) will lead to direct assays to measure the effects of genetic risk determinants in disease outcomes. When all genetic risk determinants are known and incorporated into risk assessment it will be possible to reevaluate the role of ionizing radiation in the causation of cancer. (author)

  1. Genomes to Life Project Quartely Report October 2004.

    Energy Technology Data Exchange (ETDEWEB)

    Heffelfinger, Grant S.; Martino, Anthony; Rintoul, Mark Daniel; Geist, Al; Gorin, Andrey; Xu, Ying; Palenik, Brian

    2005-02-01

    This SAND report provides the technical progress through October 2004 of the Sandia-led project, %22Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling,%22 funded by the DOE Office of Science Genomes to Life Program. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO2 are important terms in the global environmental response to anthropogenic atmospheric inputs of CO2 and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. In this project, we will investigate the carbon sequestration behavior of Synechococcus Sp., an abundant marine cyanobacteria known to be important to environmental responses to carbon dioxide levels, through experimental and computational methods. This project is a combined experimental and computational effort with emphasis on developing and applying new computational tools and methods. Our experimental effort will provide the biology and data to drive the computational efforts and include significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microarray experiments. Computational tools will be essential to our efforts to discover and characterize the function of the molecular machines of Synechococcus. To this end, molecular simulation methods will be coupled with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes. In addition, we will develop

  2. Comparison of variations detection between whole-genome amplification methods used in single-cell resequencing

    DEFF Research Database (Denmark)

    Hou, Yong; Wu, Kui; Shi, Xulian

    2015-01-01

    methods, focusing particularly on variations detection. Low-coverage whole-genome sequencing revealed that DOP-PCR had the highest duplication ratio, but an even read distribution and the best reproducibility and accuracy for detection of copy-number variations (CNVs). However, MDA had significantly...... performance using SCRS amplified by different WGA methods. It will guide researchers to determine which WGA method is best suited to individual experimental needs at single-cell level....

  3. Genomic variation among populations of threatened coral: Acropora cervicornis.

    Science.gov (United States)

    Drury, C; Dale, K E; Panlilio, J M; Miller, S V; Lirman, D; Larson, E A; Bartels, E; Crawford, D L; Oleksiak, M F

    2016-04-13

    Acropora cervicornis, a threatened, keystone reef-building coral has undergone severe declines (>90 %) throughout the Caribbean. These declines could reduce genetic variation and thus hamper the species' ability to adapt. Active restoration strategies are a common conservation approach to mitigate species' declines and require genetic data on surviving populations to efficiently respond to declines while maintaining the genetic diversity needed to adapt to changing conditions. To evaluate active restoration strategies for the staghorn coral, the genetic diversity of A. cervicornis within and among populations was assessed in 77 individuals collected from 68 locations along the Florida Reef Tract (FRT) and in the Dominican Republic. Genotyping by Sequencing (GBS) identified 4,764 single nucleotide polymorphisms (SNPs). Pairwise nucleotide differences (π) within a population are large (~37 %) and similar to π across all individuals. This high level of genetic diversity along the FRT is similar to the diversity within a small, isolated reef. Much of the genetic diversity (>90 %) exists within a population, yet GBS analysis shows significant variation along the FRT, including 300 SNPs with significant FST values and significant divergence relative to distance. There are also significant differences in SNP allele frequencies over small spatial scales, exemplified by the large FST values among corals collected within Miami-Dade county. Large standing diversity was found within each population even after recent declines in abundance, including significant, potentially adaptive divergence over short distances. The data here inform conservation and management actions by uncovering population structure and high levels of diversity maintained within coral collections among sites previously shown to have little genetic divergence. More broadly, this approach demonstrates the power of GBS to resolve differences among individuals and identify subtle genetic structure

  4. Phylogeny, rate variation, and genome size evolution of Pelargonium (Geraniaceae).

    Science.gov (United States)

    Weng, Mao-Lun; Ruhlman, Tracey A; Gibby, Mary; Jansen, Robert K

    2012-09-01

    The phylogeny of 58 Pelargonium species was estimated using five plastid markers (rbcL, matK, ndhF, rpoC1, trnL-F) and one mitochondrial gene (nad5). The results confirmed the monophyly of three major clades and four subclades within Pelargonium but also indicate the need to revise some sectional classifications. This phylogeny was used to examine karyotype evolution in the genus: plotting chromosome sizes, numbers and 2C-values indicates that genome size is significantly correlated with chromosome size but not number. Accelerated rates of nucleotide substitution have been previously detected in both plastid and mitochondrial genes in Pelargonium, but sparse taxon sampling did not enable identification of the phylogenetic distribution of these elevated rates. Using the multigene phylogeny as a constraint, we investigated lineage- and locus-specific heterogeneity of substitution rates in Pelargonium for an expanded number of taxa and demonstrated that both plastid and mitochondrial genes have had accelerated substitution rates but with markedly disparate patterns. In the plastid, the exons of rpoC1 have significantly accelerated substitution rates compared to its intron and the acceleration was mainly due to nonsynonymous substitutions. In contrast, the mitochondrial gene, nad5, experienced substantial acceleration of synonymous substitution rates in three internal branches of Pelargonium, but this acceleration ceased in all terminal branches. Several lineages also have dN/dS ratios significantly greater than one for rpoC1, indicating that positive selection is acting on this gene, whereas the accelerated synonymous substitutions in the mitochondrial gene are the result of elevated mutation rates. Published by Elsevier Inc.

  5. The Epilepsy Phenome/Genome Project (EPGP) informatics platform.

    Science.gov (United States)

    Nesbitt, Gerry; McKenna, Kevin; Mays, Vickie; Carpenter, Alan; Miller, Kevin; Williams, Michael

    2013-04-01

    The Epilepsy Phenome/Genome Project (EPGP) is a large-scale, multi-institutional, collaborative network of 27 epilepsy centers throughout the U.S., Australia, and Argentina, with the objective of collecting detailed phenotypic and genetic data on a large number of epilepsy participants. The goals of EPGP are (1) to perform detailed phenotyping on 3750 participants with specific forms of non-acquired epilepsy and 1500 parents without epilepsy, (2) to obtain DNA samples on these individuals, and (3) to ultimately genotype the samples in order to discover novel genes that cause epilepsy. To carry out the project, a reliable and robust informatics platform was needed for standardized electronic data collection and storage, data quality review, and phenotypic analysis involving cases from multiple sites. EPGP developed its own suite of web-based informatics applications for participant tracking, electronic data collection (using electronic case report forms/surveys), data management, phenotypic data review and validation, specimen tracking, electroencephalograph and neuroimaging storage, and issue tracking. We implemented procedures to train and support end-users at each clinical site. Thus far, 3780 study participants have been enrolled and 20,957 web-based study activities have been completed using this informatics platform. Over 95% of respondents to an end-user satisfaction survey felt that the informatics platform was successful almost always or most of the time. The EPGP informatics platform has successfully and effectively allowed study management and efficient and reliable collection of phenotypic data. Our novel informatics platform met the requirements of a large, multicenter research project. The platform has had a high level of end-user acceptance by principal investigators and study coordinators, and can serve as a model for new tools to support future large scale, collaborative research projects collecting extensive phenotypic data. Copyright © 2012

  6. Quality of the restricted variation after projection method with angular momentum projection

    International Nuclear Information System (INIS)

    Rodriguez, Tomas R.; Egido, J.L.; Robledo, L.M.; Rodriguez-Guzman, R.

    2005-01-01

    Recently, the restricted angular momentum variation after projection method, using the quadrupole degree of freedom as a variational coordinate in conjunction with effective interactions of the Skyrme or Gogny type, has been used very successfully to study a variety of phenomena concerning the quadrupole degree of freedom. In this paper, we study the quality of such an approach by considering additional degrees of freedom as variational coordinates: the hexadecapole moment and the fluctuations on the quadrupole moment, particle number, and angular momentum operators. The study has been performed with the Gogny interaction (D1S parametrization) for the nuclei 32 Mg and 34 Mg. The results of the angular momentum projection and the subsequent generator coordinate calculations show that the extra degrees of freedom considered are irrelevant for the description of the lowest lying states for each angular momentum

  7. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    Directory of Open Access Journals (Sweden)

    Aurélien Chateigner

    2015-07-01

    Full Text Available Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%. K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs. Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential.

  8. Ethical considerations of research policy for personal genome analysis: the approach of the Genome Science Project in Japan.

    Science.gov (United States)

    Minari, Jusaku; Shirai, Tetsuya; Kato, Kazuto

    2014-12-01

    As evidenced by high-throughput sequencers, genomic technologies have recently undergone radical advances. These technologies enable comprehensive sequencing of personal genomes considerably more efficiently and less expensively than heretofore. These developments present a challenge to the conventional framework of biomedical ethics; under these changing circumstances, each research project has to develop a pragmatic research policy. Based on the experience with a new large-scale project-the Genome Science Project-this article presents a novel approach to conducting a specific policy for personal genome research in the Japanese context. In creating an original informed-consent form template for the project, we present a two-tiered process: making the draft of the template following an analysis of national and international policies; refining the draft template in conjunction with genome project researchers for practical application. Through practical use of the template, we have gained valuable experience in addressing challenges in the ethical review process, such as the importance of sharing details of the latest developments in genomics with members of research ethics committees. We discuss certain limitations of the conventional concept of informed consent and its governance system and suggest the potential of an alternative process using information technology.

  9. Human Genome Teacher Networking Project, Final Report, April 1, 1992 - March 31, 1998

    Energy Technology Data Exchange (ETDEWEB)

    Collins, Debra

    1999-10-01

    Project to provide education regarding ethical legal and social implications of Human Genome Project to high school science teachers through two consecutive summer workshops, in class activities, and peer teaching workshops.

  10. Natural selection affects multiple aspects of genetic variation at putatively peutral sites across the human genome

    DEFF Research Database (Denmark)

    Lohmueller, Kirk E; Albrechtsen, Anders; Li, Yingrui

    2011-01-01

    A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries...... these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination...... and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations...

  11. The lawful uses of knowledge from the Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Grad, F.P.

    1994-04-15

    Part I of this study deals with the right to know or not to know personal genetic information, and examines available legal protections of the right of privacy and the adverse effect of the disclosure of genetic information both on employment and insurance interests and on self esteem and protection of personal integrity. The study examines the rationale for the legal protection of privacy as the protection of a public interest. It examines the very limited protections currently available for privacy interests, including genetic privacy interests, and concludes that there is a need for broader, more far-reaching legal protections. The second part of the study is based on the assumption that as major a project as the Human Genome Project, spending billions of dollars on science which is health related, will indeed be applied for preventive and therapeutic public health purposes, as it has been in the past. It also addresses the recurring fear that public health initiatives in the genetic area must evolve a new eugenic agenda, that we must not repeat the miserable discriminatory experiences of the past.

  12. Extreme Recombination Frequencies Shape Genome Variation and Evolution in the Honeybee, Apis mellifera

    Science.gov (United States)

    Wallberg, Andreas; Glémin, Sylvain; Webster, Matthew T.

    2015-01-01

    Meiotic recombination is a fundamental cellular process, with important consequences for evolution and genome integrity. However, we know little about how recombination rates vary across the genomes of most species and the molecular and evolutionary determinants of this variation. The honeybee, Apis mellifera, has extremely high rates of meiotic recombination, although the evolutionary causes and consequences of this are unclear. Here we use patterns of linkage disequilibrium in whole genome resequencing data from 30 diploid honeybees to construct a fine-scale map of rates of crossing over in the genome. We find that, in contrast to vertebrate genomes, the recombination landscape is not strongly punctate. Crossover rates strongly correlate with levels of genetic variation, but not divergence, which indicates a pervasive impact of selection on the genome. Germ-line methylated genes have reduced crossover rate, which could indicate a role of methylation in suppressing recombination. Controlling for the effects of methylation, we do not infer a strong association between gene expression patterns and recombination. The site frequency spectrum is strongly skewed from neutral expectations in honeybees: rare variants are dominated by AT-biased mutations, whereas GC-biased mutations are found at higher frequencies, indicative of a major influence of GC-biased gene conversion (gBGC), which we infer to generate an allele fixation bias 5 – 50 times the genomic average estimated in humans. We uncover further evidence that this repair bias specifically affects transitions and favours fixation of CpG sites. Recombination, via gBGC, therefore appears to have profound consequences on genome evolution in honeybees and interferes with the process of natural selection. These findings have important implications for our understanding of the forces driving molecular evolution. PMID:25902173

  13. Genome size variation among and within Camellia species by using flow cytometric analysis.

    Directory of Open Access Journals (Sweden)

    Hui Huang

    Full Text Available BACKGROUND: The genus Camellia, belonging to the family Theaceae, is economically important group in flowering plants. Frequent interspecific hybridization together with polyploidization has made them become taxonomically "difficult taxa". The DNA content is often used to measure genome size variation and has largely advanced our understanding of plant evolution and genome variation. The goals of this study were to investigate patterns of interspecific and intraspecific variation of DNA contents and further explore genome size evolution in a phylogenetic context of the genus. METHODOLOGY/PRINCIPAL FINDINGS: The DNA amount in the genus was determined by using propidium iodide flow cytometry analysis for a total of 139 individual plants representing almost all sections of the two subgenera, Camellia and Thea. An improved WPB buffer was proven to be suitable for the Camellia species, which was able to counteract the negative effects of secondary metabolite and generated high-quality results with low coefficient of variation values (CV <5%. Our results showed trivial effects on different tissues of flowers, leaves and buds as well as cytosolic compounds on the estimation of DNA amount. The DNA content of C. sinensis var. assamica was estimated to be 1C = 3.01 pg by flow cytometric analysis, which is equal to a genome size of about 2940 Mb. CONCLUSION: Intraspecific and interspecific variations were observed in the genus Camellia, and as expected, the latter was larger than the former. Our study suggests a directional trend of increasing genome size in the genus Camellia probably owing to the frequent polyploidization events.

  14. Enabling a Community to Dissect an Organism: Overview of the Neurospora Functional Genomics Project

    OpenAIRE

    Dunlap, Jay C.; Borkovich, Katherine A.; Henn, Matthew R.; Turner, Gloria E.; Sachs, Matthew S.; Glass, N. Louise; McCluskey, Kevin; Plamann, Michael; Galagan, James E.; Birren, Bruce W.; Weiss, Richard L.; Townsend, Jeffrey P.; Loros, Jennifer J.; Nelson, Mary Anne; Lambreghts, Randy

    2007-01-01

    A consortium of investigators is engaged in a functional genomics project centered on the filamentous fungus Neurospora, with an eye to opening up the functional genomic analysis of all the filamentous fungi. The overall goal of the four interdependent projects in this effort is to acccomplish functional genomics, annotation, and expression analyses of Neurospora crassa, a filamentous fungus that is an established model for the assemblage of over 250,000 species of nonyeast fungi. Building fr...

  15. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    Science.gov (United States)

    Macas, Jiří; Novák, Petr; Pellicer, Jaume; Čížková, Jana; Koblížková, Andrea; Neumann, Pavel; Fuková, Iva; Doležel, Jaroslav; Kelly, Laura J; Leitch, Ilia J

    2015-01-01

    The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  16. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    Directory of Open Access Journals (Sweden)

    Jiří Macas

    Full Text Available The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57% of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%. Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  17. Genomic Diversity Using Copy Number Variations in Worldwide Chicken Populations

    Directory of Open Access Journals (Sweden)

    Erica Gorla

    2018-06-01

    Full Text Available Recently, many studies in livestock have focused on the identification of Copy Number Variants (CNVs using high-density Single Nucleotide Polymorphism (SNP arrays, but few have focused on studying chicken ecotypes coming from many locations. CNVs are polymorphisms, which may influence phenotype and are an important source of genetic variation in populations. The aim of this study was to explore the genetic difference and structure, using a high density SNP chip in 936 individuals from seven different countries (Brazil, Italy, Egypt, Mexico, Rwanda, Sri Lanka and Uganda. The DNA was genotyped with the Affymetrix Axiom®600k Chicken Genotyping Array and processed with stringent quality controls to obtain 559,201 SNPs in 915 individuals. The Log R Ratio (LRR and the B Allele Frequency of SNPs were used to perform the CNV calling with PennCNV software based on a Hidden Markov Model analysis and the LRR was used to perform CNV detection with SVS Golden Helix software.After filtering, a total of 19,027 CNVs were detected with the SVS software, while 9,065 CNVs were identified with the Penn CNV software. The CNVs were summarized in 7,001 Copy Number Variant Regions (CNVRs and 4,414 CNVRs, using the software BedTool.The consensus analysis across the CNVRs allowed the identification of 2,820 consensus CNVR, of which 1,721 were gain, 637 loss and 462 complex, for a total length of 53 Mb corresponding to the 5 % of the GalGal5 chicken autosomes. Only the consensus CNV regions obtained from both detections were considered for further analysis.The intersection analysis performed between the chicken gene database (Gallus_gallus-5.0 and the 1,927 consensus CNVRs allowed the identification (within or partial overlap of a total of 2,354 unique genes with an official gene ID.  The CNVRs identified here represent the first comprehensive mapping in several worldwide populations, using a high-density SNP chip.

  18. A Perfect Match Genomic Landscape Provides a Unified Framework for the Precise Detection of Variation in Natural and Synthetic Haploid Genomes.

    Science.gov (United States)

    Palacios-Flores, Kim; García-Sotelo, Jair; Castillo, Alejandra; Uribe, Carina; Aguilar, Luis; Morales, Lucía; Gómez-Romero, Laura; Reyes, José; Garciarubio, Alejandro; Boege, Margareta; Dávila, Guillermo

    2018-04-01

    We present a conceptually simple, sensitive, precise, and essentially nonstatistical solution for the analysis of genome variation in haploid organisms. The generation of a Perfect Match Genomic Landscape (PMGL), which computes intergenome identity with single nucleotide resolution, reveals signatures of variation wherever a query genome differs from a reference genome. Such signatures encode the precise location of different types of variants, including single nucleotide variants, deletions, insertions, and amplifications, effectively introducing the concept of a general signature of variation. The precise nature of variants is then resolved through the generation of targeted alignments between specific sets of sequence reads and known regions of the reference genome. Thus, the perfect match logic decouples the identification of the location of variants from the characterization of their nature, providing a unified framework for the detection of genome variation. We assessed the performance of the PMGL strategy via simulation experiments. We determined the variation profiles of natural genomes and of a synthetic chromosome, both in the context of haploid yeast strains. Our approach uncovered variants that have previously escaped detection. Moreover, our strategy is ideally suited for further refining high-quality reference genomes. The source codes for the automated PMGL pipeline have been deposited in a public repository. Copyright © 2018 by the Genetics Society of America.

  19. The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata

    Science.gov (United States)

    Liolios, Konstantinos; Chen, I-Min A.; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Hugenholtz, Philip; Markowitz, Victor M.; Kyrpides, Nikos C.

    2010-01-01

    The Genomes On Line Database (GOLD) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2009, GOLD contains information for more than 5800 sequencing projects, of which 1100 have been completed and their sequence data deposited in a public repository. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about a (Meta)Genome Sequence (MIGS/MIMS) specification. GOLD is available at: http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece, at: http://gold.imbb.forth.gr/ PMID:19914934

  20. Bat biology, genomes, and the Bat1K project

    DEFF Research Database (Denmark)

    Teeling, Emma C; Vernes, Sonja C; Dávalos, Liliana M

    2018-01-01

    and endangered. Here we announce Bat1K, an initiative to sequence the genomes of all living bat species (n∼1,300) to chromosome-level assembly. The Bat1K genome consortium unites bat biologists (>148 members as of writing), computational scientists, conservation organizations, genome technologists, and any...

  1. Limits of variation, specific infectivity, and genome packaging of massively recoded poliovirus genomes.

    Science.gov (United States)

    Song, Yutong; Gorbatsevych, Oleksandr; Liu, Ying; Mugavero, JoAnn; Shen, Sam H; Ward, Charles B; Asare, Emmanuel; Jiang, Ping; Paul, Aniko V; Mueller, Steffen; Wimmer, Eckard

    2017-10-10

    Computer design and chemical synthesis generated viable variants of poliovirus type 1 (PV1), whose ORF (6,189 nucleotides) carried up to 1,297 "Max" mutations (excess of overrepresented synonymous codon pairs) or up to 2,104 "SD" mutations (randomly scrambled synonymous codons). "Min" variants (excess of underrepresented synonymous codon pairs) are nonviable except for P2 Min , a variant temperature-sensitive at 33 and 39.5 °C. Compared with WT PV1, P2 Min displayed a vastly reduced specific infectivity (si) (WT, 1 PFU/118 particles vs. P2 Min , 1 PFU/35,000 particles), a phenotype that will be discussed broadly. Si of haploid PV presents cellular infectivity of a single genotype. We performed a comprehensive analysis of sequence and structures of the PV genome to determine if evolutionary conserved cis-acting packaging signal(s) were preserved after recoding. We showed that conserved synonymous sites and/or local secondary structures that might play a role in determining packaging specificity do not survive codon pair recoding. This makes it unlikely that numerous "cryptic, sequence-degenerate, dispersed RNA packaging signals mapping along the entire viral genome" [Patel N, et al. (2017) Nat Microbiol 2:17098] play the critical role in poliovirus packaging specificity. Considering all available evidence, we propose a two-step assembly strategy for +ssRNA viruses: step I, acquisition of packaging specificity, either ( a ) by specific recognition between capsid protein(s) and replication proteins (poliovirus), or ( b ) by the high affinity interaction of a single RNA packaging signal (PS) with capsid protein(s) (most +ssRNA viruses so far studied); step II, cocondensation of genome/capsid precursors in which an array of hairpin structures plays a role in virion formation.

  2. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Héloïse Bastide

    2013-06-01

    Full Text Available Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  3. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Science.gov (United States)

    Bastide, Héloïse; Betancourt, Andrea; Nolte, Viola; Tobler, Raymond; Stöbe, Petra; Futschik, Andreas; Schlötterer, Christian

    2013-06-01

    Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS) to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs) segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  4. Natural selection affects multiple aspects of genetic variation at putatively neutral sites across the human genome.

    Science.gov (United States)

    Lohmueller, Kirk E; Albrechtsen, Anders; Li, Yingrui; Kim, Su Yeon; Korneliussen, Thorfinn; Vinckenbosch, Nicolas; Tian, Geng; Huerta-Sanchez, Emilia; Feder, Alison F; Grarup, Niels; Jørgensen, Torben; Jiang, Tao; Witte, Daniel R; Sandbæk, Annelli; Hellmann, Ines; Lauritzen, Torsten; Hansen, Torben; Pedersen, Oluf; Wang, Jun; Nielsen, Rasmus

    2011-10-01

    A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries of genetic variation, like allele frequencies, are also correlated with recombination rate and whether these correlations can be explained solely by negative selection against deleterious mutations or whether positive selection acting on favorable alleles is also required. Here we attempt to address these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations. However, models with strong positive selection on nonsynonymous mutations and little negative selection predict a stronger negative correlation between neutral diversity and nonsynonymous divergence than observed in the actual data, supporting the importance of negative, rather than positive, selection throughout the genome. Further, we show that the widespread presence of weakly deleterious alleles, rather than a small number of strongly positively selected mutations, is responsible for the correlation between neutral genetic diversity and recombination rate. This work suggests that natural selection has affected multiple aspects of linked neutral variation throughout the human genome and that positive selection is not required to explain these observations.

  5. VarB Plus: An Integrated Tool for Visualization of Genome Variation Datasets

    KAUST Repository

    Hidayah, Lailatul

    2012-07-01

    Research on genomic sequences has been improving significantly as more advanced technology for sequencing has been developed. This opens enormous opportunities for sequence analysis. Various analytical tools have been built for purposes such as sequence assembly, read alignments, genome browsing, comparative genomics, and visualization. From the visualization perspective, there is an increasing trend towards use of large-scale computation. However, more than power is required to produce an informative image. This is a challenge that we address by providing several ways of representing biological data in order to advance the inference endeavors of biologists. This thesis focuses on visualization of variations found in genomic sequences. We develop several visualization functions and embed them in an existing variation visualization tool as extensions. The tool we improved is named VarB, hence the nomenclature for our enhancement is VarB Plus. To the best of our knowledge, besides VarB, there is no tool that provides the capability of dynamic visualization of genome variation datasets as well as statistical analysis. Dynamic visualization allows users to toggle different parameters on and off and see the results on the fly. The statistical analysis includes Fixation Index, Relative Variant Density, and Tajima’s D. Hence we focused our efforts on this tool. The scope of our work includes plots of per-base genome coverage, Principal Coordinate Analysis (PCoA), integration with a read alignment viewer named LookSeq, and visualization of geo-biological data. In addition to description of embedded functionalities, significance, and limitations, future improvements are discussed. The result is four extensions embedded successfully in the original tool, which is built on the Qt framework in C++. Hence it is portable to numerous platforms. Our extensions have shown acceptable execution time in a beta testing with various high-volume published datasets, as well as positive

  6. Background selection as baseline for nucleotide variation across the Drosophila genome.

    Directory of Open Access Journals (Sweden)

    Josep M Comeron

    2014-06-01

    Full Text Available The constant removal of deleterious mutations by natural selection causes a reduction in neutral diversity and efficacy of selection at genetically linked sites (a process called Background Selection, BGS. Population genetic studies, however, often ignore BGS effects when investigating demographic events or the presence of other types of selection. To obtain a more realistic evolutionary expectation that incorporates the unavoidable consequences of deleterious mutations, we generated high-resolution landscapes of variation across the Drosophila melanogaster genome under a BGS scenario independent of polymorphism data. We find that BGS plays a significant role in shaping levels of variation across the entire genome, including long introns and intergenic regions distant from annotated genes. We also find that a very large percentage of the observed variation in diversity across autosomes can be explained by BGS alone, up to 70% across individual chromosome arms at 100-kb scale, thus indicating that BGS predictions can be used as baseline to infer additional types of selection and demographic events. This approach allows detecting several outlier regions with signal of recent adaptive events and selective sweeps. The use of a BGS baseline, however, is particularly appropriate to investigate the presence of balancing selection and our study exposes numerous genomic regions with the predicted signature of higher polymorphism than expected when a BGS context is taken into account. Importantly, we show that these conclusions are robust to the mutation and selection parameters of the BGS model. Finally, analyses of protein evolution together with previous comparisons of genetic maps between Drosophila species, suggest temporally variable recombination landscapes and, thus, local BGS effects that may differ between extant and past phases. Because genome-wide BGS and temporal changes in linkage effects can skew approaches to estimate demographic and

  7. Genome size and phenotypic variation of Nymphaea (Nymphaeaceae) species from Eastern Europe and temperate Asia

    Czech Academy of Sciences Publication Activity Database

    Dąbrowska, M. A.; Rola, K.; Volkova, P.; Suda, Jan; Zalewska-Gałosz, J.

    2015-01-01

    Roč. 84, č. 2 (2015), s. 277-286 ISSN 0001-6977 R&D Projects: GA ČR GB14-36079G Institutional support: RVO:67985939 Keywords : flow cytometry * genome size * morphometrics Subject RIV: EF - Botanics Impact factor: 1.213, year: 2015

  8. How genome size variation is linked with evolution within Chenopodium sensu lato

    Czech Academy of Sciences Publication Activity Database

    Mandák, Bohumil; Krak, Karol; Vít, Petr; Pavlíková, Zuzana; Lomonosova, M. N.; Habibi, Farzaneh; Lei, Wang; Jellen, E.N.; Douda, Jan

    2016-01-01

    Roč. 23, DEC 2016 (2016), s. 18-32 ISSN 1433-8319 R&D Projects: GA ČR GA13-02290S Institutional support: RVO:67985939 Keywords : Chenopodium * genome size evolution * flow cytometry Subject RIV: EF - Botanics Impact factor: 3.123, year: 2016

  9. Human Variome Project Quality Assessment Criteria for Variation Databases.

    Science.gov (United States)

    Vihinen, Mauno; Hancock, John M; Maglott, Donna R; Landrum, Melissa J; Schaafsma, Gerard C P; Taschner, Peter

    2016-06-01

    Numerous databases containing information about DNA, RNA, and protein variations are available. Gene-specific variant databases (locus-specific variation databases, LSDBs) are typically curated and maintained for single genes or groups of genes for a certain disease(s). These databases are widely considered as the most reliable information source for a particular gene/protein/disease, but it should also be made clear they may have widely varying contents, infrastructure, and quality. Quality is very important to evaluate because these databases may affect health decision-making, research, and clinical practice. The Human Variome Project (HVP) established a Working Group for Variant Database Quality Assessment. The basic principle was to develop a simple system that nevertheless provides a good overview of the quality of a database. The HVP quality evaluation criteria that resulted are divided into four main components: data quality, technical quality, accessibility, and timeliness. This report elaborates on the developed quality criteria and how implementation of the quality scheme can be achieved. Examples are provided for the current status of the quality items in two different databases, BTKbase, an LSDB, and ClinVar, a central archive of submissions about variants and their clinical significance. © 2016 WILEY PERIODICALS, INC.

  10. Genomic analysis of natural selection and phenotypic variation in high-altitude mongolians.

    Directory of Open Access Journals (Sweden)

    Jinchuan Xing

    Full Text Available Deedu (DU Mongolians, who migrated from the Mongolian steppes to the Qinghai-Tibetan Plateau approximately 500 years ago, are challenged by environmental conditions similar to native Tibetan highlanders. Identification of adaptive genetic factors in this population could provide insight into coordinated physiological responses to this environment. Here we examine genomic and phenotypic variation in this unique population and present the first complete analysis of a Mongolian whole-genome sequence. High-density SNP array data demonstrate that DU Mongolians share genetic ancestry with other Mongolian as well as Tibetan populations, specifically in genomic regions related with adaptation to high altitude. Several selection candidate genes identified in DU Mongolians are shared with other Asian groups (e.g., EDAR, neighboring Tibetan populations (including high-altitude candidates EPAS1, PKLR, and CYP2E1, as well as genes previously hypothesized to be associated with metabolic adaptation (e.g., PPARG. Hemoglobin concentration, a trait associated with high-altitude adaptation in Tibetans, is at an intermediate level in DU Mongolians compared to Tibetans and Han Chinese at comparable altitude. Whole-genome sequence from a DU Mongolian (Tianjiao1 shows that about 2% of the genomic variants, including more than 300 protein-coding changes, are specific to this individual. Our analyses of DU Mongolians and the first Mongolian genome provide valuable insight into genetic adaptation to extreme environments.

  11. Genome-wide detection of copy number variations among diverse horse breeds by array CGH.

    Directory of Open Access Journals (Sweden)

    Wei Wang

    Full Text Available Recent studies have found that copy number variations (CNVs are widespread in human and animal genomes. CNVs are a significant source of genetic variation, and have been shown to be associated with phenotypic diversity. However, the effect of CNVs on genetic variation in horses is not well understood. In the present study, CNVs in 6 different breeds of mare horses, Mongolia horse, Abaga horse, Hequ horse and Kazakh horse (all plateau breeds and Debao pony and Thoroughbred, were determined using aCGH. In total, seven hundred CNVs were identified ranging in size from 6.1 Kb to 0.57 Mb across all autosomes, with an average size of 43.08 Kb and a median size of 15.11 Kb. By merging overlapping CNVs, we found a total of three hundred and fifty-three CNV regions (CNVRs. The length of the CNVRs ranged from 6.1 Kb to 1.45 Mb with average and median sizes of 38.49 Kb and 13.1 Kb. Collectively, 13.59 Mb of copy number variation was identified among the horses investigated and accounted for approximately 0.61% of the horse genome sequence. Five hundred and eighteen annotated genes were affected by CNVs, which corresponded to about 2.26% of all horse genes. Through the gene ontology (GO, genetic pathway analysis and comparison of CNV genes among different breeds, we found evidence that CNVs involving 7 genes may be related to the adaptation to severe environment of these plateau horses. This study is the first report of copy number variations in Chinese horses, which indicates that CNVs are ubiquitous in the horse genome and influence many biological processes of the horse. These results will be helpful not only in mapping the horse whole-genome CNVs, but also to further research for the adaption to the high altitude severe environment for plateau horses.

  12. Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

    Directory of Open Access Journals (Sweden)

    McGuire Patrick E

    2010-12-01

    chromosomal regions. The net effect of these factors in T. aestivum is large variation in diversity among genomes and chromosomes, which impacts the development of SNP markers and their practical utility. Accumulation of new mutations in older polyploid species, such as wild emmer, results in increased diversity and its more uniform distribution across the genome.

  13. Variations and classification of toxic epitopes related to celiac disease among α-gliadin genes from four Aegilops genomes.

    Science.gov (United States)

    Li, Jie; Wang, Shunli; Li, Shanshan; Ge, Pei; Li, Xiaohui; Ma, Wujun; Zeller, F J; Hsam, Sai L K; Yan, Yueming

    2012-07-01

    The α-gliadins are associated with human celiac disease. A total of 23 noninterrupted full open reading frame α-gliadin genes and 19 pseudogenes were cloned and sequenced from C, M, N, and U genomes of four diploid Aegilops species. Sequence comparison of α-gliadin genes from Aegilops and Triticum species demonstrated an existence of extensive allelic variations in Gli-2 loci of the four Aegilops genomes. Specific structural features were found including the compositions and variations of two polyglutamine domains (QI and QII) and four T cell stimulatory toxic epitopes. The mean numbers of glutamine residues in the QI domain in C and N genomes and the QII domain in C, N, and U genomes were much higher than those in Triticum genomes, and the QI domain in C and N genomes and the QII domain in C, M, N, and U genomes displayed greater length variations. Interestingly, the types and numbers of four T cell stimulatory toxic epitopes in α-gliadins from the four Aegilops genomes were significantly less than those from Triticum A, B, D, and their progenitor genomes. Relationships between the structural variations of the two polyglutamine domains and the distributions of four T cell stimulatory toxic epitopes were found, resulting in the α-gliadin genes from the Aegilops and Triticum genomes to be classified into three groups.

  14. The Genomes OnLine Database (GOLD) v.4: status of genomic and metagenomic projects and their associated metadata

    Science.gov (United States)

    Pagani, Ioanna; Liolios, Konstantinos; Jansson, Jakob; Chen, I-Min A.; Smirnova, Tatyana; Nosrat, Bahador; Markowitz, Victor M.; Kyrpides, Nikos C.

    2012-01-01

    The Genomes OnLine Database (GOLD, http://www.genomesonline.org/) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2011, GOLD, now on version 4.0, contains information for 11 472 sequencing projects, of which 2907 have been completed and their sequence data has been deposited in a public repository. Out of these complete projects, 1918 are finished and 989 are permanent drafts. Moreover, GOLD contains information for 340 metagenome studies associated with 1927 metagenome samples. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about any (x) Sequence specification and beyond. PMID:22135293

  15. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    Science.gov (United States)

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  16. Meiotic gene-conversion rate and tract length variation in the human genome.

    Science.gov (United States)

    Padhukasahasram, Badri; Rannala, Bruce

    2013-02-27

    Meiotic recombination occurs in the form of two different mechanisms called crossing-over and gene-conversion and both processes have an important role in shaping genetic variation in populations. Although variation in crossing-over rates has been studied extensively using sperm-typing experiments, pedigree studies and population genetic approaches, our knowledge of variation in gene-conversion parameters (ie, rates and mean tract lengths) remains far from complete. To explore variability in population gene-conversion rates and its relationship to crossing-over rate variation patterns, we have developed and validated using coalescent simulations a comprehensive Bayesian full-likelihood method that can jointly infer crossing-over and gene-conversion rates as well as tract lengths from population genomic data under general variable rate models with recombination hotspots. Here, we apply this new method to SNP data from multiple human populations and attempt to characterize for the first time the fine-scale variation in gene-conversion parameters along the human genome. We find that the estimated ratio of gene-conversion to crossing-over rates varies considerably across genomic regions as well as between populations. However, there is a great degree of uncertainty associated with such estimates. We also find substantial evidence for variation in the mean conversion tract length. The estimated tract lengths did not show any negative relationship with the local heterozygosity levels in our analysis.European Journal of Human Genetics advance online publication, 27 February 2013; doi:10.1038/ejhg.2013.30.

  17. Distribution and diversity of cytotypes in Dianthus broteri as evidenced by genome size variations.

    Science.gov (United States)

    Balao, Francisco; Casimiro-Soriguer, Ramón; Talavera, María; Herrera, Javier; Talavera, Salvador

    2009-10-01

    Studying the spatial distribution of cytotypes and genome size in plants can provide valuable information about the evolution of polyploid complexes. Here, the spatial distribution of cytological races and the amount of DNA in Dianthus broteri, an Iberian carnation with several ploidy levels, is investigated. Sample chromosome counts and flow cytometry (using propidium iodide) were used to determine overall genome size (2C value) and ploidy level in 244 individuals of 25 populations. Both fresh and dried samples were investigated. Differences in 2C and 1Cx values among ploidy levels within biogeographical provinces were tested using ANOVA. Geographical correlations of genome size were also explored. Extensive variation in chromosomes numbers (2n = 2x = 30, 2n = 4x = 60, 2n = 6x = 90 and 2n = 12x =180) was detected, and the dodecaploid cytotype is reported for the first time in this genus. As regards cytotype distribution, six populations were diploid, 11 were tetraploid, three were hexaploid and five were dodecaploid. Except for one diploid population containing some triploid plants (2n = 45), the remaining populations showed a single cytotype. Diploids appeared in two disjunct areas (south-east and south-west), and so did tetraploids (although with a considerably wider geographic range). Dehydrated leaf samples provided reliable measurements of DNA content. Genome size varied significantly among some cytotypes, and also extensively within diploid (up to 1.17-fold) and tetraploid (1.22-fold) populations. Nevertheless, variations were not straightforwardly congruent with ecology and geographical distribution. Dianthus broteri shows the highest diversity of cytotypes known to date in the genus Dianthus. Moreover, some cytotypes present remarkable internal genome size variation. The evolution of the complex is discussed in terms of autopolyploidy, with primary and secondary contact zones.

  18. Genome-Wide Fine-Scale Recombination Rate Variation in Drosophila melanogaster

    Science.gov (United States)

    Song, Yun S.

    2012-01-01

    Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA) and the other from Africa (Gikongoro, Rwanda). It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features—including recombination rates, diversity, divergence, GC content, gene content, and sequence quality—is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between recombination and

  19. Chromosome Numbers and Genome Size Variation in Indian Species of Curcuma (Zingiberaceae)

    Science.gov (United States)

    Leong-Škorničková, Jana; Šída, Otakar; Jarolímová, Vlasta; Sabu, Mamyil; Fér, Tomáš; Trávníček, Pavel; Suda, Jan

    2007-01-01

    Background and Aims Genome size and chromosome numbers are important cytological characters that significantly influence various organismal traits. However, geographical representation of these data is seriously unbalanced, with tropical and subtropical regions being largely neglected. In the present study, an investigation was made of chromosomal and genome size variation in the majority of Curcuma species from the Indian subcontinent, and an assessment was made of the value of these data for taxonomic purposes. Methods Genome size of 161 homogeneously cultivated plant samples classified into 51 taxonomic entities was determined by propidium iodide flow cytometry. Chromosome numbers were counted in actively growing root tips using conventional rapid squash techniques. Key Results Six different chromosome counts (2n = 22, 42, 63, >70, 77 and 105) were found, the last two representing new generic records. The 2C-values varied from 1·66 pg in C. vamana to 4·76 pg in C. oligantha, representing a 2·87-fold range. Three groups of taxa with significantly different homoploid genome sizes (Cx-values) and distinct geographical distribution were identified. Five species exhibited intraspecific variation in nuclear DNA content, reaching up to 15·1 % in cultivated C. longa. Chromosome counts and genome sizes of three Curcuma-like species (Hitchenia caulina, Kaempferia scaposa and Paracautleya bhatii) corresponded well with typical hexaploid (2n = 6x = 42) Curcuma spp. Conclusions The basic chromosome number in the majority of Indian taxa (belonging to subgenus Curcuma) is x = 7; published counts correspond to 6x, 9x, 11x, 12x and 15x ploidy levels. Only a few species-specific C-values were found, but karyological and/or flow cytometric data may support taxonomic decisions in some species alliances with morphological similarities. Close evolutionary relationships among some cytotypes are suggested based on the similarity in homoploid genome sizes and geographical grouping

  20. Comparative population genomics of latitudinal variation in Drosophila simulans and Drosophila melanogaster.

    Science.gov (United States)

    Machado, Heather E; Bergland, Alan O; O'Brien, Katherine R; Behrman, Emily L; Schmidt, Paul S; Petrov, Dmitri A

    2016-02-01

    Examples of clinal variation in phenotypes and genotypes across latitudinal transects have served as important models for understanding how spatially varying selection and demographic forces shape variation within species. Here, we examine the selective and demographic contributions to latitudinal variation through the largest comparative genomic study to date of Drosophila simulans and Drosophila melanogaster, with genomic sequence data from 382 individual fruit flies, collected across a spatial transect of 19 degrees latitude and at multiple time points over 2 years. Consistent with phenotypic studies, we find less clinal variation in D. simulans than D. melanogaster, particularly for the autosomes. Moreover, we find that clinally varying loci in D. simulans are less stable over multiple years than comparable clines in D. melanogaster. D. simulans shows a significantly weaker pattern of isolation by distance than D. melanogaster and we find evidence for a stronger contribution of migration to D. simulans population genetic structure. While population bottlenecks and migration can plausibly explain the differences in stability of clinal variation between the two species, we also observe a significant enrichment of shared clinal genes, suggesting that the selective forces associated with climate are acting on the same genes and phenotypes in D. simulans and D. melanogaster. © 2015 John Wiley & Sons Ltd.

  1. Genome-wide association study identified CNP12587 region underlying height variation in Chinese females.

    Directory of Open Access Journals (Sweden)

    Yin-Ping Zhang

    Full Text Available Human height is a highly heritable trait considered as an important factor for health. There has been limited success in identifying the genetic factors underlying height variation. We aim to identify sequence variants associated with adult height by a genome-wide association study of copy number variants (CNVs in Chinese.Genome-wide CNV association analyses were conducted in 1,625 unrelated Chinese adults and sex specific subgroup for height variation, respectively. Height was measured with a stadiometer. Affymetrix SNP6.0 genotyping platform was used to identify copy number polymorphisms (CNPs. We constructed a genomic map containing 1,009 CNPs in Chinese individuals and performed a genome-wide association study of CNPs with height.We detected 10 significant association signals for height (p<0.05 in the whole population, 9 and 11 association signals for Chinese female and male population, respectively. A copy number polymorphism (CNP12587, chr18:54081842-54086942, p = 2.41 × 10(-4 was found to be significantly associated with height variation in Chinese females even after strict Bonferroni correction (p = 0.048. Confirmatory real time PCR experiments lent further support for CNV validation. Compared to female subjects with two copies of the CNP, carriers of three copies had an average of 8.1% decrease in height. An important candidate gene, ubiquitin-protein ligase NEDD4-like (NEDD4L, was detected at this region, which plays important roles in bone metabolism by binding to bone formation regulators.Our findings suggest the important genetic variants underlying height variation in Chinese.

  2. Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping.

    Directory of Open Access Journals (Sweden)

    Amaury Vaysse

    2011-10-01

    Full Text Available The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease.

  3. A genomic overview of short genetic variations in a basal chordate, Ciona intestinalis

    Directory of Open Access Journals (Sweden)

    Satou Yutaka

    2012-05-01

    Full Text Available Abstract Background Although the Ciona intestinalis genome contains many allelic polymorphisms, there is only limited data analyzed systematically. Establishing a dense map of genetic variations in C. intestinalis is necessary not only for linkage analysis, but also for other experimental biology including molecular developmental and evolutionary studies, because animals from natural populations are typically used for experiments. Results Here, we identified over three million candidate short genomic variations within a 110 Mb euchromatin region among five C. intestinalis individuals. The average nucleotide diversity was approximately 1.1%. Genetic variations were found at a similar density in intergenic and gene regions. Non-synonymous and nonsense nucleotide substitutions were found in 12,493 and 1,214 genes accounting for 81.9% and 8.0% of the entire gene set, respectively, and over 60% of genes in the single animal encode non-identical proteins between maternal and paternal alleles. Conclusions Our results provide a framework for studying evolution of the animal genome, as well as a useful resource for a wide range of C. intestinalis researchers.

  4. Comparative analysis of complete chloroplast genome sequence and inversion variation in Lasthenia burkei (Madieae, Asteraceae).

    Science.gov (United States)

    Walker, Joseph F; Zanis, Michael J; Emery, Nancy C

    2014-04-01

    Complete chloroplast genome studies can help resolve relationships among large, complex plant lineages such as Asteraceae. We present the first whole plastome from the Madieae tribe and compare its sequence variation to other chloroplast genomes in Asteraceae. We used high throughput sequencing to obtain the Lasthenia burkei chloroplast genome. We compared sequence structure and rates of molecular evolution in the small single copy (SSC), large single copy (LSC), and inverted repeat (IR) regions to those for eight Asteraceae accessions and one Solanaceae accession. The chloroplast sequence of L. burkei is 150 746 bp and contains 81 unique protein coding genes and 4 coding ribosomal RNA sequences. We identified three major inversions in the L. burkei chloroplast, all of which have been found in other Asteraceae lineages, and a previously unreported inversion in Lactuca sativa. Regions flanking inversions contained tRNA sequences, but did not have particularly high G + C content. Substitution rates varied among the SSC, LSC, and IR regions, and rates of evolution within each region varied among species. Some observed differences in rates of molecular evolution may be explained by the relative proportion of coding to noncoding sequence within regions. Rates of molecular evolution vary substantially within and among chloroplast genomes, and major inversion events may be promoted by the presence of tRNAs. Collectively, these results provide insight into different mechanisms that may promote intramolecular recombination and the inversion of large genomic regions in the plastome.

  5. Genome variations associated with viral susceptibility and calcification in Emiliania huxleyi.

    Science.gov (United States)

    Kegel, Jessica U; John, Uwe; Valentin, Klaus; Frickenhaus, Stephan

    2013-01-01

    Emiliania huxleyi, a key player in the global carbon cycle is one of the best studied coccolithophores with respect to biogeochemical cycles, climatology, and host-virus interactions. Strains of E. huxleyi show phenotypic plasticity regarding growth behaviour, light-response, calcification, acidification, and virus susceptibility. This phenomenon is likely a consequence of genomic differences, or transcriptomic responses, to environmental conditions or threats such as viral infections. We used an E. huxleyi genome microarray based on the sequenced strain CCMP1516 (reference strain) to perform comparative genomic hybridizations (CGH) of 16 E. huxleyi strains of different geographic origin. We investigated the genomic diversity and plasticity and focused on the identification of genes related to virus susceptibility and coccolith production (calcification). Among the tested 31940 gene models a core genome of 14628 genes was identified by hybridization among 16 E. huxleyi strains. 224 probes were characterized as specific for the reference strain CCMP1516. Compared to the sequenced E. huxleyi strain CCMP1516 variation in gene content of up to 30 percent among strains was observed. Comparison of core and non-core transcripts sets in terms of annotated functions reveals a broad, almost equal functional coverage over all KOG-categories of both transcript sets within the whole annotated genome. Within the variable (non-core) genome we identified genes associated with virus susceptibility and calcification. Genes associated with virus susceptibility include a Bax inhibitor-1 protein, three LRR receptor-like protein kinases, and mitogen-activated protein kinase. Our list of transcripts associated with coccolith production will stimulate further research, e.g. by genetic manipulation. In particular, the V-type proton ATPase 16 kDa proteolipid subunit is proposed to be a plausible target gene for further calcification studies.

  6. Mitogenomes from The 1000 Genome Project reveal new Near Eastern features in present-day Tuscans.

    Directory of Open Access Journals (Sweden)

    Alberto Gómez-Carballa

    Full Text Available Genetic analyses have recently been carried out on present-day Tuscans (Central Italy in order to investigate their presumable recent Near East ancestry in connection with the long-standing debate on the origins of the Etruscan civilization. We retrieved mitogenomes and genome-wide SNP data from 110 Tuscans analyzed within the context of The 1000 Genome Project. For phylogeographic and evolutionary analysis we made use of a large worldwide database of entire mitogenomes (>26,000 and partial control region sequences (>180,000.Different analyses reveal the presence of typical Near East haplotypes in Tuscans representing isolated members of various mtDNA phylogenetic branches. As a whole, the Near East component in Tuscan mitogenomes can be estimated at about 8%; a proportion that is comparable to previous estimates but significantly lower than admixture estimates obtained from autosomal SNP data (21%. Phylogeographic and evolutionary inter-population comparisons indicate that the main signal of Near Eastern Tuscan mitogenomes comes from Iran.Mitogenomes of recent Near East origin in present-day Tuscans do not show local or regional variation. This points to a demographic scenario that is compatible with a recent arrival of Near Easterners to this region in Italy with no founder events or bottlenecks.

  7. Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges.

    Science.gov (United States)

    Cai, Binghuang; Li, Biao; Kiga, Nikki; Thusberg, Janita; Bergquist, Timothy; Chen, Yun-Ching; Niknafs, Noushin; Carter, Hannah; Tokheim, Collin; Beleva-Guthrie, Violeta; Douville, Christopher; Bhattacharya, Rohit; Yeo, Hui Ting Grace; Fan, Jean; Sengupta, Sohini; Kim, Dewey; Cline, Melissa; Turner, Tychele; Diekhans, Mark; Zaucha, Jan; Pal, Lipika R; Cao, Chen; Yu, Chen-Hsin; Yin, Yizhou; Carraro, Marco; Giollo, Manuel; Ferrari, Carlo; Leonardi, Emanuela; Tosatto, Silvio C E; Bobe, Jason; Ball, Madeleine; Hoskins, Roger A; Repo, Susanna; Church, George; Brenner, Steven E; Moult, John; Gough, Julian; Stanke, Mario; Karchin, Rachel; Mooney, Sean D

    2017-09-01

    The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features. © 2017 Wiley Periodicals, Inc.

  8. Genome - wide variation and demographic history of small cats with a focus on Felis species

    Directory of Open Access Journals (Sweden)

    Anubhab Khan

    2017-10-01

    Full Text Available Majority of the 38 known cat species are classified as small and they inhabit five of the seven continents. They survive in a vast range of habitats but still 12 out of the 18 threatened felids are small cats. However, there has not been enough progress in the field of small cat research as they generally get overshadowed by the charismatic big cats. Here we attempt to create a resource for small cat research especially of the genus Felis which has six species out of which two are classified as vulnerable by IUCN and at least one more is at risk. We collected tissue samples of four Felis chaus (Jungle cat from central India and used available whole genome sequences of nine individuals from four other Felis species, two individuals of Prionailurus bengalensis and an Otocolobus manul. These whole genome sequences were filtered and aligned with the already published domestic cat (Felis catus genome assembly. Felids are closely related species and reads from all species in our study aligned with the domestic cat genome with a rate of at least 93%. We estimated the existing genomic variation by calculating heterozygous SNP encounter rate. So far, it seems that all wild cats have more genetic variation than Felis catus species. This can be attributed to the inbreeding in these cats. Among the wild cats, Felis silvestris seems to have the highest level of genetic variation. To understand the reasons behind the distribution of genetic variation in small cats, we estimated the demographic histories of each of the species using PSMC. This method can only detect demographic changes more than 1000 generations ago. We observe that roughly all species share a parallel history in terms of population increase. The most interesting and important feature might be that all wild small cat population sizes increased exponentially around twenty thousand years ago as opposed to domestic cat and big cats which declined around this time. Another interesting feature of

  9. The pig genome project has plenty to squeal about.

    Science.gov (United States)

    Fan, B; Gorbach, D M; Rothschild, M F

    2011-01-01

    Significant progress on pig genetics and genomics research has been witnessed in recent years due to the integration of advanced molecular biology techniques, bioinformatics and computational biology, and the collaborative efforts of researchers in the swine genomics community. Progress on expanding the linkage map has slowed down, but the efforts have created a higher-resolution physical map integrating the clone map and BAC end sequence. The number of QTL mapped is still growing and most of the updated QTL mapping results are available through PigQTLdb. Additionally, expression studies using high-throughput microarrays and other gene expression techniques have made significant advancements. The number of identified non-coding RNAs is rapidly increasing and their exact regulatory functions are being explored. A publishable draft (build 10) of the swine genome sequence was available for the pig genomics community by the end of December 2010. Build 9 of the porcine genome is currently available with Ensembl annotation; manual annotation is ongoing. These drafts provide useful tools for such endeavors as comparative genomics and SNP scans for fine QTL mapping. A recent community-wide effort to create a 60K porcine SNP chip has greatly facilitated whole-genome association analyses, haplotype block construction and linkage disequilibrium mapping, which can contribute to whole-genome selection. The future 'systems biology' that integrates and optimizes the information from all research levels can enhance the pig community's understanding of the full complexity of the porcine genome. These recent technological advances and where they may lead are reviewed. Copyright © 2011 S. Karger AG, Basel.

  10. Epigenetic Variation in Monozygotic Twins: A Genome-Wide Analysis of DNA Methylation in Buccal Cells

    Directory of Open Access Journals (Sweden)

    Jenny van Dongen

    2014-05-01

    Full Text Available DNA methylation is one of the most extensively studied epigenetic marks in humans. Yet, it is largely unknown what causes variation in DNA methylation between individuals. The comparison of DNA methylation profiles of monozygotic (MZ twins offers a unique experimental design to examine the extent to which such variation is related to individual-specific environmental influences and stochastic events or to familial factors (DNA sequence and shared environment. We measured genome-wide DNA methylation in buccal samples from ten MZ pairs (age 8–19 using the Illumina 450k array and examined twin correlations for methylation level at 420,921 CpGs after QC. After selecting CpGs showing the most variation in the methylation level between subjects, the mean genome-wide correlation (rho was 0.54. The correlation was higher, on average, for CpGs within CpG islands (CGIs, compared to CGI shores, shelves and non-CGI regions, particularly at hypomethylated CpGs. This finding suggests that individual-specific environmental and stochastic influences account for more variation in DNA methylation in CpG-poor regions. Our findings also indicate that it is worthwhile to examine heritable and shared environmental influences on buccal DNA methylation in larger studies that also include dizygotic twins.

  11. Common genetic variation and susceptibility to partial epilepsies: a genome-wide association study.

    Science.gov (United States)

    Kasperaviciūte, Dalia; Catarino, Claudia B; Heinzen, Erin L; Depondt, Chantal; Cavalleri, Gianpiero L; Caboclo, Luis O; Tate, Sarah K; Jamnadas-Khoda, Jenny; Chinthapalli, Krishna; Clayton, Lisa M S; Shianna, Kevin V; Radtke, Rodney A; Mikati, Mohamad A; Gallentine, William B; Husain, Aatif M; Alhusaini, Saud; Leppert, David; Middleton, Lefkos T; Gibson, Rachel A; Johnson, Michael R; Matthews, Paul M; Hosford, David; Heuser, Kjell; Amos, Leslie; Ortega, Marcos; Zumsteg, Dominik; Wieser, Heinz-Gregor; Steinhoff, Bernhard J; Krämer, Günter; Hansen, Jörg; Dorn, Thomas; Kantanen, Anne-Mari; Gjerstad, Leif; Peuralinna, Terhi; Hernandez, Dena G; Eriksson, Kai J; Kälviäinen, Reetta K; Doherty, Colin P; Wood, Nicholas W; Pandolfo, Massimo; Duncan, John S; Sander, Josemir W; Delanty, Norman; Goldstein, David B; Sisodiya, Sanjay M

    2010-07-01

    Partial epilepsies have a substantial heritability. However, the actual genetic causes are largely unknown. In contrast to many other common diseases for which genetic association-studies have successfully revealed common variants associated with disease risk, the role of common variation in partial epilepsies has not yet been explored in a well-powered study. We undertook a genome-wide association-study to identify common variants which influence risk for epilepsy shared amongst partial epilepsy syndromes, in 3445 patients and 6935 controls of European ancestry. We did not identify any genome-wide significant association. A few single nucleotide polymorphisms may warrant further investigation. We exclude common genetic variants with effect sizes above a modest 1.3 odds ratio for a single variant as contributors to genetic susceptibility shared across the partial epilepsies. We show that, at best, common genetic variation can only have a modest role in predisposition to the partial epilepsies when considered across syndromes in Europeans. The genetic architecture of the partial epilepsies is likely to be very complex, reflecting genotypic and phenotypic heterogeneity. Larger meta-analyses are required to identify variants of smaller effect sizes (odds ratio<1.3) or syndrome-specific variants. Further, our results suggest research efforts should also be directed towards identifying the multiple rare variants likely to account for at least part of the heritability of the partial epilepsies. Data emerging from genome-wide association-studies will be valuable during the next serious challenge of interpreting all the genetic variation emerging from whole-genome sequencing studies.

  12. Understanding the Human Genome Project — A Fact Sheet | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... The Human Genome Project spurred a revolution in biotechnology innovation around the world and played a key ... the U.S. the global leader in the new biotechnology sector. In April 2003, researchers successfully completed the ...

  13. An initial comparative map of copy number variations in the goat (Capra hircus genome

    Directory of Open Access Journals (Sweden)

    Casadio Rita

    2010-11-01

    Full Text Available Abstract Background The goat (Capra hircus represents one of the most important farm animal species. It is reared in all continents with an estimated world population of about 800 million of animals. Despite its importance, studies on the goat genome are still in their infancy compared to those in other farm animal species. Comparative mapping between cattle and goat showed only a few rearrangements in agreement with the similarity of chromosome banding. We carried out a cross species cattle-goat array comparative genome hybridization (aCGH experiment in order to identify copy number variations (CNVs in the goat genome analysing animals of different breeds (Saanen, Camosciata delle Alpi, Girgentana, and Murciano-Granadina using a tiling oligonucleotide array with ~385,000 probes designed on the bovine genome. Results We identified a total of 161 CNVs (an average of 17.9 CNVs per goat, with the largest number in the Saanen breed and the lowest in the Camosciata delle Alpi goat. By aggregating overlapping CNVs identified in different animals we determined CNV regions (CNVRs: on the whole, we identified 127 CNVRs covering about 11.47 Mb of the virtual goat genome referred to the bovine genome (0.435% of the latter genome. These 127 CNVRs included 86 loss and 41 gain and ranged from about 24 kb to about 1.07 Mb with a mean and median equal to 90,292 bp and 49,530 bp, respectively. To evaluate whether the identified goat CNVRs overlap with those reported in the cattle genome, we compared our results with those obtained in four independent cattle experiments. Overlapping between goat and cattle CNVRs was highly significant (P Conclusions We describe a first map of goat CNVRs. This provides information on a comparative basis with the cattle genome by identifying putative recurrent interspecies CNVs between these two ruminant species. Several goat CNVs affect genes with important biological functions. Further studies are needed to evaluate the

  14. National human genome projects: an update and an agenda

    OpenAIRE

    An, Joon Yong

    2017-01-01

    Population genetic and human genetic studies are being accelerated with genome technology and data sharing. Accordingly, in the past 10 years, several countries have initiated genetic research using genome technology and identified the genetic architecture of the ethnic groups living in the corresponding country or suggested the genetic foundation of a social phenomenon. Genetic research has been conducted from epidemiological studies that previously described the health or disease conditions...

  15. Genome size variation in Macaronesian Angiosperms: Forty Percent of Canarian Endemic Flora Completed

    Czech Academy of Sciences Publication Activity Database

    Suda, Jan; Kyncl, Tomáš; Jarolímová, Vlasta

    2005-01-01

    Roč. 252, 3-4 (2005), s. 215-238 ISSN 0378-2697 R&D Projects: GA ČR(CZ) GA206/00/1445; GA ČR(CZ) GA206/04/0081; GA AV ČR(CZ) KSK6005114 Institutional research plan: CEZ:AV0Z60050516 Keywords : genome size * cytometry * Macaronesia Subject RIV: EF - Botanics Impact factor: 1.421, year: 2005

  16. Extensive variation in the density and distribution of DNA polymorphism in sorghum genomes.

    Directory of Open Access Journals (Sweden)

    Joseph Evans

    Full Text Available Sorghum genotypes currently used for grain production in the United States were developed from African landraces that were imported starting in the mid-to-late 19(th century. Farmers and plant breeders selected genotypes for grain production with reduced plant height, early flowering, increased grain yield, adaptation to drought, and improved resistance to lodging, diseases and pests. DNA polymorphisms that distinguish three historically important grain sorghum genotypes, BTx623, BTx642 and Tx7000, were characterized by genome sequencing, genotyping by sequencing, genetic mapping, and pedigree-based haplotype analysis. The distribution and density of DNA polymorphisms in the sequenced genomes varied widely, in part because the lines were derived through breeding and selection from diverse Kafir, Durra, and Caudatum race accessions. Genomic DNA spanning dw1 (SBI-09 and dw3 (SBI-07 had identical haplotypes due to selection for reduced height. Lower SNP density in genes located in pericentromeric regions compared with genes located in euchromatic regions is consistent with background selection in these regions of low recombination. SNP density was higher in euchromatic DNA and varied >100-fold in contiguous intervals that spanned up to 300 Kbp. The localized variation in DNA polymorphism density occurred throughout euchromatic regions where recombination is elevated, however, polymorphism density was not correlated with gene density or DNA methylation. Overall, sorghum chromosomes contain distal euchromatic regions characterized by extensive, localized variation in DNA polymorphism density, and large pericentromeric regions of low gene density, diversity, and recombination.

  17. The Human Genome Project: Information access, management, and regulation. Final report

    Energy Technology Data Exchange (ETDEWEB)

    McInerney, J.D.; Micikas, L.B.

    1996-08-31

    The Human Genome Project is a large, internationally coordinated effort in biological research directed at creating a detailed map of human DNA. This report describes the access of information, management, and regulation of the project. The project led to the development of an instructional module titled The Human Genome Project: Biology, Computers, and Privacy, designed for use in high school biology classes. The module consists of print materials and both Macintosh and Windows versions of related computer software-Appendix A contains a copy of the print materials and discs containing the two versions of the software.

  18. The Human Genome Project: applications in the diagnosis and treatment of neurologic disease.

    Science.gov (United States)

    Evans, G A

    1998-10-01

    The Human Genome Project (HGP), an international program to decode the entire DNA sequence of the human genome in 15 years, represents the largest biological experiment ever conducted. This set of information will contain the blueprint for the construction and operation of a human being. While the primary driving force behind the genome project is the potential to vastly expand the amount of genetic information available for biomedical research, the ramifications for other fields of study in biological research, the biotechnology and pharmaceutical industry, our understanding of evolution, effects on agriculture, and implications for bioethics are likely to be profound.

  19. Illumina based whole mitochondrial genome of Junonia iphita reveals minor intraspecific variation

    Directory of Open Access Journals (Sweden)

    Catherine Vanlalruati

    2015-12-01

    Full Text Available In the present study, the near complete mitochondrial genome (mitogenome of Junonia iphita (Lepidoptera: Nymphalidae: Nymphalinae was determined to be 14,892 bp. The gene order and orientation are identical to those in other butterfly species. The phylogenetic tree constructed from the whole mitogenomes using the 13 protein coding genes (PCGs defines the genetic relatedness of the two J. iphita species collected from two different regions. All the Junonia species clustered together, and were further subdivided into clade one consisting of J. almana and J. orithya and clade two comprising of the two J. iphita which were collected from Indo and Indochinese subregions separated by river barrier. Comparison between the two J. iphita sequences revealed minor variations and Single Nucleotide Polymorphisms were identified at 51 sites amounting to 0.4% of the entire mitochondrial genome.

  20. Copy number variation is a fundamental aspect of the placental genome.

    Directory of Open Access Journals (Sweden)

    Roberta L Hannibal

    2014-05-01

    Full Text Available Discovery of lineage-specific somatic copy number variation (CNV in mammals has led to debate over whether CNVs are mutations that propagate disease or whether they are a normal, and even essential, aspect of cell biology. We show that 1,000 N polyploid trophoblast giant cells (TGCs of the mouse placenta contain 47 regions, totaling 138 Megabases, where genomic copies are underrepresented (UR. UR domains originate from a subset of late-replicating heterochromatic regions containing gene deserts and genes involved in cell adhesion and neurogenesis. While lineage-specific CNVs have been identified in mammalian cells, classically in the immune system where V(DJ recombination occurs, we demonstrate that CNVs form during gestation in the placenta by an underreplication mechanism, not by recombination nor deletion. Our results reveal that large scale CNVs are a normal feature of the mammalian placental genome, which are regulated systematically during embryogenesis and are propagated by a mechanism of underreplication.

  1. Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K

    2016-01-27

    Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for

  2. Sequence variation of the feline immunodeficiency virus genome and its clinical relevance.

    Science.gov (United States)

    Stickney, A L; Dunowska, M; Cave, N J

    2013-06-08

    The ongoing evolution of feline immunodeficiency virus (FIV) has resulted in the existence of a diverse continuum of viruses. FIV isolates differ with regards to their mutation and replication rates, plasma viral loads, cell tropism and the ability to induce apoptosis. Clinical disease in FIV-infected cats is also inconsistent. Genomic sequence variation of FIV is likely to be responsible for some of the variation in viral behaviour. The specific genetic sequences that influence these key viral properties remain to be determined. With knowledge of the specific key determinants of pathogenicity, there is the potential for veterinarians in the future to apply this information for prognostic purposes. Genomic sequence variation of FIV also presents an obstacle to effective vaccine development. Most challenge studies demonstrate acceptable efficacy of a dual-subtype FIV vaccine (Fel-O-Vax FIV) against FIV infection under experimental settings; however, vaccine efficacy in the field still remains to be proven. It is important that we discover the key determinants of immunity induced by this vaccine; such data would compliment vaccine field efficacy studies and provide the basis to make informed recommendations on its use.

  3. Genome Wide Distributions and Functional Characterization of Copy Number Variations between Chinese and Western Pigs.

    Directory of Open Access Journals (Sweden)

    Hongyang Wang

    Full Text Available Copy number variations (CNVs refer to large insertions, deletions and duplications in the genomic structure ranging from one thousand to several million bases in size. Since the development of next generation sequencing technology, several methods have been well built for detection of copy number variations with high credibility and accuracy. Evidence has shown that CNV occurring in gene region could lead to phenotypic changes due to the alteration in gene structure and dosage. However, it still remains unexplored whether CNVs underlie the phenotypic differences between Chinese and Western domestic pigs. Based on the read-depth methods, we investigated copy number variations using 49 individuals derived from both Chinese and Western pig breeds. A total of 3,131 copy number variation regions (CNVRs were identified with an average size of 13.4 Kb in all individuals during domestication, harboring 1,363 genes. Among them, 129 and 147 CNVRs were Chinese and Western pig specific, respectively. Gene functional enrichments revealed that these CNVRs contribute to strong disease resistance and high prolificacy in Chinese domestic pigs, but strong muscle tissue development in Western domestic pigs. This finding is strongly consistent with the morphologic characteristics of Chinese and Western pigs, indicating that these group-specific CNVRs might have been preserved by artificial selection for the favored phenotypes during independent domestication of Chinese and Western pigs. In this study, we built high-resolution CNV maps in several domestic pig breeds and discovered the group specific CNVs by comparing Chinese and Western pigs, which could provide new insight into genomic variations during pigs' independent domestication, and facilitate further functional studies of CNV-associated genes.

  4. Genomic Features That Predict Allelic Imbalance in Humans Suggest Patterns of Constraint on Gene Expression Variation

    Science.gov (United States)

    Fédrigo, Olivier; Haygood, Ralph; Mukherjee, Sayan; Wray, Gregory A.

    2009-01-01

    Variation in gene expression is an important contributor to phenotypic diversity within and between species. Although this variation often has a genetic component, identification of the genetic variants driving this relationship remains challenging. In particular, measurements of gene expression usually do not reveal whether the genetic basis for any observed variation lies in cis or in trans to the gene, a distinction that has direct relevance to the physical location of the underlying genetic variant, and which may also impact its evolutionary trajectory. Allelic imbalance measurements identify cis-acting genetic effects by assaying the relative contribution of the two alleles of a cis-regulatory region to gene expression within individuals. Identification of patterns that predict commonly imbalanced genes could therefore serve as a useful tool and also shed light on the evolution of cis-regulatory variation itself. Here, we show that sequence motifs, polymorphism levels, and divergence levels around a gene can be used to predict commonly imbalanced genes in a human data set. Reduction of this feature set to four factors revealed that only one factor significantly differentiated between commonly imbalanced and nonimbalanced genes. We demonstrate that these results are consistent between the original data set and a second published data set in humans obtained using different technical and statistical methods. Finally, we show that variation in the single allelic imbalance-associated factor is partially explained by the density of genes in the region of a target gene (allelic imbalance is less probable for genes in gene-dense regions), and, to a lesser extent, the evenness of expression of the gene across tissues and the magnitude of negative selection on putative regulatory regions of the gene. These results suggest that the genomic distribution of functional cis-regulatory variants in the human genome is nonrandom, perhaps due to local differences in evolutionary

  5. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  6. Genomic regulation of natural variation in cortical and noncortical brain volume

    Directory of Open Access Journals (Sweden)

    Laughlin Rick E

    2006-02-01

    Full Text Available Abstract Background The relative growth of the neocortex parallels the emergence of complex cognitive functions across species. To determine the regions of the mammalian genome responsible for natural variations in cortical volume, we conducted a complex trait analysis using 34 strains of recombinant inbred (Rl strains of mice (BXD, as well as their two parental strains (C57BL/6J and DBA/2J. We measured both neocortical volume and total brain volume in 155 coronally sectioned mouse brains that were Nissl stained and embedded in celloidin. After correction for shrinkage, the measured cortical and noncortical brain volumes were entered into a multiple regression analysis, which removed the effects of body size and age from the measurements. Marker regression and interval mapping were computed using WebQTL. Results An ANOVA revealed that more than half of the variance of these regressed phenotypes is genetically determined. We then identified the regions of the genome regulating this heritability. We located genomic regions in which a linkage disequilibrium was present using WebQTL as both a mapping engine and genomic database. For neocortex, we found a genome-wide significant quantitative trait locus (QTL on chromosome 11 (marker D11Mit19, as well as a suggestive QTL on chromosome 16 (marker D16Mit100. In contrast, for noncortex the effect of chromosome 11 was markedly reduced, and a significant QTL appeared on chromosome 19 (D19Mit22. Conclusion This classic pattern of double dissociation argues strongly for different genetic factors regulating relative cortical size, as opposed to brain volume more generally. It is likely, however, that the effects of proximal chromosome 11 extend beyond the neocortex strictly defined. An analysis of single nucleotide polymorphisms in these regions indicated that ciliary neurotrophic factor (Cntf is quite possibly the gene underlying the noncortical QTL. Evidence for a candidate gene modulating neocortical

  7. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing.

    Science.gov (United States)

    Aflitos, Saulo; Schijlen, Elio; de Jong, Hans; de Ridder, Dick; Smit, Sandra; Finkers, Richard; Wang, Jun; Zhang, Gengyun; Li, Ning; Mao, Likai; Bakker, Freek; Dirks, Rob; Breit, Timo; Gravendeel, Barbara; Huits, Henk; Struss, Darush; Swanson-Wagner, Ruth; van Leeuwen, Hans; van Ham, Roeland C H J; Fito, Laia; Guignier, Laëtitia; Sevilla, Myrna; Ellul, Philippe; Ganko, Eric; Kapur, Arvind; Reclus, Emannuel; de Geus, Bernard; van de Geest, Henri; Te Lintel Hekkert, Bas; van Haarst, Jan; Smits, Lars; Koops, Andries; Sanchez-Perez, Gabino; van Heusden, Adriaan W; Visser, Richard; Quan, Zhiwu; Min, Jiumeng; Liao, Li; Wang, Xiaoli; Wang, Guangbiao; Yue, Zhen; Yang, Xinhua; Xu, Na; Schranz, Eric; Smets, Erik; Vos, Rutger; Rauwerda, Johan; Ursem, Remco; Schuit, Cees; Kerns, Mike; van den Berg, Jan; Vriezen, Wim; Janssen, Antoine; Datema, Erwin; Jahrman, Torben; Moquet, Frederic; Bonnet, Julien; Peters, Sander

    2014-10-01

    We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new reference genomes were reconstructed to support our comparative genome analyses. Comparative sequence alignment revealed group-, species- and accession-specific polymorphisms, explaining characteristic fruit traits and growth habits in the various cultivars. Using gene models from the annotated Heinz 1706 reference genome, we observed differences in the ratio between non-synonymous and synonymous SNPs (dN/dS) in fruit diversification and plant growth genes compared to a random set of genes, indicating positive selection and differences in selection pressure between crop accessions and wild species. In wild species, the number of single-nucleotide polymorphisms (SNPs) exceeds 10 million, i.e. 20-fold higher than found in most of the crop accessions, indicating dramatic genetic erosion of crop and heirloom tomatoes. In addition, the highest levels of heterozygosity were found for allogamous self-incompatible wild species, while facultative and autogamous self-compatible species display a lower heterozygosity level. Using whole-genome SNP information for maximum-likelihood analysis, we achieved complete tree resolution, whereas maximum-likelihood trees based on SNPs from ten fruit and growth genes show incomplete resolution for the crop accessions, partly due to the effect of heterozygous SNPs. Finally, results suggest that phylogenetic relationships are correlated with habitat, indicating the occurrence of geographical races within these groups, which is of practical importance for Solanum genome evolution studies. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.

  8. Genomic variation in CYP3A4: type, frequencies and potential implications for pharmacogenetic understanding.

    OpenAIRE

    Creemer, O.

    2012-01-01

    The human cytochrome P450 3A subfamily metabolises endogenous substances and approximately half of all currently available drugs. There is marked inter-individual variation in hepatic expression of the major adult isoform, CYP3A4; the genetic component of this variability is estimated at 60-90% and, as yet, remains largely uncharacterised. Elucidation of genetic factors determining CYP3A4 activity would permit personalised dose-adjustment in therapies with CYP3A4 drug substrates. CYP3A4 genom...

  9. SV2: accurate structural variation genotyping and de novo mutation detection from whole genomes.

    Science.gov (United States)

    Antaki, Danny; Brandler, William M; Sebat, Jonathan

    2018-05-15

    Structural variation (SV) detection from short-read whole genome sequencing is error prone, presenting significant challenges for population or family-based studies of disease. Here, we describe SV2, a machine-learning algorithm for genotyping deletions and duplications from paired-end sequencing data. SV2 can rapidly integrate variant calls from multiple structural variant discovery algorithms into a unified call set with high genotyping accuracy and capability to detect de novo mutations. SV2 is freely available on GitHub (https://github.com/dantaki/SV2). jsebat@ucsd.edu. Supplementary data are available at Bioinformatics online.

  10. Genomic dissection of variation in clutch size and egg mass in a wild great tit (Parus major) population.

    Science.gov (United States)

    Santure, Anna W; De Cauwer, Isabelle; Robinson, Matthew R; Poissant, Jocelyn; Sheldon, Ben C; Slate, Jon

    2013-08-01

    Clutch size and egg mass are life history traits that have been extensively studied in wild bird populations, as life history theory predicts a negative trade-off between them, either at the phenotypic or at the genetic level. Here, we analyse the genomic architecture of these heritable traits in a wild great tit (Parus major) population, using three marker-based approaches - chromosome partitioning, quantitative trait locus (QTL) mapping and a genome-wide association study (GWAS). The variance explained by each great tit chromosome scales with predicted chromosome size, no location in the genome contains genome-wide significant QTL, and no individual SNPs are associated with a large proportion of phenotypic variation, all of which may suggest that variation in both traits is due to many loci of small effect, located across the genome. There is no evidence that any regions of the genome contribute significantly to both traits, which combined with a small, nonsignificant negative genetic covariance between the traits, suggests the absence of genetic constraints on the independent evolution of these traits. Our findings support the hypothesis that variation in life history traits in natural populations is likely to be determined by many loci of small effect spread throughout the genome, which are subject to continued input of variation by mutation and migration, although we cannot exclude the possibility of an additional input of major effect genes influencing either trait. © 2013 John Wiley & Sons Ltd.

  11. Orion: Detecting regions of the human non-coding genome that are intolerant to variation using population genetics.

    Science.gov (United States)

    Gussow, Ayal B; Copeland, Brett R; Dhindsa, Ryan S; Wang, Quanli; Petrovski, Slavé; Majoros, William H; Allen, Andrew S; Goldstein, David B

    2017-01-01

    There is broad agreement that genetic mutations occurring outside of the protein-coding regions play a key role in human disease. Despite this consensus, we are not yet capable of discerning which portions of non-coding sequence are important in the context of human disease. Here, we present Orion, an approach that detects regions of the non-coding genome that are depleted of variation, suggesting that the regions are intolerant of mutations and subject to purifying selection in the human lineage. We show that Orion is highly correlated with known intolerant regions as well as regions that harbor putatively pathogenic variation. This approach provides a mechanism to identify pathogenic variation in the human non-coding genome and will have immediate utility in the diagnostic interpretation of patient genomes and in large case control studies using whole-genome sequences.

  12. Complete chloroplast genomes from apomictic Taraxacum (Asteraceae): Identity and variation between three microspecies

    Science.gov (United States)

    Majeský, Ľuboš; Schwarzacher, Trude; Gornall, Richard; Heslop-Harrison, Pat

    2017-01-01

    Chloroplast DNA sequences show substantial variation between higher plant species, and less variation within species, so are typically excellent markers to investigate evolutionary, population and genetic relationships and phylogenies. We sequenced the plastomes of Taraxacum obtusifrons Markl. (O978); T. stridulum Trávniček ined. (S3); and T. amplum Markl. (A978), three apomictic triploid (2n = 3x = 24) dandelions from the T. officinale agg. We aimed to characterize the variation in plastomes, define relationships and correlations with the apomictic microspecies status, and refine placement of the microspecies in the evolutionary or phylogenetic context of the Asteraceae. The chloroplast genomes of accessions O978 and S3 were identical and 151,322 bp long (where the nuclear genes are known to show variation), while A978 was 151,349 bp long. All three genomes contained 135 unique genes, with an additional copy of the trnF-GGA gene in the LSC region and 20 duplicated genes in the IR region, along with short repeats, the typical major Inverted Repeats (IR1 and IR2, 24,431bp long), and Large and Small Single Copy regions (LSC 83,889bp and SSC 18,571bp in O978). Between the two Taraxacum plastomes types, we identified 28 SNPs. The distribution of polymorphisms suggests some parts of the Taraxacum plastome are evolving at a slower rate. There was a hemi-nested inversion in the LSC region that is common to Asteraceae, and an SSC inversion from ndhF to rps15 found only in some Asteraceae lineages. A comparative repeat analysis showed variation between Taraxacum and the phylogenetically close genus Lactuca, with many more direct repeats of 40bp or more in Lactuca (1% larger plastome than Taraxacum). When individual genes and non-coding regions were for Asteraceae phylogeny reconstruction, not all showed the same evolutionary scenario suggesting care is needed for interpretation of relationships if a limited number of markers are used. Studying genotypic diversity in

  13. A genome wide survey of SNP variation reveals the genetic structure of sheep breeds.

    Directory of Open Access Journals (Sweden)

    James W Kijas

    Full Text Available The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identifying the first genome-wide set of SNP for sheep, we report on levels of genetic variability both within and between a diverse sample of ovine populations. Then, using cluster analysis and the partitioning of genetic variation, we demonstrate sheep are characterised by weak phylogeographic structure, overlapping genetic similarity and generally low differentiation which is consistent with their short evolutionary history. The degree of population substructure was, however, sufficient to cluster individuals based on geographic origin and known breed history. Specifically, African and Asian populations clustered separately from breeds of European origin sampled from Australia, New Zealand, Europe and North America. Furthermore, we demonstrate the presence of stratification within some, but not all, ovine breeds. The results emphasize that careful documentation of genetic structure will be an essential prerequisite when mapping the genetic basis of complex traits. Furthermore, the identification of a subset of SNP able to assign individuals into broad groupings demonstrates even a small panel of markers may be suitable for applications such as traceability.

  14. A Bayesian method and its variational approximation for prediction of genomic breeding values in multiple traits

    Directory of Open Access Journals (Sweden)

    Hayashi Takeshi

    2013-01-01

    Full Text Available Abstract Background Genomic selection is an effective tool for animal and plant breeding, allowing effective individual selection without phenotypic records through the prediction of genomic breeding value (GBV. To date, genomic selection has focused on a single trait. However, actual breeding often targets multiple correlated traits, and, therefore, joint analysis taking into consideration the correlation between traits, which might result in more accurate GBV prediction than analyzing each trait separately, is suitable for multi-trait genomic selection. This would require an extension of the prediction model for single-trait GBV to multi-trait case. As the computational burden of multi-trait analysis is even higher than that of single-trait analysis, an effective computational method for constructing a multi-trait prediction model is also needed. Results We described a Bayesian regression model incorporating variable selection for jointly predicting GBVs of multiple traits and devised both an MCMC iteration and variational approximation for Bayesian estimation of parameters in this multi-trait model. The proposed Bayesian procedures with MCMC iteration and variational approximation were referred to as MCBayes and varBayes, respectively. Using simulated datasets of SNP genotypes and phenotypes for three traits with high and low heritabilities, we compared the accuracy in predicting GBVs between multi-trait and single-trait analyses as well as between MCBayes and varBayes. The results showed that, compared to single-trait analysis, multi-trait analysis enabled much more accurate GBV prediction for low-heritability traits correlated with high-heritability traits, by utilizing the correlation structure between traits, while the prediction accuracy for uncorrelated low-heritability traits was comparable or less with multi-trait analysis in comparison with single-trait analysis depending on the setting for prior probability that a SNP has zero

  15. Genome-wide recombination dynamics are associated with phenotypic variation in maize.

    Science.gov (United States)

    Pan, Qingchun; Li, Lin; Yang, Xiaohong; Tong, Hao; Xu, Shutu; Li, Zhigang; Li, Weiya; Muehlbauer, Gary J; Li, Jiansheng; Yan, Jianbing

    2016-05-01

    Meiotic recombination is a major driver of genetic diversity, species evolution, and agricultural improvement. Thus, an understanding of the genetic recombination landscape across the maize (Zea mays) genome will provide insight and tools for further study of maize evolution and improvement. Here, we used c. 50 000 single nucleotide polymorphisms to precisely map recombination events in 12 artificial maize segregating populations. We observed substantial variation in the recombination frequency and distribution along the ten maize chromosomes among the 12 populations and identified 143 recombination hot regions. Recombination breakpoints were partitioned into intragenic and intergenic events. Interestingly, an increase in the number of genes containing recombination events was accompanied by a decrease in the number of recombination events per gene. This kept the overall number of intragenic recombination events nearly invariable in a given population, suggesting that the recombination variation observed among populations was largely attributed to intergenic recombination. However, significant associations between intragenic recombination events and variation in gene expression and agronomic traits were observed, suggesting potential roles for intragenic recombination in plant phenotypic diversity. Our results provide a comprehensive view of the maize recombination landscape, and show an association between recombination, gene expression and phenotypic variation, which may enhance crop genetic improvement. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  16. Genome-size Variation in Switchgrass (Panicum virgatum: Flow Cytometry and Cytology Reveal Rampant Aneuploidy

    Directory of Open Access Journals (Sweden)

    Denise E. Costich

    2010-11-01

    Full Text Available Switchgrass ( L., a native perennial dominant of the prairies of North America, has been targeted as a model herbaceous species for biofeedstock development. A flow-cytometric survey of a core set of 11 primarily upland polyploid switchgrass accessions indicated that there was considerable variation in genome size within each accession, particularly at the octoploid (2 = 8 = 72 chromosome ploidy level. Highly variable chromosome counts in mitotic cell preparations indicated that aneuploidy was more common in octoploids (86.3% than tetraploids (23.2%. Furthermore, the incidence of hyper- versus hypoaneuploidy is equivalent in tetraploids. This is clearly not the case in octoploids, where close to 90% of the aneuploid counts are lower than the euploid number. Cytogenetic investigation using fluorescent in situ hybridization (FISH revealed an unexpected degree of variation in chromosome structure underlying the apparent genomic instability of this species. These results indicate that rapid advances in the breeding of polyploid biofuel feedstocks, based on the molecular-genetic dissection of biomass characteristics and yield, will be predicated on the continual improvement of our understanding of the cytogenetics of these species.

  17. A genome-wide analysis of putative functional and exonic variation associated with extremely high intelligence.

    Science.gov (United States)

    Spain, S L; Pedroso, I; Kadeva, N; Miller, M B; Iacono, W G; McGue, M; Stergiakouli, E; Davey Smith, G; Putallaz, M; Lubinski, D; Meaburn, E L; Plomin, R; Simpson, M A

    2016-08-01

    Although individual differences in intelligence (general cognitive ability) are highly heritable, molecular genetic analyses to date have had limited success in identifying specific loci responsible for its heritability. This study is the first to investigate exome variation in individuals of extremely high intelligence. Under the quantitative genetic model, sampling from the high extreme of the distribution should provide increased power to detect associations. We therefore performed a case-control association analysis with 1409 individuals drawn from the top 0.0003 (IQ >170) of the population distribution of intelligence and 3253 unselected population-based controls. Our analysis focused on putative functional exonic variants assayed on the Illumina HumanExome BeadChip. We did not observe any individual protein-altering variants that are reproducibly associated with extremely high intelligence and within the entire distribution of intelligence. Moreover, no significant associations were found for multiple rare alleles within individual genes. However, analyses using genome-wide similarity between unrelated individuals (genome-wide complex trait analysis) indicate that the genotyped functional protein-altering variation yields a heritability estimate of 17.4% (s.e. 1.7%) based on a liability model. In addition, investigation of nominally significant associations revealed fewer rare alleles associated with extremely high intelligence than would be expected under the null hypothesis. This observation is consistent with the hypothesis that rare functional alleles are more frequently detrimental than beneficial to intelligence.

  18. Human genome education model project. Ethical, legal, and social implications of the human genome project: Education of interdisciplinary professionals

    Energy Technology Data Exchange (ETDEWEB)

    Weiss, J.O. [Alliance of Genetic Support Groups, Chevy Chase, MD (United States); Lapham, E.V. [Georgetown Univ., Washington, DC (United States). Child Development Center

    1996-12-31

    This meeting was held June 10, 1996 at Georgetown University. The purpose of this meeting was to provide a multidisciplinary forum for exchange of state-of-the-art information on the human genome education model. Topics of discussion include the following: psychosocial issues; ethical issues for professionals; legislative issues and update; and education issues.

  19. Enabling a community to dissect an organism: overview of the Neurospora functional genomics project.

    Science.gov (United States)

    Dunlap, Jay C; Borkovich, Katherine A; Henn, Matthew R; Turner, Gloria E; Sachs, Matthew S; Glass, N Louise; McCluskey, Kevin; Plamann, Michael; Galagan, James E; Birren, Bruce W; Weiss, Richard L; Townsend, Jeffrey P; Loros, Jennifer J; Nelson, Mary Anne; Lambreghts, Randy; Colot, Hildur V; Park, Gyungsoon; Collopy, Patrick; Ringelberg, Carol; Crew, Christopher; Litvinkova, Liubov; DeCaprio, Dave; Hood, Heather M; Curilla, Susan; Shi, Mi; Crawford, Matthew; Koerhsen, Michael; Montgomery, Phil; Larson, Lisa; Pearson, Matthew; Kasuga, Takao; Tian, Chaoguang; Baştürkmen, Meray; Altamirano, Lorena; Xu, Junhuan

    2007-01-01

    A consortium of investigators is engaged in a functional genomics project centered on the filamentous fungus Neurospora, with an eye to opening up the functional genomic analysis of all the filamentous fungi. The overall goal of the four interdependent projects in this effort is to accomplish functional genomics, annotation, and expression analyses of Neurospora crassa, a filamentous fungus that is an established model for the assemblage of over 250,000 species of non yeast fungi. Building from the completely sequenced 43-Mb Neurospora genome, Project 1 is pursuing the systematic disruption of genes through targeted gene replacements, phenotypic analysis of mutant strains, and their distribution to the scientific community at large. Project 2, through a primary focus in Annotation and Bioinformatics, has developed a platform for electronically capturing community feedback and data about the existing annotation, while building and maintaining a database to capture and display information about phenotypes. Oligonucleotide-based microarrays created in Project 3 are being used to collect baseline expression data for the nearly 11,000 distinguishable transcripts in Neurospora under various conditions of growth and development, and eventually to begin to analyze the global effects of loss of novel genes in strains created by Project 1. cDNA libraries generated in Project 4 document the overall complexity of expressed sequences in Neurospora, including alternative splicing alternative promoters and antisense transcripts. In addition, these studies have driven the assembly of an SNP map presently populated by nearly 300 markers that will greatly accelerate the positional cloning of genes.

  20. Functional conservation of nucleosome formation selectively biases presumably neutral molecular variation in yeast genomes.

    Science.gov (United States)

    Babbitt, Gregory A; Cotter, C R

    2011-01-01

    One prominent pattern of mutational frequency, long appreciated in comparative genomics, is the bias of purine/pyrimidine conserving substitutions (transitions) over purine/pyrimidine altering substitutions (transversions). Traditionally, this transitional bias has been thought to be driven by the underlying rates of DNA mutation and/or repair. However, recent sequencing studies of mutation accumulation lines in model organisms demonstrate that substitutions generally do not accumulate at rates that would indicate a transitional bias. These observations have called into question a very basic assumption of molecular evolution; that naturally occurring patterns of molecular variation in noncoding regions accurately reflect the underlying processes of randomly accumulating neutral mutation in nuclear genomes. Here, in Saccharomyces yeasts, we report a very strong inverse association (r = -0.951, P < 0.004) between the genome-wide frequency of substitutions and their average energetic effect on nucleosome formation, as predicted by a structurally based energy model of DNA deformation around the nucleosome core. We find that transitions occurring at sites positioned nearest the nucleosome surface, which are believed to function most importantly in nucleosome formation, alter the deformation energy of DNA to the nucleosome core by only a fraction of the energy changes typical of most transversions. When we examined the same substitutions set against random background sequences as well as an existing study reporting substitutions arising in mutation accumulation lines of Saccharomyces cerevisiae, we failed to find a similar relationship. These results support the idea that natural selection acting to functionally conserve chromatin organization may contribute significantly to genome-wide transitional bias, even in noncoding regions. Because nucleosome core structure is highly conserved across eukaryotes, our observations may also help to further explain locally elevated

  1. Genome-Wide Association Study Reveals Natural Variations Contributing to Drought Resistance in Crops

    Directory of Open Access Journals (Sweden)

    Hongwei Wang

    2017-06-01

    Full Text Available Crops are often cultivated in regions where they will face environmental adversities; resulting in substantial yield loss which can ultimately lead to food and societal problems. Thus, significant efforts have been made to breed stress tolerant cultivars in an attempt to minimize these problems and to produce more stability with respect to crop yields across broad geographies. Since stress tolerance is a complex and multi-genic trait, advancements with classical breeding approaches have been challenging. On the other hand, molecular breeding, which is based on transgenics, marker-assisted selection and genome editing technologies; holds great promise to enable farmers to better cope with these challenges. However, identification of the key genetic components underlying the trait is critical and will serve as the foundation for future crop genetic improvement. Recently, genome-wide association studies have made significant contributions to facilitate the discovery of natural variation contributing to stress tolerance in crops. From these studies, the identified loci can serve as targets for genomic selection or editing to enable the molecular design of new cultivars. Here, we summarize research progress on this issue and focus on the genetic basis of drought tolerance as revealed by genome-wide association studies and quantitative trait loci mapping. Although many favorable loci have been identified, elucidation of their molecular mechanisms contributing to increased stress tolerance still remains a challenge. Thus, continuous efforts are still required to functionally dissect this complex trait through comprehensive approaches, such as system biological studies. It is expected that proper application of the acquired knowledge will enable the development of stress tolerant cultivars; allowing agricultural production to become more sustainable under dynamic environmental conditions.

  2. Competence development organizations in project management on the basis of genomic model methodologies

    OpenAIRE

    Бушуев, Сергей Дмитриевич; Рогозина, Виктория Борисовна; Ярошенко, Юрий Федерович

    2013-01-01

    The matrix technology for identification of organisational competencies in project management is presented in the article. Matrix elements are the components of organizational competence in the field of project management and project management methodology represented in the structure of the genome. The matrix model of competence in the framework of the adopted methodologies and scanning method for identifying organizational competences formalised. Proposed methods for building effective proj...

  3. Trait variation and genetic diversity in a banana genomic selection training population.

    Directory of Open Access Journals (Sweden)

    Moses Nyine

    Full Text Available Banana (Musa spp. is an important crop in the African Great Lakes region in terms of income and food security, with the highest per capita consumption worldwide. Pests, diseases and climate change hamper sustainable production of bananas. New breeding tools with increased crossbreeding efficiency are being investigated to breed for resistant, high yielding hybrids of East African Highland banana (EAHB. These include genomic selection (GS, which will benefit breeding through increased genetic gain per unit time. Understanding trait variation and the correlation among economically important traits is an essential first step in the development and selection of suitable GS models for banana. In this study, we tested the hypothesis that trait variations in bananas are not affected by cross combination, cycle, field management and their interaction with genotype. A training population created using EAHB breeding material and its progeny was phenotyped in two contrasting conditions. A high level of correlation among vegetative and yield related traits was observed. Therefore, genomic selection models could be developed for traits that are easily measured. It is likely that the predictive ability of traits that are difficult to phenotype will be similar to less difficult traits they are highly correlated with. Genotype response to cycle and field management practices varied greatly with respect to traits. Yield related traits accounted for 31-35% of principal component variation under low and high input field management conditions. Resistance to Black Sigatoka was stable across cycles but varied under different field management depending on the genotype. The best cross combination was 1201K-1xSH3217 based on selection response (R of hybrids. Genotyping using simple sequence repeat (SSR markers revealed that the training population was genetically diverse, reflecting a complex pedigree background, which was mostly influenced by the male parents.

  4. Trait variation and genetic diversity in a banana genomic selection training population.

    Science.gov (United States)

    Nyine, Moses; Uwimana, Brigitte; Swennen, Rony; Batte, Michael; Brown, Allan; Christelová, Pavla; Hřibová, Eva; Lorenzen, Jim; Doležel, Jaroslav

    2017-01-01

    Banana (Musa spp.) is an important crop in the African Great Lakes region in terms of income and food security, with the highest per capita consumption worldwide. Pests, diseases and climate change hamper sustainable production of bananas. New breeding tools with increased crossbreeding efficiency are being investigated to breed for resistant, high yielding hybrids of East African Highland banana (EAHB). These include genomic selection (GS), which will benefit breeding through increased genetic gain per unit time. Understanding trait variation and the correlation among economically important traits is an essential first step in the development and selection of suitable GS models for banana. In this study, we tested the hypothesis that trait variations in bananas are not affected by cross combination, cycle, field management and their interaction with genotype. A training population created using EAHB breeding material and its progeny was phenotyped in two contrasting conditions. A high level of correlation among vegetative and yield related traits was observed. Therefore, genomic selection models could be developed for traits that are easily measured. It is likely that the predictive ability of traits that are difficult to phenotype will be similar to less difficult traits they are highly correlated with. Genotype response to cycle and field management practices varied greatly with respect to traits. Yield related traits accounted for 31-35% of principal component variation under low and high input field management conditions. Resistance to Black Sigatoka was stable across cycles but varied under different field management depending on the genotype. The best cross combination was 1201K-1xSH3217 based on selection response (R) of hybrids. Genotyping using simple sequence repeat (SSR) markers revealed that the training population was genetically diverse, reflecting a complex pedigree background, which was mostly influenced by the male parents.

  5. Trait variation and genetic diversity in a banana genomic selection training population

    Science.gov (United States)

    Nyine, Moses; Uwimana, Brigitte; Swennen, Rony; Batte, Michael; Brown, Allan; Christelová, Pavla; Hřibová, Eva; Lorenzen, Jim

    2017-01-01

    Banana (Musa spp.) is an important crop in the African Great Lakes region in terms of income and food security, with the highest per capita consumption worldwide. Pests, diseases and climate change hamper sustainable production of bananas. New breeding tools with increased crossbreeding efficiency are being investigated to breed for resistant, high yielding hybrids of East African Highland banana (EAHB). These include genomic selection (GS), which will benefit breeding through increased genetic gain per unit time. Understanding trait variation and the correlation among economically important traits is an essential first step in the development and selection of suitable GS models for banana. In this study, we tested the hypothesis that trait variations in bananas are not affected by cross combination, cycle, field management and their interaction with genotype. A training population created using EAHB breeding material and its progeny was phenotyped in two contrasting conditions. A high level of correlation among vegetative and yield related traits was observed. Therefore, genomic selection models could be developed for traits that are easily measured. It is likely that the predictive ability of traits that are difficult to phenotype will be similar to less difficult traits they are highly correlated with. Genotype response to cycle and field management practices varied greatly with respect to traits. Yield related traits accounted for 31–35% of principal component variation under low and high input field management conditions. Resistance to Black Sigatoka was stable across cycles but varied under different field management depending on the genotype. The best cross combination was 1201K-1xSH3217 based on selection response (R) of hybrids. Genotyping using simple sequence repeat (SSR) markers revealed that the training population was genetically diverse, reflecting a complex pedigree background, which was mostly influenced by the male parents. PMID:28586365

  6. Population genomics of Pacific lamprey: adaptive variation in a highly dispersive species.

    Science.gov (United States)

    Hess, Jon E; Campbell, Nathan R; Close, David A; Docker, Margaret F; Narum, Shawn R

    2013-06-01

    Unlike most anadromous fishes that have evolved strict homing behaviour, Pacific lamprey (Entosphenus tridentatus) seem to lack philopatry as evidenced by minimal population structure across the species range. Yet unexplained findings of within-region population genetic heterogeneity coupled with the morphological and behavioural diversity described for the species suggest that adaptive genetic variation underlying fitness traits may be responsible. We employed restriction site-associated DNA sequencing to genotype 4439 quality filtered single nucleotide polymorphism (SNP) loci for 518 individuals collected across a broad geographical area including British Columbia, Washington, Oregon and California. A subset of putatively neutral markers (N = 4068) identified a significant amount of variation among three broad populations: northern British Columbia, Columbia River/southern coast and 'dwarf' adults (F(CT) = 0.02, P ≪ 0.001). Additionally, 162 SNPs were identified as adaptive through outlier tests, and inclusion of these markers revealed a signal of adaptive variation related to geography and life history. The majority of the 162 adaptive SNPs were not independent and formed four groups of linked loci. Analyses with matsam software found that 42 of these outlier SNPs were significantly associated with geography, run timing and dwarf life history, and 27 of these 42 SNPs aligned with known genes or highly conserved genomic regions using the genome browser available for sea lamprey. This study provides both neutral and adaptive context for observed genetic divergence among collections and thus reconciles previous findings of population genetic heterogeneity within a species that displays extensive gene flow. © 2012 John Wiley & Sons Ltd.

  7. Genomic variation and its impact on gene expression in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Andreas Massouras

    Full Text Available Understanding the relationship between genetic and phenotypic variation is one of the great outstanding challenges in biology. To meet this challenge, comprehensive genomic variation maps of human as well as of model organism populations are required. Here, we present a nucleotide resolution catalog of single-nucleotide, multi-nucleotide, and structural variants in 39 Drosophila melanogaster Genetic Reference Panel inbred lines. Using an integrative, local assembly-based approach for variant discovery, we identify more than 3.6 million distinct variants, among which were more than 800,000 unique insertions, deletions (indels, and complex variants (1 to 6,000 bp. While the SNP density is higher near other variants, we find that variants themselves are not mutagenic, nor are regions with high variant density particularly mutation-prone. Rather, our data suggest that the elevated SNP density around variants is mainly due to population-level processes. We also provide insights into the regulatory architecture of gene expression variation in adult flies by mapping cis-expression quantitative trait loci (cis-eQTLs for more than 2,000 genes. Indels comprise around 10% of all cis-eQTLs and show larger effects than SNP cis-eQTLs. In addition, we identified two-fold more gene associations in males as compared to females and found that most cis-eQTLs are sex-specific, revealing a partial decoupling of the genomic architecture between the sexes as well as the importance of genetic factors in mediating sex-biased gene expression. Finally, we performed RNA-seq-based allelic expression imbalance analyses in the offspring of crosses between sequenced lines, which revealed that the majority of strong cis-eQTLs can be validated in heterozygous individuals.

  8. Genomics England's implementation of its public engagement strategy: Blurred boundaries between engagement for the United Kingdom's 100,000 Genomes project and the need for public support.

    Science.gov (United States)

    Samuel, Gabrielle Natalie; Farsides, Bobbie

    2018-04-01

    The United Kingdom's 100,000 Genomes Project has the aim of sequencing 100,000 genomes from National Health Service patients such that whole genome sequencing becomes routine clinical practice. It also has a research-focused goal to provide data for scientific discovery. Genomics England is the limited company established by the Department of Health to deliver the project. As an innovative scientific/clinical venture, it is interesting to consider how Genomics England positions itself in relation to public engagement activities. We set out to explore how individuals working at, or associated with, Genomics England enacted public engagement in practice. Our findings show that individuals offered a narrative in which public engagement performed more than one function. On one side, public engagement was seen as 'good practice'. On the other, public engagement was presented as core to the project's success - needed to encourage involvement and ultimately recruitment. We discuss the implications of this in this article.

  9. A global reference for human genetic variation

    DEFF Research Database (Denmark)

    Auton, Adam; Abecasis, Goncalo R.; M. Altshuler, David

    2015-01-01

    The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals ...

  10. Crowdfunding the Azolla fern genome project: a grassroots approach.

    Science.gov (United States)

    Li, Fay-Wei; Pryer, Kathleen M

    2014-01-01

    Much of science progresses within the tight boundaries of what is often seen as a "black box". Though familiar to funding agencies, researchers and the academic journals they publish in, it is an entity that outsiders rarely get to peek into. Crowdfunding is a novel means that allows the public to participate in, as well as to support and witness advancements in science. Here we describe our recent crowdfunding efforts to sequence the Azolla genome, a little fern with massive green potential. Crowdfunding is a worthy platform not only for obtaining seed money for exploratory research, but also for engaging directly with the general public as a rewarding form of outreach.

  11. Brain Genomics Superstruct Project initial data release with structural, functional, and behavioral measures.

    Science.gov (United States)

    Holmes, Avram J; Hollinshead, Marisa O; O'Keefe, Timothy M; Petrov, Victor I; Fariello, Gabriele R; Wald, Lawrence L; Fischl, Bruce; Rosen, Bruce R; Mair, Ross W; Roffman, Joshua L; Smoller, Jordan W; Buckner, Randy L

    2015-01-01

    The goal of the Brain Genomics Superstruct Project (GSP) is to enable large-scale exploration of the links between brain function, behavior, and ultimately genetic variation. To provide the broader scientific community data to probe these associations, a repository of structural and functional magnetic resonance imaging (MRI) scans linked to genetic information was constructed from a sample of healthy individuals. The initial release, detailed in the present manuscript, encompasses quality screened cross-sectional data from 1,570 participants ages 18 to 35 years who were scanned with MRI and completed demographic and health questionnaires. Personality and cognitive measures were obtained on a subset of participants. Each dataset contains a T1-weighted structural MRI scan and either one (n=1,570) or two (n=1,139) resting state functional MRI scans. Test-retest reliability datasets are included from 69 participants scanned within six months of their initial visit. For the majority of participants self-report behavioral and cognitive measures are included (n=926 and n=892 respectively). Analyses of data quality, structure, function, personality, and cognition are presented to demonstrate the dataset's utility.

  12. Genome-wide analysis of wild-type Epstein-Barr virus genomes derived from healthy individuals of the 1,000 Genomes Project.

    Science.gov (United States)

    Santpere, Gabriel; Darre, Fleur; Blanco, Soledad; Alcami, Antonio; Villoslada, Pablo; Mar Albà, M; Navarro, Arcadi

    2014-04-01

    Most people in the world (∼90%) are infected by the Epstein-Barr virus (EBV), which establishes itself permanently in B cells. Infection by EBV is related to a number of diseases including infectious mononucleosis, multiple sclerosis, and different types of cancer. So far, only seven complete EBV strains have been described, all of them coming from donors presenting EBV-related diseases. To perform a detailed comparative genomic analysis of EBV including, for the first time, EBV strains derived from healthy individuals, we reconstructed EBV sequences infecting lymphoblastoid cell lines (LCLs) from the 1000 Genomes Project. As strain B95-8 was used to transform B cells to obtain LCLs, it is always present, but a specific deletion in its genome sets it apart from natural EBV strains. After studying hundreds of individuals, we determined the presence of natural EBV in at least 10 of them and obtained a set of variants specific to wild-type EBV. By mapping the natural EBV reads into the EBV reference genome (NC007605), we constructed nearly complete wild-type viral genomes from three individuals. Adding them to the five disease-derived EBV genomic sequences available in the literature, we performed an in-depth comparative genomic analysis. We found that latency genes harbor more nucleotide diversity than lytic genes and that six out of nine latency-related genes, as well as other genes involved in viral attachment and entry into host cells, packaging, and the capsid, present the molecular signature of accelerated protein evolution rates, suggesting rapid host-parasite coevolution.

  13. DNA variation of the mammalian major histocompatibility complex reflects genomic diversity and population history

    International Nuclear Information System (INIS)

    Yuhki, Naoya; O'Brien, S.J.

    1990-01-01

    The major histocompatibility complex (MHC) is a multigene complex of tightly linked homologous genes that encode cell surface antigens that play a key role in immune regulation and response to foreign antigens. In most species, MHC gene products display extreme antigenic polymorphism, and their variability has been interpreted to reflect an adaptive strategy for accommodating rapidly evolving infectious agents that periodically afflict natural populations. Determination of the extent of MHC variation has been limited to populations in which skin grafting is feasible or for which serological reagents have been developed. The authors present here a quantitative analysis of restriction fragment length polymorphism of MHC class I genes in several mammalian species (cats, rodents, humans) known to have very different levels of genetic diversity based on functional MHC assays and on allozyme surveys. When homologous class I probes were employed, a notable concordance was observed between the extent of MHC restriction fragment variation and functional MHC variation detected by skin grafts or genome-wide diversity estimated by allozyme screens. These results confirm the genetically depauperate character of the African cheetah, Acinonyx jubatus, and the Asiatic lion, Panthera leo persica; further, they support the use of class I MHC molecular reagents in estimating the extent and character of genetic diversity in natural populations

  14. Genetic and epigenetic variation in 5S ribosomal RNA genes reveals genome dynamics in Arabidopsis thaliana.

    Science.gov (United States)

    Simon, Lauriane; Rabanal, Fernando A; Dubos, Tristan; Oliver, Cecilia; Lauber, Damien; Poulet, Axel; Vogt, Alexander; Mandlbauer, Ariane; Le Goff, Samuel; Sommer, Andreas; Duborjal, Hervé; Tatout, Christophe; Probst, Aline V

    2018-04-06

    Organized in tandem repeat arrays in most eukaryotes and transcribed by RNA polymerase III, expression of 5S rRNA genes is under epigenetic control. To unveil mechanisms of transcriptional regulation, we obtained here in depth sequence information on 5S rRNA genes from the Arabidopsis thaliana genome and identified differential enrichment in epigenetic marks between the three 5S rDNA loci situated on chromosomes 3, 4 and 5. We reveal the chromosome 5 locus as the major source of an atypical, long 5S rRNA transcript characteristic of an open chromatin structure. 5S rRNA genes from this locus translocated in the Landsberg erecta ecotype as shown by linkage mapping and chromosome-specific FISH analysis. These variations in 5S rDNA locus organization cause changes in the spatial arrangement of chromosomes in the nucleus. Furthermore, 5S rRNA gene arrangements are highly dynamic with alterations in chromosomal positions through translocations in certain mutants of the RNA-directed DNA methylation pathway and important copy number variations among ecotypes. Finally, variations in 5S rRNA gene sequence, chromatin organization and transcripts indicate differential usage of 5S rDNA loci in distinct ecotypes. We suggest that both the usage of existing and new 5S rDNA loci resulting from translocations may impact neighboring chromatin organization.

  15. DNA variation of the mammalian major histocompatibility complex reflects genomic diversity and population history

    Energy Technology Data Exchange (ETDEWEB)

    Yuhki, Naoya; O' Brien, S.J. (National Cancer Institute, Frederick, MD (USA))

    1990-01-01

    The major histocompatibility complex (MHC) is a multigene complex of tightly linked homologous genes that encode cell surface antigens that play a key role in immune regulation and response to foreign antigens. In most species, MHC gene products display extreme antigenic polymorphism, and their variability has been interpreted to reflect an adaptive strategy for accommodating rapidly evolving infectious agents that periodically afflict natural populations. Determination of the extent of MHC variation has been limited to populations in which skin grafting is feasible or for which serological reagents have been developed. The authors present here a quantitative analysis of restriction fragment length polymorphism of MHC class I genes in several mammalian species (cats, rodents, humans) known to have very different levels of genetic diversity based on functional MHC assays and on allozyme surveys. When homologous class I probes were employed, a notable concordance was observed between the extent of MHC restriction fragment variation and functional MHC variation detected by skin grafts or genome-wide diversity estimated by allozyme screens. These results confirm the genetically depauperate character of the African cheetah, Acinonyx jubatus, and the Asiatic lion, Panthera leo persica; further, they support the use of class I MHC molecular reagents in estimating the extent and character of genetic diversity in natural populations.

  16. Identification of genomic copy number variations associated with specific clinical features of head and neck cancer.

    Science.gov (United States)

    Zagradišnik, Boris; Krgović, Danijela; Herodež, Špela Stangler; Zagorac, Andreja; Ćižmarević, Bogdan; Vokač, Nadja Kokalj

    2018-01-01

    Copy number variations (CNSs) of large genomic regions are an important mechanism implicated in the development of head and neck cancer, however, for most changes their exact role is not well understood. The aim of this study was to find possible associations between gains/losses of genomic regions and clinically distinct subgroups of head and neck cancer patients. Array comparative genomic hybridization (aCGH) analysis was performed on DNA samples in 64 patients with cancer in oral cavity, oropharynx or hypopharynx. Overlapping genomic regions created from gains and losses were used for statistical analysis. Following regions were overrepresented: in tumors with stage I or II a gain of 2.98 Mb on 6p21.2-p11 and a gain of 7.4 Mb on 8q11.1-q11.23; in tumors with grade I histology a gain of 1.1 Mb on 8q24.13, a loss of a large part of p arm of chromosome 3, a loss of a 1.24 Mb on 6q14.3, and a loss of terminal 32 Mb region of 8p23.3; in cases with affected lymph nodes a gain of 0.75 Mb on 3q24, and a gain of 0.9 Mb on 3q26.32-q26.33; in cases with unaffected lymph nodes a gain of 1.1 Mb on 8q23.3, in patients not treated with surgery a gain of 12.2 Mb on 7q21.3-q22.3 and a gain of 0.33 Mb on 20q11.22. Our study identified several genomic regions of interest which appear to be associated with various clinically distinct subgroups of head and neck cancer. They represent a potentially important source of biomarkers useful for the clinical management of head and neck cancer. In particular, the PIK3CA and AGTR1 genes could be singled out to predict the lymph node involvement.

  17. The Human Genome Project and the social contract: a law policy approach.

    Science.gov (United States)

    Byk, C

    1992-08-01

    For the first time in history, genetics will enable science to completely identify each human as genetically unique. Will this knowledge reinforce the trend for more individual liberties or will it create a 'brave new world'? A law policy approach to the problems raised by the human genome project shows how far our democratic institutions are from being the proper forum to discuss such issues. Because of the fears and anxiety raised in the population, and also because of its wide implications on the everyday life, the human genome analysis more than any other project needs to succeed in setting up such a social assessment.

  18. Genomic Analysis of Hepatitis B Virus Reveals Antigen State and Genotype as Sources of Evolutionary Rate Variation

    Science.gov (United States)

    Harrison, Abby; Lemey, Philippe; Hurles, Matthew; Moyes, Chris; Horn, Susanne; Pryor, Jan; Malani, Joji; Supuri, Mathias; Masta, Andrew; Teriboriki, Burentau; Toatu, Tebuka; Penny, David; Rambaut, Andrew; Shapiro, Beth

    2011-01-01

    Hepatitis B virus (HBV) genomes are small, semi-double-stranded DNA circular genomes that contain alternating overlapping reading frames and replicate through an RNA intermediary phase. This complex biology has presented a challenge to estimating an evolutionary rate for HBV, leading to difficulties resolving the evolutionary and epidemiological history of the virus. Here, we re-examine rates of HBV evolution using a novel data set of 112 within-host, transmission history (pedigree) and among-host genomes isolated over 20 years from the indigenous peoples of the South Pacific, combined with 313 previously published HBV genomes. We employ Bayesian phylogenetic approaches to examine several potential causes and consequences of evolutionary rate variation in HBV. Our results reveal rate variation both between genotypes and across the genome, as well as strikingly slower rates when genomes are sampled in the Hepatitis B e antigen positive state, compared to the e antigen negative state. This Hepatitis B e antigen rate variation was found to be largely attributable to changes during the course of infection in the preCore and Core genes and their regulatory elements. PMID:21765983

  19. Plasticity of the Leishmania genome leading to gene copy number variations and drug resistance [version 1; referees: 5 approved

    Directory of Open Access Journals (Sweden)

    Marie-Claude N. Laffitte

    2016-09-01

    Full Text Available Leishmania has a plastic genome, and drug pressure can select for gene copy number variation (CNV. CNVs can apply either to whole chromosomes, leading to aneuploidy, or to specific genomic regions. For the latter, the amplification of chromosomal regions occurs at the level of homologous direct or inverted repeated sequences leading to extrachromosomal circular or linear amplified DNAs. This ability of Leishmania to respond to drug pressure by CNVs has led to the development of genomic screens such as Cos-Seq, which has the potential of expediting the discovery of drug targets for novel promising drug candidates.

  20. The RadGenomics project. Prediction for radio-susceptibility of individuals with genetic predisposition

    International Nuclear Information System (INIS)

    Imai, Takashi

    2003-01-01

    The ultimate goal of our project, named RadGenomics, is to elucidate the heterogeneity of the response to ionizing radiation arising from genetic variation among individuals, for the purpose of developing personalized radiation therapy regimens for cancer patients. Cancer patients exhibit patient-to-patient variability in normal tissue reactions after radiotherapy. Several observations support the hypothesis that the radiosensitivity of normal tissue is influenced by genetic factors. The rapid progression of human genome sequencing and the recent development of new technologies in molecular biology are providing new opportunities for elucidating the genetic basis of individual differences in susceptibility to radiation exposure. The development of a sufficiently robust, predictive assay enabling individual dose adjustment would improve the outcome of radiation therapy in patients. Our strategy for identification of DNA polymorphisms that contribute to the individual radiosensitivity is as follows. First, we have been categorizing DNA samples obtained from cancer patients, who have been kindly introduced to us through many collaborators, according to their clinical characteristics including the method and effect of treatment and side effects as scored by toxicity criteria, and also the result of an in vitro radiosensitivity assay, e.g., the micronuclei assay of their lymphocytes. Second, we have identified candidate genes for genotyping mainly by using our custom-designed oligonucleotide array with RNA samples, in which the probes were obtained from more than 40 cancer and 3 fibroblast cell lines whose radiosensitivity level was quite heterogeneous. We have also been studying the modification of proteins after irradiation of cells which may be caused by mainly phosphorylation or dephosphorylation, using mass spectrometry. Genes encoding the modified proteins and/or other proteins with which they interact such as specific protein kinases and phosphatases are also

  1. Variation in the OC locus of Acinetobacter baumannii genomes predicts extensive structural diversity in the lipooligosaccharide.

    Directory of Open Access Journals (Sweden)

    Johanna J Kenyon

    Full Text Available Lipooligosaccharide (LOS is a complex surface structure that is linked to many pathogenic properties of Acinetobacter baumannii. In A. baumannii, the genes responsible for the synthesis of the outer core (OC component of the LOS are located between ilvE and aspS. The content of the OC locus is usually variable within a species, and examination of 6 complete and 227 draft A. baumannii genome sequences available in GenBank non-redundant and Whole Genome Shotgun databases revealed nine distinct new types, OCL4-OCL12, in addition to the three known ones. The twelve gene clusters fell into two distinct groups, designated Group A and Group B, based on similarities in the genes present. OCL6 (Group B was unique in that it included genes for the synthesis of L-Rhamnosep. Genetic exchange of the different configurations between strains has occurred as some OC forms were found in several different sequence types (STs. OCL1 (Group A was the most widely distributed being present in 18 STs, and OCL6 was found in 16 STs. Variation within clones was also observed, with more than one OC locus type found in the two globally disseminated clones, GC1 and GC2, that include the majority of multiply antibiotic resistant isolates. OCL1 was the most abundant gene cluster in both GC1 and GC2 genomes but GC1 isolates also carried OCL2, OCL3 or OCL5, and OCL3 was also present in GC2. As replacement of the OC locus in the major global clones indicates the presence of sub-lineages, a PCR typing scheme was developed to rapidly distinguish Group A and Group B types, and to distinguish the specific forms found in GC1 and GC2 isolates.

  2. DESCARTES' RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA.

    Science.gov (United States)

    Bhaskar, Anand; Song, Yun S

    2014-01-01

    The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the "folded" SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes' rule of signs for polynomials to the Laplace transform of piecewise continuous functions.

  3. DESCARTES’ RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA1

    Science.gov (United States)

    Bhaskar, Anand; Song, Yun S.

    2016-01-01

    The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the “folded” SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes’ rule of signs for polynomials to the Laplace transform of piecewise continuous functions. PMID:28018011

  4. Genome-wide copy number variation (CNV in patients with autoimmune Addison's disease

    Directory of Open Access Journals (Sweden)

    Brønstad Ingeborg

    2011-08-01

    Full Text Available Abstract Background Addison's disease (AD is caused by an autoimmune destruction of the adrenal cortex. The pathogenesis is multi-factorial, involving genetic components and hitherto unknown environmental factors. The aim of the present study was to investigate if gene dosage in the form of copy number variation (CNV could add to the repertoire of genetic susceptibility to autoimmune AD. Methods A genome-wide study using the Affymetrix GeneChip® Genome-Wide Human SNP Array 6.0 was conducted in 26 patients with AD. CNVs in selected genes were further investigated in a larger material of patients with autoimmune AD (n = 352 and healthy controls (n = 353 by duplex Taqman real-time polymerase chain reaction assays. Results We found that low copy number of UGT2B28 was significantly more frequent in AD patients compared to controls; conversely high copy number of ADAM3A was associated with AD. Conclusions We have identified two novel CNV associations to ADAM3A and UGT2B28 in AD. The mechanism by which this susceptibility is conferred is at present unclear, but may involve steroid inactivation (UGT2B28 and T cell maturation (ADAM3A. Characterization of these proteins may unravel novel information on the pathogenesis of autoimmunity.

  5. Distinct Contributions of Replication and Transcription to Mutation Rate Variation of Human Genomes

    KAUST Repository

    Cui, Peng; Ding, Feng; Lin, Qiang; Zhang, Lingfang; Li, Ang; Zhang, Zhang; Hu, Songnian; Yu, Jun

    2012-01-01

    Here, we evaluate the contribution of two major biological processes—DNA replication and transcription—to mutation rate variation in human genomes. Based on analysis of the public human tissue transcriptomics data, high-resolution replicating map of Hela cells and dbSNP data, we present significant correlations between expression breadth, replication time in local regions and SNP density. SNP density of tissue-specific (TS) genes is significantly higher than that of housekeeping (HK) genes. TS genes tend to locate in late-replicating genomic regions and genes in such regions have a higher SNP density compared to those in early-replication regions. In addition, SNP density is found to be positively correlated with expression level among HK genes. We conclude that the process of DNA replication generates stronger mutational pressure than transcription-associated biological processes do, resulting in an increase of mutation rate in TS genes while having weaker effects on HK genes. In contrast, transcription-associated processes are mainly responsible for the accumulation of mutations in highly-expressed HK genes.

  6. Genome-wide copy number variation (CNV) in patients with autoimmune Addison's disease

    Science.gov (United States)

    2011-01-01

    Background Addison's disease (AD) is caused by an autoimmune destruction of the adrenal cortex. The pathogenesis is multi-factorial, involving genetic components and hitherto unknown environmental factors. The aim of the present study was to investigate if gene dosage in the form of copy number variation (CNV) could add to the repertoire of genetic susceptibility to autoimmune AD. Methods A genome-wide study using the Affymetrix GeneChip® Genome-Wide Human SNP Array 6.0 was conducted in 26 patients with AD. CNVs in selected genes were further investigated in a larger material of patients with autoimmune AD (n = 352) and healthy controls (n = 353) by duplex Taqman real-time polymerase chain reaction assays. Results We found that low copy number of UGT2B28 was significantly more frequent in AD patients compared to controls; conversely high copy number of ADAM3A was associated with AD. Conclusions We have identified two novel CNV associations to ADAM3A and UGT2B28 in AD. The mechanism by which this susceptibility is conferred is at present unclear, but may involve steroid inactivation (UGT2B28) and T cell maturation (ADAM3A). Characterization of these proteins may unravel novel information on the pathogenesis of autoimmunity. PMID:21851588

  7. A genome-wide association study of copy number variations with umbilical hernia in swine.

    Science.gov (United States)

    Long, Yi; Su, Ying; Ai, Huashui; Zhang, Zhiyan; Yang, Bin; Ruan, Guorong; Xiao, Shijun; Liao, Xinjun; Ren, Jun; Huang, Lusheng; Ding, Nengshui

    2016-06-01

    Umbilical hernia (UH) is one of the most common congenital defects in pigs, leading to considerable economic loss and serious animal welfare problems. To test whether copy number variations (CNVs) contribute to pig UH, we performed a case-control genome-wide CNV association study on 905 pigs from the Duroc, Landrace and Yorkshire breeds using the Porcine SNP60 BeadChip and penncnv algorithm. We first constructed a genomic map comprising 6193 CNVs that pertain to 737 CNV regions. Then, we identified eight CNVs significantly associated with the risk for UH in the three pig breeds. Six of seven significantly associated CNVs were validated using quantitative real-time PCR. Notably, a rare CNV (CNV14:13030843-13059455) encompassing the NUGGC gene was strongly associated with UH (permutation-corrected P = 0.0015) in Duroc pigs. This CNV occurred exclusively in seven Duroc UH-affected individuals. SNPs surrounding the CNV did not show association signals, indicating that rare CNVs may play an important role in complex pig diseases such as UH. The NUGGC gene has been implicated in human omphalocele and inguinal hernia. Our finding supports that CNVs, including the NUGGC CNV, contribute to the pathogenesis of pig UH. © 2016 Stichting International Foundation for Animal Genetics.

  8. Analysis of Genetic Variation across the Encapsidated Genome of Microplitis demolitor Bracovirus in Parasitoid Wasps.

    Directory of Open Access Journals (Sweden)

    Gaelen R Burke

    Full Text Available Insect parasitoids must complete part of their life cycle within or on another insect, ultimately resulting in the death of the host insect. One group of parasitoid wasps, the 'microgastroid complex' (Hymenoptera: Braconidae, engage in an association with beneficial symbiotic viruses that are essential for successful parasitism of hosts. These viruses, known as Bracoviruses, persist in an integrated form in the wasp genome, and activate to replicate in wasp ovaries during development to ultimately be delivered into host insects during parasitism. The lethal nature of host-parasitoid interactions, combined with the involvement of viruses in mediating these interactions, has led to the hypothesis that Bracoviruses are engaged in an arms race with hosts, resulting in recurrent adaptation in viral (and host genes. Deep sequencing was employed to characterize sequence variation across the encapsidated Bracovirus genome within laboratory and field populations of the parasitoid wasp species Microplitis demolitor. Contrary to expectations, there was a paucity of evidence for positive directional selection among virulence genes, which generally exhibited signatures of purifying selection. These data suggest that the dynamics of host-parasite interactions may not result in recurrent rounds of adaptation, and that adaptation may be more variable in time than previously expected.

  9. Structural variation discovery in the cancer genome using next generation sequencing: Computational solutions and perspectives

    Science.gov (United States)

    Liu, Biao; Conroy, Jeffrey M.; Morrison, Carl D.; Odunsi, Adekunle O.; Qin, Maochun; Wei, Lei; Trump, Donald L.; Johnson, Candace S.; Liu, Song; Wang, Jianmin

    2015-01-01

    Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937

  10. Distinct Contributions of Replication and Transcription to Mutation Rate Variation of Human Genomes

    KAUST Repository

    Cui, Peng

    2012-03-23

    Here, we evaluate the contribution of two major biological processes—DNA replication and transcription—to mutation rate variation in human genomes. Based on analysis of the public human tissue transcriptomics data, high-resolution replicating map of Hela cells and dbSNP data, we present significant correlations between expression breadth, replication time in local regions and SNP density. SNP density of tissue-specific (TS) genes is significantly higher than that of housekeeping (HK) genes. TS genes tend to locate in late-replicating genomic regions and genes in such regions have a higher SNP density compared to those in early-replication regions. In addition, SNP density is found to be positively correlated with expression level among HK genes. We conclude that the process of DNA replication generates stronger mutational pressure than transcription-associated biological processes do, resulting in an increase of mutation rate in TS genes while having weaker effects on HK genes. In contrast, transcription-associated processes are mainly responsible for the accumulation of mutations in highly-expressed HK genes.

  11. A refined model of the genomic basis for phenotypic variation in vertebrate hemostasis.

    Science.gov (United States)

    Ribeiro, Ângela M; Zepeda-Mendoza, M Lisandra; Bertelsen, Mads F; Kristensen, Annemarie T; Jarvis, Erich D; Gilbert, M Thomas P; da Fonseca, Rute R

    2015-06-30

    Hemostasis is a defense mechanism that enhances an organism's survival by minimizing blood loss upon vascular injury. In vertebrates, hemostasis has been evolving with the cardio-vascular and hemodynamic systems over the last 450 million years. Birds and mammals have very similar vascular and hemodynamic systems, thus the mechanism that blocks ruptures in the vasculature is expected to be the same. However, the speed of the process varies across vertebrates, and is particularly slow for birds. Understanding the differences in the hemostasis pathway between birds and mammals, and placing them in perspective to other vertebrates may provide clues to the genetic contribution to variation in blood clotting phenotype in vertebrates. We compiled genomic data corresponding to key elements involved in hemostasis across vertebrates to investigate its genetic basis and understand how it affects fitness. We found that: i) fewer genes are involved in hemostasis in birds compared to mammals; and ii) the largest differences concern platelet membrane receptors and components from the kallikrein-kinin system. We propose that lack of the cytoplasmic domain of the GPIb receptor subunit alpha could be a strong contributor to the prolonged bleeding phenotype in birds. Combined analysis of laboratory assessments of avian hemostasis with the first avian phylogeny based on genomic-scale data revealed that differences in hemostasis within birds are not explained by phylogenetic relationships, but more so by genetic variation underlying components of the hemostatic process, suggestive of natural selection. This work adds to our understanding of the evolution of hemostasis in vertebrates. The overlap with the inflammation, complement and renin-angiotensin (blood pressure regulation) pathways is a potential driver of rapid molecular evolution in the hemostasis network. Comparisons between avian species and mammals allowed us to hypothesize that the observed mammalian innovations might have

  12. Alignment of 1000 Genomes Project reads to reference assembly GRCh38.

    Science.gov (United States)

    Zheng-Bradley, Xiangqun; Streeter, Ian; Fairley, Susan; Richardson, David; Clarke, Laura; Flicek, Paul

    2017-07-01

    The 1000 Genomes Project produced more than 100 trillion basepairs of short read sequence from more than 2600 samples in 26 populations over a period of five years. In its final phase, the project released over 85 million genotyped and phased variants on human reference genome assembly GRCh37. An updated reference assembly, GRCh38, was released in late 2013, but there was insufficient time for the final phase of the project analysis to change to the new assembly. Although it is possible to lift the coordinates of the 1000 Genomes Project variants to the new assembly, this is a potentially error-prone process as coordinate remapping is most appropriate only for non-repetitive regions of the genome and those that did not see significant change between the two assemblies. It will also miss variants in any region that was newly added to GRCh38. Thus, to produce the highest quality variants and genotypes on GRCh38, the best strategy is to realign the reads and recall the variants based on the new alignment. As the first step of variant calling for the 1000 Genomes Project data, we have finished remapping all of the 1000 Genomes sequence reads to GRCh38 with alternative scaffold-aware BWA-MEM. The resulting alignments are available as CRAM, a reference-based sequence compression format. The data have been released on our FTP site and are also available from European Nucleotide Archive to facilitate researchers discovering variants on the primary sequences and alternative contigs of GRCh38. © The Authors 2017. Published by Oxford University Press.

  13. Genotype Imputation for Latinos Using the HapMap and 1000 Genomes Project Reference Panels

    Directory of Open Access Journals (Sweden)

    Xiaoyi eGao

    2012-06-01

    Full Text Available Genotype imputation is a vital tool in genome-wide association studies (GWAS and meta-analyses of multiple GWAS results. Imputation enables researchers to increase genomic coverage and to pool data generated using different genotyping platforms. HapMap samples are often employed as the reference panel. More recently, the 1000 Genomes Project resource is becoming the primary source for reference panels. Multiple GWAS and meta-analyses are targeting Latinos, the most populous and fastest growing minority group in the US. However, genotype imputation resources for Latinos are rather limited compared to individuals of European ancestry at present, largely because of the lack of good reference data. One choice of reference panel for Latinos is one derived from the population of Mexican individuals in Los Angeles contained in the HapMap Phase 3 project and the 1000 Genomes Project. However, a detailed evaluation of the quality of the imputed genotypes derived from the public reference panels has not yet been reported. Using simulation studies, the Illumina OmniExpress GWAS data from the Los Angles Latino Eye Study and the MACH software package, we evaluated the accuracy of genotype imputation in Latinos. Our results show that the 1000 Genomes Project AMR+CEU+YRI reference panel provides the highest imputation accuracy for Latinos, and that also including Asian samples in the panel can reduce imputation accuracy. We also provide the imputation accuracy for each autosomal chromosome using the 1000 Genomes Project panel for Latinos. Our results serve as a guide to future imputation-based analysis in Latinos.

  14. A genome-wide association study demonstrates significant genetic variation for fracture risk in Thoroughbred racehorses

    Science.gov (United States)

    2014-01-01

    Background Thoroughbred racehorses are subject to non-traumatic distal limb bone fractures that occur during racing and exercise. Susceptibility to fracture may be due to underlying disturbances in bone metabolism which have a genetic cause. Fracture risk has been shown to be heritable in several species but this study is the first genetic analysis of fracture risk in the horse. Results Fracture cases (n = 269) were horses that sustained catastrophic distal limb fractures while racing on UK racecourses, necessitating euthanasia. Control horses (n = 253) were over 4 years of age, were racing during the same time period as the cases, and had no history of fracture at the time the study was carried out. The horses sampled were bred for both flat and National Hunt (NH) jump racing. 43,417 SNPs were employed to perform a genome-wide association analysis and to estimate the proportion of genetic variance attributable to the SNPs on each chromosome using restricted maximum likelihood (REML). Significant genetic variation associated with fracture risk was found on chromosomes 9, 18, 22 and 31. Three SNPs on chromosome 18 (62.05 Mb – 62.15 Mb) and one SNP on chromosome 1 (14.17 Mb) reached genome-wide significance (p fracture than cases, p = 1 × 10-4), while a second haplotype increases fracture risk (cases at 3.39 times higher risk of fracture than controls, p = 0.042). Conclusions Fracture risk in the Thoroughbred horse is a complex condition with an underlying genetic basis. Multiple genomic regions contribute to susceptibility to fracture risk. This suggests there is the potential to develop SNP-based estimators for genetic risk of fracture in the Thoroughbred racehorse, using methods pioneered in livestock genetics such as genomic selection. This information would be useful to racehorse breeders and owners, enabling them to reduce the risk of injury in their horses. PMID:24559379

  15. Functional food ingredients against colorectal cancer. An example project integrating functional genomics, nutrition and health

    NARCIS (Netherlands)

    Stierum, R.; Burgemeister, R.; Helvoort, van A.; Peijnenburg, A.; Schütze, K.; Seidelin, M.; Vang, O.; Ommen, van B.

    2001-01-01

    Functional Food Ingredients Against Colorectal Cancer is one of the first European Union funded Research Projects at the cross-road of functional genomics [comprising transcriptomics, the measurement of the expression of all messengers RNA (mRNAs) and proteomics, the measurement of expression/state

  16. The Human Genome Project and Eugenics: Identifying the Impact on Individuals with Mental Retardation.

    Science.gov (United States)

    Kuna, Jason

    2001-01-01

    This article explores the impact of the mapping work of the Human Genome Project on individuals with mental retardation and the negative effects of genetic testing. The potential to identify disabilities and the concept of eugenics are discussed, along with ethical issues surrounding potential genetic therapies. (Contains references.) (CR)

  17. Reflections on Mental Retardation and Eugenics, Old and New: Mensa and the Human Genome Project.

    Science.gov (United States)

    Smith, J. David

    1994-01-01

    This article addresses the moral and ethical issues of mental retardation and a continuing legacy of belief in eugenics. It discusses the involuntary sterilization of Carrie Buck in 1927, support for legalized killing of subnormal infants by 47% of respondents to a Mensa survey, and implications of the Human Genome Project for the field of mental…

  18. Democratizing Human Genome Project Information: A Model Program for Education, Information and Debate in Public Libraries.

    Science.gov (United States)

    Pollack, Miriam

    The "Mapping the Human Genome" project demonstrated that librarians can help whomever they serve in accessing information resources in the areas of biological and health information, whether it is the scientists who are developing the information or a member of the public who is using the information. Public libraries can guide library…

  19. From Mendel to the Human Genome Project: The Implications for Nurse Education.

    Science.gov (United States)

    Burton, Hilary; Stewart, Alison

    2003-01-01

    The Human Genome Project is brining new opportunities to predict and prevent diseases. Although pediatric nurses are the closest to these developments, most nurses will encounter genetic aspects of practice and must understand the basic science and its ethical, legal, and social dimensions. (Includes commentary by Peter Birchenall.) (SK)

  20. Direct linkage of mitochondrial genome variation to risk factors for type 2 diabetes in conplastic strains

    Czech Academy of Sciences Publication Activity Database

    Pravenec, Michal; Hyakukoku, M.; Houštěk, Josef; Zídek, Václav; Landa, Vladimír; Mlejnek, Petr; Mikšík, Ivan; Mothejzíková-Dudová, Kristýna; Pecina, Petr; Vrbacký, Marek; Drahota, Zdeněk; Vojtíšková, Alena; Mráček, Tomáš; Kazdová, L.; Oliyarnyk, O.; Wang, Ji.; Ho, Ch.; Qi, N.; Sugimoto, K.; Kurtz, T.

    2007-01-01

    Roč. 17, č. 9 (2007), s. 1319-1326 ISSN 1088-9051 R&D Projects: GA MŠk(CZ) 1M0520; GA ČR(CZ) GA301/06/0028; GA ČR GA303/07/0781 Grant - others:GA UK(CZ) 24/2005; GA UK(CZ) 26/2005; National Institutes of Health(US) HL35018; National Institutes of Health(US) HL56028; National Institutes of Health(US) HL63709; EURATOOLS(XE) LSHG-CT-2005-019015 Institutional research plan: CEZ:AV0Z50110509 Source of funding: R - rámcový projekt EK Keywords : mitochondrial genome * conplastic strains * risk factors for type 2 diabetes Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 11.224, year: 2007

  1. Copy number variation identification and analysis of the chicken genome using a 60K SNP BeadChip.

    Science.gov (United States)

    Rao, Y S; Li, J; Zhang, R; Lin, X R; Xu, J G; Xie, L; Xu, Z Q; Wang, L; Gan, J K; Xie, X J; He, J; Zhang, X Q

    2016-08-01

    Copy number variation (CNV) is an important source of genetic variation in organisms and a main factor that affects phenotypic variation. A comprehensive study of chicken CNV can provide valuable information on genetic diversity and facilitate future analyses of associations between CNV and economically important traits in chickens. In the present study, an F2 full-sib chicken population (554 individuals), established from a cross between Xinghua and White Recessive Rock chickens, was used to explore CNV in the chicken genome. Genotyping was performed using a chicken 60K SNP BeadChip. A total of 1,875 CNV were detected with the PennCNV algorithm, and the average number of CNV was 3.42 per individual. The CNV were distributed across 383 independent CNV regions (CNVR) and covered 41 megabases (3.97%) of the chicken genome. Seven CNVR in 108 individuals were validated by quantitative real-time PCR, and 81 of these individuals (75%) also were detected with the PennCNV algorithm. In total, 274 CNVR (71.54%) identified in the current study were previously reported. Of these, 147 (38.38%) were reported in at least 2 studies. Additionally, 109 of the CNVR (28.46%) discovered here are novel. A total of 709 genes within or overlapping with the CNVR was retrieved. Out of the 2,742 quantitative trait loci (QTL) collected in the chicken QTL database, 43 QTL had confidence intervals overlapping with the CNVR, and 32 CNVR encompassed one or more functional genes. The functional genes located in the CNVR are likely to be the QTG that are associated with underlying economic traits. This study considerably expands our insight into the structural variation in the genome of chickens and provides an important resource for genomic variation, especially for genomic structural variation related to economic traits in chickens. © 2016 Poultry Science Association Inc.

  2. Detecting single DNA copy number variations in complex genomes using one nanogram of starting DNA and BAC-array CGH.

    Science.gov (United States)

    Guillaud-Bataille, Marine; Valent, Alexander; Soularue, Pascal; Perot, Christine; Inda, Maria Mar; Receveur, Aline; Smaïli, Sadek; Roest Crollius, Hugues; Bénard, Jean; Bernheim, Alain; Gidrol, Xavier; Danglot, Gisèle

    2004-07-29

    Comparative genomic hybridization to bacterial artificial chromosome (BAC)-arrays (array-CGH) is a highly efficient technique, allowing the simultaneous measurement of genomic DNA copy number at hundreds or thousands of loci, and the reliable detection of local one-copy-level variations. We report a genome-wide amplification method allowing the same measurement sensitivity, using 1 ng of starting genomic DNA, instead of the classical 1 microg usually necessary. Using a discrete series of DNA fragments, we defined the parameters adapted to the most faithful ligation-mediated PCR amplification and the limits of the technique. The optimized protocol allows a 3000-fold DNA amplification, retaining the quantitative characteristics of the initial genome. Validation of the amplification procedure, using DNA from 10 tumour cell lines hybridized to BAC-arrays of 1500 spots, showed almost perfectly superimposed ratios for the non-amplified and amplified DNAs. Correlation coefficients of 0.96 and 0.99 were observed for regions of low-copy-level variations and all regions, respectively (including in vivo amplified oncogenes). Finally, labelling DNA using two nucleotides bearing the same fluorophore led to a significant increase in reproducibility and to the correct detection of one-copy gain or loss in >90% of the analysed data, even for pseudotriploid tumour genomes.

  3. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

    Science.gov (United States)

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-03-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.

  4. Analysis of the genome-wide variations among multiple strains of the plant pathogenic bacterium Xylella fastidiosa

    Directory of Open Access Journals (Sweden)

    Walker M Andrew

    2006-09-01

    Full Text Available Abstract Background The Gram-negative, xylem-limited phytopathogenic bacterium Xylella fastidiosa is responsible for causing economically important diseases in grapevine, citrus and many other plant species. Despite its economic impact, relatively little is known about the genomic variations among strains isolated from different hosts and their influence on the population genetics of this pathogen. With the availability of genome sequence information for four strains, it is now possible to perform genome-wide analyses to identify and categorize such DNA variations and to understand their influence on strain functional divergence. Results There are 1,579 genes and 194 non-coding homologous sequences present in the genomes of all four strains, representing a 76. 2% conservation of the sequenced genome. About 60% of the X. fastidiosa unique sequences exist as tandem gene clusters of 6 or more genes. Multiple alignments identified 12,754 SNPs and 14,449 INDELs in the 1528 common genes and 20,779 SNPs and 10,075 INDELs in the 194 non-coding sequences. The average SNP frequency was 1.08 × 10-2 per base pair of DNA and the average INDEL frequency was 2.06 × 10-2 per base pair of DNA. On an average, 60.33% of the SNPs were synonymous type while 39.67% were non-synonymous type. The mutation frequency, primarily in the form of external INDELs was the main type of sequence variation. The relative similarity between the strains was discussed according to the INDEL and SNP differences. The number of genes unique to each strain were 60 (9a5c, 54 (Dixon, 83 (Ann1 and 9 (Temecula-1. A sub-set of the strain specific genes showed significant differences in terms of their codon usage and GC composition from the native genes suggesting their xenologous origin. Tandem repeat analysis of the genomic sequences of the four strains identified associations of repeat sequences with hypothetical and phage related functions. Conclusion INDELs and strain specific genes

  5. Overlap in genomic variation associated with milk fat composition in Holstein Friesian and Dutch native dual-purpose breeds

    NARCIS (Netherlands)

    Maurice - Van Eijndhoven, M.H.T.; Bovenhuis, H.; Veerkamp, R.F.; Calus, M.P.L.

    2015-01-01

    The aim of this study was to identify if genomic variations associated with fatty acid (FA) composition are similar between the Holstein-Friesian (HF) and native dual-purpose breeds used in the Dutch dairy industry. Phenotypic and genotypic information were available for the breeds Meuse-Rhine-Yssel

  6. Genome-Wide Mapping of Structural Variations Reveals a Copy Number Variant That Determines Reproductive Morphology in Cucumber

    NARCIS (Netherlands)

    Zhang, Z.; Mao, L.; Chen, Junshi; Bu, F.; Li, G.; Sun, J.; Li, S.; Sun, H.; Jiao, C.; Blakely, R.; Pan, J.; Cai, R.; Luo, R.; Peer, Van de Y.; Jacobsen, E.; Fei, Z.; Huang, S.

    2015-01-01

    Structural variations (SVs) represent a major source of genetic diversity. However, the functional impact and formation mechanisms of SVs in plant genomes remain largely unexplored. Here, we report a nucleotide-resolution SV map of cucumber (Cucumis sativas) that comprises 26,788 SVs based on deep

  7. Genome-wide recombination rate variation in a recombination map of cotton.

    Science.gov (United States)

    Shen, Chao; Li, Ximei; Zhang, Ruiting; Lin, Zhongxu

    2017-01-01

    Recombination is crucial for genetic evolution, which not only provides new allele combinations but also influences the biological evolution and efficacy of natural selection. However, recombination variation is not well understood outside of the complex species' genomes, and it is particularly unclear in Gossypium. Cotton is the most important natural fibre crop and the second largest oil-seed crop. Here, we found that the genetic and physical maps distances did not have a simple linear relationship. Recombination rates were unevenly distributed throughout the cotton genome, which showed marked changes along the chromosome lengths and recombination was completely suppressed in the centromeric regions. Recombination rates significantly varied between A-subgenome (At) (range = 1.60 to 3.26 centimorgan/megabase [cM/Mb]) and D-subgenome (Dt) (range = 2.17 to 4.97 cM/Mb), which explained why the genetic maps of At and Dt are similar but the physical map of Dt is only half that of At. The translocation regions between A02 and A03 and between A04 and A05, and the inversion regions on A10, D10, A07 and D07 indicated relatively high recombination rates in the distal regions of the chromosomes. Recombination rates were positively correlated with the densities of genes, markers and the distance from the centromere, and negatively correlated with transposable elements (TEs). The gene ontology (GO) categories showed that genes in high recombination regions may tend to response to environmental stimuli, and genes in low recombination regions are related to mitosis and meiosis, which suggested that they may provide the primary driving force in adaptive evolution and assure the stability of basic cell cycle in a rapidly changing environment. Global knowledge of recombination rates will facilitate genetics and breeding in cotton.

  8. Autism genome-wide copy number variation reveals ubiquitin and neuronal genes.

    Science.gov (United States)

    Glessner, Joseph T; Wang, Kai; Cai, Guiqing; Korvatska, Olena; Kim, Cecilia E; Wood, Shawn; Zhang, Haitao; Estes, Annette; Brune, Camille W; Bradfield, Jonathan P; Imielinski, Marcin; Frackelton, Edward C; Reichert, Jennifer; Crawford, Emily L; Munson, Jeffrey; Sleiman, Patrick M A; Chiavacci, Rosetta; Annaiah, Kiran; Thomas, Kelly; Hou, Cuiping; Glaberson, Wendy; Flory, James; Otieno, Frederick; Garris, Maria; Soorya, Latha; Klei, Lambertus; Piven, Joseph; Meyer, Kacie J; Anagnostou, Evdokia; Sakurai, Takeshi; Game, Rachel M; Rudd, Danielle S; Zurawiecki, Danielle; McDougle, Christopher J; Davis, Lea K; Miller, Judith; Posey, David J; Michaels, Shana; Kolevzon, Alexander; Silverman, Jeremy M; Bernier, Raphael; Levy, Susan E; Schultz, Robert T; Dawson, Geraldine; Owley, Thomas; McMahon, William M; Wassink, Thomas H; Sweeney, John A; Nurnberger, John I; Coon, Hilary; Sutcliffe, James S; Minshew, Nancy J; Grant, Struan F A; Bucan, Maja; Cook, Edwin H; Buxbaum, Joseph D; Devlin, Bernie; Schellenberg, Gerard D; Hakonarson, Hakon

    2009-05-28

    Autism spectrum disorders (ASDs) are childhood neurodevelopmental disorders with complex genetic origins. Previous studies focusing on candidate genes or genomic regions have identified several copy number variations (CNVs) that are associated with an increased risk of ASDs. Here we present the results from a whole-genome CNV study on a cohort of 859 ASD cases and 1,409 healthy children of European ancestry who were genotyped with approximately 550,000 single nucleotide polymorphism markers, in an attempt to comprehensively identify CNVs conferring susceptibility to ASDs. Positive findings were evaluated in an independent cohort of 1,336 ASD cases and 1,110 controls of European ancestry. Besides previously reported ASD candidate genes, such as NRXN1 (ref. 10) and CNTN4 (refs 11, 12), several new susceptibility genes encoding neuronal cell-adhesion molecules, including NLGN1 and ASTN2, were enriched with CNVs in ASD cases compared to controls (P = 9.5 x 10(-3)). Furthermore, CNVs within or surrounding genes involved in the ubiquitin pathways, including UBE3A, PARK2, RFWD2 and FBXO40, were affected by CNVs not observed in controls (P = 3.3 x 10(-3)). We also identified duplications 55 kilobases upstream of complementary DNA AK123120 (P = 3.6 x 10(-6)). Although these variants may be individually rare, they target genes involved in neuronal cell-adhesion or ubiquitin degradation, indicating that these two important gene networks expressed within the central nervous system may contribute to the genetic susceptibility of ASD.

  9. Getting the Word Out on the Human Genome Project: A Course for Physicians

    Energy Technology Data Exchange (ETDEWEB)

    Sara L. Tobin

    2004-09-29

    Our project, ''Getting the Word Out on the Human Genome Project: A Course for Physicians,'' presented educational goals to convey the power and promise of the Human Genome Program to a variety of professional, educational, and public audiences. Our initial goal was to provide practicing physicians with a comprehensive multimedia tool to update their skills in the genomic era. We therefore created the multimedia courseware, ''The New Genetics: Courseware for Physicians. Molecular Concepts, Applications, and Ramifications.'' However, as the project moved forward, several unanticipated audiences found the courseware to be useful for instruction and for self-education, so an additional edition of the courseware ''The New Genetics: Medicine and the Human Genome. Molecular Concepts, Applications, and Ramifications'' was published simultaneously with the physician version. At the time that both versions of the courseware were being completed, Stanford's Office of Technology Licensing opted not to commercialize the courseware and offered a license-back agreement if the authors founded a commercial business. The authors thus became closely involved in marketing and sales, and several thousand copies of the courseware have been sold. Surprisingly, the non-physician version has turned out to be more in demand, and this has led us in several new directions, most of which involve undergraduate education. These are discussed in detail in the Report.

  10. Overlap in genomic variation associated with milk fat composition in Holstein Friesian and Dutch native dual-purpose breeds.

    Science.gov (United States)

    Maurice-Van Eijndhoven, M H T; Bovenhuis, H; Veerkamp, R F; Calus, M P L

    2015-09-01

    The aim of this study was to identify if genomic variations associated with fatty acid (FA) composition are similar between the Holstein-Friesian (HF) and native dual-purpose breeds used in the Dutch dairy industry. Phenotypic and genotypic information were available for the breeds Meuse-Rhine-Yssel (MRY), Dutch Friesian (DF), Groningen White Headed (GWH), and HF. First, the reliability of genomic breeding values of the native Dutch dual-purpose cattle breeds MRY, DF, and GWH was evaluated using single nucleotide polymorphism (SNP) effects estimated in HF, including all SNP or subsets with stronger associations in HF. Second, the genomic variation of the regions associated with FA composition in HF (regions on Bos taurus autosome 5, 14, and 26), were studied in the different breeds. Finally, similarities in genotype and allele frequencies between MRY, DF, GWH, and HF breeds were assessed for specific regions associated with FA composition. On average across the traits, the highest reliabilities of genomic prediction were estimated for GWH (0.158) and DF (0.116) when the 8 to 22 SNP with the strongest association in HF were included. With the same set of SNP, GEBV for MRY were the least reliable (0.022). This indicates that on average only 2 (MRY) to 16% (GWH) of the genomic variation in HF is shared with the native Dutch dual-purpose breeds. The comparison of predicted variances of different regions associated with milk and milk fat composition showed that breeds clearly differed in genomic variation within these regions. Finally, the correlations of allele frequencies between breeds across the 8 to 22 SNP with the strongest association in HF were around 0.8 between the Dutch native dual-purpose breeds, whereas the correlations between the native breeds and HF were clearly lower and around 0.5. There was no consistent relationship between the reliabilities of genomic prediction for a specific breed and the correlation between the allele frequencies of this breed

  11. Glaucoma in Asia: regional prevalence variations and future projections.

    Science.gov (United States)

    Chan, Errol Wei'en; Li, Xiang; Tham, Yih-Chung; Liao, Jiemin; Wong, Tien Yin; Aung, Tin; Cheng, Ching-Yu

    2016-01-01

    To evaluate glaucoma prevalence and disease burden across Asian subregions from 2013 to 2040. We conducted a systematic review and meta-analysis of 23 population-based studies of 1318 primary open angle glaucoma (POAG) cases in 66,800 individuals and 691 primary angle closure glaucoma (PACG) cases in 72,767 individuals in Asia. Regions in Asia were defined based on United Nations' (UN) classification of macro-geographic regions. PubMed, Medline and Web of Science databases were searched for population-based glaucoma prevalence studies using standardised criteria published to 31 December 2013. Pooled glaucoma prevalence for individuals aged 40-80 years was calculated using hierarchical Bayesian approaches. Prevalence differences by geographic subregion, subtype and habitation were examined with random effects meta-regression models. Estimates of individuals with glaucoma from 2013 to 2040 were based on the UN World Population Prospects. In 2013, pooled overall glaucoma prevalence was 3.54% (95% credible interval (CrI) 1.83 to 6.28). POAG (2.34%, 95% CrI 0.96 to 4.55) predominated over PACG (0.73%, 95% CrI 0.18 to 1.96). With age and gender adjustment, PACG prevalence was higher in East than South East Asia (OR 5.55, 95% CrI 1.52 to 14.73), and POAG prevalence was higher in urban than rural populations (OR 2.11, 95% CrI 1.57 to 2.38). From 2013 to 2040, South Central Asia will record the steepest increase in number of glaucoma individuals from 17.06 million to 32.90 million compared with other Asian subregions. In 2040, South-Central Asia is also projected to overtake East Asia for highest overall glaucoma and POAG burden, while PACG burden remains highest in East Asia. Across the Asian subregions, there was greater glaucoma burden in South-Central and East Asia. Sustainable public health strategies to combat glaucoma in Asia are needed. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to

  12. Population-genetic nature of copy number variations in the human genome.

    Science.gov (United States)

    Kato, Mamoru; Kawaguchi, Takahisa; Ishikawa, Shumpei; Umeda, Takayoshi; Nakamichi, Reiichiro; Shapero, Michael H; Jones, Keith W; Nakamura, Yusuke; Aburatani, Hiroyuki; Tsunoda, Tatsuhiko

    2010-03-01

    Copy number variations (CNVs) are universal genetic variations, and their association with disease has been increasingly recognized. We designed high-density microarrays for CNVs, and detected 3000-4000 CNVs (4-6% of the genomic sequence) per population that included CNVs previously missed because of smaller sizes and residing in segmental duplications. The patterns of CNVs across individuals were surprisingly simple at the kilo-base scale, suggesting the applicability of a simple genetic analysis for these genetic loci. We utilized the probabilistic theory to determine integer copy numbers of CNVs and employed a recently developed phasing tool to estimate the population frequencies of integer copy number alleles and CNV-SNP haplotypes. The results showed a tendency toward a lower frequency of CNV alleles and that most of our CNVs were explained only by zero-, one- and two-copy alleles. Using the estimated population frequencies, we found several CNV regions with exceptionally high population differentiation. Investigation of CNV-SNP linkage disequilibrium (LD) for 500-900 bi- and multi-allelic CNVs per population revealed that previous conflicting reports on bi-allelic LD were unexpectedly consistent and explained by an LD increase correlated with deletion-allele frequencies. Typically, the bi-allelic LD was lower than SNP-SNP LD, whereas the multi-allelic LD was somewhat stronger than the bi-allelic LD. After further investigation of tag SNPs for CNVs, we conclude that the customary tagging strategy for disease association studies can be applicable for common deletion CNVs, but direct interrogation is needed for other types of CNVs.

  13. Genetic variation in the Staphylococcus aureus 8325 strain lineage revealed by whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Kristoffer T Bæk

    Full Text Available Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47 required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected Φ11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (Δ63bp in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75% steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin α3 (PSMα3, a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.

  14. A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica).

    Science.gov (United States)

    Jia, Guanqing; Huang, Xuehui; Zhi, Hui; Zhao, Yan; Zhao, Qiang; Li, Wenjun; Chai, Yang; Yang, Lifang; Liu, Kunyan; Lu, Hengyun; Zhu, Chuanrang; Lu, Yiqi; Zhou, Congcong; Fan, Danlin; Weng, Qijun; Guo, Yunli; Huang, Tao; Zhang, Lei; Lu, Tingting; Feng, Qi; Hao, Hangfei; Liu, Hongkuan; Lu, Ping; Zhang, Ning; Li, Yuhui; Guo, Erhu; Wang, Shujun; Wang, Suying; Liu, Jinrong; Zhang, Wenfei; Chen, Guoqiu; Zhang, Baojin; Li, Wei; Wang, Yongfang; Li, Haiquan; Zhao, Baohua; Li, Jiayang; Diao, Xianmin; Han, Bin

    2013-08-01

    Foxtail millet (Setaria italica) is an important grain crop that is grown in arid regions. Here we sequenced 916 diverse foxtail millet varieties, identified 2.58 million SNPs and used 0.8 million common SNPs to construct a haplotype map of the foxtail millet genome. We classified the foxtail millet varieties into two divergent groups that are strongly correlated with early and late flowering times. We phenotyped the 916 varieties under five different environments and identified 512 loci associated with 47 agronomic traits by genome-wide association studies. We performed a de novo assembly of deeply sequenced genomes of a Setaria viridis accession (the wild progenitor of S. italica) and an S. italica variety and identified complex interspecies and intraspecies variants. We also identified 36 selective sweeps that seem to have occurred during modern breeding. This study provides fundamental resources for genetics research and genetic improvement in foxtail millet.

  15. De novo Genome Assembly and Single Nucleotide Variations for Soybean Mosaic Virus Using Soybean Seed Transcriptome Data

    Directory of Open Access Journals (Sweden)

    Yeonhwa Jo

    2017-10-01

    Full Text Available Soybean is the most important legume crop in the world. Several diseases in soybean lead to serious yield losses in major soybean-producing countries. Moreover, soybean can be infected by diverse viruses. Recently, we carried out a large-scale screening to identify viruses infecting soybean using available soybean transcriptome data. Of the screened transcriptomes, a soybean transcriptome for soybean seed development analysis contains several virus-associated sequences. In this study, we identified five viruses, including soybean mosaic virus (SMV, infecting soybean by de novo transcriptome assembly followed by blast search. We assembled a nearly complete consensus genome sequence of SMV China using transcriptome data. Based on phylogenetic analysis, the consensus genome sequence of SMV China was closely related to SMV isolates from South Korea. We examined single nucleotide variations (SNVs for SMVs in the soybean seed transcriptome revealing 780 SNVs, which were evenly distributed on the SMV genome. Four SNVs, C-U, U-C, A-G, and G-A, were frequently identified. This result demonstrated the quasispecies variation of the SMV genome. Taken together, this study carried out bioinformatics analyses to identify viruses using soybean transcriptome data. In addition, we demonstrated the application of soybean transcriptome data for virus genome assembly and SNV analysis.

  16. Patterns of genomic variation in the poplar rust fungus Melampsora larici-populina identify pathogenesis-related factors

    Directory of Open Access Journals (Sweden)

    Antoine ePersoons

    2014-09-01

    Full Text Available Melampsora larici-populina is a fungal pathogen responsible for foliar rust disease on poplar trees, which causes damage to forest plantations worldwide, particularly in Northern Europe. The reference genome of the isolate 98AG31 was previously sequenced using a whole genome shotgun strategy, revealing a large genome of 101 megabases containing 16,399 predicted genes, which included secreted protein genes representing poplar rust candidate effectors. In the present study, the genomes of 15 isolates collected over the past 20 years throughout the French territory, representing distinct virulence profiles, were characterized by massively parallel sequencing to assess genetic variation in the poplar rust fungus. Comparison to the reference genome revealed striking structural variations. Analysis of coverage and sequencing depth identified large missing regions between isolates related to the mating type loci. More than 611,824 single-nucleotide polymorphism (SNP positions were uncovered overall, indicating a remarkable level of polymorphism. Based on the accumulation of non-synonymous substitutions in coding sequences and the relative frequencies of synonymous and non-synonymous polymorphisms (i.e. PN/PS, we identify candidate genes that may be involved in fungal pathogenesis. Correlation between non-synonymous SNPs in genes encoding secreted proteins and pathotypes of the studied isolates revealed candidate genes potentially related to virulences 1, 6 and 8 of the poplar rust fungus.

  17. Earth BioGenome Project: Sequencing life for the future of life.

    Science.gov (United States)

    Lewin, Harris A; Robinson, Gene E; Kress, W John; Baker, William J; Coddington, Jonathan; Crandall, Keith A; Durbin, Richard; Edwards, Scott V; Forest, Félix; Gilbert, M Thomas P; Goldstein, Melissa M; Grigoriev, Igor V; Hackett, Kevin J; Haussler, David; Jarvis, Erich D; Johnson, Warren E; Patrinos, Aristides; Richards, Stephen; Castilla-Rubio, Juan Carlos; van Sluys, Marie-Anne; Soltis, Pamela S; Xu, Xun; Yang, Huanming; Zhang, Guojie

    2018-04-24

    Increasing our understanding of Earth's biodiversity and responsibly stewarding its resources are among the most crucial scientific and social challenges of the new millennium. These challenges require fundamental new knowledge of the organization, evolution, functions, and interactions among millions of the planet's organisms. Herein, we present a perspective on the Earth BioGenome Project (EBP), a moonshot for biology that aims to sequence, catalog, and characterize the genomes of all of Earth's eukaryotic biodiversity over a period of 10 years. The outcomes of the EBP will inform a broad range of major issues facing humanity, such as the impact of climate change on biodiversity, the conservation of endangered species and ecosystems, and the preservation and enhancement of ecosystem services. We describe hurdles that the project faces, including data-sharing policies that ensure a permanent, freely available resource for future scientific discovery while respecting access and benefit sharing guidelines of the Nagoya Protocol. We also describe scientific and organizational challenges in executing such an ambitious project, and the structure proposed to achieve the project's goals. The far-reaching potential benefits of creating an open digital repository of genomic information for life on Earth can be realized only by a coordinated international effort.

  18. Citrus sinensis annotation project (CAP): a comprehensive database for sweet orange genome.

    Science.gov (United States)

    Wang, Jia; Chen, Dijun; Lei, Yang; Chang, Ji-Wei; Hao, Bao-Hai; Xing, Feng; Li, Sen; Xu, Qiang; Deng, Xiu-Xin; Chen, Ling-Ling

    2014-01-01

    Citrus is one of the most important and widely grown fruit crop with global production ranking firstly among all the fruit crops in the world. Sweet orange accounts for more than half of the Citrus production both in fresh fruit and processed juice. We have sequenced the draft genome of a double-haploid sweet orange (C. sinensis cv. Valencia), and constructed the Citrus sinensis annotation project (CAP) to store and visualize the sequenced genomic and transcriptome data. CAP provides GBrowse-based organization of sweet orange genomic data, which integrates ab initio gene prediction, EST, RNA-seq and RNA-paired end tag (RNA-PET) evidence-based gene annotation. Furthermore, we provide a user-friendly web interface to show the predicted protein-protein interactions (PPIs) and metabolic pathways in sweet orange. CAP provides comprehensive information beneficial to the researchers of sweet orange and other woody plants, which is freely available at http://citrus.hzau.edu.cn/.

  19. Circadian pathway genetic variation and cancer risk: evidence from genome-wide association studies.

    Science.gov (United States)

    Mocellin, Simone; Tropea, Saveria; Benna, Clara; Rossi, Carlo Riccardo

    2018-02-19

    Dysfunction of the circadian clock and single polymorphisms of some circadian genes have been linked to cancer susceptibility, although data are scarce and findings inconsistent. We aimed to investigate the association between circadian pathway genetic variation and risk of developing common cancers based on the findings of genome-wide association studies (GWASs). Single nucleotide polymorphisms (SNPs) of 17 circadian genes reported by three GWAS meta-analyses dedicated to breast (Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) Consortium; cases, n = 15,748; controls, n = 18,084), prostate (Elucidating Loci Involved in Prostate Cancer Susceptibility (ELLIPSE) Consortium; cases, n = 14,160; controls, n = 12,724) and lung carcinoma (Transdisciplinary Research In Cancer of the Lung (TRICL) Consortium; cases, n = 12,160; controls, n = 16,838) in patients of European ancestry were utilized to perform pathway analysis by means of the adaptive rank truncated product (ARTP) method. Data were also available for the following subgroups: estrogen receptor negative breast cancer, aggressive prostate cancer, squamous lung carcinoma and lung adenocarcinoma. We found a highly significant statistical association between circadian pathway genetic variation and the risk of breast (pathway P value = 1.9 × 10 -6 ; top gene RORA, gene P value = 0.0003), prostate (pathway P value = 4.1 × 10 -6 ; top gene ARNTL, gene P value = 0.0002) and lung cancer (pathway P value = 6.9 × 10 -7 ; top gene RORA, gene P value = 2.0 × 10 -6 ), as well as all their subgroups. Out of 17 genes investigated, 15 were found to be significantly associated with the risk of cancer: four genes were shared by all three malignancies (ARNTL, CLOCK, RORA and RORB), two by breast and lung cancer (CRY1 and CRY2) and three by prostate and lung cancer (NPAS2, NR1D1 and PER3), whereas four genes were specific for lung cancer

  20. Effects of Sublethal Fungicides on Mutation Rates and Genomic Variation in Fungal Plant Pathogen, Sclerotinia sclerotiorum.

    Science.gov (United States)

    Amaradasa, B Sajeewa; Everhart, Sydney E

    2016-01-01

    when repeated, only one isolate had higher EC50 while most isolates showed no difference. Results of this support the hypothesis that sublethal fungicide stress increases mutation rates in a largely clonal plant pathogen under in vitro conditions. Collectively, this work will aid our understanding how non-lethal fungicide exposure may affect genomic variation, which may be an important mechanism of novel trait emergence, adaptation, and evolution for clonal organisms.

  1. Effects of Sublethal Fungicides on Mutation Rates and Genomic Variation in Fungal Plant Pathogen, Sclerotinia sclerotiorum.

    Directory of Open Access Journals (Sweden)

    B Sajeewa Amaradasa

    experiment, and when repeated, only one isolate had higher EC50 while most isolates showed no difference. Results of this support the hypothesis that sublethal fungicide stress increases mutation rates in a largely clonal plant pathogen under in vitro conditions. Collectively, this work will aid our understanding how non-lethal fungicide exposure may affect genomic variation, which may be an important mechanism of novel trait emergence, adaptation, and evolution for clonal organisms.

  2. Effects of Sublethal Fungicides on Mutation Rates and Genomic Variation in Fungal Plant Pathogen, Sclerotinia sclerotiorum

    Science.gov (United States)

    Amaradasa, B. Sajeewa

    2016-01-01

    , and when repeated, only one isolate had higher EC50 while most isolates showed no difference. Results of this support the hypothesis that sublethal fungicide stress increases mutation rates in a largely clonal plant pathogen under in vitro conditions. Collectively, this work will aid our understanding how non-lethal fungicide exposure may affect genomic variation, which may be an important mechanism of novel trait emergence, adaptation, and evolution for clonal organisms. PMID:27959950

  3. Continuous Morphological Variation Correlated with Genome Size Indicates Frequent Introgressive Hybridization among Diphasiastrum Species (Lycopodiaceae) in Central Europe

    Czech Academy of Sciences Publication Activity Database

    Hanušová, K.; Ekrt, L.; Vít, Petr; Kolář, Filip; Urfus, Tomáš

    2014-01-01

    Roč. 9, č. 6 (2014), no.-e99552 E-ISSN 1932-6203 R&D Projects: GA ČR GB14-36079G Institutional support: RVO:67985939 Keywords : genome size * merphometrics * Diphasiastrum Subject RIV: EF - Botanics Impact factor: 3.234, year: 2014

  4. Convergence rates in constrained Tikhonov regularization: equivalence of projected source conditions and variational inequalities

    International Nuclear Information System (INIS)

    Flemming, Jens; Hofmann, Bernd

    2011-01-01

    In this paper, we enlighten the role of variational inequalities for obtaining convergence rates in Tikhonov regularization of nonlinear ill-posed problems with convex penalty functionals under convexity constraints in Banach spaces. Variational inequalities are able to cover solution smoothness and the structure of nonlinearity in a uniform manner, not only for unconstrained but, as we indicate, also for constrained Tikhonov regularization. In this context, we extend the concept of projected source conditions already known in Hilbert spaces to Banach spaces, and we show in the main theorem that such projected source conditions are to some extent equivalent to certain variational inequalities. The derived variational inequalities immediately yield convergence rates measured by Bregman distances

  5. IW-Scoring: an Integrative Weighted Scoring framework for annotating and prioritizing genetic variations in the noncoding genome.

    Science.gov (United States)

    Wang, Jun; Dayem Ullah, Abu Z; Chelala, Claude

    2018-01-30

    The vast majority of germline and somatic variations occur in the noncoding part of the genome, only a small fraction of which are believed to be functional. From the tens of thousands of noncoding variations detectable in each genome, identifying and prioritizing driver candidates with putative functional significance is challenging. To address this, we implemented IW-Scoring, a new Integrative Weighted Scoring model to annotate and prioritise functionally relevant noncoding variations. We evaluate 11 scoring methods, and apply an unsupervised spectral approach for subsequent selective integration into two linear weighted functional scoring schemas for known and novel variations. IW-Scoring produces stable high-quality performance as the best predictors for three independent data sets. We demonstrate the robustness of IW-Scoring in identifying recurrent functional mutations in the TERT promoter, as well as disease SNPs in proximity to consensus motifs and with gene regulatory effects. Using follicular lymphoma as a paradigmatic cancer model, we apply IW-Scoring to locate 11 recurrently mutated noncoding regions in 14 follicular lymphoma genomes, and validate 9 of these regions in an extension cohort, including the promoter and enhancer regions of PAX5. Overall, IW-Scoring demonstrates greater versatility in identifying trait- and disease-associated noncoding variants. Scores from IW-Scoring as well as other methods are freely available from http://www.snp-nexus.org/IW-Scoring/. © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Perspectives from the Avian Phylogenomics Project: Questions that Can Be Answered with Sequencing All Genomes of a Vertebrate Class.

    Science.gov (United States)

    Jarvis, Erich D

    2016-01-01

    The rapid pace of advances in genome technology, with concomitant reductions in cost, makes it feasible that one day in our lifetime we will have available extant genomes of entire classes of species, including vertebrates. I recently helped cocoordinate the large-scale Avian Phylogenomics Project, which collected and sequenced genomes of 48 bird species representing most currently classified orders to address a range of questions in phylogenomics and comparative genomics. The consortium was able to answer questions not previously possible with just a few genomes. This success spurred on the creation of a project to sequence the genomes of at least one individual of all extant ∼10,500 bird species. The initiation of this project has led us to consider what questions now impossible to answer could be answered with all genomes, and could drive new questions now unimaginable. These include the generation of a highly resolved family tree of extant species, genome-wide association studies across species to identify genetic substrates of many complex traits, redefinition of species and the species concept, reconstruction of the genomes of common ancestors, and generation of new computational tools to address these questions. Here I present visions for the future by posing and answering questions regarding what scientists could potentially do with available genomes of an entire vertebrate class.

  7. Field of genes: the politics of science and identity in the Estonian Genome Project.

    Science.gov (United States)

    Fletcher, Amy L

    2004-04-01

    This case study of the Estonian Genome Project (EGP) analyses the Estonian policy decision to construct a national human gene bank. Drawing upon qualitative data from newspaper articles and public policy documents, it focuses on how proponents use discourse to link the EGP to the broader political goal of securing Estonia's position within the Western/European scientific and cultural space. This dominant narrative is then situated within the analytical notion of the "brand state", which raises potentially negative political consequences for this type of market-driven genomic research. Considered against the increasing number of countries engaging in gene bank and/or gene database projects, this analysis of Estonia elucidates issues that cross national boundaries, while also illuminating factors specific to this small, post-Soviet state as it enters the global biocybernetic economy.

  8. A recurrent neural network based on projection operator for extended general variational inequalities.

    Science.gov (United States)

    Liu, Qingshan; Cao, Jinde

    2010-06-01

    Based on the projection operator, a recurrent neural network is proposed for solving extended general variational inequalities (EGVIs). Sufficient conditions are provided to ensure the global convergence of the proposed neural network based on Lyapunov methods. Compared with the existing neural networks for variational inequalities, the proposed neural network is a modified version of the general projection neural network existing in the literature and capable of solving the EGVI problems. In addition, simulation results on numerical examples show the effectiveness and performance of the proposed neural network.

  9. aCNViewer: Comprehensive genome-wide visualization of absolute copy number and copy neutral variations.

    Directory of Open Access Journals (Sweden)

    Victor Renault

    Full Text Available Copy number variations (CNV include net gains or losses of part or whole chromosomal regions. They differ from copy neutral loss of heterozygosity (cn-LOH events which do not induce any net change in the copy number and are often associated with uniparental disomy. These phenomena have long been reported to be associated with diseases and particularly in cancer. Losses/gains of genomic regions are often correlated with lower/higher gene expression. On the other hand, loss of heterozygosity (LOH and cn-LOH are common events in cancer and may be associated with the loss of a functional tumor suppressor gene. Therefore, identifying recurrent CNV and cn-LOH events can be important as they may highlight common biological components and give insights into the development or mechanisms of a disease. However, no currently available tools allow a comprehensive whole-genome visualization of recurrent CNVs and cn-LOH in groups of samples providing absolute quantification of the aberrations leading to the loss of potentially important information.To overcome these limitations, we developed aCNViewer (Absolute CNV Viewer, a visualization tool for absolute CNVs and cn-LOH across a group of samples. aCNViewer proposes three graphical representations: dendrograms, bi-dimensional heatmaps showing chromosomal regions sharing similar abnormality patterns, and quantitative stacked histograms facilitating the identification of recurrent absolute CNVs and cn-LOH. We illustrated aCNViewer using publically available hepatocellular carcinomas (HCCs Affymetrix SNP Array data (Fig 1A. Regions 1q and 8q present a similar percentage of total gains but significantly different copy number gain categories (p-value of 0.0103 with a Fisher exact test, validated by another cohort of HCCs (p-value of 5.6e-7 (Fig 2B.aCNViewer is implemented in python and R and is available with a GNU GPLv3 license on GitHub https://github.com/FJD-CEPH/aCNViewer and Docker https://hub.docker.com/r/fjdceph/acnviewer/.aCNViewer@cephb.fr.

  10. aCNViewer: Comprehensive genome-wide visualization of absolute copy number and copy neutral variations.

    Science.gov (United States)

    Renault, Victor; Tost, Jörg; Pichon, Fabien; Wang-Renault, Shu-Fang; Letouzé, Eric; Imbeaud, Sandrine; Zucman-Rossi, Jessica; Deleuze, Jean-François; How-Kit, Alexandre

    2017-01-01

    Copy number variations (CNV) include net gains or losses of part or whole chromosomal regions. They differ from copy neutral loss of heterozygosity (cn-LOH) events which do not induce any net change in the copy number and are often associated with uniparental disomy. These phenomena have long been reported to be associated with diseases and particularly in cancer. Losses/gains of genomic regions are often correlated with lower/higher gene expression. On the other hand, loss of heterozygosity (LOH) and cn-LOH are common events in cancer and may be associated with the loss of a functional tumor suppressor gene. Therefore, identifying recurrent CNV and cn-LOH events can be important as they may highlight common biological components and give insights into the development or mechanisms of a disease. However, no currently available tools allow a comprehensive whole-genome visualization of recurrent CNVs and cn-LOH in groups of samples providing absolute quantification of the aberrations leading to the loss of potentially important information. To overcome these limitations, we developed aCNViewer (Absolute CNV Viewer), a visualization tool for absolute CNVs and cn-LOH across a group of samples. aCNViewer proposes three graphical representations: dendrograms, bi-dimensional heatmaps showing chromosomal regions sharing similar abnormality patterns, and quantitative stacked histograms facilitating the identification of recurrent absolute CNVs and cn-LOH. We illustrated aCNViewer using publically available hepatocellular carcinomas (HCCs) Affymetrix SNP Array data (Fig 1A). Regions 1q and 8q present a similar percentage of total gains but significantly different copy number gain categories (p-value of 0.0103 with a Fisher exact test), validated by another cohort of HCCs (p-value of 5.6e-7) (Fig 2B). aCNViewer is implemented in python and R and is available with a GNU GPLv3 license on GitHub https://github.com/FJD-CEPH/aCNViewer and Docker https

  11. The Human Genome Project and Mental Retardation: An Educational Program. Final Progress Report

    Energy Technology Data Exchange (ETDEWEB)

    Davis, Sharon

    1999-05-03

    The Arc, a national organization on mental retardation, conducted an educational program for members, many of whom have a family member with a genetic condition causing mental retardation. The project informed members about the Human Genome scientific efforts, conducted training regarding ethical, legal and social implications and involved members in issue discussions. Short reports and fact sheets on genetic and ELSI topics were disseminated to 2,200 of the Arc's leaders across the country and to other interested individuals. Materials produced by the project can e found on the Arc's web site, TheArc.org.

  12. The MedSeq Project: a randomized trial of integrating whole genome sequencing into clinical medicine.

    Science.gov (United States)

    Vassy, Jason L; Lautenbach, Denise M; McLaughlin, Heather M; Kong, Sek Won; Christensen, Kurt D; Krier, Joel; Kohane, Isaac S; Feuerman, Lindsay Z; Blumenthal-Barby, Jennifer; Roberts, J Scott; Lehmann, Lisa Soleymani; Ho, Carolyn Y; Ubel, Peter A; MacRae, Calum A; Seidman, Christine E; Murray, Michael F; McGuire, Amy L; Rehm, Heidi L; Green, Robert C

    2014-03-20

    Whole genome sequencing (WGS) is already being used in certain clinical and research settings, but its impact on patient well-being, health-care utilization, and clinical decision-making remains largely unstudied. It is also unknown how best to communicate sequencing results to physicians and patients to improve health. We describe the design of the MedSeq Project: the first randomized trials of WGS in clinical care. This pair of randomized controlled trials compares WGS to standard of care in two clinical contexts: (a) disease-specific genomic medicine in a cardiomyopathy clinic and (b) general genomic medicine in primary care. We are recruiting 8 to 12 cardiologists, 8 to 12 primary care physicians, and approximately 200 of their patients. Patient participants in both the cardiology and primary care trials are randomly assigned to receive a family history assessment with or without WGS. Our laboratory delivers a genome report to physician participants that balances the needs to enhance understandability of genomic information and to convey its complexity. We provide an educational curriculum for physician participants and offer them a hotline to genetics professionals for guidance in interpreting and managing their patients' genome reports. Using varied data sources, including surveys, semi-structured interviews, and review of clinical data, we measure the attitudes, behaviors and outcomes of physician and patient participants at multiple time points before and after the disclosure of these results. The impact of emerging sequencing technologies on patient care is unclear. We have designed a process of interpreting WGS results and delivering them to physicians in a way that anticipates how we envision genomic medicine will evolve in the near future. That is, our WGS report provides clinically relevant information while communicating the complexity and uncertainty of WGS results to physicians and, through physicians, to their patients. This project will not only

  13. Public trust and 'ethics review' as a commodity: the case of Genomics England Limited and the UK's 100,000 genomes project.

    Science.gov (United States)

    Samuel, Gabrielle Natalie; Farsides, Bobbie

    2018-06-01

    The UK Chief Medical Officer's 2016 Annual Report, Generation Genome, focused on a vision to fully integrate genomics into all aspects of the UK's National Health Service (NHS). This process of integration, which has now already begun, raises a wide range of social and ethical concerns, many of which were discussed in the final Chapter of the report. This paper explores how the UK's 100,000 Genomes Project (100 kGP)-the catalyst for Generation Genome, and for bringing genomics into the NHS-is negotiating these ethical concerns. The UK's 100 kGP, promoted and delivered by Genomics England Limited (GEL), is an innovative venture aiming to sequence 100,000 genomes from NHS patients who have a rare disease, cancer, or an infectious disease. GEL has emphasised the importance of ethical governance and decision-making. However, some sociological critique argues that biomedical/technological organisations presenting themselves as 'ethical' entities do not necessarily reflect a space within which moral thinking occurs. Rather, the 'ethical work' conducted (and displayed) by organisations is more strategic, relating to the politics of the organisation and the need to build public confidence. We set out to explore whether GEL's ethical framework was reflective of this critique, and what this tells us more broadly about how genomics is being integrated into the NHS in response to the ethical and social concerns raised in Generation Genome. We do this by drawing on a series of 20 interviews with individuals associated with or working at GEL.

  14. Whole-genome copy number variation analysis in anophthalmia and microphthalmia.

    Science.gov (United States)

    Schilter, K F; Reis, L M; Schneider, A; Bardakjian, T M; Abdul-Rahman, O; Kozel, B A; Zimmerman, H H; Broeckel, U; Semina, E V

    2013-11-01

    Anophthalmia/microphthalmia (A/M) represent severe developmental ocular malformations. Currently, mutations in known genes explain less than 40% of A/M cases. We performed whole-genome copy number variation analysis in 60 patients affected with isolated or syndromic A/M. Pathogenic deletions of 3q26 (SOX2) were identified in four independent patients with syndromic microphthalmia. Other variants of interest included regions with a known role in human disease (likely pathogenic) as well as novel rearrangements (uncertain significance). A 2.2-Mb duplication of 3q29 in a patient with non-syndromic anophthalmia and an 877-kb duplication of 11p13 (PAX6) and a 1.4-Mb deletion of 17q11.2 (NF1) in two independent probands with syndromic microphthalmia and other ocular defects were identified; while ocular anomalies have been previously associated with 3q29 duplications, PAX6 duplications, and NF1 mutations in some cases, the ocular phenotypes observed here are more severe than previously reported. Three novel regions of possible interest included a 2q14.2 duplication which cosegregated with microphthalmia/microcornea and congenital cataracts in one family, and 2q21 and 15q26 duplications in two additional cases; each of these regions contains genes that are active during vertebrate ocular development. Overall, this study identified causative copy number mutations and regions with a possible role in ocular disease in 17% of A/M cases. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  15. A genome-wide investigation of copy number variation in patients with sporadic brain arteriovenous malformation.

    Directory of Open Access Journals (Sweden)

    Nasrine Bendjilali

    Full Text Available Brain arteriovenous malformations (BAVM are clusters of abnormal blood vessels, with shunting of blood from the arterial to venous circulation and a high risk of rupture and intracranial hemorrhage. Most BAVMs are sporadic, but also occur in patients with Hereditary Hemorrhagic Telangiectasia, a Mendelian disorder caused by mutations in genes in the transforming growth factor beta (TGFβ signaling pathway.To investigate whether copy number variations (CNVs contribute to risk of sporadic BAVM, we performed a genome-wide association study in 371 sporadic BAVM cases and 563 healthy controls, all Caucasian. Cases and controls were genotyped using the Affymetrix 6.0 array. CNVs were called using the PennCNV and Birdsuite algorithms and analyzed via segment-based and gene-based approaches. Common and rare CNVs were evaluated for association with BAVM.A CNV region on 1p36.13, containing the neuroblastoma breakpoint family, member 1 gene (NBPF1, was significantly enriched with duplications in BAVM cases compared to controls (P = 2.2×10(-9; NBPF1 was also significantly associated with BAVM in gene-based analysis using both PennCNV and Birdsuite. We experimentally validated the 1p36.13 duplication; however, the association did not replicate in an independent cohort of 184 sporadic BAVM cases and 182 controls (OR = 0.81, P = 0.8. Rare CNV analysis did not identify genes significantly associated with BAVM.We did not identify common CNVs associated with sporadic BAVM that replicated in an independent cohort. Replication in larger cohorts is required to elucidate the possible role of common or rare CNVs in BAVM pathogenesis.

  16. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Directory of Open Access Journals (Sweden)

    Sathishkumar Natarajan

    Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  17. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  18. Ensembl Genomes 2016: more genomes, more complexity.

    Science.gov (United States)

    Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

    2016-01-04

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Ethical challenges and innovations in the dissemination of genomic data: the experience of the PERSPECTIVE project

    Directory of Open Access Journals (Sweden)

    Lévesque E

    2015-08-01

    Full Text Available Emmanuelle Lévesque,1 Bartha Maria Knoppers,1 Jacques Simard,2 1Department of Human Genetics, Centre for Genomics and Policy, McGill University, Montréal, 2Genomics Centre, CHU de Québec Research Center, Department of Molecular Medicine, Laval University, Québec City, QC, Canada Abstract: The importance of making genomic data available for future research is now widely recognized among the scientific community and policymakers. In this era of shared responsibility for data dissemination, improved patient care through research depends on the development of powerful and secure data-sharing systems. As part of the concerted effort to share research resources, the project entitled Personalized Risk Stratification for Prevention and Early Detection of Breast Cancer (PERSPECTIVE makes effective data sharing through the development of a data-sharing framework, one of its goals. The secondary uses of data from PERSPECTIVE for future research promise to enhance our knowledge of breast cancer etiologies without duplicating data-gathering efforts. Despite its benefit for research, we recognize the ethical challenges of data sharing on the local, national, and international levels. The effective management of ethical approvals for projects spanning across jurisdictions, the return of results to research participants, and research incentives and recognition for data production, are but a few pressing issues that need to be properly addressed. We discuss how we managed these issues and suggest how ongoing innovations might help to facilitate data sharing in future genomic research projects. Keywords: data sharing, research ethics, cancer

  20. Variational formulation and projectional methods for the second order transport equation

    International Nuclear Information System (INIS)

    Borysiewicz, M.; Stankiewicz, R.

    1979-01-01

    Herein the variational problem for a second-order boundary value problem for the neutron transport equation is formulated. The projectional methods solving the problem are examined. The approach is compared with that based on the original untransformed form of the neutron transport equation

  1. The Variation Theorem Applied to H-2+: A Simple Quantum Chemistry Computer Project

    Science.gov (United States)

    Robiette, Alan G.

    1975-01-01

    Describes a student project which requires limited knowledge of Fortran and only minimal computing resources. The results illustrate such important principles of quantum mechanics as the variation theorem and the virial theorem. Presents sample calculations and the subprogram for energy calculations. (GS)

  2. Standard Terminology for Phenotypic Variations: The Elements of Morphology Project, Its Current Progress, and Future Directions

    NARCIS (Netherlands)

    Carey, John C.; Allanson, Judith E.; Hennekam, Raoul C. M.; Biesecker, Leslie G.

    2012-01-01

    In 2005, the authors of this article formed an international working group to develop standardized definitions and terms to describe the physical variations used in human phenotypic analyses. This project, which came to be known as the Elements of Morphology, resulted in six articles proposing

  3. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome.

    Science.gov (United States)

    Collins, Ryan L; Brand, Harrison; Redin, Claire E; Hanscom, Carrie; Antolik, Caroline; Stone, Matthew R; Glessner, Joseph T; Mason, Tamara; Pregno, Giulia; Dorrani, Naghmeh; Mandrile, Giorgia; Giachino, Daniela; Perrin, Danielle; Walsh, Cole; Cipicchio, Michelle; Costello, Maura; Stortchevoi, Alexei; An, Joon-Yong; Currall, Benjamin B; Seabra, Catarina M; Ragavendran, Ashok; Margolin, Lauren; Martinez-Agosto, Julian A; Lucente, Diane; Levy, Brynn; Sanders, Stephan J; Wapner, Ronald J; Quintero-Rivera, Fabiola; Kloosterman, Wigard; Talkowski, Michael E

    2017-03-06

    Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. We sequenced 689 participants with autism spectrum disorder (ASD) and other developmental abnormalities to construct a genome-wide map of large SV. Using long-insert jumping libraries at 105X mean physical coverage and linked-read whole-genome sequencing from 10X Genomics, we document seven major SV classes at ~5 kb SV resolution. Our results encompass 11,735 distinct large SV sites, 38.1% of which are novel and 16.8% of which are balanced or complex. We characterize 16 recurrent subclasses of complex SV (cxSV), revealing that: (1) cxSV are larger and rarer than canonical SV; (2) each genome harbors 14 large cxSV on average; (3) 84.4% of large cxSVs involve inversion; and (4) most large cxSV (93.8%) have not been delineated in previous studies. Rare SVs are more likely to disrupt coding and regulatory non-coding loci, particularly when truncating constrained and disease-associated genes. We also identify multiple cases of catastrophic chromosomal rearrangements known as chromoanagenesis, including somatic chromoanasynthesis, and extreme balanced germline chromothripsis events involving up to 65 breakpoints and 60.6 Mb across four chromosomes, further defining rare categories of extreme cxSV. These data provide a foundational map of large SV in the morbid human genome and demonstrate a previously underappreciated abundance and diversity of cxSV that should be considered in genomic studies of human disease.

  4. The human genome project: Information management, access, and regulation. Technical progress report, 1 April--31 August 1993

    Energy Technology Data Exchange (ETDEWEB)

    McInerney, J.D.; Micikas, L.B.

    1993-09-10

    Efforts are described to prepare educational materials including computer based as well as conventional type teaching materials for training interested high school and elementary students in aspects of Human Genome Project.

  5. Defining the role of common variation in the genomic and biological architecture of adult human height

    NARCIS (Netherlands)

    A.R. Wood (Andrew); T. Esko (Tõnu); J. Yang (Jian); S. Vedantam (Sailaja); T.H. Pers (Tune); S. Gustafsson (Stefan); A.Y. Chu (Audrey Y); K. Estrada Gil (Karol); J. Luan; Z. Kutalik; N. Amin (Najaf); M.L. Buchkovich (Martin); D.C. Croteau-Chonka (Damien); F.R. Day (Felix); Y. Duan (Yanan); M. Fall (Magnus); R.S.N. Fehrmann (Rudolf); T. Ferreira (Teresa); A.U. Jackson (Anne); J. Karjalainen (Juha); K.S. Lo (Ken Sin); A. Locke (Adam); R. Mägi (Reedik); E. Mihailov (Evelin); E. Porcu (Eleonora); J.C. Randall (Joshua); A. Scherag (Andre); A.A.E. Vinkhuyzen (Anna A.); H.J. Westra (Harm-Jan); T.W. Winkler (Thomas W.); T. Workalemahu (Tsegaselassie); J.H. Zhao (Jing Hua); D. Absher (Devin); E. Albrecht (Eva); J. Baron (Jeffrey); M. Beekman (Marian); A. Demirkan (Ayşe); G.B. Ehret (Georg); B. Feenstra; M.F. Feitosa (Mary Furlan); K. Fischer (Krista); R.M. Fraser (Ross); A. Goel (Anuj); J. Gong (Jian); A.E. Justice (Anne); S. Kanoni (Stavroula); M.E. Kleber (Marcus); K. Kristiansson (Kati); U. Lim (Unhee); V. Lotay (Vaneet); J.C. Lui (Julian C); M. Mangino (Massimo); I.M. Leach (Irene Mateo); M.C. Medina-Gomez (Carolina); M.A. Nalls (Michael); A.S. Dimas (Antigone); C. Palmer (Cameron); D. Pasko (Dorota); S. Pechlivanis (Sonali); I. Prokopenko (Inga); J.S. Ried (Janina); S. Ripke (Stephan); D. Shungin (Dmitry); A. Stancáková (Alena); R.J. Strawbridge (Rona); Y.J. Sung (Yun Ju); T. Tanaka (Toshiko); A. Teumer (Alexander); S. Trompet (Stella); S.W. Van Der Laan (Sander W.); J. van Setten (Jessica); J.V. van Vliet-Ostaptchouk (Jana); Z. Wang (Zhaoming); L. Yengo (Loic); W. Zhang (Weihua); U. Afzal (Uzma); J. Ärnlöv (Johan); G.M. Arscott (Gillian M.); S. Bandinelli (Stefania); A. Barrett (Angela); C. Bellis (Claire); A.J. Bennett (Amanda); C. Berne (Christian); M. Blüher (Matthias); J.L. Bolton (Jennifer); Y. Böttcher (Yvonne); H.A. Boyd; M. Bruinenberg (M.); B.M. Buckley (Brendan M.); S. Buyske (Steven); I.H. Caspersen (Ida H.); P.S. Chines (Peter); R. Clarke (Robert); S. Claudi-Boehm (Simone); M.N. Cooper (Matthew); E.W. Daw (E Warwick); P.A. De Jong (Pim A); J. Deelen (Joris); G. Delgado; J.C. Denny (Josh C); R.A.M. Dhonukshe-Rutten (Rosalie); M. Dimitriou (Maria); A.S.F. Doney (Alex); M. Dörr (Marcus); N. Eklund (Niina); E. Eury (Elodie); L. Folkersen (Lasse); M. Garcia (Melissa); F. Geller (Frank); V. Giedraitis (Vilmantas); A. Go (Attie); H. Grallert (Harald); T.B. Grammer (Tanja B); J. Gräßler (Jürgen); H. Grönberg (Henrik); L.C.P.G.M. de Groot (Lisette); C.J. Groves (Christopher J.); J. Haessler (Jeff); P. Hall (Per); T. Haller (Toomas); G. Hallmans (Göran); M. Hannemann (Mario); C.A. Hartman (Catharina); M. Hassinen (Maija); C. Hayward (Caroline); N.L. Heard-Costa (Nancy); Q. Helmer (Quinta); G. Hemani; A.K. Henders (Anjali); H.L. Hillege (Hans); M.A. Hlatky (Mark); W. Hoffmann (Wolfgang); P. Hoffmann (Per); O.L. Holmen (Oddgeir); J.J. Houwing-Duistermaat (Jeanine); T. Illig (Thomas); A. Isaacs (Aaron); A.L. James (Alan); J. Jeff (Janina); B. Johansen (Berit); A. Johansson (Åsa); G.J. Jolley (Jason); T. Juliusdottir (Thorhildur); M.J. Junttila (Juhani); M.M.L. Kho (Marcia); L. Kinnunen (Leena); N. Klopp (Norman); T. Kocher; W. Kratzer (Wolfgang); P. Lichtner (Peter); L. Lind (Lars); J. Lindström (Jaana); S. Lobbens (Stéphane); M. Lorentzon (Mattias); Y. Lu (Yingchang); V. Lyssenko (Valeriya); P.K. Magnusson (Patrik); A. Mahajan (Anubha); M. Maillard (Marc); W.L. McArdle (Wendy); C.A. McKenzie (Colin A.); S. McLachlan (Stela); P.J. McLaren (Paul J); C. Menni (Cristina); S. Merger (Sigrun); L. Milani (Lili); A. Moayyeri (Alireza); K.L. Monda (Keri); M.A. Morken (Mario); G. Müller (Gabriele); M. Müller-Nurasyid (Martina); A.W. Musk (Arthur); N. Narisu (Narisu); M. Nauck (Matthias); I.M. Nolte (Ilja M.); M.M. Nöthen (Markus); L. Oozageer (Laticia); S. Pilz (Stefan); N.W. Rayner (Nigel William); F. Renström (Frida); N.R. Robertson (Neil R.); L.M. Rose (Lynda M.); R. Roussel (Ronan); S. Sanna (Serena); H. Scharnagl (Hubert); S. Scholtens (Salome); F.R. Schumacher (Fredrick R); H. Schunkert (Heribert); R.A. Scott (Robert); J.S. Sehmi (Joban); T. Seufferlein (Thomas); J. Shi (Jianxin); K. Silventoinen (Karri); J.H. Smit (Johannes); G.D. Smith; J. Smolonska (Joanna); A. Stanton (Alice); K. Stirrups (Kathy); D.J. Stott (David J); H.M. Stringham (Heather); J. Sundstrom (Johan); M. Swertz (Morris); A.C. Syvanen; B. Tayo (Bamidele); G. Thorleifsson (Gudmar); J.P. Tyrer (Jonathan); S. Van Dijk (Suzanne); N.M. van Schoor (Natasja); N. van der Velde (Nathalie); D. van Heemst (Diana); F.V.A. Van Oort (Floor V A); S.H.H.M. Vermeulen (Sita); N. Verweij (Niek); J.M. Vonk (Judith M); L. Waite (Lindsay); M. Waldenberger (Melanie); R. Wennauer (Roman); L.R. Wilkens (Lynne R.); C. Willenborg (Christina); T. Wilsgaard (Tom); M.K. Wojczynski (Mary ); A. Wong (Andrew); A. Wright (Alan); Q. Zhang (Qunyuan); D. Arveiler (Dominique); S.J.L. Bakker (Stephan); J. Beilby (John); R.N. Bergman (Richard); S.M. Bergmann (Sven); R. Biffar; J. Blangero (John); D.I. Boomsma (Dorret); S.R. Bornstein (Stefan R.); P. Bovet (Pascal); P. Brambilla (Paolo); M.J. Brown (Morris); H. Campbell (Harry); M. Caulfield (Mark); A. Chakravarti (Aravinda); F.S. Collins (Francis); D.C. Crawford (Dana); L.A. Cupples (Adrienne); J. Danesh (John); U. de Faire (Ulf); H.M. den Ruijter (Hester ); R. Erbel (Raimund); J. Erdmann (Jeanette); J. Eriksson; M. Farrall (Martin); E. Ferrannini (Ele); J. Ferrieres (Jean); I. Ford; N.G. Forouhi (Nita); T. Forrester (Terrence); R.T. Gansevoort (Ron); P.V. Gejman (Pablo); C. Gieger (Christian); A. Golay (Alain); R.F. Gottesman (Rebecca); V. Gudnason (Vilmundur); U. Gyllensten (Ulf); D.W. Haas (David W); A.S. Hall (Alistair); T.B. Harris (Tamara); A.T. Hattersley (Andrew); A.C. Heath (Andrew C); C. Hengstenberg (Christian); A.A. Hicks (Andrew); L.A. Hindorff (Lucia A); A. Hingorani (Aroon); A. Hofman (Albert); G.K. Hovingh (Kees); S.E. Humphries (Steve E.); S.C. Hunt (Steven); E. Hypponen (Elina); K.B. Jacobs (Kevin); M.-R. Jarvelin (Marjo-Riitta); P. Jousilahti (Pekka); A. Jula (Antti); J. Kaprio (Jaakko); J.J.P. Kastelein (John); M.H. Kayser (Manfred); F. Kee (Frank); S. Keinanen-Kiukaanniemi (Sirkka); L.A.L.M. Kiemeney (Bart); J.S. Kooner (Jaspal S.); C. Kooperberg (Charles); S. Koskinen (Seppo); P. Kovacs (Peter); A. Kraja (Aldi); M. Kumari (Meena); J. Kuusisto (Johanna); T.A. Lakka (Timo); C. Langenberg (Claudia); L. Le Marchand (Loic); T. Lehtimäki (Terho); S. Lupoli (Sara); P.A. Madden; S. Männistö (Satu); P. Manunta (Paolo); A. Marette (Andre'); T.C. Matise (Tara C.); B. McKnight (Barbara); T. Meitinger (Thomas); F.L. Moll (Frans); G.W. Montgomery (Grant W.); A.D. Morris (Andrew); A.P. Morris (Andrew); J.C. Murray (Jeffrey); M. Nelis (Mari); C. Ohlsson (Claes); A.J. Oldehinkel (Albertine); K.K. Ong (Ken K.); W.H. Ouwehand (Willem); G. Pasterkamp (Gerard); A. Peters (Annette); P.P. Pramstaller (Peter Paul); J.F. Price (Jackie F.); L. Qi (Lu); O. Raitakari (Olli); T. Rankinen (Tuomo); D.C. Rao (Dabeeru C.); T.K. Rice (Treva K.); M.D. Ritchie (Marylyn D.); I. Rudan (Igor); V. Salomaa (Veikko); N.J. Samani (Nilesh); J. Saramies (Jouko); M.A. Sarzynski (Mark A.); P.E.H. Schwarz (Peter E. H.); S. Sebert (Sylvain); P. Sever (Peter); A.R. Shuldiner (Alan); J. Sinisalo (Juha); V. Steinthorsdottir (Valgerdur); R.P. Stolk; J.-C. Tardif (Jean-Claude); A. Tönjes (Anke); A. Tremblay (Angelo); E. Tremoli (Elena); J. Virtamo (Jarmo); M.-C. Vohl (Marie-Claude); P. Amouyel (Philippe); F.W. Asselbergs (Folkert W.); T.L. Assimes (Themistocles); M. Bochud (Murielle); B.O. Boehm (Bernhard); E.A. Boerwinkle (Eric); E.P. Bottinger (Erwin P.); C. Bouchard (Claude); S. Cauchi (Stéphane); J.C. Chambers (John C.); S.J. Chanock (Stephen); R.S. Cooper (Richard S.); P.I.W. de Bakker (Paul); G.V. Dedoussis (George); L. Ferrucci (Luigi); P.W. Franks; P. Froguel (Philippe); L. Groop (Leif); C.A. Haiman (Christopher); A. Hamsten (Anders); M.G. Hayes (M. Geoffrey); J. Hui (Jennie); D. Hunter (David); K. Hveem (Kristian); J.W. Jukema (Jan Wouter); R.C. Kaplan (Robert); M. Kivimaki (Mika); D. Kuh (Diana); M. Laakso (Markku); Y. Liu (YongMei); N.G. Martin (Nicholas); W. März (Winfried); M. Melbye (Mads); S. Moebus (Susanne); P. Munroe (Patricia); I. Njølstad (Inger); B.A. Oostra (Ben); C.N.A. Palmer (Colin); N.L. Pedersen (Nancy L.); M. Perola (Markus); L. Perusse (Louis); U. Peters (Ulrike); J.E. Powell (Joseph); C. Power (Christine); T. Quertermous (Thomas); R. Rauramaa (Rainer); E. Reinmaa (Eva); P.M. Ridker (Paul); F. Rivadeneira Ramirez (Fernando); J.I. Rotter (Jerome I.); T. Saaristo (Timo); D. Saleheen; D. Schlessinger (David); P.E. Slagboom (P Eline); H. Snieder (Harold); T.D. Spector (Timothy); K. Strauch (Konstantin); M. Stumvoll (Michael); J. Tuomilehto (Jaakko); M. Uusitupa (Matti); P. van der Harst (Pim); H. Völzke (Henry); M. Walker (Mark); N.J. Wareham (Nick); H. Watkins (Hugh); H.E. Wichmann (Heinz Erich); J.F. Wilson (James F); P. Zanen (Pieter); P. Deloukas (Panagiotis); I.M. Heid (Iris); C.M. Lindgren (Cecilia); K.L. Mohlke (Karen); E.K. Speliotes (Elizabeth); U. Thorsteinsdottir (Unnur); I.E. Barroso (Inês); C.S. Fox (Caroline S.); K.E. North (Kari); D.P. Strachan (David P.); J.S. Beckmann (Jacques); S.I. Berndt (Sonja); M. Boehnke (Michael); I.B. Borecki (Ingrid); M.I. McCarthy (Mark); A. Metspalu (Andres); J-A. Zwart (John-Anker); A.G. Uitterlinden (André); C.M. van Duijn (Cornelia); L. Franke (Lude); C.J. Willer (Cristen); A. Price (Alkes); G. Lettre (Guillaume); R.J.F. Loos (Ruth); M.N. Weedon (Michael); E. Ingelsson (Erik); J.R. O´Connell; G.R. Abecasis (Gonçalo); D.I. Chasman (Daniel); D. Anderson (Denise); M.E. Goddard (Michael); P.M. Visscher (Peter); J.N. Hirschhorn (Joel); T.M. Frayling (Timothy)

    2014-01-01

    textabstractUsing genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated

  6. Defining the role of common variation in the genomic and biological architecture of adult human height

    NARCIS (Netherlands)

    Wood, Andrew R.; Esko, Tonu; Yang, Jian; Vedantam, Sailaja; Pers, Tune H.; Gustafsson, Stefan; Chu, Audrey Y.; Estrada, Karol; Luan, Jian'an; Kutalik, Zoltán; Amin, Najaf; Buchkovich, Martin L.; Croteau-Chonka, Damien C.; Day, Felix R.; Duan, Yanan; Fall, Tove; Fehrmann, Rudolf; Ferreira, Teresa; Jackson, Anne U.; Karjalainen, Juha; Lo, Ken Sin; Locke, Adam E.; Mägi, Reedik; Mihailov, Evelin; Porcu, Eleonora; Randall, Joshua C.; Scherag, André; Vinkhuyzen, Anna A. E.; Westra, Harm-Jan; Winkler, Thomas W.; Workalemahu, Tsegaselassie; Zhao, Jing Hua; Absher, Devin; Albrecht, Eva; Anderson, Denise; Baron, Jeffrey; Beekman, Marian; Demirkan, Ayse; Ehret, Georg B.; Feenstra, Bjarke; Feitosa, Mary F.; Fischer, Krista; Fraser, Ross M.; Goel, Anuj; Gong, Jian; Justice, Anne E.; Kanoni, Stavroula; Kleber, Marcus E.; Kristiansson, Kati; Lim, Unhee; Lotay, Vaneet; Lui, Julian C.; Mangino, Massimo; Mateo Leach, Irene; Medina-Gomez, Carolina; Nalls, Michael A.; Nyholt, Dale R.; Palmer, Cameron D.; Pasko, Dorota; Pechlivanis, Sonali; Prokopenko, Inga; Ried, Janina S.; Ripke, Stephan; Shungin, Dmitry; Stancáková, Alena; Strawbridge, Rona J.; Sung, Yun Ju; Tanaka, Toshiko; Teumer, Alexander; Trompet, Stella; van der Laan, Sander W.; van Setten, Jessica; van Vliet-Ostaptchouk, Jana V.; Wang, Zhaoming; Yengo, Loïc; Zhang, Weihua; Afzal, Uzma; Arnlöv, Johan; Arscott, Gillian M.; Bandinelli, Stefania; Barrett, Amy; Bellis, Claire; Bennett, Amanda J.; Berne, Christian; Blüher, Matthias; Bolton, Jennifer L.; Böttcher, Yvonne; Boyd, Heather A.; Bruinenberg, Marcel; Buckley, Brendan M.; Buyske, Steven; Caspersen, Ida H.; Chines, Peter S.; Clarke, Robert; Claudi-Boehm, Simone; Cooper, Matthew; Daw, E. Warwick; de Jong, Pim A.; Deelen, Joris; Delgado, Graciela; Denny, Josh C.; Dhonukshe-Rutten, Rosalie; Dimitriou, Maria; Doney, Alex S. F.; Dörr, Marcus; Eklund, Niina; Eury, Elodie; Folkersen, Lasse; Garcia, Melissa E.; Geller, Frank; Giedraitis, Vilmantas; Go, Alan S.; Grallert, Harald; Grammer, Tanja B.; Gräßler, Jürgen; Grönberg, Henrik; de Groot, Lisette C. P. G. M.; Groves, Christopher J.; Haessler, Jeffrey; Hall, Per; Haller, Toomas; Hallmans, Goran; Hannemann, Anke; Hartman, Catharina A.; Hassinen, Maija; Hayward, Caroline; Heard-Costa, Nancy L.; Helmer, Quinta; Hemani, Gibran; Henders, Anjali K.; Hillege, Hans L.; Hlatky, Mark A.; Hoffmann, Wolfgang; Hoffmann, Per; Holmen, Oddgeir; Houwing-Duistermaat, Jeanine J.; Illig, Thomas; Isaacs, Aaron; James, Alan L.; Jeff, Janina; Johansen, Berit; Johansson, Åsa; Jolley, Jennifer; Juliusdottir, Thorhildur; Junttila, Juhani; Kho, Abel N.; Kinnunen, Leena; Klopp, Norman; Kocher, Thomas; Kratzer, Wolfgang; Lichtner, Peter; Lind, Lars; Lindström, Jaana; Lobbens, Stéphane; Lorentzon, Mattias; Lu, Yingchang; Lyssenko, Valeriya; Magnusson, Patrik K. E.; Mahajan, Anubha; Maillard, Marc; McArdle, Wendy L.; McKenzie, Colin A.; McLachlan, Stela; McLaren, Paul J.; Menni, Cristina; Merger, Sigrun; Milani, Lili; Moayyeri, Alireza; Monda, Keri L.; Morken, Mario A.; Müller, Gabriele; Müller-Nurasyid, Martina; Musk, Arthur W.; Narisu, Narisu; Nauck, Matthias; Nolte, Ilja M.; Nöthen, Markus M.; Oozageer, Laticia; Pilz, Stefan; Rayner, Nigel W.; Renstrom, Frida; Robertson, Neil R.; Rose, Lynda M.; Roussel, Ronan; Sanna, Serena; Scharnagl, Hubert; Scholtens, Salome; Schumacher, Fredrick R.; Schunkert, Heribert; Scott, Robert A.; Sehmi, Joban; Seufferlein, Thomas; Shi, Jianxin; Silventoinen, Karri; Smit, Johannes H.; Smith, Albert Vernon; Smolonska, Joanna; Stanton, Alice V.; Stirrups, Kathleen; Stott, David J.; Stringham, Heather M.; Sundström, Johan; Swertz, Morris A.; Syvänen, Ann-Christine; Tayo, Bamidele O.; Thorleifsson, Gudmar; Tyrer, Jonathan P.; van Dijk, Suzanne; van Schoor, Natasja M.; van der Velde, Nathalie; van Heemst, Diana; van Oort, Floor V. A.; Vermeulen, Sita H.; Verweij, Niek; Vonk, Judith M.; Waite, Lindsay L.; Waldenberger, Melanie; Wennauer, Roman; Wilkens, Lynne R.; Willenborg, Christina; Wilsgaard, Tom; Wojczynski, Mary K.; Wong, Andrew; Wright, Alan F.; Zhang, Qunyuan; Arveiler, Dominique; Bakker, Stephan J. L.; Beilby, John; Bergman, Richard N.; Bergmann, Sven; Biffar, Reiner; Blangero, John; Boomsma, Dorret I.; Bornstein, Stefan R.; Bovet, Pascal; Brambilla, Paolo; Brown, Morris J.; Campbell, Harry; Caulfield, Mark J.; Chakravarti, Aravinda; Collins, Rory; Collins, Francis S.; Crawford, Dana C.; Cupples, L. Adrienne; Danesh, John; de Faire, Ulf; den Ruijter, Hester M.; Erbel, Raimund; Erdmann, Jeanette; Eriksson, Johan G.; Farrall, Martin; Ferrannini, Ele; Ferrières, Jean; Ford, Ian; Forouhi, Nita G.; Forrester, Terrence; Gansevoort, Ron T.; Gejman, Pablo V.; Gieger, Christian; Golay, Alain; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Haas, David W.; Hall, Alistair S.; Harris, Tamara B.; Hattersley, Andrew T.; Heath, Andrew C.; Hengstenberg, Christian; Hicks, Andrew A.; Hindorff, Lucia A.; Hingorani, Aroon D.; Hofman, Albert; Hovingh, G. Kees; Humphries, Steve E.; Hunt, Steven C.; Hypponen, Elina; Jacobs, Kevin B.; Jarvelin, Marjo-Riitta; Jousilahti, Pekka; Jula, Antti M.; Kaprio, Jaakko; Kastelein, John J. P.; Kayser, Manfred; Kee, Frank; Keinanen-Kiukaanniemi, Sirkka M.; Kiemeney, Lambertus A.; Kooner, Jaspal S.; Kooperberg, Charles; Koskinen, Seppo; Kovacs, Peter; Kraja, Aldi T.; Kumari, Meena; Kuusisto, Johanna; Lakka, Timo A.; Langenberg, Claudia; Le Marchand, Loic; Lehtimäki, Terho; Lupoli, Sara; Madden, Pamela A. F.; Männistö, Satu; Manunta, Paolo; Marette, André; Matise, Tara C.; McKnight, Barbara; Meitinger, Thomas; Moll, Frans L.; Montgomery, Grant W.; Morris, Andrew D.; Morris, Andrew P.; Murray, Jeffrey C.; Nelis, Mari; Ohlsson, Claes; Oldehinkel, Albertine J.; Ong, Ken K.; Ouwehand, Willem H.; Pasterkamp, Gerard; Peters, Annette; Pramstaller, Peter P.; Price, Jackie F.; Qi, Lu; Raitakari, Olli T.; Rankinen, Tuomo; Rao, D. C.; Rice, Treva K.; Ritchie, Marylyn; Rudan, Igor; Salomaa, Veikko; Samani, Nilesh J.; Saramies, Jouko; Sarzynski, Mark A.; Schwarz, Peter E. H.; Sebert, Sylvain; Sever, Peter; Shuldiner, Alan R.; Sinisalo, Juha; Steinthorsdottir, Valgerdur; Stolk, Ronald P.; Tardif, Jean-Claude; Tönjes, Anke; Tremblay, Angelo; Tremoli, Elena; Virtamo, Jarmo; Vohl, Marie-Claude; Amouyel, Philippe; Asselbergs, Folkert W.; Assimes, Themistocles L.; Bochud, Murielle; Boehm, Bernhard O.; Boerwinkle, Eric; Bottinger, Erwin P.; Bouchard, Claude; Cauchi, Stéphane; Chambers, John C.; Chanock, Stephen J.; Cooper, Richard S.; de Bakker, Paul I. W.; Dedoussis, George; Ferrucci, Luigi; Franks, Paul W.; Froguel, Philippe; Groop, Leif C.; Haiman, Christopher A.; Hamsten, Anders; Hayes, M. Geoffrey; Hui, Jennie; Hunter, David J.; Hveem, Kristian; Jukema, J. Wouter; Kaplan, Robert C.; Kivimaki, Mika; Kuh, Diana; Laakso, Markku; Liu, Yongmei; Martin, Nicholas G.; März, Winfried; Melbye, Mads; Moebus, Susanne; Munroe, Patricia B.; Njølstad, Inger; Oostra, Ben A.; Palmer, Colin N. A.; Pedersen, Nancy L.; Perola, Markus; Pérusse, Louis; Peters, Ulrike; Powell, Joseph E.; Power, Chris; Quertermous, Thomas; Rauramaa, Rainer; Reinmaa, Eva; Ridker, Paul M.; Rivadeneira, Fernando; Rotter, Jerome I.; Saaristo, Timo E.; Saleheen, Danish; Schlessinger, David; Slagboom, P. Eline; Snieder, Harold; Spector, Tim D.; Strauch, Konstantin; Stumvoll, Michael; Tuomilehto, Jaakko; Uusitupa, Matti; van der Harst, Pim; Völzke, Henry; Walker, Mark; Wareham, Nicholas J.; Watkins, Hugh; Wichmann, H.-Erich; Wilson, James F.; Zanen, Pieter; Deloukas, Panos; Heid, Iris M.; Lindgren, Cecilia M.; Mohlke, Karen L.; Speliotes, Elizabeth K.; Thorsteinsdottir, Unnur; Barroso, Inês; Fox, Caroline S.; North, Kari E.; Strachan, David P.; Beckmann, Jacques S.; Berndt, Sonja I.; Boehnke, Michael; Borecki, Ingrid B.; McCarthy, Mark I.; Metspalu, Andres; Stefansson, Kari; Uitterlinden, André G.; van Duijn, Cornelia M.; Franke, Lude; Willer, Cristen J.; Price, Alkes L.; Lettre, Guillaume; Loos, Ruth J. F.; Weedon, Michael N.; Ingelsson, Erik; O'Connell, Jeffrey R.; Abecasis, Goncalo R.; Chasman, Daniel I.; Goddard, Michael E.; Visscher, Peter M.; Hirschhorn, Joel N.; Frayling, Timothy M.; McCarty, Catherine A.; Starren, Justin; Peissig, Peggy; Berg, Richard; Rasmussen, Luke; Linneman, James; Miller, Aaron; Choudary, Vidhu; Chen, Lin; Waudby, Carol; Kitchner, Terrie; Reeser, Jonathan; Fost, Norman; Wilke, Russell A.; Chisholm, Rex L.; Avila, Pedro C.; Greenland, Philip; Hayes, M. Geoff; Kho, Abel; Kibbe, Warren A.; Lemke, Amy A.; Lowe, William L.; Smith, Maureen E.; Wolf, Wendy A.; Pacheco, Jennifer A.; Thompson, William K.; Humowiecki, Joel; Law, May; Chute, Christopher; Kullo, Iftikar; Koenig, Barbara; de Andrade, Mariza; Bielinski, Suzette; Pathak, Jyotishman; Savova, Guergana; Wu, Joel; Henriksen, Joan; Ding, Keyue; Hart, Lacey; Palbicki, Jeremy; Larson, Eric B.; Newton, Katherine; Ludman, Evette; Spangler, Leslie; Hart, Gene; Carrell, David; Jarvik, Gail; Crane, Paul; Burke, Wylie; Fullerton, Stephanie Malia; Trinidad, Susan Brown; Carlson, Chris; Hutchinson, Fred; McDavid, Andrew; Roden, Dan M.; Clayton, Ellen; Haines, Jonathan L.; Masys, Daniel R.; Churchill, Larry R.; Cornfield, Daniel; Crawford, Dana; Darbar, Dawood; Denny, Joshua C.; Malin, Bradley A.; Ritchie, Marylyn D.; Schildcrout, Jonathan S.; Xu, Hua; Ramirez, Andrea Havens; Basford, Melissa; Pulley, Jill; Alizadeh, Behrooz Z.; de Boer, Rudolf A.; Boezen, H. Marike; van der Klauw, Melanie M.; Navis, Gerjan; Ormel, Johan; Postma, Dirkje S.; Rosmalen, Judith G. M.; Slaets, Joris P.; Wolffenbuttel, Bruce H. R.; Wijmenga, Cisca; Kathiresan, Sekar; Voight, Benjamin F.; Purcell, Shaun; Musunuru, Kiran; Ardissino, Diego; Mannucci, Pier M.; Anand, Sonia; Engert, James C.; Reilly, Muredach P.; Rader, Daniel J.; Morgan, Thomas; Spertus, John A.; Stoll, Monika; Girelli, Domenico; McKeown, Pascal P.; Patterson, Chris C.; Siscovick, David S.; O'Donnell, Christopher J.; Elosua, Roberto; Peltonen, Leena; Schwartz, Stephen M.; Melander, Olle; Altshuler, David; Merlini, Pier Angelica; Berzuini, Carlo; Bernardinelli, Luisa; Peyvandi, Flora; Tubaro, Marco; Celli, Patrizia; Ferrario, Maurizio; Fetiveau, Raffaela; Marziliano, Nicola; Casari, Giorgio; Galli, Michele; Ribichini, Flavio; Rossi, Marco; Bernardi, Francesco; Zonzin, Pietro; Piazza, Alberto; Yee, Jean; Friedlander, Yechiel; Marrugat, Jaume; Lucas, Gavin; Subirana, Isaac; Sala, Joan; Ramos, Rafael; Meigs, James B.; Williams, Gordon; Nathan, David M.; MacRae, Calum A.; Havulinna, Aki S.; Berglund, Goran; Asselta, Rosanna; Duga, Stefano; Spreafico, Marta; Daly, Mark J.; Nemesh, James; Korn, Joshua M.; McCarroll, Steven A.; Surti, Aarti; Guiducci, Candace; Gianniny, Lauren; Mirel, Daniel; Parkin, Melissa; Burtt, Noel; Gabriel, Stacey B.; Thompson, John R.; Braund, Peter S.; Wright, Benjamin J.; Balmforth, Anthony J.; Ball, Stephen G.; Schunkert, I. Heribert; Linsel-Nitschke, Patrick; Lieb, Wolfgang; Ziegler, Andreas; König, Inke R.; Fischer, Marcus; Stark, Klaus; Grosshennig, Anika; Preuss, Michael; Schreiber, Stefan; Ouwehand, Willem; Scholz, Michael; Cambien, Francois; Goodall, Alison; Li, Mingyao; Chen, Zhen; Wilensky, Robert; Matthai, William; Qasim, Atif; Hakonarson, Hakon H.; Devaney, Joe; Burnett, Mary-Susan; Pichard, Augusto D.; Kent, Kenneth M.; Satler, Lowell; Lindsay, Joseph M.; Waksman, Ron; Knouff, Christopher W.; Waterworth, Dawn M.; Walker, Max C.; Mooser, Vincent; Epstein, Stephen E.; Scheffold, Thomas; Berger, Klaus; Huge, Andreas; Martinelli, Nicola; Olivieri, Oliviero; Corrocher, Roberto; Hólm, Hilma; Do, Ron; Xie, Changchun; Siscovick, David; Matise, Tara; Buyske, Steve; Higashio, Julia; Williams, Rasheeda; Nato, Andrew; Ambite, Jose Luis; Deelman, Ewa; Manolio, Teri; Hindorff, Lucia; Heiss, Gerardo; Taylor, Kira; Franceschini, Nora; Avery, Christy; Graff, Misa; Lin, Danyu; Quibrera, Miguel; Cochran, Barbara; Kao, Linda; Umans, Jason; Cole, Shelley; MacCluer, Jean; Person, Sharina; Pankow, James; Gross, Myron; Fornage, Myriam; Durda, Peter; Jenny, Nancy; Patsy, Bruce; Arnold, Alice; Buzkova, Petra; Haines, Jonathan; Murdock, Deborah; Glenn, Kim; Brown-Gentry, Kristin; Thornton-Wells, Tricia; Dumitrescu, Logan; Bush, William S.; Mitchell, Sabrina L.; Goodloe, Robert; Wilson, Sarah; Boston, Jonathan; Malinowski, Jennifer; Restrepo, Nicole; Oetjens, Matthew; Fowke, Jay; Zheng, Wei; Spencer, Kylee; Pendergrass, Sarah; Le Marchand, Loïc; Wilkens, Lynne; Park, Lani; Tiirikainen, Maarit; Kolonel, Laurence; Cheng, Iona; Wang, Hansong; Shohet, Ralph; Haiman, Christopher; Stram, Daniel; Henderson, Brian; Monroe, Kristine; Schumacher, Fredrick; Anderson, Garnet; Prentice, Ross; LaCroix, Andrea; Wu, Chunyuan; Carty, Cara; Rosse, Stephanie; Young, Alicia; Haessler, Jeff; Kocarnik, Jonathan; Lin, Yi; Jackson, Rebecca; Duggan, David; Kuller, Lew

    2014-01-01

    Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700

  7. Hunting for genes for hypertension: the Millennium Genome Project for Hypertension.

    Science.gov (United States)

    Tabara, Yasuharu; Kohara, Katsuhiko; Miki, Tetsuro

    2012-06-01

    The Millennium Genome Project for Hypertension was started in 2000 to identify genetic variants conferring susceptibility to hypertension, with the aim of furthering the understanding of the pathogenesis of this condition and realizing genome-based personalized medical care. Two different approaches were launched, genome-wide association analysis using single-nucleotide polymorphisms (SNPs) and microsatellite markers, and systematic candidate gene analysis, under the hypothesis that common variants have an important role in the etiology of common diseases. These multilateral approaches identified ATP2B1 as a gene responsible for hypertension in not only Japanese but also Caucasians. The high blood pressure susceptibility conferred by certain alleles of ATP2B1 has been widely replicated in various populations. Ex vivo mRNA expression analysis in umbilical artery smooth muscle cells indicated that reduced expression of this gene associated with the risk allele may be an underlying mechanism relating the ATP2B1 variant to hypertension. However, the effect size of a SNP was too small to clarify the entire picture of the genetic basis of hypertension. Further, dense genome analysis with accurate phenotype data may be required.

  8. Variational equations for the solution of the hartree-fock problem with angular momentum projection before the variation

    International Nuclear Information System (INIS)

    Schmid, K.W.; Gruemmer, F.

    1979-01-01

    A variational principle is used to determine the optimal angular momentum projected one determinant approach to the N-nucleon yrast-wave function for a given total spin value. The solution is given in terms of a set of coupled nonlinear equations. Besides an orthonormality constraint for the occupied orbits and a normalization conditions for the total wave function, this set consists out of a matrix equation taking care of the fact that the spin-projected wave function does not depend on the orientation of the intrinsic determinant it is based on, and a second subset of equations, which can be considered as a Thouless theorem for the spin-projected N-nucleon state, and desribes the diagonalization of the total Hamiltonian in the subspace of linear independent N-nucleon shell model configurations contained in the test-determinant. Furthermore, a numerical method for the solution of these equations is proposed and an extension of the theory for the description of excited bands is given. Finally, the consistency of the equations is checked by solving them for a simple example analytically. (orig.)

  9. GENOMICS SYMPOSIUM: Using genomic approaches to uncover sources of variation in age at puberty and reproductive longevity in sows

    Science.gov (United States)

    Genetic variants associated with traits such as age at puberty and litter size could provide insight into the underlying genetic sources of variation impacting sow reproductive longevity and productivity. Genomewide characterization and gene expression profiling were used using gilts from the Univer...

  10. Characterization of apparently balanced chromosomal rearrangements from the developmental genome anatomy project.

    Science.gov (United States)

    Higgins, Anne W; Alkuraya, Fowzan S; Bosco, Amy F; Brown, Kerry K; Bruns, Gail A P; Donovan, Diana J; Eisenman, Robert; Fan, Yanli; Farra, Chantal G; Ferguson, Heather L; Gusella, James F; Harris, David J; Herrick, Steven R; Kelly, Chantal; Kim, Hyung-Goo; Kishikawa, Shotaro; Korf, Bruce R; Kulkarni, Shashikant; Lally, Eric; Leach, Natalia T; Lemyre, Emma; Lewis, Janine; Ligon, Azra H; Lu, Weining; Maas, Richard L; MacDonald, Marcy E; Moore, Steven D P; Peters, Roxanna E; Quade, Bradley J; Quintero-Rivera, Fabiola; Saadi, Irfan; Shen, Yiping; Shendure, Jay; Williamson, Robin E; Morton, Cynthia C

    2008-03-01

    Apparently balanced chromosomal rearrangements in individuals with major congenital anomalies represent natural experiments of gene disruption and dysregulation. These individuals can be studied to identify novel genes critical in human development and to annotate further the function of known genes. Identification and characterization of these genes is the goal of the Developmental Genome Anatomy Project (DGAP). DGAP is a multidisciplinary effort that leverages the recent advances resulting from the Human Genome Project to increase our understanding of birth defects and the process of human development. Clinically significant phenotypes of individuals enrolled in DGAP are varied and, in most cases, involve multiple organ systems. Study of these individuals' chromosomal rearrangements has resulted in the mapping of 77 breakpoints from 40 chromosomal rearrangements by FISH with BACs and fosmids, array CGH, Southern-blot hybridization, MLPA, RT-PCR, and suppression PCR. Eighteen chromosomal breakpoints have been cloned and sequenced. Unsuspected genomic imbalances and cryptic rearrangements were detected, but less frequently than has been reported previously. Chromosomal rearrangements, both balanced and unbalanced, in individuals with multiple congenital anomalies continue to be a valuable resource for gene discovery and annotation.

  11. Symmetry-projected variational approach to the one-dimensional Hubbard model

    International Nuclear Information System (INIS)

    Schmid, K.W.; Dahm, T.; Margueron, J.; Muether, H.

    2005-01-01

    We apply a variational method devised for the nuclear many-body problem to the one-dimensional Hubbard model with nearest neighbor hopping and periodic boundary conditions. The test wave function consist for each state out of a single Hartree-Fock determinant mixing all the sites (or momenta) as well as the spin projections of the electrons. Total spin and linear momentum are restored by projection methods before the variation. It is demonstrated that this approach reproduces the results of exact diagonalizations for half-filled N=12 and N=14 lattices not only for the energies and occupation numbers of the ground but also of the lowest excited states rather well. Furthermore, a system of ten electrons in an N=12 lattice is investigated and, finally, an N=30 lattice is studied. In addition to energies and occupation numbers we present the spectral functions computed with the help of the symmetry-projected wave functions as well

  12. Variation, Evolution, and Correlation Analysis of C+G Content and Genome or Chromosome Size in Different Kingdoms and Phyla

    Science.gov (United States)

    Li, Xiu-Qing; Du, Donglei

    2014-01-01

    C+G content (GC content or G+C content) is known to be correlated with genome/chromosome size in bacteria but the relationship for other kingdoms remains unclear. This study analyzed genome size, chromosome size, and base composition in most of the available sequenced genomes in various kingdoms. Genome size tends to increase during evolution in plants and animals, and the same is likely true for bacteria. The genomic C+G contents were found to vary greatly in microorganisms but were quite similar within each animal or plant subkingdom. In animals and plants, the C+G contents are ranked as follows: monocot plants>mammals>non-mammalian animals>dicot plants. The variation in C+G content between chromosomes within species is greater in animals than in plants. The correlation between average chromosome C+G content and chromosome length was found to be positive in Proteobacteria, Actinobacteria (but not in other analyzed bacterial phyla), Ascomycota fungi, and likely also in some plants; negative in some animals, insignificant in two protist phyla, and likely very weak in Archaea. Clearly, correlations between C+G content and chromosome size can be positive, negative, or not significant depending on the kingdoms/groups or species. Different phyla or species exhibit different patterns of correlation between chromosome-size and C+G content. Most chromosomes within a species have a similar pattern of variation in C+G content but outliers are common. The data presented in this study suggest that the C+G content is under genetic control by both trans- and cis- factors and that the correlation between C+G content and chromosome length can be positive, negative, or not significant in different phyla. PMID:24551092

  13. Chromosomal Copy Number Variation in Saccharomyces pastorianus Is Evidence for Extensive Genome Dynamics in Industrial Lager Brewing Strains.

    Science.gov (United States)

    van den Broek, M; Bolat, I; Nijkamp, J F; Ramos, E; Luttik, M A H; Koopman, F; Geertman, J M; de Ridder, D; Pronk, J T; Daran, J-M

    2015-09-01

    Lager brewing strains of Saccharomyces pastorianus are natural interspecific hybrids originating from the spontaneous hybridization of Saccharomyces cerevisiae and Saccharomyces eubayanus. Over the past 500 years, S. pastorianus has been domesticated to become one of the most important industrial microorganisms. Production of lager-type beers requires a set of essential phenotypes, including the ability to ferment maltose and maltotriose at low temperature, the production of flavors and aromas, and the ability to flocculate. Understanding of the molecular basis of complex brewing-related phenotypic traits is a prerequisite for rational strain improvement. While genome sequences have been reported, the variability and dynamics of S. pastorianus genomes have not been investigated in detail. Here, using deep sequencing and chromosome copy number analysis, we showed that S. pastorianus strain CBS1483 exhibited extensive aneuploidy. This was confirmed by quantitative PCR and by flow cytometry. As a direct consequence of this aneuploidy, a massive number of sequence variants was identified, leading to at least 1,800 additional protein variants in S. pastorianus CBS1483. Analysis of eight additional S. pastorianus strains revealed that the previously defined group I strains showed comparable karyotypes, while group II strains showed large interstrain karyotypic variability. Comparison of three strains with nearly identical genome sequences revealed substantial chromosome copy number variation, which may contribute to strain-specific phenotypic traits. The observed variability of lager yeast genomes demonstrates that systematic linking of genotype to phenotype requires a three-dimensional genome analysis encompassing physical chromosomal structures, the copy number of individual chromosomes or chromosomal regions, and the allelic variation of copies of individual genes. Copyright © 2015, van den Broek et al.

  14. Spatial variation in the parasite communities and genomic structure of urban rats in New York City.

    Science.gov (United States)

    Angley, L P; Combs, M; Firth, C; Frye, M J; Lipkin, I; Richardson, J L; Munshi-South, J

    2018-02-01

    Brown rats (Rattus norvegicus) are a globally distributed pest. Urban habitats can support large infestations of rats, posing a potential risk to public health from the parasites and pathogens they carry. Despite the potential influence of rodent-borne zoonotic diseases on human health, it is unclear how urban habitats affect the structure and transmission dynamics of ectoparasite and microbial communities (all referred to as "parasites" hereafter) among rat colonies. In this study, we use ecological data on parasites and genomic sequencing of their rat hosts to examine associations between spatial proximity, genetic relatedness and the parasite communities associated with 133 rats at five sites in sections of New York City with persistent rat infestations. We build on previous work showing that rats in New York carry a wide variety of parasites and report that these communities differ significantly among sites, even across small geographical distances. Ectoparasite community similarity was positively associated with geographical proximity; however, there was no general association between distance and microbial communities of rats. Sites with greater overall parasite diversity also had rats with greater infection levels and parasite species richness. Parasite community similarity among sites was not linked to genetic relatedness of rats, suggesting that these communities are not associated with genetic similarity among host individuals or host dispersal among sites. Discriminant analysis identified site-specific associations of several parasite species, suggesting that the presence of some species within parasite communities may allow researchers to determine the sites of origin for newly sampled rats. The results of our study help clarify the roles that colony structure and geographical proximity play in determining the ecology of R. norvegicus as a significant urban reservoir of zoonotic diseases. Our study also highlights the spatial variation present in urban

  15. The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

    Science.gov (United States)

    Reddy, T.B.K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna; Lobos, Elizabeth A.; Kyrpides, Nikos C.

    2015-01-01

    The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Here we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards. PMID:25348402

  16. The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

    Energy Technology Data Exchange (ETDEWEB)

    Reddy, Tatiparthi B. K. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Thomas, Alex D. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Stamatis, Dimitri [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Bertsch, Jon [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Isbandi, Michelle [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Jansson, Jakob [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Mallajosyula, Jyothi [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Pagani, Ioanna [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Lobos, Elizabeth A. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); King Abdulaziz Univ., Jeddah (Saudi Arabia)

    2014-10-27

    The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Within this paper, we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. Lastly, GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards.

  17. Ensembl variation resources

    Directory of Open Access Journals (Sweden)

    Marin-Garcia Pablo

    2010-05-01

    Full Text Available Abstract Background The maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics. Description The Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl. Conclusions Variation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org.

  18. Genetic basis for spontaneous hybrid genome doubling during allopolyploid speciation of common wheat shown by natural variation analyses of the paternal species.

    Directory of Open Access Journals (Sweden)

    Yoshihiro Matsuoka

    Full Text Available The complex process of allopolyploid speciation includes various mechanisms ranging from species crosses and hybrid genome doubling to genome alterations and the establishment of new allopolyploids as persisting natural entities. Currently, little is known about the genetic mechanisms that underlie hybrid genome doubling, despite the fact that natural allopolyploid formation is highly dependent on this phenomenon. We examined the genetic basis for the spontaneous genome doubling of triploid F1 hybrids between the direct ancestors of allohexaploid common wheat (Triticum aestivum L., AABBDD genome, namely Triticumturgidum L. (AABB genome and Aegilopstauschii Coss. (DD genome. An Ae. tauschii intraspecific lineage that is closely related to the D genome of common wheat was identified by population-based analysis. Two representative accessions, one that produces a high-genome-doubling-frequency hybrid when crossed with a T. turgidum cultivar and the other that produces a low-genome-doubling-frequency hybrid with the same cultivar, were chosen from that lineage for further analyses. A series of investigations including fertility analysis, immunostaining, and quantitative trait locus (QTL analysis showed that (1 production of functional unreduced gametes through nonreductional meiosis is an early step key to successful hybrid genome doubling, (2 first division restitution is one of the cytological mechanisms that cause meiotic nonreduction during the production of functional male unreduced gametes, and (3 six QTLs in the Ae. tauschii genome, most of which likely regulate nonreductional meiosis and its subsequent gamete production processes, are involved in hybrid genome doubling. Interlineage comparisons of Ae. tauschii's ability to cause hybrid genome doubling suggested an evolutionary model for the natural variation pattern of the trait in which non-deleterious mutations in six QTLs may have important roles. The findings of this study demonstrated

  19. Genomic Heterogeneity of Methicillin Resistant Staphylococcus aureus Associated with Variation in Severity of Illness among Children with Acute Hematogenous Osteomyelitis.

    Directory of Open Access Journals (Sweden)

    Claudia Gaviria-Agudelo

    Full Text Available The association between severity of illness of children with osteomyelitis caused by Methicillin-resistant Staphylococcus aureus (MRSA and genomic variation of the causative organism has not been previously investigated. The purpose of this study is to assess genomic heterogeneity among MRSA isolates from children with osteomyelitis who have diverse severity of illness.Children with osteomyelitis were prospectively studied between 2010 and 2011. Severity of illness of the affected children was determined from clinical and laboratory parameters. MRSA isolates were analyzed with next generation sequencing (NGS and optical mapping. Sequence data was used for multi-locus sequence typing (MLST, phylogenetic analysis by maximum likelihood (PAML, and identification of virulence genes and single nucleotide polymorphisms (SNP relative to reference strains.The twelve children studied demonstrated severity of illness scores ranging from 0 (mild to 9 (severe. All isolates were USA300, ST 8, SCC mec IVa MRSA by MLST. The isolates differed from reference strains by 2 insertions (40 Kb each and 2 deletions (10 and 25 Kb but had no rearrangements or copy number variations. There was a higher occurrence of virulence genes among study isolates when compared to the reference strains (p = 0.0124. There were an average of 11 nonsynonymous SNPs per strain. PAML demonstrated heterogeneity of study isolates from each other and from the reference strains.Genomic heterogeneity exists among MRSA isolates causing osteomyelitis among children in a single community. These variations may play a role in the pathogenesis of variation in clinical severity among these children.

  20. Cloning, production, and purification of proteins for a medium-scale structural genomics project.

    Science.gov (United States)

    Quevillon-Cheruel, Sophie; Collinet, Bruno; Trésaugues, Lionel; Minard, Philippe; Henckes, Gilles; Aufrère, Robert; Blondeau, Karine; Zhou, Cong-Zhao; Liger, Dominique; Bettache, Nabila; Poupon, Anne; Aboulfath, Ilham; Leulliot, Nicolas; Janin, Joël; van Tilbeurgh, Herman

    2007-01-01

    The South-Paris Yeast Structural Genomics Pilot Project (http://www.genomics.eu.org) aims at systematically expressing, purifying, and determining the three-dimensional structures of Saccharomyces cerevisiae proteins. We have already cloned 240 yeast open reading frames in the Escherichia coli pET system. Eighty-two percent of the targets can be expressed in E. coli, and 61% yield soluble protein. We have currently purified 58 proteins. Twelve X-ray structures have been solved, six are in progress, and six other proteins gave crystals. In this chapter, we present the general experimental flowchart applied for this project. One of the main difficulties encountered in this pilot project was the low solubility of a great number of target proteins. We have developed parallel strategies to recover these proteins from inclusion bodies, including refolding, coexpression with chaperones, and an in vitro expression system. A limited proteolysis protocol, developed to localize flexible regions in proteins that could hinder crystallization, is also described.

  1. Projection after variation in the finite-temperature Hartree-Fock-Bogoliubov approximation

    Science.gov (United States)

    Fanto, P.

    2017-11-01

    The finite-temperature Hartree-Fock-Bogoliubov (HFB) approximation often breaks symmetries of the underlying many-body Hamiltonian. Restricting the calculation of the HFB partition function to a subspace with good quantum numbers through projection after variation restores some of the correlations lost in breaking these symmetries, although effects of the broken symmetries such as sharp kinks at phase transitions remain. However, the most general projection after variation formula in the finite-temperature HFB approximation is limited by a sign ambiguity. Here, I extend the Pfaffian formula for the many-body traces of HFB density operators introduced by Robledo [L. M. Robledo, Phys. Rev. C. 79, 021302(R) (2009), 10.1103/PhysRevC.79.021302] to eliminate this sign ambiguity and evaluate the more complicated many-body traces required in projection after variation in the most general HFB case. The method is validated through a proof-of-principle calculation of the particle-number-projected HFB thermal energy in a simple model.

  2. Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains.

    Science.gov (United States)

    Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cuff, Alison; Dana, Jose M; Filippis, Ioannis; Gough, Julian; Hunter, Sarah; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mitchell, Alex; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Rackham, Owen J L; Smith, James; Sternberg, Michael J E; Velankar, Sameer; Yeats, Corin; Orengo, Christine

    2013-01-01

    Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).

  3. Genome-wide analysis of macrosatellite repeat copy number variation in worldwide populations: Evidence for differences and commonalities in size distributions and size restrictions

    NARCIS (Netherlands)

    M. Schaap (Michiel); R.J.L.F. Lemmers (Richard); R. Maassen (Roel); P.J. van der Vliet (Patrick); L.F. Hoogerheide (Lennart); H.K. van Dijk (Herman); N. Basturk (Nalan); P. de Knijff (Peter); S.M. van der Maarel (Silvère)

    2013-01-01

    textabstractBackground: Macrosatellite repeats (MSRs), usually spanning hundreds of kilobases of genomic DNA, comprise a significant proportion of the human genome. Because of their highly polymorphic nature, MSRs represent an extreme example of copy number variation, but their structure and

  4. Genome-wide analysis of macrosatellite repeat copy number variation in worldwide populations: evidence for differences and commonalities in size distributions and size restrictions

    NARCIS (Netherlands)

    Schaap, M.; Lemmers, R.J.L.F.; Maassen, R.; van der Vliet, P.J.; Hoogerheide, L.F.; van Dijk, H.K.; Basturk, N.; de Knijff, P.; van der Maarel, S.M.

    2013-01-01

    Background: Macrosatellite repeats (MSRs), usually spanning hundreds of kilobases of genomic DNA, comprise a significant proportion of the human genome. Because of their highly polymorphic nature, MSRs represent an extreme example of copy number variation, but their structure and function is largely

  5. Genomes

    National Research Council Canada - National Science Library

    Brown, T. A. (Terence A.)

    2002-01-01

    ... of genome expression and replication processes, and transcriptomics and proteomics. This text is richly illustrated with clear, easy-to-follow, full color diagrams, which are downloadable from the book's website...

  6. Dynamics of chromosome number and genome size variation in a cytogenetically variable sedge (Carex scoparia var. scoparia, Cyperaceae).

    Science.gov (United States)

    Chung, Kyong-Sook; Weber, Jaime A; Hipp, Andrew L

    2011-01-01

    High intraspecific cytogenetic variation in the sedge genus Carex (Cyperaceae) is hypothesized to be due to the "diffuse" or non-localized centromeres, which facilitate chromosome fission and fusion. If chromosome number changes are dominated by fission and fusion, then chromosome evolution will result primarily in changes in the potential for recombination among populations. Chromosome duplications, on the other hand, entail consequent opportunities for divergent evolution of paralogs. In this study, we evaluate whether genome size and chromosome number covary within species. We used flow cytometry to estimate genome sizes in Carex scoparia var. scoparia, sampling 99 plants (23 populations) in the Chicago region, and we used meiotic chromosome observations to document chromosome numbers and chromosome pairing relations. Chromosome numbers range from 2n = 62 to 2n = 68, and nuclear DNA 1C content from 0.342 to 0.361 pg DNA. Regressions of DNA content on chromosome number are nonsignificant for data analyzed by individual or population, and a regression model that excludes slope is favored over a model in which chromosome number predicts genome size. Chromosome rearrangements within cytogenetically variable Carex species are more likely a consequence of fission and fusion than of duplication and deletion. Moreover, neither genome size nor chromosome number is spatially autocorrelated, which suggests the potential for rapid chromosome evolution by fission and fusion at a relatively fine geographic scale (<350 km). These findings have important implications for ecological restoration and speciation within the largest angiosperm genus of the temperate zone.

  7. Convergence Properties of Projection and Contraction Methods for Variational Inequality Problems

    International Nuclear Information System (INIS)

    Xiu, N.; Wang, C.; Zhang, J.

    2001-01-01

    In this paper we develop the convergence theory of a general class of projection and contraction algorithms (PC method), where an extended stepsize rule is used, for solving variational inequality (VI) problems. It is shown that, by defining a scaled projection residue, the PC method forces the sequence of the residues to zero. It is also shown that, by defining a projected function, the PC method forces the sequence of projected functions to zero. A consequence of this result is that if the PC method converges to a nondegenerate solution of the VI problem, then after a finite number of iterations, the optimal face is identified. Finally, we study local convergence behavior of the extragradient algorithm for solving the KKT system of the inequality constrained VI problem

  8. Consortium biology in immunology: the perspective from the Immunological Genome Project.

    Science.gov (United States)

    Benoist, Christophe; Lanier, Lewis; Merad, Miriam; Mathis, Diane

    2012-10-01

    Although the field has a long collaborative tradition, immunology has made less use than genetics of 'consortium biology', wherein groups of investigators together tackle large integrated questions or problems. However, immunology is naturally suited to large-scale integrative and systems-level approaches, owing to the multicellular and adaptive nature of the cells it encompasses. Here, we discuss the value and drawbacks of this organization of research, in the context of the long-running 'big science' debate, and consider the opportunities that may exist for the immunology community. We position this analysis in light of our own experience, both positive and negative, as participants of the Immunological Genome Project.

  9. ELSI Bibliography: Ethical legal and social implications of the Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Yesley, M.S. [comp.

    1993-11-01

    This second edition of the ELSI Bibliography provides a current and comprehensive resource for identifying publications on the major topics related to the ethical, legal and social issues (ELSI) of the Human Genome Project. Since the first edition of the ELSI Bibliography was printed last year, new publications and earlier ones identified by additional searching have doubled our computer database of ELSI publications to over 5600 entries. The second edition of the ELSI Bibliography reflects this growth of the underlying computer database. Researchers should note that an extensive collection of publications in the database is available for public use at the General Law Library of Los Alamos National Laboratory (LANL).

  10. Patterns of Genome-Wide Variation in Glossina fuscipes fuscipes Tsetse Flies from Uganda

    Directory of Open Access Journals (Sweden)

    Andrea Gloria-Soria

    2016-06-01

    Full Text Available The tsetse fly Glossina fuscipes fuscipes (Gff is the insect vector of the two forms of Human African Trypanosomiasis (HAT that exist in Uganda. Understanding Gff population dynamics, and the underlying genetics of epidemiologically relevant phenotypes is key to reducing disease transmission. Using ddRAD sequence technology, complemented with whole-genome sequencing, we developed a panel of ∼73,000 single-nucleotide polymorphisms (SNPs distributed across the Gff genome that can be used for population genomics and to perform genome-wide-association studies. We used these markers to estimate genomic patterns of linkage disequilibrium (LD in Gff, and used the information, in combination with outlier-locus detection tests, to identify candidate regions of the genome under selection. LD in individual populations decays to half of its maximum value (r2max/2 between 1359 and 2429 bp. The overall LD estimated for the species reaches r2max/2 at 708 bp, an order of magnitude slower than in Drosophila. Using 53 infected (Trypanosoma spp. and uninfected flies from four genetically distinct Ugandan populations adapted to different environmental conditions, we were able to identify SNPs associated with the infection status of the fly and local environmental adaptation. The extent of LD in Gff likely facilitated the detection of loci under selection, despite the small sample size. Furthermore, it is probable that LD in the regions identified is much higher than the average genomic LD due to strong selection. Our results show that even modest sample sizes can reveal significant genetic associations in this species, which has implications for future studies given the difficulties of collecting field specimens with contrasting phenotypes for association analysis.

  11. Concepts, Operations, and Feasibility of a Projection-Based Variation Control System

    DEFF Research Database (Denmark)

    Stanciulescu, Stefan; Berger, Thorsten; Walkingshaw, Eric

    2016-01-01

    on a subset of all variants could ease the engineering of highly configurable software. We investigate the potential of one kind of such tools: projection-based variation control systems. For such systems we aim to understand: (i) what end-user operations they need to support, and (ii) whether they can...... realize the actual evolution of real-world, highly configurable software. We conduct an experiment that investigates variability-related evolution patterns and that evaluates the feasibility of a projection-based variation control system by replaying parts of the history of a highly configurable real......Highly configurable software often uses preproces- sor annotations to handle variability. However, understanding, maintaining, and evolving code with such annotations is difficult, mainly because a developer has to work with all variants at a time. Dedicated methods and tools that allow working...

  12. Importing statistical measures into Artemis enhances gene identification in the Leishmania genome project

    Directory of Open Access Journals (Sweden)

    McDonagh Paul D

    2003-06-01

    Full Text Available Abstract Background Seattle Biomedical Research Institute (SBRI as part of the Leishmania Genome Network (LGN is sequencing chromosomes of the trypanosomatid protozoan species Leishmania major. At SBRI, chromosomal sequence is annotated using a combination of trained and untrained non-consensus gene-prediction algorithms with ARTEMIS, an annotation platform with rich and user-friendly interfaces. Results Here we describe a methodology used to import results from three different protein-coding gene-prediction algorithms (GLIMMER, TESTCODE and GENESCAN into the ARTEMIS sequence viewer and annotation tool. Comparison of these methods, along with the CODONUSAGE algorithm built into ARTEMIS, shows the importance of combining methods to more accurately annotate the L. major genomic sequence. Conclusion An improvised and powerful tool for gene prediction has been developed by importing data from widely-used algorithms into an existing annotation platform. This approach is especially fruitful in the Leishmania genome project where there is large proportion of novel genes requiring manual annotation.

  13. Human Genome Project discoveries: Dialectics and rhetoric in the science of genetics

    Science.gov (United States)

    Robidoux, Charlotte A.

    The Human Genome Project (HGP), a $437 million effort that began in 1990 to chart the chemical sequence of our three billion base pairs of DNA, was completed in 2003, marking the 50th anniversary that proved the definitive structure of the molecule. This study considered how dialectical and rhetorical arguments functioned in the science, political, and public forums over a 20-year period, from 1980 to 2000, to advance human genome research and to establish the official project. I argue that Aristotle's continuum of knowledge--which ranges from the probable on one end to certified or demonstrated knowledge on the other--provides useful distinctions for analyzing scientific reasoning. While contemporary scientific research seeks to discover certified knowledge, investigators generally employ the hypothetico-deductive or scientific method, which often yields probable rather than certain findings, making these dialectical in nature. Analysis of the discourse describing human genome research revealed the use of numerous rhetorical figures and topics. Persuasive and probable reasoning were necessary for scientists to characterize unknown genetic phenomena, to secure interest in and funding for large-scale human genome research, to solve scientific problems, to issue probable findings, to convince colleagues and government officials that the findings were sound and to disseminate information to the public. Both government and private venture scientists drew on these tools of reasoning to promote their methods of mapping and sequencing the genome. The debate over how to carry out sequencing was rooted in conflicting values. Scientists representing the academic tradition valued a more conservative method that would establish high quality results, and those supporting private industry valued an unconventional approach that would yield products and profits more quickly. Values in turn influenced political and public forum arguments. Agency representatives and investors sided

  14. Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa

    Science.gov (United States)

    Huckins, L M; Hatzikotoulas, K; Southam, L; Thornton, L M; Steinberg, J; Aguilera-McKay, F; Treasure, J; Schmidt, U; Gunasinghe, C; Romero, A; Curtis, C; Rhodes, D; Moens, J; Kalsi, G; Dempster, D; Leung, R; Keohane, A; Burghardt, R; Ehrlich, S; Hebebrand, J; Hinney, A; Ludolph, A; Walton, E; Deloukas, P; Hofman, A; Palotie, A; Palta, P; van Rooij, F J A; Stirrups, K; Adan, R; Boni, C; Cone, R; Dedoussis, G; van Furth, E; Gonidakis, F; Gorwood, P; Hudson, J; Kaprio, J; Kas, M; Keski-Rahonen, A; Kiezebrink, K; Knudsen, G-P; Slof-Op 't Landt, M C T; Maj, M; Monteleone, A M; Monteleone, P; Raevuori, A H; Reichborn-Kjennerud, T; Tozzi, F; Tsitsika, A; van Elburg, A; Adan, R A H; Alfredsson, L; Ando, T; Andreassen, O A; Aschauer, H; Baker, J H; Barrett, J C; Bencko, V; Bergen, A W; Berrettini, W H; Birgegard, A; Boni, C; Boraska Perica, V; Brandt, H; Breen, G; Bulik, C M; Carlberg, L; Cassina, M; Cichon, S; Clementi, M; Cohen-Woods, S; Coleman, J; Cone, R D; Courtet, P; Crawford, S; Crow, S; Crowley, J; Danner, U N; Davis, O S P; de Zwaan, M; Dedoussis, G; Degortes, D; DeSocio, J E; Dick, D M; Dikeos, D; Dina, C; Ding, B; Dmitrzak-Weglarz, M; Docampo, E; Duncan, L; Egberts, K; Ehrlich, S; Escaramís, G; Esko, T; Espeseth, T; Estivill, X; Favaro, A; Fernández-Aranda, F; Fichter, M M; Finan, C; Fischer, K; Floyd, J A B; Foretova, L; Forzan, M; Franklin, C S; Gallinger, S; Gambaro, G; Gaspar, H A; Giegling, I; Gonidakis, F; Gorwood, P; Gratacos, M; Guillaume, S; Guo, Y; Hakonarson, H; Halmi, K A; Hatzikotoulas, K; Hauser, J; Hebebrand, J; Helder, S; Herms, S; Herpertz-Dahlmann, B; Herzog, W; Hilliard, C E; Hinney, A; Hübel, C; Huckins, L M; Hudson, J I; Huemer, J; Inoko, H; Janout, V; Jiménez-Murcia, S; Johnson, C; Julià, A; Juréus, A; Kalsi, G; Kaminska, D; Kaplan, A S; Kaprio, J; Karhunen, L; Karwautz, A; Kas, M J H; Kaye, W; Kennedy, J L; Keski-Rahkonen, A; Kiezebrink, K; Klareskog, L; Klump, K L; Knudsen, G P S; Koeleman, B P C; Koubek, D; La Via, M C; Landén, M; Le Hellard, S; Levitan, R D; Li, D; Lichtenstein, P; Lilenfeld, L; Lissowska, J; Lundervold, A; Magistretti, P; Maj, M; Mannik, K; Marsal, S; Martin, N; Mattingsdal, M; McDevitt, S; McGuffin, P; Merl, E; Metspalu, A; Meulenbelt, I; Micali, N; Mitchell, J; Mitchell, K; Monteleone, P; Monteleone, A M; Mortensen, P; Munn-Chernoff, M A; Navratilova, M; Nilsson, I; Norring, C; Ntalla, I; Ophoff, R A; O'Toole, J K; Palotie, A; Pante, J; Papezova, H; Pinto, D; Rabionet, R; Raevuori, A; Rajewski, A; Ramoz, N; Rayner, N W; Reichborn-Kjennerud, T; Ripatti, S; Roberts, M; Rotondo, A; Rujescu, D; Rybakowski, F; Santonastaso, P; Scherag, A; Scherer, S W; Schmidt, U; Schork, N J; Schosser, A; Slachtova, L; Sladek, R; Slagboom, P E; Slof-Op 't Landt, M C T; Slopien, A; Soranzo, N; Southam, L; Steen, V M; Strengman, E; Strober, M; Sullivan, P F; Szatkiewicz, J P; Szeszenia-Dabrowska, N; Tachmazidou, I; Tenconi, E; Thornton, L M; Tortorella, A; Tozzi, F; Treasure, J; Tsitsika, A; Tziouvas, K; van Elburg, A A; van Furth, E F; Wagner, G; Walton, E; Watson, H; Wichmann, H-E; Widen, E; Woodside, D B; Yanovski, J; Yao, S; Yilmaz, Z; Zeggini, E; Zerwas, S; Zipfel, S; Collier, D A; Sullivan, P F; Breen, G; Bulik, C M; Zeggini, E

    2018-01-01

    Anorexia nervosa (AN) is a complex neuropsychiatric disorder presenting with dangerously low body weight, and a deep and persistent fear of gaining weight. To date, only one genome-wide significant locus associated with AN has been identified. We performed an exome-chip based genome-wide association studies (GWAS) in 2158 cases from nine populations of European origin and 15 485 ancestrally matched controls. Unlike previous studies, this GWAS also probed association in low-frequency and rare variants. Sixteen independent variants were taken forward for in silico and de novo replication (11 common and 5 rare). No findings reached genome-wide significance. Two notable common variants were identified: rs10791286, an intronic variant in OPCML (P=9.89 × 10−6), and rs7700147, an intergenic variant (P=2.93 × 10−5). No low-frequency variant associations were identified at genome-wide significance, although the study was well-powered to detect low-frequency variants with large effect sizes, suggesting that there may be no AN loci in this genomic search space with large effect sizes. PMID:29155802

  15. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation

    Directory of Open Access Journals (Sweden)

    Anubhav Jain

    2013-07-01

    Full Text Available Accelerating the discovery of advanced materials is essential for human welfare and sustainable, clean energy. In this paper, we introduce the Materials Project (www.materialsproject.org, a core program of the Materials Genome Initiative that uses high-throughput computing to uncover the properties of all known inorganic materials. This open dataset can be accessed through multiple channels for both interactive exploration and data mining. The Materials Project also seeks to create open-source platforms for developing robust, sophisticated materials analyses. Future efforts will enable users to perform ‘‘rapid-prototyping’’ of new materials in silico, and provide researchers with new avenues for cost-effective, data-driven materials design.

  16. Single-Nucleotide Variations in Cardiac Arrhythmias: Prospects for Genomics and Proteomics Based Biomarker Discovery and Diagnostics

    Directory of Open Access Journals (Sweden)

    Ayman Abunimer

    2014-03-01

    Full Text Available Cardiovascular diseases are a large contributor to causes of early death in developed countries. Some of these conditions, such as sudden cardiac death and atrial fibrillation, stem from arrhythmias—a spectrum of conditions with abnormal electrical activity in the heart. Genome-wide association studies can identify single nucleotide variations (SNVs that may predispose individuals to developing acquired forms of arrhythmias. Through manual curation of published genome-wide association studies, we have collected a comprehensive list of 75 SNVs associated with cardiac arrhythmias. Ten of the SNVs result in amino acid changes and can be used in proteomic-based detection methods. In an effort to identify additional non-synonymous mutations that affect the proteome, we analyzed the post-translational modification S-nitrosylation, which is known to affect cardiac arrhythmias. We identified loss of seven known S-nitrosylation sites due to non-synonymous single nucleotide variations (nsSNVs. For predicted nitrosylation sites we found 1429 proteins where the sites are modified due to nsSNV. Analysis of the predicted S-nitrosylation dataset for over- or under-representation (compared to the complete human proteome of pathways and functional elements shows significant statistical over-representation of the blood coagulation pathway. Gene Ontology (GO analysis displays statistically over-represented terms related to muscle contraction, receptor activity, motor activity, cystoskeleton components, and microtubule activity. Through the genomic and proteomic context of SNVs and S-nitrosylation sites presented in this study, researchers can look for variation that can predispose individuals to cardiac arrhythmias. Such attempts to elucidate mechanisms of arrhythmia thereby add yet another useful parameter in predicting susceptibility for cardiac diseases.

  17. Population genomic analysis of strain variation in Leptospirillum group II bacteria involved in acid mine drainage formation.

    Science.gov (United States)

    Simmons, Sheri L; Dibartolo, Genevieve; Denef, Vincent J; Goltsman, Daniela S Aliaga; Thelen, Michael P; Banfield, Jillian F

    2008-07-22

    Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth approximately 20x). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types ( approximately 94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of

  18. Population genomic analysis of strain variation in Leptospirillum group II bacteria involved in acid mine drainage formation.

    Directory of Open Access Journals (Sweden)

    Sheri L Simmons

    2008-07-01

    Full Text Available Deeply sampled community genomic (metagenomic datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth approximately 20x. The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types ( approximately 94% sequence identity have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the

  19. A haplotype regression approach for genetic evaluation using sequences from the 1000 bull genomes Project

    International Nuclear Information System (INIS)

    Lakhssassi, K.; González-Recio, O.

    2017-01-01

    Haplotypes from sequencing data may improve the prediction accuracy in genomic evaluations as haplotypes are in stronger linkage disequilibrium with quantitative trait loci than markers from SNP chips. This study focuses first, on the creation of haplotypes in a population sample of 450 Holstein animals, with full-sequence data from the 1000 bull genomes project; and second, on incorporating them into the whole genome prediction model. In total, 38,319,258 SNPs (and indels) from Next Generation Sequencing were included in the analysis. After filtering variants with minor allele frequency (MAF< 0.025) 13,912,326 SNPs were available for the haplotypes extraction with findhap.f90. The number of SNPs in the haploblocks was on average 924 SNP (166,552 bp). Unique haplotypes were around 97% in all chromosomes and were ignored leaving 153,428 haplotypes. Estimated haplotypes had a large contribution to the total variance of genomic estimated breeding values for kilogram of protein, Global Type Index, Somatic Cell Score and Days Open (between 32 and 99.9%). Haploblocks containing haplotypes with large effects were selected by filtering for each trait, haplotypes whose effect was larger/lower than the mean plus/minus 3 times the standard deviation (SD) and 1 SD above the mean of the haplotypes effect distribution. Results showed that filtering by 3 SD would not be enough to capture a large proportion of genetic variance, whereas filtering by 1 SD could be useful but model convergence should be considered. Additionally, sequence haplotypes were able to capture additional genetic variance to the polygenic effect for traits undergoing lower selection intensity like fertility and health traits.

  20. A haplotype regression approach for genetic evaluation using sequences from the 1000 bull genomes Project

    Energy Technology Data Exchange (ETDEWEB)

    Lakhssassi, K.; González-Recio, O.

    2017-07-01

    Haplotypes from sequencing data may improve the prediction accuracy in genomic evaluations as haplotypes are in stronger linkage disequilibrium with quantitative trait loci than markers from SNP chips. This study focuses first, on the creation of haplotypes in a population sample of 450 Holstein animals, with full-sequence data from the 1000 bull genomes project; and second, on incorporating them into the whole genome prediction model. In total, 38,319,258 SNPs (and indels) from Next Generation Sequencing were included in the analysis. After filtering variants with minor allele frequency (MAF< 0.025) 13,912,326 SNPs were available for the haplotypes extraction with findhap.f90. The number of SNPs in the haploblocks was on average 924 SNP (166,552 bp). Unique haplotypes were around 97% in all chromosomes and were ignored leaving 153,428 haplotypes. Estimated haplotypes had a large contribution to the total variance of genomic estimated breeding values for kilogram of protein, Global Type Index, Somatic Cell Score and Days Open (between 32 and 99.9%). Haploblocks containing haplotypes with large effects were selected by filtering for each trait, haplotypes whose effect was larger/lower than the mean plus/minus 3 times the standard deviation (SD) and 1 SD above the mean of the haplotypes effect distribution. Results showed that filtering by 3 SD would not be enough to capture a large proportion of genetic variance, whereas filtering by 1 SD could be useful but model convergence should be considered. Additionally, sequence haplotypes were able to capture additional genetic variance to the polygenic effect for traits undergoing lower selection intensity like fertility and health traits.

  1. Population-Genomic Insights into Variation in Prevotella intermedia and Prevotella nigrescens Isolates and Its Association with Periodontal Disease

    Directory of Open Access Journals (Sweden)

    Yifei Zhang

    2017-09-01

    Full Text Available High-throughput sequencing has helped to reveal the close relationship between Prevotella and periodontal disease, but the roles of subspecies diversity and genomic variation within this genus in periodontal diseases still need to be investigated. We performed a comparative genome analysis of 48 Prevotella intermedia and Prevotella nigrescens isolates that from the same cohort of subjects to identify the main drivers of their pathogenicity and adaptation to different environments. The comparisons were done between two species and between disease and health based on pooled sequences. The results showed that both P. intermedia and P. nigrescens have highly dynamic genomes and can take up various exogenous factors through horizontal gene transfer. The major differences between disease-derived and health-derived samples of P. intermedia and P. nigrescens were factors related to genome modification and recombination, indicating that the Prevotella isolates from disease sites may be more capable of genomic reconstruction. We also identified genetic elements specific to each sample, and found that disease groups had more unique virulence factors related to capsule and lipopolysaccharide synthesis, secretion systems, proteinases, and toxins, suggesting that strains from disease sites may have more specific virulence, particularly for P. intermedia. The differentially represented pathways between samples from disease and health were related to energy metabolism, carbohydrate and lipid metabolism, and amino acid metabolism, consistent with data from the whole subgingival microbiome in periodontal disease and health. Disease-derived samples had gained or lost several metabolic genes compared to healthy-derived samples, which could be linked with the difference in virulence performance between diseased and healthy sample groups. Our findings suggest that P. intermedia and P. nigrescens may serve as “crucial substances” in subgingival plaque, which may

  2. Population-Genomic Insights into Variation in Prevotella intermedia and Prevotella nigrescens Isolates and Its Association with Periodontal Disease.

    Science.gov (United States)

    Zhang, Yifei; Zhen, Min; Zhan, Yalin; Song, Yeqing; Zhang, Qian; Wang, Jinfeng

    2017-01-01

    High-throughput sequencing has helped to reveal the close relationship between Prevotella and periodontal disease, but the roles of subspecies diversity and genomic variation within this genus in periodontal diseases still need to be investigated. We performed a comparative genome analysis of 48 Prevotella intermedia and Prevotella nigrescens isolates that from the same cohort of subjects to identify the main drivers of their pathogenicity and adaptation to different environments. The comparisons were done between two species and between disease and health based on pooled sequences. The results showed that both P. intermedia and P. nigrescens have highly dynamic genomes and can take up various exogenous factors through horizontal gene transfer. The major differences between disease-derived and health-derived samples of P. intermedia and P. nigrescens were factors related to genome modification and recombination, indicating that the Prevotella isolates from disease sites may be more capable of genomic reconstruction. We also identified genetic elements specific to each sample, and found that disease groups had more unique virulence factors related to capsule and lipopolysaccharide synthesis, secretion systems, proteinases, and toxins, suggesting that strains from disease sites may have more specific virulence, particularly for P. intermedia . The differentially represented pathways between samples from disease and health were related to energy metabolism, carbohydrate and lipid metabolism, and amino acid metabolism, consistent with data from the whole subgingival microbiome in periodontal disease and health. Disease-derived samples had gained or lost several metabolic genes compared to healthy-derived samples, which could be linked with the difference in virulence performance between diseased and healthy sample groups. Our findings suggest that P. intermedia and P. nigrescens may serve as "crucial substances" in subgingival plaque, which may reflect changes in

  3. High-confidence assessment of functional impact of human mitochondrial non-synonymous genome variations by APOGEE.

    Directory of Open Access Journals (Sweden)

    Stefano Castellana

    2017-06-01

    Full Text Available 24,189 are all the possible non-synonymous amino acid changes potentially affecting the human mitochondrial DNA. Only a tiny subset was functionally evaluated with certainty so far, while the pathogenicity of the vast majority was only assessed in-silico by software predictors. Since these tools proved to be rather incongruent, we have designed and implemented APOGEE, a machine-learning algorithm that outperforms all existing prediction methods in estimating the harmfulness of mitochondrial non-synonymous genome variations. We provide a detailed description of the underlying algorithm, of the selected and manually curated training and test sets of variants, as well as of its classification ability.

  4. TcruziDB, an Integrated Database, and the WWW Information Server for the Trypanosoma cruzi Genome Project

    Directory of Open Access Journals (Sweden)

    Degrave Wim

    1997-01-01

    Full Text Available Data analysis, presentation and distribution is of utmost importance to a genome project. A public domain software, ACeDB, has been chosen as the common basis for parasite genome databases, and a first release of TcruziDB, the Trypanosoma cruzi genome database, is available by ftp from ftp://iris.dbbm.fiocruz.br/pub/genomedb/TcruziDB as well as versions of the software for different operating systems (ftp://iris.dbbm.fiocruz.br/pub/unixsoft/. Moreover, data originated from the project are available from the WWW server at http://www.dbbm.fiocruz.br. It contains biological and parasitological data on CL Brener, its karyotype, all available T. cruzi sequences from Genbank, data on the EST-sequencing project and on available libraries, a T. cruzi codon table and a listing of activities and participating groups in the genome project, as well as meeting reports. T. cruzi discussion lists (tcruzi-l@iris.dbbm.fiocruz.br and tcgenics@iris.dbbm.fiocruz.br are being maintained for communication and to promote collaboration in the genome project

  5. Ancestry variation and footprints of natural selection along the genome in Latin American populations.

    Science.gov (United States)

    Deng, Lian; Ruiz-Linares, Andrés; Xu, Shuhua; Wang, Sijia

    2016-02-18

    Latin American populations stem from the admixture of Europeans, Africans and Native Americans, which started over 400 years ago and had lasted for several centuries. Extreme deviation over the genome-wide average in ancestry estimations at certain genomic locations could reflect recent natural selection. We evaluated the distribution of ancestry estimations using 678 genome-wide microsatellite markers in 249 individuals from 13 admixed populations across Latin America. We found significant deviations in ancestry estimations including three locations with more than 3.5 times standard deviations from the genome-wide average: an excess of European ancestry at 1p36 and 14q32, and an excess of African ancestry at 6p22. Using simulations, we could show that at least the deviation at 6p22 was unlikely to result from genetic drift alone. By applying different linguistic groups as well as the most likely ancestral Native American populations as the ancestry, we showed that the choice of Native American ancestry could affect the local ancestry estimation. However, the signal at 6p22 consistently appeared in most of the analyses using various ancestral groups. This study provided important insights for recent natural selection in the context of the unique history of the New World and implications for disease mapping.

  6. Natural variation of histone modification and its impact on gene expression in the rat genome

    NARCIS (Netherlands)

    Rintisch, Carola; Heinig, Matthias; Bauerfeind, Anja; Schafer, Sebastian; Mieth, Christin; Patone, Giannino; Hummel, Oliver; Chen, Wei; Cook, Stuart; Cuppen, Edwin; Colomé-Tatché, Maria; Johannes, Frank; Jansen, Ritsert C; Neil, Helen; Werner, Michel; Pravenec, Michal; Vingron, Martin; Hubner, Norbert

    Histone modifications are epigenetic marks that play fundamental roles in many biological processes including the control of chromatin-mediated regulation of gene expression. Little is known about interindividual variability of histone modification levels across the genome and to what extent they

  7. Using an online genome resource to identify myostatin variation in U.S. sheep

    Science.gov (United States)

    We created a public, searchable DNA sequence resource for sheep that contained approximately 14x whole genome sequence of 96 rams. The animals represent 10 popular U.S. breeds and share minimal pedigree relationships, making the resource suitable for viewing gene variants in the user-friendly Integ...

  8. Systematic differences in the response of genetic variation to pedigree and genome-based selection methods

    NARCIS (Netherlands)

    Heidaritabar, M.; Vereijken, A.; Muir, W.M.; Meuwissen, T.H.E.; Cheng, H.; Megens, H.J.W.C.; Groenen, M.; Bastiaansen, J.W.M.

    2014-01-01

    Genomic selection (GS) is a DNA-based method of selecting for quantitative traits in animal and plant breeding, and offers a potentially superior alternative to traditional breeding methods that rely on pedigree and phenotype information. Using a 60¿K SNP chip with markers spaced throughout the

  9. BIGSdb: Scalable analysis of bacterial genome variation at the population level

    Directory of Open Access Journals (Sweden)

    Maiden Martin CJ

    2010-12-01

    Full Text Available Abstract Background The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner. Results The Bacterial Isolate Genome Sequence Database (BIGSDB is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens. The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences. These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses. Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches. LIMS functionality of the software enables linkage to and organisation of laboratory samples. The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database. Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus. The BIGSDB source code and documentation are available at http://pubmlst.org/software/database/bigsdb/. Conclusions Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies. BIGSDB

  10. Evolutionary and biotechnology implications of plastid genome variation in the inverted-repeat-lacking clade of legumes.

    Science.gov (United States)

    Sabir, Jamal; Schwarz, Erika; Ellison, Nicholas; Zhang, Jin; Baeshen, Nabih A; Mutwakil, Muhammed; Jansen, Robert; Ruhlman, Tracey

    2014-08-01

    Land plant plastid genomes (plastomes) provide a tractable model for evolutionary study in that they are relatively compact and gene dense. Among the groups that display an appropriate level of variation for structural features, the inverted-repeat-lacking clade (IRLC) of papilionoid legumes presents the potential to advance general understanding of the mechanisms of genomic evolution. Here, are presented six complete plastome sequences from economically important species of the IRLC, a lineage previously represented by only five completed plastomes. A number of characters are compared across the IRLC including gene retention and divergence, synteny, repeat structure and functional gene transfer to the nucleus. The loss of clpP intron 2 was identified in one newly sequenced member of IRLC, Glycyrrhiza glabra. Using deeply sequenced nuclear transcriptomes from two species helped clarify the nature of the functional transfer of accD to the nucleus in Trifolium, which likely occurred in the lineage leading to subgenus Trifolium. Legumes are second only to cereal crops in agricultural importance based on area harvested and total production. Genetic improvement via plastid transformation of IRLC crop species is an appealing proposition. Comparative analyses of intergenic spacer regions emphasize the need for complete genome sequences for developing transformation vectors for plastid genetic engineering of legume crops. © 2014 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  11. Single-Cell-Based Platform for Copy Number Variation Profiling through Digital Counting of Amplified Genomic DNA Fragments.

    Science.gov (United States)

    Li, Chunmei; Yu, Zhilong; Fu, Yusi; Pang, Yuhong; Huang, Yanyi

    2017-04-26

    We develop a novel single-cell-based platform through digital counting of amplified genomic DNA fragments, named multifraction amplification (mfA), to detect the copy number variations (CNVs) in a single cell. Amplification is required to acquire genomic information from a single cell, while introducing unavoidable bias. Unlike prevalent methods that directly infer CNV profiles from the pattern of sequencing depth, our mfA platform denatures and separates the DNA molecules from a single cell into multiple fractions of a reaction mix before amplification. By examining the sequencing result of each fraction for a specific fragment and applying a segment-merge maximum likelihood algorithm to the calculation of copy number, we digitize the sequencing-depth-based CNV identification and thus provide a method that is less sensitive to the amplification bias. In this paper, we demonstrate a mfA platform through multiple displacement amplification (MDA) chemistry. When performing the mfA platform, the noise of MDA is reduced; therefore, the resolution of single-cell CNV identification can be improved to 100 kb. We can also determine the genomic region free of allelic drop-out with mfA platform, which is impossible for conventional single-cell amplification methods.

  12. Systematic differences in the response of genetic variation to pedigree and genome-based selection methods.

    Science.gov (United States)

    Heidaritabar, M; Vereijken, A; Muir, W M; Meuwissen, T; Cheng, H; Megens, H-J; Groenen, M A M; Bastiaansen, J W M

    2014-12-01

    Genomic selection (GS) is a DNA-based method of selecting for quantitative traits in animal and plant breeding, and offers a potentially superior alternative to traditional breeding methods that rely on pedigree and phenotype information. Using a 60 K SNP chip with markers spaced throughout the entire chicken genome, we compared the impact of GS and traditional BLUP (best linear unbiased prediction) selection methods applied side-by-side in three different lines of egg-laying chickens. Differences were demonstrated between methods, both at the level and genomic distribution of allele frequency changes. In all three lines, the average allele frequency changes were larger with GS, 0.056 0.064 and 0.066, compared with BLUP, 0.044, 0.045 and 0.036 for lines B1, B2 and W1, respectively. With BLUP, 35 selected regions (empirical P selected regions were identified. Empirical thresholds for local allele frequency changes were determined from gene dropping, and differed considerably between GS (0.167-0.198) and BLUP (0.105-0.126). Between lines, the genomic regions with large changes in allele frequencies showed limited overlap. Our results show that GS applies selection pressure much more locally than BLUP, resulting in larger allele frequency changes. With these results, novel insights into the nature of selection on quantitative traits have been gained and important questions regarding the long-term impact of GS are raised. The rapid changes to a part of the genetic architecture, while another part may not be selected, at least in the short term, require careful consideration, especially when selection occurs before phenotypes are observed.

  13. Analyzing the genomic variation of microbial cell factories in the era of “New Biotechnology”

    DEFF Research Database (Denmark)

    Herrgard, Markus; Panagiotou, Gianni

    2012-01-01

    The application of genome-scale technologies, both experimental and in silico, to industrial biotechnology has allowed improving the conversion of biomass-derived feedstocks to chemicals, materials and fuels through microbial fermentation. In particular, due to rapidly decreasing costs and its...... technologies for finding the underlying molecular mechanisms for (a) improved carbon source utilization, (b) increased product formation, and (c) stress tolerance. We also discuss the strengths and weaknesses of different strategies for mapping industrially relevant genotype-to-phenotype links including...

  14. Demographic history and biologically relevant genetic variation of Native Mexicans inferred from whole-genome sequencing

    OpenAIRE

    Romero-Hidalgo, Sandra; Ochoa-Leyva, Adrián; Garcíarrubio, Alejandro; Acuña-Alonzo, Victor; Antúnez-Argüelles, Erika; Balcazar-Quintero, Martha; Barquera-Lozano, Rodrigo; Carnevale, Alessandra; Cornejo-Granados, Fernanda; Fernández-López, Juan Carlos; García-Herrera, Rodrigo; García-Ortíz, Humberto; Granados-Silvestre, Ángeles; Granados, Julio; Guerrero-Romero, Fernando

    2017-01-01

    Understanding the genetic structure of Native American populations is important to clarify their diversity, demographic history, and to identify genetic factors relevant for biomedical traits. Here, we show a demographic history reconstruction from 12 Native American whole genomes belonging to six distinct ethnic groups representing the three main described genetic clusters of Mexico (Northern, Southern, and Maya). Effective population size estimates of all Native American groups remained bel...

  15. Understanding our genetic inheritance: The US Human Genome Project, The first five years FY 1991--1995

    Energy Technology Data Exchange (ETDEWEB)

    None

    1990-04-01

    The Human Genome Initiative is a worldwide research effort with the goal of analyzing the structure of human DNA and determining the location of the estimated 100,000 human genes. In parallel with this effort, the DNA of a set of model organisms will be studied to provide the comparative information necessary for understanding the functioning of the human genome. The information generated by the human genome project is expected to be the source book for biomedical science in the 21st century and will by of immense benefit to the field of medicine. It will help us to understand and eventually treat many of the more than 4000 genetic diseases that affect mankind, as well as the many multifactorial diseases in which genetic predisposition plays an important role. A centrally coordinated project focused on specific objectives is believed to be the most efficient and least expensive way of obtaining this information. The basic data produced will be collected in electronic databases that will make the information readily accessible on convenient form to all who need it. This report describes the plans for the U.S. human genome project and updates those originally prepared by the Office of Technology Assessment (OTA) and the National Research Council (NRC) in 1988. In the intervening two years, improvements in technology for almost every aspect of genomics research have taken place. As a result, more specific goals can now be set for the project.

  16. Striking structural dynamism and nucleotide sequence variation of the transposon Galileo in the genome of Drosophila mojavensis.

    Science.gov (United States)

    Marzo, Mar; Bello, Xabier; Puig, Marta; Maside, Xulio; Ruiz, Alfredo

    2013-02-04

    Galileo is a transposable element responsible for the generation of three chromosomal inversions in natural populations of Drosophila buzzatii. Although the most characteristic feature of Galileo is the long internally-repetitive terminal inverted repeats (TIRs), which resemble the Drosophila Foldback element, its transposase-coding sequence has led to its classification as a member of the P-element superfamily (Class II, subclass 1, TIR order). Furthermore, Galileo has a wide distribution in the genus Drosophila, since it has been found in 6 of the 12 Drosophila sequenced genomes. Among these species, D. mojavensis, the one closest to D. buzzatii, presented the highest diversity in sequence and structure of Galileo elements. In the present work, we carried out a thorough search and annotation of all the Galileo copies present in the D. mojavensis sequenced genome. In our set of 170 Galileo copies we have detected 5 Galileo subfamilies (C, D, E, F, and X) with different structures ranging from nearly complete, to only 2 TIR or solo TIR copies. Finally, we have explored the structural and length variation of the Galileo copies that point out the relatively frequent rearrangements within and between Galileo elements. Different mechanisms responsible for these rearrangements are discussed. Although Galileo is a transposable element with an ancient history in the D. mojavensis genome, our data indicate a recent transpositional activity. Furthermore, the dynamism in sequence and structure, mainly affecting the TIRs, suggests an active exchange of sequences among the copies. This exchange could lead to new subfamilies of the transposon, which could be crucial for the long-term survival of the element in the genome.

  17. Spectrum of mitochondrial genomic variation and associated clinical presentation of prostate cancer in South African men.

    Science.gov (United States)

    McCrow, John P; Petersen, Desiree C; Louw, Melanie; Chan, Eva K F; Harmeyer, Katherine; Vecchiarelli, Stefano; Lyons, Ruth J; Bornman, M S Riana; Hayes, Vanessa M

    2016-03-01

    Prostate cancer incidence and mortality rates are significantly increased in African-American men, but limited studies have been performed within Sub-Saharan African populations. As mitochondria control energy metabolism and apoptosis we speculate that somatic mutations within mitochondrial genomes are candidate drivers of aggressive prostate carcinogenesis. We used matched blood and prostate tissue samples from 87 South African men (77 with African ancestry) to perform deep sequencing of complete mitochondrial genomes. Clinical presentation was biased toward aggressive disease (Gleason score >7, 64%), and compared with men without prostate cancer either with or without benign prostatic hyperplasia. We identified 144 somatic mtDNA single nucleotide variants (SNVs), of which 80 were observed in 39 men presenting with aggressive disease. Both the number and frequency of somatic mtDNA SNVs were associated with higher pathological stage. Besides doubling the total number of somatic PCa-associated mitochondrial genome mutations identified to date, we associate mutational load with aggressive prostate cancer status in men of African ancestry. © 2015 The Authors. The Prostate published by Wiley Periodicals, Inc.

  18. O admirável Projeto Genoma Humano The brave New Human Genome Project

    Directory of Open Access Journals (Sweden)

    Marilena V. Corrêa

    2002-12-01

    Full Text Available Este artigo apresenta um panorama das implicações sociais, éticas e legais do Projeto Genoma Humano. Os benefícios desse megaprojeto, traduzidos em promessas de uma revolução terapêutica na medicina, não se realizarão sem conflitos. O processo de inovação tecnológica na genética traz problemas de ordens diversas: por um lado, pesquisas em consórcio, patenteamento de genes e produtos da genômica apontam interesses comerciais e dificuldades de gerenciamento dos resultados dessas pesquisas. Esses problemas colocam desafios em termos de uma possível desigualdade no acesso aos benefícios das pesquisas. Por outro lado, temos a questão da informação genética e da proteção de dados individuais sobre riscos e suscetibilidades a doenças e atributos humanos. O problema da definição de homens e mulheres em função de traços genéticos traz uma ameaça discriminatória clara, e se torna agudo em função do reducionismo genético que a mídia ajuda a propagar. As respostas a esses problemas não podem ser esperadas apenas da bioética. A abordagem bioética deve poder combinar-se a análises políticas da reprodução, da sexualidade, da saúde e da medicina. Um vastíssimo espectro de problemas como estes não pode ser discutido em profundidade em um artigo. Optou-se por mapeá-los no sentido de enfatizar em que medida, na reflexão sobre o projeto genoma, a genômica e a pós-genômica, enfrenta-se o desafio de articular aspectos tão diferenciados.This article presents an overview of the social, ethical, and legal implications of the Human Genome Project. The benefits of this mega-project, expressed as promises of a therapeutic revolution in medicine, will not be achieved without conflict. The process of technological innovation in genetics poses problems of various orders: on the one hand, consortium-based research, gene patenting, and genomic products tend to feature commercial interests and management of the results of such

  19. The GenABEL Project for statistical genomics [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Lennart C. Karssen

    2016-05-01

    Full Text Available Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the “core team”, facilitating agile statistical omics methodology development and fast dissemination.

  20. Human Genome Diversity Project. Summary of planning workshop 3(B): Ethical and human-rights implications

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1993-12-31

    The third planning workshop of the Human Genome Diversity Project was held on the campus of the US National Institutes of Health in Bethesda, Maryland, from February 16 through February 18, 1993. The second day of the workshop was devoted to an exploration of the ethical and human-rights implications of the Project. This open meeting centered on three roundtables, involving 12 invited participants, and the resulting discussions among all those present. Attendees and their affiliations are listed in the attached Appendix A. The discussion was guided by a schedule and list of possible issues, distributed to all present and attached as Appendix B. This is a relatively complete, and thus lengthy, summary of the comments at the meeting. The beginning of the summary sets out as conclusions some issues on which there appeared to be widespread agreement, but those conclusions are not intended to serve as a set of detailed recommendations. The meeting organizer is distributing his recommendations in a separate memorandum; recommendations from others who attended the meeting are welcome and will be distributed by the meeting organizer to the participants and to the Project committee.

  1. High-throughput crystal-optimization strategies in the South Paris Yeast Structural Genomics Project: one size fits all?

    Science.gov (United States)

    Leulliot, Nicolas; Trésaugues, Lionel; Bremang, Michael; Sorel, Isabelle; Ulryck, Nathalie; Graille, Marc; Aboulfath, Ilham; Poupon, Anne; Liger, Dominique; Quevillon-Cheruel, Sophie; Janin, Joël; van Tilbeurgh, Herman

    2005-06-01

    Crystallization has long been regarded as one of the major bottlenecks in high-throughput structural determination by X-ray crystallography. Structural genomics projects have addressed this issue by using robots to set up automated crystal screens using nanodrop technology. This has moved the bottleneck from obtaining the first crystal hit to obtaining diffraction-quality crystals, as crystal optimization is a notoriously slow process that is difficult to automatize. This article describes the high-throughput optimization strategies used in the Yeast Structural Genomics project, with selected successful examples.

  2. BIOETHICS METHODS IN THE ETHICAL, LEGAL, AND SOCIAL IMPLICATIONS OF THE HUMAN GENOME PROJECT LITERATURE

    Science.gov (United States)

    Walker, Rebecca; Morrissey, Clair

    2013-01-01

    While bioethics as a field has concerned itself with methodological issues since the early years, there has been no systematic examination of how ethics is incorporated into research on the Ethical, Legal and Social Implications (ELSI) of the Human Genome Project. Yet ELSI research may bear a particular burden of investigating and substantiating its methods given public funding, an explicitly cross-disciplinary approach, and the perceived significance of adequate responsiveness to advances in genomics. We undertook a qualitative content analysis of a sample of ELSI publications appearing between 2003-2008 with the aim of better understanding the methods, aims, and approaches to ethics that ELSI researchers employ. We found that the aims of ethics within ELSI are largely prescriptive and address multiple groups. We also found that the bioethics methods used in the ELSI literature are both diverse between publications and multiple within publications, but are usually not themselves discussed or employed as suggested by bioethics method proponents. Ethics in ELSI is also sometimes undistinguished from related inquiries (such as social, legal, or political investigations). PMID:23796275

  3. The human genome project and novel aspects of cytochrome P450 research

    International Nuclear Information System (INIS)

    Ingelman-Sundberg, Magnus

    2005-01-01

    Currently, 57 active cytochrome P450 (CYP) genes and 58 pseudogenes are known to be present in the human genome. Among the genes discovered by initiatives in the human genome project are CYP2R1, CYP2W1, CYP2S1, CYP2U1 and CYP3A43, the latter apparently encoding a pseudoenzyme. The function, polymorphism and regulation of these genes are still to be discovered to a great extent. The polymorphism of drug metabolizing CYPs is extensive and influences the outcome of drug therapy causing lack of response or adverse drug reactions. The basis for the differences in the global distribution of the polymorphic variants is inactivating gene mutations and subsequent genetic drift. However, polymorphic alleles carrying multiple active gene copies also exist and are suggested in case of CYP2D6 to be caused by positive selection due to development of alkaloid resistance in North East Africa about 10,000-5000 BC. The knowledge about the CYP genes and their polymorphisms is of fundamental importance for effective drug therapy and for drug development as well as for understanding metabolic activation of carcinogens and other xenobiotics. Here, a short review of the current knowledge is given

  4. Comparative Genomics in Homo sapiens.

    Science.gov (United States)

    Oti, Martin; Sammeth, Michael

    2018-01-01

    Genomes can be compared at different levels of divergence, either between species or within species. Within species genomes can be compared between different subpopulations, such as human subpopulations from different continents. Investigating the genomic differences between different human subpopulations is important when studying complex diseases that are affected by many genetic variants, as the variants involved can differ between populations. The 1000 Genomes Project collected genome-scale variation data for 2504 human individuals from 26 different populations, enabling a systematic comparison of variation between human subpopulations. In this chapter, we present step-by-step a basic protocol for the identification of population-specific variants employing the 1000 Genomes data. These variants are subsequently further investigated for those that affect the proteome or RNA splice sites, to investigate potentially biologically relevant differences between the populations.

  5. Variational calculations in gauge theories with approximate projection on gauge invariant states

    International Nuclear Information System (INIS)

    Heinemann, C.; Martin, C.; Vautherin, D.; Iancu, E.

    1999-01-01

    Variational calculations using Gaussian wave functionals combined with an approximate projection on gauge invariant states are presented. The minimization with respect to the kernel and center of the Gaussian leads to a gap type equation which is free of the difficulties generally encountered with negative modes. We show that the divergences in the expectation value of the energy density are only logarithmic and can be removed by a renormalization of the coupling constant. The renormalized energy density has a minimum which corresponds to a vanishing background magnetic field. We obtain an estimate for the gluon condensate. (authors)

  6. A Domain Specific Embedded Language in C++ for Automatic Differentiation, Projection, Integration and Variational Formulations

    Directory of Open Access Journals (Sweden)

    Christophe Prud'homme

    2006-01-01

    Full Text Available In this article, we present a domain specific embedded language in C++ that can be used in various contexts such as numerical projection onto a functional space, numerical integration, variational formulations and automatic differentiation. Albeit these tools operate in different ways, the language overcomes this difficulty by decoupling expression constructions from evaluation. The language is implemented using expression templates and meta-programming techniques and uses various Boost libraries. The language is exercised on a number of non-trivial examples and a benchmark presents the performance behavior on a few test problems.

  7. Analysis of genetic variation and potential applications in genome-scale metabolic modeling

    DEFF Research Database (Denmark)

    Cardoso, Joao; Andersen, Mikael Rørdam; Herrgard, Markus

    2015-01-01

    scale and resolution by re-sequencing thousands of strains systematically. In this article, we review challenges in the integration and analysis of large-scale re-sequencing data, present an extensive overview of bioinformatics methods for predicting the effects of genetic variants on protein function......Genetic variation is the motor of evolution and allows organisms to overcome the environmental challenges they encounter. It can be both beneficial and harmful in the process of engineering cell factories for the production of proteins and chemicals. Throughout the history of biotechnology......, there have been efforts to exploit genetic variation in our favor to create strains with favorable phenotypes. Genetic variation can either be present in natural populations or it can be artificially created by mutagenesis and selection or adaptive laboratory evolution. On the other hand, unintended genetic...

  8. Bat Biology, Genomes, and the Bat1K Project: To Generate Chromosome-Level Genomes for All Living Bat Species.

    Science.gov (United States)

    Teeling, Emma C; Vernes, Sonja C; Dávalos, Liliana M; Ray, David A; Gilbert, M Thomas P; Myers, Eugene

    2018-02-15

    Bats are unique among mammals, possessing some of the rarest mammalian adaptations, including true self-powered flight, laryngeal echolocation, exceptional longevity, unique immunity, contracted genomes, and vocal learning. They provide key ecosystem services, pollinating tropical plants, dispersing seeds, and controlling insect pest populations, thus driving healthy ecosystems. They account for more than 20% of all living mammalian diversity, and their crown-group evolutionary history dates back to the Eocene. Despite their great numbers and diversity, many species are threatened and endangered. Here we announce Bat1K, an initiative to sequence the genomes of all living bat species (n∼1,300) to chromosome-level assembly. The Bat1K genome consortium unites bat biologists (>148 members as of writing), computational scientists, conservation organizations, genome technologists, and any interested individuals committed to a better understanding of the genetic and evolutionary mechanisms that underlie the unique adaptations of bats. Our aim is to catalog the unique genetic diversity present in all living bats to better understand the molecular basis of their unique adaptations; uncover their evolutionary history; link genotype with phenotype; and ultimately better understand, promote, and conserve bats. Here we review the unique adaptations of bats and highlight how chromosome-level genome assemblies can uncover the molecular basis of these traits. We present a novel sequencing and assembly strategy and review the striking societal and scientific benefits that will result from the Bat1K initiative.

  9. Genetic Architecture of Natural Variation in Rice Chlorophyll Content Revealed by a Genome-Wide Association Study.

    Science.gov (United States)

    Wang, Quanxiu; Xie, Weibo; Xing, Hongkun; Yan, Ju; Meng, Xiangzhou; Li, Xinglei; Fu, Xiangkui; Xu, Jiuyue; Lian, Xingming; Yu, Sibin; Xing, Yongzhong; Wang, Gongwei

    2015-06-01

    Chlorophyll content is one of the most important physiological traits as it is closely related to leaf photosynthesis and crop yield potential. So far, few genes have been reported to be involved in natural variation of chlorophyll content in rice (Oryza sativa) and the extent of variations explored is very limited. We conducted a genome-wide association study (GWAS) using a diverse worldwide collection of 529 O. sativa accessions. A total of 46 significant association loci were identified. Three F2 mapping populations with parents selected from the association panel were tested for validation of GWAS signals. We clearly demonstrated that Grain number, plant height, and heading date7 (Ghd7) was a major locus for natural variation of chlorophyll content at the heading stage by combining evidence from near-isogenic lines and transgenic plants. The enhanced expression of Ghd7 decreased the chlorophyll content, mainly through down-regulating the expression of genes involved in the biosynthesis of chlorophyll and chloroplast. In addition, Narrow leaf1 (NAL1) corresponded to one significant association region repeatedly detected over two years. We revealed a high degree of polymorphism in the 5' UTR and four non-synonymous SNPs in the coding region of NAL1, and observed diverse effects of the major haplotypes. The loci or candidate genes identified would help to fine-tune and optimize the antenna size of canopies in rice breeding. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.

  10. A novel variational Bayes multiple locus Z-statistic for genome-wide association studies with Bayesian model averaging

    Science.gov (United States)

    Logsdon, Benjamin A.; Carty, Cara L.; Reiner, Alexander P.; Dai, James Y.; Kooperberg, Charles

    2012-01-01

    Motivation: For many complex traits, including height, the majority of variants identified by genome-wide association studies (GWAS) have small effects, leaving a significant proportion of the heritable variation unexplained. Although many penalized multiple regression methodologies have been proposed to increase the power to detect associations for complex genetic architectures, they generally lack mechanisms for false-positive control and diagnostics for model over-fitting. Our methodology is the first penalized multiple regression approach that explicitly controls Type I error rates and provide model over-fitting diagnostics through a novel normally distributed statistic defined for every marker within the GWAS, based on results from a variational Bayes spike regression algorithm. Results: We compare the performance of our method to the lasso and single marker analysis on simulated data and demonstrate that our approach has superior performance in terms of power and Type I error control. In addition, using the Women's Health Initiative (WHI) SNP Health Association Resource (SHARe) GWAS of African-Americans, we show that our method has power to detect additional novel associations with body height. These findings replicate by reaching a stringent cutoff of marginal association in a larger cohort. Availability: An R-package, including an implementation of our variational Bayes spike regression (vBsr) algorithm, is available at http://kooperberg.fhcrc.org/soft.html. Contact: blogsdon@fhcrc.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22563072

  11. Epigenetic Variation in Monozygotic Twins: A Genome-Wide Analysis of DNA Methylation in Buccal Cells

    NARCIS (Netherlands)

    van Dongen, J.; Ehli, E.A.; Slieker, R.C.; Bartels, M.; Weber, Z.M.; Davies, G.E.; Slagboom, P.E.; Heijmans, B.T.; Boomsma, D.I.

    2014-01-01

    DNA methylation is one of the most extensively studied epigenetic marks in humans. Yet, it is largely unknown what causes variation in DNA methylation between individuals. The comparison of DNA methylation profiles of monozygotic (MZ) twins offers a unique experimental design to examine the extent

  12. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing

    NARCIS (Netherlands)

    Aflitos, S.A.; Schijlen, E.G.W.M.; Jong, de J.H.S.G.M.; Ridder, de D.; Smit, S.; Finkers, H.J.; Bakker, F.T.; Geest, van de H.C.; Lintel Hekkert, te B.; Haarst, van J.C.; Smits, L.W.M.; Koops, A.J.; Sanchez-Perez, M.J.; Heusden, van A.W.; Visser, R.G.F.; Schranz, M.E.; Peters, S.A.

    2014-01-01

    We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative for the Lycopersicon, Arcanum, Eriopersicon, and Neolycopersicon groups which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new

  13. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing

    NARCIS (Netherlands)

    Aflitos, S.; Schijlen, E.; de Jong, H.; de Ridder, D.; Smit, S.; Finkers, R.; Wang, J.; Zhang, G.; Li, N.; Mao, L.; Bakker, F.; Dirks, R.; Breit, T.; Gravendeel, B.; Huits, H.; Struss, D.; Swanson-Wagner, R.; van Leeuwen, H.; van Ham, R.C.H.J.; Fito, L.; Guignier, L.; Sevilla, M.; Ellul, P.; Ganko, E.; Kapur, A.; Reclus, E.; de Geus, B.; van de Geest, H.; te Lintel Hekkert, B.; van Haarst, J.; Smits, L.; Koops, A.; Sanchez-Perez, G.; van Heusden, A.W.; Visser, R.; Quan, Z.; Min, J.; Liao, L.; Wang, X.; Wang, G.; Yue, Z.; Yang, X.; Xu, N.; Schranz, E.; Smets, E.; Vos, R.; Rauwerda, J.; Ursem, R.; Schuit, C.; Kerns, M.; van den Berg, J.; Vriezen, W.; Janssen, A.; Datema, E.; Jahrman, T.; Moquet, F.; Bonnet, J.; Peters, S.

    2014-01-01

    We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new

  14. Defining the role of common variation in the genomic and biological architecture of adult human height.

    Science.gov (United States)

    Wood, Andrew R; Esko, Tonu; Yang, Jian; Vedantam, Sailaja; Pers, Tune H; Gustafsson, Stefan; Chu, Audrey Y; Estrada, Karol; Luan, Jian'an; Kutalik, Zoltán; Amin, Najaf; Buchkovich, Martin L; Croteau-Chonka, Damien C; Day, Felix R; Duan, Yanan; Fall, Tove; Fehrmann, Rudolf; Ferreira, Teresa; Jackson, Anne U; Karjalainen, Juha; Lo, Ken Sin; Locke, Adam E; Mägi, Reedik; Mihailov, Evelin; Porcu, Eleonora; Randall, Joshua C; Scherag, André; Vinkhuyzen, Anna A E; Westra, Harm-Jan; Winkler, Thomas W; Workalemahu, Tsegaselassie; Zhao, Jing Hua; Absher, Devin; Albrecht, Eva; Anderson, Denise; Baron, Jeffrey; Beekman, Marian; Demirkan, Ayse; Ehret, Georg B; Feenstra, Bjarke; Feitosa, Mary F; Fischer, Krista; Fraser, Ross M; Goel, Anuj; Gong, Jian; Justice, Anne E; Kanoni, Stavroula; Kleber, Marcus E; Kristiansson, Kati; Lim, Unhee; Lotay, Vaneet; Lui, Julian C; Mangino, Massimo; Mateo Leach, Irene; Medina-Gomez, Carolina; Nalls, Michael A; Nyholt, Dale R; Palmer, Cameron D; Pasko, Dorota; Pechlivanis, Sonali; Prokopenko, Inga; Ried, Janina S; Ripke, Stephan; Shungin, Dmitry; Stancáková, Alena; Strawbridge, Rona J; Sung, Yun Ju; Tanaka, Toshiko; Teumer, Alexander; Trompet, Stella; van der Laan, Sander W; van Setten, Jessica; Van Vliet-Ostaptchouk, Jana V; Wang, Zhaoming; Yengo, Loïc; Zhang, Weihua; Afzal, Uzma; Arnlöv, Johan; Arscott, Gillian M; Bandinelli, Stefania; Barrett, Amy; Bellis, Claire; Bennett, Amanda J; Berne, Christian; Blüher, Matthias; Bolton, Jennifer L; Böttcher, Yvonne; Boyd, Heather A; Bruinenberg, Marcel; Buckley, Brendan M; Buyske, Steven; Caspersen, Ida H; Chines, Peter S; Clarke, Robert; Claudi-Boehm, Simone; Cooper, Matthew; Daw, E Warwick; De Jong, Pim A; Deelen, Joris; Delgado, Graciela; Denny, Josh C; Dhonukshe-Rutten, Rosalie; Dimitriou, Maria; Doney, Alex S F; Dörr, Marcus; Eklund, Niina; Eury, Elodie; Folkersen, Lasse; Garcia, Melissa E; Geller, Frank; Giedraitis, Vilmantas; Go, Alan S; Grallert, Harald; Grammer, Tanja B; Gräßler, Jürgen; Grönberg, Henrik; de Groot, Lisette C P G M; Groves, Christopher J; Haessler, Jeffrey; Hall, Per; Haller, Toomas; Hallmans, Goran; Hannemann, Anke; Hartman, Catharina A; Hassinen, Maija; Hayward, Caroline; Heard-Costa, Nancy L; Helmer, Quinta; Hemani, Gibran; Henders, Anjali K; Hillege, Hans L; Hlatky, Mark A; Hoffmann, Wolfgang; Hoffmann, Per; Holmen, Oddgeir; Houwing-Duistermaat, Jeanine J; Illig, Thomas; Isaacs, Aaron; James, Alan L; Jeff, Janina; Johansen, Berit; Johansson, Åsa; Jolley, Jennifer; Juliusdottir, Thorhildur; Junttila, Juhani; Kho, Abel N; Kinnunen, Leena; Klopp, Norman; Kocher, Thomas; Kratzer, Wolfgang; Lichtner, Peter; Lind, Lars; Lindström, Jaana; Lobbens, Stéphane; Lorentzon, Mattias; Lu, Yingchang; Lyssenko, Valeriya; Magnusson, Patrik K E; Mahajan, Anubha; Maillard, Marc; McArdle, Wendy L; McKenzie, Colin A; McLachlan, Stela; McLaren, Paul J; Menni, Cristina; Merger, Sigrun; Milani, Lili; Moayyeri, Alireza; Monda, Keri L; Morken, Mario A; Müller, Gabriele; Müller-Nurasyid, Martina; Musk, Arthur W; Narisu, Narisu; Nauck, Matthias; Nolte, Ilja M; Nöthen, Markus M; Oozageer, Laticia; Pilz, Stefan; Rayner, Nigel W; Renstrom, Frida; Robertson, Neil R; Rose, Lynda M; Roussel, Ronan; Sanna, Serena; Scharnagl, Hubert; Scholtens, Salome; Schumacher, Fredrick R; Schunkert, Heribert; Scott, Robert A; Sehmi, Joban; Seufferlein, Thomas; Shi, Jianxin; Silventoinen, Karri; Smit, Johannes H; Smith, Albert Vernon; Smolonska, Joanna; Stanton, Alice V; Stirrups, Kathleen; Stott, David J; Stringham, Heather M; Sundström, Johan; Swertz, Morris A; Syvänen, Ann-Christine; Tayo, Bamidele O; Thorleifsson, Gudmar; Tyrer, Jonathan P; van Dijk, Suzanne; van Schoor, Natasja M; van der Velde, Nathalie; van Heemst, Diana; van Oort, Floor V A; Vermeulen, Sita H; Verweij, Niek; Vonk, Judith M; Waite, Lindsay L; Waldenberger, Melanie; Wennauer, Roman; Wilkens, Lynne R; Willenborg, Christina; Wilsgaard, Tom; Wojczynski, Mary K; Wong, Andrew; Wright, Alan F; Zhang, Qunyuan; Arveiler, Dominique; Bakker, Stephan J L; Beilby, John; Bergman, Richard N; Bergmann, Sven; Biffar, Reiner; Blangero, John; Boomsma, Dorret I; Bornstein, Stefan R; Bovet, Pascal; Brambilla, Paolo; Brown, Morris J; Campbell, Harry; Caulfield, Mark J; Chakravarti, Aravinda; Collins, Rory; Collins, Francis S; Crawford, Dana C; Cupples, L Adrienne; Danesh, John; de Faire, Ulf; den Ruijter, Hester M; Erbel, Raimund; Erdmann, Jeanette; Eriksson, Johan G; Farrall, Martin; Ferrannini, Ele; Ferrières, Jean; Ford, Ian; Forouhi, Nita G; Forrester, Terrence; Gansevoort, Ron T; Gejman, Pablo V; Gieger, Christian; Golay, Alain; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Haas, David W; Hall, Alistair S; Harris, Tamara B; Hattersley, Andrew T; Heath, Andrew C; Hengstenberg, Christian; Hicks, Andrew A; Hindorff, Lucia A; Hingorani, Aroon D; Hofman, Albert; Hovingh, G Kees; Humphries, Steve E; Hunt, Steven C; Hypponen, Elina; Jacobs, Kevin B; Jarvelin, Marjo-Riitta; Jousilahti, Pekka; Jula, Antti M; Kaprio, Jaakko; Kastelein, John J P; Kayser, Manfred; Kee, Frank; Keinanen-Kiukaanniemi, Sirkka M; Kiemeney, Lambertus A; Kooner, Jaspal S; Kooperberg, Charles; Koskinen, Seppo; Kovacs, Peter; Kraja, Aldi T; Kumari, Meena; Kuusisto, Johanna; Lakka, Timo A; Langenberg, Claudia; Le Marchand, Loic; Lehtimäki, Terho; Lupoli, Sara; Madden, Pamela A F; Männistö, Satu; Manunta, Paolo; Marette, André; Matise, Tara C; McKnight, Barbara; Meitinger, Thomas; Moll, Frans L; Montgomery, Grant W; Morris, Andrew D; Morris, Andrew P; Murray, Jeffrey C; Nelis, Mari; Ohlsson, Claes; Oldehinkel, Albertine J; Ong, Ken K; Ouwehand, Willem H; Pasterkamp, Gerard; Peters, Annette; Pramstaller, Peter P; Price, Jackie F; Qi, Lu; Raitakari, Olli T; Rankinen, Tuomo; Rao, D C; Rice, Treva K; Ritchie, Marylyn; Rudan, Igor; Salomaa, Veikko; Samani, Nilesh J; Saramies, Jouko; Sarzynski, Mark A; Schwarz, Peter E H; Sebert, Sylvain; Sever, Peter; Shuldiner, Alan R; Sinisalo, Juha; Steinthorsdottir, Valgerdur; Stolk, Ronald P; Tardif, Jean-Claude; Tönjes, Anke; Tremblay, Angelo; Tremoli, Elena; Virtamo, Jarmo; Vohl, Marie-Claude; Amouyel, Philippe; Asselbergs, Folkert W; Assimes, Themistocles L; Bochud, Murielle; Boehm, Bernhard O; Boerwinkle, Eric; Bottinger, Erwin P; Bouchard, Claude; Cauchi, Stéphane; Chambers, John C; Chanock, Stephen J; Cooper, Richard S; de Bakker, Paul I W; Dedoussis, George; Ferrucci, Luigi; Franks, Paul W; Froguel, Philippe; Groop, Leif C; Haiman, Christopher A; Hamsten, Anders; Hayes, M Geoffrey; Hui, Jennie; Hunter, David J; Hveem, Kristian; Jukema, J Wouter; Kaplan, Robert C; Kivimaki, Mika; Kuh, Diana; Laakso, Markku; Liu, Yongmei; Martin, Nicholas G; März, Winfried; Melbye, Mads; Moebus, Susanne; Munroe, Patricia B; Njølstad, Inger; Oostra, Ben A; Palmer, Colin N A; Pedersen, Nancy L; Perola, Markus; Pérusse, Louis; Peters, Ulrike; Powell, Joseph E; Power, Chris; Quertermous, Thomas; Rauramaa, Rainer; Reinmaa, Eva; Ridker, Paul M; Rivadeneira, Fernando; Rotter, Jerome I; Saaristo, Timo E; Saleheen, Danish; Schlessinger, David; Slagboom, P Eline; Snieder, Harold; Spector, Tim D; Strauch, Konstantin; Stumvoll, Michael; Tuomilehto, Jaakko; Uusitupa, Matti; van der Harst, Pim; Völzke, Henry; Walker, Mark; Wareham, Nicholas J; Watkins, Hugh; Wichmann, H-Erich; Wilson, James F; Zanen, Pieter; Deloukas, Panos; Heid, Iris M; Lindgren, Cecilia M; Mohlke, Karen L; Speliotes, Elizabeth K; Thorsteinsdottir, Unnur; Barroso, Inês; Fox, Caroline S; North, Kari E; Strachan, David P; Beckmann, Jacques S; Berndt, Sonja I; Boehnke, Michael; Borecki, Ingrid B; McCarthy, Mark I; Metspalu, Andres; Stefansson, Kari; Uitterlinden, André G; van Duijn, Cornelia M; Franke, Lude; Willer, Cristen J; Price, Alkes L; Lettre, Guillaume; Loos, Ruth J F; Weedon, Michael N; Ingelsson, Erik; O'Connell, Jeffrey R; Abecasis, Goncalo R; Chasman, Daniel I; Goddard, Michael E; Visscher, Peter M; Hirschhorn, Joel N; Frayling, Timothy M

    2014-11-01

    Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.

  15. Defining the role of common variation in the genomic and biological architecture of adult human height

    Science.gov (United States)

    Chu, Audrey Y; Estrada, Karol; Luan, Jian’an; Kutalik, Zoltán; Amin, Najaf; Buchkovich, Martin L; Croteau-Chonka, Damien C; Day, Felix R; Duan, Yanan; Fall, Tove; Fehrmann, Rudolf; Ferreira, Teresa; Jackson, Anne U; Karjalainen, Juha; Lo, Ken Sin; Locke, Adam E; Mägi, Reedik; Mihailov, Evelin; Porcu, Eleonora; Randall, Joshua C; Scherag, André; Vinkhuyzen, Anna AE; Westra, Harm-Jan; Winkler, Thomas W; Workalemahu, Tsegaselassie; Zhao, Jing Hua; Absher, Devin; Albrecht, Eva; Anderson, Denise; Baron, Jeffrey; Beekman, Marian; Demirkan, Ayse; Ehret, Georg B; Feenstra, Bjarke; Feitosa, Mary F; Fischer, Krista; Fraser, Ross M; Goel, Anuj; Gong, Jian; Justice, Anne E; Kanoni, Stavroula; Kleber, Marcus E; Kristiansson, Kati; Lim, Unhee; Lotay, Vaneet; Lui, Julian C; Mangino, Massimo; Leach, Irene Mateo; Medina-Gomez, Carolina; Nalls, Michael A; Nyholt, Dale R; Palmer, Cameron D; Pasko, Dorota; Pechlivanis, Sonali; Prokopenko, Inga; Ried, Janina S; Ripke, Stephan; Shungin, Dmitry; Stancáková, Alena; Strawbridge, Rona J; Sung, Yun Ju; Tanaka, Toshiko; Teumer, Alexander; Trompet, Stella; van der Laan, Sander W; van Setten, Jessica; Van Vliet-Ostaptchouk, Jana V; Wang, Zhaoming; Yengo, Loïc; Zhang, Weihua; Afzal, Uzma; Ärnlöv, Johan; Arscott, Gillian M; Bandinelli, Stefania; Barrett, Amy; Bellis, Claire; Bennett, Amanda J; Berne, Christian; Blüher, Matthias; Bolton, Jennifer L; Böttcher, Yvonne; Boyd, Heather A; Bruinenberg, Marcel; Buckley, Brendan M; Buyske, Steven; Caspersen, Ida H; Chines, Peter S; Clarke, Robert; Claudi-Boehm, Simone; Cooper, Matthew; Daw, E Warwick; De Jong, Pim A; Deelen, Joris; Delgado, Graciela; Denny, Josh C; Dhonukshe-Rutten, Rosalie; Dimitriou, Maria; Doney, Alex SF; Dörr, Marcus; Eklund, Niina; Eury, Elodie; Folkersen, Lasse; Garcia, Melissa E; Geller, Frank; Giedraitis, Vilmantas; Go, Alan S; Grallert, Harald; Grammer, Tanja B; Gräßler, Jürgen; Grönberg, Henrik; de Groot, Lisette C.P.G.M.; Groves, Christopher J; Haessler, Jeffrey; Hall, Per; Haller, Toomas; Hallmans, Goran; Hannemann, Anke; Hartman, Catharina A; Hassinen, Maija; Hayward, Caroline; Heard-Costa, Nancy L; Helmer, Quinta; Hemani, Gibran; Henders, Anjali K; Hillege, Hans L; Hlatky, Mark A; Hoffmann, Wolfgang; Hoffmann, Per; Holmen, Oddgeir; Houwing-Duistermaat, Jeanine J; Illig, Thomas; Isaacs, Aaron; James, Alan L; Jeff, Janina; Johansen, Berit; Johansson, Åsa; Jolley, Jennifer; Juliusdottir, Thorhildur; Junttila, Juhani; Kho, Abel N; Kinnunen, Leena; Klopp, Norman; Kocher, Thomas; Kratzer, Wolfgang; Lichtner, Peter; Lind, Lars; Lindström, Jaana; Lobbens, Stéphane; Lorentzon, Mattias; Lu, Yingchang; Lyssenko, Valeriya; Magnusson, Patrik KE; Mahajan, Anubha; Maillard, Marc; McArdle, Wendy L; McKenzie, Colin A; McLachlan, Stela; McLaren, Paul J; Menni, Cristina; Merger, Sigrun; Milani, Lili; Moayyeri, Alireza; Monda, Keri L; Morken, Mario A; Müller, Gabriele; Müller-Nurasyid, Martina; Musk, Arthur W; Narisu, Narisu; Nauck, Matthias; Nolte, Ilja M; Nöthen, Markus M; Oozageer, Laticia; Pilz, Stefan; Rayner, Nigel W; Renstrom, Frida; Robertson, Neil R; Rose, Lynda M; Roussel, Ronan; Sanna, Serena; Scharnagl, Hubert; Scholtens, Salome; Schumacher, Fredrick R; Schunkert, Heribert; Scott, Robert A; Sehmi, Joban; Seufferlein, Thomas; Shi, Jianxin; Silventoinen, Karri; Smit, Johannes H; Smith, Albert Vernon; Smolonska, Joanna; Stanton, Alice V; Stirrups, Kathleen; Stott, David J; Stringham, Heather M; Sundström, Johan; Swertz, Morris A; Syvänen, Ann-Christine; Tayo, Bamidele O; Thorleifsson, Gudmar; Tyrer, Jonathan P; van Dijk, Suzanne; van Schoor, Natasja M; van der Velde, Nathalie; van Heemst, Diana; van Oort, Floor VA; Vermeulen, Sita H; Verweij, Niek; Vonk, Judith M; Waite, Lindsay L; Waldenberger, Melanie; Wennauer, Roman; Wilkens, Lynne R; Willenborg, Christina; Wilsgaard, Tom; Wojczynski, Mary K; Wong, Andrew; Wright, Alan F; Zhang, Qunyuan; Arveiler, Dominique; Bakker, Stephan JL; Beilby, John; Bergman, Richard N; Bergmann, Sven; Biffar, Reiner; Blangero, John; Boomsma, Dorret I; Bornstein, Stefan R; Bovet, Pascal; Brambilla, Paolo; Brown, Morris J; Campbell, Harry; Caulfield, Mark J; Chakravarti, Aravinda; Collins, Rory; Collins, Francis S; Crawford, Dana C; Cupples, L Adrienne; Danesh, John; de Faire, Ulf; den Ruijter, Hester M; Erbel, Raimund; Erdmann, Jeanette; Eriksson, Johan G; Farrall, Martin; Ferrannini, Ele; Ferrières, Jean; Ford, Ian; Forouhi, Nita G; Forrester, Terrence; Gansevoort, Ron T; Gejman, Pablo V; Gieger, Christian; Golay, Alain; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Haas, David W; Hall, Alistair S; Harris, Tamara B; Hattersley, Andrew T; Heath, Andrew C; Hengstenberg, Christian; Hicks, Andrew A; Hindorff, Lucia A; Hingorani, Aroon D; Hofman, Albert; Hovingh, G Kees; Humphries, Steve E; Hunt, Steven C; Hypponen, Elina; Jacobs, Kevin B; Jarvelin, Marjo-Riitta; Jousilahti, Pekka; Jula, Antti M; Kaprio, Jaakko; Kastelein, John JP; Kayser, Manfred; Kee, Frank; Keinanen-Kiukaanniemi, Sirkka M; Kiemeney, Lambertus A; Kooner, Jaspal S; Kooperberg, Charles; Koskinen, Seppo; Kovacs, Peter; Kraja, Aldi T; Kumari, Meena; Kuusisto, Johanna; Lakka, Timo A; Langenberg, Claudia; Le Marchand, Loic; Lehtimäki, Terho; Lupoli, Sara; Madden, Pamela AF; Männistö, Satu; Manunta, Paolo; Marette, André; Matise, Tara C; McKnight, Barbara; Meitinger, Thomas; Moll, Frans L; Montgomery, Grant W; Morris, Andrew D; Morris, Andrew P; Murray, Jeffrey C; Nelis, Mari; Ohlsson, Claes; Oldehinkel, Albertine J; Ong, Ken K; Ouwehand, Willem H; Pasterkamp, Gerard; Peters, Annette; Pramstaller, Peter P; Price, Jackie F; Qi, Lu; Raitakari, Olli T; Rankinen, Tuomo; Rao, DC; Rice, Treva K; Ritchie, Marylyn; Rudan, Igor; Salomaa, Veikko; Samani, Nilesh J; Saramies, Jouko; Sarzynski, Mark A; Schwarz, Peter EH; Sebert, Sylvain; Sever, Peter; Shuldiner, Alan R; Sinisalo, Juha; Steinthorsdottir, Valgerdur; Stolk, Ronald P; Tardif, Jean-Claude; Tönjes, Anke; Tremblay, Angelo; Tremoli, Elena; Virtamo, Jarmo; Vohl, Marie-Claude; Amouyel, Philippe; Asselbergs, Folkert W; Assimes, Themistocles L; Bochud, Murielle; Boehm, Bernhard O; Boerwinkle, Eric; Bottinger, Erwin P; Bouchard, Claude; Cauchi, Stéphane; Chambers, John C; Chanock, Stephen J; Cooper, Richard S; de Bakker, Paul IW; Dedoussis, George; Ferrucci, Luigi; Franks, Paul W; Froguel, Philippe; Groop, Leif C; Haiman, Christopher A; Hamsten, Anders; Hayes, M Geoffrey; Hui, Jennie; Hunter, David J.; Hveem, Kristian; Jukema, J Wouter; Kaplan, Robert C; Kivimaki, Mika; Kuh, Diana; Laakso, Markku; Liu, Yongmei; Martin, Nicholas G; März, Winfried; Melbye, Mads; Moebus, Susanne; Munroe, Patricia B; Njølstad, Inger; Oostra, Ben A; Palmer, Colin NA; Pedersen, Nancy L; Perola, Markus; Pérusse, Louis; Peters, Ulrike; Powell, Joseph E; Power, Chris; Quertermous, Thomas; Rauramaa, Rainer; Reinmaa, Eva; Ridker, Paul M; Rivadeneira, Fernando; Rotter, Jerome I; Saaristo, Timo E; Saleheen, Danish; Schlessinger, David; Slagboom, P Eline; Snieder, Harold; Spector, Tim D; Strauch, Konstantin; Stumvoll, Michael; Tuomilehto, Jaakko; Uusitupa, Matti; van der Harst, Pim; Völzke, Henry; Walker, Mark; Wareham, Nicholas J; Watkins, Hugh; Wichmann, H-Erich; Wilson, James F; Zanen, Pieter; Deloukas, Panos; Heid, Iris M; Lindgren, Cecilia M; Mohlke, Karen L; Speliotes, Elizabeth K; Thorsteinsdottir, Unnur; Barroso, Inês; Fox, Caroline S; North, Kari E; Strachan, David P; Beckmann, Jacques S.; Berndt, Sonja I; Boehnke, Michael; Borecki, Ingrid B; McCarthy, Mark I; Metspalu, Andres; Stefansson, Kari; Uitterlinden, André G; van Duijn, Cornelia M; Franke, Lude; Willer, Cristen J; Price, Alkes L.; Lettre, Guillaume; Loos, Ruth JF; Weedon, Michael N; Ingelsson, Erik; O’Connell, Jeffrey R; Abecasis, Goncalo R; Chasman, Daniel I; Goddard, Michael E

    2014-01-01

    Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explain one-fifth of heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ~2,000, ~3,700 and ~9,500 SNPs explained ~21%, ~24% and ~29% of phenotypic variance. Furthermore, all common variants together captured the majority (60%) of heritability. The 697 variants clustered in 423 loci enriched for genes, pathways, and tissue-types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/beta-catenin, and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants. PMID:25282103

  16. Annotating individual human genomes.

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J

    2011-10-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.

  17. ANNOTATING INDIVIDUAL HUMAN GENOMES*

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.

    2014-01-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162

  18. Strong convergence with a modified iterative projection method for hierarchical fixed point problems and variational inequalities

    Directory of Open Access Journals (Sweden)

    Ibrahim Karahan

    2016-04-01

    Full Text Available Let C be a nonempty closed convex subset of a real Hilbert space H. Let {T_{n}}:C›H be a sequence of nearly nonexpansive mappings such that F:=?_{i=1}^{?}F(T_{i}?Ø. Let V:C›H be a ?-Lipschitzian mapping and F:C›H be a L-Lipschitzian and ?-strongly monotone operator. This paper deals with a modified iterative projection method for approximating a solution of the hierarchical fixed point problem. It is shown that under certain approximate assumptions on the operators and parameters, the modified iterative sequence {x_{n}} converges strongly to x^{*}?F which is also the unique solution of the following variational inequality: ?0, ?x?F. As a special case, this projection method can be used to find the minimum norm solution of above variational inequality; namely, the unique solution x^{*} to the quadratic minimization problem: x^{*}=argmin_{x?F}?x?². The results here improve and extend some recent corresponding results of other authors.

  19. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project

    DEFF Research Database (Denmark)

    Birney, Ewan; Stamatoyannopoulos, John A; Dutta, Anindya

    2007-01-01

    We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses...

  20. Metabolome-genome-wide association study dissects genetic architecture for generating natural variation in rice secondary metabolism

    Science.gov (United States)

    Matsuda, Fumio; Nakabayashi, Ryo; Yang, Zhigang; Okazaki, Yozo; Yonemaru, Jun-ichi; Ebana, Kaworu; Yano, Masahiro; Saito, Kazuki

    2015-01-01

    Plants produce structurally diverse secondary (specialized) metabolites to increase their fitness for survival under adverse environments. Several bioactive compounds for new drugs have been identified through screening of plant extracts. In this study, genome-wide association studies (GWAS) were conducted to investigate the genetic architecture behind the natural variation of rice secondary metabolites. GWAS using the metabolome data of 175 rice accessions successfully identified 323 associations among 143 single nucleotide polymorphisms (SNPs) and 89 metabolites. The data analysis highlighted that levels of many metabolites are tightly associated with a small number of strong quantitative trait loci (QTLs). The tight association may be a mechanism generating strains with distinct metabolic composition through the crossing of two different strains. The results indicate that one plant species produces more diverse phytochemicals than previously expected, and plants still contain many useful compounds for human applications. PMID:25267402

  1. Bayesian Nonparametric Hidden Markov Models with application to the analysis of copy-number-variation in mammalian genomes.

    Science.gov (United States)

    Yau, C; Papaspiliopoulos, O; Roberts, G O; Holmes, C

    2011-01-01

    We consider the development of Bayesian Nonparametric methods for product partition models such as Hidden Markov Models and change point models. Our approach uses a Mixture of Dirichlet Process (MDP) model for the unknown sampling distribution (likelihood) for the observations arising in each state and a computationally efficient data augmentation scheme to aid inference. The method uses novel MCMC methodology which combines recent retrospective sampling methods with the use of slice sampler variables. The methodology is computationally efficient, both in terms of MCMC mixing properties, and robustness to the length of the time series being investigated. Moreover, the method is easy to implement requiring little or no user-interaction. We apply our methodology to the analysis of genomic copy number variation.

  2. Rare Genome-Wide Copy Number Variation and Expression of Schizophrenia in 22q11.2 Deletion Syndrome.

    Science.gov (United States)

    Bassett, Anne S; Lowther, Chelsea; Merico, Daniele; Costain, Gregory; Chow, Eva W C; van Amelsvoort, Therese; McDonald-McGinn, Donna; Gur, Raquel E; Swillen, Ann; Van den Bree, Marianne; Murphy, Kieran; Gothelf, Doron; Bearden, Carrie E; Eliez, Stephan; Kates, Wendy; Philip, Nicole; Sashi, Vandana; Campbell, Linda; Vorstman, Jacob; Cubells, Joseph; Repetto, Gabriela M; Simon, Tony; Boot, Erik; Heung, Tracy; Evers, Rens; Vingerhoets, Claudia; van Duin, Esther; Zackai, Elaine; Vergaelen, Elfi; Devriendt, Koen; Vermeesch, Joris R; Owen, Michael; Murphy, Clodagh; Michaelovosky, Elena; Kushan, Leila; Schneider, Maude; Fremont, Wanda; Busa, Tiffany; Hooper, Stephen; McCabe, Kathryn; Duijff, Sasja; Isaev, Karin; Pellecchia, Giovanna; Wei, John; Gazzellone, Matthew J; Scherer, Stephen W; Emanuel, Beverly S; Guo, Tingwei; Morrow, Bernice E; Marshall, Christian R

    2017-11-01

    Chromosome 22q11.2 deletion syndrome (22q11.2DS) is associated with a more than 20-fold increased risk for developing schizophrenia. The aim of this study was to identify additional genetic factors (i.e., "second hits") that may contribute to schizophrenia expression. Through an international consortium, the authors obtained DNA samples from 329 psychiatrically phenotyped subjects with 22q11.2DS. Using a high-resolution microarray platform and established methods to assess copy number variation (CNV), the authors compared the genome-wide burden of rare autosomal CNV, outside of the 22q11.2 deletion region, between two groups: a schizophrenia group and those with no psychotic disorder at age ≥25 years. The authors assessed whether genes overlapped by rare CNVs were overrepresented in functional pathways relevant to schizophrenia. Rare CNVs overlapping one or more protein-coding genes revealed significant between-group differences. For rare exonic duplications, six of 19 gene sets tested were enriched in the schizophrenia group; genes associated with abnormal nervous system phenotypes remained significant in a stepwise logistic regression model and showed significant interactions with 22q11.2 deletion region genes in a connectivity analysis. For rare exonic deletions, the schizophrenia group had, on average, more genes overlapped. The additional rare CNVs implicated known (e.g., GRM7, 15q13.3, 16p12.2) and novel schizophrenia risk genes and loci. The results suggest that additional rare CNVs overlapping genes outside of the 22q11.2 deletion region contribute to schizophrenia risk in 22q11.2DS, supporting a multigenic hypothesis for schizophrenia. The findings have implications for understanding expression of psychotic illness and herald the importance of whole-genome sequencing to appreciate the overall genomic architecture of schizophrenia.

  3. Simultaneous inference of selection and population growth from patterns of variation in the human genome

    DEFF Research Database (Denmark)

    Williamson, Scott H.; Hernandez, Ryan; Fledel-Alon, Adi

    2005-01-01

    Natural selection and demographic forces can have similar effects on patterns of DNA polymorphism. Therefore, to infer selection from samples of DNA sequences, one must simultaneously account for demographic effects. Here we take a model-based approach to this problem by developing predictions fo......-specific methods, and (iii) strong evidence for very recent population growth....... for patterns of polymorphism in the presence of both population size change and natural selection. If data are available from different functional classes of variation, and a priori information suggests that mutations in one of those classes are selectively neutral, then the putatively neutral class can...... this method to a large polymorphism data set from 301 human genes and find (i) widespread negative selection acting on standing nonsynonymous variation, (ii) that the fitness effects of nonsynonymous mutations are well predicted by several measures of amino acid exchangeability, especially site...

  4. Genomic and proteomic analysis of soybean heritable variations induced by space flight

    Institute of Scientific and Technical Information of China (English)

    HE Jie; GAO Yong; SUN Ye-qing

    2009-01-01

    To analyze the biological effects of space environment, the diversity of genomic DNA between the space flight soybean 194(4126) with phenotype of good yield and good fruit quality induced by space flight and the soybean with ground control was studied by amplified fragment length polymorphism (AFLP) method, and the polymorphism of space flight soybean 194(4126) was 3.56%. The differences of protein expression of seeds and leaves between the two kinds of soybeans were analysed by two-dimensional electrophoresis, PDQuest software and MALDI-TOF mass spectrometry. Results show that the loss and decrease of protein expression in 194(4126) soybean are subjected to the space fight of seeds, and three special proteins including Dehydrin, MAT1 and ceQORH are identified. It is concluded that the space environment changes the phenotype and geno-type of soybeans due to the space flight of seeds.

  5. Genomic and transcriptome profiling identified both human and HBV genetic variations and their interactions in Chinese hepatocellular carcinoma

    Directory of Open Access Journals (Sweden)

    Hua Dong

    2015-12-01

    Full Text Available Interaction between HBV and host genome integrations in hepatocellular carcinoma (HCC development is a complex process and the mechanism is still unclear. Here we described in details the quality controls and data mining of aCGH and transcriptome sequencing data on 50 HCC samples from the Chinese patients, published by Dong et al. (2015 (GEO#: GSE65486. In additional to the HBV-MLL4 integration discovered, we also investigated the genetic aberrations of HBV and host genes as well as their genetic interactions. We reported human genome copy number changes and frequent transcriptome variations (e.g. TP53, CTNNB1 mutation, especially MLL family mutations in this cohort of the patients. For HBV genotype C, we identified a novel linkage disequilibrium region covering HBV replication regulatory elements, including basal core promoter, DR1, epsilon and poly-A regions, which is associated with HBV core antigen over-expression and almost exclusive to HBV-MLL4 integration.

  6. Variation in the complex carbohydrate biosynthesis loci of Acinetobacter baumannii genomes.

    Directory of Open Access Journals (Sweden)

    Johanna J Kenyon

    Full Text Available Extracellular polysaccharides are major immunogenic components of the bacterial cell envelope. However, little is known about their biosynthesis in the genus Acinetobacter, which includes A. baumannii, an important nosocomial pathogen. Whether Acinetobacter sp. produce a capsule or a lipopolysaccharide carrying an O antigen or both is not resolved. To explore these issues, genes involved in the synthesis of complex polysaccharides were located in 10 complete A. baumannii genome sequences, and the function of each of their products was predicted via comparison to enzymes with a known function. The absence of a gene encoding a WaaL ligase, required to link the carbohydrate polymer to the lipid A-core oligosaccharide (lipooligosaccharide forming lipopolysaccharide, suggests that only a capsule is produced. Nine distinct arrangements of a large capsule biosynthesis locus, designated KL1 to KL9, were found in the genomes. Three forms of a second, smaller variable locus, likely to be required for synthesis of the outer core of the lipid A-core moiety, were designated OCL1 to OCL3 and also annotated. Each K locus includes genes for capsule export as well as genes for synthesis of activated sugar precursors, and for glycosyltransfer, glycan modification and oligosaccharide repeat-unit processing. The K loci all include the export genes at one end and genes for synthesis of common sugar precursors at the other, with a highly variable region that includes the remaining genes in between. Five different capsule loci, KL2, KL6, KL7, KL8 and KL9 were detected in multiply antibiotic resistant isolates belonging to global clone 2, and two other loci, KL1 and KL4, in global clone 1. This indicates that this region is being substituted repeatedly in multiply antibiotic resistant isolates from these clones.

  7. Rhinovirus genome variation during chronic upper and lower respiratory tract infections.

    Directory of Open Access Journals (Sweden)

    Caroline Tapparel

    Full Text Available Routine screening of lung transplant recipients and hospital patients for respiratory virus infections allowed to identify human rhinovirus (HRV in the upper and lower respiratory tracts, including immunocompromised hosts chronically infected with the same strain over weeks or months. Phylogenetic analysis of 144 HRV-positive samples showed no apparent correlation between a given viral genotype or species and their ability to invade the lower respiratory tract or lead to protracted infection. By contrast, protracted infections were found almost exclusively in immunocompromised patients, thus suggesting that host factors rather than the virus genotype modulate disease outcome, in particular the immune response. Complete genome sequencing of five chronic cases to study rhinovirus genome adaptation showed that the calculated mutation frequency was in the range observed during acute human infections. Analysis of mutation hot spot regions between specimens collected at different times or in different body sites revealed that non-synonymous changes were mostly concentrated in the viral capsid genes VP1, VP2 and VP3, independent of the HRV type. In an immunosuppressed lung transplant recipient infected with the same HRV strain for more than two years, both classical and ultra-deep sequencing of samples collected at different time points in the upper and lower respiratory tracts showed that these virus populations were phylogenetically indistinguishable over the course of infection, except for the last month. Specific signatures were found in the last two lower respiratory tract populations, including changes in the 5'UTR polypyrimidine tract and the VP2 immunogenic site 2. These results highlight for the first time the ability of a given rhinovirus to evolve in the course of a natural infection in immunocompromised patients and complement data obtained from previous experimental inoculation studies in immunocompetent volunteers.

  8. Sequencing and characterizing the genome of Estrella lausannensis as an undergraduate project: training students and biological insights

    Directory of Open Access Journals (Sweden)

    Claire eBertelli

    2015-02-01

    Full Text Available With the widespread availability of high-throughput sequencing technologies, sequencing projects have become pervasive in the molecular life sciences. The huge bulk of data generated daily must be analyzed further by biologists with skills in bioinformatics and by embedded bioinformaticians, i.e., bioinformaticians integrated in wet lab research groups. Thus, students interested in molecular life sciences must be trained in the main steps of genomics: sequencing, assembly, annotation and analysis. To reach that goal, a practical course has been set up for master students at the University of Lausanne: the Sequence a genome class. At the beginning of the academic year, a few bacterial species whose genome is unknown are provided to the students, who sequence and assemble the genome(s and perform manual annotation. Here, we report the progress of the first class from September 2010 to June 2011 and the results obtained by seven master students who specifically assembled and annotated the genome of Estrella lausannensis, an obligate intracellular bacterium related to Chlamydia. The draft genome of Estrella is composed of 29 scaffolds encompassing 2,819,825 bp that encode for 2,233 putative proteins. Estrella also possesses a 9,136 bp plasmid that encodes for 14 genes, among which we found an integrase and a toxin/antitoxin module. Like all other members of the Chlamydiales order, Estrella possesses a highly conserved type III secretion system, considered as a key virulence factor. The annotation of the Estrella genome also allowed the characterization of the metabolic abilities of this strictly intracellular bacterium. Altogether, the students provided the scientific community with the Estrella genome sequence and a preliminary understanding of the biology of this recently-discovered bacterial genus, while learning to use cutting-edge technologies for sequencing and to perform bioinformatics analyses.

  9. Chromosome and genome size variation in Luzula (Juncaceae), a genus with holocentric chromosomes

    Czech Academy of Sciences Publication Activity Database

    Bozek, M.; Leitch, A. R.; Leitch, I. J.; Záveská Drábková, Lenka; Kuta, E.

    2012-01-01

    Roč. 170, č. 4 (2012), s. 529-541 ISSN 0024-4074 R&D Projects: GA ČR GP206/07/P147 Institutional support: RVO:67985939 Keywords : chromosomal evolution * endopolyploidy * holokinetic chromosome * karyotype evolution * tetraploides * centromeres * TRNF intergenic spacer Subject RIV: EF - Botanics Impact factor: 2.589, year: 2012

  10. Trait variation and genetic diversity in a banana genomic selection training population

    Czech Academy of Sciences Publication Activity Database

    Nyine, Moses; Uwimana, B.; Swennen, R.; Batte, M.; Brown, A.; Christelová, Pavla; Hřibová, Eva; Lorenzen, J.; Doležel, Jaroslav

    2017-01-01

    Roč. 12, č. 6 (2017), č. článku e0178734. E-ISSN 1932-6203 R&D Projects: GA MŠk(CZ) LO1204 Institutional support: RVO:61389030 Keywords : PLANTAIN MUSA * AAB GROUP * IMPROVEMENT Subject RIV: EB - Genetics ; Molecular Biology OBOR OECD: Plant sciences, botany Impact factor: 2.806, year: 2016

  11. Geographical parthenogenesis, genome size variation and pollen production in the arctic-alpine species Hieracium alpinum

    Czech Academy of Sciences Publication Activity Database

    Mráz, P.; Chrtek, Jindřich; Šingliarová, B.

    2009-01-01

    Roč. 119, č. 1 (2009), s. 41-51 ISSN 0253-1453 R&D Projects: GA ČR GA206/05/0657 Institutional research plan: CEZ:AV0Z60050516 Keywords : apomixis * Compositae * polyploidy Subject RIV: EF - Botanics Impact factor: 0.900, year: 2009

  12. Athlome Project Consortium: a concerted effort to discover genomic and other "omic" markers of athletic performance.

    Science.gov (United States)

    Pitsiladis, Yannis P; Tanaka, Masashi; Eynon, Nir; Bouchard, Claude; North, Kathryn N; Williams, Alun G; Collins, Malcolm; Moran, Colin N; Britton, Steven L; Fuku, Noriyuki; Ashley, Euan A; Klissouras, Vassilis; Lucia, Alejandro; Ahmetov, Ildus I; de Geus, Eco; Alsayrafi, Mohammed

    2016-03-01

    Despite numerous attempts to discover genetic variants associated with elite athletic performance, injury predisposition, and elite/world-class athletic status, there has been limited progress to date. Past reliance on candidate gene studies predominantly focusing on genotyping a limited number of single nucleotide polymorphisms or the insertion/deletion variants in small, often heterogeneous cohorts (i.e., made up of athletes of quite different sport specialties) have not generated the kind of results that could offer solid opportunities to bridge the gap between basic research in exercise sciences and deliverables in biomedicine. A retrospective view of genetic association studies with complex disease traits indicates that transition to hypothesis-free genome-wide approaches will be more fruitful. In studies of complex disease, it is well recognized that the magnitude of genetic association is often smaller than initially anticipated, and, as such, large sample sizes are required to identify the gene effects robustly. A symposium was held in Athens and on the Greek island of Santorini from 14-17 May 2015 to review the main findings in exercise genetics and genomics and to explore promising trends and possibilities. The symposium also offered a forum for the development of a position stand (the Santorini Declaration). Among the participants, many were involved in ongoing collaborative studies (e.g., ELITE, GAMES, Gene SMART, GENESIS, and POWERGENE). A consensus emerged among participants that it would be advantageous to bring together all current studies and those recently launched into one new large collaborative initiative, which was subsequently named the Athlome Project Consortium. Copyright © 2016 the American Physiological Society.

  13. Deep brain stimulation, brain maps and personalized medicine: lessons from the human genome project.

    Science.gov (United States)

    Fins, Joseph J; Shapiro, Zachary E

    2014-01-01

    Although the appellation of personalized medicine is generally attributed to advanced therapeutics in molecular medicine, deep brain stimulation (DBS) can also be so categorized. Like its medical counterpart, DBS is a highly personalized intervention that needs to be tailored to a patient's individual anatomy. And because of this, DBS like more conventional personalized medicine, can be highly specific where the object of care is an N = 1. But that is where the similarities end. Besides their differing medical and surgical provenances, these two varieties of personalized medicine have had strikingly different impacts. The molecular variant, though of a more recent vintage has thrived and is experiencing explosive growth, while DBS still struggles to find a sustainable therapeutic niche. Despite its promise, and success as a vetted treatment for drug resistant Parkinson's Disease, DBS has lagged in broadening its development, often encountering regulatory hurdles and financial barriers necessary to mount an adequate number of quality trials. In this paper we will consider why DBS-or better yet neuromodulation-has encountered these challenges and contrast this experience with the more successful advance of personalized medicine. We will suggest that personalized medicine and DBS's differential performance can be explained as a matter of timing and complexity. We believe that DBS has struggled because it has been a journey of scientific exploration conducted without a map. In contrast to molecular personalized medicine which followed the mapping of the human genome and the Human Genome Project, DBS preceded plans for the mapping of the human brain. We believe that this sequence has given personalized medicine a distinct advantage and that the fullest potential of DBS will be realized both as a cartographical or electrophysiological probe and as a modality of personalized medicine.

  14. Total Variation-Based Reduction of Streak Artifacts, Ring Artifacts and Noise in 3D Reconstruction from Optical Projection Tomography

    Czech Academy of Sciences Publication Activity Database

    Michálek, Jan

    2015-01-01

    Roč. 21, č. 6 (2015), s. 1602-1615 ISSN 1431-9276 R&D Projects: GA MŠk(CZ) LH13028; GA ČR(CZ) GA13-12412S Institutional support: RVO:67985823 Keywords : optical projection tomography * microscopy * artifacts * total variation * data mismatch Subject RIV: EA - Cell Biology Impact factor: 1.730, year: 2015

  15. Rhipicephalus microplus strain Deutsch, whole genome shotgun sequencing project Version 2

    Science.gov (United States)

    The cattle tick, Rhipicephalus (Boophilus) microplus, has a genome over 2.4 times the size of the human genome, and with over 70% of repetitive DNA, this genome would prove very costly to sequence at today's prices and difficult to assemble and analyze. Cot filtration/selection techniques were used ...

  16. A variational principle for computing nonequilibrium fluxes and potentials in genome-scale biochemical networks.

    Science.gov (United States)

    Fleming, R M T; Maes, C M; Saunders, M A; Ye, Y; Palsson, B Ø

    2012-01-07

    We derive a convex optimization problem on a steady-state nonequilibrium network of biochemical reactions, with the property that energy conservation and the second law of thermodynamics both hold at the problem solution. This suggests a new variational principle for biochemical networks that can be implemented in a computationally tractable manner. We derive the Lagrange dual of the optimization problem and use strong duality to demonstrate that a biochemical analogue of Tellegen's theorem holds at optimality. Each optimal flux is dependent on a free parameter that we relate to an elementary kinetic parameter when mass action kinetics is assumed. Copyright © 2011 Elsevier Ltd. All rights reserved.

  17. Genomics: The Science and Technology Behind the Human Genome Project (by Charles R. Cantor and Cassandra L. Smith)

    Science.gov (United States)

    Serra, Reviewed By Martin J.

    2000-01-01

    Genomics is one of the most rapidly expanding areas of science. This book is an outgrowth of a series of lectures given by one of the former heads (CRC) of the Human Genome Initiative. The book is designed to reach a wide audience, from biologists with little chemical or physical science background through engineers, computer scientists, and physicists with little current exposure to the chemical or biological principles of genetics. The text starts with a basic review of the chemical and biological properties of DNA. However, without either a biochemistry background or a supplemental biochemistry text, this chapter and much of the rest of the text would be difficult to digest. The second chapter is designed to put DNA into the context of the larger chromosomal unit. Specialized chromosomal structures and sequences (centromeres, telomeres) are introduced, leading to a section on chromosome organization and purification. The next 4 chapters cover the physical (hybridization, electrophoresis), chemical (polymerase chain reaction), and biological (genetic) techniques that provide the backbone of genomic analysis. These chapters cover in significant detail the fundamental principles underlying each technique and provide a firm background for the remainder of the text. Chapters 7­9 consider the need and methods for the development of physical maps. Chapter 7 primarily discusses chromosomal localization techniques, including in situ hybridization, FISH, and chromosome paintings. The next two chapters focus on the development of libraries and clones. In particular, Chapter 9 considers the limitations of current mapping and clone production. The current state and future of DNA sequencing is covered in the next three chapters. The first considers the current methods of DNA sequencing - especially gel-based methods of analysis, although other possible approaches (mass spectrometry) are introduced. Much of the chapter addresses the limitations of current methods, including

  18. Genome size variation in Orchidaceae subfamily Apostasioideae: filling the phylogenetic gap

    Czech Academy of Sciences Publication Activity Database

    Jersáková, Jana; Trávníček, Pavel; Kubátová, B.; Krejčíková, Jana; Urfus, Tomáš; Liu, Z.-J.; Lamb, A.; Ponert, J.; Schulte, K.; Čurn, V.; Vrána, Jan; Leitch, I. J.; Suda, Jan

    2013-01-01

    Roč. 172, č. 1 (2013), s. 95-105 ISSN 0024-4074 R&D Projects: GA ČR GAP506/12/1320 Institutional support: RVO:67179843 ; RVO:67985939 ; RVO:61389030 Keywords : DNA base content * flow cytometry * nuclear C-value * phylogeny * orchids Subject RIV: EH - Ecology, Behaviour; EF - Botanics (BU-J); EF - Botanics (UEB-Q) Impact factor: 2.699, year: 2013

  19. Natural variation of histone modification and its impact on gene expression in the rat genome

    Czech Academy of Sciences Publication Activity Database

    Rintisch, C.; Heinig, M.; Bauerfeind, A.; Schafer, S.; Mieth, Ch.; Patone, G.; Hummel, O.; Chen, W.; Cook, S.; Cuppen, E.; Colomé-Tatché, M.; Johannes, F.; Jansen, R. C.; Neil, H.; Werner, M.; Pravenec, Michal; Vingron, M.; Hubner, N.

    2014-01-01

    Roč. 24, JUN (2014), s. 942-953 ISSN 1088-9051 R&D Projects: GA MŠk(CZ) 7E10067; GA ČR(CZ) GAP301/10/0290; GA MŠk(CZ) LL1204 Institutional support: RVO:67985823 Keywords : ChIP-seq * histone modification * gene expression * genetic linkage analysis Subject RIV: EB - Genetic s ; Molecular Biology Impact factor: 14.630, year: 2014

  20. Aboriginal Australian mitochondrial genome variation - an increased understanding of population antiquity and diversity

    Science.gov (United States)

    Nagle, Nano; van Oven, Mannis; Wilcox, Stephen; van Holst Pellekaan, Sheila; Tyler-Smith, Chris; Xue, Yali; Ballantyne, Kaye N.; Wilcox, Leah; Papac, Luka; Cooke, Karen; van Oorschot, Roland A. H.; McAllister, Peter; Williams, Lesley; Kayser, Manfred; Mitchell, R. John; Adhikarla, Syama; Adler, Christina J.; Balanovska, Elena; Balanovsky, Oleg; Bertranpetit, Jaume; Clarke, Andrew C.; Comas, David; Cooper, Alan; der Sarkissian, Clio S. I.; Dulik, Matthew C.; Gaieski, Jill B.; Ganeshprasad, Arunkumar; Haak, Wolfgang; Haber, Marc; Hobbs, Angela; Javed, Asif; Jin, Li; Kaplan, Matthew E.; Li, Shilin; Martínez-Cruz, Begoña; Matisoo-Smith, Elizabeth A.; Melé, Marta; Merchant, Nirav C.; Owings, Amanda C.; Parida, Laxmi; Pitchappan, Ramasamy; Platt, Daniel E.; Quintana-Murci, Lluis; Renfrew, Colin; Royyuru, Ajay K.; Santhakumari, Arun Varatharajan; Santos, Fabrício R.; Schurr, Theodore G.; Soodyall, Himla; Soria Hernanz, David F.; Swamikrishnan, Pandikumar; Vilar, Miguel G.; Wells, R. Spencer; Zalloua, Pierre A.; Ziegle, Janet S.

    2017-03-01

    Aboriginal Australians represent one of the oldest continuous cultures outside Africa, with evidence indicating that their ancestors arrived in the ancient landmass of Sahul (present-day New Guinea and Australia) ~55 thousand years ago. Genetic studies, though limited, have demonstrated both the uniqueness and antiquity of Aboriginal Australian genomes. We have further resolved known Aboriginal Australian mitochondrial haplogroups and discovered novel indigenous lineages by sequencing the mitogenomes of 127 contemporary Aboriginal Australians. In particular, the more common haplogroups observed in our dataset included M42a, M42c, S, P5 and P12, followed by rarer haplogroups M15, M16, N13, O, P3, P6 and P8. We propose some major phylogenetic rearrangements, such as in haplogroup P where we delinked P4a and P4b and redefined them as P4 (New Guinean) and P11 (Australian), respectively. Haplogroup P2b was identified as a novel clade potentially restricted to Torres Strait Islanders. Nearly all Aboriginal Australian mitochondrial haplogroups detected appear to be ancient, with no evidence of later introgression during the Holocene. Our findings greatly increase knowledge about the geographic distribution and phylogenetic structure of mitochondrial lineages that have survived in contemporary descendants of Australia’s first settlers.

  1. Genome-Wide DNA Methylation Analysis and Epigenetic Variations Associated with Congenital Aortic Valve Stenosis (AVS.

    Directory of Open Access Journals (Sweden)

    Uppala Radhakrishna

    Full Text Available Congenital heart defect (CHD is the most common cause of death from congenital anomaly. Among several candidate epigenetic mechanisms, DNA methylation may play an important role in the etiology of CHDs. We conducted a genome-wide DNA methylation analysis using an Illumina Infinium 450k human methylation assay in a cohort of 24 newborns who had aortic valve stenosis (AVS, with gestational-age matched controls. The study identified significantly-altered CpG methylation at 59 sites in 52 genes in AVS subjects as compared to controls (either hypermethylated or demethylated. Gene Ontology analysis identified biological processes and functions for these genes including positive regulation of receptor-mediated endocytosis. Consistent with prior clinical data, the molecular function categories as determined using DAVID identified low-density lipoprotein receptor binding, lipoprotein receptor binding and identical protein binding to be over-represented in the AVS group. A significant epigenetic change in the APOA5 and PCSK9 genes known to be involved in AVS was also observed. A large number CpG methylation sites individually demonstrated good to excellent diagnostic accuracy for the prediction of AVS status, thus raising possibility of molecular screening markers for this disorder. Using epigenetic analysis we were able to identify genes significantly involved in the pathogenesis of AVS.

  2. Genome-Wide DNA Methylation Analysis and Epigenetic Variations Associated with Congenital Aortic Valve Stenosis (AVS).

    Science.gov (United States)

    Radhakrishna, Uppala; Albayrak, Samet; Alpay-Savasan, Zeynep; Zeb, Amna; Turkoglu, Onur; Sobolewski, Paul; Bahado-Singh, Ray O

    2016-01-01

    Congenital heart defect (CHD) is the most common cause of death from congenital anomaly. Among several candidate epigenetic mechanisms, DNA methylation may play an important role in the etiology of CHDs. We conducted a genome-wide DNA methylation analysis using an Illumina Infinium 450k human methylation assay in a cohort of 24 newborns who had aortic valve stenosis (AVS), with gestational-age matched controls. The study identified significantly-altered CpG methylation at 59 sites in 52 genes in AVS subjects as compared to controls (either hypermethylated or demethylated). Gene Ontology analysis identified biological processes and functions for these genes including positive regulation of receptor-mediated endocytosis. Consistent with prior clinical data, the molecular function categories as determined using DAVID identified low-density lipoprotein receptor binding, lipoprotein receptor binding and identical protein binding to be over-represented in the AVS group. A significant epigenetic change in the APOA5 and PCSK9 genes known to be involved in AVS was also observed. A large number CpG methylation sites individually demonstrated good to excellent diagnostic accuracy for the prediction of AVS status, thus raising possibility of molecular screening markers for this disorder. Using epigenetic analysis we were able to identify genes significantly involved in the pathogenesis of AVS.

  3. Lawrence Livermore National Laboratory- Completing the Human Genome Project and Triggering Nearly $1 Trillion in U.S. Economic Activity

    Energy Technology Data Exchange (ETDEWEB)

    Stewart, Jeffrey S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-07-28

    The success of the Human Genome project is already nearing $1 Trillion dollars of U.S. economic activity. Lawrence Livermore National Laboratory (LLNL) was a co-leader in one of the biggest biological research effort in history, sequencing the Human Genome Project. This ambitious research effort set out to sequence the approximately 3 billion nucleotides in the human genome, an effort many thought was nearly impossible. Deoxyribonucleic acid (DNA) was discovered in 1869, and by 1943 came the discovery that DNA was a molecule that encodes the genetic instructions used in the development and functioning of living organisms and many viruses. To make full use of the information, scientists needed to first sequence the billions of nucleotides to begin linking them to genetic traits and illnesses, and eventually more effective treatments. New medical discoveries and improved agriculture productivity were some of the expected benefits. While the potential benefits were vast, the timeline (over a decade) and cost ($3.8 Billion) exceeded what the private sector would normally attempt, especially when this would only be the first phase toward the path to new discoveries and market opportunities. The Department of Energy believed its best research laboratories could meet this Grand Challenge and soon convinced the National Institute of Health to formally propose the Human Genome project to the federal government. The U.S. government accepted the risk and challenge to potentially create new healthcare and food discoveries that could benefit the world and the U.S. Industry.

  4. The role of OMICS research in understanding phenotype variation in thalassaemia: the THALAMOSS project

    Directory of Open Access Journals (Sweden)

    Roberto Gambari

    2014-12-01

    BCL11A contribute to high HbF production. Pharmacogenomic analysis of the effects of hydroxyurea (HU on HbF production in a collection of β-thalassemia and sickle cell disease (SCD patients allowed the identification of genomic signatures associated with high HbF. Therefore, it can hypothesized that genomic studies might predict the response of patients to treatments based on hydroxyurea, which is at present the most used HbF inducer in pharmacological therapy of β-thalassaemia. Transcriptomic/proteomic studies allowed to identify the zinc finger transcription factor B-cell lymphoma/leukemia 11A (BCL11A as the major repressor of HbF expression. The field of research on g-globin gene repressors (including BCL11A is of top interest, since several approaches can lead to pharmacologically-mediated inhibition of the expression of g-globin gene repressors, leading to gglobin gene activation. Among these strategies, we underline direct targeting of the transcription factors by aptamers or decoy molecules, as well as inhibition of the mRNA coding g-globin gene repressors with shRNAs, antisense molecules, peptide nucleic acids (PNAs and microRNAs. In this respect, the THALAMOSS FP7 Project (THALAssaemia MOdular Stratification System for personalized therapy of β-thalassemia, www.thalamoss.eu aims develop a universal sets of markers and techniques for stratification of β-thalassaemia patients into treatment subgroups for (a onset and frequency of blood transfusions, (b choice of iron chelation, (c induction of fetal hemoglobin, (d prospective efficacy of gene-therapy. The impact of THALAMOSS is the provision of novel biomarkers for distinct treatment subgroups in β-thalassaemia (500–1000 samples from participating medical centres, identified by combined genomics, proteomics, transcriptomics and tissue culture assays, the development of new or improved products for the cell isolation, characterization and treatment of β-thalassaemia patients and the establishment of

  5. Insights into mechanisms of bacterial antigenic variation derived from the complete genome sequence of Anaplasma marginale.

    Science.gov (United States)

    Palmer, Guy H; Futse, James E; Knowles, Donald P; Brayton, Kelly A

    2006-10-01

    Persistence of Anaplasma spp. in the animal reservoir host is required for efficient tick-borne transmission of these pathogens to animals and humans. Using A. marginale infection of its natural reservoir host as a model, persistent infection has been shown to reflect sequential cycles in which antigenic variants emerge, replicate, and are controlled by the immune system. Variation in the immunodominant outer-membrane protein MSP2 is generated by a process of gene conversion, in which unique hypervariable region sequences (HVRs) located in pseudogenes are recombined into a single operon-linked msp2 expression site. Although organisms expressing whole HVRs derived from pseudogenes emerge early in infection, long-term persistent infection is dependent on the generation of complex mosaics in which segments from different HVRs recombine into the expression site. The resulting combinatorial diversity generates the number of variants both predicted and shown to emerge during persistence.

  6. Variation in genome-wide levels of meiotic recombination is established at the onset of prophase in mammalian males.

    Directory of Open Access Journals (Sweden)

    Brian Baier

    2014-01-01

    Full Text Available Segregation of chromosomes during the first meiotic division relies on crossovers established during prophase. Although crossovers are strictly regulated so that at least one occurs per chromosome, individual variation in crossover levels is not uncommon. In an analysis of different inbred strains of male mice, we identified among-strain variation in the number of foci for the crossover-associated protein MLH1. We report studies of strains with "low" (CAST/EiJ, "medium" (C3H/HeJ, and "high" (C57BL/6J genome-wide MLH1 values to define factors responsible for this variation. We utilized immunofluorescence to analyze the number and distribution of proteins that function at different stages in the recombination pathway: RAD51 and DMC1, strand invasion proteins acting shortly after double-strand break (DSB formation, MSH4, part of the complex stabilizing double Holliday junctions, and the Bloom helicase BLM, thought to have anti-crossover activity. For each protein, we identified strain-specific differences that mirrored the results for MLH1; i.e., CAST/EiJ mice had the lowest values, C3H/HeJ mice intermediate values, and C57BL/6J mice the highest values. This indicates that differences in the numbers of DSBs (as identified by RAD51 and DMC1 are translated into differences in the number of crossovers, suggesting that variation in crossover levels is established by the time of DSB formation. However, DSBs per se are unlikely to be the primary determinant, since allelic variation for the DSB-inducing locus Spo11 resulted in differences in the numbers of DSBs but not the number of MLH1 foci. Instead, chromatin conformation appears to be a more important contributor, since analysis of synaptonemal complex length and DNA loop size also identified consistent strain-specific differences; i.e., crossover frequency increased with synaptonemal complex length and was inversely related to chromatin loop size. This indicates a relationship between recombination

  7. Segmentation of teeth in CT volumetric dataset by panoramic projection and variational level set

    International Nuclear Information System (INIS)

    Hosntalab, Mohammad; Aghaeizadeh Zoroofi, Reza; Abbaspour Tehrani-Fard, Ali; Shirani, Gholamreza

    2008-01-01

    Quantification of teeth is of clinical importance for various computer assisted procedures such as dental implant, orthodontic planning, face, jaw and cosmetic surgeries. In this regard, segmentation is a major step. In this paper, we propose a method for segmentation of teeth in volumetric computed tomography (CT) data using panoramic re-sampling of the dataset in the coronal view and variational level set. The proposed method consists of five steps as follows: first, we extract a mask in a CT images using Otsu thresholding. Second, the teeth are segmented from other bony tissues by utilizing anatomical knowledge of teeth in the jaws. Third, the proposed method is followed by estimating the arc of the upper and lower jaws and panoramic re-sampling of the dataset. Separation of upper and lower jaws and initial segmentation of teeth are performed by employing the horizontal and vertical projections of the panoramic dataset, respectively. Based the above mentioned procedures an initial mask for each tooth is obtained. Finally, we utilize the initial mask of teeth and apply a Variational level set to refine initial teeth boundaries to final contours. The proposed algorithm was evaluated in the presence of 30 multi-slice CT datasets including 3,600 images. Experimental results reveal the effectiveness of the proposed method. In the proposed algorithm, the variational level set technique was utilized to trace the contour of the teeth. In view of the fact that, this technique is based on the characteristic of the overall region of the teeth image, it is possible to extract a very smooth and accurate tooth contour using this technique. In the presence of the available datasets, the proposed technique was successful in teeth segmentation compared to previous techniques. (orig.)

  8. Segmentation of teeth in CT volumetric dataset by panoramic projection and variational level set

    Energy Technology Data Exchange (ETDEWEB)

    Hosntalab, Mohammad [Islamic Azad University, Faculty of Engineering, Science and Research Branch, Tehran (Iran); Aghaeizadeh Zoroofi, Reza [University of Tehran, Control and Intelligent Processing Center of Excellence, School of Electrical and Computer Engineering, College of Engineering, Tehran (Iran); Abbaspour Tehrani-Fard, Ali [Islamic Azad University, Faculty of Engineering, Science and Research Branch, Tehran (Iran); Sharif University of Technology, Department of Electrical Engineering, Tehran (Iran); Shirani, Gholamreza [Faculty of Dentistry Medical Science of Tehran University, Oral and Maxillofacial Surgery Department, Tehran (Iran)

    2008-09-15

    Quantification of teeth is of clinical importance for various computer assisted procedures such as dental implant, orthodontic planning, face, jaw and cosmetic surgeries. In this regard, segmentation is a major step. In this paper, we propose a method for segmentation of teeth in volumetric computed tomography (CT) data using panoramic re-sampling of the dataset in the coronal view and variational level set. The proposed method consists of five steps as follows: first, we extract a mask in a CT images using Otsu thresholding. Second, the teeth are segmented from other bony tissues by utilizing anatomical knowledge of teeth in the jaws. Third, the proposed method is followed by estimating the arc of the upper and lower jaws and panoramic re-sampling of the dataset. Separation of upper and lower jaws and initial segmentation of teeth are performed by employing the horizontal and vertical projections of the panoramic dataset, respectively. Based the above mentioned procedures an initial mask for each tooth is obtained. Finally, we utilize the initial mask of teeth and apply a Variational level set to refine initial teeth boundaries to final contours. The proposed algorithm was evaluated in the presence of 30 multi-slice CT datasets including 3,600 images. Experimental results reveal the effectiveness of the proposed method. In the proposed algorithm, the variational level set technique was utilized to trace the contour of the teeth. In view of the fact that, this technique is based on the characteristic of the overall region of the teeth image, it is possible to extract a very smooth and accurate tooth contour using this technique. In the presence of the available datasets, the proposed technique was successful in teeth segmentation compared to previous techniques. (orig.)

  9. Northeast African genomic variation shaped by the continuity of indigenous groups and Eurasian migrations.

    Directory of Open Access Journals (Sweden)

    Nina Hollfelder

    2017-08-01

    Full Text Available Northeast Africa has a long history of human habitation, with fossil-finds from the earliest anatomically modern humans, and housing ancient civilizations. The region is also the gate-way out of Africa, as well as a portal for migration into Africa from Eurasia via the Middle East and the Arabian Peninsula. We investigate the population history of northeast Africa by genotyping ~3.9 million SNPs in 221 individuals from 18 populations sampled in Sudan and South Sudan and combine this data with published genome-wide data from surrounding areas. We find a strong genetic divide between the populations from the northeastern parts of the region (Nubians, central Arab populations, and the Beja and populations towards the west and south (Nilotes, Darfur and Kordofan populations. This differentiation is mainly caused by a large Eurasian ancestry component of the northeast populations likely driven by migration of Middle Eastern groups followed by admixture that affected the local populations in a north-to-south succession of events. Genetic evidence points to an early admixture event in the Nubians, concurrent with historical contact between North Sudanese and Arab groups. We estimate the admixture in current-day Sudanese Arab populations to about 700 years ago, coinciding with the fall of Dongola in 1315/1316 AD, a wave of admixture that reached the Darfurian/Kordofanian populations some 400-200 years ago. In contrast to the northeastern populations, the current-day Nilotic populations from the south of the region display little or no admixture from Eurasian groups indicating long-term isolation and population continuity in these areas of northeast Africa.

  10. Northeast African genomic variation shaped by the continuity of indigenous groups and Eurasian migrations.

    Science.gov (United States)

    Hollfelder, Nina; Schlebusch, Carina M; Günther, Torsten; Babiker, Hiba; Hassan, Hisham Y; Jakobsson, Mattias

    2017-08-01

    Northeast Africa has a long history of human habitation, with fossil-finds from the earliest anatomically modern humans, and housing ancient civilizations. The region is also the gate-way out of Africa, as well as a portal for migration into Africa from Eurasia via the Middle East and the Arabian Peninsula. We investigate the population history of northeast Africa by genotyping ~3.9 million SNPs in 221 individuals from 18 populations sampled in Sudan and South Sudan and combine this data with published genome-wide data from surrounding areas. We find a strong genetic divide between the populations from the northeastern parts of the region (Nubians, central Arab populations, and the Beja) and populations towards the west and south (Nilotes, Darfur and Kordofan populations). This differentiation is mainly caused by a large Eurasian ancestry component of the northeast populations likely driven by migration of Middle Eastern groups followed by admixture that affected the local populations in a north-to-south succession of events. Genetic evidence points to an early admixture event in the Nubians, concurrent with historical contact between North Sudanese and Arab groups. We estimate the admixture in current-day Sudanese Arab populations to about 700 years ago, coinciding with the fall of Dongola in 1315/1316 AD, a wave of admixture that reached the Darfurian/Kordofanian populations some 400-200 years ago. In contrast to the northeastern populations, the current-day Nilotic populations from the south of the region display little or no admixture from Eurasian groups indicating long-term isolation and population continuity in these areas of northeast Africa.

  11. Structural analysis of a set of proteins resulting from a bacterial genomics project.

    Science.gov (United States)

    Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R

    2005-09-01

    The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.

  12. The post-Human Genome Project mindset: race, reliability, and health care.

    Science.gov (United States)

    Kimmelman, J

    2006-11-01

    The following essay reports on the first session of a 2-day workshop on genetic diversity and science communication, organized by the Institute of Genetics. I argue that the four talks in this session reflected two different facets of a 'post-Human Genome Project (HGP)' view of human genetics. The first is characterized by an increasing interest in genetic differences. Two speakers - Troy Duster and Jasber Singh - expressed skepticism about one aspect of this trend: an emphasis on race in medicine and genetics. The other two speakers - Kenneth Weiss and Gustavo Turecki - spoke to a second facet of the post-HGP view: a recognition of the difficulty in translating genetic discovery into medical or public health applications. Though both sets of talks were highly critical of current trends in genetic research, they pulled in opposite directions: one warned about the role of genetics in stabilizing racial categories, while the other lamented the failure of any genetic claims or categories to stabilize at all. I argue that the use of racial categories in medicine seems likely to encounter scientific, medical, and social challenges.