WorldWideScience

Sample records for haplotype-specific genomic diversity

  1. A Near-Complete Haplotype-Phased Genome of the Dikaryotic Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici Reveals High Interhaplotype Diversity.

    Science.gov (United States)

    Schwessinger, Benjamin; Sperschneider, Jana; Cuddy, William S; Garnica, Diana P; Miller, Marisa E; Taylor, Jennifer M; Dodds, Peter N; Figueroa, Melania; Park, Robert F; Rathjen, John P

    2018-02-20

    A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N 50 of 1.5 Mb) and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies. IMPORTANCE Current representations of eukaryotic microbial genomes are haploid, hiding the genomic diversity intrinsic to diploid and polyploid life forms. This hidden diversity contributes to the organism's evolutionary potential and ability to adapt to stress conditions. Yet, it is

  2. The analysis of APOL1 genetic variation and haplotype diversity provided by 1000 Genomes project.

    Science.gov (United States)

    Peng, Ting; Wang, Li; Li, Guisen

    2017-08-11

    The APOL1 gene variants has been shown to be associated with an increased risk of multiple kinds of diseases, particularly in African Americans, but not in Caucasians and Asians. In this study, we explored the single nucleotide polymorphism (SNP) and haplotype diversity of APOL1 gene in different races provided by 1000 Genomes project. Variants of APOL1 gene in 1000 Genome Project were obtained and SNPs located in the regulatory region or coding region were selected for genetic variation analysis. Total 2504 individuals from 26 populations were classified as four groups that included Africa, Europe, Asia and Admixed populations. Tag SNPs were selected to evaluate the haplotype diversities in the four populations by HaploStats software. APOL1 gene was surrounded by some of the most polymorphic genes in the human genome, variation of APOL1 gene was common, with up to 613 SNP (1000 Genome Project reported) and 99 of them (16.2%) with MAF ≥ 1%. There were 79 SNPs in the URR and 92 SNPs in 3'UTR. Total 12 SNPs in URR and 24 SNPs in 3'UTR were considered as common variants with MAF ≥ 1%. It is worth noting that URR-1 was presents lower frequencies in European populations, while other three haplotypes taken an opposite pattern; 3'UTR presents several high-frequency variation sites in a short segment, and the differences of its haplotypes among different population were significant (P < 0.01), UTR-1 and UTR-5 presented much higher frequency in African population, while UTR-2, UTR-3 and UTR-4 were much lower. APOL1 coding region showed that two SNP of G1 with higher frequency are actually pull down the haplotype H-1 frequency when considering all populations pooled together, and the diversity among the four populations be widen by the G1 two mutation (P 1  = 3.33E-4 vs P 2  = 3.61E-30). The distributions of APOL1 gene variants and haplotypes were significantly different among the different populations, in either regulatory or coding regions. It could provide

  3. A Near-Complete Haplotype-Phased Genome of the Dikaryotic Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici Reveals High Interhaplotype Diversity

    Directory of Open Access Journals (Sweden)

    Benjamin Schwessinger

    2018-02-01

    Full Text Available A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N50 of 1.5 Mb and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales. In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies.

  4. Genomic sequence of 'Candidatus Liberibacter solanacearum' haplotype C and its comparison with haplotype A and B genomes.

    Directory of Open Access Journals (Sweden)

    Jinhui Wang

    Full Text Available Haplotypes A and B of 'Candidatus Liberibacter solanacearum' (CLso are associated with diseases of solanaceous plants, especially Zebra chip disease of potato, and haplotypes C, D and E are associated with symptoms on apiaceous plants. To date, one complete genome of haplotype B and two high quality draft genomes of haplotype A have been obtained for these unculturable bacteria using metagenomics from the psyllid vector Bactericera cockerelli. Here, we present the first genomic sequences obtained for the carrot-associated CLso. These two genomic sequences of haplotype C, FIN114 (1.24 Mbp and FIN111 (1.20 Mbp, were obtained from carrot psyllids (Trioza apicalis harboring CLso. Genomic comparisons between the haplotypes A, B and C revealed that the genome organization differs between these haplotypes, due to large inversions and other recombinations. Comparison of protein-coding genes indicated that the core genome of CLso consists of 885 ortholog groups, with the pan-genome consisting of 1327 ortholog groups. Twenty-seven ortholog groups are unique to CLso haplotype C, whilst 11 ortholog groups shared by the haplotypes A and B, are not found in the haplotype C. Some of these ortholog groups that are not part of the core genome may encode functions related to interactions with the different host plant and psyllid species.

  5. Major soybean maturity gene haplotypes revealed by SNPViz analysis of 72 sequenced soybean genomes.

    Directory of Open Access Journals (Sweden)

    Tiffany Langewisch

    Full Text Available In this Genomics Era, vast amounts of next-generation sequencing data have become publicly available for multiple genomes across hundreds of species. Analyses of these large-scale datasets can become cumbersome, especially when comparing nucleotide polymorphisms across many samples within a dataset and among different datasets or organisms. To facilitate the exploration of allelic variation and diversity, we have developed and deployed an in-house computer software to categorize and visualize these haplotypes. The SNPViz software enables users to analyze region-specific haplotypes from single nucleotide polymorphism (SNP datasets for different sequenced genomes. The examination of allelic variation and diversity of important soybean [Glycine max (L. Merr.] flowering time and maturity genes may provide additional insight into flowering time regulation and enhance researchers' ability to target soybean breeding for particular environments. For this study, we utilized two available soybean genomic datasets for a total of 72 soybean genotypes encompassing cultivars, landraces, and the wild species Glycine soja. The major soybean maturity genes E1, E2, E3, and E4 along with the Dt1 gene for plant growth architecture were analyzed in an effort to determine the number of major haplotypes for each gene, to evaluate the consistency of the haplotypes with characterized variant alleles, and to identify evidence of artificial selection. The results indicated classification of a small number of predominant haplogroups for each gene and important insights into possible allelic diversity for each gene within the context of known causative mutations. The software has both a stand-alone and web-based version and can be used to analyze other genes, examine additional soybean datasets, and view similar genome sequence and SNP datasets from other species.

  6. HLA diversity in the 1000 genomes dataset.

    Directory of Open Access Journals (Sweden)

    Pierre-Antoine Gourraud

    Full Text Available The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation by sequencing at a level that should allow the genome-wide detection of most variants with frequencies as low as 1%. However, in the major histocompatibility complex (MHC, only the top 10 most frequent haplotypes are in the 1% frequency range whereas thousands of haplotypes are present at lower frequencies. Given the limitation of both the coverage and the read length of the sequences generated by the 1000 Genomes Project, the highly variable positions that define HLA alleles may be difficult to identify. We used classical Sanger sequencing techniques to type the HLA-A, HLA-B, HLA-C, HLA-DRB1 and HLA-DQB1 genes in the available 1000 Genomes samples and combined the results with the 103,310 variants in the MHC region genotyped by the 1000 Genomes Project. Using pairwise identity-by-descent distances between individuals and principal component analysis, we established the relationship between ancestry and genetic diversity in the MHC region. As expected, both the MHC variants and the HLA phenotype can identify the major ancestry lineage, informed mainly by the most frequent HLA haplotypes. To some extent, regions of the genome with similar genetic or similar recombination rate have similar properties. An MHC-centric analysis underlines departures between the ancestral background of the MHC and the genome-wide picture. Our analysis of linkage disequilibrium (LD decay in these samples suggests that overestimation of pairwise LD occurs due to a limited sampling of the MHC diversity. This collection of HLA-specific MHC variants, available on the dbMHC portal, is a valuable resource for future analyses of the role of MHC in population and disease studies.

  7. De novo assembly of a haplotype-resolved human genome.

    Science.gov (United States)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang; Huang, Shujia; Sun, Yuhui; Tong, Xin; Xie, Yinlong; Liu, Binghang; Yang, Hailong; Zheng, Hancheng; Li, Jian; Li, Bo; Wang, Yu; Yang, Fang; Sun, Peng; Liu, Siyang; Gao, Peng; Huang, Haodong; Sun, Jing; Chen, Dan; He, Guangzhu; Huang, Weihua; Huang, Zheng; Li, Yue; Tellier, Laurent C A M; Liu, Xiao; Feng, Qiang; Xu, Xun; Zhang, Xiuqing; Bolund, Lars; Krogh, Anders; Kristiansen, Karsten; Drmanac, Radoje; Drmanac, Snezana; Nielsen, Rasmus; Li, Songgang; Wang, Jian; Yang, Huanming; Li, Yingrui; Wong, Gane Ka-Shu; Wang, Jun

    2015-06-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.

  8. The effect of using genealogy-based haplotypes for genomic prediction.

    Science.gov (United States)

    Edriss, Vahid; Fernando, Rohan L; Su, Guosheng; Lund, Mogens S; Guldbrandtsen, Bernt

    2013-03-06

    Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (π) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method. About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter π when fitting haplotypes compared to fitting markers. Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy.

  9. Genomic identification of founding haplotypes reveals the history of the selfing species Capsella rubella.

    Directory of Open Access Journals (Sweden)

    Yaniv Brandvain

    Full Text Available The shift from outcrossing to self-fertilization is among the most common evolutionary transitions in flowering plants. Until recently, however, a genome-wide view of this transition has been obscured by both a dearth of appropriate data and the lack of appropriate population genomic methods to interpret such data. Here, we present a novel population genomic analysis detailing the origin of the selfing species, Capsella rubella, which recently split from its outcrossing sister, Capsella grandiflora. Due to the recency of the split, much of the variation within C. rubella is also found within C. grandiflora. We can therefore identify genomic regions where two C. rubella individuals have inherited the same or different segments of ancestral diversity (i.e. founding haplotypes present in C. rubella's founder(s. Based on this analysis, we show that C. rubella was founded by multiple individuals drawn from a diverse ancestral population closely related to extant C. grandiflora, that drift and selection have rapidly homogenized most of this ancestral variation since C. rubella's founding, and that little novel variation has accumulated within this time. Despite the extensive loss of ancestral variation, the approximately 25% of the genome for which two C. rubella individuals have inherited different founding haplotypes makes up roughly 90% of the genetic variation between them. To extend these findings, we develop a coalescent model that utilizes the inferred frequency of founding haplotypes and variation within founding haplotypes to estimate that C. rubella was founded by a potentially large number of individuals between 50 and 100 kya, and has subsequently experienced a twenty-fold reduction in its effective population size. As population genomic data from an increasing number of outcrossing/selfing pairs are generated, analyses like the one developed here will facilitate a fine-scaled view of the evolutionary and demographic impact of the

  10. The effect of genealogy-based haplotypes on genomic prediction

    DEFF Research Database (Denmark)

    Edriss, Vahid; Fernando, Rohan L.; Su, Guosheng

    2013-01-01

    on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. Methods A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using...... local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (pi) of the haplotype covariates had zero effect......, i.e. a Bayesian mixture method. Results About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some...

  11. Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication

    Science.gov (United States)

    vonHoldt, Bridgett M.; Pollinger, John P.; Lohmueller, Kirk E.; Han, Eunjung; Parker, Heidi G.; Quignon, Pascale; Degenhardt, Jeremiah D.; Boyko, Adam R.; Earl, Dent A.; Auton, Adam; Reynolds, Andy; Bryc, Kasia; Brisbin, Abra; Knowles, James C.; Mosher, Dana S.; Spady, Tyrone C.; Elkahloun, Abdel; Geffen, Eli; Pilot, Malgorzata; Jedrzejewski, Wlodzimierz; Greco, Claudia; Randi, Ettore; Bannasch, Danika; Wilton, Alan; Shearman, Jeremy; Musiani, Marco; Cargill, Michelle; Jones, Paul G.; Qian, Zuwei; Huang, Wei; Ding, Zhao-Li; Zhang, Ya-ping; Bustamante, Carlos D.; Ostrander, Elaine A.; Novembre, John; Wayne, Robert K.

    2010-01-01

    Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication1,2. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data3. Furthermore, we find a surprising correspondence between genetic and phenotypic/functional breed groupings but there are exceptions that suggest phenotypic diversification depended in part on the repeated crossing of individuals with novel phenotypes. Our results show that Middle Eastern wolves were a critical source of genome diversity, although interbreeding with local wolf populations clearly occurred elsewhere in the early history of specific lineages. More recently, the evolution of modern dog breeds seems to have been an iterative process that drew on a limited genetic toolkit to create remarkable phenotypic diversity. PMID:20237475

  12. Reliable reconstruction of HIV-1 whole genome haplotypes reveals clonal interference and genetic hitchhiking among immune escape variants

    Science.gov (United States)

    2014-01-01

    Background Following transmission, HIV-1 evolves into a diverse population, and next generation sequencing enables us to detect variants occurring at low frequencies. Studying viral evolution at the level of whole genomes was hitherto not possible because next generation sequencing delivers relatively short reads. Results We here provide a proof of principle that whole HIV-1 genomes can be reliably reconstructed from short reads, and use this to study the selection of immune escape mutations at the level of whole genome haplotypes. Using realistically simulated HIV-1 populations, we demonstrate that reconstruction of complete genome haplotypes is feasible with high fidelity. We do not reconstruct all genetically distinct genomes, but each reconstructed haplotype represents one or more of the quasispecies in the HIV-1 population. We then reconstruct 30 whole genome haplotypes from published short sequence reads sampled longitudinally from a single HIV-1 infected patient. We confirm the reliability of the reconstruction by validating our predicted haplotype genes with single genome amplification sequences, and by comparing haplotype frequencies with observed epitope escape frequencies. Conclusions Phylogenetic analysis shows that the HIV-1 population undergoes selection driven evolution, with successive replacement of the viral population by novel dominant strains. We demonstrate that immune escape mutants evolve in a dependent manner with various mutations hitchhiking along with others. As a consequence of this clonal interference, selection coefficients have to be estimated for complete haplotypes and not for individual immune escapes. PMID:24996694

  13. Haplotype assembly in polyploid genomes and identical by descent shared tracts.

    Science.gov (United States)

    Aguiar, Derek; Istrail, Sorin

    2013-07-01

    Genome-wide haplotype reconstruction from sequence data, or haplotype assembly, is at the center of major challenges in molecular biology and life sciences. For complex eukaryotic organisms like humans, the genome is vast and the population samples are growing so rapidly that algorithms processing high-throughput sequencing data must scale favorably in terms of both accuracy and computational efficiency. Furthermore, current models and methodologies for haplotype assembly (i) do not consider individuals sharing haplotypes jointly, which reduces the size and accuracy of assembled haplotypes, and (ii) are unable to model genomes having more than two sets of homologous chromosomes (polyploidy). Polyploid organisms are increasingly becoming the target of many research groups interested in the genomics of disease, phylogenetics, botany and evolution but there is an absence of theory and methods for polyploid haplotype reconstruction. In this work, we present a number of results, extensions and generalizations of compass graphs and our HapCompass framework. We prove the theoretical complexity of two haplotype assembly optimizations, thereby motivating the use of heuristics. Furthermore, we present graph theory-based algorithms for the problem of haplotype assembly using our previously developed HapCompass framework for (i) novel implementations of haplotype assembly optimizations (minimum error correction), (ii) assembly of a pair of individuals sharing a haplotype tract identical by descent and (iii) assembly of polyploid genomes. We evaluate our methods on 1000 Genomes Project, Pacific Biosciences and simulated sequence data. HapCompass is available for download at http://www.brown.edu/Research/Istrail_Lab/. Supplementary data are available at Bioinformatics online.

  14. Exploring and Harnessing Haplotype Diversity to Improve Yield Stability in Crops

    Directory of Open Access Journals (Sweden)

    Lunwen Qian

    2017-09-01

    Full Text Available In order to meet future food, feed, fiber, and bioenergy demands, global yields of all major crops need to be increased significantly. At the same time, the increasing frequency of extreme weather events such as heat and drought necessitates improvements in the environmental resilience of modern crop cultivars. Achieving sustainably increase yields implies rapid improvement of quantitative traits with a very complex genetic architecture and strong environmental interaction. Latest advances in genome analysis technologies today provide molecular information at an ultrahigh resolution, revolutionizing crop genomic research, and paving the way for advanced quantitative genetic approaches. These include highly detailed assessment of population structure and genotypic diversity, facilitating the identification of selective sweeps and signatures of directional selection, dissection of genetic variants that underlie important agronomic traits, and genomic selection (GS strategies that not only consider major-effect genes. Single-nucleotide polymorphism (SNP markers today represent the genotyping system of choice for crop genetic studies because they occur abundantly in plant genomes and are easy to detect. SNPs are typically biallelic, however, hence their information content compared to multiallelic markers is low, limiting the resolution at which SNP–trait relationships can be delineated. An efficient way to overcome this limitation is to construct haplotypes based on linkage disequilibrium, one of the most important features influencing genetic analyses of crop genomes. Here, we give an overview of the latest advances in genomics-based haplotype analyses in crops, highlighting their importance in the context of polyploidy and genome evolution, linkage drag, and co-selection. We provide examples of how haplotype analyses can complement well-established quantitative genetics frameworks, such as quantitative trait analysis and GS, ultimately

  15. Exploring and Harnessing Haplotype Diversity to Improve Yield Stability in Crops.

    Science.gov (United States)

    Qian, Lunwen; Hickey, Lee T; Stahl, Andreas; Werner, Christian R; Hayes, Ben; Snowdon, Rod J; Voss-Fels, Kai P

    2017-01-01

    In order to meet future food, feed, fiber, and bioenergy demands, global yields of all major crops need to be increased significantly. At the same time, the increasing frequency of extreme weather events such as heat and drought necessitates improvements in the environmental resilience of modern crop cultivars. Achieving sustainably increase yields implies rapid improvement of quantitative traits with a very complex genetic architecture and strong environmental interaction. Latest advances in genome analysis technologies today provide molecular information at an ultrahigh resolution, revolutionizing crop genomic research, and paving the way for advanced quantitative genetic approaches. These include highly detailed assessment of population structure and genotypic diversity, facilitating the identification of selective sweeps and signatures of directional selection, dissection of genetic variants that underlie important agronomic traits, and genomic selection (GS) strategies that not only consider major-effect genes. Single-nucleotide polymorphism (SNP) markers today represent the genotyping system of choice for crop genetic studies because they occur abundantly in plant genomes and are easy to detect. SNPs are typically biallelic, however, hence their information content compared to multiallelic markers is low, limiting the resolution at which SNP-trait relationships can be delineated. An efficient way to overcome this limitation is to construct haplotypes based on linkage disequilibrium, one of the most important features influencing genetic analyses of crop genomes. Here, we give an overview of the latest advances in genomics-based haplotype analyses in crops, highlighting their importance in the context of polyploidy and genome evolution, linkage drag, and co-selection. We provide examples of how haplotype analyses can complement well-established quantitative genetics frameworks, such as quantitative trait analysis and GS, ultimately providing an effective tool

  16. Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

    Directory of Open Access Journals (Sweden)

    McGuire Patrick E

    2010-12-01

    Full Text Available Abstract Background A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat. Results Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed. Conclusions In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large

  17. The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

    Science.gov (United States)

    Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

    2016-10-11

    Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

  18. Dense and accurate whole-chromosome haplotyping of individual genomes

    NARCIS (Netherlands)

    Porubsky, David; Garg, Shilpa; Sanders, Ashley D.; Korbel, Jan O.; Guryev, Victor; Lansdorp, Peter M.; Marschall, Tobias

    2017-01-01

    The diploid nature of the human genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. This lack of haplotype-level analyses can be explained by a lack of methods that can produce dense and accurate

  19. A genomic portrait of haplotype diversity and signatures of selection in indigenous southern African populations.

    Directory of Open Access Journals (Sweden)

    Emile R Chimusa

    2015-03-01

    Full Text Available We report a study of genome-wide, dense SNP (∼ 900K and copy number polymorphism data of indigenous southern Africans. We demonstrate the genetic contribution to southern and eastern African populations, which involved admixture between indigenous San, Niger-Congo-speaking and populations of Eurasian ancestry. This finding illustrates the need to account for stratification in genome-wide association studies, and that admixture mapping would likely be a successful approach in these populations. We developed a strategy to detect the signature of selection prior to and following putative admixture events. Several genomic regions show an unusual excess of Niger-Kordofanian, and unusual deficiency of both San and Eurasian ancestry, which were considered the footprints of selection after population admixture. Several SNPs with strong allele frequency differences were observed predominantly between the admixed indigenous southern African populations, and their ancestral Eurasian populations. Interestingly, many candidate genes, which were identified within the genomic regions showing signals for selection, were associated with southern African-specific high-risk, mostly communicable diseases, such as malaria, influenza, tuberculosis, and human immunodeficiency virus/AIDs. This observation suggests a potentially important role that these genes might have played in adapting to the environment. Additionally, our analyses of haplotype structure, linkage disequilibrium, recombination, copy number variation and genome-wide admixture highlight, and support the unique position of San relative to both African and non-African populations. This study contributes to a better understanding of population ancestry and selection in south-eastern African populations; and the data and results obtained will support research into the genetic contributions to infectious as well as non-communicable diseases in the region.

  20. A genomic portrait of haplotype diversity and signatures of selection in indigenous southern African populations.

    Science.gov (United States)

    Chimusa, Emile R; Meintjies, Ayton; Tchanga, Milaine; Mulder, Nicola; Seoighe, Cathal; Seioghe, Cathal; Soodyall, Himla; Ramesar, Rajkumar

    2015-03-01

    We report a study of genome-wide, dense SNP (∼ 900K) and copy number polymorphism data of indigenous southern Africans. We demonstrate the genetic contribution to southern and eastern African populations, which involved admixture between indigenous San, Niger-Congo-speaking and populations of Eurasian ancestry. This finding illustrates the need to account for stratification in genome-wide association studies, and that admixture mapping would likely be a successful approach in these populations. We developed a strategy to detect the signature of selection prior to and following putative admixture events. Several genomic regions show an unusual excess of Niger-Kordofanian, and unusual deficiency of both San and Eurasian ancestry, which were considered the footprints of selection after population admixture. Several SNPs with strong allele frequency differences were observed predominantly between the admixed indigenous southern African populations, and their ancestral Eurasian populations. Interestingly, many candidate genes, which were identified within the genomic regions showing signals for selection, were associated with southern African-specific high-risk, mostly communicable diseases, such as malaria, influenza, tuberculosis, and human immunodeficiency virus/AIDs. This observation suggests a potentially important role that these genes might have played in adapting to the environment. Additionally, our analyses of haplotype structure, linkage disequilibrium, recombination, copy number variation and genome-wide admixture highlight, and support the unique position of San relative to both African and non-African populations. This study contributes to a better understanding of population ancestry and selection in south-eastern African populations; and the data and results obtained will support research into the genetic contributions to infectious as well as non-communicable diseases in the region.

  1. Haplotype-Based Genotyping in Polyploids

    Directory of Open Access Journals (Sweden)

    Josh P. Clevenger

    2018-04-01

    Full Text Available Accurate identification of polymorphisms from sequence data is crucial to unlocking the potential of high throughput sequencing for genomics. Single nucleotide polymorphisms (SNPs are difficult to accurately identify in polyploid crops due to the duplicative nature of polyploid genomes leading to low confidence in the true alignment of short reads. Implementing a haplotype-based method in contrasting subgenome-specific sequences leads to higher accuracy of SNP identification in polyploids. To test this method, a large-scale 48K SNP array (Axiom Arachis2 was developed for Arachis hypogaea (peanut, an allotetraploid, in which 1,674 haplotype-based SNPs were included. Results of the array show that 74% of the haplotype-based SNP markers could be validated, which is considerably higher than previous methods used for peanut. The haplotype method has been implemented in a standalone program, HAPLOSWEEP, which takes as input bam files and a vcf file and identifies haplotype-based markers. Haplotype discovery can be made within single reads or span paired reads, and can leverage long read technology by targeting any length of haplotype. Haplotype-based genotyping is applicable in all allopolyploid genomes and provides confidence in marker identification and in silico-based genotyping for polyploid genomics.

  2. Haplotype diversity and linkage disequilibrium at DRD2 locus--a study on four population groups of Andhra Pradesh, India.

    Science.gov (United States)

    Saraswathy, Kallur Nava; Mukhopadhyay, Rupak; Shukla, Deepti; Kaur, Harpreet; Sachdeva, Mohinder Pal; Rao, A P; Saksena, Deepti; Kalla, Aloke Kumar

    2009-02-01

    Dopamine receptor D2 (DRD2) is expressed in the central nervous system and has a high affinity for many antipsychotic drugs. Besides several epidemiological investigations on association of DRD2 locus polymorphism(s) with neuropsychiatric problems and addictive behavior, a few polymorphisms in this locus have also been used to understand genomic diversity and population migratory histories globally. The present study attempts to understand the genomic diversity/affinity among four endogamous groups of Andhra Pradesh (India) against the backdrop of diversity studies from other parts of India and the rest of the world, with special reference to DRD2 locus. The four population groups from Adilabad District of Andhra Pradesh, namely, Brahmin (n=50), Nayakpod (n=49), Thoti (n=52), and Kolam (n=53), were included in the study. The DRD2 markers typed for the present study are three biallelic restriction fragments, that is, TaqI A (rs1800497), TaqI B (rs1079597), and TaqI D (rs1800498). Scoring of DRD2 haplotypes with respect to the three TaqI sites shows that five out of eight possible haplotypes are shared by the four populations. Ancestral haplotype B2D2A1 is most frequent among Thotis (0.359). The results of the present study indicate a differential gene flow into South India followed by certain important demographic events resulting in diversified peopling of India.

  3. Insights into HLA-G genetics provided by worldwide haplotype diversity

    Directory of Open Access Journals (Sweden)

    Erick C Castelli

    2014-10-01

    Full Text Available Human Leucocyte Antigen G (HLA-G belongs to the family of nonclassical HLA class I genes, located within the major histocompatibility complex (MHC. HLA-G has been the target of most recent research regarding the function of class I nonclassical genes. The main features that distinguish HLA-G from classical class I genes are: a limited protein variability; b alternative splicing generating several membrane bound and soluble isoforms; c short cytoplasmic tail; d modulation of immune response (immune tolerance; e restricted expression to certain tissues. In the present work, we describe the HLA-G gene structure and address the HLA-G variability and haplotype diversity among several populations around the world, considering each of its major segments (promoter, coding and 3’untranslated regions. For this purpose, we developed a pipeline to reevaluate the 1000Genomes data and recover miscalled or missing genotypes and haplotypes. It became clear that the overall structure of the HLA-G molecule has been maintained during the evolutionary process and that most of the variation sites found in the HLA-G coding region are either coding synonymous or intronic mutations. In addition, only a few frequent and divergent extended haplotypes are found when the promoter, coding and 3’ untranslated regions are evaluated together. The divergence is particularly evident for the regulatory regions. The population comparisons confirmed that most of the HLA-G variability has originated before human dispersion from Africa and that the allele and haplotype frequencies have probably been shaped by strong selective pressures.

  4. Mapping the genetic diversity of HLA haplotypes in the Japanese populations

    Science.gov (United States)

    Saw, Woei-Yuh; Liu, Xuanyao; Khor, Chiea-Chuen; Takeuchi, Fumihiko; Katsuya, Tomohiro; Kimura, Ryosuke; Nabika, Toru; Ohkubo, Takayoshi; Tabara, Yasuharu; Yamamoto, Ken; Yokota, Mitsuhiro; Akiyama, Koichi; Asano, Hiroyuki; Asayama, Kei; Haga, Toshikazu; Hara, Azusa; Hirose, Takuo; Hosaka, Miki; Ichihara, Sahoko; Imai, Yutaka; Inoue, Ryusuke; Ishiguro, Aya; Isomura, Minoru; Isono, Masato; Kamide, Kei; Kato, Norihiro; Katsuya, Tomohiro; Kikuya, Masahiro; Kohara, Katsuhiko; Matsubara, Tatsuaki; Matsuda, Ayako; Metoki, Hirohito; Miki, Tetsuro; Murakami, Keiko; Nabika, Toru; Nakatochi, Masahiro; Ogihara, Toshio; Ohnaka, Keizo; Ohkubo, Takayoshi; Rakugi, Hiromi; Satoh, Michihiro; Shiwaku, Kunihiro; Sugimoto, Ken; Tabara, Yasuharu; Takami, Yoichi; Takayanagi, Ryoichi; Takeuchi, Fumihiko; Tsubota-Utsugi, Megumi; Yamamoto, Ken; Yamamoto, Koichi; Yamasaki, Masayuki; Yasui, Daisaku; Yokota, Mitsuhiro; Teo, Yik-Ying; Kato, Norihiro

    2015-01-01

    Japan has often been viewed as an Asian country that possesses a genetically homogenous community. The basis for partitioning the country into prefectures has largely been geographical, although cultural and linguistic differences still exist between some of the districts/prefectures, especially between Okinawa and the mainland prefectures. The Major Histocompatibility Complex (MHC) region has consistently emerged as the most polymorphic region in the human genome, harbouring numerous biologically important variants; nevertheless the presence of population-specific long haplotypes hinders the imputation of SNPs and classical HLA alleles. Here, we examined the extent of genetic variation at the MHC between eight Japanese populations sampled from Okinawa, and six other prefectures located in or close to the mainland of Japan, specifically focusing at the haplotypes observed within each population, and what the impact of any variation has on imputation. Our results indicated that Okinawa was genetically farther to the mainland Japanese than were Gujarati Indians from Tamil Indians, while the mainland Japanese from six prefectures were more homogeneous than between northern and southern Han Chinese. The distribution of haplotypes across Japan was similar, although imputation was most accurate for Okinawa and several mainland prefectures when population-specific panels were used as reference. PMID:26648100

  5. Nucleotide polymorphisms and haplotype diversity of RTCS gene in China elite maize inbred lines.

    Directory of Open Access Journals (Sweden)

    Enying Zhang

    Full Text Available The maize RTCS gene, encoding a LOB domain transcription factor, plays important roles in the initiation of embryonic seminal and postembryonic shoot-borne root. In this study, the genomic sequences of this gene in 73 China elite inbred lines, including 63 lines from 5 temperate heteroric groups and 10 tropic germplasms, were obtained, and the nucleotide polymorphisms and haplotype diversity were detected. A total of 63 sequence variants, including 44 SNPs and 19 indels, were identified at this locus, and most of them were found to be located in the regions of UTR and intron. The coding region of this gene in all tested inbred lines carried 14 haplotypes, which encoding 7 deferring RTCS proteins. Analysis of the polymorphism sites revealed that at least 6 recombination events have occurred. Among all 6 groups tested, only the P heterotic group had a much lower nucleotide diversity than the whole set, and selection analysis also revealed that only this group was under strong negative selection. However, the set of Huangzaosi and its derived lines possessed a higher nucleotide diversity than the whole set, and no selection signal were identified.

  6. De novo assembly of a haplotype-resolved human genome

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang

    2015-01-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-...

  7. A haplotype regression approach for genetic evaluation using sequences from the 1000 bull genomes Project

    International Nuclear Information System (INIS)

    Lakhssassi, K.; González-Recio, O.

    2017-01-01

    Haplotypes from sequencing data may improve the prediction accuracy in genomic evaluations as haplotypes are in stronger linkage disequilibrium with quantitative trait loci than markers from SNP chips. This study focuses first, on the creation of haplotypes in a population sample of 450 Holstein animals, with full-sequence data from the 1000 bull genomes project; and second, on incorporating them into the whole genome prediction model. In total, 38,319,258 SNPs (and indels) from Next Generation Sequencing were included in the analysis. After filtering variants with minor allele frequency (MAF< 0.025) 13,912,326 SNPs were available for the haplotypes extraction with findhap.f90. The number of SNPs in the haploblocks was on average 924 SNP (166,552 bp). Unique haplotypes were around 97% in all chromosomes and were ignored leaving 153,428 haplotypes. Estimated haplotypes had a large contribution to the total variance of genomic estimated breeding values for kilogram of protein, Global Type Index, Somatic Cell Score and Days Open (between 32 and 99.9%). Haploblocks containing haplotypes with large effects were selected by filtering for each trait, haplotypes whose effect was larger/lower than the mean plus/minus 3 times the standard deviation (SD) and 1 SD above the mean of the haplotypes effect distribution. Results showed that filtering by 3 SD would not be enough to capture a large proportion of genetic variance, whereas filtering by 1 SD could be useful but model convergence should be considered. Additionally, sequence haplotypes were able to capture additional genetic variance to the polygenic effect for traits undergoing lower selection intensity like fertility and health traits.

  8. A haplotype regression approach for genetic evaluation using sequences from the 1000 bull genomes Project

    Energy Technology Data Exchange (ETDEWEB)

    Lakhssassi, K.; González-Recio, O.

    2017-07-01

    Haplotypes from sequencing data may improve the prediction accuracy in genomic evaluations as haplotypes are in stronger linkage disequilibrium with quantitative trait loci than markers from SNP chips. This study focuses first, on the creation of haplotypes in a population sample of 450 Holstein animals, with full-sequence data from the 1000 bull genomes project; and second, on incorporating them into the whole genome prediction model. In total, 38,319,258 SNPs (and indels) from Next Generation Sequencing were included in the analysis. After filtering variants with minor allele frequency (MAF< 0.025) 13,912,326 SNPs were available for the haplotypes extraction with findhap.f90. The number of SNPs in the haploblocks was on average 924 SNP (166,552 bp). Unique haplotypes were around 97% in all chromosomes and were ignored leaving 153,428 haplotypes. Estimated haplotypes had a large contribution to the total variance of genomic estimated breeding values for kilogram of protein, Global Type Index, Somatic Cell Score and Days Open (between 32 and 99.9%). Haploblocks containing haplotypes with large effects were selected by filtering for each trait, haplotypes whose effect was larger/lower than the mean plus/minus 3 times the standard deviation (SD) and 1 SD above the mean of the haplotypes effect distribution. Results showed that filtering by 3 SD would not be enough to capture a large proportion of genetic variance, whereas filtering by 1 SD could be useful but model convergence should be considered. Additionally, sequence haplotypes were able to capture additional genetic variance to the polygenic effect for traits undergoing lower selection intensity like fertility and health traits.

  9. Genome-wide haplotype analysis of cis expression quantitative trait loci in monocytes.

    Directory of Open Access Journals (Sweden)

    Sophie Garnier

    Full Text Available In order to assess whether gene expression variability could be influenced by several SNPs acting in cis, either through additive or more complex haplotype effects, a systematic genome-wide search for cis haplotype expression quantitative trait loci (eQTL was conducted in a sample of 758 individuals, part of the Cardiogenics Transcriptomic Study, for which genome-wide monocyte expression and GWAS data were available. 19,805 RNA probes were assessed for cis haplotypic regulation through investigation of ~2,1 × 10(9 haplotypic combinations. 2,650 probes demonstrated haplotypic p-values >10(4-fold smaller than the best single SNP p-value. Replication of significant haplotype effects were tested for 412 probes for which SNPs (or proxies that defined the detected haplotypes were available in the Gutenberg Health Study composed of 1,374 individuals. At the Bonferroni correction level of 1.2 × 10(-4 (~0.05/412, 193 haplotypic signals replicated. 1000 G imputation was then conducted, and 105 haplotypic signals still remained more informative than imputed SNPs. In-depth analysis of these 105 cis eQTL revealed that at 76 loci genetic associations were compatible with additive effects of several SNPs, while for the 29 remaining regions data could be compatible with a more complex haplotypic pattern. As 24 of the 105 cis eQTL have previously been reported to be disease-associated loci, this work highlights the need for conducting haplotype-based and 1000 G imputed cis eQTL analysis before commencing functional studies at disease-associated loci.

  10. Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms.

    Science.gov (United States)

    Yamamoto, Toshio; Nagasaki, Hideki; Yonemaru, Jun-ichi; Ebana, Kaworu; Nakajima, Maiko; Shibaya, Taeko; Yano, Masahiro

    2010-04-27

    To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP) data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7 x the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67,051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several genomic regions decreasing genetic diversity which might be

  11. Combinatorial aspects of genome rearrangements and haplotype networks

    OpenAIRE

    Labarre , Anthony

    2008-01-01

    The dissertation covers two problems motivated by computational biology: genome rearrangements, and haplotype networks. Genome rearrangement problems are a particular case of edit distance problems, where one seeks to transform two given objects into one another using as few operations as possible, with the additional constraint that the set of allowed operations is fixed beforehand; we are also interested in computing the corresponding distances between those objects, i.e. merely computing t...

  12. Whole genome SNP discovery and analysis of genetic diversity in Turkey (Meleagris gallopavo)

    Science.gov (United States)

    2012-01-01

    study in turkey resulted in the detection of 5.49 million putative SNPs compared to the reference genome. All commercial lines appear to share a common origin. Presence of different alleles/haplotypes in the SM population highlights that specific haplotypes have been selected in the modern domesticated turkey. PMID:22891612

  13. Whole genome SNP discovery and analysis of genetic diversity in Turkey (Meleagris gallopavo

    Directory of Open Access Journals (Sweden)

    Aslam Muhammad L

    2012-08-01

    whole genome SNP discovery study in turkey resulted in the detection of 5.49 million putative SNPs compared to the reference genome. All commercial lines appear to share a common origin. Presence of different alleles/haplotypes in the SM population highlights that specific haplotypes have been selected in the modern domesticated turkey.

  14. Genome patterns of selection and introgression of haplotypes in natural populations of the house mouse (Mus musculus.

    Directory of Open Access Journals (Sweden)

    Fabian Staubach

    Full Text Available General parameters of selection, such as the frequency and strength of positive selection in natural populations or the role of introgression, are still insufficiently understood. The house mouse (Mus musculus is a particularly well-suited model system to approach such questions, since it has a defined history of splits into subspecies and populations and since extensive genome information is available. We have used high-density single-nucleotide polymorphism (SNP typing arrays to assess genomic patterns of positive selection and introgression of alleles in two natural populations of each of the subspecies M. m. domesticus and M. m. musculus. Applying different statistical procedures, we find a large number of regions subject to apparent selective sweeps, indicating frequent positive selection on rare alleles or novel mutations. Genes in the regions include well-studied imprinted loci (e.g. Plagl1/Zac1, homologues of human genes involved in adaptations (e.g. alpha-amylase genes or in genetic diseases (e.g. Huntingtin and Parkin. Haplotype matching between the two subspecies reveals a large number of haplotypes that show patterns of introgression from specific populations of the respective other subspecies, with at least 10% of the genome being affected by partial or full introgression. Using neutral simulations for comparison, we find that the size and the fraction of introgressed haplotypes are not compatible with a pure migration or incomplete lineage sorting model. Hence, it appears that introgressed haplotypes can rise in frequency due to positive selection and thus can contribute to the adaptive genomic landscape of natural populations. Our data support the notion that natural genomes are subject to complex adaptive processes, including the introgression of haplotypes from other differentiated populations or species at a larger scale than previously assumed for animals. This implies that some of the admixture found in inbred strains of mice

  15. Effects of the number of markers per haplotype and clustering of haplotypes on the accuracy of QTL mapping and prediction of genomic breeding values

    NARCIS (Netherlands)

    Calus, M.P.L.; Meuwissen, T.H.E.; Windig, J.J.; Knol, E.F.; Schrooten, C.; Vereijken, A.L.J.; Veerkamp, R.F.

    2009-01-01

    The aim of this paper was to compare the effect of haplotype definition on the precision of QTL-mapping and on the accuracy of predicted genomic breeding values. In a multiple QTL model using identity-by-descent (IBD) probabilities between haplotypes, various haplotype definitions were tested i.e.

  16. A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica).

    Science.gov (United States)

    Jia, Guanqing; Huang, Xuehui; Zhi, Hui; Zhao, Yan; Zhao, Qiang; Li, Wenjun; Chai, Yang; Yang, Lifang; Liu, Kunyan; Lu, Hengyun; Zhu, Chuanrang; Lu, Yiqi; Zhou, Congcong; Fan, Danlin; Weng, Qijun; Guo, Yunli; Huang, Tao; Zhang, Lei; Lu, Tingting; Feng, Qi; Hao, Hangfei; Liu, Hongkuan; Lu, Ping; Zhang, Ning; Li, Yuhui; Guo, Erhu; Wang, Shujun; Wang, Suying; Liu, Jinrong; Zhang, Wenfei; Chen, Guoqiu; Zhang, Baojin; Li, Wei; Wang, Yongfang; Li, Haiquan; Zhao, Baohua; Li, Jiayang; Diao, Xianmin; Han, Bin

    2013-08-01

    Foxtail millet (Setaria italica) is an important grain crop that is grown in arid regions. Here we sequenced 916 diverse foxtail millet varieties, identified 2.58 million SNPs and used 0.8 million common SNPs to construct a haplotype map of the foxtail millet genome. We classified the foxtail millet varieties into two divergent groups that are strongly correlated with early and late flowering times. We phenotyped the 916 varieties under five different environments and identified 512 loci associated with 47 agronomic traits by genome-wide association studies. We performed a de novo assembly of deeply sequenced genomes of a Setaria viridis accession (the wild progenitor of S. italica) and an S. italica variety and identified complex interspecies and intraspecies variants. We also identified 36 selective sweeps that seem to have occurred during modern breeding. This study provides fundamental resources for genetics research and genetic improvement in foxtail millet.

  17. Effects of the number of markers per haplotype and clustering of haplotypes on the accuracy of QTL mapping and prediction of genomic breeding values

    Directory of Open Access Journals (Sweden)

    Schrooten Chris

    2009-01-01

    Full Text Available Abstract The aim of this paper was to compare the effect of haplotype definition on the precision of QTL-mapping and on the accuracy of predicted genomic breeding values. In a multiple QTL model using identity-by-descent (IBD probabilities between haplotypes, various haplotype definitions were tested i.e. including 2, 6, 12 or 20 marker alleles and clustering base haplotypes related with an IBD probability of > 0.55, 0.75 or 0.95. Simulated data contained 1100 animals with known genotypes and phenotypes and 1000 animals with known genotypes and unknown phenotypes. Genomes comprising 3 Morgan were simulated and contained 74 polymorphic QTL and 383 polymorphic SNP markers with an average r2 value of 0.14 between adjacent markers. The total number of haplotypes decreased up to 50% when the window size was increased from two to 20 markers and decreased by at least 50% when haplotypes related with an IBD probability of > 0.55 instead of > 0.95 were clustered. An intermediate window size led to more precise QTL mapping. Window size and clustering had a limited effect on the accuracy of predicted total breeding values, ranging from 0.79 to 0.81. Our conclusion is that different optimal window sizes should be used in QTL-mapping versus genome-wide breeding value prediction.

  18. Prion gene haplotypes of U.S. cattle

    Directory of Open Access Journals (Sweden)

    Harhay Gregory P

    2006-11-01

    Full Text Available Abstract Background Bovine spongiform encephalopathy (BSE is a fatal neurological disorder characterized by abnormal deposits of a protease-resistant isoform of the prion protein. Characterizing linkage disequilibrium (LD and haplotype networks within the bovine prion gene (PRNP is important for 1 testing rare or common PRNP variation for an association with BSE and 2 interpreting any association of PRNP alleles with BSE susceptibility. The objective of this study was to identify polymorphisms and haplotypes within PRNP from the promoter region through the 3'UTR in a diverse sample of U.S. cattle genomes. Results A 25.2-kb genomic region containing PRNP was sequenced from 192 diverse U.S. beef and dairy cattle. Sequence analyses identified 388 total polymorphisms, of which 287 have not previously been reported. The polymorphism alleles define PRNP by regions of high and low LD. High LD is present between alleles in the promoter region through exon 2 (6.7 kb. PRNP alleles within the majority of intron 2, the entire coding sequence and the untranslated region of exon 3 are in low LD (18.0 kb. Two haplotype networks, one representing the region of high LD and the other the region of low LD yielded nineteen different combinations that represent haplotypes spanning PRNP. The haplotype combinations are tagged by 19 polymorphisms (htSNPS which characterize variation within and across PRNP. Conclusion The number of polymorphisms in the prion gene region of U.S. cattle is nearly four times greater than previously described. These polymorphisms define PRNP haplotypes that may influence BSE susceptibility in cattle.

  19. Geographical distribution of a specific mitochondrial haplotype of Zymoseptoria tritici

    Directory of Open Access Journals (Sweden)

    Sameh BOUKEF

    2014-01-01

    Full Text Available Severity of disease caused by the fungus Zymoseptoria tritici throughout world cereal growing regions has elicited much debate on the potential evolutionary mechanism conferring high adaptability of the pathogen to diverse climate conditions and different wheat hosts (Triticum durum and T. aestivum. Specific mitochondrial DNA sequence was used to investigate geographic distribution of the type 4 haplotype (mtRFLP4 within 1363 isolates of Z. tritici originating from 21 countries. The mtRFLP4 haplotype was detected from both durum and bread wheat hosts with greater frequency on durum wheat. The distribution of mtRFLP4 was limited to populations sampled from the Mediterranean and the Red Sea region. Greater frequencies of mtRFLP4 were found in Tunisia (87% and Algeria (60%. The haplotype was absent within European, Australian, North and South American populations except Argentina. While alternative hypotheses such as climatic adaptation could not be ruled out, it is postulated that mtRFLP4 originated in North Africa (e.g. Tunisia or Algeria as an adaptation to durum wheat as the prevailing cereal crop. The specialized haplotype has subsequently spread as indicated by lower frequency of occurrence in the surrounding Mediterranean countries and on bread wheat hosts.

  20. Bacillus subtilis genome diversity.

    Science.gov (United States)

    Earl, Ashlee M; Losick, Richard; Kolter, Roberto

    2007-02-01

    Microarray-based comparative genomic hybridization (M-CGH) is a powerful method for rapidly identifying regions of genome diversity among closely related organisms. We used M-CGH to examine the genome diversity of 17 strains belonging to the nonpathogenic species Bacillus subtilis. Our M-CGH results indicate that there is considerable genetic heterogeneity among members of this species; nearly one-third of Bsu168-specific genes exhibited variability, as measured by the microarray hybridization intensities. The variable loci include those encoding proteins involved in antibiotic production, cell wall synthesis, sporulation, and germination. The diversity in these genes may reflect this organism's ability to survive in diverse natural settings.

  1. Congruence as a measurement of extended haplotype structure across the genome

    Science.gov (United States)

    2012-01-01

    Background Historically, extended haplotypes have been defined using only a few data points, such as alleles for several HLA genes in the MHC. High-density SNP data, and the increasing affordability of whole genome SNP typing, creates the opportunity to define higher resolution extended haplotypes. This drives the need for new tools that support quantification and visualization of extended haplotypes as defined by as many as 2000 SNPs. Confronted with high-density SNP data across the major histocompatibility complex (MHC) for 2,300 complete families, compiled by the Type 1 Diabetes Genetics Consortium (T1DGC), we developed software for studying extended haplotypes. Methods The software, called ExHap (Extended Haplotype), uses a similarity measurement we term congruence to identify and quantify long-range allele identity. Using ExHap, we analyzed congruence in both the T1DGC data and family-phased data from the International HapMap Project. Results Congruent chromosomes from the T1DGC data have between 96.5% and 99.9% allele identity over 1,818 SNPs spanning 2.64 megabases of the MHC (HLA-DRB1 to HLA-A). Thirty-three of 132 DQ-DR-B-A defined haplotype groups have > 50% congruent chromosomes in this region. For example, 92% of chromosomes within the DR3-B8-A1 haplotype are congruent from HLA-DRB1 to HLA-A (99.8% allele identity). We also applied ExHap to all 22 autosomes for both CEU and YRI cohorts from the International HapMap Project, identifying multiple candidate extended haplotypes. Conclusions Long-range congruence is not unique to the MHC region. Patterns of allele identity on phased chromosomes provide a simple, straightforward approach to visually and quantitatively inspect complex long-range structural patterns in the genome. Such patterns aid the biologist in appreciating genetic similarities and differences across cohorts, and can lead to hypothesis generation for subsequent studies. PMID:22369243

  2. Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana).

    Science.gov (United States)

    Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila

    2010-07-16

    Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A

  3. Power laws for heavy-tailed distributions: modeling allele and haplotype diversity for the national marrow donor program.

    Directory of Open Access Journals (Sweden)

    Noa Slater

    2015-04-01

    Full Text Available Measures of allele and haplotype diversity, which are fundamental properties in population genetics, often follow heavy tailed distributions. These measures are of particular interest in the field of hematopoietic stem cell transplant (HSCT. Donor/Recipient suitability for HSCT is determined by Human Leukocyte Antigen (HLA similarity. Match predictions rely upon a precise description of HLA diversity, yet classical estimates are inaccurate given the heavy-tailed nature of the distribution. This directly affects HSCT matching and diversity measures in broader fields such as species richness. We, therefore, have developed a power-law based estimator to measure allele and haplotype diversity that accommodates heavy tails using the concepts of regular variation and occupancy distributions. Application of our estimator to 6.59 million donors in the Be The Match Registry revealed that haplotypes follow a heavy tail distribution across all ethnicities: for example, 44.65% of the European American haplotypes are represented by only 1 individual. Indeed, our discovery rate of all U.S. European American haplotypes is estimated at 23.45% based upon sampling 3.97% of the population, leaving a large number of unobserved haplotypes. Population coverage, however, is much higher at 99.4% given that 90% of European Americans carry one of the 4.5% most frequent haplotypes. Alleles were found to be less diverse suggesting the current registry represents most alleles in the population. Thus, for HSCT registries, haplotype discovery will remain high with continued recruitment to a very deep level of sampling, but population coverage will not. Finally, we compared the convergence of our power-law versus classical diversity estimators such as Capture recapture, Chao, ACE and Jackknife methods. When fit to the haplotype data, our estimator displayed favorable properties in terms of convergence (with respect to sampling depth and accuracy (with respect to diversity

  4. Bayesian genomic selection: the effect of haplotype lenghts and priors

    DEFF Research Database (Denmark)

    Villumsen, Trine Michelle; Janss, Luc

    2009-01-01

    Breeding values for animals with marker data are estimated using a genomic selection approach where data is analyzed using Bayesian multi-marker association models. Fourteen model scenarios with varying haplotype lengths, hyper parameter and prior distributions were compared to find the scenario ...

  5. [Construction of haplotype and haplotype block based on tag single nucleotide polymorphisms and their applications in association studies].

    Science.gov (United States)

    Gu, Ming-liang; Chu, Jia-you

    2007-12-01

    Human genome has structures of haplotype and haplotype block which provide valuable information on human evolutionary history and may lead to the development of more efficient strategies to identify genetic variants that increase susceptibility to complex diseases. Haplotype block can be divided into discrete blocks of limited haplotype diversity. In each block, a small fraction of ptag SNPsq can be used to distinguish a large fraction of the haplotypes. These tag SNPs can be potentially useful for construction of haplotype and haplotype block, and association studies in complex diseases. There are two general classes of methods to construct haplotype and haplotype blocks based on genotypes on large pedigrees and statistical algorithms respectively. The author evaluate several construction methods to assess the power of different association tests with a variety of disease models and block-partitioning criteria. The advantages, limitations and applications of each method and the application in the association studies are discussed equitably. With the completion of the HapMap and development of statistical algorithms for addressing haplotype reconstruction, ideas of construction of haplotype based on combination of mathematics, physics, and computer science etc will have profound impacts on population genetics, location and cloning for susceptible genes in complex diseases, and related domain with life science etc.

  6. Determination of haplotypes at structurally complex regions using emulsion haplotype fusion PCR.

    Science.gov (United States)

    Tyson, Jess; Armour, John A L

    2012-12-11

    Genotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes. It is nevertheless this phased information that is transmitted from one generation to the next and is most directly associated with biological function and the genetic causes of biological effects. Despite progress made in genome-wide sequencing and phasing algorithms and methods, problems assembling (and reconstructing linear haplotypes in) regions of repetitive DNA and structural variation remain. These dynamic and structurally complex regions are often poorly understood from a sequence point of view. Regions such as these that are highly similar in their sequence tend to be collapsed onto the genome assembly. This is turn means downstream determination of the true sequence haplotype in these regions poses a particular challenge. For structurally complex regions, a more focussed approach to assembling haplotypes may be required. In order to investigate reconstruction of spatial information at structurally complex regions, we have used an emulsion haplotype fusion PCR approach to reproducibly link sequences of up to 1kb in length to allow phasing of multiple variants from neighbouring loci, using allele-specific PCR and sequencing to detect the phase. By using emulsion systems linking flanking regions to amplicons within the CNV, this led to the reconstruction of a 59kb haplotype across the DEFA1A3 CNV in HapMap individuals. This study has demonstrated a novel use for emulsion haplotype fusion PCR in addressing the issue of reconstructing structural haplotypes at multiallelic copy variable regions, using the DEFA1A3 locus as an example.

  7. Determination of haplotypes at structurally complex regions using emulsion haplotype fusion PCR

    Directory of Open Access Journals (Sweden)

    Tyson Jess

    2012-12-01

    Full Text Available Abstract Background Genotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes. It is nevertheless this phased information that is transmitted from one generation to the next and is most directly associated with biological function and the genetic causes of biological effects. Despite progress made in genome-wide sequencing and phasing algorithms and methods, problems assembling (and reconstructing linear haplotypes in regions of repetitive DNA and structural variation remain. These dynamic and structurally complex regions are often poorly understood from a sequence point of view. Regions such as these that are highly similar in their sequence tend to be collapsed onto the genome assembly. This is turn means downstream determination of the true sequence haplotype in these regions poses a particular challenge. For structurally complex regions, a more focussed approach to assembling haplotypes may be required. Results In order to investigate reconstruction of spatial information at structurally complex regions, we have used an emulsion haplotype fusion PCR approach to reproducibly link sequences of up to 1kb in length to allow phasing of multiple variants from neighbouring loci, using allele-specific PCR and sequencing to detect the phase. By using emulsion systems linking flanking regions to amplicons within the CNV, this led to the reconstruction of a 59kb haplotype across the DEFA1A3 CNV in HapMap individuals. Conclusion This study has demonstrated a novel use for emulsion haplotype fusion PCR in addressing the issue of reconstructing structural haplotypes at multiallelic copy variable regions, using the DEFA1A3 locus as an example.

  8. Haplotypes in the dystrophin DNA segment point to a mosaic origin of modern human diversity.

    Science.gov (United States)

    Zietkiewicz, Ewa; Yotova, Vania; Gehl, Dominik; Wambach, Tina; Arrieta, Isabel; Batzer, Mark; Cole, David E C; Hechtman, Peter; Kaplan, Feige; Modiano, David; Moisan, Jean-Paul; Michalski, Roman; Labuda, Damian

    2003-11-01

    Although Africa has played a central role in human evolutionary history, certain studies have suggested that not all contemporary human genetic diversity is of recent African origin. We investigated 35 simple polymorphic sites and one T(n) microsatellite in an 8-kb segment of the dystrophin gene. We found 86 haplotypes in 1,343 chromosomes from around the world. Although a classical out-of-Africa topology was observed in trees based on the variant frequencies, the tree of haplotype sequences reveals three lineages accounting for present-day diversity. The proportion of new recombinants and the diversity of the T(n) microsatellite were used to estimate the age of haplotype lineages and the time of colonization events. The lineage that underwent the great expansion originated in Africa prior to the Upper Paleolithic (27,000-56,000 years ago). A second group, of structurally distinct haplotypes that occupy a central position on the tree, has never left Africa. The third lineage is represented by the haplotype that lies closest to the root, is virtually absent in Africa, and appears older than the recent out-of-Africa expansion. We propose that this lineage could have left Africa before the expansion (as early as 160,000 years ago) and admixed, outside of Africa, with the expanding lineage. Contemporary human diversity, although dominated by the recently expanded African lineage, thus represents a mosaic of different contributions.

  9. Consequences for diversity when animals are prioritized for conservation of the whole genome or of one specific allele

    NARCIS (Netherlands)

    Engelsma, K.A.; Veerkamp, R.F.; Calus, M.P.L.; Windig, J.J.

    2014-01-01

    When animals are selected for one specific allele, for example for inclusion in a gene bank, this may result in the loss of diversity in other parts of the genome. The aim of this study was to quantify the risk of losing diversity across the genome when targeting a single allele for conservation

  10. Mapping Haplotype-haplotype Interactions with Adaptive LASSO

    Directory of Open Access Journals (Sweden)

    Li Ming

    2010-08-01

    Full Text Available Abstract Background The genetic etiology of complex diseases in human has been commonly viewed as a complex process involving both genetic and environmental factors functioning in a complicated manner. Quite often the interactions among genetic variants play major roles in determining the susceptibility of an individual to a particular disease. Statistical methods for modeling interactions underlying complex diseases between single genetic variants (e.g. single nucleotide polymorphisms or SNPs have been extensively studied. Recently, haplotype-based analysis has gained its popularity among genetic association studies. When multiple sequence or haplotype interactions are involved in determining an individual's susceptibility to a disease, it presents daunting challenges in statistical modeling and testing of the interaction effects, largely due to the complicated higher order epistatic complexity. Results In this article, we propose a new strategy in modeling haplotype-haplotype interactions under the penalized logistic regression framework with adaptive L1-penalty. We consider interactions of sequence variants between haplotype blocks. The adaptive L1-penalty allows simultaneous effect estimation and variable selection in a single model. We propose a new parameter estimation method which estimates and selects parameters by the modified Gauss-Seidel method nested within the EM algorithm. Simulation studies show that it has low false positive rate and reasonable power in detecting haplotype interactions. The method is applied to test haplotype interactions involved in mother and offspring genome in a small for gestational age (SGA neonates data set, and significant interactions between different genomes are detected. Conclusions As demonstrated by the simulation studies and real data analysis, the approach developed provides an efficient tool for the modeling and testing of haplotype interactions. The implementation of the method in R codes can be

  11. Genetic differences in the two main groups of the Japanese population based on autosomal SNPs and haplotypes.

    Science.gov (United States)

    Yamaguchi-Kabata, Yumi; Tsunoda, Tatsuhiko; Kumasaka, Natsuhiko; Takahashi, Atsushi; Hosono, Naoya; Kubo, Michiaki; Nakamura, Yusuke; Kamatani, Naoyuki

    2012-05-01

    Although the Japanese population has a rather low genetic diversity, we recently confirmed the presence of two main clusters (the Hondo and Ryukyu clusters) through principal component analysis of genome-wide single-nucleotide polymorphism (SNP) genotypes. Understanding the genetic differences between the two main clusters requires further genome-wide analyses based on a dense SNP set and comparison of haplotype frequencies. In the present study, we determined haplotypes for the Hondo cluster of the Japanese population by detecting SNP homozygotes with 388,591 autosomal SNPs from 18,379 individuals and estimated the haplotype frequencies. Haplotypes for the Ryukyu cluster were inferred by a statistical approach using the genotype data from 504 individuals. We then compared the haplotype frequencies between the Hondo and Ryukyu clusters. In most genomic regions, the haplotype frequencies in the Hondo and Ryukyu clusters were very similar. However, in addition to the human leukocyte antigen region on chromosome 6, other genomic regions (chromosomes 3, 4, 5, 7, 10 and 12) showed dissimilarities in haplotype frequency. These regions were enriched for genes involved in the immune system, cell-cell adhesion and the intracellular signaling cascade. These differentiated genomic regions between the Hondo and Ryukyu clusters are of interest because they (1) should be examined carefully in association studies and (2) likely contain genes responsible for morphological or physiological differences between the two groups.

  12. Transcription profiles of mitochondrial genes correlate with mitochondrial DNA haplotypes in a natural population of Silene vulgaris

    Directory of Open Access Journals (Sweden)

    Olson Matthew S

    2010-01-01

    Full Text Available Abstract Background Although rapid changes in copy number and gene order are common within plant mitochondrial genomes, associated patterns of gene transcription are underinvestigated. Previous studies have shown that the gynodioecious plant species Silene vulgaris exhibits high mitochondrial diversity and occasional paternal inheritance of mitochondrial markers. Here we address whether variation in DNA molecular markers is correlated with variation in transcription of mitochondrial genes in S. vulgaris collected from natural populations. Results We analyzed RFLP variation in two mitochondrial genes, cox1 and atp1, in offspring of ten plants from a natural population of S. vulgaris in Central Europe. We also investigated transcription profiles of the atp1 and cox1 genes. Most DNA haplotypes and transcription profiles were maternally inherited; for these, transcription profiles were associated with specific mitochondrial DNA haplotypes. One individual exhibited a pattern consistent with paternal inheritance of mitochondrial DNA; this individual exhibited a transcription profile suggestive of paternal but inconsistent with maternal inheritance. We found no associations between gender and transcript profiles. Conclusions Specific transcription profiles of mitochondrial genes were associated with specific mitochondrial DNA haplotypes in a natural population of a gynodioecious species S. vulgaris. Our findings suggest the potential for a causal association between rearrangements in the plant mt genome and transcription product variation.

  13. Population Structure of Pseudocercospora fijiensis in Costa Rica Reveals Shared Haplotype Diversity with Southeast Asian Populations.

    Science.gov (United States)

    Saville, Amanda; Charles, Melodi; Chavan, Suchitra; Muñoz, Miguel; Gómez-Alpizar, Luis; Ristaino, Jean Beagle

    2017-12-01

    Pseudocercospora fijiensis is the causal pathogen of black Sigatoka, a devastating disease of banana that can cause 20 to 80% yield loss in the absence of fungicides in banana crops. The genetic structure of populations of P. fijiensis in Costa Rica was examined and compared with Honduran and global populations to better understand migration patterns and inform management strategies. In total, 118 isolates of P. fijiensis collected from Costa Rica and Honduras from 2010 to 2014 were analyzed using multilocus genotyping of six loci and compared with a previously published global dataset of populations of P. fijiensis. The Costa Rican and Honduran populations shared haplotype diversity with haplotypes from Southeast Asia, Oceania, and the Americas but not Africa for all but one of the six loci studied. Gene flow and shared haplotype diversity was found in Honduran and Costa Rican populations of the pathogen. The data indicate that the haplotypic diversity observed in Costa Rican populations of P. fijiensis is derived from dispersal from initial outbreak sources in Honduras and admixtures between genetically differentiated sources from Southeast Asia, Oceania, and the Americas.

  14. Genetic variations and haplotype diversity of the UGT1 gene cluster in the Chinese population.

    Directory of Open Access Journals (Sweden)

    Jing Yang

    Full Text Available Vertebrates require tremendous molecular diversity to defend against numerous small hydrophobic chemicals. UDP-glucuronosyltransferases (UGTs are a large family of detoxification enzymes that glucuronidate xenobiotics and endobiotics, facilitating their excretion from the body. The UGT1 gene cluster contains a tandem array of variable first exons, each preceded by a specific promoter, and a common set of downstream constant exons, similar to the genomic organization of the protocadherin (Pcdh, immunoglobulin, and T-cell receptor gene clusters. To assist pharmacogenomics studies in Chinese, we sequenced nine first exons, promoter and intronic regions, and five common exons of the UGT1 gene cluster in a population sample of 253 unrelated Chinese individuals. We identified 101 polymorphisms and found 15 novel SNPs. We then computed allele frequencies for each polymorphism and reconstructed their linkage disequilibrium (LD map. The UGT1 cluster can be divided into five linkage blocks: Block 9 (UGT1A9, Block 9/7/6 (UGT1A9, UGT1A7, and UGT1A6, Block 5 (UGT1A5, Block 4/3 (UGT1A4 and UGT1A3, and Block 3' UTR. Furthermore, we inferred haplotypes and selected their tagSNPs. Finally, comparing our data with those of three other populations of the HapMap project revealed ethnic specificity of the UGT1 genetic diversity in Chinese. These findings have important implications for future molecular genetic studies of the UGT1 gene cluster as well as for personalized medical therapies in Chinese.

  15. Characterization of genomic sequence showing strong association with polyembryony among diverse Citrus species and cultivars, and its synteny with Vitis and Populus.

    Science.gov (United States)

    Nakano, Michiharu; Shimada, Takehiko; Endo, Tomoko; Fujii, Hiroshi; Nesumi, Hirohisa; Kita, Masayuki; Ebina, Masumi; Shimizu, Tokurou; Omura, Mitsuo

    2012-02-01

    Polyembryony, in which multiple somatic nucellar cell-derived embryos develop in addition to the zygotic embryo in a seed, is common in the genus Citrus. Previous genetic studies indicated polyembryony is mainly determined by a single locus, but the underlying molecular mechanism is still unclear. As a step towards identification and characterization of the gene or genes responsible for nucellar embryogenesis in Citrus, haplotype-specific physical maps around the polyembryony locus were constructed. By sequencing three BAC clones aligned on the polyembryony haplotype, a single contiguous draft sequence consisting of 380 kb containing 70 predicted open reading frames (ORFs) was reconstructed. Single nucleotide polymorphism genotypes detected in the sequenced genomic region showed strong association with embryo type in Citrus, indicating a common polyembryony locus is shared among widely diverse Citrus cultivars and species. The arrangement of the predicted ORFs in the characterized genomic region showed high collinearity to the genomic sequence of chromosome 4 of Vitis vinifera and linkage group VI of Populus trichocarpa, suggesting that the syntenic relationship among these species is conserved even though V. vinifera and P. trichocarpa are non-apomictic species. This is the first study to characterize in detail the genomic structure of an apomixis locus determining adventitious embryony. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  16. The impact of sample size and marker selection on the study of haplotype structures

    Directory of Open Access Journals (Sweden)

    Sun Xiao

    2004-03-01

    Full Text Available Abstract Several studies of haplotype structures in the human genome in various populations have found that the human chromosomes are structured such that each chromosome can be divided into many blocks, within which there is limited haplotype diversity. In addition, only a few genetic markers in a putative block are needed to capture most of the diversity within a block. There has been no systematic empirical study of the effects of sample size and marker set on the identified block structures and representative marker sets, however. The purpose of this study was to conduct a detailed empirical study to examine such impacts. Towards this goal, we have analysed three representative autosomal regions from a large genome-wide study of haplotypes with samples consisting of African-Americans and samples consisting of Japanese and Chinese individuals. For both populations, we have found that the sample size and marker set have significant impact on the number of blocks and the total number of representative markers identified. The marker set in particular has very strong impacts, and our results indicate that the marker density in the original datasets may not be adequate to allow a meaningful characterisation of haplotype structures. In general, we conclude that we need a relatively large sample size and a very dense marker panel in the study of haplotype structures in human populations.

  17. Mitochondrial Haplotype Diversity in Zambian Lions: Bridging a Gap in the Biogeography of an Iconic Species.

    Science.gov (United States)

    Curry, Caitlin J; White, Paula A; Derr, James N

    2015-01-01

    Analysis of DNA sequence diversity at the 12S to 16S mitochondrial genes of 165 African lions (Panthera leo) from five main areas in Zambia has uncovered haplotypes which link Southern Africa with East Africa. Phylogenetic analysis suggests Zambia may serve as a bridge connecting the lion populations in southern Africa to eastern Africa, supporting earlier hypotheses that eastern-southern Africa may represent the evolutionary cradle for the species. Overall gene diversity throughout the Zambian lion population was 0.7319 +/- 0.0174 with eight haplotypes found; three haplotypes previously described and the remaining five novel. The addition of these five novel haplotypes, so far only found within Zambia, nearly doubles the number of haplotypes previously reported for any given geographic location of wild lions. However, based on an AMOVA analysis of these haplotypes, there is little to no matrilineal gene flow (Fst = 0.47) when the eastern and western regions of Zambia are considered as two regional sub-populations. Crossover haplotypes (H9, H11, and Z1) appear in both populations as rare in one but common in the other. This pattern is a possible result of the lion mating system in which predominately males disperse, as all individuals with crossover haplotypes were male. The determination and characterization of lion sub-populations, such as done in this study for Zambia, represent a higher-resolution of knowledge regarding both the genetic health and connectivity of lion populations, which can serve to inform conservation and management of this iconic species.

  18. Mitochondrial Haplotype Diversity in Zambian Lions: Bridging a Gap in the Biogeography of an Iconic Species.

    Directory of Open Access Journals (Sweden)

    Caitlin J Curry

    Full Text Available Analysis of DNA sequence diversity at the 12S to 16S mitochondrial genes of 165 African lions (Panthera leo from five main areas in Zambia has uncovered haplotypes which link Southern Africa with East Africa. Phylogenetic analysis suggests Zambia may serve as a bridge connecting the lion populations in southern Africa to eastern Africa, supporting earlier hypotheses that eastern-southern Africa may represent the evolutionary cradle for the species. Overall gene diversity throughout the Zambian lion population was 0.7319 +/- 0.0174 with eight haplotypes found; three haplotypes previously described and the remaining five novel. The addition of these five novel haplotypes, so far only found within Zambia, nearly doubles the number of haplotypes previously reported for any given geographic location of wild lions. However, based on an AMOVA analysis of these haplotypes, there is little to no matrilineal gene flow (Fst = 0.47 when the eastern and western regions of Zambia are considered as two regional sub-populations. Crossover haplotypes (H9, H11, and Z1 appear in both populations as rare in one but common in the other. This pattern is a possible result of the lion mating system in which predominately males disperse, as all individuals with crossover haplotypes were male. The determination and characterization of lion sub-populations, such as done in this study for Zambia, represent a higher-resolution of knowledge regarding both the genetic health and connectivity of lion populations, which can serve to inform conservation and management of this iconic species.

  19. Haplotype reconstruction error as a classical misclassification problem: introducing sensitivity and specificity as error measures.

    Directory of Open Access Journals (Sweden)

    Claudia Lamina

    Full Text Available BACKGROUND: Statistically reconstructing haplotypes from single nucleotide polymorphism (SNP genotypes, can lead to falsely classified haplotypes. This can be an issue when interpreting haplotype association results or when selecting subjects with certain haplotypes for subsequent functional studies. It was our aim to quantify haplotype reconstruction error and to provide tools for it. METHODS AND RESULTS: By numerous simulation scenarios, we systematically investigated several error measures, including discrepancy, error rate, and R(2, and introduced the sensitivity and specificity to this context. We exemplified several measures in the KORA study, a large population-based study from Southern Germany. We find that the specificity is slightly reduced only for common haplotypes, while the sensitivity was decreased for some, but not all rare haplotypes. The overall error rate was generally increasing with increasing number of loci, increasing minor allele frequency of SNPs, decreasing correlation between the alleles and increasing ambiguity. CONCLUSIONS: We conclude that, with the analytical approach presented here, haplotype-specific error measures can be computed to gain insight into the haplotype uncertainty. This method provides the information, if a specific risk haplotype can be expected to be reconstructed with rather no or high misclassification and thus on the magnitude of expected bias in association estimates. We also illustrate that sensitivity and specificity separate two dimensions of the haplotype reconstruction error, which completely describe the misclassification matrix and thus provide the prerequisite for methods accounting for misclassification.

  20. Inference of haplotypic phase and missing genotypes in polyploid organisms and variable copy number genomic regions

    Directory of Open Access Journals (Sweden)

    Balding David J

    2008-12-01

    Full Text Available Abstract Background The power of haplotype-based methods for association studies, identification of regions under selection, and ancestral inference, is well-established for diploid organisms. For polyploids, however, the difficulty of determining phase has limited such approaches. Polyploidy is common in plants and is also observed in animals. Partial polyploidy is sometimes observed in humans (e.g. trisomy 21; Down's syndrome, and it arises more frequently in some human tissues. Local changes in ploidy, known as copy number variations (CNV, arise throughout the genome. Here we present a method, implemented in the software polyHap, for the inference of haplotype phase and missing observations from polyploid genotypes. PolyHap allows each individual to have a different ploidy, but ploidy cannot vary over the genomic region analysed. It employs a hidden Markov model (HMM and a sampling algorithm to infer haplotypes jointly in multiple individuals and to obtain a measure of uncertainty in its inferences. Results In the simulation study, we combine real haplotype data to create artificial diploid, triploid, and tetraploid genotypes, and use these to demonstrate that polyHap performs well, in terms of both switch error rate in recovering phase and imputation error rate for missing genotypes. To our knowledge, there is no comparable software for phasing a large, densely genotyped region of chromosome from triploids and tetraploids, while for diploids we found polyHap to be more accurate than fastPhase. We also compare the results of polyHap to SATlotyper on an experimentally haplotyped tetraploid dataset of 12 SNPs, and show that polyHap is more accurate. Conclusion With the availability of large SNP data in polyploids and CNV regions, we believe that polyHap, our proposed method for inferring haplotypic phase from genotype data, will be useful in enabling researchers analysing such data to exploit the power of haplotype-based analyses.

  1. A Haplotype Information Theory Method Reveals Genes of Evolutionary Interest in European vs. Asian Pigs.

    Science.gov (United States)

    Hudson, Nicholas J; Naval-Sánchez, Marina; Porto-Neto, Laercio; Pérez-Enciso, Miguel; Reverter, Antonio

    2018-06-05

    Asian and European wild boars were independently domesticated ca. 10,000 years ago. Since the 17th century, Chinese breeds have been imported to Europe to improve the genetics of European animals by introgression of favourable alleles, resulting in a complex mosaic of haplotypes. To interrogate the structure of these haplotypes further, we have run a new haplotype segregation analysis based on information theory, namely compression efficiency (CE). We applied the approach to sequence data from individuals from each phylogeographic region (n = 23 from Asia and Europe) including a number of major pig breeds. Our genome-wide CE is able to discriminate the breeds in a manner reflecting phylogeography. Furthermore, 24,956 non-overlapping sliding windows (each comprising 1,000 consecutive SNP) were quantified for extent of haplotype sharing within and between Asia and Europe. The genome-wide distribution of extent of haplotype sharing was quite different between groups. Unlike European pigs, Asian pigs haplotype sharing approximates a normal distribution. In line with this, we found the European breeds possessed a number of genomic windows of dramatically higher haplotype sharing than the Asian breeds. Our CE analysis of sliding windows capture some of the genomic regions reported to contain signatures of selection in domestic pigs. Prominent among these regions, we highlight the role of a gene encoding the mitochondrial enzyme LACTB which has been associated with obesity, and the gene encoding MYOG a fundamental transcriptional regulator of myogenesis. The origin of these regions likely reflects either a population bottleneck in European animals, or selective targets on commercial phenotypes reducing allelic diversity in particular genes and/or regulatory regions.

  2. A Comprehensive, Ethnically Diverse Library of Sickle Cell Disease-Specific Induced Pluripotent Stem Cells

    Directory of Open Access Journals (Sweden)

    Seonmi Park

    2017-04-01

    Full Text Available Summary: Sickle cell anemia affects millions of people worldwide and is an emerging global health burden. As part of a large NIH-funded NextGen Consortium, we generated a diverse, comprehensive, and fully characterized library of sickle-cell-disease-specific induced pluripotent stem cells (iPSCs from patients of different ethnicities, β-globin gene (HBB haplotypes, and fetal hemoglobin (HbF levels. iPSCs stand to revolutionize the way we study human development, model disease, and perhaps eventually, treat patients. Here, we describe this unique resource for the study of sickle cell disease, including novel haplotype-specific polymorphisms that affect disease severity, as well as for the development of patient-specific therapeutics for this phenotypically diverse disorder. As a complement to this library, and as proof of principle for future cell- and gene-based therapies, we also designed and employed CRISPR/Cas gene editing tools to correct the sickle hemoglobin (HbS mutation. : In this resource article, Mostoslavsky, Murphy, and colleagues of the NextGen consortium describe a diverse, comprehensive, and characterized library of sickle cell disease-specific induced pluripotent stem cells (iPSCs from patients of different ethnicities, β-globin gene (HBB haplotypes and fetal hemoglobin (HbF levels. This bank is readily available and accessible to all investigators. Keywords: induced pluripotent stem cells, iPSCs, sickle cell disease, disease modeling, directed differentiation, gene correction

  3. The draft genome of watermelon (Citrullus lanatus) and resequencing of 20 diverse accessions.

    Science.gov (United States)

    Guo, Shaogui; Zhang, Jianguo; Sun, Honghe; Salse, Jerome; Lucas, William J; Zhang, Haiying; Zheng, Yi; Mao, Linyong; Ren, Yi; Wang, Zhiwen; Min, Jiumeng; Guo, Xiaosen; Murat, Florent; Ham, Byung-Kook; Zhang, Zhaoliang; Gao, Shan; Huang, Mingyun; Xu, Yimin; Zhong, Silin; Bombarely, Aureliano; Mueller, Lukas A; Zhao, Hong; He, Hongju; Zhang, Yan; Zhang, Zhonghua; Huang, Sanwen; Tan, Tao; Pang, Erli; Lin, Kui; Hu, Qun; Kuang, Hanhui; Ni, Peixiang; Wang, Bo; Liu, Jingan; Kou, Qinghe; Hou, Wenju; Zou, Xiaohua; Jiang, Jiao; Gong, Guoyi; Klee, Kathrin; Schoof, Heiko; Huang, Ying; Hu, Xuesong; Dong, Shanshan; Liang, Dequan; Wang, Juan; Wu, Kui; Xia, Yang; Zhao, Xiang; Zheng, Zequn; Xing, Miao; Liang, Xinming; Huang, Bangqing; Lv, Tian; Wang, Junyi; Yin, Ye; Yi, Hongping; Li, Ruiqiang; Wu, Mingzhu; Levi, Amnon; Zhang, Xingping; Giovannoni, James J; Wang, Jun; Li, Yunfu; Fei, Zhangjun; Xu, Yong

    2013-01-01

    Watermelon, Citrullus lanatus, is an important cucurbit crop grown throughout the world. Here we report a high-quality draft genome sequence of the east Asia watermelon cultivar 97103 (2n = 2× = 22) containing 23,440 predicted protein-coding genes. Comparative genomics analysis provided an evolutionary scenario for the origin of the 11 watermelon chromosomes derived from a 7-chromosome paleohexaploid eudicot ancestor. Resequencing of 20 watermelon accessions representing three different C. lanatus subspecies produced numerous haplotypes and identified the extent of genetic diversity and population structure of watermelon germplasm. Genomic regions that were preferentially selected during domestication were identified. Many disease-resistance genes were also found to be lost during domestication. In addition, integrative genomic and transcriptomic analyses yielded important insights into aspects of phloem-based vascular signaling in common between watermelon and cucumber and identified genes crucial to valuable fruit-quality traits, including sugar accumulation and citrulline metabolism.

  4. Genomic diversity of Lactobacillus salivarius

    OpenAIRE

    Raftis, Emma J.

    2015-01-01

    Lactobacillus salivarius is unusual among the lactobacilli due to its multireplicon genome architecture. The circular megaplasmids harboured by L. salivarius strains encode strain-specific traits for intestinal survival and probiotic activity. L. salivarius strains are increasingly being exploited for their probiotic properties in humans and animals. In terms of probiotic strain selection, it is important to have an understanding of the level of genomic diversity present in this species. Comp...

  5. A genomic insight into diversity among tribal and nontribal population groups of Manipur, India.

    Science.gov (United States)

    Saraswathy, K N; Kiranmala, Naorem; Murry, Benrithung; Sinha, Ekata; Saksena, Deepti; Kaur, Harpreet; Sachdeva, M P; Kalla, A K

    2009-10-01

    Twenty autosomal markers, including linked markers at two gene markers, are used to understand the genomic similarity and diversity among three tribal (Paite, Thadou, and Kom) and one nontribal communities of Manipur (Northeast India). Two of the markers (CD4 and HB9) are monomorphic in Paite and one (the CD4 marker) in Kom. Data suggest the Meitei (nontribal groups) stand apart from the three tribal groups with respect to higher heterozygosity (0.366) and presence of the highest ancestor haplotypes of DRD2 markers (0.228); this is also supported by principal co-ordinate analysis. These populations are found to be genomically closer to the Chinese population than to other Indian populations.

  6. Recovery of native genetic background in admixed populations using haplotypes, phenotypes, and pedigree information--using Cika cattle as a case breed.

    Directory of Open Access Journals (Sweden)

    Mojca Simčič

    Full Text Available The aim of this study was to obtain unbiased estimates of the diversity parameters, the population history, and the degree of admixture in Cika cattle which represents the local admixed breeds at risk of extinction undergoing challenging conservation programs. Genetic analyses were performed on the genome-wide Single Nucleotide Polymorphism (SNP Illumina Bovine SNP50 array data of 76 Cika animals and 531 animals from 14 reference populations. To obtain unbiased estimates we used short haplotypes spanning four markers instead of single SNPs to avoid an ascertainment bias of the BovineSNP50 array. Genome-wide haplotypes combined with partial pedigree and type trait classification show the potential to improve identification of purebred animals with a low degree of admixture. Phylogenetic analyses demonstrated unique genetic identity of Cika animals. Genetic distance matrix presented by rooted Neighbour-Net suggested long and broad phylogenetic connection between Cika and Pinzgauer. Unsupervised clustering performed by the admixture analysis and two-dimensional presentation of the genetic distances between individuals also suggest Cika is a distinct breed despite being similar in appearance to Pinzgauer. Animals identified as the most purebred could be used as a nucleus for a recovery of the native genetic background in the current admixed population. The results show that local well-adapted strains, which have never been intensively managed and differentiated into specific breeds, exhibit large haplotype diversity. They suggest a conservation and recovery approach that does not rely exclusively on the search for the original native genetic background but rather on the identification and removal of common introgressed haplotypes would be more powerful. Successful implementation of such an approach should be based on combining phenotype, pedigree, and genome-wide haplotype data of the breed of interest and a spectrum of reference breeds which

  7. Inclusion of Population-specific Reference Panel from India to the 1000 Genomes Phase 3 Panel Improves Imputation Accuracy.

    Science.gov (United States)

    Ahmad, Meraj; Sinha, Anubhav; Ghosh, Sreya; Kumar, Vikrant; Davila, Sonia; Yajnik, Chittaranjan S; Chandak, Giriraj R

    2017-07-27

    Imputation is a computational method based on the principle of haplotype sharing allowing enrichment of genome-wide association study datasets. It depends on the haplotype structure of the population and density of the genotype data. The 1000 Genomes Project led to the generation of imputation reference panels which have been used globally. However, recent studies have shown that population-specific panels provide better enrichment of genome-wide variants. We compared the imputation accuracy using 1000 Genomes phase 3 reference panel and a panel generated from genome-wide data on 407 individuals from Western India (WIP). The concordance of imputed variants was cross-checked with next-generation re-sequencing data on a subset of genomic regions. Further, using the genome-wide data from 1880 individuals, we demonstrate that WIP works better than the 1000 Genomes phase 3 panel and when merged with it, significantly improves the imputation accuracy throughout the minor allele frequency range. We also show that imputation using only South Asian component of the 1000 Genomes phase 3 panel works as good as the merged panel, making it computationally less intensive job. Thus, our study stresses that imputation accuracy using 1000 Genomes phase 3 panel can be further improved by including population-specific reference panels from South Asia.

  8. Haplotypes in the Dystrophin DNA Segment Point to a Mosaic Origin of Modern Human Diversity

    OpenAIRE

    Ziętkiewicz, Ewa; Yotova, Vania; Gehl, Dominik; Wambach, Tina; Arrieta, Isabel; Batzer, Mark; Cole, David E.C.; Hechtman, Peter; Kaplan, Feige; Modiano, David; Moisan, Jean-Paul; Michalski, Roman; Labuda, Damian

    2003-01-01

    Although Africa has played a central role in human evolutionary history, certain studies have suggested that not all contemporary human genetic diversity is of recent African origin. We investigated 35 simple polymorphic sites and one Tn microsatellite in an 8-kb segment of the dystrophin gene. We found 86 haplotypes in 1,343 chromosomes from around the world. Although a classical out-of-Africa topology was observed in trees based on the variant frequencies, the tree of haplotype sequences re...

  9. Integrated genetic and epigenetic analysis identifies haplotype-specific methylation in the FTO type 2 diabetes and obesity susceptibility locus.

    Directory of Open Access Journals (Sweden)

    Christopher G Bell

    2010-11-01

    Full Text Available Recent multi-dimensional approaches to the study of complex disease have revealed powerful insights into how genetic and epigenetic factors may underlie their aetiopathogenesis. We examined genotype-epigenotype interactions in the context of Type 2 Diabetes (T2D, focussing on known regions of genomic susceptibility. We assayed DNA methylation in 60 females, stratified according to disease susceptibility haplotype using previously identified association loci. CpG methylation was assessed using methylated DNA immunoprecipitation on a targeted array (MeDIP-chip and absolute methylation values were estimated using a Bayesian algorithm (BATMAN. Absolute methylation levels were quantified across LD blocks, and we identified increased DNA methylation on the FTO obesity susceptibility haplotype, tagged by the rs8050136 risk allele A (p = 9.40×10(-4, permutation p = 1.0×10(-3. Further analysis across the 46 kb LD block using sliding windows localised the most significant difference to be within a 7.7 kb region (p = 1.13×10(-7. Sequence level analysis, followed by pyrosequencing validation, revealed that the methylation difference was driven by the co-ordinated phase of CpG-creating SNPs across the risk haplotype. This 7.7 kb region of haplotype-specific methylation (HSM, encapsulates a Highly Conserved Non-Coding Element (HCNE that has previously been validated as a long-range enhancer, supported by the histone H3K4me1 enhancer signature. This study demonstrates that integration of Genome-Wide Association (GWA SNP and epigenomic DNA methylation data can identify potential novel genotype-epigenotype interactions within disease-associated loci, thus providing a novel route to aid unravelling common complex diseases.

  10. Diversity of Pseudomonas Genomes, Including Populus-Associated Isolates, as Revealed by Comparative Genome Analysis.

    Science.gov (United States)

    Jun, Se-Ran; Wassenaar, Trudy M; Nookaew, Intawat; Hauser, Loren; Wanchai, Visanu; Land, Miriam; Timm, Collin M; Lu, Tse-Yuan S; Schadt, Christopher W; Doktycz, Mitchel J; Pelletier, Dale A; Ussery, David W

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants. Copyright © 2015 Jun et al.

  11. Haplotype-based stratification of Huntington's disease.

    Science.gov (United States)

    Chao, Michael J; Gillis, Tammy; Atwal, Ranjit S; Mysore, Jayalakshmi Srinidhi; Arjomand, Jamshid; Harold, Denise; Holmans, Peter; Jones, Lesley; Orth, Michael; Myers, Richard H; Kwak, Seung; Wheeler, Vanessa C; MacDonald, Marcy E; Gusella, James F; Lee, Jong-Min

    2017-11-01

    Huntington's disease (HD) is an autosomal dominant neurodegenerative disease caused by expansion of a CAG trinucleotide repeat in HTT, resulting in an extended polyglutamine tract in huntingtin. We and others have previously determined that the HD-causing expansion occurs on multiple different haplotype backbones, reflecting more than one ancestral origin of the same type of mutation. In view of the therapeutic potential of mutant allele-specific gene silencing, we have compared and integrated two major systems of HTT haplotype definition, combining data from 74 sequence variants to identify the most frequent disease-associated and control chromosome backbones and revealing that there is potential for additional resolution of HD haplotypes. We have used the large collection of 4078 heterozygous HD subjects analyzed in our recent genome-wide association study of HD age at onset to estimate the frequency of these haplotypes in European subjects, finding that common genetic variation at HTT can distinguish the normal and CAG-expanded chromosomes for more than 95% of European HD individuals. As a resource for the HD research community, we have also determined the haplotypes present in a series of publicly available HD subject-derived fibroblasts, induced pluripotent cells, and embryonic stem cells in order to facilitate efforts to develop inclusive methods of allele-specific HTT silencing applicable to most HD patients. Our data providing genetic guidance for therapeutic gene-based targeting will significantly contribute to the developments of rational treatments and implementation of precision medicine in HD.

  12. The iSelect 9 K SNP analysis revealed polyploidization induced revolutionary changes and intense human selection causing strong haplotype blocks in wheat.

    Science.gov (United States)

    Hao, Chenyang; Wang, Yuquan; Chao, Shiaoman; Li, Tian; Liu, Hongxia; Wang, Lanfen; Zhang, Xueyong

    2017-01-30

    A Chinese wheat mini core collection was genotyped using the wheat 9 K iSelect SNP array. Total 2420 and 2396 polymorphic SNPs were detected on the A and the B genome chromosomes, which formed 878 haplotype blocks. There were more blocks in the B genome, but the average block size was significantly (P polyploidization of wheat (both tetraploidization and hexaploidization) induced revolutionary changes in both the A and the B genomes, with a greater increase of gene diversity compared to their diploid ancestors. Modern breeding has dramatically increased diversity in the gene coding regions, though obvious blocks were formed on most of the chromosomes in both tetraploid and hexaploid wheats. Tag-SNP markers identified in this study can be used for marker assisted selection using haplotype blocks as a wheat breeding strategy. This strategy can also be employed to facilitate genome selection in other self-pollinating crop species.

  13. Are molecular haplotypes worth the time and expense? A cost-effective method for applying molecular haplotypes.

    Directory of Open Access Journals (Sweden)

    Mark A Levenstien

    2006-08-01

    Full Text Available Because current molecular haplotyping methods are expensive and not amenable to automation, many researchers rely on statistical methods to infer haplotype pairs from multilocus genotypes, and subsequently treat these inferred haplotype pairs as observations. These procedures are prone to haplotype misclassification. We examine the effect of these misclassification errors on the false-positive rate and power for two association tests. These tests include the standard likelihood ratio test (LRTstd and a likelihood ratio test that employs a double-sampling approach to allow for the misclassification inherent in the haplotype inference procedure (LRTae. We aim to determine the cost-benefit relationship of increasing the proportion of individuals with molecular haplotype measurements in addition to genotypes to raise the power gain of the LRTae over the LRTstd. This analysis should provide a guideline for determining the minimum number of molecular haplotypes required for desired power. Our simulations under the null hypothesis of equal haplotype frequencies in cases and controls indicate that (1 for each statistic, permutation methods maintain the correct type I error; (2 specific multilocus genotypes that are misclassified as the incorrect haplotype pair are consistently misclassified throughout each entire dataset; and (3 our simulations under the alternative hypothesis showed a significant power gain for the LRTae over the LRTstd for a subset of the parameter settings. Permutation methods should be used exclusively to determine significance for each statistic. For fixed cost, the power gain of the LRTae over the LRTstd varied depending on the relative costs of genotyping, molecular haplotyping, and phenotyping. The LRTae showed the greatest benefit over the LRTstd when the cost of phenotyping was very high relative to the cost of genotyping. This situation is likely to occur in a replication study as opposed to a whole-genome association study.

  14. A Comprehensive, Ethnically Diverse Library of Sickle Cell Disease-Specific Induced Pluripotent Stem Cells.

    Science.gov (United States)

    Park, Seonmi; Gianotti-Sommer, Andreia; Molina-Estevez, Francisco Javier; Vanuytsel, Kim; Skvir, Nick; Leung, Amy; Rozelle, Sarah S; Shaikho, Elmutaz Mohammed; Weir, Isabelle; Jiang, Zhihua; Luo, Hong-Yuan; Chui, David H K; Figueiredo, Maria Stella; Alsultan, Abdulraham; Al-Ali, Amein; Sebastiani, Paola; Steinberg, Martin H; Mostoslavsky, Gustavo; Murphy, George J

    2017-04-11

    Sickle cell anemia affects millions of people worldwide and is an emerging global health burden. As part of a large NIH-funded NextGen Consortium, we generated a diverse, comprehensive, and fully characterized library of sickle-cell-disease-specific induced pluripotent stem cells (iPSCs) from patients of different ethnicities, β-globin gene (HBB) haplotypes, and fetal hemoglobin (HbF) levels. iPSCs stand to revolutionize the way we study human development, model disease, and perhaps eventually, treat patients. Here, we describe this unique resource for the study of sickle cell disease, including novel haplotype-specific polymorphisms that affect disease severity, as well as for the development of patient-specific therapeutics for this phenotypically diverse disorder. As a complement to this library, and as proof of principle for future cell- and gene-based therapies, we also designed and employed CRISPR/Cas gene editing tools to correct the sickle hemoglobin (HbS) mutation. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  15. Association of specific haplotype of TNFα with Helicobacter pylori ...

    Indian Academy of Sciences (India)

    Home; Journals; Journal of Genetics; Volume 87; Issue 3. Association of specific haplotype of TNF with Helicobacter pylori-mediated duodenal ulcer in eastern Indian population. Meenakshi Chakravorty Dipanjana Datta De Abhijit Choudhury Amal Santra Susanta Roychoudhury. Research Note Volume 87 Issue 3 ...

  16. Comparative genomic and functional analyses: unearthing the diversity and specificity of nematicidal factors in Pseudomonas putida strain 1A00316

    Science.gov (United States)

    Guo, Jing; Jing, Xueping; Peng, Wen-Lei; Nie, Qiyu; Zhai, Yile; Shao, Zongze; Zheng, Longyu; Cai, Minmin; Li, Guangyu; Zuo, Huaiyu; Zhang, Zhitao; Wang, Rui-Ru; Huang, Dian; Cheng, Wanli; Yu, Ziniu; Chen, Ling-Ling; Zhang, Jibin

    2016-01-01

    We isolated Pseudomonas putida (P. putida) strain 1A00316 from Antarctica. This bacterium has a high efficiency against Meloidogyne incognita (M. incognita) in vitro and under greenhouse conditions. The complete genome of P. putida 1A00316 was sequenced using PacBio single molecule real-time (SMRT) technology. A comparative genomic analysis of 16 Pseudomonas strains revealed that although P. putida 1A00316 belonged to P. putida, it was phenotypically more similar to nematicidal Pseudomonas fluorescens (P. fluorescens) strains. We characterized the diversity and specificity of nematicidal factors in P. putida 1A00316 with comparative genomics and functional analysis, and found that P. putida 1A00316 has diverse nematicidal factors including protein alkaline metalloproteinase AprA and two secondary metabolites, hydrogen cyanide and cyclo-(l-isoleucyl-l-proline). We show for the first time that cyclo-(l-isoleucyl-l-proline) exhibit nematicidal activity in P. putida. Interestingly, our study had not detected common nematicidal factors such as 2,4-diacetylphloroglucinol (2,4-DAPG) and pyrrolnitrin in P. putida 1A00316. The results of the present study reveal the diversity and specificity of nematicidal factors in P. putida strain 1A00316. PMID:27384076

  17. Vitamin K epoxide reductase complex subunit 1 (Vkorc1 haplotype diversity in mouse priority strains

    Directory of Open Access Journals (Sweden)

    Kohn Michael H

    2008-12-01

    Full Text Available Abstract Background Polymorphisms in the vitamin K-epoxide reductase complex subunit 1 gene, Vkorc1, could affect blood coagulation and other vitamin K-dependent proteins, such as osteocalcin (bone Gla protein, BGP. Here we sequenced the Vkorc1 gene in 40 mouse priority strains. We analyzed Vkorc1 haplotypes with respect to prothrombin time (PT and bone mineral density and composition (BMD and BMC; phenotypes expected to be vitamin K-dependent and represented by data in the Mouse Phenome Database (MPD. Findings In the commonly used laboratory strains of Mus musculus domesticus we identified only four haplotypes differing in the intron or 5' region sequence of the Vkorc1. Six haplotypes differing by coding and non-coding polymorphisms were identified in the other subspecies of Mus. We detected no significant association of Vkorc1 haplotypes with PT, BMD and BMC within each subspecies of Mus. Vkorc1 haplotype sequences divergence between subspecies was associated with PT, BMD and BMC. Conclusion Phenotypic variation in PT, BMD and BMC within subspecies of Mus, while substantial, appears to be dominated by genetic variation in genes other than the Vkorc1. This was particularly evident for M. m. domesticus, where a single haplotype was observed in conjunction with virtually the entire range of PT, BMD and BMC values of all 5 subspecies of Mus included in this study. Differences in these phenotypes between subspecies also should not be attributed to Vkorc1 variants, but should be viewed as a result of genome wide genetic divergence.

  18. Worldwide distribution of the MYH9 kidney disease susceptibility alleles and haplotypes: evidence of historical selection in Africa.

    Directory of Open Access Journals (Sweden)

    Taras K Oleksyk

    2010-07-01

    Full Text Available MYH9 was recently identified as renal susceptibility gene (OR 3-8, p or = 60% than in European Americans (< 4%, revealing a genetic basis for a major health disparity. The population distributions of MYH9 risk alleles and the E-1 risk haplotype and the demographic and selective forces acting on the MYH9 region are not well explored. We reconstructed MYH9 haplotypes from 4 tagging single nucleotide polymorphisms (SNPs spanning introns 12-23 using available data from HapMap Phase II, and by genotyping 938 DNAs from the Human Genome Diversity Panel (HGDP. The E-1 risk haplotype followed a cline, being most frequent within sub-Saharan African populations (range 50-80%, less frequent in populations from the Middle East (9-27% and Europe (0-9%, and rare or absent in Asia, the Americas, and Oceania. The fixation indexes (F(ST for pairwise comparisons between the risk haplotypes for continental populations were calculated for MYH9 haplotypes; F(ST ranged from 0.27-0.40 for Africa compared to other continental populations, possibly due to selection. Uniquely in Africa, the Yoruba population showed high frequency extended haplotype length around the core risk allele (C compared to the alternative allele (T at the same locus (rs4821481, iHs = 2.67, as well as high population differentiation (F(ST(CEU vs. YRI = 0.51 in HapMap Phase II data, also observable only in the Yoruba population from HGDP (F(ST = 0.49, pointing to an instance of recent selection in the genomic region. The population-specific divergence in MYH9 risk allele frequencies among the world's populations may prove important in risk assessment and public health policies to mitigate the burden of kidney disease in vulnerable populations.

  19. A spatial haplotype copying model with applications to genotype imputation.

    Science.gov (United States)

    Yang, Wen-Yun; Hormozdiari, Farhad; Eskin, Eleazar; Pasaniuc, Bogdan

    2015-05-01

    Ever since its introduction, the haplotype copy model has proven to be one of the most successful approaches for modeling genetic variation in human populations, with applications ranging from ancestry inference to genotype phasing and imputation. Motivated by coalescent theory, this approach assumes that any chromosome (haplotype) can be modeled as a mosaic of segments copied from a set of chromosomes sampled from the same population. At the core of the model is the assumption that any chromosome from the sample is equally likely to contribute a priori to the copying process. Motivated by recent works that model genetic variation in a geographic continuum, we propose a new spatial-aware haplotype copy model that jointly models geography and the haplotype copying process. We extend hidden Markov models of haplotype diversity such that at any given location, haplotypes that are closest in the genetic-geographic continuum map are a priori more likely to contribute to the copying process than distant ones. Through simulations starting from the 1000 Genomes data, we show that our model achieves superior accuracy in genotype imputation over the standard spatial-unaware haplotype copy model. In addition, we show the utility of our model in selecting a small personalized reference panel for imputation that leads to both improved accuracy as well as to a lower computational runtime than the standard approach. Finally, we show our proposed model can be used to localize individuals on the genetic-geographical map on the basis of their genotype data.

  20. A reduced number of mtSNPs saturates mitochondrial DNA haplotype diversity of worldwide population groups.

    Science.gov (United States)

    Salas, Antonio; Amigo, Jorge

    2010-05-03

    The high levels of variation characterising the mitochondrial DNA (mtDNA) molecule are due ultimately to its high average mutation rate; moreover, mtDNA variation is deeply structured in different populations and ethnic groups. There is growing interest in selecting a reduced number of mtDNA single nucleotide polymorphisms (mtSNPs) that account for the maximum level of discrimination power in a given population. Applications of the selected mtSNP panel range from anthropologic and medical studies to forensic genetic casework. This study proposes a new simulation-based method that explores the ability of different mtSNP panels to yield the maximum levels of discrimination power. The method explores subsets of mtSNPs of different sizes randomly chosen from a preselected panel of mtSNPs based on frequency. More than 2,000 complete genomes representing three main continental human population groups (Africa, Europe, and Asia) and two admixed populations ("African-Americans" and "Hispanics") were collected from GenBank and the literature, and were used as training sets. Haplotype diversity was measured for each combination of mtSNP and compared with existing mtSNP panels available in the literature. The data indicates that only a reduced number of mtSNPs ranging from six to 22 are needed to account for 95% of the maximum haplotype diversity of a given population sample. However, only a small proportion of the best mtSNPs are shared between populations, indicating that there is not a perfect set of "universal" mtSNPs suitable for all population contexts. The discrimination power provided by these mtSNPs is much higher than the power of the mtSNP panels proposed in the literature to date. Some mtSNP combinations also yield high diversity values in admixed populations. The proposed computational approach for exploring combinations of mtSNPs that optimise the discrimination power of a given set of mtSNPs is more efficient than previous empirical approaches. In contrast to

  1. Trends in genome-wide and region-specific genetic diversity in the Dutch-Flemish Holstein-Friesian breeding program from 1986 to 2015.

    Science.gov (United States)

    Doekes, Harmen P; Veerkamp, Roel F; Bijma, Piter; Hiemstra, Sipke J; Windig, Jack J

    2018-04-11

    In recent decades, Holstein-Friesian (HF) selection schemes have undergone profound changes, including the introduction of optimal contribution selection (OCS; around 2000), a major shift in breeding goal composition (around 2000) and the implementation of genomic selection (GS; around 2010). These changes are expected to have influenced genetic diversity trends. Our aim was to evaluate genome-wide and region-specific diversity in HF artificial insemination (AI) bulls in the Dutch-Flemish breeding program from 1986 to 2015. Pedigree and genotype data (~ 75.5 k) of 6280 AI-bulls were used to estimate rates of genome-wide inbreeding and kinship and corresponding effective population sizes. Region-specific inbreeding trends were evaluated using regions of homozygosity (ROH). Changes in observed allele frequencies were compared to those expected under pure drift to identify putative regions under selection. We also investigated the direction of changes in allele frequency over time. Effective population size estimates for the 1986-2015 period ranged from 69 to 102. Two major breakpoints were observed in genome-wide inbreeding and kinship trends. Around 2000, inbreeding and kinship levels temporarily dropped. From 2010 onwards, they steeply increased, with pedigree-based, ROH-based and marker-based inbreeding rates as high as 1.8, 2.1 and 2.8% per generation, respectively. Accumulation of inbreeding varied substantially across the genome. A considerable fraction of markers showed changes in allele frequency that were greater than expected under pure drift. Putative selected regions harboured many quantitative trait loci (QTL) associated to a wide range of traits. In consecutive 5-year periods, allele frequencies changed more often in the same direction than in opposite directions, except when comparing the 1996-2000 and 2001-2005 periods. Genome-wide and region-specific diversity trends reflect major changes in the Dutch-Flemish HF breeding program. Introduction of

  2. How to deal with Haplotype data: An Extension to the Conceptual Schema of the Human Genome

    Directory of Open Access Journals (Sweden)

    José Fabián Reyes Román

    2016-12-01

    Full Text Available The goal of this work is to describe the advantages of the application of Conceptual Modeling (CM in complex domains, such as genomics. Nowadays, the study and comprehension of the human genome is a major challenge due to its high level of complexity. The constant evolution in the genomic domain contributes to the generation of ever larger amounts of new data, which means that if we do not manage it correctly data quality could be compromised (i.e., problems related with heterogeneity and inconsistent data. In this paper, we propose the use of a Conceptual Schema of the Human Genome (CSHG, designed to understand and improve our ontological commitment to the domain and also extend (enrich this schema with the integration of a novel concept: Haplotypes. Our focus is on improving the understanding of the relationship between genotype and phenotype, since new findings show that this question is more complex than was originally thought. Here we present the first steps in our data management approach with haplotypes (variations, frequencies and populations and discuss the database evolution to support this data. Each new version in our conceptual schema (CS introduces changes to the underlying database structure that has essential and practical implications for better understanding and managing the relevant information. A solution based on conceptual models gives a clear definition of the domain with direct implications in the medical field (Precision Medicine, in which Genomic Information Systems (GeIS play a very important role.

  3. Along for the ride or missing it altogether: exploring the host specificity and diversity of haemogregarines in the Canary Islands.

    Science.gov (United States)

    Tomé, Beatriz; Pereira, Ana; Jorge, Fátima; Carretero, Miguel A; Harris, D James; Perera, Ana

    2018-03-19

    Host-parasite relationships are expected to be strongly shaped by host specificity, a crucial factor in parasite adaptability and diversification. Because whole host communities have to be considered to assess host specificity, oceanic islands are ideal study systems given their simplified biotic assemblages. Previous studies on insular parasites suggest host range broadening during colonization. Here, we investigate the association between one parasite group (haemogregarines) and multiple sympatric hosts (of three lizard genera: Gallotia, Chalcides and Tarentola) in the Canary Islands. Given haemogregarine characteristics and insular conditions, we hypothesized low host specificity and/or occurrence of host-switching events. A total of 825 samples were collected from the three host taxa inhabiting the seven main islands of the Canarian Archipelago, including locations where the different lizards occurred in sympatry. Blood slides were screened to assess prevalence and parasitaemia, while parasite genetic diversity and phylogenetic relationships were inferred from 18S rRNA gene sequences. Infection levels and diversity of haplotypes varied geographically and across host groups. Infections were found in all species of Gallotia across the seven islands, in Tarentola from Tenerife, La Gomera and La Palma, and in Chalcides from Tenerife, La Gomera and El Hierro. Gallotia lizards presented the highest parasite prevalence, parasitaemia and diversity (seven haplotypes), while the other two host groups (Chalcides and Tarentola) harbored one haplotype each, with low prevalence and parasitaemia levels, and very restricted geographical ranges. Host-sharing of the same haemogregarine haplotype was only detected twice, but these rare instances likely represent occasional cross-infections. Our results suggest that: (i) Canarian haemogregarine haplotypes are highly host-specific, which might have restricted parasite host expansion; (ii) haemogregarines most probably reached the

  4. Y-chromosome STR haplotypes in Somalis

    DEFF Research Database (Denmark)

    Hallenberg, Charlotte; Simonsen, Bo; Sanchez Sanchez, Juan Jose

    2005-01-01

    A total of 201 males from Somalia were typed for the Y-chromosome STRs DYS19, DYS385a/b, DYS389-I, DYS389-II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438 and DYS439 with the PowerPlex Y kit (Promega). A total of 96 different haplotypes were observed and the haplotype diversity was 0.9715. The ......A total of 201 males from Somalia were typed for the Y-chromosome STRs DYS19, DYS385a/b, DYS389-I, DYS389-II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438 and DYS439 with the PowerPlex Y kit (Promega). A total of 96 different haplotypes were observed and the haplotype diversity was 0...

  5. Polymorphism at Expressed DQ and DR Loci in Five Common Equine MHC Haplotypes

    Science.gov (United States)

    Miller, Donald; Tallmadge, Rebecca L.; Binns, Matthew; Zhu, Baoli; Mohamoud, Yasmin Ali; Ahmed, Ayeda; Brooks, Samantha A.; Antczak, Douglas F.

    2016-01-01

    The polymorphism of Major Histocompatibility Complex (MHC) class II DQ and DR genes in five common Equine Leukocyte Antigen (ELA) haplotypes was determined through sequencing of mRNA transcripts isolated from lymphocytes of eight ELA homozygous horses. Ten expressed MHC class II genes were detected in horses of the ELA-A3 haplotype carried by the donor horses of the equine Bacterial Artificial Chromosome (BAC) library and the reference genome sequence: four DR genes and six DQ genes. The other four ELA haplotypes contained at least eight expressed polymorphic MHC class II loci. Next Generation Sequencing (NGS) of genomic DNA of these four MHC haplotypes revealed stop codons in the DQA3 gene in the ELA-A2, ELA-A5, and ELA-A9 haplotypes. Few NGS reads were obtained for the other MHC class II genes that were not amplified in these horses. The amino acid sequences across haplotypes contained locus-specific residues, and the locus clusters produced by phylogenetic analysis were well supported. The MHC class II alleles within the five tested haplotypes were largely non-overlapping between haplotypes. The complement of equine MHC class II DQ and DR genes appears to be well conserved between haplotypes, in contrast to the recently described variation in class I gene loci between equine MHC haplotypes. The identification of allelic series of equine MHC class II loci will aid comparative studies of mammalian MHC conservation and evolution and may also help to interpret associations between the equine MHC class II region and diseases of the horse. PMID:27889800

  6. A haplotype specific to North European wheat (Triticum aestivum L.)

    Czech Academy of Sciences Publication Activity Database

    Tsombalova, J.; Karafiátová, Miroslava; Vrána, Jan; Kubaláková, Marie; Peusa, H.; Jakobson, I.; Jarve, M.; Valárik, Miroslav; Doležel, Jaroslav; Jarve, K.

    2017-01-01

    Roč. 64, č. 4 (2017), s. 653-664 ISSN 0925-9864 R&D Projects: GA MŠk(CZ) LO1204; GA ČR(CZ) GA14-07164S Institutional support: RVO:61389030 Keywords : bread wheat * genetic diversity * polyploid wheat * introgression lines * molecular analysis * tetraploid wheat * hexaploid wheat * powdery mildew * spelta l. * map * Common wheat * Triticum aestivum L * Spelt * Triticum spelta L * Chromosome 4A * Zero alleles * Haplotype * Linkage disequilibrium Subject RIV: EB - Genetics ; Molecular Biology OBOR OECD: Plant sciences, botany Impact factor: 1.294, year: 2016

  7. Comparative Study on the Genetic Diversity of GHR Gene in Tibetan Cattle and Holstein Cows.

    Science.gov (United States)

    Deng, Feilong; Xia, Chenyang; Jia, Xianbo; Song, Tianzeng; Liu, Jianzhi; Lai, Song-Jia; Chen, Shi-Yi

    2015-01-01

    Due to the phenotype-based artificial selection in domestic cattle, the underlying functional genes may be indirectly selected and show decreasing diversity in theory. The growth hormone receptor (GHR) gene has been widely proposed to significantly associate with critical economic traits in cattle. In the present study, we comparatively studied the genetic diversity of GHR in Tibetan cattle (a traditional unselected breed, n = 93) and Chinese Holstein cow (the intensively selected breed, n = 94). The Tibetan yak (n = 38) was also included as an outgroup breed. A total of 21 variants were detected by sequencing 1279 bp genomic fragments encompassing the largest exon 9. Twelve haplotypes (H1∼H12) constructed by 15 coding SNPs were presented as a star-like network profile, in which haplotype H2 was located at the central position and almost occupied by Tibetan yaks. Furthermore, H2 was also identical to the formerly reported sequence specific to African cattle. Only haplotype H5 was simultaneously shared by all three breeds. Tibetan cattle showed higher nucleotide diversity (0.00215 ± 0.00015) and haplotype diversity (0.678 ± 0.026) than Holstein cow. Conclusively, we found Tibetan cattle have retained relatively high genetic variation of GHR. The predominant presence of African cattle specific H2 in the outgroup yak breed would highlight its ancestral relationship, which may be used as one informative molecular marker in the phylogenetic studies.

  8. Genomic and Genetic Diversity within the Pseudomonas fluorescens Complex.

    Directory of Open Access Journals (Sweden)

    Daniel Garrido-Sanz

    Full Text Available The Pseudomonas fluorescens complex includes Pseudomonas strains that have been taxonomically assigned to more than fifty different species, many of which have been described as plant growth-promoting rhizobacteria (PGPR with potential applications in biocontrol and biofertilization. So far the phylogeny of this complex has been analyzed according to phenotypic traits, 16S rDNA, MLSA and inferred by whole-genome analysis. However, since most of the type strains have not been fully sequenced and new species are frequently described, correlation between taxonomy and phylogenomic analysis is missing. In recent years, the genomes of a large number of strains have been sequenced, showing important genomic heterogeneity and providing information suitable for genomic studies that are important to understand the genomic and genetic diversity shown by strains of this complex. Based on MLSA and several whole-genome sequence-based analyses of 93 sequenced strains, we have divided the P. fluorescens complex into eight phylogenomic groups that agree with previous works based on type strains. Digital DDH (dDDH identified 69 species and 75 subspecies within the 93 genomes. The eight groups corresponded to clustering with a threshold of 31.8% dDDH, in full agreement with our MLSA. The Average Nucleotide Identity (ANI approach showed inconsistencies regarding the assignment to species and to the eight groups. The small core genome of 1,334 CDSs and the large pan-genome of 30,848 CDSs, show the large diversity and genetic heterogeneity of the P. fluorescens complex. However, a low number of strains were enough to explain most of the CDSs diversity at core and strain-specific genomic fractions. Finally, the identification and analysis of group-specific genome and the screening for distinctive characters revealed a phylogenomic distribution of traits among the groups that provided insights into biocontrol and bioremediation applications as well as their role as

  9. Discovery, evaluation and distribution of haplotypes of the wheat Ppd-D1 gene.

    Science.gov (United States)

    Guo, Zhiai; Song, Yanxia; Zhou, Ronghua; Ren, Zhenglong; Jia, Jizeng

    2010-02-01

    Ppd-D1 is one of the most potent genes affecting the photoperiod response of wheat (Triticum aestivum). Only two alleles, insensitive Ppd-D1a and sensitive Ppd-D1b, were known previously, and these did not adequately explain the broad adaptation of wheat to photoperiod variation. In this study, five diagnostic molecular markers were employed to identify Ppd-D1 haplotypes in 492 wheat varieties from diverse geographic locations and 55 accessions of Aegilops tauschii, the D genome donor species of wheat. Six Ppd-D1 haplotypes, designated I-VI, were identified. Types II, V and VI were considered to be more ancient and types I, III and IV were considered to be derived from type II. The transcript abundances of the Ppd-D1 haplotypes showed continuous variation, being highest for haplotype I, lowest for haplotype III, and correlating negatively with varietal differences in heading time. These haplotypes also significantly affected other agronomic traits. The distribution frequency of Ppd-D1 haplotypes showed partial correlations with both latitudes and altitudes of wheat cultivation regions. The evolution, expression and distribution of Ppd-D1 haplotypes were consistent evidentially with each other. What was regarded as a pair of alleles in the past can now be considered a series of alleles leading to continuous variation.

  10. Genome Surfing As Driver of Microbial Genomic Diversity.

    Science.gov (United States)

    Choudoir, Mallory J; Panke-Buisse, Kevin; Andam, Cheryl P; Buckley, Daniel H

    2017-08-01

    Historical changes in population size, such as those caused by demographic range expansions, can produce nonadaptive changes in genomic diversity through mechanisms such as gene surfing. We propose that demographic range expansion of a microbial population capable of horizontal gene exchange can result in genome surfing, a mechanism that can cause widespread increase in the pan-genome frequency of genes acquired by horizontal gene exchange. We explain that patterns of genetic diversity within Streptomyces are consistent with genome surfing, and we describe several predictions for testing this hypothesis both in Streptomyces and in other microorganisms. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. The population genomics of archaeological transition in west Iberia: Investigation of ancient substructure using imputation and haplotype-based methods.

    Directory of Open Access Journals (Sweden)

    Rui Martiniano

    2017-07-01

    Full Text Available We analyse new genomic data (0.05-2.95x from 14 ancient individuals from Portugal distributed from the Middle Neolithic (4200-3500 BC to the Middle Bronze Age (1740-1430 BC and impute genomewide diploid genotypes in these together with published ancient Eurasians. While discontinuity is evident in the transition to agriculture across the region, sensitive haplotype-based analyses suggest a significant degree of local hunter-gatherer contribution to later Iberian Neolithic populations. A more subtle genetic influx is also apparent in the Bronze Age, detectable from analyses including haplotype sharing with both ancient and modern genomes, D-statistics and Y-chromosome lineages. However, the limited nature of this introgression contrasts with the major Steppe migration turnovers within third Millennium northern Europe and echoes the survival of non-Indo-European language in Iberia. Changes in genomic estimates of individual height across Europe are also associated with these major cultural transitions, and ancestral components continue to correlate with modern differences in stature.

  12. Global biogeography of Prochlorococcus genome diversity in the surface ocean.

    Science.gov (United States)

    Kent, Alyssa G; Dupont, Chris L; Yooseph, Shibu; Martiny, Adam C

    2016-08-01

    Prochlorococcus, the smallest known photosynthetic bacterium, is abundant in the ocean's surface layer despite large variation in environmental conditions. There are several genetically divergent lineages within Prochlorococcus and superimposed on this phylogenetic diversity is extensive gene gain and loss. The environmental role in shaping the global ocean distribution of genome diversity in Prochlorococcus is largely unknown, particularly in a framework that considers the vertical and lateral mechanisms of evolution. Here we show that Prochlorococcus field populations from a global circumnavigation harbor extensive genome diversity across the surface ocean, but this diversity is not randomly distributed. We observed a significant correspondence between phylogenetic and gene content diversity, including regional differences in both phylogenetic composition and gene content that were related to environmental factors. Several gene families were strongly associated with specific regions and environmental factors, including the identification of a set of genes related to lower nutrient and temperature regions. Metagenomic assemblies of natural Prochlorococcus genomes reinforced this association by providing linkage of genes across genomic backbones. Overall, our results show that the phylogeography in Prochlorococcus taxonomy is echoed in its genome content. Thus environmental variation shapes the functional capabilities and associated ecosystem role of the globally abundant Prochlorococcus.

  13. Mitochondrial genome diversity and population structure of two western honey bee subspecies in the Republic of South Africa.

    Science.gov (United States)

    Eimanifar, Amin; Kimball, Rebecca T; Braun, Edward L; Ellis, James D

    2018-01-22

    Apis mellifera capensis Eschscholtz and A.m. scutellata Lepeletier are subspecies of western honey bees that are indigenous to the Republic of South Africa (RSA). Both subspecies have invasive potential and are organisms of concern for areas outside their native range, though they are important bees to beekeepers, agriculture, and the environment where they are native. The aim of the present study was to examine genetic differentiation among these subspecies and estimate their phylogenetic relationships using complete mitochondrial genomes sequences. We used 25 individuals that were either assigned to one of the subspecies or designated hybrids using morphometric analyses. Phylogenetic analyses of mitogenome sequences by maximum likelihood (ML) and Bayesian inference identified a monophyletic RSA clade, subdivided into two clades. A haplotype network was consistent with the phylogenetic trees. However, members of both subspecies occurred in both clades, indicating that A.m. capensis and A.m. scutellata are neither reciprocally monophyletic nor do they exhibit paraphyly with one subspecies nested within the other subspecies. Furthermore, no mitogenomic features were diagnostic to either subspecies. All bees analyzed from the RSA expressed a substantial level of haplotype diversity (most samples had unique haplotypes) but limited nucleotide diversity. The number of variable codons across protein-coding genes (PCGs) differed among loci, with CO3 exhibiting the most variation and ATP6 the least.

  14. Genetic diversity of sago palm in Indonesia based on chloroplast DNA (cpDNA markers

    Directory of Open Access Journals (Sweden)

    MEMEN SURAHMAN

    2010-07-01

    Full Text Available Abbas B, Renwarin Y, Bintoro MH, Sudarsono, Surahman M, Ehara H (2010 Genetic diversity of sago palm in Indonesia based on chloroplast DNA (cpDNA markers. Biodiversitas 11: 112-117. Sago palm (Metroxylon sagu Rottb. was believed capable to accumulate high carbohydrate content in its trunk. The capability of sago palm producing high carbohydrate should be an appropriate criterion for defining alternative crops in anticipating food crisis. The objective of this research was to study genetic diversity of sago palm in Indonesia based on cpDNA markers. Total genome extraction was done following the Qiagen DNA isolation protocols 2003. Single Nucleotide Fragments (SNF analyses were performed by using ABI Prism GeneScanR 3.7. SNF analyses detected polymorphism revealing eleven alleles and ten haplotypes from total 97 individual samples of sago palm. Specific haplotypes were found in the population from Papua, Sulawesi, and Kalimantan. Therefore, the three islands will be considered as origin of sago palm diversities in Indonesia. The highest haplotype numbers and the highest specific haplotypes were found in the population from Papua suggesting this islands as the centre and the origin of sago palm diversities in Indonesia. The research had however no sufficient data yet to conclude the Papua origin of sago palm. Genetic hierarchies and differentiations of sago palm samples were observed significantly different within populations (P=0.04574, among populations (P=0.04772, and among populations within the island (P=0.03366, but among islands no significant differentiations were observed (P= 0.63069.

  15. Alternative haplotypes of antigen processing genes in zebrafish diverged early in vertebrate evolution

    Science.gov (United States)

    McConnell, Sean C.; Hernandez, Kyle M.; Wcisel, Dustin J.; Kettleborough, Ross N.; Stemple, Derek L.; Andrade, Jorge; de Jong, Jill L. O.

    2016-01-01

    Antigen processing and presentation genes found within the MHC are among the most highly polymorphic genes of vertebrate genomes, providing populations with diverse immune responses to a wide array of pathogens. Here, we describe transcriptome, exome, and whole-genome sequencing of clonal zebrafish, uncovering the most extensive diversity within the antigen processing and presentation genes of any species yet examined. Our CG2 clonal zebrafish assembly provides genomic context within a remarkably divergent haplotype of the core MHC region on chromosome 19 for six expressed genes not found in the zebrafish reference genome: mhc1uga, proteasome-β 9b (psmb9b), psmb8f, and previously unknown genes psmb13b, tap2d, and tap2e. We identify ancient lineages for Psmb13 within a proteasome branch previously thought to be monomorphic and provide evidence of substantial lineage diversity within each of three major trifurcations of catalytic-type proteasome subunits in vertebrates: Psmb5/Psmb8/Psmb11, Psmb6/Psmb9/Psmb12, and Psmb7/Psmb10/Psmb13. Strikingly, nearby tap2 and MHC class I genes also retain ancient sequence lineages, indicating that alternative lineages may have been preserved throughout the entire MHC pathway since early diversification of the adaptive immune system ∼500 Mya. Furthermore, polymorphisms within the three MHC pathway steps (antigen cleavage, transport, and presentation) are each predicted to alter peptide specificity. Lastly, comparative analysis shows that antigen processing gene diversity is far more extensive than previously realized (with ancient coelacanth psmb8 lineages, shark psmb13, and tap2t and psmb10 outside the teleost MHC), implying distinct immune functions and conserved roles in shaping MHC pathway evolution throughout vertebrates. PMID:27493218

  16. Haplotypes in the APOA1-C3-A4-A5 gene cluster affect plasma lipids in both humans and baboons

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Qian-fei; Liu, Xin; O' Connell, Jeff; Peng, Ze; Krauss, Ronald M.; Rainwater, David L.; VandeBerg, John L.; Rubin, Edward M.; Cheng, Jan-Fang; Pennacchio, Len A.

    2003-09-15

    Genetic studies in non-human primates serve as a potential strategy for identifying genomic intervals where polymorphisms impact upon human disease-related phenotypes. It remains unclear, however, whether independently arising polymorphisms in orthologous regions of non-human primates leads to similar variation in a quantitative trait found in both species. To explore this paradigm, we studied a baboon apolipoprotein gene cluster (APOA1/C3/A4/A5) for which the human gene orthologs have well established roles in influencing plasma HDL-cholesterol and triglyceride concentrations. Our extensive polymorphism analysis of this 68 kb gene cluster in 96 pedigreed baboons identified several haplotype blocks each with limited diversity, consistent with haplotype findings in humans. To determine whether baboons, like humans, also have particular haplotypes associated with lipid phenotypes, we genotyped 634 well characterized baboons using 16 haplotype tagging SNPs. Genetic analysis of single SNPs, as well as haplotypes, revealed an association of APOA5 and APOC3 variants with HDL cholesterol and triglyceride concentrations, respectively. Thus, independent variation in orthologous genomic intervals does associate with similar quantitative lipid traits in both species, supporting the possibility of uncovering human QTL genes in a highly controlled non-human primate model.

  17. Intraclonal genome diversity of Pseudomonas aeruginosa clones CHA and TB

    Science.gov (United States)

    2013-01-01

    Background Adaptation of Pseudomonas aeruginosa to different living conditions is accompanied by microevolution resulting in genomic diversity between strains of the same clonal lineage. In order to detect the impact of colonized habitats on P. aeruginosa microevolution we determined the genomic diversity between the highly virulent cystic fibrosis (CF) isolate CHA and two temporally and geographically unrelated clonal variants. The outcome was compared with the intraclonal genome diversity between three more closely related isolates of another clonal complex. Results The three clone CHA isolates differed in their core genome in several dozen strain specific nucleotide exchanges and small deletions from each other. Loss of function mutations and non-conservative amino acid replacements affected several habitat- and lifestyle-associated traits, for example, the key regulator GacS of the switch between acute and chronic disease phenotypes was disrupted in strain CHA. Intraclonal genome diversity manifested in an individual composition of the respective accessory genome whereby the highest number of accessory DNA elements was observed for isolate PT22 from a polluted aquatic habitat. Little intraclonal diversity was observed between three spatiotemporally related outbreak isolates of clone TB. Although phenotypically different, only a few individual SNPs and deletions were detected in the clone TB isolates. Their accessory genome mainly differed in prophage-like DNA elements taken up by one of the strains. Conclusions The higher geographical and temporal distance of the clone CHA isolates was associated with an increased intraclonal genome diversity compared to the more closely related clone TB isolates derived from a common source demonstrating the impact of habitat adaptation on the microevolution of P. aeruginosa. However, even short-term habitat differentiation can cause major phenotypic diversification driven by single genomic variation events and uptake of phage

  18. Y chromosome haplotype diversity of domestic sheep (Ovis aries) in northern Eurasia.

    Science.gov (United States)

    Zhang, Min; Peng, Wei-Feng; Yang, Guang-Li; Lv, Feng-Hua; Liu, Ming-Jun; Li, Wen-Rong; Liu, Yong-Gang; Li, Jin-Quan; Wang, Feng; Shen, Zhi-Qiang; Zhao, Sheng-Guo; Hehua, Eer; Marzanov, Nurbiy; Murawski, Maziek; Kantanen, Juha; Li, Meng-Hua

    2014-12-01

    Variation in two SNPs and one microsatellite on the Y chromosome was analyzed in a total of 663 rams representing 59 breeds from a large geographic range in northern Eurasia. SNPA-oY1 showed the highest allele frequency (91.55%) across the breeds, whereas SNPG-oY1 was present in only 56 samples. Combined genotypes established seven haplotypes (H4, H5, H6, H7, H8, H12 and H19). H6 dominated in northern Eurasia, and H8 showed the second-highest frequency. H4, which had been earlier reported to be absent in European breeds, was detected in one European breed (Swiniarka), whereas H7, which had been previously identified to be unique to European breeds, was present in two Chinese breeds (Ninglang Black and Large-tailed Han), one Buryatian (Transbaikal Finewool) and two Russian breeds (North Caucasus Mutton-Wool and Kuibyshev). H12, which had been detected only in Turkish breeds, was also found in Chinese breeds in this work. An overall low level of haplotype diversity (median h = 0.1288) was observed across the breeds with relatively higher median values in breeds from the regions neighboring the Near Eastern domestication center of sheep. H6 is the dominant haplotype in northwestern and eastern China, in which the haplotype distribution could be explained by the historical translocations of the H4 and H8 Y chromosomes to China via the Mongol invasions followed by expansions to northwestern and eastern China. Our findings extend previous results of sheep Y chromosomal genetic variability and indicate probably recent paternal gene flows between sheep breeds from distinct major geographic regions. © 2014 Stichting International Foundation for Animal Genetics.

  19. Genetic Diversity, Natural Selection and Haplotype Grouping of Plasmodium knowlesi Gamma Protein Region II (PkγRII): Comparison with the Duffy Binding Protein (PkDBPαRII).

    Science.gov (United States)

    Fong, Mun Yik; Rashdi, Sarah A A; Yusof, Ruhani; Lau, Yee Ling

    2016-01-01

    Plasmodium knowlesi is a simian malaria parasite that has been reported to cause malaria in humans in Southeast Asia. This parasite invades the erythrocytes of humans and of its natural host, the macaque Macaca fascicularis, via interaction between the Duffy binding protein region II (PkDBPαRII) and the Duffy antigen receptor on the host erythrocytes. In contrast, the P. knowlesi gamma protein region II (PkγRII) is not involved in the invasion of P. knowlesi into humans. PkγRII, however, mediates the invasion of P. knowlesi into the erythrocytes of M. mulata, a non-natural host of P. knowlesi via a hitherto unknown receptor. The haplotypes of PkDBPαRII in P. knowlesi isolates from Peninsular Malaysia and North Borneo have been shown to be genetically distinct and geographically clustered. Also, the PkDBPαRII was observed to be undergoing purifying (negative) selection. The present study aimed to determine whether similar phenomena occur in PkγRII. Blood samples from 78 knowlesi malaria patients were used. Forty-eight of the samples were from Peninsular Malaysia, and 30 were from Malaysia Borneo. The genomic DNA of the samples was extracted and used as template for the PCR amplification of the PkγRII. The PCR product was cloned and sequenced. The sequences obtained were analysed for genetic diversity and natural selection using MEGA6 and DnaSP (version 5.10.00) programmes. Genetic differentiation between the PkγRII of Peninsular Malaysia and North Borneo isolates was estimated using the Wright's FST fixation index in DnaSP (version 5.10.00). Haplotype analysis was carried out using the Median-Joining approach in NETWORK (version 4.6.1.3). A total of 78 PkγRII sequences was obtained. Comparative analysis showed that the PkγRII have similar range of haplotype (Hd) and nucleotide diversity (π) with that of PkDBPαRII. Other similarities between PkγRII and PkDBPαRII include undergoing purifying (negative) selection, geographical clustering of haplotypes

  20. Genetic Diversity, Natural Selection and Haplotype Grouping of Plasmodium knowlesi Gamma Protein Region II (PkγRII: Comparison with the Duffy Binding Protein (PkDBPαRII.

    Directory of Open Access Journals (Sweden)

    Mun Yik Fong

    Full Text Available Plasmodium knowlesi is a simian malaria parasite that has been reported to cause malaria in humans in Southeast Asia. This parasite invades the erythrocytes of humans and of its natural host, the macaque Macaca fascicularis, via interaction between the Duffy binding protein region II (PkDBPαRII and the Duffy antigen receptor on the host erythrocytes. In contrast, the P. knowlesi gamma protein region II (PkγRII is not involved in the invasion of P. knowlesi into humans. PkγRII, however, mediates the invasion of P. knowlesi into the erythrocytes of M. mulata, a non-natural host of P. knowlesi via a hitherto unknown receptor. The haplotypes of PkDBPαRII in P. knowlesi isolates from Peninsular Malaysia and North Borneo have been shown to be genetically distinct and geographically clustered. Also, the PkDBPαRII was observed to be undergoing purifying (negative selection. The present study aimed to determine whether similar phenomena occur in PkγRII.Blood samples from 78 knowlesi malaria patients were used. Forty-eight of the samples were from Peninsular Malaysia, and 30 were from Malaysia Borneo. The genomic DNA of the samples was extracted and used as template for the PCR amplification of the PkγRII. The PCR product was cloned and sequenced. The sequences obtained were analysed for genetic diversity and natural selection using MEGA6 and DnaSP (version 5.10.00 programmes. Genetic differentiation between the PkγRII of Peninsular Malaysia and North Borneo isolates was estimated using the Wright's FST fixation index in DnaSP (version 5.10.00. Haplotype analysis was carried out using the Median-Joining approach in NETWORK (version 4.6.1.3.A total of 78 PkγRII sequences was obtained. Comparative analysis showed that the PkγRII have similar range of haplotype (Hd and nucleotide diversity (π with that of PkDBPαRII. Other similarities between PkγRII and PkDBPαRII include undergoing purifying (negative selection, geographical clustering of

  1. Haplotype mapping of a diploid non-meiotic organism using existing and induced aneuploidies.

    Directory of Open Access Journals (Sweden)

    Melanie Legrand

    2008-01-01

    Full Text Available Haplotype maps (HapMaps reveal underlying sequence variation and facilitate the study of recombination and genetic diversity. In general, HapMaps are produced by analysis of Single-Nucleotide Polymorphism (SNP segregation in large numbers of meiotic progeny. Candida albicans, the most common human fungal pathogen, is an obligate diploid that does not appear to undergo meiosis. Thus, standard methods for haplotype mapping cannot be used. We exploited naturally occurring aneuploid strains to determine the haplotypes of the eight chromosome pairs in the C. albicans laboratory strain SC5314 and in a clinical isolate. Comparison of the maps revealed that the clinical strain had undergone a significant amount of genome rearrangement, consisting primarily of crossover or gene conversion recombination events. SNP map haplotyping revealed that insertion and activation of the UAU1 cassette in essential and non-essential genes can result in whole chromosome aneuploidy. UAU1 is often used to construct homozygous deletions of targeted genes in C. albicans; the exact mechanism (trisomy followed by chromosome loss versus gene conversion has not been determined. UAU1 insertion into the essential ORC1 gene resulted in a large proportion of trisomic strains, while gene conversion events predominated when UAU1 was inserted into the non-essential LRO1 gene. Therefore, induced aneuploidies can be used to generate HapMaps, which are essential for analyzing genome alterations and mitotic recombination events in this clonal organism.

  2. PRDM9 drives evolutionary erosion of hotspots in Mus musculus through haplotype-specific initiation of meiotic recombination.

    Science.gov (United States)

    Baker, Christopher L; Kajita, Shimpei; Walker, Michael; Saxl, Ruth L; Raghupathy, Narayanan; Choi, Kwangbom; Petkov, Petko M; Paigen, Kenneth

    2015-01-01

    Meiotic recombination generates new genetic variation and assures the proper segregation of chromosomes in gametes. PRDM9, a zinc finger protein with histone methyltransferase activity, initiates meiotic recombination by binding DNA at recombination hotspots and directing the position of DNA double-strand breaks (DSB). The DSB repair mechanism suggests that hotspots should eventually self-destruct, yet genome-wide recombination levels remain constant, a conundrum known as the hotspot paradox. To test if PRDM9 drives this evolutionary erosion, we measured activity of the Prdm9Cst allele in two Mus musculus subspecies, M.m. castaneus, in which Prdm9Cst arose, and M.m. domesticus, into which Prdm9Cst was introduced experimentally. Comparing these two strains, we find that haplotype differences at hotspots lead to qualitative and quantitative changes in PRDM9 binding and activity. Using Mus spretus as an outlier, we found most variants affecting PRDM9Cst binding arose and were fixed in M.m. castaneus, suppressing hotspot activity. Furthermore, M.m. castaneus×M.m. domesticus F1 hybrids exhibit novel hotspots, with large haplotype biases in both PRDM9 binding and chromatin modification. These novel hotspots represent sites of historic evolutionary erosion that become activated in hybrids due to crosstalk between one parent's Prdm9 allele and the opposite parent's chromosome. Together these data support a model where haplotype-specific PRDM9 binding directs biased gene conversion at hotspots, ultimately leading to hotspot erosion.

  3. Genetic diversity and haplotype structure of 21 Y-STRs, including nine noncore loci, in South Tunisian Population: Forensic relevance.

    Science.gov (United States)

    Makki-Rmida, Faten; Kammoun, Arwa; Mahfoudh, Nadia; Ayadi, Adnene; Gibriel, Abdullah Ahmed; Mallek, Bakhta; Maalej, Leila; Hammami, Zouheir; Maatoug, Samir; Makni, Hafedh; Masmoudi, Saber

    2015-12-01

    Y chromosome STRs (Y-STRs) are being used frequently in forensic laboratories. Previous studies of Y-STR polymorphisms in different groups of the Tunisian population identified low levels of diversity and discrimination capacity (DC) using various commercial marker sets. This definitely limits the use of such systems for Y-STRs genotyping in Tunisia. In our investigation on South Tunisia, 200 unrelated males were typed for the 12 conventional Y-STRs included in the PowerPlex® Y System. Additional set of nine noncore Y-STRs including DYS446, DYS456, DYS458, DYS388, DYS444, DYS445, DYS449, DYS710, and DYS464 markers were genotyped and evaluated for their potential in improving DC. Allele frequency, gene diversity, haplotype diversity (HD), and DC calculation revealed that DYS464 was the most diverse marker followed by DYS710 and DYS449 markers. The standard panel of 12 Y-STRs (DC = 80.5%) and the nine markers were combined to obtain DC of 99%. Among the 198 different haplotypes observed, 196 haplotypes were unique (HD = 99.999). Out of the nine noncore set, six Y-STRs (DYS458, DYS456, DYS449, DYS710, DYS444, and DYS464) had the greatest impact on enhancing DC. Our data provided putative Y-STRs combination to be used for genetic and forensic applications. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Selection on Optimal Haploid Value Increases Genetic Gain and Preserves More Genetic Diversity Relative to Genomic Selection.

    Science.gov (United States)

    Daetwyler, Hans D; Hayden, Matthew J; Spangenberg, German C; Hayes, Ben J

    2015-08-01

    Doubled haploids are routinely created and phenotypically selected in plant breeding programs to accelerate the breeding cycle. Genomic selection, which makes use of both phenotypes and genotypes, has been shown to further improve genetic gain through prediction of performance before or without phenotypic characterization of novel germplasm. Additional opportunities exist to combine genomic prediction methods with the creation of doubled haploids. Here we propose an extension to genomic selection, optimal haploid value (OHV) selection, which predicts the best doubled haploid that can be produced from a segregating plant. This method focuses selection on the haplotype and optimizes the breeding program toward its end goal of generating an elite fixed line. We rigorously tested OHV selection breeding programs, using computer simulation, and show that it results in up to 0.6 standard deviations more genetic gain than genomic selection. At the same time, OHV selection preserved a substantially greater amount of genetic diversity in the population than genomic selection, which is important to achieve long-term genetic gain in breeding populations. Copyright © 2015 by the Genetics Society of America.

  5. Common ataxia telangiectasia mutated haplotypes and risk of breast cancer: a nested case–control study

    International Nuclear Information System (INIS)

    Tamimi, Rulla M; Hankinson, Susan E; Spiegelman, Donna; Kraft, Peter; Colditz, Graham A; Hunter, David J

    2004-01-01

    The ataxia telangiectasia mutated (ATM) gene is a tumor suppressor gene with functions in cell cycle arrest, apoptosis, and repair of DNA double-strand breaks. Based on family studies, women heterozygous for mutations in the ATM gene are reported to have a fourfold to fivefold increased risk of breast cancer compared with noncarriers of the mutations, although not all studies have confirmed this association. Haplotype analysis has been suggested as an efficient method for investigating the role of common variation in the ATM gene and breast cancer. Five biallelic haplotype tagging single nucleotide polymorphisms are estimated to capture 99% of the haplotype diversity in Caucasian populations. We conducted a nested case–control study of breast cancer within the Nurses' Health Study cohort to address the role of common ATM haplotypes and breast cancer. Cases and controls were genotyped for five haplotype tagging single nucleotide polymorphisms. Haplotypes were predicted for 1309 cases and 1761 controls for which genotype information was available. Six unique haplotypes were predicted in this study, five of which occur at a frequency of 5% or greater. The overall distribution of haplotypes was not significantly different between cases and controls (χ 2 = 3.43, five degrees of freedom, P = 0.63). There was no evidence that common haplotypes of ATM are associated with breast cancer risk. Extensive single nucleotide polymorphism detection using the entire genomic sequence of ATM will be necessary to rule out less common variation in ATM and sporadic breast cancer risk

  6. Genetic diversity in breonadia salicina based on intra-species sequence variation of chloroplast dna spacer sequence

    International Nuclear Information System (INIS)

    Qurainy, F.A.; Gaafar, A.R.Z.

    2014-01-01

    Assessment and knowledge of the genetic diversity and variation within and between populations of rare and endangered plants is very important for effective conservation. Intergenic spacer sequences variation of psbA-trnH locus of chloroplast genome was assessed within Breonadia salicina (Rubiaceae), a critically endangered and endemic plant species to South western part of Kingdom of Saudi Arabia. The obtained sequence data from 19 individuals in three populations revealed nine haplotypes. The aligned sequences obtained from the overall Saudi accessions extended to 355 bp, revealing nine haplotypes. A high level of haplotype diversity (Hd = 0.842) and low level of nucleotide diversity (Pi = 0.0058) were detected. Consistently, both hierarchical analysis of molecular variance (AMOVA) and constructed neighbor-joining tree indicated null genetic differentiation among populations. This level of differentiation between populations or between regions in psbA-trnH sequences may be due to effects of the abundance of ancestral haplotype sharing and the presence of private haplotypes fixed for each population. Furthermore, the results revealed almost the same level of genetic diversity in comparison with Yemeni accessions, in which Saudi accessions were sharing three haplotypes from the four haplotypes found in Yemeni accessions. (author)

  7. Shared and unique components of human population structure and genome-wide signals of positive selection in South Asia.

    Science.gov (United States)

    Metspalu, Mait; Romero, Irene Gallego; Yunusbayev, Bayazit; Chaubey, Gyaneshwer; Mallick, Chandana Basu; Hudjashov, Georgi; Nelis, Mari; Mägi, Reedik; Metspalu, Ene; Remm, Maido; Pitchappan, Ramasamy; Singh, Lalji; Thangaraj, Kumarasamy; Villems, Richard; Kivisild, Toomas

    2011-12-09

    South Asia harbors one of the highest levels genetic diversity in Eurasia, which could be interpreted as a result of its long-term large effective population size and of admixture during its complex demographic history. In contrast to Pakistani populations, populations of Indian origin have been underrepresented in previous genomic scans of positive selection and population structure. Here we report data for more than 600,000 SNP markers genotyped in 142 samples from 30 ethnic groups in India. Combining our results with other available genome-wide data, we show that Indian populations are characterized by two major ancestry components, one of which is spread at comparable frequency and haplotype diversity in populations of South and West Asia and the Caucasus. The second component is more restricted to South Asia and accounts for more than 50% of the ancestry in Indian populations. Haplotype diversity associated with these South Asian ancestry components is significantly higher than that of the components dominating the West Eurasian ancestry palette. Modeling of the observed haplotype diversities suggests that both Indian ancestry components are older than the purported Indo-Aryan invasion 3,500 YBP. Consistent with the results of pairwise genetic distances among world regions, Indians share more ancestry signals with West than with East Eurasians. However, compared to Pakistani populations, a higher proportion of their genes show regionally specific signals of high haplotype homozygosity. Among such candidates of positive selection in India are MSTN and DOK5, both of which have potential implications in lipid metabolism and the etiology of type 2 diabetes. Copyright © 2011 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  8. Evolution and Diversity in Human Herpes Simplex Virus Genomes

    Science.gov (United States)

    Gatherer, Derek; Ochoa, Alejandro; Greenbaum, Benjamin; Dolan, Aidan; Bowden, Rory J.; Enquist, Lynn W.; Legendre, Matthieu; Davison, Andrew J.

    2014-01-01

    Herpes simplex virus 1 (HSV-1) causes a chronic, lifelong infection in >60% of adults. Multiple recent vaccine trials have failed, with viral diversity likely contributing to these failures. To understand HSV-1 diversity better, we comprehensively compared 20 newly sequenced viral genomes from China, Japan, Kenya, and South Korea with six previously sequenced genomes from the United States, Europe, and Japan. In this diverse collection of passaged strains, we found that one-fifth of the newly sequenced members share a gene deletion and one-third exhibit homopolymeric frameshift mutations (HFMs). Individual strains exhibit genotypic and potential phenotypic variation via HFMs, deletions, short sequence repeats, and single-nucleotide polymorphisms, although the protein sequence identity between strains exceeds 90% on average. In the first genome-scale analysis of positive selection in HSV-1, we found signs of selection in specific proteins and residues, including the fusion protein glycoprotein H. We also confirmed previous results suggesting that recombination has occurred with high frequency throughout the HSV-1 genome. Despite this, the HSV-1 strains analyzed clustered by geographic origin during whole-genome distance analysis. These data shed light on likely routes of HSV-1 adaptation to changing environments and will aid in the selection of vaccine antigens that are invariant worldwide. PMID:24227835

  9. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  10. Analysis of the a genome genetic diversity among brassica napus, b. rapa and b. juncea accessions using specific simple sequence repeat markers

    International Nuclear Information System (INIS)

    Tian, H.; Yan, J.; Zhang, R.; Guo, Y.; Hu, S.; Channa, S.A.

    2017-01-01

    This investigation was aimed at evaluating the genetic diversity of 127 accessions among Brassica napus, B. rapa, and B. juncea by using 15 pairs of the A genome specific simple sequence repeat primers. These 127 accessions could be clearly separated into three groups by cluster analysis, principal component analysis, and population structure analysis separately, and the results analyzed by the three methods were very similar. Group I comprised of mainly B. napus accessions and the most of B. juncea accessions formed Group II, Group III included nearly all of the B. rapa accessions. The result showed that 36.86% of the variance was due to significant differences among populations of species, indicated that abundance genetic diversity existed among the A genome of B. napus, B. rapa, and B. juncea accessions. B. napus, B. rapa, and B. juncea have the abundant genetic diversity in the A genome, and some elite genes can be used to broaden the genetic base of them, especially for B. napus, in future rapeseed breeding program. (author)

  11. Carnivore-specific SINEs (Can-SINEs): distribution, evolution, and genomic impact.

    Science.gov (United States)

    Walters-Conte, Kathryn B; Johnson, Diana L E; Allard, Marc W; Pecon-Slattery, Jill

    2011-01-01

    Short interspersed nuclear elements (SINEs) are a type of class 1 transposable element (retrotransposon) with features that allow investigators to resolve evolutionary relationships between populations and species while providing insight into genome composition and function. Characterization of a Carnivora-specific SINE family, Can-SINEs, has, has aided comparative genomic studies by providing rare genomic changes, and neutral sequence variants often needed to resolve difficult evolutionary questions. In addition, Can-SINEs constitute a significant source of functional diversity with Carnivora. Publication of the whole-genome sequence of domestic dog, domestic cat, and giant panda serves as a valuable resource in comparative genomic inferences gleaned from Can-SINEs. In anticipation of forthcoming studies bolstered by new genomic data, this review describes the discovery and characterization of Can-SINE motifs as well as describes composition, distribution, and effect on genome function. As the contribution of noncoding sequences to genomic diversity becomes more apparent, SINEs and other transposable elements will play an increasingly large role in mammalian comparative genomics.

  12. Genomic Diversity of Lactobacillus salivarius▿ †

    OpenAIRE

    Raftis, Emma J.; Salvetti, Elisa; Torriani, Sandra; Felis, Giovanna E.; O'Toole, Paul W.

    2010-01-01

    Strains of Lactobacillus salivarius are increasingly employed as probiotic agents for humans or animals. Despite the diversity of environmental sources from which they have been isolated, the genomic diversity of L. salivarius has been poorly characterized, and the implications of this diversity for strain selection have not been examined. To tackle this, we applied comparative genomic hybridization (CGH) and multilocus sequence typing (MLST) to 33 strains derived from humans, animals, or foo...

  13. Genetic diversity and host specificity varies across three genera of blood parasites in ducks of the Pacific Americas Flyway

    Science.gov (United States)

    Reeves, Andrew B.; Smith, Matthew M.; Meixell, Brandt W.; Fleskes, Joseph P.; Ramey, Andrew M.

    2015-01-01

    Birds of the order Anseriformes, commonly referred to as waterfowl, are frequently infected by Haemosporidia of the genera Haemoproteus, Plasmodium, and Leucocytozoon via dipteran vectors. We analyzed nucleotide sequences of the Cytochrome b (Cytb) gene from parasites of these genera detected in six species of ducks from Alaska and California, USA to characterize the genetic diversity of Haemosporidia infecting waterfowl at two ends of the Pacific Americas Flyway. In addition, parasite Cytb sequences were compared to those available on a public database to investigate specificity of genetic lineages to hosts of the order Anseriformes. Haplotype and nucleotide diversity of Haemoproteus Cytb sequences was lower than was detected for Plasmodium and Leucocytozoon parasites. Although waterfowl are presumed to be infected by only a single species of Leucocytozoon, L. simondi, diversity indices were highest for haplotypes from this genus and sequences formed five distinct clades separated by genetic distances of 4.9%–7.6%, suggesting potential cryptic speciation. All Haemoproteus andLeucocytozoon haplotypes derived from waterfowl samples formed monophyletic clades in phylogenetic analyses and were unique to the order Anseriformes with few exceptions. In contrast, waterfowl-origin Plasmodium haplotypes were identical or closely related to lineages found in other avian orders. Our results suggest a more generalist strategy for Plasmodiumparasites infecting North American waterfowl as compared to those of the generaHaemoproteus and Leucocytozoon.

  14. Genomic Diversity of Lactobacillus salivarius▿ †

    Science.gov (United States)

    Raftis, Emma J.; Salvetti, Elisa; Torriani, Sandra; Felis, Giovanna E.; O'Toole, Paul W.

    2011-01-01

    Strains of Lactobacillus salivarius are increasingly employed as probiotic agents for humans or animals. Despite the diversity of environmental sources from which they have been isolated, the genomic diversity of L. salivarius has been poorly characterized, and the implications of this diversity for strain selection have not been examined. To tackle this, we applied comparative genomic hybridization (CGH) and multilocus sequence typing (MLST) to 33 strains derived from humans, animals, or food. The CGH, based on total genome content, including small plasmids, identified 18 major regions of genomic variation, or hot spots for variation. Three major divisions were thus identified, with only a subset of the human isolates constituting an ecologically discernible group. Omission of the small plasmids from the CGH or analysis by MLST provided broadly concordant fine divisions and separated human-derived and animal-derived strains more clearly. The two gene clusters for exopolysaccharide (EPS) biosynthesis corresponded to regions of significant genomic diversity. The CGH-based groupings of these regions did not correlate with levels of production of bound or released EPS. Furthermore, EPS production was significantly modulated by available carbohydrate. In addition to proving difficult to predict from the gene content, EPS production levels correlated inversely with production of biofilms, a trait considered desirable in probiotic commensals. L. salivarius displays a high level of genomic diversity, and while selection of L. salivarius strains for probiotic use can be informed by CGH or MLST, it also requires pragmatic experimental validation of desired phenotypic traits. PMID:21131523

  15. Site-specific recombination in the chicken genome using Flipase recombinase-mediated cassette exchange.

    Science.gov (United States)

    Lee, Hong Jo; Lee, Hyung Chul; Kim, Young Min; Hwang, Young Sun; Park, Young Hyun; Park, Tae Sub; Han, Jae Yong

    2016-02-01

    Targeted genome recombination has been applied in diverse research fields and has a wide range of possible applications. In particular, the discovery of specific loci in the genome that support robust and ubiquitous expression of integrated genes and the development of genome-editing technology have facilitated rapid advances in various scientific areas. In this study, we produced transgenic (TG) chickens that can induce recombinase-mediated gene cassette exchange (RMCE), one of the site-specific recombination technologies, and confirmed RMCE in TG chicken-derived cells. As a result, we established TG chicken lines that have, Flipase (Flp) recognition target (FRT) pairs in the chicken genome, mediated by piggyBac transposition. The transgene integration patterns were diverse in each TG chicken line, and the integration diversity resulted in diverse levels of expression of exogenous genes in each tissue of the TG chickens. In addition, the replaced gene cassette was expressed successfully and maintained by RMCE in the FRT predominant loci of TG chicken-derived cells. These results indicate that targeted genome recombination technology with RMCE could be adaptable to TG chicken models and that the technology would be applicable to specific gene regulation by cis-element insertion and customized expression of functional proteins at predicted levels without epigenetic influence. © FASEB.

  16. Genetics of chloroquine-resistant malaria: a haplotypic view

    Directory of Open Access Journals (Sweden)

    Gauri Awasthi

    2013-12-01

    Full Text Available The development and rapid spread of chloroquine resistance (CQR in Plasmodium falciparum have triggered the identification of several genetic target(s in the P. falciparum genome. In particular, mutations in the Pfcrt gene, specifically, K76T and mutations in three other amino acids in the region adjoining K76 (residues 72, 74, 75 and 76, are considered to be highly related to CQR. These various mutations form several different haplotypes and Pfcrt gene polymorphisms and the global distribution of the different CQR- Pfcrt haplotypes in endemic and non-endemic regions of P. falciparum malaria have been the subject of extensive study. Despite the fact that the Pfcrt gene is considered to be the primary CQR gene in P. falciparum , several studies have suggested that this may not be the case. Furthermore, there is a poor correlation between the evolutionary implications of the Pfcrt haplotypes and the inferred migration of CQR P. falciparum based on CQR epidemiological surveillance data. The present paper aims to clarify the existing knowledge on the genetic basis of the different CQR- Pfcrt haplotypes that are prevalent in worldwide populations based on the published literature and to analyse the data to generate hypotheses on the genetics and evolution of CQR malaria.

  17. Expanding the Diversity of Mycobacteriophages: Insights into Genome Architecture and Evolution

    Science.gov (United States)

    Pope, Welkin H.; Jacobs-Sera, Deborah; Russell, Daniel A.; Peebles, Craig L.; Al-Atrache, Zein; Alcoser, Turi A.; Alexander, Lisa M.; Alfano, Matthew B.; Alford, Samantha T.; Amy, Nichols E.; Anderson, Marie D.; Anderson, Alexander G.; Ang, Andrew A. S.; Ares, Manuel; Barber, Amanda J.; Barker, Lucia P.; Barrett, Jonathan M.; Barshop, William D.; Bauerle, Cynthia M.; Bayles, Ian M.; Belfield, Katherine L.; Best, Aaron A.; Borjon, Agustin; Bowman, Charles A.; Boyer, Christine A.; Bradley, Kevin W.; Bradley, Victoria A.; Broadway, Lauren N.; Budwal, Keshav; Busby, Kayla N.; Campbell, Ian W.; Campbell, Anne M.; Carey, Alyssa; Caruso, Steven M.; Chew, Rebekah D.; Cockburn, Chelsea L.; Cohen, Lianne B.; Corajod, Jeffrey M.; Cresawn, Steven G.; Davis, Kimberly R.; Deng, Lisa; Denver, Dee R.; Dixon, Breyon R.; Ekram, Sahrish; Elgin, Sarah C. R.; Engelsen, Angela E.; English, Belle E. V.; Erb, Marcella L.; Estrada, Crystal; Filliger, Laura Z.; Findley, Ann M.; Forbes, Lauren; Forsyth, Mark H.; Fox, Tyler M.; Fritz, Melissa J.; Garcia, Roberto; George, Zindzi D.; Georges, Anne E.; Gissendanner, Christopher R.; Goff, Shannon; Goldstein, Rebecca; Gordon, Kobie C.; Green, Russell D.; Guerra, Stephanie L.; Guiney-Olsen, Krysta R.; Guiza, Bridget G.; Haghighat, Leila; Hagopian, Garrett V.; Harmon, Catherine J.; Harmson, Jeremy S.; Hartzog, Grant A.; Harvey, Samuel E.; He, Siping; He, Kevin J.; Healy, Kaitlin E.; Higinbotham, Ellen R.; Hildebrandt, Erin N.; Ho, Jason H.; Hogan, Gina M.; Hohenstein, Victoria G.; Holz, Nathan A.; Huang, Vincent J.; Hufford, Ericka L.; Hynes, Peter M.; Jackson, Arrykka S.; Jansen, Erica C.; Jarvik, Jonathan; Jasinto, Paul G.; Jordan, Tuajuanda C.; Kasza, Tomas; Katelyn, Murray A.; Kelsey, Jessica S.; Kerrigan, Larisa A.; Khaw, Daryl; Kim, Junghee; Knutter, Justin Z.; Ko, Ching-Chung; Larkin, Gail V.; Laroche, Jennifer R.; Latif, Asma; Leuba, Kohana D.; Leuba, Sequoia I.; Lewis, Lynn O.; Loesser-Casey, Kathryn E.; Long, Courtney A.; Lopez, A. Javier; Lowery, Nicholas; Lu, Tina Q.; Mac, Victor; Masters, Isaac R.; McCloud, Jazmyn J.; McDonough, Molly J.; Medenbach, Andrew J.; Menon, Anjali; Miller, Rachel; Morgan, Brandon K.; Ng, Patrick C.; Nguyen, Elvis; Nguyen, Katrina T.; Nguyen, Emilie T.; Nicholson, Kaylee M.; Parnell, Lindsay A.; Peirce, Caitlin E.; Perz, Allison M.; Peterson, Luke J.; Pferdehirt, Rachel E.; Philip, Seegren V.; Pogliano, Kit; Pogliano, Joe; Polley, Tamsen; Puopolo, Erica J.; Rabinowitz, Hannah S.; Resiss, Michael J.; Rhyan, Corwin N.; Robinson, Yetta M.; Rodriguez, Lauren L.; Rose, Andrew C.; Rubin, Jeffrey D.; Ruby, Jessica A.; Saha, Margaret S.; Sandoz, James W.; Savitskaya, Judith; Schipper, Dale J.; Schnitzler, Christine E.; Schott, Amanda R.; Segal, J. Bradley; Shaffer, Christopher D.; Sheldon, Kathryn E.; Shepard, Erica M.; Shepardson, Jonathan W.; Shroff, Madav K.; Simmons, Jessica M.; Simms, Erika F.; Simpson, Brandy M.; Sinclair, Kathryn M.; Sjoholm, Robert L.; Slette, Ingrid J.; Spaulding, Blaire C.; Straub, Clark L.; Stukey, Joseph; Sughrue, Trevor; Tang, Tin-Yun; Tatyana, Lyons M.; Taylor, Stephen B.; Taylor, Barbara J.; Temple, Louise M.; Thompson, Jasper V.; Tokarz, Michael P.; Trapani, Stephanie E.; Troum, Alexander P.; Tsay, Jonathan; Tubbs, Anthony T.; Walton, Jillian M.; Wang, Danielle H.; Wang, Hannah; Warner, John R.; Weisser, Emilie G.; Wendler, Samantha C.; Weston-Hafer, Kathleen A.; Whelan, Hilary M.; Williamson, Kurt E.; Willis, Angelica N.; Wirtshafter, Hannah S.; Wong, Theresa W.; Wu, Phillip; Yang, Yun jeong; Yee, Brandon C.; Zaidins, David A.; Zhang, Bo; Zúniga, Melina Y.; Hendrix, Roger W.; Hatfull, Graham F.

    2011-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts such as Mycobacterium smegmatis and Mycobacterium tuberculosis. All mycobacteriophages characterized to date are dsDNA tailed phages, and have either siphoviral or myoviral morphotypes. However, their genetic diversity is considerable, and although sixty-two genomes have been sequenced and comparatively analyzed, these likely represent only a small portion of the diversity of the mycobacteriophage population at large. Here we report the isolation, sequencing and comparative genomic analysis of 18 new mycobacteriophages isolated from geographically distinct locations within the United States. Although no clear correlation between location and genome type can be discerned, these genomes expand our knowledge of mycobacteriophage diversity and enhance our understanding of the roles of mobile elements in viral evolution. Expansion of the number of mycobacteriophages grouped within Cluster A provides insights into the basis of immune specificity in these temperate phages, and we also describe a novel example of apparent immunity theft. The isolation and genomic analysis of bacteriophages by freshman college students provides an example of an authentic research experience for novice scientists. PMID:21298013

  18. Expanding the diversity of mycobacteriophages: insights into genome architecture and evolution.

    Directory of Open Access Journals (Sweden)

    Welkin H Pope

    2011-01-01

    Full Text Available Mycobacteriophages are viruses that infect mycobacterial hosts such as Mycobacterium smegmatis and Mycobacterium tuberculosis. All mycobacteriophages characterized to date are dsDNA tailed phages, and have either siphoviral or myoviral morphotypes. However, their genetic diversity is considerable, and although sixty-two genomes have been sequenced and comparatively analyzed, these likely represent only a small portion of the diversity of the mycobacteriophage population at large. Here we report the isolation, sequencing and comparative genomic analysis of 18 new mycobacteriophages isolated from geographically distinct locations within the United States. Although no clear correlation between location and genome type can be discerned, these genomes expand our knowledge of mycobacteriophage diversity and enhance our understanding of the roles of mobile elements in viral evolution. Expansion of the number of mycobacteriophages grouped within Cluster A provides insights into the basis of immune specificity in these temperate phages, and we also describe a novel example of apparent immunity theft. The isolation and genomic analysis of bacteriophages by freshman college students provides an example of an authentic research experience for novice scientists.

  19. Assessing the diversity, host-specificity and infection patterns of apicomplexan parasites in reptiles from Oman, Arabia.

    Science.gov (United States)

    Maia, João P; Harris, D James; Carranza, Salvador; Goméz-Díaz, Elena

    2016-11-01

    Understanding the processes that shape parasite diversification, their distribution and abundance provides valuable information on the dynamics and evolution of disease. In this study, we assessed the diversity, distribution, host-specificity and infection patterns of apicomplexan parasites in amphibians and reptiles from Oman, Arabia. Using a quantitative PCR approach we detected three apicomplexan parasites (haemogregarines, lankesterellids and sarcocystids). A total of 13 haemogregarine haplotypes were identified, which fell into four main clades in a phylogenetic framework. Phylogenetic analysis of six new lankesterellid haplotypes revealed that these parasites were distinct from, but phylogenetically related to, known Lankesterella species and might represent new taxa. The percentage of infected hosts (prevalence) and the number of haemogregarines in the blood (parasitaemia) varied significantly between gecko species. We also found significant differences in parasitaemia between haemogregarine parasite lineages (defined by phylogenetic clustering of haplotypes), suggesting differences in host-parasite compatibility between these lineages. For Pristurus rupestris, we found significant differences in haemogregarine prevalence between geographical areas. Our results suggest that host ecology and host relatedness may influence haemogregarine distributions and, more generally, highlight the importance of screening wild hosts from remote regions to provide new insights into parasite diversity.

  20. Genetic Competence Drives Genome Diversity in Bacillus subtilis

    Science.gov (United States)

    Chevreux, Bastien; Serra, Cláudia R; Schyns, Ghislain; Henriques, Adriano O

    2018-01-01

    Abstract Prokaryote genomes are the result of a dynamic flux of genes, with increases achieved via horizontal gene transfer and reductions occurring through gene loss. The ecological and selective forces that drive this genomic flexibility vary across species. Bacillus subtilis is a naturally competent bacterium that occupies various environments, including plant-associated, soil, and marine niches, and the gut of both invertebrates and vertebrates. Here, we quantify the genomic diversity of B. subtilis and infer the genome dynamics that explain the high genetic and phenotypic diversity observed. Phylogenomic and comparative genomic analyses of 42 B. subtilis genomes uncover a remarkable genome diversity that translates into a core genome of 1,659 genes and an asymptotic pangenome growth rate of 57 new genes per new genome added. This diversity is due to a large proportion of low-frequency genes that are acquired from closely related species. We find no gene-loss bias among wild isolates, which explains why the cloud genome, 43% of the species pangenome, represents only a small proportion of each genome. We show that B. subtilis can acquire xenologous copies of core genes that propagate laterally among strains within a niche. While not excluding the contributions of other mechanisms, our results strongly suggest a process of gene acquisition that is largely driven by competence, where the long-term maintenance of acquired genes depends on local and global fitness effects. This competence-driven genomic diversity provides B. subtilis with its generalist character, enabling it to occupy a wide range of ecological niches and cycle through them. PMID:29272410

  1. HERC1 polymorphisms: population-specific variations in haplotype composition.

    Science.gov (United States)

    Yuasa, Isao; Umetsu, Kazuo; Nishimukai, Hiroaki; Fukumori, Yasuo; Harihara, Shinji; Saitou, Naruya; Jin, Feng; Chattopadhyay, Prasanta K; Henke, Lotte; Henke, Jürgen

    2009-08-01

    Human HERC1 is one of six HERC proteins and may play an important role in intracellular membrane trafficking. The human HERC1 gene is suggested to have been affected by local positive selection. To assess the global frequency distributions of coding and non-coding single nucleotide polymorphisms (SNPs) in the HERC1 gene, we developed a new simultaneous genotyping method for four SNPs, and applied this method to investigate 1213 individuals from 12 global populations. The results confirmed remarked differences in the allele and haplotype frequencies between East Asian and non-East Asian populations. One of the three common haplotypes observed was found to be characteristic of East Asians, who showed a relatively uniform distribution of haplotypes. Information on haplotypes would be useful for testing the function of polymorphisms in the HERC1 gene. This is the first study to investigate the distribution of HERC1 polymorphisms in various populations. (c) 2009 John Wiley & Sons, Ltd.

  2. An unusual haplotype structure on human chromosome 8p23 derived from the inversion polymorphism.

    Science.gov (United States)

    Deng, Libin; Zhang, Yuezheng; Kang, Jian; Liu, Tao; Zhao, Hongbin; Gao, Yang; Li, Chaohua; Pan, Hao; Tang, Xiaoli; Wang, Dunmei; Niu, Tianhua; Yang, Huanming; Zeng, Changqing

    2008-10-01

    Chromosomal inversion is an important type of genomic variations involved in both evolution and disease pathogenesis. Here, we describe the refined genetic structure of a 3.8-Mb inversion polymorphism at chromosome 8p23. Using HapMap data of 1,073 SNPs generated from 209 unrelated samples from CEPH-Utah residents with ancestry from northern and western Europe (CEU); Yoruba in Ibadan, Nigeria (YRI); and Asian (ASN) samples, which were comprised of Han Chinese from Beijing, China (CHB) and Japanese from Tokyo, Japan (JPT)-we successfully deduced the inversion orientations of all their 418 haplotypes. In particular, distinct haplotype subgroups were identified based on principal component analysis (PCA). Such genetic substructures were consistent with clustering patterns based on neighbor-joining tree reconstruction, which revealed a total of four haplotype clades across all samples. Metaphase fluorescence in situ hybridization (FISH) in a subset of 10 HapMap samples verified their inversion orientations predicted by PCA or phylogenetic tree reconstruction. Positioning of the outgroup haplotype within one of YRI clades suggested that Human NCBI Build 36-inverted order is most likely the ancestral orientation. Furthermore, the population differentiation test and the relative extended haplotype homozygosity (REHH) analysis in this region discovered multiple selection signals, also in a population-specific manner. A positive selection signal was detected at XKR6 in the ASN population. These results revealed the correlation of inversion polymorphisms to population-specific genetic structures, and various selection patterns as possible mechanisms for the maintenance of a large chromosomal rearrangement at 8p23 region during evolution. In addition, our study also showed that haplotype-based clustering methods, such as PCA, can be applied in scanning for cryptic inversion polymorphisms at a genome-wide scale.

  3. Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.

    Science.gov (United States)

    Hirsch, Cory D; Evans, Joseph; Buell, C Robin; Hirsch, Candice N

    2014-07-01

    Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  4. Haplotype diversity of 16 Y-chromosomal STRs in three main ethnic populations (Malays, Chinese and Indians) in Malaysia.

    Science.gov (United States)

    Chang, Yuet Meng; Perumal, Revathi; Keat, Phoon Yoong; Kuehn, Daniel L C

    2007-03-22

    We have analyzed 16 Y-STR loci (DYS456, DYS389I, DYS390, DYS389II, DYS458, DYS19, DYS385a/b, DYS393, DYS391, DYS439, DYS635 or Y-GATA C4, DYS392, Y-GATA H4, DYS437, DYS438 and DYS448) from the non-recombining region of the human Y-chromosome in 980 male individuals from three main ethnic populations in Malaysia (Malay, Chinese, Indian) using the AmpFlSTR((R)) Y-filertrade mark (Applied Biosystems, Foster City, CA). The observed 17-loci haplotypes and the individual allele frequencies for each locus were estimated, whilst the locus diversity, haplotype diversity and discrimination capacity were calculated in the three ethnic populations. Analysis of molecular variance indicated that 88.7% of the haplotypic variation is found within population and 11.3% is between populations (fixation index F(ST)=0.113, p=0.000). This study has revealed Y-chromosomes with null alleles at several Y-loci, namely DYS458, DYS392, DYS389I, DYS389II, DYS439, DYS448 and Y-GATA H4; and several occurrences of duplications at the highly polymorphic DYS385 loci. Some of these deleted loci were in regions of the Y(q) arm that have been implicated in the occurrence of male infertility.

  5. Specific single-cell isolation and genomic amplification of uncultured microorganisms

    DEFF Research Database (Denmark)

    Kvist, Thomas; Ahring, Birgitte Kiær; Lasken, R.S.

    2007-01-01

    We in this study describe a new method for genomic studies of individual uncultured prokaryotic organisms, which was used for the isolation and partial genome sequencing of a soil archaeon. The diversity of Archaea in a soil sample was mapped by generating a clone library using group-specific pri......We in this study describe a new method for genomic studies of individual uncultured prokaryotic organisms, which was used for the isolation and partial genome sequencing of a soil archaeon. The diversity of Archaea in a soil sample was mapped by generating a clone library using group......-specific primers in combination with a terminal restriction fragment length polymorphism profile. Intact cells were extracted from the environmental sample, and fluorescent in situ hybridization probing with Cy3-labeled probes designed from the clone library was subsequently used to detect the organisms...... of interest. Single cells with a bright fluorescent signal were isolated using a micromanipulator and the genome of the single isolated cells served as a template for multiple displacement amplification (MDA) using the Phi29 DNA polymerase. The generated MDA product was afterwards used for 16S rRNA gene...

  6. Evidence for high genetic diversity of NAD1 and COX1 mitochondrial haplotypes among triclabendazole resistant and susceptible populations and field isolates of Fasciola hepatica (liver fluke) in Australia.

    Science.gov (United States)

    Elliott, T; Muller, A; Brockwell, Y; Murphy, N; Grillo, V; Toet, H M; Anderson, G; Sangster, N; Spithill, T W

    2014-02-24

    In recent years, the global incidence of Fasciola hepatica (liver fluke) infections exhibiting resistance to triclabendazole (TCBZ) has increased, resulting in increased economic losses for livestock producers and threatening future control. The development of TCBZ resistance and the worldwide discovery of F. hepatica population diversity has emphasized the need to further understand the genetic structure of drug susceptible and resistant Fasciola populations within Australia. In this study, the genetic diversity of liver flukes was estimated by sequencing mitochondrial DNA (mtDNA) encoding the NAD1 (530 bp) and COX1 (420 bp) genes of 208 liver flukes (F. hepatica) collected from three populations: field isolates obtained from abattoirs from New South Wales (NSW) and Victoria (Vic); three TCBZ-resistant fluke populations from NSW and Victoria; and the well-established TCBZ-susceptible Sunny Corner laboratory isolate. Overall nucleotide diversity for all flukes analysed of 0.00516 and 0.00336 was estimated for the NAD1 and COX1 genes respectively. Eighteen distinct haplotypes were established for the NAD1 gene and six haplotypes for the COX1 gene, resulting in haplotype diversity levels of 0.832 and 0.482, respectively. One field isolate showed a similar low level of haplotype diversity as seen in the Sunny Corner laboratory isolate. Analysis of TCBZ-resistant infrapopulations from 3 individual cattle grazing one property revealed considerable sequence parasite diversity between cattle. Analysis of parasite TCBZ-resistant infrapopulations from sheep and cattle revealed haplotypes unique to each host, but no significant difference between parasite populations. Fst analysis of fluke populations revealed little differentiation between the resistant and field populations. This study has revealed a high level of diversity in field and drug resistant flukes in South-Eastern Australia. Copyright © 2013 Elsevier B.V. All rights reserved.

  7. ParaHaplo 3.0: A program package for imputation and a haplotype-based whole-genome association study using hybrid parallel computing

    Directory of Open Access Journals (Sweden)

    Kamatani Naoyuki

    2011-05-01

    Full Text Available Abstract Background Use of missing genotype imputations and haplotype reconstructions are valuable in genome-wide association studies (GWASs. By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured in the study samples can be imputed and used for GWASs. Since millions of single nucleotide polymorphisms need to be imputed in a GWAS, faster methods for genotype imputation and haplotype reconstruction are required. Results We developed a program package for parallel computation of genotype imputation and haplotype reconstruction. Our program package, ParaHaplo 3.0, is intended for use in workstation clusters using the Intel Message Passing Interface. We compared the performance of ParaHaplo 3.0 on the Japanese in Tokyo, Japan and Han Chinese in Beijing, and Chinese in the HapMap dataset. A parallel version of ParaHaplo 3.0 can conduct genotype imputation 20 times faster than a non-parallel version of ParaHaplo. Conclusions ParaHaplo 3.0 is an invaluable tool for conducting haplotype-based GWASs. The need for faster genotype imputation and haplotype reconstruction using parallel computing will become increasingly important as the data sizes of such projects continue to increase. ParaHaplo executable binaries and program sources are available at http://en.sourceforge.jp/projects/parallelgwas/releases/.

  8. Genetic diversity and paternal origin of domestic donkeys.

    Science.gov (United States)

    Han, H; Chen, N; Jordana, J; Li, C; Sun, T; Xia, X; Zhao, X; Ji, C; Shen, S; Yu, J; Ainhoa, F; Chen, H; Lei, C; Dang, R

    2017-12-01

    Numerous studies have been conducted to investigate genetic diversity, origins and domestication of donkey using autosomal microsatellites and the mitochondrial genome, whereas the male-specific region of the Y chromosome of modern donkeys is largely uncharacterized. In the current study, 14 published equine Y chromosome-specific microsatellites (Y-STR) were investigated in 395 male donkey samples from China, Egypt, Spain and Peru using fluorescent labeled microsatellite markers. The results showed that seven Y-STRs-EcaYP9, EcaYM2, EcaYE2, EcaYE3, EcaYNO1, EcaYNO2 and EcaYNO4-were male specific and polymorphic, showing two to eight alleles in the donkeys studied. A total of 21 haplotypes corresponding to three haplogroups were identified, indicating three independent patrilines in domestic donkey. These markers are useful for the study the Y-chromosome diversity and population genetics of donkeys in Africa, Europe, South America and China. © 2017 Stichting International Foundation for Animal Genetics.

  9. Haplotype Diversity of COI Gene of Hylarana chalconota Species Found at State University of Malang

    Directory of Open Access Journals (Sweden)

    Dian Ratri Wulandari

    2014-01-01

    Full Text Available Hylarana chalconota is a cryptic species of frog endemic to Java Island [1]. This species is small with long legs, and brown skin. The Snout-Vent Length (SVL ranges between 30-40 mm for male and 45-65 mm for female. [4] Reports the existence of this species in State University of Malang, which was not found in 1995 [5]. Sampel #1 displays spots in its skin, which does not exist in sample #2. To reveal the haplotype diversity of COI gene in this species, we analyzed Cytochrome-c oxidase subunit-1 (COI sequences of both samples. Using a pair of primers according to [6] both samples had 604 bp and 574 bp fragment length, respectively. These fragments showed polymorphism; with mutation position in sites 104, 105, and 124. Based on this result, we suggest that the two samples share a different haplotypes, proposed as UM1 and UM2.

  10. Haplotype diversity of 17 Y-chromosomal STRs in three native Sarawak populations (Iban, Bidayuh and Melanau) in East Malaysia.

    Science.gov (United States)

    Chang, Yuet Meng; Swaran, Yuvaneswari; Phoon, Yoong Keat; Sothirasan, Kavin; Sim, Hang Thiew; Lim, Kong Boon; Kuehn, Daniel

    2009-06-01

    17 Y-STRs (DYS456, DYS389I, DYS390, DYS389II, DYS458, DYS19, DYS385a/b, DYS393, DYS391, DYS439, DYS635 or Y-GATA C4, DYS392, Y-GATA H4, DYS437, DYS438 and DYS448) have been analyzed in 320 male individuals from Sarawak, an eastern state of Malaysia on the Borneo island using the AmpFlSTR Y-filer (Applied Biosystems, Foster City, CA). These individuals were from three indigenous ethnic groups in Sarawak comprising of 103 Ibans, 113 Bidayuhs and 104 Melanaus. The observed 17-loci haplotypes and the individual allele frequencies for each locus were estimated, whilst the locus diversity, haplotype diversity and discrimination capacity were calculated in the three groups. Analysis of molecular variance (AMOVA) indicated that 87.6% of the haplotypic variation was found within population and 12.4% between populations (fixation index F(ST)=0.124, p=0.000). This study has revealed that the indigenous populations in Sarawak are distinctly different to each other, and to the three major ethnic groups in Malaysia (Malays, Chinese and Indians), with the Melanaus having a strikingly high degree of shared haplotypes within. There are rare unusual variants and microvariants that were not present in Malaysian Malay, Chinese or Indian groups. In addition, occurrences of DYS385 duplications which were only noticeably present in Chinese group previously was also observed in the Iban group whilst null alleles were detected at several Y-loci (namely DYS19, DYS392, DYS389II and DYS448) in the Iban and Melanau groups.

  11. Novel Harmful Recessive Haplotypes Identified for Fertility Traits in Nordic Holstein Cattle

    Science.gov (United States)

    Sahana, Goutam; Nielsen, Ulrik Sander; Aamand, Gert Pedersen; Lund, Mogens Sandø; Guldbrandtsen, Bernt

    2013-01-01

    Using genomic data, lethal recessives may be discovered from haplotypes that are common in the population but never occur in the homozygote state in live animals. This approach only requires genotype data from phenotypically normal (i.e. live) individuals and not from the affected embryos that die. A total of 7,937 Nordic Holstein animals were genotyped with BovineSNP50 BeadChip and haplotypes including 25 consecutive markers were constructed and tested for absence of homozygotes states. We have identified 17 homozygote deficient haplotypes which could be loosely clustered into eight genomic regions harboring possible recessive lethal alleles. Effects of the identified haplotypes were estimated on two fertility traits: non-return rates and calving interval. Out of the eight identified genomic regions, six regions were confirmed as having an effect on fertility. The information can be used to avoid carrier-by-carrier mattings in practical animal breeding. Further, identification of causative genes/polymorphisms responsible for lethal effects will lead to accurate testing of the individuals carrying a lethal allele. PMID:24376603

  12. On detecting incomplete soft or hard selective sweeps using haplotype structure

    DEFF Research Database (Denmark)

    Ferrer-Admetlla, Anna; Liang, Mason; Korneliussen, Thorfinn Sand

    2014-01-01

    We present a new haplotype-based statistic (nSL) for detecting both soft and hard sweeps in population genomic data from a single population. We compare our new method with classic single-population haplotype and site frequency spectrum (SFS)-based methods and show that it is more robust, particu......We present a new haplotype-based statistic (nSL) for detecting both soft and hard sweeps in population genomic data from a single population. We compare our new method with classic single-population haplotype and site frequency spectrum (SFS)-based methods and show that it is more robust......, particularly to recombination rate variation. However, all statistics show some sensitivity to the assumptions of the demographic model. Additionally, we show that nSL has at least as much power as other methods under a number of different selection scenarios, most notably in the cases of sweeps from standing...

  13. Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel.

    Science.gov (United States)

    Mitt, Mario; Kals, Mart; Pärn, Kalle; Gabriel, Stacey B; Lander, Eric S; Palotie, Aarno; Ripatti, Samuli; Morris, Andrew P; Metspalu, Andres; Esko, Tõnu; Mägi, Reedik; Palta, Priit

    2017-06-01

    Genetic imputation is a cost-efficient way to improve the power and resolution of genome-wide association (GWA) studies. Current publicly accessible imputation reference panels accurately predict genotypes for common variants with minor allele frequency (MAF)≥5% and low-frequency variants (0.5≤MAF<5%) across diverse populations, but the imputation of rare variation (MAF<0.5%) is still rather limited. In the current study, we evaluate imputation accuracy achieved with reference panels from diverse populations with a population-specific high-coverage (30 ×) whole-genome sequencing (WGS) based reference panel, comprising of 2244 Estonian individuals (0.25% of adult Estonians). Although the Estonian-specific panel contains fewer haplotypes and variants, the imputation confidence and accuracy of imputed low-frequency and rare variants was significantly higher. The results indicate the utility of population-specific reference panels for human genetic studies.

  14. Genomic association for sexual precocity in beef heifers using pre-selection of genes and haplotype reconstruction.

    Directory of Open Access Journals (Sweden)

    Luciana Takada

    Full Text Available Reproductive traits are of the utmost importance for any livestock farming, but are difficult to measure and to interpret since they are influenced by various factors. The objective of this study was to detect associations between known polymorphisms in candidate genes related to sexual precocity in Nellore heifers, which could be used in breeding programs. Records of 1,689 precocious and non-precocious heifers from farms participating in the Conexão Delta G breeding program were analyzed. A subset of single nucleotide polymorphisms (SNP located in the region of the candidate genes at a distance of up to 5 kb from the boundaries of each gene, were selected from the panel of 777,000 SNPs of the High-Density Bovine SNP BeadChip. Linear mixed models were used for statistical analysis of early heifer pregnancy, relating the trait with isolated SNPs or with haplotype groups. The model included the contemporary group (year and month of birth as fixed effect and parent of the animal (sire effect as random effect. The fastPHASE® and GenomeStudio® were used for reconstruction of the haplotypes and for analysis of linkage disequilibrium based on r2 statistics. A total of 125 candidate genes and 2,024 SNPs forming haplotypes were analyzed. Statistical analysis after Bonferroni correction showed that nine haplotypes exerted a significant effect (p<0.05 on sexual precocity. Four of these haplotypes were located in the Pregnancy-associated plasma protein-A2 gene (PAPP-A2, two in the Estrogen-related receptor gamma gene (ESRRG, and one each in the Pregnancy-associated plasma protein-A gene (PAPP-A, Kell blood group complex subunit-related family (XKR4 and mannose-binding lectin genes (MBL-1 genes. Although the present results indicate that the PAPP-A2, PAPP-A, XKR4, MBL-1 and ESRRG genes influence sexual precocity in Nellore heifers, further studies are needed to evaluate their possible use in breeding programs.

  15. Mice, humans and haplotypes--the hunt for disease genes in SLE.

    Science.gov (United States)

    Rigby, R J; Fernando, M M A; Vyse, T J

    2006-09-01

    Defining the polymorphisms that contribute to the development of complex genetic disease traits is a challenging, although increasingly tractable problem. Historically, the technical difficulties in conducting association studies across the entire human genome are such that murine models have been used to generate candidate genes for analysis in human complex diseases, such as SLE. In this article we discuss the advantages and disadvantages of this approach and specifically address some assumptions made in the transition from studying one species to another, using lupus as an example. These issues include differences in genetic structure and genetic organisation which are a reflection on the population history. Clearly there are major differences in the histories of the human population and inbred laboratory strains of mice. Both human and murine genomes do exhibit structure at the genetic level. That is to say, they comprise haplotypes which are genomic regions that carry runs of polymorphisms that are not independently inherited. Haplotypes therefore reduce the number of combinations of the polymorphisms in the DNA in that region and facilitate the identification of disease susceptibility genes in both mice and humans. There are now novel means of generating candidate genes in SLE using mutagenesis (with ENU) in mice and identifying mice that generate antinuclear autoimmunity. In addition, murine models still provide a valuable means of exploring the functional consequences of genetic variation. However, advances in technology are such that human geneticists can now screen large fractions of the human genome for disease associations using microchip technologies that provide information on upwards of 100,000 different polymorphisms. These approaches are aimed at identifying haplotypes that carry disease susceptibility mutations and rely less on the generation of candidate genes.

  16. The use of comparative genomic hybridization to characterize genome dynamics and diversity among the serotypes of Shigella

    Directory of Open Access Journals (Sweden)

    Sun Meisheng

    2006-08-01

    Full Text Available Abstract Background Compelling evidence indicates that Shigella species, the etiologic agents of bacillary dysentery, as well as enteroinvasive Escherichia coli, are derived from multiple origins of Escherichia coli and form a single pathovar. To further understand the genome diversity and virulence evolution of Shigella, comparative genomic hybridization microarray analysis was employed to compare the gene content of E. coli K-12 with those of 43 Shigella strains from all lineages. Results For the 43 strains subjected to CGH microarray analyses, the common backbone of the Shigella genome was estimated to contain more than 1,900 open reading frames (ORFs, with a mean number of 726 undetectable ORFs. The mosaic distribution of absent regions indicated that insertions and/or deletions have led to the highly diversified genomes of pathogenic strains. Conclusion These results support the hypothesis that by gain and loss of functions, Shigella species became successful human pathogens through convergent evolution from diverse genomic backgrounds. Moreover, we also found many specific differences between different lineages, providing a window into understanding bacterial speciation and taxonomic relationships.

  17. Genomic Diversity and Evolution of the Lyssaviruses

    Science.gov (United States)

    Delmas, Olivier; Holmes, Edward C.; Talbi, Chiraz; Larrous, Florence; Dacheux, Laurent; Bouchier, Christiane; Bourhy, Hervé

    2008-01-01

    Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as ‘Lagos Bat’. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses. PMID:18446239

  18. Genomic diversity and evolution of the lyssaviruses.

    Directory of Open Access Journals (Sweden)

    Olivier Delmas

    2008-04-01

    Full Text Available Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as 'Lagos Bat'. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses.

  19. Cercospora zeina from Maize in South Africa Exhibits High Genetic Diversity and Lack of Regional Population Differentiation.

    Science.gov (United States)

    Muller, Mischa F; Barnes, Irene; Kunene, Ncobile T; Crampton, Bridget G; Bluhm, Burton H; Phillips, Sonia M; Olivier, Nicholas A; Berger, Dave K

    2016-10-01

    South Africa is one of the leading maize-producing countries in sub-Saharan Africa. Since the 1980s, Cercospora zeina, a causal agent of gray leaf spot of maize, has become endemic in South Africa, and is responsible for substantial yield reductions. To assess genetic diversity and population structure of C. zeina in South Africa, 369 isolates were collected from commercial maize farms in three provinces (KwaZulu-Natal, Mpumalanga, and North West). These isolates were evaluated with 14 microsatellite markers and species-specific mating type markers that were designed from draft genome sequences of C. zeina isolates from Africa (CMW 25467) and the United States (USPA-4). Sixty alleles were identified across 14 loci, and gene diversity values within each province ranged from 0.18 to 0.35. High levels of gene flow were observed (Nm = 5.51), and in a few cases, identical multilocus haplotypes were found in different provinces. Overall, 242 unique multilocus haplotypes were identified with a low clonal fraction of 34%. No distinct population clusters were identified using STRUCTURE, principal coordinate analysis, or Weir's theta θ statistic. The lack of population differentiation was supported by analysis of molecular variance tests, which indicated that only 2% of the variation was attributed to variability between populations from each province. Mating type ratios of MAT1-1 and MAT1-2 idiomorphs from 335 isolates were not significantly different from a 1:1 ratio in all provinces, which provided evidence for sexual reproduction. The draft genome of C. zeina CMW 25467 exhibited a complete genomic copy of the MAT1-1 idiomorph as well as exonic fragments of MAT genes from both idiomorphs. The high level of gene diversity, shared haplotypes at different geographical locations within South Africa, and presence of both MAT idiomorphs at all sites indicates widespread dispersal of C. zeina between maize fields in the country as well as evidence for sexual recombination. The

  20. Effects of Single Nucleotide Polymorphism Marker Density on Haplotype Block Partition

    Directory of Open Access Journals (Sweden)

    Sun Ah Kim

    2016-12-01

    Full Text Available Many researchers have found that one of the most important characteristics of the structure of linkage disequilibrium is that the human genome can be divided into non-overlapping block partitions in which only a small number of haplotypes are observed. The location and distribution of haplotype blocks can be seen as a population property influenced by population genetic events such as selection, mutation, recombination and population structure. In this study, we investigate the effects of the density of markers relative to the full set of all polymorphisms in the region on the results of haplotype partitioning for five popular haplotype block partition methods: three methods in Haploview (confidence interval, four gamete test, and solid spine, MIG++ implemented in PLINK 1.9 and S-MIG++. We used several experimental datasets obtained by sampling subsets of single nucleotide polymorphism (SNP markers of chromosome 22 region in the 1000 Genomes Project data and also the HapMap phase 3 data to compare the results of haplotype block partitions by five methods. With decreasing sampling ratio down to 20% of the original SNP markers, the total number of haplotype blocks decreases and the length of haplotype blocks increases for all algorithms. When we examined the marker-independence of the haplotype block locations constructed from the datasets of different density, the results using below 50% of the entire SNP markers were very different from the results using the entire SNP markers. We conclude that the haplotype block construction results should be used and interpreted carefully depending on the selection of markers and the purpose of the study.

  1. Genome-wide divergence, haplotype distribution and population demographic histories for Gossypium hirsutum and Gossypium barbadense as revealed by genome-anchored SNPs

    Science.gov (United States)

    Use of 10,129 singleton SNPs of known genomic location in tetraploid cotton provided unique opportunities to characterize genome-wide diversity among 440 Gossypium hirsutum and 219 G. barbadense cultivars and landrace accessions of widespread origin. Using the SNPs distributed genome-wide, we exami...

  2. Evaluation of haplotype diversity of Achatina fulica (Lissachatina) [Bowdich] from Indian sub-continent by means of 16S rDNA sequence and its phylogenetic relationships with other global populations.

    Science.gov (United States)

    Ayyagari, Vijaya Sai; Sreerama, Krupanidhi

    2017-08-01

    Achatina fulica (Lissachatina fulica) is one of the most invasive species found across the globe causing a significant damage to crops, vegetables, and horticultural plants. This terrestrial snail is native to east Africa and spread to different parts of the world by introductions. India, a hot spot for biodiversity of several endemic gastropods, has witnessed an outburst of this snail population in several parts of the country posing a serious threat to crop loss and also to human health. With an objective to evaluate the genetic diversity of this snail, we have sampled this snail from different parts of India and analyzed its haplotype diversity by means of 16S rDNA sequence information. Apart from this, we have studied the phylogenetic relationships of the isolates sequenced in the present study in relation with other global populations by Bayesian and Maximum-likelihood approaches. Of the isolates sequenced, haplotype 'C' is the predominant one. A new haplotype 'S' from the state of Odisha was observed. The isolates sequenced in the present study clustered with its conspecifics from the Indian sub-continent. Haplotype network analyses were also carried out for studying the evolution of different haplotypes. It was observed that haplotype 'S' was associated with a Mauritius haplotype 'H', indicating the possibility of multiple introductions of A. fulica to India.

  3. Extended HLA-D region haplotype associated with celiac disease

    Energy Technology Data Exchange (ETDEWEB)

    Howell, M.D.; Smith, J.R.; Austin, R.K.; Kelleher, D.; Nepom, G.T.; Volk, B.; Kagnoff, M.F.

    1988-01-01

    Celiac disease has one of the strongest associations with HLA (human leukocyte antigen) class II markers of the known HLA-linked diseases. This association is primarily with the class II serologic specificities HLA-DR3 and -DQw2. The authors previously described a restriction fragment length polymorphism (RFLP) characterized by the presence of a 4.0-kilobase Rsa I fragment derived from an HLA class II ..beta..-chain gene, which distinguishes the class II HLA haplotype of celiac disease patients from those of many serologically matched controls. They now report the isolation of this ..beta..-chain gene from a bacteriophage genomic library constructed from the DNA of a celiac disease patient. Based on restriction mapping and differential hybridization with class II cDNA and oligonucleotide probes, this gene was identified as one encoding an HLA-DP ..beta..-chain. This celiac disease-associated HLA-DP ..beta..-chain gene was flanked by HLA-DP ..cap alpha..-chain genes and, therefore, was probably in its normal chromosomal location. The HLA-DP..cap alpha..-chain genes of celiac disease patients also were studied by RFLP analysis. Celiac disease is associated with a subset of HLA-DR3, -DQw2 haplotypes characterized by HLA-DP ..cap alpha..- and ..beta..-chain gene RFLPs. Within the celiac-disease patient population, the joint segregation of these HLA-DP genes with those encoding the serologic specificities HLA-DR3 and -DQw2 indicates: (i) that the class II HLA haplotype associated with celiac disease is extended throughout the entire HLA-D region, and (ii) that celiac-disease susceptibility genes may reside as far centromeric on this haplotype as the HLA-DP subregion.

  4. Extended HLA-D region haplotype associated with celiac disease

    International Nuclear Information System (INIS)

    Howell, M.D.; Smith, J.R.; Austin, R.K.; Kelleher, D.; Nepom, G.T.; Volk, B.; Kagnoff, M.F.

    1988-01-01

    Celiac disease has one of the strongest associations with HLA (human leukocyte antigen) class II markers of the known HLA-linked diseases. This association is primarily with the class II serologic specificities HLA-DR3 and -DQw2. The authors previously described a restriction fragment length polymorphism (RFLP) characterized by the presence of a 4.0-kilobase Rsa I fragment derived from an HLA class II β-chain gene, which distinguishes the class II HLA haplotype of celiac disease patients from those of many serologically matched controls. They now report the isolation of this β-chain gene from a bacteriophage genomic library constructed from the DNA of a celiac disease patient. Based on restriction mapping and differential hybridization with class II cDNA and oligonucleotide probes, this gene was identified as one encoding an HLA-DP β-chain. This celiac disease-associated HLA-DP β-chain gene was flanked by HLA-DP α-chain genes and, therefore, was probably in its normal chromosomal location. The HLA-DPα-chain genes of celiac disease patients also were studied by RFLP analysis. Celiac disease is associated with a subset of HLA-DR3, -DQw2 haplotypes characterized by HLA-DP α- and β-chain gene RFLPs. Within the celiac-disease patient population, the joint segregation of these HLA-DP genes with those encoding the serologic specificities HLA-DR3 and -DQw2 indicates: (i) that the class II HLA haplotype associated with celiac disease is extended throughout the entire HLA-D region, and (ii) that celiac-disease susceptibility genes may reside as far centromeric on this haplotype as the HLA-DP subregion

  5. Optimized PCR with sequence specific primers (PCR-SSP for fast and efficient determination of Interleukin-6 Promoter -597/-572/-174Haplotypes

    Directory of Open Access Journals (Sweden)

    Bugert Peter

    2009-12-01

    Full Text Available Abstract Background Interleukin-6 (IL-6 promoter polymorphisms at positions -597(G→A, -572(G→C and -174(G→C were shown to have a clinical impact on different major diseases. At present PCR-SSP protocols for IL-6 -597/-572/-174haplotyping are elaborate and require large amounts of genomic DNA. Findings We describe an improved typing technique requiring a decreased number of PCR-reactions and a reduced PCR-runtime due to optimized PCR-conditions. Conclusion This enables a fast and efficient determination of IL-6 -597/-572/-174haplotypes in clinical diagnosis and further evaluation of IL-6 promoter polymorphisms in larger patient cohorts.

  6. Genomic Diversity in the Genus of Aspergillus

    DEFF Research Database (Denmark)

    Rasmussen, Jane Lind Nybo

    , sections and genus of Aspergillus. The work uncovers a large genomic diversity across all studied groups of species. The genomic diversity was especially evident on the section level, where the proteins shared by all species only represents ⇠55% of the proteome. This number decreases even further, to 38......, sections Nigri, Usti and Cavericolus, clade Tubingensis, and species A. niger. It lastly uses these results to predict genetic traits that take part in fungal speciation. Within a few years the Aspergillus whole-genus sequencing project will have published all currently-accepted Aspergillus genomes......Aspergillus is a highly important genus of saprotrophic filamentous fungi. It is a very diverse genus that is inextricably intertwined with human a↵airs on a daily basis, holding species relevant to plant and human pathology, enzyme and bulk chemistry production, food and beverage biotechnology...

  7. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    OpenAIRE

    Henrique Machado; Henrique Machado; Lone Gram

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationship...

  8. Visualization of Genome Diversity in German Shepherd Dogs

    OpenAIRE

    Sally-Anne Mortlock; Rachel Booth; Hamutal Mazrier; Mehar S. Khatkar; Peter Williamson

    2016-01-01

    A loss of genetic diversity may lead to increased disease risks in subpopulations of dogs. The canine breed structure has contributed to relatively small effective population size in many breeds and can limit the options for selective breeding strategies to maintain diversity. With the completion of the canine genome sequencing project, and the subsequent reduction in the cost of genotyping on a genomic scale, evaluating diversity in dogs has become much more accurate and accessible. This pro...

  9. Three Novel Haplotypes of Theileria bicornis in Black and White Rhinoceros in Kenya.

    Science.gov (United States)

    Otiende, M Y; Kivata, M W; Jowers, M J; Makumi, J N; Runo, S; Obanda, V; Gakuya, F; Mutinda, M; Kariuki, L; Alasaad, S

    2016-02-01

    Piroplasms, especially those in the genera Babesia and Theileria, have been found to naturally infect rhinoceros. Due to natural or human-induced stress factors such as capture and translocations, animals often develop fatal clinical piroplasmosis, which causes death if not treated. This study examines the genetic diversity and occurrence of novel Theileria species infecting both black and white rhinoceros in Kenya. Samples collected opportunistically during routine translocations and clinical interventions from 15 rhinoceros were analysed by polymerase chain reaction (PCR) using a nested amplification of the small subunit ribosomal RNA (18S rRNA) gene fragments of Babesia and Theileria. Our study revealed for the first time in Kenya the presence of Theileria bicornis in white (Ceratotherium simum simum) and black (Diceros bicornis michaeli) rhinoceros and the existence of three new haplotypes: haplotypes H1 and H3 were present in white rhinoceros, while H2 was present in black rhinoceros. No specific haplotype was correlated to any specific geographical location. The Bayesian inference 50% consensus phylogram recovered the three haplotypes monophyleticly, and Theileria bicornis had very high support (BPP: 0.98). Furthermore, the genetic p-uncorrected distances and substitutions between T. bicornis and the three haplotypes were the same in all three haplotypes, indicating a very close genetic affinity. This is the first report of the occurrence of Theileria species in white and black rhinoceros from Kenya. The three new haplotypes reported here for the first time have important ecological and conservational implications, especially for population management and translocation programs and as a means of avoiding the transport of infected animals into non-affected areas. © 2014 Blackwell Verlag GmbH.

  10. In Vivo Characterization of Human APOA5 Haplotypes

    Energy Technology Data Exchange (ETDEWEB)

    Ahituv, Nadav; Akiyama, Jennifer; Chapman-Helleboid, Audrey; Fruchart, Jamila; Pennacchio, Len A.

    2006-10-01

    Increased plasma triglycerides concentrations are an independent risk factor for cardiovascular disease. Numerous studies support a reproducible genetic association between two minor haplotypes in the human apolipoprotein A5 gene (APOA5) and increased plasma triglyceride concentrations. We thus sought to investigate the effect of these minor haplotypes (APOA5*2 and APOA5*3) on ApoAV plasma levels through the precise insertion of single-copy intact APOA5 haplotypes at a targeted location in the mouse genome. While we found no difference in the amount of human plasma ApoAV in mice containing the common APOA5*1 and minor APOA5*2 haplotype, the introduction of the single APOA5*3 defining allele (19W) resulted in 3-fold lower ApoAV plasma levels consistent with existing genetic association studies. These results indicate that S19W polymorphism is likely to be functional and explain the strong association of this variant with plasma triglycerides supporting the value of sensitive in vivo assays to define the functional nature of human haplotypes.

  11. Novel harmful recessive haplotypes identified for fertility traits in Nordic Holstein cattle

    DEFF Research Database (Denmark)

    Sahana, Goutam; Nielsen, Ulrik Sander; Aamand, Gert Pedersen

    2013-01-01

    harboring possible recessive lethal alleles. Effects of the identified haplotypes were estimated on two fertility traits: non-return rates and calving interval. Out of the eight identified genomic regions, six regions were confirmed as having an effect on fertility. The information can be used to avoid......Using genomic data, lethal recessives may be discovered from haplotypes that are common in the population but never occur in the homozygote state in live animals. This approach only requires genotype data from phenotypically normal (i.e. live) individuals and not from the affected embryos that die...

  12. Population Genomics of sub-saharan Drosophila melanogaster: African diversity and non-African admixture.

    Directory of Open Access Journals (Sweden)

    John E Pool

    Full Text Available Drosophila melanogaster has played a pivotal role in the development of modern population genetics. However, many basic questions regarding the demographic and adaptive history of this species remain unresolved. We report the genome sequencing of 139 wild-derived strains of D. melanogaster, representing 22 population samples from the sub-Saharan ancestral range of this species, along with one European population. Most genomes were sequenced above 25X depth from haploid embryos. Results indicated a pervasive influence of non-African admixture in many African populations, motivating the development and application of a novel admixture detection method. Admixture proportions varied among populations, with greater admixture in urban locations. Admixture levels also varied across the genome, with localized peaks and valleys suggestive of a non-neutral introgression process. Genomes from the same location differed starkly in ancestry, suggesting that isolation mechanisms may exist within African populations. After removing putatively admixed genomic segments, the greatest genetic diversity was observed in southern Africa (e.g. Zambia, while diversity in other populations was largely consistent with a geographic expansion from this potentially ancestral region. The European population showed different levels of diversity reduction on each chromosome arm, and some African populations displayed chromosome arm-specific diversity reductions. Inversions in the European sample were associated with strong elevations in diversity across chromosome arms. Genomic scans were conducted to identify loci that may represent targets of positive selection within an African population, between African populations, and between European and African populations. A disproportionate number of candidate selective sweep regions were located near genes with varied roles in gene regulation. Outliers for Europe-Africa F(ST were found to be enriched in genomic regions of locally

  13. Population Genomics of Sub-Saharan Drosophila melanogaster: African Diversity and Non-African Admixture

    Science.gov (United States)

    Pool, John E.; Corbett-Detig, Russell B.; Sugino, Ryuichi P.; Stevens, Kristian A.; Cardeno, Charis M.; Crepeau, Marc W.; Duchen, Pablo; Emerson, J. J.; Saelao, Perot; Begun, David J.; Langley, Charles H.

    2012-01-01

    Drosophila melanogaster has played a pivotal role in the development of modern population genetics. However, many basic questions regarding the demographic and adaptive history of this species remain unresolved. We report the genome sequencing of 139 wild-derived strains of D. melanogaster, representing 22 population samples from the sub-Saharan ancestral range of this species, along with one European population. Most genomes were sequenced above 25X depth from haploid embryos. Results indicated a pervasive influence of non-African admixture in many African populations, motivating the development and application of a novel admixture detection method. Admixture proportions varied among populations, with greater admixture in urban locations. Admixture levels also varied across the genome, with localized peaks and valleys suggestive of a non-neutral introgression process. Genomes from the same location differed starkly in ancestry, suggesting that isolation mechanisms may exist within African populations. After removing putatively admixed genomic segments, the greatest genetic diversity was observed in southern Africa (e.g. Zambia), while diversity in other populations was largely consistent with a geographic expansion from this potentially ancestral region. The European population showed different levels of diversity reduction on each chromosome arm, and some African populations displayed chromosome arm-specific diversity reductions. Inversions in the European sample were associated with strong elevations in diversity across chromosome arms. Genomic scans were conducted to identify loci that may represent targets of positive selection within an African population, between African populations, and between European and African populations. A disproportionate number of candidate selective sweep regions were located near genes with varied roles in gene regulation. Outliers for Europe-Africa FST were found to be enriched in genomic regions of locally elevated cosmopolitan

  14. Comparative genomics of the marine bacterial genus Glaciecola reveals the high degree of genomic diversity and genomic characteristic for cold adaptation.

    Science.gov (United States)

    Qin, Qi-Long; Xie, Bin-Bin; Yu, Yong; Shu, Yan-Li; Rong, Jin-Cheng; Zhang, Yan-Jiao; Zhao, Dian-Li; Chen, Xiu-Lan; Zhang, Xi-Ying; Chen, Bo; Zhou, Bai-Cheng; Zhang, Yu-Zhong

    2014-06-01

    To what extent the genomes of different species belonging to one genus can be diverse and the relationship between genomic differentiation and environmental factor remain unclear for oceanic bacteria. With many new bacterial genera and species being isolated from marine environments, this question warrants attention. In this study, we sequenced all the type strains of the published species of Glaciecola, a recently defined cold-adapted genus with species from diverse marine locations, to study the genomic diversity and cold-adaptation strategy in this genus.The genome size diverged widely from 3.08 to 5.96 Mb, which can be explained by massive gene gain and loss events. Horizontal gene transfer and new gene emergence contributed substantially to the genome size expansion. The genus Glaciecola had an open pan-genome. Comparative genomic research indicated that species of the genus Glaciecola had high diversity in genome size, gene content and genetic relatedness. This may be prevalent in marine bacterial genera considering the dynamic and complex environments of the ocean. Species of Glaciecola had some common genomic features related to cold adaptation, which enable them to thrive and play a role in biogeochemical cycle in the cold marine environments.

  15. Early Epstein-Barr Virus Genomic Diversity and Convergence toward the B95.8 Genome in Primary Infection.

    Science.gov (United States)

    Weiss, Eric R; Lamers, Susanna L; Henderson, Jennifer L; Melnikov, Alexandre; Somasundaran, Mohan; Garber, Manuel; Selin, Liisa; Nusbaum, Chad; Luzuriaga, Katherine

    2018-01-15

    Over 90% of the world's population is persistently infected with Epstein-Barr virus. While EBV does not cause disease in most individuals, it is the common cause of acute infectious mononucleosis (AIM) and has been associated with several cancers and autoimmune diseases, highlighting a need for a preventive vaccine. At present, very few primary, circulating EBV genomes have been sequenced directly from infected individuals. While low levels of diversity and low viral evolution rates have been predicted for double-stranded DNA (dsDNA) viruses, recent studies have demonstrated appreciable diversity in common dsDNA pathogens (e.g., cytomegalovirus). Here, we report 40 full-length EBV genome sequences obtained from matched oral wash and B cell fractions from a cohort of 10 AIM patients. Both intra- and interpatient diversity were observed across the length of the entire viral genome. Diversity was most pronounced in viral genes required for establishing latent infection and persistence, with appreciable levels of diversity also detected in structural genes, including envelope glycoproteins. Interestingly, intrapatient diversity declined significantly over time ( P < 0.01), and this was particularly evident on comparison of viral genomes sequenced from B cell fractions in early primary infection and convalescence ( P < 0.001). B cell-associated viral genomes were observed to converge, becoming nearly identical to the B95.8 reference genome over time (Spearman rank-order correlation test; r = -0.5589, P = 0.0264). The reduction in diversity was most marked in the EBV latency genes. In summary, our data suggest independent convergence of diverse viral genome sequences toward a reference-like strain within a relatively short period following primary EBV infection. IMPORTANCE Identification of viral proteins with low variability and high immunogenicity is important for the development of a protective vaccine. Knowledge of genome diversity within circulating viral

  16. Compound haplotypes at Xp11.23 and human population growth in Eurasia.

    Science.gov (United States)

    Alonso, S; Armour, J A L

    2004-09-01

    To investigate patterns of diversity and the evolutionary history of Eurasians, we have sequenced a 2.8 kb region at Xp11.23 in a sample of African and Eurasian chromosomes. This region is in a long intron of CLCN5 and is immediately flanked by a highly variable minisatellite, DXS255, and a human-specific Ta0 LINE. Compared to Africans, Eurasians showed a marked reduction in sequence diversity. The main Euro-Asiatic haplotype seems to be the ancestral haplotype for the whole sample. Coalescent simulations, including recombination and exponential growth, indicate a median length of strong linkage disequilibrium, up to approximately 9 kb for this area. The Ka/Ks ratio between the coding sequence of human CLCN5 and its mouse orthologue is much less than 1. This implies that the region sequenced is unlikely to be under the strong influence of positive selective processes on CLCN5, mutations in which have been associated with disorders such as Dent's disease. In contrast, a scenario based on a population bottleneck and exponential growth seems a more likely explanation for the reduced diversity observed in Eurasians. Coalescent analysis and linked minisatellite diversity (which reaches a gene diversity value greater than 98% in Eurasians) suggest an estimated age of origin of the Euro-Asiatic diversity compatible with a recent out-of-Africa model for colonization of Eurasia by modern Homo sapiens.

  17. Genetic diversity and natural selection of Plasmodium knowlesi merozoite surface protein 1 paralog gene in Malaysia.

    Science.gov (United States)

    Ahmed, Md Atique; Fauzi, Muh; Han, Eun-Taek

    2018-03-14

    Human infections due to the monkey malaria parasite Plasmodium knowlesi is on the rise in most Southeast Asian countries specifically Malaysia. The C-terminal 19 kDa domain of PvMSP1P is a potential vaccine candidate, however, no study has been conducted in the orthologous gene of P. knowlesi. This study investigates level of polymorphisms, haplotypes and natural selection of full-length pkmsp1p in clinical samples from Malaysia. A total of 36 full-length pkmsp1p sequences along with the reference H-strain and 40 C-terminal pkmsp1p sequences from clinical isolates of Malaysia were downloaded from published genomes. Genetic diversity, polymorphism, haplotype and natural selection were determined using DnaSP 5.10 and MEGA 5.0 software. Genealogical relationships were determined using haplotype network tree in NETWORK software v5.0. Population genetic differentiation index (F ST ) and population structure of parasite was determined using Arlequin v3.5 and STRUCTURE v2.3.4 software. Comparison of 36 full-length pkmsp1p sequences along with the H-strain identified 339 SNPs (175 non-synonymous and 164 synonymous substitutions). The nucleotide diversity across the full-length gene was low compared to its ortholog pvmsp1p. The nucleotide diversity was higher toward the N-terminal domains (pkmsp1p-83 and 30) compared to the C-terminal domains (pkmsp1p-38, 33 and 19). Phylogenetic analysis of full-length genes identified 2 distinct clusters of P. knowlesi from Malaysian Borneo. The 40 pkmsp1p-19 sequences showed low polymorphisms with 16 polymorphisms leading to 18 haplotypes. In total there were 10 synonymous and 6 non-synonymous substitutions and 12 cysteine residues were intact within the two EGF domains. Evidence of strong purifying selection was observed within the full-length sequences as well in all the domains. Shared haplotypes of 40 pkmsp1p-19 were identified within Malaysian Borneo haplotypes. This study is the first to report on the genetic diversity and natural

  18. Demography or selection on linked cultural traits or genes? Investigating the driver of low mtDNA diversity in the sperm whale using complementary mitochondrial and nuclear genome analyses.

    Science.gov (United States)

    Morin, Phillip A; Foote, Andrew D; Baker, C Scott; Hancock-Hanser, Brittany L; Kaschner, Kristin; Mate, Bruce R; Mesnick, Sarah L; Pease, Victoria L; Rosel, Patricia E; Alexander, Alana

    2018-04-19

    Mitochondrial DNA has been heavily utilized in phylogeography studies for several decades. However, underlying patterns of demography and phylogeography may be misrepresented due to coalescence stochasticity, selection, variation in mutation rates, and cultural hitchhiking (linkage of genetic variation to culturally transmitted traits affecting fitness). Cultural hitchhiking has been suggested as an explanation for low genetic diversity in species with strong social structures, counteracting even high mobility, abundance and limited barriers to dispersal. One such species is the sperm whale, which shows very limited phylogeographic structure and low mtDNA diversity despite a worldwide distribution and large population. Here, we use analyses of 175 globally distributed mitogenomes and three nuclear genomes to evaluate hypotheses of a population bottleneck/expansion versus a selective sweep due to cultural-hitchhiking or selection on mtDNA as the mechanism contributing to low worldwide mitochondrial diversity in sperm whales. In contrast to mtDNA control region (CR) data, mitogenome haplotypes are largely ocean-specific, with only one of 80 shared between the Atlantic and Pacific. Demographic analyses of nuclear genomes suggest low mtDNA diversity is consistent with a global reduction in population size that ended approximately 125,000 years ago, correlated with the Eemian interglacial. Phylogeographic analysis suggests that extant sperm whales descend from maternal lineages endemic to the Pacific during the period of reduced abundance, and have subsequently colonized the Atlantic several times. Results highlight the apparent impact of past climate change, and suggest selection and hitchhiking are not the sole processes responsible for low mtDNA diversity in this highly social species. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  19. Approximation properties of haplotype tagging

    Directory of Open Access Journals (Sweden)

    Dreiseitl Stephan

    2006-01-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are locations at which the genomic sequences of population members differ. Since these differences are known to follow patterns, disease association studies are facilitated by identifying SNPs that allow the unique identification of such patterns. This process, known as haplotype tagging, is formulated as a combinatorial optimization problem and analyzed in terms of complexity and approximation properties. Results It is shown that the tagging problem is NP-hard but approximable within 1 + ln((n2 - n/2 for n haplotypes but not approximable within (1 - ε ln(n/2 for any ε > 0 unless NP ⊂ DTIME(nlog log n. A simple, very easily implementable algorithm that exhibits the above upper bound on solution quality is presented. This algorithm has running time O((2m - p + 1 ≤ O(m(n2 - n/2 where p ≤ min(n, m for n haplotypes of size m. As we show that the approximation bound is asymptotically tight, the algorithm presented is optimal with respect to this asymptotic bound. Conclusion The haplotype tagging problem is hard, but approachable with a fast, practical, and surprisingly simple algorithm that cannot be significantly improved upon on a single processor machine. Hence, significant improvement in computatational efforts expended can only be expected if the computational effort is distributed and done in parallel.

  20. Mitochondrial DNA haplotype distribution patterns in Pinus ponderosa (Pinaceae): range-wide evolutionary history and implications for conservation.

    Science.gov (United States)

    Potter, Kevin M; Hipkins, Valerie D; Mahalovich, Mary F; Means, Robert E

    2013-08-01

    Ponderosa pine (Pinus ponderosa Douglas ex P. Lawson & C. Lawson) exhibits complicated patterns of morphological and genetic variation across its range in western North America. This study aims to clarify P. ponderosa evolutionary history and phylogeography using a highly polymorphic mitochondrial DNA marker, with results offering insights into how geographical and climatological processes drove the modern evolutionary structure of tree species in the region. We amplified the mtDNA nad1 second intron minisatellite region for 3,100 trees representing 104 populations, and sequenced all length variants. We estimated population-level haplotypic diversity and determined diversity partitioning among varieties, races and populations. After aligning sequences of minisatellite repeat motifs, we evaluated evolutionary relationships among haplotypes. The geographical structuring of the 10 haplotypes corresponded with division between Pacific and Rocky Mountain varieties. Pacific haplotypes clustered with high bootstrap support, and appear to have descended from Rocky Mountain haplotypes. A greater proportion of diversity was partitioned between Rocky Mountain races than between Pacific races. Areas of highest haplotypic diversity were the southern Sierra Nevada mountain range in California, northwestern California, and southern Nevada. Pinus ponderosa haplotype distribution patterns suggest a complex phylogeographic history not revealed by other genetic and morphological data, or by the sparse paleoecological record. The results appear consistent with long-term divergence between the Pacific and Rocky Mountain varieties, along with more recent divergences not well-associated with race. Pleistocene refugia may have existed in areas of high haplotypic diversity, as well as the Great Basin, Southwestern United States/northern Mexico, and the High Plains.

  1. Unprecedented genomic diversity of RNA viruses in arthropods reveals the ancestry of negative-sense RNA viruses.

    Science.gov (United States)

    Li, Ci-Xiu; Shi, Mang; Tian, Jun-Hua; Lin, Xian-Dan; Kang, Yan-Jun; Chen, Liang-Jun; Qin, Xin-Cheng; Xu, Jianguo; Holmes, Edward C; Zhang, Yong-Zhen

    2015-01-29

    Although arthropods are important viral vectors, the biodiversity of arthropod viruses, as well as the role that arthropods have played in viral origins and evolution, is unclear. Through RNA sequencing of 70 arthropod species we discovered 112 novel viruses that appear to be ancestral to much of the documented genetic diversity of negative-sense RNA viruses, a number of which are also present as endogenous genomic copies. With this greatly enriched diversity we revealed that arthropods contain viruses that fall basal to major virus groups, including the vertebrate-specific arenaviruses, filoviruses, hantaviruses, influenza viruses, lyssaviruses, and paramyxoviruses. We similarly documented a remarkable diversity of genome structures in arthropod viruses, including a putative circular form, that sheds new light on the evolution of genome organization. Hence, arthropods are a major reservoir of viral genetic diversity and have likely been central to viral evolution.

  2. DivStat: a user-friendly tool for single nucleotide polymorphism analysis of genomic diversity.

    Directory of Open Access Journals (Sweden)

    Inês Soares

    Full Text Available Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs. Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis.

  3. iHAP – integrated haplotype analysis pipeline for characterizing the haplotype structure of genes

    Directory of Open Access Journals (Sweden)

    Lim Yun Ping

    2006-12-01

    Full Text Available Abstract Background The advent of genotype data from large-scale efforts that catalog the genetic variants of different populations have given rise to new avenues for multifactorial disease association studies. Recent work shows that genotype data from the International HapMap Project have a high degree of transferability to the wider population. This implies that the design of genotyping studies on local populations may be facilitated through inferences drawn from information contained in HapMap populations. Results To facilitate analysis of HapMap data for characterizing the haplotype structure of genes or any chromosomal regions, we have developed an integrated web-based resource, iHAP. In addition to incorporating genotype and haplotype data from the International HapMap Project and gene information from the UCSC Genome Browser Database, iHAP also provides capabilities for inferring haplotype blocks and selecting tag SNPs that are representative of haplotype patterns. These include block partitioning algorithms, block definitions, tag SNP definitions, as well as SNPs to be "force included" as tags. Based on the parameters defined at the input stage, iHAP performs on-the-fly analysis and displays the result graphically as a webpage. To facilitate analysis, intermediate and final result files can be downloaded. Conclusion The iHAP resource, available at http://ihap.bii.a-star.edu.sg, provides a convenient yet flexible approach for the user community to analyze HapMap data and identify candidate targets for genotyping studies.

  4. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    DEFF Research Database (Denmark)

    Machado, Henrique; Gram, Lone

    2017-01-01

    was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.......Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...

  5. Genomic diversity of Escherichia isolates from diverse habitats.

    Directory of Open Access Journals (Sweden)

    Seungdae Oh

    Full Text Available Our understanding of the Escherichia genus is heavily biased toward pathogenic or commensal isolates from human or animal hosts. Recent studies have recovered Escherichia isolates that persist, and even grow, outside these hosts. Although the environmental isolates are typically phylogenetically distinct, they are highly related to and phenotypically indistinguishable from their human counterparts, including for the coliform test. To gain insights into the genomic diversity of Escherichia isolates from diverse habitats, including freshwater, soil, animal, and human sources, we carried out comparative DNA-DNA hybridizations using a multi-genome E. coli DNA microarray. The microarray was validated based on hybridizations with selected strains whose genome sequences were available and used to assess the frequency of microarray false positive and negative signals. Our results showed that human fecal isolates share two sets of genes (n>90 that are rarely found among environmental isolates, including genes presumably important for evading host immune mechanisms (e.g., a multi-drug transporter for acids and antimicrobials and adhering to epithelial cells (e.g., hemolysin E and fimbrial-like adhesin protein. These results imply that environmental isolates are characterized by decreased ability to colonize host cells relative to human isolates. Our study also provides gene markers that can distinguish human isolates from those of warm-blooded animal and environmental origins, and thus can be used to more reliably assess fecal contamination in natural ecosystems.

  6. Diversity and population structure of Plasmodium falciparum in Thailand based on the spatial and temporal haplotype patterns of the C-terminal 19-kDa domain of merozoite surface protein-1.

    Science.gov (United States)

    Simpalipan, Phumin; Pattaradilokrat, Sittiporn; Siripoon, Napaporn; Seugorn, Aree; Kaewthamasorn, Morakot; Butcher, Robert D J; Harnyuttanakorn, Pongchai

    2014-02-12

    The 19-kDa C-terminal region of the merozoite surface protein-1 of the human malaria parasite Plasmodium falciparum (PfMSP-119) constitutes the major component on the surface of merozoites and is considered as one of the leading candidates for asexual blood stage vaccines. Because the protein exhibits a level of sequence variation that may compromise the effectiveness of a vaccine, the global sequence diversity of PfMSP-119 has been subjected to extensive research, especially in malaria endemic areas. In Thailand, PfMSP-119 sequences have been derived from a single parasite population in Tak province, located along the Thailand-Myanmar border, since 1995. However, the extent of sequence variation and the spatiotemporal patterns of the MSP-119 haplotypes along the Thai borders with Laos and Cambodia are unknown. Sixty-three isolates of P. falciparum from five geographically isolated populations along the Thai borders with Myanmar, Laos and Cambodia in three transmission seasons between 2002 and 2008 were collected and culture-adapted. The msp-1 gene block 17 was sequenced and analysed for the allelic diversity, frequency and distribution patterns of PfMSP-119 haplotypes in individual populations. The PfMSP-119 haplotype patterns were then compared between parasite populations to infer the population structure and genetic differentiation of the malaria parasite. Five conserved polymorphic positions, which accounted for five distinct haplotypes, of PfMSP-119 were identified. Differences in the prevalence of PfMSP-119 haplotypes were detected in different geographical regions, with the highest levels of genetic diversity being found in the Kanchanaburi and Ranong provinces along the Thailand-Myanmar border and Trat province located at the Thailand-Cambodia border. Despite this variability, the distribution patterns of individual PfMSP-119 haplotypes seemed to be very similar across the country and over the three malarial transmission seasons, suggesting that gene flow

  7. Relationship between metabolic and genomic diversity in sesame (Sesamum indicum L.

    Directory of Open Access Journals (Sweden)

    Karlovsky Petr

    2008-05-01

    Full Text Available Abstract Background Diversity estimates in cultivated plants provide a rationale for conservation strategies and support the selection of starting material for breeding programs. Diversity measures applied to crops usually have been limited to the assessment of genome polymorphism at the DNA level. Occasionally, selected morphological features are recorded and the content of key chemical constituents determined, but unbiased and comprehensive chemical phenotypes have not been included systematically in diversity surveys. Our objective in this study was to assess metabolic diversity in sesame by nontargeted metabolic profiling and elucidate the relationship between metabolic and genome diversity in this crop. Results Ten sesame accessions were selected that represent most of the genome diversity of sesame grown in India, Western Asia, Sudan and Venezuela based on previous AFLP studies. Ethanolic seed extracts were separated by HPLC, metabolites were ionized by positive and negative electrospray and ions were detected with an ion trap mass spectrometer in full-scan mode for m/z from 50 to 1000. Genome diversity was determined by Amplified Fragment Length Polymorphism (AFLP using eight primer pair combinations. The relationship between biodiversity at the genome and at the metabolome levels was assessed by correlation analysis and multivariate statistics. Conclusion Patterns of diversity at the genomic and metabolic levels differed, indicating that selection played a significant role in the evolution of metabolic diversity in sesame. This result implies that when used for the selection of genotypes in breeding and conservation, diversity assessment based on neutral DNA markers should be complemented with metabolic profiles. We hypothesize that this applies to all crops with a long history of domestication that possess commercially relevant traits affected by chemical phenotypes.

  8. Genome-level comparisons provide insight into the phylogeny and metabolic diversity of species within the genus Lactococcus.

    Science.gov (United States)

    Yu, Jie; Song, Yuqin; Ren, Yan; Qing, Yanting; Liu, Wenjun; Sun, Zhihong

    2017-11-03

    The genomic diversity of different species within the genus Lactococcus and the relationships between genomic differentiation and environmental factors remain unclear. In this study, type isolates of ten Lactococcus species/subspecies were sequenced to assess their genomic characteristics, metabolic diversity, and phylogenetic relationships. The total genome sizes varied between 1.99 (Lactococcus plantarum) and 2.46 megabases (Mb; L. lactis subsp. lactis), and the G + C content ranged from 34.81 (L. lactis subsp. hordniae) to 39.67% (L. raffinolactis) with an average value of 37.02%. Analysis of genome dynamics indicated that the genus Lactococcus has an open pan-genome, while the core genome size decreased with sequential addition at the genus and species group levels. A phylogenetic dendrogram based on the concatenated amino acid sequences of 643 core genes was largely consistent with the phylogenetic tree obtained by 16S ribosomal RNA (rRNA) genes, but it provided a more robust phylogenetic resolution than the 16S rRNA gene-based analysis. Comparative genomics indicated that species in the genus Lactococcus had high degrees of diversity in genome size, gene content, and carbohydrate metabolism. This may be important for the specific adaptations that allow different Lactococcus species to survive in different environments. These results provide a quantitative basis for understanding the genomic and metabolic diversity within the genus Lactococcus, laying the foundation for future studies on taxonomy and functional genomics.

  9. Direct chromosome-length haplotyping by single-cell sequencing

    NARCIS (Netherlands)

    Porubský, David; Sanders, Ashley D; van Wietmarschen, Niek; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Bevova, Marianna R; Guryev, Victor; Lansdorp, Peter Michael

    Haplotypes are fundamental to fully characterize the diploid genome of an individual, yet methods to directly chart the unique genetic makeup of each parental chromosome are lacking. Here we introduce single-cell DNA template strand sequencing (Strand-seq) as a novel approach to phasing diploid

  10. Genetic relationships among native americans based on beta-globin gene cluster haplotype frequencies

    Directory of Open Access Journals (Sweden)

    Rita de Cassia Mousinho-Ribeiro

    2003-01-01

    Full Text Available The distribution of b-globin gene haplotypes was studied in 209 Amerindians from eight tribes of the Brazilian Amazon: Asurini from Xingú, Awá-Guajá, Parakanã, Urubú-Kaapór, Zoé, Kayapó (Xikrin from the Bacajá village, Katuena, and Tiriyó. Nine different haplotypes were found, two of which (n. 11 and 13 had not been previously identified in Brazilian indigenous populations. Haplotype 2 (+ - - - - was the most common in all groups studied, with frequencies varying from 70% to 100%, followed by haplotype 6 (- + + - +, with frequencies between 7% and 18%. The frequency distribution of the b-globin gene haplotypes in the eighteen Brazilian Amerindian populations studied to date is characterized by a reduced number of haplotypes (average of 3.5 and low levels of heterozygosity and intrapopulational differentiation, with a single clearly predominant haplotype in most tribes (haplotype 2. The Parakanã, Urubú-Kaapór, Tiriyó and Xavante tribes constitute exceptions, presenting at least four haplotypes with relatively high frequencies. The closest genetic relationships were observed between the Brazilian and the Colombian Amerindians (Wayuu, Kamsa and Inga, and, to a lesser extent, with the Huichol of Mexico. North-American Amerindians are more differentiated and clearly separated from all other tribes, except the Xavante, from Brazil, and the Mapuche, from Argentina. A restricted pool of ancestral haplotypes may explain the low diversity observed among most present-day Brazilian and Colombian Amerindian groups, while interethnic admixture could be the most important factor to explain the high number of haplotypes and high levels of diversity observed in some South-American and most North-American tribes.

  11. Genome Size Diversity and Its Impact on the Evolution of Land Plants

    Directory of Open Access Journals (Sweden)

    Jaume Pellicer

    2018-02-01

    Full Text Available Genome size is a biodiversity trait that shows staggering diversity across eukaryotes, varying over 64,000-fold. Of all major taxonomic groups, land plants stand out due to their staggering genome size diversity, ranging ca. 2400-fold. As our understanding of the implications and significance of this remarkable genome size diversity in land plants grows, it is becoming increasingly evident that this trait plays not only an important role in shaping the evolution of plant genomes, but also in influencing plant community assemblages at the ecosystem level. Recent advances and improvements in novel sequencing technologies, as well as analytical tools, make it possible to gain critical insights into the genomic and epigenetic mechanisms underpinning genome size changes. In this review we provide an overview of our current understanding of genome size diversity across the different land plant groups, its implications on the biology of the genome and what future directions need to be addressed to fill key knowledge gaps.

  12. Reconstruction of Diverse Verrucomicrobial Genomes from Metagenome Datasets of Freshwater Reservoirs

    Directory of Open Access Journals (Sweden)

    Pedro J. Cabello-Yeves

    2017-11-01

    Full Text Available The phylum Verrucomicrobia contains freshwater representatives which remain poorly studied at the genomic, taxonomic, and ecological levels. In this work we present eighteen new reconstructed verrucomicrobial genomes from two freshwater reservoirs located close to each other (Tous and Amadorio, Spain. These metagenome-assembled genomes (MAGs display a remarkable taxonomic diversity inside the phylum and comprise wide ranges of estimated genome sizes (from 1.8 to 6 Mb. Among all Verrucomicrobia studied we found some of the smallest genomes of the Spartobacteria and Opitutae classes described so far. Some of the Opitutae family MAGs were small, cosmopolitan, with a general heterotrophic metabolism with preference for carbohydrates, and capable of xylan, chitin, or cellulose degradation. Besides, we assembled large copiotroph genomes, which contain a higher number of transporters, polysaccharide degrading pathways and in general more strategies for the uptake of nutrients and carbohydrate-based metabolic pathways in comparison with the representatives with the smaller genomes. The diverse genomes revealed interesting features like green-light absorbing rhodopsins and a complete set of genes involved in nitrogen fixation. The large diversity in genome sizes and physiological properties emphasize the diversity of this clade in freshwaters enlarging even further the already broad eco-physiological range of these microbes.

  13. Genomic diversity and versatility of Lactobacillus plantarum, a natural metabolic engineer

    Science.gov (United States)

    2011-01-01

    In the past decade it has become clear that the lactic acid bacterium Lactobacillus plantarum occupies a diverse range of environmental niches and has an enormous diversity in phenotypic properties, metabolic capacity and industrial applications. In this review, we describe how genome sequencing, comparative genome hybridization and comparative genomics has provided insight into the underlying genomic diversity and versatility of L. plantarum. One of the main features appears to be genomic life-style islands consisting of numerous functional gene cassettes, in particular for carbohydrates utilization, which can be acquired, shuffled, substituted or deleted in response to niche requirements. In this sense, L. plantarum can be considered a “natural metabolic engineer”. PMID:21995294

  14. Genome-wide association study identifies HLA 8.1 ancestral haplotype alleles as major genetic risk factors for myositis phenotypes.

    Science.gov (United States)

    Miller, F W; Chen, W; O'Hanlon, T P; Cooper, R G; Vencovsky, J; Rider, L G; Danko, K; Wedderburn, L R; Lundberg, I E; Pachman, L M; Reed, A M; Ytterberg, S R; Padyukov, L; Selva-O'Callaghan, A; Radstake, T R; Isenberg, D A; Chinoy, H; Ollier, W E R; Scheet, P; Peng, B; Lee, A; Byun, J; Lamb, J A; Gregersen, P K; Amos, C I

    2015-10-01

    Autoimmune muscle diseases (myositis) comprise a group of complex phenotypes influenced by genetic and environmental factors. To identify genetic risk factors in patients of European ancestry, we conducted a genome-wide association study (GWAS) of the major myositis phenotypes in a total of 1710 cases, which included 705 adult dermatomyositis, 473 juvenile dermatomyositis, 532 polymyositis and 202 adult dermatomyositis, juvenile dermatomyositis or polymyositis patients with anti-histidyl-tRNA synthetase (anti-Jo-1) autoantibodies, and compared them with 4724 controls. Single-nucleotide polymorphisms showing strong associations (Pmyositis phenotypes together, as well as for the four clinical and autoantibody phenotypes studied separately. Imputation and regression analyses found that alleles comprising the human leukocyte antigen (HLA) 8.1 ancestral haplotype (AH8.1) defined essentially all the genetic risk in the phenotypes studied. Although the HLA DRB1*03:01 allele showed slightly stronger associations with adult and juvenile dermatomyositis, and HLA B*08:01 with polymyositis and anti-Jo-1 autoantibody-positive myositis, multiple alleles of AH8.1 were required for the full risk effects. Our findings establish that alleles of the AH8.1 comprise the primary genetic risk factors associated with the major myositis phenotypes in geographically diverse Caucasian populations.

  15. ABO alleles are linked with haplotypes of an erythroid cell-specific regulatory element in intron 1 with a few exceptions attributable to genetic recombination.

    Science.gov (United States)

    Nakajima, T; Sano, R; Takahashi, Y; Watanabe, K; Kubo, R; Kobayashi, M; Takahashi, K; Takeshita, H; Kominato, Y

    2016-01-01

    Recent investigation of transcriptional regulation of the ABO genes has identified a candidate erythroid cell-specific regulatory element, named the +5·8-kb site, in the first intron of ABO. Six haplotypes of the site have been reported previously. The present genetic population study demonstrated that each haplotype was mostly linked with specific ABO alleles with a few exceptions, possibly as a result of hybrid formation between common ABO alleles. Thus, investigation of these haplotypes could provide a clue to further elucidation of ABO alleles. © 2015 International Society of Blood Transfusion.

  16. Evolution and Diversity of Transposable Elements in Vertebrate Genomes.

    Science.gov (United States)

    Sotero-Caio, Cibele G; Platt, Roy N; Suh, Alexander; Ray, David A

    2017-01-01

    Transposable elements (TEs) are selfish genetic elements that mobilize in genomes via transposition or retrotransposition and often make up large fractions of vertebrate genomes. Here, we review the current understanding of vertebrate TE diversity and evolution in the context of recent advances in genome sequencing and assembly techniques. TEs make up 4-60% of assembled vertebrate genomes, and deeply branching lineages such as ray-finned fishes and amphibians generally exhibit a higher TE diversity than the more recent radiations of birds and mammals. Furthermore, the list of taxa with exceptional TE landscapes is growing. We emphasize that the current bottleneck in genome analyses lies in the proper annotation of TEs and provide examples where superficial analyses led to misleading conclusions about genome evolution. Finally, recent advances in long-read sequencing will soon permit access to TE-rich genomic regions that previously resisted assembly including the gigantic, TE-rich genomes of salamanders and lungfishes. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. Different patterns of evolution in the centromeric and telomeric regions of group A and B haplotypes of the human killer cell Ig-like receptor locus.

    Directory of Open Access Journals (Sweden)

    Chul-Woo Pyo

    Full Text Available The fast evolving human KIR gene family encodes variable lymphocyte receptors specific for polymorphic HLA class I determinants. Nucleotide sequences for 24 representative human KIR haplotypes were determined. With three previously defined haplotypes, this gave a set of 12 group A and 15 group B haplotypes for assessment of KIR variation. The seven gene-content haplotypes are all combinations of four centromeric and two telomeric motifs. 2DL5, 2DS5 and 2DS3 can be present in centromeric and telomeric locations. With one exception, haplotypes having identical gene content differed in their combinations of KIR alleles. Sequence diversity varied between haplotype groups and between centromeric and telomeric halves of the KIR locus. The most variable A haplotype genes are in the telomeric half, whereas the most variable genes characterizing B haplotypes are in the centromeric half. Of the highly polymorphic genes, only the 3DL3 framework gene exhibits a similar diversity when carried by A and B haplotypes. Phylogenetic analysis and divergence time estimates, point to the centromeric gene-content motifs that distinguish A and B haplotypes having emerged ~6 million years ago, contemporaneously with the separation of human and chimpanzee ancestors. In contrast, the telomeric motifs that distinguish A and B haplotypes emerged more recently, ~1.7 million years ago, before the emergence of Homo sapiens. Thus the centromeric and telomeric motifs that typify A and B haplotypes have likely been present throughout human evolution. The results suggest the common ancestor of A and B haplotypes combined a B-like centromeric region with an A-like telomeric region.

  18. Consequences of genomic diversity in Mycobacterium tuberculosis

    Science.gov (United States)

    Coscolla, Mireia; Gagneux, Sebastien

    2014-01-01

    The causative agent of human tuberculosis, Mycobacterium tuberculosis complex (MTBC), comprises seven phylogenetically distinct lineages associated with different geographical regions. Here we review the latest findings on the nature and amount of genomic diversity within and between MTBC lineages. We then review recent evidence for the effect of this genomic diversity on mycobacterial phenotypes measured experimentally and in clinical settings. We conclude that overall, the most geographically widespread Lineage 2 (includes Beijing) and Lineage 4 (also known as Euro-American) are more virulent than other lineages that are more geographically restricted. This increased virulence is associated with delayed or reduced pro-inflammatory host immune responses, greater severity of disease, and enhanced transmission. Future work should focus on the interaction between MTBC and human genetic diversity, as well as on the environmental factors that modulate these interactions. PMID:25453224

  19. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish

    Directory of Open Access Journals (Sweden)

    Atsuo Kawahara

    2016-05-01

    Full Text Available The zebrafish (Danio rerio is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs, transcription activator-like effector nucleases (TALENs and the clustered regularly interspaced short palindromic repeats (CRISPR/CRISPR associated protein 9 (Cas9 system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish.

  20. India, Genomic diversity & Disease susceptibility

    Indian Academy of Sciences (India)

    Table of contents. India, Genomic diversity & Disease susceptibility · India, a paradise for Genetic Studies · Involved in earlier stages of Immune response protecting us from Diseases, Responsible for kidney and other transplant rejections Inherited from our parents · PowerPoint Presentation · Slide 5 · Slide 6 · Slide 7.

  1. Genetics, Genomics and Evolution of Ergot Alkaloid Diversity

    Directory of Open Access Journals (Sweden)

    Carolyn A. Young

    2015-04-01

    Full Text Available The ergot alkaloid biosynthesis system has become an excellent model to study evolutionary diversification of specialized (secondary metabolites. This is a very diverse class of alkaloids with various neurotropic activities, produced by fungi in several orders of the phylum Ascomycota, including plant pathogens and protective plant symbionts in the family Clavicipitaceae. Results of comparative genomics and phylogenomic analyses reveal multiple examples of three evolutionary processes that have generated ergot-alkaloid diversity: gene gains, gene losses, and gene sequence changes that have led to altered substrates or product specificities of the enzymes that they encode (neofunctionalization. The chromosome ends appear to be particularly effective engines for gene gains, losses and rearrangements, but not necessarily for neofunctionalization. Changes in gene expression could lead to accumulation of various pathway intermediates and affect levels of different ergot alkaloids. Genetic alterations associated with interspecific hybrids of Epichloë species suggest that such variation is also selectively favored. The huge structural diversity of ergot alkaloids probably represents adaptations to a wide variety of ecological situations by affecting the biological spectra and mechanisms of defense against herbivores, as evidenced by the diverse pharmacological effects of ergot alkaloids used in medicine.

  2. Exact algorithms for haplotype assembly from whole-genome sequence data.

    Science.gov (United States)

    Chen, Zhi-Zhong; Deng, Fei; Wang, Lusheng

    2013-08-15

    Haplotypes play a crucial role in genetic analysis and have many applications such as gene disease diagnoses, association studies, ancestry inference and so forth. The development of DNA sequencing technologies makes it possible to obtain haplotypes from a set of aligned reads originated from both copies of a chromosome of a single individual. This approach is often known as haplotype assembly. Exact algorithms that can give optimal solutions to the haplotype assembly problem are highly demanded. Unfortunately, previous algorithms for this problem either fail to output optimal solutions or take too long time even executed on a PC cluster. We develop an approach to finding optimal solutions for the haplotype assembly problem under the minimum-error-correction (MEC) model. Most of the previous approaches assume that the columns in the input matrix correspond to (putative) heterozygous sites. This all-heterozygous assumption is correct for most columns, but it may be incorrect for a small number of columns. In this article, we consider the MEC model with or without the all-heterozygous assumption. In our approach, we first use new methods to decompose the input read matrix into small independent blocks and then model the problem for each block as an integer linear programming problem, which is then solved by an integer linear programming solver. We have tested our program on a single PC [a Linux (x64) desktop PC with i7-3960X CPU], using the filtered HuRef and the NA 12878 datasets (after applying some variant calling methods). With the all-heterozygous assumption, our approach can optimally solve the whole HuRef data set within a total time of 31 h (26 h for the most difficult block of the 15th chromosome and only 5 h for the other blocks). To our knowledge, this is the first time that MEC optimal solutions are completely obtained for the filtered HuRef dataset. Moreover, in the general case (without the all-heterozygous assumption), for the HuRef dataset our

  3. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime

    2015-11-18

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a textbased browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.

  4. Cancer Genomics: Diversity and Disparity Across Ethnicity and Geography.

    Science.gov (United States)

    Tan, Daniel S W; Mok, Tony S K; Rebbeck, Timothy R

    2016-01-01

    Ethnic and geographic differences in cancer incidence, prognosis, and treatment outcomes can be attributed to diversity in the inherited (germline) and somatic genome. Although international large-scale sequencing efforts are beginning to unravel the genomic underpinnings of cancer traits, much remains to be known about the underlying mechanisms and determinants of genomic diversity. Carcinogenesis is a dynamic, complex phenomenon representing the interplay between genetic and environmental factors that results in divergent phenotypes across ethnicities and geography. For example, compared with whites, there is a higher incidence of prostate cancer among Africans and African Americans, and the disease is generally more aggressive and fatal. Genome-wide association studies have identified germline susceptibility loci that may account for differences between the African and non-African patients, but the lack of availability of appropriate cohorts for replication studies and the incomplete understanding of genomic architecture across populations pose major limitations. We further discuss the transformative potential of routine diagnostic evaluation for actionable somatic alterations, using lung cancer as an example, highlighting implications of population disparities, current hurdles in implementation, and the far-reaching potential of clinical genomics in enhancing cancer prevention, diagnosis, and treatment. As we enter the era of precision cancer medicine, a concerted multinational effort is key to addressing population and genomic diversity as well as overcoming barriers and geographical disparities in research and health care delivery. © 2015 by American Society of Clinical Oncology.

  5. Salmonella enterica Prophage Sequence Profiles Reflect Genome Diversity and Can Be Used for High Discrimination Subtyping

    Directory of Open Access Journals (Sweden)

    Walid Mottawea

    2018-05-01

    Full Text Available Non-typhoidal Salmonella is a leading cause of foodborne illness worldwide. Prompt and accurate identification of the sources of Salmonella responsible for disease outbreaks is crucial to minimize infections and eliminate ongoing sources of contamination. Current subtyping tools including single nucleotide polymorphism (SNP typing may be inadequate, in some instances, to provide the required discrimination among epidemiologically unrelated Salmonella strains. Prophage genes represent the majority of the accessory genes in bacteria genomes and have potential to be used as high discrimination markers in Salmonella. In this study, the prophage sequence diversity in different Salmonella serovars and genetically related strains was investigated. Using whole genome sequences of 1,760 isolates of S. enterica representing 151 Salmonella serovars and 66 closely related bacteria, prophage sequences were identified from assembled contigs using PHASTER. We detected 154 different prophages in S. enterica genomes. Prophage sequences were highly variable among S. enterica serovars with a median ± interquartile range (IQR of 5 ± 3 prophage regions per genome. While some prophage sequences were highly conserved among the strains of specific serovars, few regions were lineage specific. Therefore, strains belonging to each serovar could be clustered separately based on their prophage content. Analysis of S. Enteritidis isolates from seven outbreaks generated distinct prophage profiles for each outbreak. Taken altogether, the diversity of the prophage sequences correlates with genome diversity. Prophage repertoires provide an additional marker for differentiating S. enterica subtypes during foodborne outbreaks.

  6. Y-STR haplotypes of Native American populations from the Brazilian Amazon region.

    Science.gov (United States)

    Palha, Teresinha Jesus Brabo Ferreira; Rodrigues, Elzemar Martins Ribeiro; dos Santos, Sidney Emanuel Batista

    2010-10-01

    The allele and haplotype frequencies of nine Y-STRs (DYS19, DYS389 I, DYS389 II, DYS390, DYS391, DYS392, DYS393, DYS385 I/II) were determined in a sample of six native tribes from the Brazilian Amazon (Tiriyó, Awa-Guajá, Waiãpi, Urubu-Kaapor, Zoé and Parakanã). Forty-eight different haplotypes were identified, 28 of which unique. Five haplotypes are very frequent and were shared by over 10 individuals. The estimated haplotype diversity (0.9114) was very low compared to other geographic groups, including Africans, Europeans and Asians. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  7. Genomic Diversity and Evolution of the Fish Pathogen Flavobacterium psychrophilum

    Directory of Open Access Journals (Sweden)

    Eric Duchaud

    2018-02-01

    Full Text Available Flavobacterium psychrophilum, the etiological agent of rainbow trout fry syndrome and bacterial cold-water disease in salmonid fish, is currently one of the main bacterial pathogens hampering the productivity of salmonid farming worldwide. In this study, the genomic diversity of the F. psychrophilum species is analyzed using a set of 41 genomes, including 30 newly sequenced isolates. These were selected on the basis of available MLST data with the two-fold objective of maximizing the coverage of the species diversity and of allowing a focus on the main clonal complex (CC-ST10 infecting farmed rainbow trout (Oncorhynchus mykiss worldwide. The results reveal a bacterial species harboring a limited genomic diversity both in terms of nucleotide diversity, with ~0.3% nucleotide divergence inside CDSs in pairwise genome comparisons, and in terms of gene repertoire, with the core genome accounting for ~80% of the genes in each genome. The pan-genome seems nevertheless “open” according to the scaling exponent of a power-law fitted on the rate of new gene discovery when genomes are added one-by-one. Recombination is a key component of the evolutionary process of the species as seen in the high level of apparent homoplasy in the core genome. Using a Hidden Markov Model to delineate recombination tracts in pairs of closely related genomes, the average recombination tract length was estimated to ~4.0 Kbp and the typical ratio of the contributions of recombination and mutations to nucleotide-level differentiation (r/m was estimated to ~13. Within CC-ST10, evolutionary distances computed on non-recombined regions and comparisons between 22 isolates sampled up to 27 years apart suggest a most recent common ancestor in the second half of the nineteenth century in North America with subsequent diversification and transmission of this clonal complex coinciding with the worldwide expansion of rainbow trout farming. With the goal to promote the development of

  8. Comparative Genomics of the Herbivore Gut Symbiont Lactobacillus reuteri Reveals Genetic Diversity and Lifestyle Adaptation

    Directory of Open Access Journals (Sweden)

    Jie Yu

    2018-06-01

    Full Text Available Lactobacillus reuteri is a catalase-negative, Gram-positive, non-motile, obligately heterofermentative bacterial species that has been used as a model to describe the ecology and evolution of vertebrate gut symbionts. However, the genetic features and evolutionary strategies of L. reuteri from the gastrointestinal tract of herbivores remain unknown. Therefore, 16 L. reuteri strains isolated from goat, sheep, cow, and horse in Inner Mongolia, China were sequenced in this study. A comparative genomic approach was used to assess genetic diversity and gain insight into the distinguishing features related to the different hosts based on 21 published genomic sequences. Genome size, G + C content, and average nucleotide identity values of the L. reuteri strains from different hosts indicated that the strains have broad genetic diversity. The pan-genome of 37 L. reuteri strains contained 8,680 gene families, and the core genome contained 726 gene families. A total of 92,270 nucleotide mutation sites were discovered among 37 L. reuteri strains, and all core genes displayed a Ka/Ks ratio much lower than 1, suggesting strong purifying selective pressure (negative selection. A highly robust maximum likelihood tree based on the core genes shown in the herbivore isolates were divided into three clades; clades A and B contained most of the herbivore isolates and were more closely related to human isolates and vastly distinct from clade C. Some functional genes may be attributable to host-specific of the herbivore, omnivore, and sourdough groups. Moreover, the numbers of genes encoding cell surface proteins and active carbohydrate enzymes were host-specific. This study provides new insight into the adaptation of L. reuteri to the intestinal habitat of herbivores, suggesting that the genomic diversity of L. reuteri from different ecological origins is closely associated with their living environment.

  9. Mitochondrial control region haplotypes of the South American sea lion Otaria flavescens (Shaw, 1800).

    Science.gov (United States)

    Artico, L O; Bianchini, A; Grubel, K S; Monteiro, D S; Estima, S C; Oliveira, L R de; Bonatto, S L; Marins, L F

    2010-09-01

    The South American sea lion, Otaria flavescens, is widely distributed along the Pacific and Atlantic coasts of South America. However, along the Brazilian coast, there are only two nonbreeding sites for the species (Refúgio de Vida Silvestre da Ilha dos Lobos and Refúgio de Vida Silvestre do Molhe Leste da Barra do Rio Grande), both in Southern Brazil. In this region, the species is continuously under the effect of anthropic activities, mainly those related to environmental contamination with organic and inorganic chemicals and fishery interactions. This paper reports, for the first time, the genetic diversity of O. flavescens found along the Southern Brazilian coast. A 287-bp fragment of the mitochondrial DNA control region (D-loop) was analyzed. Seven novel haplotypes were found in 56 individuals (OFA1-OFA7), with OFA1 being the most frequent (47.54%). Nucleotide diversity was moderate (π = 0.62%) and haplotype diversity was relatively low (67%). Furthermore, the median joining network analysis indicated that Brazilian haplotypes formed a reciprocal monophyletic clade when compared to the haplotypes from the Peruvian population on the Pacific coast. These two populations do not share haplotypes and may have become isolated some time back. Further genetic studies covering the entire species distribution are necessary to better understand the biological implications of the results reported here for the management and conservation of South American sea lions.

  10. Genomic diversity within the haloalkaliphilic genus Thioalkalivibrio.

    Science.gov (United States)

    Ahn, Anne-Catherine; Meier-Kolthoff, Jan P; Overmars, Lex; Richter, Michael; Woyke, Tanja; Sorokin, Dimitry Y; Muyzer, Gerard

    2017-01-01

    Thioalkalivibrio is a genus of obligate chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacteria. Their habitat are soda lakes which are dual extreme environments with a pH range from 9.5 to 11 and salt concentrations up to saturation. More than 100 strains of this genus have been isolated from various soda lakes all over the world, but only ten species have been effectively described yet. Therefore, the assignment of the remaining strains to either existing or novel species is important and will further elucidate their genomic diversity as well as give a better general understanding of this genus. Recently, the genomes of 76 Thioalkalivibrio strains were sequenced. On these, we applied different methods including (i) 16S rRNA gene sequence analysis, (ii) Multilocus Sequence Analysis (MLSA) based on eight housekeeping genes, (iii) Average Nucleotide Identity based on BLAST (ANIb) and MUMmer (ANIm), (iv) Tetranucleotide frequency correlation coefficients (TETRA), (v) digital DNA:DNA hybridization (dDDH) as well as (vi) nucleotide- and amino acid-based Genome BLAST Distance Phylogeny (GBDP) analyses. We detected a high genomic diversity by revealing 15 new "genomic" species and 16 new "genomic" subspecies in addition to the ten already described species. Phylogenetic and phylogenomic analyses showed that the genus is not monophyletic, because four strains were clearly separated from the other Thioalkalivibrio by type strains from other genera. Therefore, it is recommended to classify the latter group as a novel genus. The biogeographic distribution of Thioalkalivibrio suggested that the different "genomic" species can be classified as candidate disjunct or candidate endemic species. This study is a detailed genome-based classification and identification of members within the genus Thioalkalivibrio. However, future phenotypical and chemotaxonomical studies will be needed for a full species description of this genus.

  11. Characterization of sequence diversity in Plasmodium falciparum SERA5 from Indian isolates

    Directory of Open Access Journals (Sweden)

    Rahul C.N

    2015-06-01

    Full Text Available Objective: To characterize the sequence diversity of blood-stage Plasmodium falciparum serine repeat antigen-5 (PfSERA5 which is lacking in a malaria-endemic country like India. Methods: In this study, parasitic DNA was obtained from field isolates collected from various geographic regions. Subsequently, PfSERA5 gene sequence was PCR amplified and DNA sequenced. Results: We reported the existence of unique repeat polymorphisms and novel haplotypes for both the octamer repeat (OR and serine repeat (SR regions of the N-terminal fragment of PfSERA5 from Indian isolates. Several isolates from India were identical to low-frequency African haplotypes. Unique finding of our study was an Indian isolate showing deletion in a perfectly conserved 14 mer sequence within octamer repeat. Indian haplotypes reported in this study were found to be distributed into the three earlier classified allelic clusters of FCR3, K1 and Honduras showcasing broad diversity as compared to worldwide haplotypes. Conclusions: This study is the first report on genetic diversity of PfSERA5 antigen from India. Further evaluation of these haplotypes by serotyping would provide useful information for investigating variant-specific immunity and aid in malaria vaccine research.

  12. Identifying tagging SNPs for African specific genetic variation from the African Diaspora Genome.

    Science.gov (United States)

    Johnston, Henry Richard; Hu, Yi-Juan; Gao, Jingjing; O'Connor, Timothy D; Abecasis, Gonçalo R; Wojcik, Genevieve L; Gignoux, Christopher R; Gourraud, Pierre-Antoine; Lizee, Antoine; Hansen, Mark; Genuario, Rob; Bullis, Dave; Lawley, Cindy; Kenny, Eimear E; Bustamante, Carlos; Beaty, Terri H; Mathias, Rasika A; Barnes, Kathleen C; Qin, Zhaohui S

    2017-04-21

    A primary goal of The Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) is to develop an 'African Diaspora Power Chip' (ADPC), a genotyping array consisting of tagging SNPs, useful in comprehensively identifying African specific genetic variation. This array is designed based on the novel variation identified in 642 CAAPA samples of African ancestry with high coverage whole genome sequence data (~30× depth). This novel variation extends the pattern of variation catalogued in the 1000 Genomes and Exome Sequencing Projects to a spectrum of populations representing the wide range of West African genomic diversity. These individuals from CAAPA also comprise a large swath of the African Diaspora population and incorporate historical genetic diversity covering nearly the entire Atlantic coast of the Americas. Here we show the results of designing and producing such a microchip array. This novel array covers African specific variation far better than other commercially available arrays, and will enable better GWAS analyses for researchers with individuals of African descent in their study populations. A recent study cataloging variation in continental African populations suggests this type of African-specific genotyping array is both necessary and valuable for facilitating large-scale GWAS in populations of African ancestry.

  13. The Global Invertebrate Genomics Alliance (GIGA): Developing Community Resources to Study Diverse Invertebrate Genomes

    KAUST Repository

    Bracken-Grissom, Heather

    2013-12-12

    Over 95% of all metazoan (animal) species comprise the invertebrates, but very few genomes from these organisms have been sequenced. We have, therefore, formed a Global Invertebrate Genomics Alliance (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site () has been launched to facilitate this collaborative venture.

  14. WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads

    NARCIS (Netherlands)

    M.D. Patterson (Murray); T. Marschall (Tobias); N. Pisanti (Nadia); L.J.J. van Iersel (Leo); L. Stougie (Leen); G.W. Klau (Gunnar); A. Schönhuth (Alexander)

    2014-01-01

    htmlabstractThe human genome is diploid, that is each of its chromosomes comes in two copies. This requires to phase the single nucleotide polymorphisms (SNPs), that is, to assign them to the two copies, beyond just detecting them. The resulting haplotypes, lists of SNPs belonging to each copy, are

  15. Tomato Fruits Show Wide Phenomic Diversity but Fruit Developmental Genes Show Low Genomic Diversity.

    Directory of Open Access Journals (Sweden)

    Vijee Mohan

    Full Text Available Domestication of tomato has resulted in large diversity in fruit phenotypes. An intensive phenotyping of 127 tomato accessions from 20 countries revealed extensive morphological diversity in fruit traits. The diversity in fruit traits clustered the accessions into nine classes and identified certain promising lines having desirable traits pertaining to total soluble salts (TSS, carotenoids, ripening index, weight and shape. Factor analysis of the morphometric data from Tomato Analyzer showed that the fruit shape is a complex trait shared by several factors. The 100% variance between round and flat fruit shapes was explained by one discriminant function having a canonical correlation of 0.874 by stepwise discriminant analysis. A set of 10 genes (ACS2, COP1, CYC-B, RIN, MSH2, NAC-NOR, PHOT1, PHYA, PHYB and PSY1 involved in various plant developmental processes were screened for SNP polymorphism by EcoTILLING. The genetic diversity in these genes revealed a total of 36 non-synonymous and 18 synonymous changes leading to the identification of 28 haplotypes. The average frequency of polymorphism across the genes was 0.038/Kb. Significant negative Tajima'D statistic in two of the genes, ACS2 and PHOT1 indicated the presence of rare alleles in low frequency. Our study indicates that while there is low polymorphic diversity in the genes regulating plant development, the population shows wider phenotype diversity. Nonetheless, morphological and genetic diversity of the present collection can be further exploited as potential resources in future.

  16. Interrelationships between Amerindian tribes of lower Amazonia as manifest by HLA haplotype disequilibria.

    Science.gov (United States)

    Black, F L

    1984-11-01

    HLA B-C haplotypes exhibit common disequilibria in populations drawn from four continents, indicating that they are subject to broadly active selective forces. However, the A-B and A-C associations we have examined show no consistent disequilibrium pattern, leaving open the possibility that these disequilibria are due to descent from common progenitors. By examining HLA haplotype distributions, I have explored the implications that would follow from the hypothesis that biological selection played no role in determining A-C disequilibria in 10 diverse tribes of the lower Amazon Basin. Certain haplotypes are in strong positive disequilibria across a broad geographic area, suggesting that members of diverse tribes descend from common ancestors. On the basis of the extent of diffusion of the components of these haplotypes, one can estimate that the progenitors lived less than 6,000 years ago. One widely encountered lineage entered the area within the last 1,200 years. When haplotype frequencies are used in genetic distance measurements, they give a pattern of relationships very similar to that obtained by conventional chord measurements based on several genetic markers; but more than that, when individual haplotype disequilibria in the several tribes are compared, multiple origins of a single tribe are discernible and relationships are revealed that correlate more closely to geographic and linguistic patterns than do the genetic distance measurements.

  17. SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines

    Directory of Open Access Journals (Sweden)

    Smith Oscar

    2002-10-01

    Full Text Available Abstract Background Recent studies of ancestral maize populations indicate that linkage disequilibrium tends to dissipate rapidly, sometimes within 100 bp. We set out to examine the linkage disequilibrium and diversity in maize elite inbred lines, which have been subject to population bottlenecks and intense selection by breeders. Such population events are expected to increase the amount of linkage disequilibrium, but reduce diversity. The results of this study will inform the design of genetic association studies. Results We examined the frequency and distribution of DNA polymorphisms at 18 maize genes in 36 maize inbreds, chosen to represent most of the genetic diversity in U.S. elite maize breeding pool. The frequency of nucleotide changes is high, on average one polymorphism per 31 bp in non-coding regions and 1 polymorphism per 124 bp in coding regions. Insertions and deletions are frequent in non-coding regions (1 per 85 bp, but rare in coding regions. A small number (2–8 of distinct and highly diverse haplotypes can be distinguished at all loci examined. Within genes, SNP loci comprising the haplotypes are in linkage disequilibrium with each other. Conclusions No decline of linkage disequilibrium within a few hundred base pairs was found in the elite maize germplasm. This finding, as well as the small number of haplotypes, relative to neutral expectation, is consistent with the effects of breeding-induced bottlenecks and selection on the elite germplasm pool. The genetic distance between haplotypes is large, indicative of an ancient gene pool and of possible interspecific hybridization events in maize ancestry.

  18. Inter- and intra-specific pan-genomes of Borrelia burgdorferi sensu lato: genome stability and adaptive radiation

    Science.gov (United States)

    2013-01-01

    Background Lyme disease is caused by spirochete bacteria from the Borrelia burgdorferi sensu lato (B. burgdorferi s.l.) species complex. To reconstruct the evolution of B. burgdorferi s.l. and identify the genomic basis of its human virulence, we compared the genomes of 23 B. burgdorferi s.l. isolates from Europe and the United States, including B. burgdorferi sensu stricto (B. burgdorferi s.s., 14 isolates), B. afzelii (2), B. garinii (2), B. “bavariensis” (1), B. spielmanii (1), B. valaisiana (1), B. bissettii (1), and B. “finlandensis” (1). Results Robust B. burgdorferi s.s. and B. burgdorferi s.l. phylogenies were obtained using genome-wide single-nucleotide polymorphisms, despite recombination. Phylogeny-based pan-genome analysis showed that the rate of gene acquisition was higher between species than within species, suggesting adaptive speciation. Strong positive natural selection drives the sequence evolution of lipoproteins, including chromosomally-encoded genes 0102 and 0404, cp26-encoded ospC and b08, and lp54-encoded dbpA, a07, a22, a33, a53, a65. Computer simulations predicted rapid adaptive radiation of genomic groups as population size increases. Conclusions Intra- and inter-specific pan-genome sizes of B. burgdorferi s.l. expand linearly with phylogenetic diversity. Yet gene-acquisition rates in B. burgdorferi s.l. are among the lowest in bacterial pathogens, resulting in high genome stability and few lineage-specific genes. Genome adaptation of B. burgdorferi s.l. is driven predominantly by copy-number and sequence variations of lipoprotein genes. New genomic groups are likely to emerge if the current trend of B. burgdorferi s.l. population expansion continues. PMID:24112474

  19. Evidence and Consequence of a Highly Adapted Clonal Haplotype within the Australian Ascochyta rabiei Population

    Directory of Open Access Journals (Sweden)

    Yasir Mehmood

    2017-06-01

    Full Text Available The Australian Ascochyta rabiei (Pass. Labr. (syn. Phoma rabiei population has low genotypic diversity with only one mating type detected to date, potentially precluding substantial evolution through recombination. However, a large diversity in aggressiveness exists. In an effort to better understand the risk from selective adaptation to currently used resistance sources and chemical control strategies, the population was examined in detail. For this, a total of 598 isolates were quasi-hierarchically sampled between 2013 and 2015 across all major Australian chickpea growing regions and commonly grown host genotypes. Although a large number of haplotypes were identified (66 through short sequence repeat (SSR genotyping, overall low gene diversity (Hexp = 0.066 and genotypic diversity (D = 0.57 was detected. Almost 70% of the isolates assessed were of a single dominant haplotype (ARH01. Disease screening on a differential host set, including three commonly deployed resistance sources, revealed distinct aggressiveness among the isolates, with 17% of all isolates identified as highly aggressive. Almost 75% of these were of the ARH01 haplotype. A similar pattern was observed at the host level, with 46% of all isolates collected from the commonly grown host genotype Genesis090 (classified as “resistant” during the term of collection identified as highly aggressive. Of these, 63% belonged to the ARH01 haplotype. In conclusion, the ARH01 haplotype represents a significant risk to the Australian chickpea industry, being not only widely adapted to the diverse agro-geographical environments of the Australian chickpea growing regions, but also containing a disproportionately large number of aggressive isolates, indicating fitness to survive and replicate on the best resistance sources in the Australian germplasm.

  20. The African Genome Variation Project shapes medical genetics in Africa

    Science.gov (United States)

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O.; Choudhury, Ananyo; Ritchie, Graham R. S.; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N.; Young, Elizabeth H.; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P.; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A.; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S.

    2015-01-01

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.

  1. The African Genome Variation Project shapes medical genetics in Africa.

    Science.gov (United States)

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O; Choudhury, Ananyo; Ritchie, Graham R S; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N; Young, Elizabeth H; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S

    2015-01-15

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.

  2. An accurate clone-based haplotyping method by overlapping pool sequencing.

    Science.gov (United States)

    Li, Cheng; Cao, Changchang; Tu, Jing; Sun, Xiao

    2016-07-08

    Chromosome-long haplotyping of human genomes is important to identify genetic variants with differing gene expression, in human evolution studies, clinical diagnosis, and other biological and medical fields. Although several methods have realized haplotyping based on sequencing technologies or population statistics, accuracy and cost are factors that prohibit their wide use. Borrowing ideas from group testing theories, we proposed a clone-based haplotyping method by overlapping pool sequencing. The clones from a single individual were pooled combinatorially and then sequenced. According to the distinct pooling pattern for each clone in the overlapping pool sequencing, alleles for the recovered variants could be assigned to their original clones precisely. Subsequently, the clone sequences could be reconstructed by linking these alleles accordingly and assembling them into haplotypes with high accuracy. To verify the utility of our method, we constructed 130 110 clones in silico for the individual NA12878 and simulated the pooling and sequencing process. Ultimately, 99.9% of variants on chromosome 1 that were covered by clones from both parental chromosomes were recovered correctly, and 112 haplotype contigs were assembled with an N50 length of 3.4 Mb and no switch errors. A comparison with current clone-based haplotyping methods indicated our method was more accurate. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Report of the second Human Genome Diversity workshop

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1992-12-31

    The Second Human Genome Diversity Workshop was successfully held at Penn State University from October 29--31, 1992. The Workshop was essentially organized around 7 groups, each comprising approximately 10 participants, representing the sampling issues in different regions of the world. These groups worked independently, using a common format provided by the organizers; this was adjusted as needed by the individual groups. The Workshop began with a presentation of the mandate to the participants, and of the procedures to be followed during the workshop. Dr. Feldman presented a summary of the results from the First Workshop. He and the other organizers also presented brief comments giving their perspective on the objectives of the Second Workshop. Dr. Julia Bodmer discussed the study of European genetic diversity, especially in the context of the HLA experience there, and of plans to extend such studies in the coming years. She also discussed surveys of world HLA laboratories in regard to resources related to Human Genome Diversity. Dr. Mark Weiss discussed the relevance of nonhuman primate studies for understanding how demographic processes, such as mate exchange between local groups, affected the local dispersion of genetic variation. Primate population geneticists have some relevant experience in interpreting variation at this local level, in particular, with various DNA fingerprinting methods. This experience may be relevant to the Human Genome Diversity Project, in terms of practical and statistical issues.

  4. Genomic landscape of human diversity across Madagascar

    Science.gov (United States)

    Pierron, Denis; Heiske, Margit; Razafindrazaka, Harilanto; Rakoto, Ignace; Rabetokotany, Nelly; Ravololomanga, Bodo; Rakotozafy, Lucien M.-A.; Rakotomalala, Mireille Mialy; Razafiarivony, Michel; Rasoarifetra, Bako; Raharijesy, Miakabola Andriamampianina; Razafindralambo, Lolona; Ramilisonina; Fanony, Fulgence; Lejamble, Sendra; Thomas, Olivier; Mohamed Abdallah, Ahmed; Rocher, Christophe; Arachiche, Amal; Tonaso, Laure; Pereda-loth, Veronica; Schiavinato, Stéphanie; Brucato, Nicolas; Ricaut, Francois-Xavier; Kusuma, Pradiptajati; Sudoyo, Herawati; Ni, Shengyu; Boland, Anne; Deleuze, Jean-Francois; Beaujard, Philippe; Grange, Philippe; Adelaar, Sander; Stoneking, Mark; Rakotoarisoa, Jean-Aimé; Radimilahy, Chantal; Letellier, Thierry

    2017-01-01

    Although situated ∼400 km from the east coast of Africa, Madagascar exhibits cultural, linguistic, and genetic traits from both Southeast Asia and Eastern Africa. The settlement history remains contentious; we therefore used a grid-based approach to sample at high resolution the genomic diversity (including maternal lineages, paternal lineages, and genome-wide data) across 257 villages and 2,704 Malagasy individuals. We find a common Bantu and Austronesian descent for all Malagasy individuals with a limited paternal contribution from Europe and the Middle East. Admixture and demographic growth happened recently, suggesting a rapid settlement of Madagascar during the last millennium. However, the distribution of African and Asian ancestry across the island reveals that the admixture was sex biased and happened heterogeneously across Madagascar, suggesting independent colonization of Madagascar from Africa and Asia rather than settlement by an already admixed population. In addition, there are geographic influences on the present genomic diversity, independent of the admixture, showing that a few centuries is sufficient to produce detectable genetic structure in human populations. PMID:28716916

  5. WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads

    NARCIS (Netherlands)

    Patterson, M.; Marschall, T.; Pisanti, N.; van Iersel, L.J.J.; Stougie, L.; Klau, G.W.; Schoenhuth, A.

    2014-01-01

    The human genome is diploid, that is each of its chromosomes comes in two copies. This requires to phase the single nucleotide polymorphisms (SNPs), that is, to assign them to the two copies, beyond just detecting them. The resulting haplotypes, lists of SNPs belonging to each copy, are crucial for

  6. Genome Size Diversity in Lilium (Liliaceae Is Correlated with Karyotype and Environmental Traits

    Directory of Open Access Journals (Sweden)

    Yun-peng Du

    2017-07-01

    Full Text Available Genome size (GS diversity is of fundamental biological importance. The occurrence of giant genomes in angiosperms is restricted to just a few lineages in the analyzed genome size of plant species so far. It is still an open question whether GS diversity is shaped by neutral or natural selection. The genus Lilium, with giant genomes, is phylogenetically and horticulturally important and is distributed throughout the northern hemisphere. GS diversity in Lilium and the underlying evolutionary mechanisms are poorly understood. We performed a comprehensive study involving phylogenetically independent analysis on 71 species to explore the diversity and evolution of GS and its correlation with karyological and environmental traits within Lilium (including Nomocharis. The strong phylogenetic signal detected for GS in the genus provides evidence consistent with that the repetitive DNA may be the primary contributors to the GS diversity, while the significant positive relationships detected between GS and the haploid chromosome length (HCL provide insights into patterns of genome evolution. The relationships between GS and karyotypes indicate that ancestral karyotypes of Lilium are likely to have exhibited small genomes, low diversity in centromeric index (CVCI values and relatively high relative variation in chromosome length (CVCL values. Significant relationships identified between GS and annual temperature and between GS and annual precipitation suggest that adaptation to habitat strongly influences GS diversity. We conclude that GS in Lilium is shaped by both neutral (genetic drift and adaptive evolution. These findings will have important consequences for understanding the evolution of giant plant genomes, and exploring the role of repetitive DNA fraction and chromosome changes in a plant group with large genomes and conservation of chromosome number.

  7. Modeling and E-M estimation of haplotype-specific relative risks from genotype data for a case-control study of unrelated individuals.

    Science.gov (United States)

    Stram, Daniel O; Leigh Pearce, Celeste; Bretsky, Phillip; Freedman, Matthew; Hirschhorn, Joel N; Altshuler, David; Kolonel, Laurence N; Henderson, Brian E; Thomas, Duncan C

    2003-01-01

    The US National Cancer Institute has recently sponsored the formation of a Cohort Consortium (http://2002.cancer.gov/scpgenes.htm) to facilitate the pooling of data on very large numbers of people, concerning the effects of genes and environment on cancer incidence. One likely goal of these efforts will be generate a large population-based case-control series for which a number of candidate genes will be investigated using SNP haplotype as well as genotype analysis. The goal of this paper is to outline the issues involved in choosing a method of estimating haplotype-specific risk estimates for such data that is technically appropriate and yet attractive to epidemiologists who are already comfortable with odds ratios and logistic regression. Our interest is to develop and evaluate extensions of methods, based on haplotype imputation, that have been recently described (Schaid et al., Am J Hum Genet, 2002, and Zaykin et al., Hum Hered, 2002) as providing score tests of the null hypothesis of no effect of SNP haplotypes upon risk, which may be used for more complex tasks, such as providing confidence intervals, and tests of equivalence of haplotype-specific risks in two or more separate populations. In order to do so we (1) develop a cohort approach towards odds ratio analysis by expanding the E-M algorithm to provide maximum likelihood estimates of haplotype-specific odds ratios as well as genotype frequencies; (2) show how to correct the cohort approach, to give essentially unbiased estimates for population-based or nested case-control studies by incorporating the probability of selection as a case or control into the likelihood, based on a simplified model of case and control selection, and (3) finally, in an example data set (CYP17 and breast cancer, from the Multiethnic Cohort Study) we compare likelihood-based confidence interval estimates from the two methods with each other, and with the use of the single-imputation approach of Zaykin et al. applied under both

  8. Haplotype Diversity at Sub1 Locus and Allelic Distribution Among Rice Varieties of Tide and Flood Prone Areas of South-East Asia

    Directory of Open Access Journals (Sweden)

    A.S.M. Masuduzzaman

    2017-07-01

    Full Text Available Single nucleotide polymorphisms and restriction digestion-based haplotype variations among 160 flood prone rice varieties were analyzed with enzymes Alu I and Cac8 I to generate polymorphisms at Sub1A and Sub1C loci (conferring submergence tolerance, respectively. Haplotype associated with phenotype was used to study the haplotype variations at Sub1A and Sub1C loci and to determine their functional influence on submergence tolerance and stem elongation. Three patterns at Sub1A locus, Sub1A0 (null allele, Sub1A1 (does not cut and Sub1A2 (one SNP, and four patterns at Sub1C locus, Sub1C1, Sub1C2, Sub1C3 and Sub1C4, were generated. Both tolerant Sub1A1 and intolerant Sub1A2 had the same length, but the difference was presence of a restriction site in the Sub1A2, but absent at the Sub1A1. Further, two types of polymorphism were detected at the Sub1C, one included major length polymorphisms (165, 170 and 175 bp and the other was a single restriction site at different position. Eight haplotypes (different combinations of the two loci, A1C1, A1C2, A1C4, A2C2, A2C4, A0C2, A0C3 and A0C4, were detected among 160 varieties. Haplotype A1C1 was comparatively more related to haplotypes A1C2 and A1C4, having the same Sub1A allele, and these haplotypes were found only in Bangladeshi, Sri Lankan and Indian varieties. Most tolerant varieties in A1C1 haplotype showed slow elongation, having tolerant specific Sub1A1 and Sub1C1 alleles. Further, the varieties Madabaru and Kottamali (A2C2 also showed moderate level of tolerance without Sub1A1 allele. These varieties were different with FR13A and also suspected to carry different novel tolerant genes at other loci. These materials could be used for hybridization with Sub1 varieties for pyramiding additional tolerant specific alleles into a single genotype for improving submergence tolerance in rice.

  9. Mitochondrial control region haplotypes of the South American sea lion Otaria flavescens (Shaw, 1800

    Directory of Open Access Journals (Sweden)

    L.O. Artico

    2010-09-01

    Full Text Available The South American sea lion, Otaria flavescens, is widely distributed along the Pacific and Atlantic coasts of South America. However, along the Brazilian coast, there are only two nonbreeding sites for the species (Refúgio de Vida Silvestre da Ilha dos Lobos and Refúgio de Vida Silvestre do Molhe Leste da Barra do Rio Grande, both in Southern Brazil. In this region, the species is continuously under the effect of anthropic activities, mainly those related to environmental contamination with organic and inorganic chemicals and fishery interactions. This paper reports, for the first time, the genetic diversity of O. flavescens found along the Southern Brazilian coast. A 287-bp fragment of the mitochondrial DNA control region (D-loop was analyzed. Seven novel haplotypes were found in 56 individuals (OFA1-OFA7, with OFA1 being the most frequent (47.54%. Nucleotide diversity was moderate (π = 0.62% and haplotype diversity was relatively low (67%. Furthermore, the median joining network analysis indicated that Brazilian haplotypes formed a reciprocal monophyletic clade when compared to the haplotypes from the Peruvian population on the Pacific coast. These two populations do not share haplotypes and may have become isolated some time back. Further genetic studies covering the entire species distribution are necessary to better understand the biological implications of the results reported here for the management and conservation of South American sea lions.

  10. Diversity and Genome Analysis of Australian and Global Oilseed Brassica napus L. Germplasm Using Transcriptomics and Whole Genome Re-sequencing

    Directory of Open Access Journals (Sweden)

    M. Michelle Malmberg

    2018-04-01

    Full Text Available Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD. Complexity reduction genotyping-by-sequencing (GBS methods, including GBS-transcriptomics (GBS-t, enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs, and identify structural variants (SVs. Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.

  11. Haplotype phasing and inheritance of copy number variants in nuclear families.

    Science.gov (United States)

    Palta, Priit; Kaplinski, Lauris; Nagirnaja, Liina; Veidenberg, Andres; Möls, Märt; Nelis, Mari; Esko, Tõnu; Metspalu, Andres; Laan, Maris; Remm, Maido

    2015-01-01

    DNA copy number variants (CNVs) that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i) phase normal and CNV-carrying haplotypes in the copy number variable regions, ii) resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii) infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.

  12. Haplotype phasing and inheritance of copy number variants in nuclear families.

    Directory of Open Access Journals (Sweden)

    Priit Palta

    Full Text Available DNA copy number variants (CNVs that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i phase normal and CNV-carrying haplotypes in the copy number variable regions, ii resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.

  13. Correlation exploration of metabolic and genomic diversity in rice

    Directory of Open Access Journals (Sweden)

    Shinozaki Kazuo

    2009-12-01

    Full Text Available Abstract Background It is essential to elucidate the relationship between metabolic and genomic diversity to understand the genetic regulatory networks associated with the changing metabolo-phenotype among natural variation and/or populations. Recent innovations in metabolomics technologies allow us to grasp the comprehensive features of the metabolome. Metabolite quantitative trait analysis is a key approach for the identification of genetic loci involved in metabolite variation using segregated populations. Although several attempts have been made to find correlative relationships between genetic and metabolic diversity among natural populations in various organisms, it is still unclear whether it is possible to discover such correlations between each metabolite and the polymorphisms found at each chromosomal location. To assess the correlative relationship between the metabolic and genomic diversity found in rice accessions, we compared the distance matrices for these two "omics" patterns in the rice accessions. Results We selected 18 accessions from the world rice collection based on their population structure. To determine the genomic diversity of the rice genome, we genotyped 128 restriction fragment length polymorphism (RFLP markers to calculate the genetic distance among the accessions. To identify the variations in the metabolic fingerprint, a soluble extract from the seed grain of each accession was analyzed with one dimensional 1H-nuclear magnetic resonance (NMR. We found no correlation between global metabolic diversity and the phylogenetic relationships among the rice accessions (rs = 0.14 by analyzing the distance matrices (calculated from the pattern of the metabolic fingerprint in the 4.29- to 0.71-ppm 1H chemical shift and the genetic distance on the basis of the RFLP markers. However, local correlation analysis between the distance matrices (derived from each 0.04-ppm integral region of the 1H chemical shift against genetic

  14. Endozoicomonas genomes reveal functional adaptation and plasticity in bacterial strains symbiotically associated with diverse marine hosts

    KAUST Repository

    Neave, Matthew J.

    2017-01-17

    Endozoicomonas bacteria are globally distributed and often abundantly associated with diverse marine hosts including reef-building corals, yet their function remains unknown. In this study we generated novel Endozoicomonas genomes from single cells and metagenomes obtained directly from the corals Stylophora pistillata, Pocillopora verrucosa, and Acropora humilis. We then compared these culture-independent genomes to existing genomes of bacterial isolates acquired from a sponge, sea slug, and coral to examine the functional landscape of this enigmatic genus. Sequencing and analysis of single cells and metagenomes resulted in four novel genomes with 60–76% and 81–90% genome completeness, respectively. These data also confirmed that Endozoicomonas genomes are large and are not streamlined for an obligate endosymbiotic lifestyle, implying that they have free-living stages. All genomes show an enrichment of genes associated with carbon sugar transport and utilization and protein secretion, potentially indicating that Endozoicomonas contribute to the cycling of carbohydrates and the provision of proteins to their respective hosts. Importantly, besides these commonalities, the genomes showed evidence for differential functional specificity and diversification, including genes for the production of amino acids. Given this metabolic diversity of Endozoicomonas we propose that different genotypes play disparate roles and have diversified in concert with their hosts.

  15. Endozoicomonas genomes reveal functional adaptation and plasticity in bacterial strains symbiotically associated with diverse marine hosts

    KAUST Repository

    Neave, Matthew J.; Michell, Craig; Apprill, Amy; Voolstra, Christian R.

    2017-01-01

    Endozoicomonas bacteria are globally distributed and often abundantly associated with diverse marine hosts including reef-building corals, yet their function remains unknown. In this study we generated novel Endozoicomonas genomes from single cells and metagenomes obtained directly from the corals Stylophora pistillata, Pocillopora verrucosa, and Acropora humilis. We then compared these culture-independent genomes to existing genomes of bacterial isolates acquired from a sponge, sea slug, and coral to examine the functional landscape of this enigmatic genus. Sequencing and analysis of single cells and metagenomes resulted in four novel genomes with 60–76% and 81–90% genome completeness, respectively. These data also confirmed that Endozoicomonas genomes are large and are not streamlined for an obligate endosymbiotic lifestyle, implying that they have free-living stages. All genomes show an enrichment of genes associated with carbon sugar transport and utilization and protein secretion, potentially indicating that Endozoicomonas contribute to the cycling of carbohydrates and the provision of proteins to their respective hosts. Importantly, besides these commonalities, the genomes showed evidence for differential functional specificity and diversification, including genes for the production of amino acids. Given this metabolic diversity of Endozoicomonas we propose that different genotypes play disparate roles and have diversified in concert with their hosts.

  16. Detection of selection signatures of population-specific genomic regions selected during domestication process in Jinhua pigs.

    Science.gov (United States)

    Li, Zhengcao; Chen, Jiucheng; Wang, Zhen; Pan, Yuchun; Wang, Qishan; Xu, Ningying; Wang, Zhengguang

    2016-12-01

    Chinese pigs have been undergoing both natural and artificial selection for thousands of years. Jinhua pigs are of great importance, as they can be a valuable model for exploring the genetic mechanisms linked to meat quality and other traits such as disease resistance, reproduction and production. The purpose of this study was to identify distinctive footprints of selection between Jinhua pigs and other breeds utilizing genome-wide SNP data. Genotyping by genome reducing and sequencing was implemented in order to perform cross-population extended haplotype homozygosity to reveal strong signatures of selection for those economically important traits. This work was performed at a 2% genome level, which comprised 152 006 SNPs genotyped in a total of 517 individuals. Population-specific footprints of selective sweeps were searched for in the genome of Jinhua pigs using six native breeds and three European breeds as reference groups. Several candidate genes associated with meat quality, health and reproduction, such as GH1, CRHR2, TRAF4 and CCK, were found to be overlapping with the significantly positive outliers. Additionally, the results revealed that some genomic regions associated with meat quality, immune response and reproduction in Jinhua pigs have evolved directionally under domestication and subsequent selections. The identified genes and biological pathways in Jinhua pigs showed different selection patterns in comparison with the Chinese and European breeds. © 2016 Stichting International Foundation for Animal Genetics.

  17. Minimal sharing of Y-chromosome STR haplotypes among five endogamous population groups from western and southwestern India.

    Science.gov (United States)

    Das, Birajalaxmi; Chauhan, P S; Seshadri, M

    2004-10-01

    We attempt to address the issue of genetic variation and the pattern of male gene flow among and between five Indian population groups of two different geographic and linguistic affiliations using Y-chromosome markers. We studied 221 males at three Y-chromosome biallelic loci and 184 males for the five Y-chromosome STRs. We observed 111 Y-chromosome STR haplotypes. An analysis of molecular variance (AMOVA) based on Y-chromosome STRs showed that the variation observed between the population groups belonging to two major regions (western and southwestern India) was 0.17%, which was significantly lower than the level of genetic variance among the five populations (0.59%) considered as a single group. Combined haplotype analysis of the five STRs and the biallelic locus 92R7 revealed minimal sharing of haplotypes among these five ethnic groups, irrespective of the similar origin of the linguistic and geographic affiliations; this minimal sharing indicates restricted male gene flow. As a consequence, most of the haplotypes were population specific. Network analysis showed that the haplotypes, which were shared between the populations, seem to have originated from different mutational pathways at different loci. Biallelic markers showed that all five ethnic groups have a similar ancestral origin despite their geographic and linguistic diversity.

  18. The African Genome Variation Project shapes medical genetics in Africa

    Science.gov (United States)

    Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O.; Choudhury, Ananyo; Ritchie, Graham R. S.; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N.; Young, Elizabeth H.; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P.; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A.; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S.

    2014-01-01

    Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterisation of African genetic diversity is needed. The African Genome Variation Project (AGVP) provides a resource to help design, implement and interpret genomic studies in sub-Saharan Africa (SSA) and worldwide. The AGVP represents dense genotypes from 1,481 and whole genome sequences (WGS) from 320 individuals across SSA. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across SSA. We identify new loci under selection, including for malaria and hypertension. We show that modern imputation panels can identify association signals at highly differentiated loci across populations in SSA. Using WGS, we show further improvement in imputation accuracy supporting efforts for large-scale sequencing of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa, showing for the first time that such designs are feasible. PMID:25470054

  19. Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping.

    Directory of Open Access Journals (Sweden)

    Amaury Vaysse

    2011-10-01

    Full Text Available The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease.

  20. Genome-wide association mapping for yield and other agronomic traits in an elite breeding population of tropical rice (Oryza sativa).

    Science.gov (United States)

    Begum, Hasina; Spindel, Jennifer E; Lalusin, Antonio; Borromeo, Teresita; Gregorio, Glenn; Hernandez, Jose; Virk, Parminder; Collard, Bertrand; McCouch, Susan R

    2015-01-01

    Genome-wide association mapping studies (GWAS) are frequently used to detect QTL in diverse collections of crop germplasm, based on historic recombination events and linkage disequilibrium across the genome. Generally, diversity panels genotyped with high density SNP panels are utilized in order to assay a wide range of alleles and haplotypes and to monitor recombination breakpoints across the genome. By contrast, GWAS have not generally been performed in breeding populations. In this study we performed association mapping for 19 agronomic traits including yield and yield components in a breeding population of elite irrigated tropical rice breeding lines so that the results would be more directly applicable to breeding than those from a diversity panel. The population was genotyped with 71,710 SNPs using genotyping-by-sequencing (GBS), and GWAS performed with the explicit goal of expediting selection in the breeding program. Using this breeding panel we identified 52 QTL for 11 agronomic traits, including large effect QTLs for flowering time and grain length/grain width/grain-length-breadth ratio. We also identified haplotypes that can be used to select plants in our population for short stature (plant height), early flowering time, and high yield, and thus demonstrate the utility of association mapping in breeding populations for informing breeding decisions. We conclude by exploring how the newly identified significant SNPs and insights into the genetic architecture of these quantitative traits can be leveraged to build genomic-assisted selection models.

  1. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    LENUS (Irish Health Repository)

    Potnis, Neha

    2011-03-11

    Abstract Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv) has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv) strain 1111 (ATCC 35937), X. perforans (Xp) strain 91-118 and X. gardneri (Xg) strain 101 (ATCC 19865). The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the lipopolysaccharide cluster

  2. Low diversity, activity, and density of transposable elements in five avian genomes.

    Science.gov (United States)

    Gao, Bo; Wang, Saisai; Wang, Yali; Shen, Dan; Xue, Songlei; Chen, Cai; Cui, Hengmi; Song, Chengyi

    2017-07-01

    In this study, we conducted the activity, diversity, and density analysis of transposable elements (TEs) across five avian genomes (budgerigar, chicken, turkey, medium ground finch, and zebra finch) to explore the potential reason of small genome sizes of birds. We found that these avian genomes exhibited low density of TEs by about 10% of genome coverages and low diversity of TEs with the TE landscapes dominated by CR1 and ERV elements, and contrasting proliferation dynamics both between TE types and between species were observed across the five avian genomes. Phylogenetic analysis revealed that CR1 clade was more diverse in the family structure compared with R2 clade in birds; avian ERVs were classified into four clades (alpha, beta, gamma, and ERV-L) and belonged to three classes of ERV with an uneven distributed in these lineages. The activities of DNA and SINE TEs were very low in the evolution history of avian genomes; most LINEs and LTRs were ancient copies with a substantial decrease of activity in recent, with only LTRs and LINEs in chicken and zebra finch exhibiting weak activity in very recent, and very few TEs were intact; however, the recent activity may be underestimated due to the sequencing/assembly technologies in some species. Overall, this study demonstrates low diversity, activity, and density of TEs in the five avian species; highlights the differences of TEs in these lineages; and suggests that the current and recent activity of TEs in avian genomes is very limited, which may be one of the reasons of small genome sizes in birds.

  3. Overview of worldwide diversity of Diaphorina citri Kuwayama mitochondrial cytochrome oxidase 1 haplotypes: two Old World lineages and a New World invasion

    Science.gov (United States)

    Boykin, L.M.; De Barro, P.; Hall, D.G.; Hunter, W.B.; McKenzie, C.L.; Powell, C.A.; Shatters, R.G.

    2012-01-01

    Relationships among worldwide collections of Diaphorina citri (Asian citrus psyllid) were analyzed using mitochondrial cytochrome oxidase I (mtCOI) haplotypes from novel primers. Sequences were produced from PCR amplicons of an 821bp portion of the mtCOI gene using D. citri specific primers, derived from an existing EST library. An alignment was constructed using 612bps of this fragment and consisted of 212 individuals from 52 collections representing 15 countries. There were a total of eight polymorphic sites that separated the sequences into eight different haplotypes (Dcit-1 through Dcit-8). Phylogenetic network analysis using the statistical parsimony software, TCS, suggests two major haplotype groups with preliminary geographic bias between southwestern Asia (SWA) and southeastern Asia (SEA). The recent (within the last 15 to 25 years) invasion into the New World originated from only the SWA group in the northern hemisphere (USA and Mexico) and from both the SEA and SWA groups in the southern hemisphere (Brazil). In only one case, Reunion Island, did haplotypes from both the SEA and SWA group appear in the same location. In Brazil, both groups were present, but in separate locations. The Dcit-1 SWA haplotype was the most frequently encountered, including ~50% of the countries sampled and 87% of the total sequences obtained from India, Pakistan and Saudi Arabia. The second most frequently encountered haplotype, Dcit-2, the basis of the SEA group, represented ~50% of the countries and contained most of the sequences from Southeast Asia and China. Interestingly, only the Caribbean collections (Puerto Rico and Guadeloupe) represented a unique haplotype not found in other countries, indicating no relationship between the USA (Florida) and Caribbean introductions. There is no evidence for cryptic speciation for D. citri based on the COI region included in this study. PMID:22717059

  4. Unique haplotypes of cacao trees as revealed by trnH-psbA chloroplast DNA

    Directory of Open Access Journals (Sweden)

    Nidia Gutiérrez-López

    2016-04-01

    Full Text Available Cacao trees have been cultivated in Mesoamerica for at least 4,000 years. In this study, we analyzed sequence variation in the chloroplast DNA trnH-psbA intergenic spacer from 28 cacao trees from different farms in the Soconusco region in southern Mexico. Genetic relationships were established by two analysis approaches based on geographic origin (five populations and genetic origin (based on a previous study. We identified six polymorphic sites, including five insertion/deletion (indels types and one transversion. The overall nucleotide diversity was low for both approaches (geographic = 0.0032 and genetic = 0.0038. Conversely, we obtained moderate to high haplotype diversity (0.66 and 0.80 with 10 and 12 haplotypes, respectively. The common haplotype (H1 for both networks included cacao trees from all geographic locations (geographic approach and four genetic groups (genetic approach. This common haplotype (ancient derived a set of intermediate haplotypes and singletons interconnected by one or two mutational steps, which suggested directional selection and event purification from the expansion of narrow populations. Cacao trees from Soconusco region were grouped into one cluster without any evidence of subclustering based on AMOVA (FST = 0 and SAMOVA (FST = 0.04393 results. One population (Mazatán showed a high haplotype frequency; thus, this population could be considered an important reservoir of genetic material. The indels located in the trnH-psbA intergenic spacer of cacao trees could be useful as markers for the development of DNA barcoding.

  5. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication.

    Science.gov (United States)

    Avni, Raz; Nave, Moran; Barad, Omer; Baruch, Kobi; Twardziok, Sven O; Gundlach, Heidrun; Hale, Iago; Mascher, Martin; Spannagl, Manuel; Wiebe, Krystalee; Jordan, Katherine W; Golan, Guy; Deek, Jasline; Ben-Zvi, Batsheva; Ben-Zvi, Gil; Himmelbach, Axel; MacLachlan, Ron P; Sharpe, Andrew G; Fritz, Allan; Ben-David, Roi; Budak, Hikmet; Fahima, Tzion; Korol, Abraham; Faris, Justin D; Hernandez, Alvaro; Mikel, Mark A; Levy, Avraham A; Steffenson, Brian; Maccaferri, Marco; Tuberosa, Roberto; Cattivelli, Luigi; Faccioli, Primetta; Ceriotti, Aldo; Kashkush, Khalil; Pourkheirandish, Mohammad; Komatsuda, Takao; Eilam, Tamar; Sela, Hanan; Sharon, Amir; Ohad, Nir; Chamovitz, Daniel A; Mayer, Klaus F X; Stein, Nils; Ronen, Gil; Peleg, Zvi; Pozniak, Curtis J; Akhunov, Eduard D; Distelfeld, Assaf

    2017-07-07

    Wheat ( Triticum spp.) is one of the founder crops that likely drove the Neolithic transition to sedentary agrarian societies in the Fertile Crescent more than 10,000 years ago. Identifying genetic modifications underlying wheat's domestication requires knowledge about the genome of its allo-tetraploid progenitor, wild emmer ( T. turgidum ssp. dicoccoides ). We report a 10.1-gigabase assembly of the 14 chromosomes of wild tetraploid wheat, as well as analyses of gene content, genome architecture, and genetic diversity. With this fully assembled polyploid wheat genome, we identified the causal mutations in Brittle Rachis 1 ( TtBtr1 ) genes controlling shattering, a key domestication trait. A study of genomic diversity among wild and domesticated accessions revealed genomic regions bearing the signature of selection under domestication. This reference assembly will serve as a resource for accelerating the genome-assisted improvement of modern wheat varieties. Copyright © 2017, American Association for the Advancement of Science.

  6. MBGD update 2013: the microbial genome database for exploring the diversity of microbial world.

    Science.gov (United States)

    Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

    2013-01-01

    The microbial genome database for comparative analysis (MBGD, available at http://mbgd.genome.ad.jp/) is a platform for microbial genome comparison based on orthology analysis. As its unique feature, MBGD allows users to conduct orthology analysis among any specified set of organisms; this flexibility allows MBGD to adapt to a variety of microbial genomic study. Reflecting the huge diversity of microbial world, the number of microbial genome projects now becomes several thousands. To efficiently explore the diversity of the entire microbial genomic data, MBGD now provides summary pages for pre-calculated ortholog tables among various taxonomic groups. For some closely related taxa, MBGD also provides the conserved synteny information (core genome alignment) pre-calculated using the CoreAligner program. In addition, efficient incremental updating procedure can create extended ortholog table by adding additional genomes to the default ortholog table generated from the representative set of genomes. Combining with the functionalities of the dynamic orthology calculation of any specified set of organisms, MBGD is an efficient and flexible tool for exploring the microbial genome diversity.

  7. De Novo Assembly and Phasing of Dikaryotic Genomes from Two Isolates of Puccinia coronata f. sp. avenae, the Causal Agent of Oat Crown Rust.

    Science.gov (United States)

    Miller, Marisa E; Zhang, Ying; Omidvar, Vahid; Sperschneider, Jana; Schwessinger, Benjamin; Raley, Castle; Palmer, Jonathan M; Garnica, Diana; Upadhyaya, Narayana; Rathjen, John; Taylor, Jennifer M; Park, Robert F; Dodds, Peter N; Hirsch, Cory D; Kianian, Shahryar F; Figueroa, Melania

    2018-02-20

    Oat crown rust, caused by the fungus Pucinnia coronata f. sp. avenae , is a devastating disease that impacts worldwide oat production. For much of its life cycle, P. coronata f. sp. avenae is dikaryotic, with two separate haploid nuclei that may vary in virulence genotype, highlighting the importance of understanding haplotype diversity in this species. We generated highly contiguous de novo genome assemblies of two P. coronata f. sp. avenae isolates, 12SD80 and 12NC29, from long-read sequences. In total, we assembled 603 primary contigs for 12SD80, for a total assembly length of 99.16 Mbp, and 777 primary contigs for 12NC29, for a total length of 105.25 Mbp; approximately 52% of each genome was assembled into alternate haplotypes. This revealed structural variation between haplotypes in each isolate equivalent to more than 2% of the genome size, in addition to about 260,000 and 380,000 heterozygous single-nucleotide polymorphisms in 12SD80 and 12NC29, respectively. Transcript-based annotation identified 26,796 and 28,801 coding sequences for isolates 12SD80 and 12NC29, respectively, including about 7,000 allele pairs in haplotype-phased regions. Furthermore, expression profiling revealed clusters of coexpressed secreted effector candidates, and the majority of orthologous effectors between isolates showed conservation of expression patterns. However, a small subset of orthologs showed divergence in expression, which may contribute to differences in virulence between 12SD80 and 12NC29. This study provides the first haplotype-phased reference genome for a dikaryotic rust fungus as a foundation for future studies into virulence mechanisms in P. coronata f. sp. avenae IMPORTANCE Disease management strategies for oat crown rust are challenged by the rapid evolution of Puccinia coronata f. sp. avenae , which renders resistance genes in oat varieties ineffective. Despite the economic importance of understanding P. coronata f. sp. avenae , resources to study the

  8. Contrasted patterns of molecular evolution in dominant and recessive self-incompatibility haplotypes in Arabidopsis.

    Directory of Open Access Journals (Sweden)

    Pauline M Goubet

    Full Text Available Self-incompatibility has been considered by geneticists a model system for reproductive biology and balancing selection, but our understanding of the genetic basis and evolution of this molecular lock-and-key system has remained limited by the extreme level of sequence divergence among haplotypes, resulting in a lack of appropriate genomic sequences. In this study, we report and analyze the full sequence of eleven distinct haplotypes of the self-incompatibility locus (S-locus in two closely related Arabidopsis species, obtained from individual BAC libraries. We use this extensive dataset to highlight sharply contrasted patterns of molecular evolution of each of the two genes controlling self-incompatibility themselves, as well as of the genomic region surrounding them. We find strong collinearity of the flanking regions among haplotypes on each side of the S-locus together with high levels of sequence similarity. In contrast, the S-locus region itself shows spectacularly deep gene genealogies, high variability in size and gene organization, as well as complete absence of sequence similarity in intergenic sequences and striking accumulation of transposable elements. Of particular interest, we demonstrate that dominant and recessive S-haplotypes experience sharply contrasted patterns of molecular evolution. Indeed, dominant haplotypes exhibit larger size and a much higher density of transposable elements, being matched only by that in the centromere. Overall, these properties highlight that the S-locus presents many striking similarities with other regions involved in the determination of mating-types, such as sex chromosomes in animals or in plants, or the mating-type locus in fungi and green algae.

  9. Tales of diversity: Genomic and morphological characteristics of forty-six Arthrobacter phages.

    Directory of Open Access Journals (Sweden)

    Karen K Klyczek

    Full Text Available The vast bacteriophage population harbors an immense reservoir of genetic information. Almost 2000 phage genomes have been sequenced from phages infecting hosts in the phylum Actinobacteria, and analysis of these genomes reveals substantial diversity, pervasive mosaicism, and novel mechanisms for phage replication and lysogeny. Here, we describe the isolation and genomic characterization of 46 phages from environmental samples at various geographic locations in the U.S. infecting a single Arthrobacter sp. strain. These phages include representatives of all three virion morphologies, and Jasmine is the first sequenced podovirus of an actinobacterial host. The phages also span considerable sequence diversity, and can be grouped into 10 clusters according to their nucleotide diversity, and two singletons each with no close relatives. However, the clusters/singletons appear to be genomically well separated from each other, and relatively few genes are shared between clusters. Genome size varies from among the smallest of siphoviral phages (15,319 bp to over 70 kbp, and G+C contents range from 45-68%, compared to 63.4% for the host genome. Although temperate phages are common among other actinobacterial hosts, these Arthrobacter phages are primarily lytic, and only the singleton Galaxy is likely temperate.

  10. Tales of diversity: Genomic and morphological characteristics of forty-six Arthrobacter phages.

    Science.gov (United States)

    Klyczek, Karen K; Bonilla, J Alfred; Jacobs-Sera, Deborah; Adair, Tamarah L; Afram, Patricia; Allen, Katherine G; Archambault, Megan L; Aziz, Rahat M; Bagnasco, Filippa G; Ball, Sarah L; Barrett, Natalie A; Benjamin, Robert C; Blasi, Christopher J; Borst, Katherine; Braun, Mary A; Broomell, Haley; Brown, Conner B; Brynell, Zachary S; Bue, Ashley B; Burke, Sydney O; Casazza, William; Cautela, Julia A; Chen, Kevin; Chimalakonda, Nitish S; Chudoff, Dylan; Connor, Jade A; Cross, Trevor S; Curtis, Kyra N; Dahlke, Jessica A; Deaton, Bethany M; Degroote, Sarah J; DeNigris, Danielle M; DeRuff, Katherine C; Dolan, Milan; Dunbar, David; Egan, Marisa S; Evans, Daniel R; Fahnestock, Abby K; Farooq, Amal; Finn, Garrett; Fratus, Christopher R; Gaffney, Bobby L; Garlena, Rebecca A; Garrigan, Kelly E; Gibbon, Bryan C; Goedde, Michael A; Guerrero Bustamante, Carlos A; Harrison, Melinda; Hartwell, Megan C; Heckman, Emily L; Huang, Jennifer; Hughes, Lee E; Hyduchak, Kathryn M; Jacob, Aswathi E; Kaku, Machika; Karstens, Allen W; Kenna, Margaret A; Khetarpal, Susheel; King, Rodney A; Kobokovich, Amanda L; Kolev, Hannah; Konde, Sai A; Kriese, Elizabeth; Lamey, Morgan E; Lantz, Carter N; Lapin, Jonathan S; Lawson, Temiloluwa O; Lee, In Young; Lee, Scott M; Lee-Soety, Julia Y; Lehmann, Emily M; London, Shawn C; Lopez, A Javier; Lynch, Kelly C; Mageeney, Catherine M; Martynyuk, Tetyana; Mathew, Kevin J; Mavrich, Travis N; McDaniel, Christopher M; McDonald, Hannah; McManus, C Joel; Medrano, Jessica E; Mele, Francis E; Menninger, Jennifer E; Miller, Sierra N; Minick, Josephine E; Nabua, Courtney T; Napoli, Caroline K; Nkangabwa, Martha; Oates, Elizabeth A; Ott, Cassandra T; Pellerino, Sarah K; Pinamont, William J; Pirnie, Ross T; Pizzorno, Marie C; Plautz, Emilee J; Pope, Welkin H; Pruett, Katelyn M; Rickstrew, Gabbi; Rimple, Patrick A; Rinehart, Claire A; Robinson, Kayla M; Rose, Victoria A; Russell, Daniel A; Schick, Amelia M; Schlossman, Julia; Schneider, Victoria M; Sells, Chloe A; Sieker, Jeremy W; Silva, Morgan P; Silvi, Marissa M; Simon, Stephanie E; Staples, Amanda K; Steed, Isabelle L; Stowe, Emily L; Stueven, Noah A; Swartz, Porter T; Sweet, Emma A; Sweetman, Abigail T; Tender, Corrina; Terry, Katrina; Thomas, Chrystal; Thomas, Daniel S; Thompson, Allison R; Vanderveen, Lorianna; Varma, Rohan; Vaught, Hannah L; Vo, Quynh D; Vonberg, Zachary T; Ware, Vassie C; Warrad, Yasmene M; Wathen, Kaitlyn E; Weinstein, Jonathan L; Wyper, Jacqueline F; Yankauskas, Jakob R; Zhang, Christine; Hatfull, Graham F

    2017-01-01

    The vast bacteriophage population harbors an immense reservoir of genetic information. Almost 2000 phage genomes have been sequenced from phages infecting hosts in the phylum Actinobacteria, and analysis of these genomes reveals substantial diversity, pervasive mosaicism, and novel mechanisms for phage replication and lysogeny. Here, we describe the isolation and genomic characterization of 46 phages from environmental samples at various geographic locations in the U.S. infecting a single Arthrobacter sp. strain. These phages include representatives of all three virion morphologies, and Jasmine is the first sequenced podovirus of an actinobacterial host. The phages also span considerable sequence diversity, and can be grouped into 10 clusters according to their nucleotide diversity, and two singletons each with no close relatives. However, the clusters/singletons appear to be genomically well separated from each other, and relatively few genes are shared between clusters. Genome size varies from among the smallest of siphoviral phages (15,319 bp) to over 70 kbp, and G+C contents range from 45-68%, compared to 63.4% for the host genome. Although temperate phages are common among other actinobacterial hosts, these Arthrobacter phages are primarily lytic, and only the singleton Galaxy is likely temperate.

  11. Genomic diversity of drug-resistant Mycobacterium tuberculosis isolates in Lisbon Portugal: Towards tuberculosis genomic epidemiology

    Directory of Open Access Journals (Sweden)

    João Perdigão

    2015-01-01

    Full Text Available Multidrug- (MDR and extensively drug-resistant (XDR tuberculosis (TB present a challenge to disease control and elimination goals. Lisbon, Portugal, has a high TB incidencerate and unusual and successful XDR-TB strains that have been found in circulation foralmost two decades. For the last 20 years, a continued circulation of two phylogenetic clades, Lisboa3 and Q1, which are highly associated with MDR and XDR, have been observed. In recent years, these strains have been well characterized regarding the molecular basis of drug resistance and have been inclusively subjected to whole genome sequencing (WGS. Researchers have been studying the genomic diversity of strains circulating in Lisbon and its genomic determinants through cutting-edge next generation sequencing. An enormous amount of whole genome sequence data are now available for the most prevalent and clinically relevant strains circulating in Lisbon. It is the persistence, prevalence and rapid evolution towards drug resistance that has prompted researchers to investigate the properties of these strains at the genomic level and in the future at a global transcriptomic level. Seventy Mycobacterium tuberculosis (MTB isolates, mostly recovered in Lisbon, were genotyped by 24-loci Mycobacterial Interspersed Repetitive Unit – Variable Number of Tandem Repeats (MIRU-VNTR and the genomes sequenced using a next generation sequencing platform – Illumina HiSeq 2000. The genotyping data revealed three major clusters associated with MDR-TB (Lisboa3-A, Lisboa3-B and Q1, two of which are associated with XDR-TB (Lisboa3-B and Q1, whilst the genomic data contributed to elucidating the phylogenetic positioning of circulating MDR-TB strains, showing a high predominance of a single SNP cluster group 5. Furthermore, a genome-wide phylogeny analysis from these strains, together with 19 publicly available genomes of MTB clinical isolates, revealed two major clades responsible for MDR/XDR-TB in the region

  12. Genomic diversity of drug-resistant Mycobacterium tuberculosis isolates in Lisbon Portugal: Towards tuberculosis genomic epidemiology

    KAUST Repository

    Perdigã o, Joã o; Silva, Hugo; Machado, Diana; Macedo, Rita; Maltez, Fernando; Silva, Carla; Jordao, Luisa; Couto, Isabel; Mallard, Kim; Coll, Francesc; Hill-Cawthorne, Grant A.; McNerney, Ruth; Pain, Arnab; Clark, Taane G.; Viveiros, Miguel; Portugal, Isabel

    2015-01-01

    Multidrug- (MDR) and extensively drug-resistant (XDR) tuberculosis (TB) present a challenge to disease control and elimination goals. Lisbon, Portugal, has a high TB incidence rate and unusual and successful XDR-TB strains that have been found in circulation for almost two decades. For the last 20. years, a continued circulation of two phylogenetic clades, Lisboa3 and Q1, which are highly associated with MDR and XDR, have been observed. In recent years, these strains have been well characterized regarding the molecular basis of drug resistance and have been inclusively subjected to whole genome sequencing (WGS). Researchers have been studying the genomic diversity of strains circulating in Lisbon and its genomic determinants through cutting-edge next generation sequencing. An enormous amount of whole genome sequence data are now available for the most prevalent and clinically relevant strains circulating in Lisbon.It is the persistence, prevalence and rapid evolution towards drug resistance that has prompted researchers to investigate the properties of these strains at the genomic level and in the future at a global transcriptomic level. Seventy Mycobacterium tuberculosis (MTB) isolates, mostly recovered in Lisbon, were genotyped by 24-. loci Mycobacterial Interspersed Repetitive Unit - Variable Number of Tandem Repeats (MIRU-VNTR) and the genomes sequenced using a next generation sequencing platform - Illumina HiSeq 2000.The genotyping data revealed three major clusters associated with MDR-TB (Lisboa3-A, Lisboa3-B and Q1), two of which are associated with XDR-TB (Lisboa3-B and Q1), whilst the genomic data contributed to elucidating the phylogenetic positioning of circulating MDR-TB strains, showing a high predominance of a single SNP cluster group 5. Furthermore, a genome-wide phylogeny analysis from these strains, together with 19 publicly available genomes of MTB clinical isolates, revealed two major clades responsible for MDR/XDR-TB in the region: Lisboa3 and Q

  13. Genomic diversity of drug-resistant Mycobacterium tuberculosis isolates in Lisbon Portugal: Towards tuberculosis genomic epidemiology

    KAUST Repository

    Perdigão, João

    2015-03-01

    Multidrug- (MDR) and extensively drug-resistant (XDR) tuberculosis (TB) present a challenge to disease control and elimination goals. Lisbon, Portugal, has a high TB incidence rate and unusual and successful XDR-TB strains that have been found in circulation for almost two decades. For the last 20. years, a continued circulation of two phylogenetic clades, Lisboa3 and Q1, which are highly associated with MDR and XDR, have been observed. In recent years, these strains have been well characterized regarding the molecular basis of drug resistance and have been inclusively subjected to whole genome sequencing (WGS). Researchers have been studying the genomic diversity of strains circulating in Lisbon and its genomic determinants through cutting-edge next generation sequencing. An enormous amount of whole genome sequence data are now available for the most prevalent and clinically relevant strains circulating in Lisbon.It is the persistence, prevalence and rapid evolution towards drug resistance that has prompted researchers to investigate the properties of these strains at the genomic level and in the future at a global transcriptomic level. Seventy Mycobacterium tuberculosis (MTB) isolates, mostly recovered in Lisbon, were genotyped by 24-. loci Mycobacterial Interspersed Repetitive Unit - Variable Number of Tandem Repeats (MIRU-VNTR) and the genomes sequenced using a next generation sequencing platform - Illumina HiSeq 2000.The genotyping data revealed three major clusters associated with MDR-TB (Lisboa3-A, Lisboa3-B and Q1), two of which are associated with XDR-TB (Lisboa3-B and Q1), whilst the genomic data contributed to elucidating the phylogenetic positioning of circulating MDR-TB strains, showing a high predominance of a single SNP cluster group 5. Furthermore, a genome-wide phylogeny analysis from these strains, together with 19 publicly available genomes of MTB clinical isolates, revealed two major clades responsible for MDR/XDR-TB in the region: Lisboa3 and Q

  14. Haplotype Analysis of the Pre-harvest Sprouting Resistance Locus Phs-A1 Reveals a Causal Role of TaMKK3-A in Global Germplasm.

    Science.gov (United States)

    Shorinola, Oluwaseyi; Balcárková, Barbara; Hyles, Jessica; Tibbits, Josquin F G; Hayden, Matthew J; Holušova, Katarina; Valárik, Miroslav; Distelfeld, Assaf; Torada, Atsushi; Barrero, Jose M; Uauy, Cristobal

    2017-01-01

    Pre-harvest sprouting (PHS) is an important cause of quality loss in many cereal crops and is particularly prevalent and damaging in wheat. Resistance to PHS is therefore a valuable target trait in many breeding programs. The Phs-A1 locus on wheat chromosome arm 4AL has been consistently shown to account for a significant proportion of natural variation to PHS in diverse mapping populations. However, the deployment of sprouting resistance is confounded by the fact that different candidate genes, including the tandem duplicated Plasma Membrane 19 ( PM19 ) genes and the mitogen-activated protein kinase kinase 3 ( TaMKK3-A) gene, have been proposed to underlie Phs-A1 . To further define the Phs-A1 locus, we constructed a physical map across this interval in hexaploid and tetraploid wheat. We established close proximity of the proposed candidate genes which are located within a 1.2 Mb interval. Genetic characterization of diverse germplasm used in previous genetic mapping studies suggests that TaMKK3-A , and not PM19 , is the major gene underlying the Phs-A1 effect in European, North American, Australian and Asian germplasm. We identified the non-dormant TaMKK3-A allele at low frequencies within the A-genome diploid progenitor Triticum urartu genepool, and show an increase in the allele frequency in modern varieties. In United Kingdom varieties, the frequency of the dormant TaMKK3-A allele was significantly higher in bread-making quality varieties compared to feed and biscuit-making cultivars. Analysis of exome capture data from 58 diverse hexaploid wheat accessions identified fourteen haplotypes across the extended Phs-A1 locus and four haplotypes for TaMKK3-A . Analysis of these haplotypes in a collection of United Kingdom and Australian cultivars revealed distinct major dormant and non-dormant Phs-A1 haplotypes in each country, which were either rare or absent in the opposing germplasm set. The diagnostic markers and haplotype information reported in the study will

  15. Haplotype Analysis of the Pre-harvest Sprouting Resistance Locus Phs-A1 Reveals a Causal Role of TaMKK3-A in Global Germplasm

    Directory of Open Access Journals (Sweden)

    Oluwaseyi Shorinola

    2017-09-01

    Full Text Available Pre-harvest sprouting (PHS is an important cause of quality loss in many cereal crops and is particularly prevalent and damaging in wheat. Resistance to PHS is therefore a valuable target trait in many breeding programs. The Phs-A1 locus on wheat chromosome arm 4AL has been consistently shown to account for a significant proportion of natural variation to PHS in diverse mapping populations. However, the deployment of sprouting resistance is confounded by the fact that different candidate genes, including the tandem duplicated Plasma Membrane 19 (PM19 genes and the mitogen-activated protein kinase kinase 3 (TaMKK3-A gene, have been proposed to underlie Phs-A1. To further define the Phs-A1 locus, we constructed a physical map across this interval in hexaploid and tetraploid wheat. We established close proximity of the proposed candidate genes which are located within a 1.2 Mb interval. Genetic characterization of diverse germplasm used in previous genetic mapping studies suggests that TaMKK3-A, and not PM19, is the major gene underlying the Phs-A1 effect in European, North American, Australian and Asian germplasm. We identified the non-dormant TaMKK3-A allele at low frequencies within the A-genome diploid progenitor Triticum urartu genepool, and show an increase in the allele frequency in modern varieties. In United Kingdom varieties, the frequency of the dormant TaMKK3-A allele was significantly higher in bread-making quality varieties compared to feed and biscuit-making cultivars. Analysis of exome capture data from 58 diverse hexaploid wheat accessions identified fourteen haplotypes across the extended Phs-A1 locus and four haplotypes for TaMKK3-A. Analysis of these haplotypes in a collection of United Kingdom and Australian cultivars revealed distinct major dormant and non-dormant Phs-A1 haplotypes in each country, which were either rare or absent in the opposing germplasm set. The diagnostic markers and haplotype information reported in the

  16. Patterns of linkage disequilibrium and haplotype distribution in disease candidate genes.

    Science.gov (United States)

    Long, Ji-Rong; Zhao, Lan-Juan; Liu, Peng-Yuan; Lu, Yan; Dvornyk, Volodymyr; Shen, Hui; Liu, Yong-Jun; Zhang, Yuan-Yuan; Xiong, Dong-Hai; Xiao, Peng; Deng, Hong-Wen

    2004-05-24

    The adequacy of association studies for complex diseases depends critically on the existence of linkage disequilibrium (LD) between functional alleles and surrounding SNP markers. We examined the patterns of LD and haplotype distribution in eight candidate genes for osteoporosis and/or obesity using 31 SNPs in 1,873 subjects. These eight genes are apolipoprotein E (APOE), type I collagen alpha1 (COL1A1), estrogen receptor-alpha (ER-alpha), leptin receptor (LEPR), parathyroid hormone (PTH)/PTH-related peptide receptor type 1 (PTHR1), transforming growth factor-beta1 (TGF-beta1), uncoupling protein 3 (UCP3), and vitamin D (1,25-dihydroxyvitamin D3) receptor (VDR). Yin yang haplotypes, two high-frequency haplotypes composed of completely mismatching SNP alleles, were examined. To quantify LD patterns, two common measures of LD, D' and r2, were calculated for the SNPs within the genes. The haplotype distribution varied in the different genes. Yin yang haplotypes were observed only in PTHR1 and UCP3. D' ranged from 0.020 to 1.000 with the average of 0.475, whereas the average r2 was 0.158 (ranging from 0.000 to 0.883). A decay of LD was observed as the intermarker distance increased, however, there was a great difference in LD characteristics of different genes or even in different regions within gene. The differences in haplotype distributions and LD patterns among the genes underscore the importance of characterizing genomic regions of interest prior to association studies.

  17. Human-specific HERV-K insertion causes genomic variations in the human genome.

    Directory of Open Access Journals (Sweden)

    Wonseok Shin

    Full Text Available Human endogenous retroviruses (HERV sequences account for about 8% of the human genome. Through comparative genomics and literature mining, we identified a total of 29 human-specific HERV-K insertions. We characterized them focusing on their structure and flanking sequence. The results showed that four of the human-specific HERV-K insertions deleted human genomic sequences via non-classical insertion mechanisms. Interestingly, two of the human-specific HERV-K insertion loci contained two HERV-K internals and three LTR elements, a pattern which could be explained by LTR-LTR ectopic recombination or template switching. In addition, we conducted a polymorphic test and observed that twelve out of the 29 elements are polymorphic in the human population. In conclusion, human-specific HERV-K elements have inserted into human genome since the divergence of human and chimpanzee, causing human genomic changes. Thus, we believe that human-specific HERV-K activity has contributed to the genomic divergence between humans and chimpanzees, as well as within the human population.

  18. Extensive Genomic Diversity among Bovine-Adapted Staphylococcus aureus: Evidence for a Genomic Rearrangement within CC97.

    Directory of Open Access Journals (Sweden)

    Kathleen E Budd

    Full Text Available Staphylococcus aureus is an important pathogen associated with both human and veterinary disease and is a common cause of bovine mastitis. Genomic heterogeneity exists between S. aureus strains and has been implicated in the adaptation of specific strains to colonise particular mammalian hosts. Knowledge of the factors required for host specificity and virulence is important for understanding the pathogenesis and management of S. aureus mastitis. In this study, a panel of mastitis-associated S. aureus isolates (n = 126 was tested for resistance to antibiotics commonly used to treat mastitis. Over half of the isolates (52% demonstrated resistance to penicillin and ampicillin but all were susceptible to the other antibiotics tested. S. aureus isolates were further examined for their clonal diversity by Multi-Locus Sequence Typing (MLST. In total, 18 different sequence types (STs were identified and eBURST analysis demonstrated that the majority of isolates grouped into clonal complexes CC97, CC151 or sequence type (ST 136. Analysis of the role of recombination events in determining S. aureus population structure determined that ST diversification through nucleotide substitutions were more likely to be due to recombination compared to point mutation, with regions of the genome possibly acting as recombination hotspots. DNA microarray analysis revealed a large number of differences amongst S. aureus STs in their variable genome content, including genes associated with capsule and biofilm formation and adhesion factors. Finally, evidence for a genomic arrangement was observed within isolates from CC97 with the ST71-like subgroup showing evidence of an IS431 insertion element having replaced approximately 30 kb of DNA including the ica operon and histidine biosynthesis genes, resulting in histidine auxotrophy. This genomic rearrangement may be responsible for the diversification of ST71 into an emerging bovine adapted subgroup.

  19. Mitochondrial genome diversity in the Tubalar, Even, and Ulchi: contribution to prehistory of native Siberians and their affinities to Native Americans.

    Science.gov (United States)

    Sukernik, Rem I; Volodko, Natalia V; Mazunin, Ilya O; Eltsov, Nikolai P; Dryomov, Stanislav V; Starikovskaya, Elena B

    2012-05-01

    To fill remaining gaps in mitochondrial DNA diversity in the least surveyed eastern and western flanks of Siberia, 391 mtDNA samples (144 Tubalar from Altai, 87 Even from northeastern Siberia, and 160 Ulchi from the Russian Far East) were characterized via high-resolution restriction fragment length polymorphism/single nucleotide polymorphisms analysis. The subhaplogroup structure was extended through complete sequencing of 67 mtDNA samples selected from these and other related native Siberians. Specifically, we have focused on the evolutionary histories of the derivatives of M and N haplogroups, putatively reflecting different phases of settling Siberia by early modern humans. Population history and phylogeography of the resulting mtDNA genomes, combined with those from previously published data sets, revealed a wide range of tribal- and region-specific mtDNA haplotypes that emerged or diversified in Siberia before or after the last glacial maximum, ∼18 kya. Spatial distribution and ages of the "east" and "west" Eurasian mtDNA haploclusters suggest that anatomically modern humans that originally colonized Altai derived from macrohaplogroup N and came from Southwest Asia around 38,000 years ago. The derivatives of macrohaplogroup M, which largely emerged or diversified within the Russian Far East, came along with subsequent migrations to West Siberia millennia later. The last glacial maximum played a critical role in the timing and character of the settlement of the Siberian subcontinent. Copyright © 2012 Wiley Periodicals, Inc.

  20. The effect of genetic bottlenecks and inbreeding on the incidence of two major autoimmune diseases in standard poodles, sebaceous adenitis and Addison's disease.

    Science.gov (United States)

    Pedersen, Niels C; Brucker, Lynn; Tessier, Natalie Green; Liu, Hongwei; Penedo, Maria Cecilia T; Hughes, Shayne; Oberbauer, Anita; Sacks, Ben

    2015-01-01

    Sebaceous adenitis (SA) and Addison's disease (AD) increased rapidly in incidence among Standard Poodles after the mid-twentieth century. Previous attempts to identify specific genetic causes using genome wide association studies and interrogation of the dog leukocyte antigen (DLA) region have been non-productive. However, such studies led us to hypothesize that positive selection for desired phenotypic traits that arose in the mid-twentieth century led to intense inbreeding and the inadvertent amplification of AD and SA associated traits. This hypothesis was tested with genetic studies of 761 Standard, Miniature, and Miniature/Standard Poodle crosses from the USA, Canada and Europe, coupled with extensive pedigree analysis of thousands more dogs. Genome-wide diversity across the world-wide population was measured using a panel of 33 short tandem repeat (STR) loci. Allele frequency data were also used to determine the internal relatedness of individual dogs within the population as a whole. Assays based on linkage between STR genomic loci and DLA genes were used to identify class I and II haplotypes and disease associations. Genetic diversity statistics based on genomic STR markers indicated that Standard Poodles from North America and Europe were closely related and reasonably diverse across the breed. However, genetic diversity statistics, internal relatedness, principal coordinate analysis, and DLA haplotype frequencies showed a marked imbalance with 30 % of the diversity in 70 % of the dogs. Standard Poodles with SA and AD were strongly linked to this inbred population, with dogs suffering with SA being the most inbred. No single strong association was found between STR defined DLA class I or II haplotypes and SA or AD in the breed as a whole, although certain haplotypes present in a minority of the population appeared to confer moderate degrees of risk or protection against either or both diseases. Dogs possessing minor DLA class I haplotypes were half as

  1. Genetic diversity and relationship of Indian cattle inferred from microsatellite and mitochondrial DNA markers.

    Science.gov (United States)

    Sharma, Rekha; Kishore, Amit; Mukesh, Manishi; Ahlawat, Sonika; Maitra, Avishek; Pandey, Ashwni Kumar; Tantia, Madhu Sudan

    2015-06-30

    Indian agriculture is an economic symbiosis of crop and livestock production with cattle as the foundation. Sadly, the population of indigenous cattle (Bos indicus) is declining (8.94% in last decade) and needs immediate scientific management. Genetic characterization is the first step in the development of proper management strategies for preserving genetic diversity and preventing undesirable loss of alleles. Thus, in this study we investigated genetic diversity and relationship among eleven Indian cattle breeds using 21 microsatellite markers and mitochondrial D loop sequence. The analysis of autosomal DNA was performed on 508 cattle which exhibited sufficient genetic diversity across all the breeds. Estimates of mean allele number and observed heterozygosity across all loci and population were 8.784 ± 0.25 and 0.653 ± 0.014, respectively. Differences among breeds accounted for 13.3% of total genetic variability. Despite high genetic diversity, significant inbreeding was also observed within eight populations. Genetic distances and cluster analysis showed a close relationship between breeds according to proximity in geographic distribution. The genetic distance, STRUCTURE and Principal Coordinate Analysis concluded that the Southern Indian Ongole cattle are the most distinct among the investigated cattle populations. Sequencing of hypervariable mitochondrial DNA region on a subset of 170 cattle revealed sixty haplotypes with haplotypic diversity of 0.90240, nucleotide diversity of 0.02688 and average number of nucleotide differences as 6.07407. Two major star clusters for haplotypes indicated population expansion for Indian cattle. Nuclear and mitochondrial genomes show a similar pattern of genetic variability and genetic differentiation. Various analyses concluded that the Southern breed 'Ongole' was distinct from breeds of Northern/ Central India. Overall these results provide basic information about genetic diversity and structure of Indian cattle which

  2. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  3. Estimating intraspecific genetic diversity from community DNA metabarcoding data

    Directory of Open Access Journals (Sweden)

    Vasco Elbrecht

    2018-04-01

    Full Text Available Background DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high-throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTUs, losing information on intraspecific diversity in the process. While Cytochrome c oxidase subunit I (COI haplotype information is limited in resolving intraspecific diversity it is nevertheless often useful e.g. in a phylogeographic context, helping to formulate hypotheses on taxon distribution and dispersal. Methods This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotype information from freshwater macroinvertebrate metabarcoding datasets. This novel approach was added to the R package “JAMP” and can be applied to COI amplicon datasets. We tested our haplotyping method by sequencing (i a single-species mock community composed of 31 individuals with 15 different haplotypes spanning three orders of magnitude in biomass and (ii 18 monitoring samples each amplified with four different primer sets and two PCR replicates. Results We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177–200 OTUs, each containing an average of 2.40–3.30 haplotypes per OTU. The derived intraspecific diversity data showed population structures that were consistent between replicates and similar between primer pairs but resolution depended on the primer length. A closer look at abundant taxa in the dataset revealed various population genetic patterns, e.g. the stonefly

  4. Diversity of 23S rRNA genes within individual prokaryotic genomes.

    Directory of Open Access Journals (Sweden)

    Anna Pei

    Full Text Available BACKGROUND: The concept of ribosomal constraints on rRNA genes is deduced primarily based on the comparison of consensus rRNA sequences between closely related species, but recent advances in whole-genome sequencing allow evaluation of this concept within organisms with multiple rRNA operons. METHODOLOGY/PRINCIPAL FINDINGS: Using the 23S rRNA gene as an example, we analyzed the diversity among individual rRNA genes within a genome. Of 184 prokaryotic species containing multiple 23S rRNA genes, diversity was observed in 113 (61.4% genomes (mean 0.40%, range 0.01%-4.04%. Significant (1.17%-4.04% intragenomic variation was found in 8 species. In 5 of the 8 species, the diversity in the primary structure had only minimal effect on the secondary structure (stem versus loop transition. In the remaining 3 species, the diversity significantly altered local secondary structure, but the alteration appears minimized through complex rearrangement. Intervening sequences (IVS, ranging between 9 and 1471 nt in size, were found in 7 species. IVS in Deinococcus radiodurans and Nostoc sp. encode transposases. T. tengcongensis was the only species in which intragenomic diversity >3% was observed among 4 paralogous 23S rRNA genes. CONCLUSIONS/SIGNIFICANCE: These findings indicate tight ribosomal constraints on individual 23S rRNA genes within a genome. Although classification using primary 23S rRNA sequences could be erroneous, significant diversity among paralogous 23S rRNA genes was observed only once in the 184 species analyzed, indicating little overall impact on the mainstream of 23S rRNA gene-based prokaryotic taxonomy.

  5. Experimental analysis of specification language diversity impact on NPP software diversity

    International Nuclear Information System (INIS)

    Yoo, Chang Sik

    1999-02-01

    In order to increase computer system reliability, software fault tolerance methods have been adopted to some safety critical systems including NPP. Prevention of software common mode failure is very crucial problem in software fault tolerance, but the effective method for this problem is not found yet. In our research, to find out an effective method for prevention of software common mode failure, the impact of specification language diversity on NPP software diversity was examined experimentally. Three specification languages were used to compose three requirements specifications, and programmers made twelve product codes from the specifications. From the product codes analysis, using fault diversity criteria, we concluded that diverse specification language method would enhance program diversity through diversification of requirements specification imperfections

  6. Comparative genomic analysis of 45 type strains of the genus Bifidobacterium: a snapshot of its genetic diversity and evolution.

    Directory of Open Access Journals (Sweden)

    Zhihong Sun

    Full Text Available Bifidobacteria are well known for their human health-promoting effects and are therefore widely applied in the food industry. Members of the Bifidobacterium genus were first identified from the human gastrointestinal tract and were then found to be widely distributed across various ecological niches. Although the genetic diversity of Bifidobacterium has been determined based on several marker genes or a few genomes, the global diversity and evolution scenario for the entire genus remain unresolved. The present study comparatively analyzed the genomes of 45 type strains. We built a robust genealogy for Bifidobacterium based on 402 core genes and defined its root according to the phylogeny of the tree of bacteria. Our results support that all human isolates are of younger lineages, and although species isolated from bees dominate the more ancient lineages, the bee was not necessarily the original host for bifidobacteria. Moreover, the species isolated from different hosts are enriched with specific gene sets, suggesting host-specific adaptation. Notably, bee-specific genes are strongly associated with respiratory metabolism and are potential in helping those bacteria adapt to the oxygen-rich gut environment in bees. This study provides a snapshot of the genetic diversity and evolution of Bifidobacterium, paving the way for future studies on the taxonomy and functional genomics of the genus.

  7. Lampreys as Diverse Model Organisms in the Genomics Era.

    Science.gov (United States)

    McCauley, David W; Docker, Margaret F; Whyard, Steve; Li, Weiming

    2015-11-01

    Lampreys, one of the two surviving groups of ancient vertebrates, have become important models for study in diverse fields of biology. Lampreys (of which there are approximately 40 species) are being studied, for example, (a) to control pest sea lamprey in the North American Great Lakes and to restore declining populations of native species elsewhere; (b) in biomedical research, focusing particularly on the regenerative capability of lampreys; and (c) by developmental biologists studying the evolution of key vertebrate characters. Although a lack of genetic resources has hindered research on the mechanisms regulating many aspects of lamprey life history and development, formerly intractable questions are now amenable to investigation following the recent publication of the sea lamprey genome. Here, we provide an overview of the ways in which genomic tools are currently being deployed to tackle diverse research questions and suggest several areas that may benefit from the availability of the sea lamprey genome.

  8. Extensive variation in the density and distribution of DNA polymorphism in sorghum genomes.

    Directory of Open Access Journals (Sweden)

    Joseph Evans

    Full Text Available Sorghum genotypes currently used for grain production in the United States were developed from African landraces that were imported starting in the mid-to-late 19(th century. Farmers and plant breeders selected genotypes for grain production with reduced plant height, early flowering, increased grain yield, adaptation to drought, and improved resistance to lodging, diseases and pests. DNA polymorphisms that distinguish three historically important grain sorghum genotypes, BTx623, BTx642 and Tx7000, were characterized by genome sequencing, genotyping by sequencing, genetic mapping, and pedigree-based haplotype analysis. The distribution and density of DNA polymorphisms in the sequenced genomes varied widely, in part because the lines were derived through breeding and selection from diverse Kafir, Durra, and Caudatum race accessions. Genomic DNA spanning dw1 (SBI-09 and dw3 (SBI-07 had identical haplotypes due to selection for reduced height. Lower SNP density in genes located in pericentromeric regions compared with genes located in euchromatic regions is consistent with background selection in these regions of low recombination. SNP density was higher in euchromatic DNA and varied >100-fold in contiguous intervals that spanned up to 300 Kbp. The localized variation in DNA polymorphism density occurred throughout euchromatic regions where recombination is elevated, however, polymorphism density was not correlated with gene density or DNA methylation. Overall, sorghum chromosomes contain distal euchromatic regions characterized by extensive, localized variation in DNA polymorphism density, and large pericentromeric regions of low gene density, diversity, and recombination.

  9. A class representative model for Pure Parsimony Haplotyping under uncertain data.

    Directory of Open Access Journals (Sweden)

    Daniele Catanzaro

    Full Text Available The Pure Parsimony Haplotyping (PPH problem is a NP-hard combinatorial optimization problem that consists of finding the minimum number of haplotypes necessary to explain a given set of genotypes. PPH has attracted more and more attention in recent years due to its importance in analysis of many fine-scale genetic data. Its application fields range from mapping complex disease genes to inferring population histories, passing through designing drugs, functional genomics and pharmacogenetics. In this article we investigate, for the first time, a recent version of PPH called the Pure Parsimony Haplotype problem under Uncertain Data (PPH-UD. This version mainly arises when the input genotypes are not accurate, i.e., when some single nucleotide polymorphisms are missing or affected by errors. We propose an exact approach to solution of PPH-UD based on an extended version of Catanzaro et al.[1] class representative model for PPH, currently the state-of-the-art integer programming model for PPH. The model is efficient, accurate, compact, polynomial-sized, easy to implement, solvable with any solver for mixed integer programming, and usable in all those cases for which the parsimony criterion is well suited for haplotype estimation.

  10. Comparative Genomics Reveals the Diversity of Restriction-Modification Systems and DNA Methylation Sites in Listeria monocytogenes.

    Science.gov (United States)

    Chen, Poyin; den Bakker, Henk C; Korlach, Jonas; Kong, Nguyet; Storey, Dylan B; Paxinos, Ellen E; Ashby, Meredith; Clark, Tyson; Luong, Khai; Wiedmann, Martin; Weimer, Bart C

    2017-02-01

    Listeria monocytogenes is a bacterial pathogen that is found in a wide variety of anthropogenic and natural environments. Genome sequencing technologies are rapidly becoming a powerful tool in facilitating our understanding of how genotype, classification phenotypes, and virulence phenotypes interact to predict the health risks of individual bacterial isolates. Currently, 57 closed L. monocytogenes genomes are publicly available, representing three of the four phylogenetic lineages, and they suggest that L. monocytogenes has high genomic synteny. This study contributes an additional 15 closed L. monocytogenes genomes that were used to determine the associations between the genome and methylome with host invasion magnitude. In contrast to previous findings, large chromosomal inversions and rearrangements were detected in five isolates at the chromosome terminus and within rRNA genes, including a previously undescribed inversion within rRNA-encoding regions. Each isolate's epigenome contained highly diverse methyltransferase recognition sites, even within the same serotype and methylation pattern. Eleven strains contained a single chromosomally encoded methyltransferase, one strain contained two methylation systems (one system on a plasmid), and three strains exhibited no methylation, despite the occurrence of methyltransferase genes. In three isolates a new, unknown DNA modification was observed in addition to diverse methylation patterns, accompanied by a novel methylation system. Neither chromosome rearrangement nor strain-specific patterns of epigenome modification observed within virulence genes were correlated with serotype designation, clonal complex, or in vitro infectivity. These data suggest that genome diversity is larger than previously considered in L. monocytogenes and that as more genomes are sequenced, additional structure and methylation novelty will be observed in this organism. Listeria monocytogenes is the causative agent of listeriosis, a disease

  11. Allelic recombination between distinct genomic locations generates copy number diversity in human β-defensins

    Science.gov (United States)

    Bakar, Suhaili Abu; Hollox, Edward J.; Armour, John A. L.

    2009-01-01

    β-Defensins are small secreted antimicrobial and signaling peptides involved in the innate immune response of vertebrates. In humans, a cluster of at least 7 of these genes shows extensive copy number variation, with a diploid copy number commonly ranging between 2 and 7. Using a genetic mapping approach, we show that this cluster is at not 1 but 2 distinct genomic loci ≈5 Mb apart on chromosome band 8p23.1, contradicting the most recent genome assembly. We also demonstrate that the predominant mechanism of change in β-defensin copy number is simple allelic recombination occurring in the interval between the 2 distinct genomic loci for these genes. In 416 meiotic transmissions, we observe 3 events creating a haplotype copy number not found in the parent, equivalent to a germ-line rate of copy number change of ≈0.7% per gamete. This places it among the fastest-changing copy number variants currently known. PMID:19131514

  12. Comparison of relative efficiency of genomic SSR and EST-SSR markers in estimating genetic diversity in sugarcane.

    Science.gov (United States)

    Parthiban, S; Govindaraj, P; Senthilkumar, S

    2018-03-01

    Twenty-five primer pairs developed from genomic simple sequence repeats (SSR) were compared with 25 expressed sequence tags (EST) SSRs to evaluate the efficiency of these two sets of primers using 59 sugarcane genetic stocks. The mean polymorphism information content (PIC) of genomic SSR was higher (0.72) compared to the PIC value recorded by EST-SSR marker (0.62). The relatively low level of polymorphism in EST-SSR markers may be due to the location of these markers in more conserved and expressed sequences compared to genomic sequences which are spread throughout the genome. Dendrogram based on the genomic SSR and EST-SSR marker data showed differences in grouping of genotypes. A total of 59 sugarcane accessions were grouped into 6 and 4 clusters using genomic SSR and EST-SSR, respectively. The highly efficient genomic SSR could subcluster the genotypes of some of the clusters formed by EST-SSR markers. The difference in dendrogram observed was probably due to the variation in number of markers produced by genomic SSR and EST-SSR and different portion of genome amplified by both the markers. The combined dendrogram (genomic SSR and EST-SSR) more clearly showed the genetic relationship among the sugarcane genotypes by forming four clusters. The mean genetic similarity (GS) value obtained using EST-SSR among 59 sugarcane accessions was 0.70, whereas the mean GS obtained using genomic SSR was 0.63. Although relatively lower level of polymorphism was displayed by the EST-SSR markers, genetic diversity shown by the EST-SSR was found to be promising as they were functional marker. High level of PIC and low genetic similarity values of genomic SSR may be more useful in DNA fingerprinting, selection of true hybrids, identification of variety specific markers and genetic diversity analysis. Identification of diverse parents based on cluster analysis can be effectively done with EST-SSR as the genetic similarity estimates are based on functional attributes related to

  13. Natural selection shaped the rise and fall of passenger pigeon genomic diversity.

    Science.gov (United States)

    Murray, Gemma G R; Soares, André E R; Novak, Ben J; Schaefer, Nathan K; Cahill, James A; Baker, Allan J; Demboski, John R; Doll, Andrew; Da Fonseca, Rute R; Fulton, Tara L; Gilbert, M Thomas P; Heintzman, Peter D; Letts, Brandon; McIntosh, George; O'Connell, Brendan L; Peck, Mark; Pipes, Marie-Lorraine; Rice, Edward S; Santos, Kathryn M; Sohrweide, A Gregory; Vohr, Samuel H; Corbett-Detig, Russell B; Green, Richard E; Shapiro, Beth

    2017-11-17

    The extinct passenger pigeon was once the most abundant bird in North America, and possibly the world. Although theory predicts that large populations will be more genetically diverse, passenger pigeon genetic diversity was surprisingly low. To investigate this disconnect, we analyzed 41 mitochondrial and 4 nuclear genomes from passenger pigeons and 2 genomes from band-tailed pigeons, which are passenger pigeons' closest living relatives. Passenger pigeons' large population size appears to have allowed for faster adaptive evolution and removal of harmful mutations, driving a huge loss in their neutral genetic diversity. These results demonstrate the effect that selection can have on a vertebrate genome and contradict results that suggested that population instability contributed to this species's surprisingly rapid extinction. Copyright © 2017, American Association for the Advancement of Science.

  14. Two Tales of Prokaryotic Genomic Diversity: Escherichia coli and Halophiles

    Directory of Open Access Journals (Sweden)

    Lejla Pašić

    2014-01-01

    Full Text Available Prokaryotes are generally characterized by vast genomic diversity that has been shaped by mutations, horizontal gene transfer, bacteriocins and phage predation. Enormous genetic diversity has developed as a result of stresses imposed in harsh environments and the ability of microorganisms to adapt. Two examples of prokaryotic diversity are presented: on intraspecies level, exemplified by Escherichia coli, and the diversity of the hypersaline environment, with the discussion of food-related health issues and biotechnological potential.

  15. Evolution of genomic diversity and sex at extreme environments: Fungal life under hypersaline Dead Sea stress

    Science.gov (United States)

    Kis-Papo, Tamar; Kirzhner, Valery; Wasser, Solomon P.; Nevo, Eviatar

    2003-01-01

    We have found that genomic diversity is generally positively correlated with abiotic and biotic stress levels (1–3). However, beyond a high-threshold level of stress, the diversity declines to a few adapted genotypes. The Dead Sea is the harshest planetary hypersaline environment (340 g·liter–1 total dissolved salts, ≈10 times sea water). Hence, the Dead Sea is an excellent natural laboratory for testing the “rise and fall” pattern of genetic diversity with stress proposed in this article. Here, we examined genomic diversity of the ascomycete fungus Aspergillus versicolor from saline, nonsaline, and hypersaline Dead Sea environments. We screened the coding and noncoding genomes of A. versicolor isolates by using >600 AFLP (amplified fragment length polymorphism) markers (equal to loci). Genomic diversity was positively correlated with stress, culminating in the Dead Sea surface but dropped drastically in 50- to 280-m-deep seawater. The genomic diversity pattern paralleled the pattern of sexual reproduction of fungal species across the same southward gradient of increasing stress in Israel. This parallel may suggest that diversity and sex are intertwined intimately according to the rise and fall pattern and adaptively selected by natural selection in fungal genome evolution. Future large-scale verification in micromycetes will define further the trajectories of diversity and sex in the rise and fall pattern. PMID:14645702

  16. Absence of genome reduction in diverse, facultative endohyphal bacteria

    Energy Technology Data Exchange (ETDEWEB)

    Baltrus, David A. [Univ. of Arizona, Tucson, AZ (United States); Dougherty, Kevin [Univ. of Arizona, Tucson, AZ (United States); Arendt, Kayla R. [Univ. of Arizona, Tucson, AZ (United States); Huntemann, Marcel [Joint Genome Institute, Walnut Creek, CA (United States); Clum, Alicia [Joint Genome Institute, Walnut Creek, CA (United States); Pillay, Manoj [Joint Genome Institute, Walnut Creek, CA (United States); Palaniappan, Krishnaveni [Joint Genome Institute, Walnut Creek, CA (United States); Varghese, Neha [Joint Genome Institute, Walnut Creek, CA (United States); Mikhailova, Natalia [Joint Genome Institute, Walnut Creek, CA (United States); Stamatis, Dimitrios [Joint Genome Institute, Walnut Creek, CA (United States); Reddy, T. B. K. [Joint Genome Institute, Walnut Creek, CA (United States); Ngan, Chew Yee [Joint Genome Institute, Walnut Creek, CA (United States); Daum, Chris [Joint Genome Institute, Walnut Creek, CA (United States); Shapiro, Nicole [Joint Genome Institute, Walnut Creek, CA (United States); Markowitz, Victor [Joint Genome Institute, Walnut Creek, CA (United States); Ivanova, Natalia [Joint Genome Institute, Walnut Creek, CA (United States); Kyrpides, Nikos [Joint Genome Institute, Walnut Creek, CA (United States); Woyke, Tanja [Joint Genome Institute, Walnut Creek, CA (United States); Arnold, A. Elizabeth [Univ. of Arizona, Tucson, AZ (United States)

    2017-02-28

    Fungi interact closely with bacteria, both on the surfaces of the hyphae and within their living tissues (i.e. endohyphal bacteria, EHB). These EHB can be obligate or facultative symbionts and can mediate diverse phenotypic traits in their hosts. Although EHB have been observed in many lineages of fungi, it remains unclear how widespread and general these associations are, and whether there are unifying ecological and genomic features can be found across EHB strains as a whole. We cultured 11 bacterial strains after they emerged from the hyphae of diverse Ascomycota that were isolated as foliar endophytes of cupressaceous trees, and generated nearly complete genome sequences for all. Unlike the genomes of largely obligate EHB, the genomes of these facultative EHB resembled those of closely related strains isolated from environmental sources. Although all analysed genomes encoded structures that could be used to interact with eukaryotic hosts, pathways previously implicated in maintenance and establishment of EHB symbiosis were not universally present across all strains. Independent isolation of two nearly identical pairs of strains from different classes of fungi, coupled with recent experimental evidence, suggests horizontal transfer of EHB across endophytic hosts. Given the potential for EHB to influence fungal phenotypes, these genomes could shed light on the mechanisms of plant growth promotion or stress mitigation by fungal endophytes during the symbiotic phase, as well as degradation of plant material during the saprotrophic phase. As such, these findings contribute to the illumination of a new dimension of functional biodiversity in fungi.

  17. Genomes of the Mouse Collaborative Cross.

    Science.gov (United States)

    Srivastava, Anuj; Morgan, Andrew P; Najarian, Maya L; Sarsani, Vishal Kumar; Sigmon, J Sebastian; Shorter, John R; Kashfeen, Anwica; McMullan, Rachel C; Williams, Lucy H; Giusti-Rodríguez, Paola; Ferris, Martin T; Sullivan, Patrick; Hock, Pablo; Miller, Darla R; Bell, Timothy A; McMillan, Leonard; Churchill, Gary A; de Villena, Fernando Pardo-Manuel

    2017-06-01

    The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of

  18. Analysis of Molecular Variance Inferred from Metric Distances among DNA Haplotypes: Application to Human Mitochondrial DNA Restriction Data

    OpenAIRE

    Excoffier, L.; Smouse, P. E.; Quattro, J. M.

    1992-01-01

    We present here a framework for the study of molecular variation within a single species. Information on DNA haplotype divergence is incorporated into an analysis of variance format, derived from a matrix of squared-distances among all pairs of haplotypes. This analysis of molecular variance (AMOVA) produces estimates of variance components and F-statistic analogs, designated here as φ-statistics, reflecting the correlation of haplotypic diversity at different levels of hierarchical subdivisi...

  19. The Jujube Genome Provides Insights into Genome Evolution and the Domestication of Sweetness/Acidity Taste in Fruit Trees.

    Science.gov (United States)

    Huang, Jian; Zhang, Chunmei; Zhao, Xing; Fei, Zhangjun; Wan, KangKang; Zhang, Zhong; Pang, Xiaoming; Yin, Xiao; Bai, Yang; Sun, Xiaoqing; Gao, Lizhi; Li, Ruiqiang; Zhang, Jinbo; Li, Xingang

    2016-12-01

    Jujube (Ziziphus jujuba Mill.) belongs to the Rhamnaceae family and is a popular fruit tree species with immense economic and nutritional value. Here, we report a draft genome of the dry jujube cultivar 'Junzao' and the genome resequencing of 31 geographically diverse accessions of cultivated and wild jujubes (Ziziphus jujuba var. spinosa). Comparative analysis revealed that the genome of 'Dongzao', a fresh jujube, was ~86.5 Mb larger than that of the 'Junzao', partially due to the recent insertions of transposable elements in the 'Dongzao' genome. We constructed eight proto-chromosomes of the common ancestor of Rhamnaceae and Rosaceae, two sister families in the order Rosales, and elucidated the evolutionary processes that have shaped the genome structures of modern jujubes. Population structure analysis revealed the complex genetic background of jujubes resulting from extensive hybridizations between jujube and its wild relatives. Notably, several key genes that control fruit organic acid metabolism and sugar content were identified in the selective sweep regions. We also identified S-locus genes controlling gametophytic self-incompatibility and investigated haplotype patterns of the S locus in the jujube genomes, which would provide a guideline for parent selection for jujube crossbreeding. This study provides valuable genomic resources for jujube improvement, and offers insights into jujube genome evolution and its population structure and domestication.

  20. Inferring mechanisms of copy number change from haplotype structures at the human DEFA1A3 locus.

    Science.gov (United States)

    Black, Holly A; Khan, Fayeza F; Tyson, Jess; Al Armour, John

    2014-07-21

    The determination of structural haplotypes at copy number variable regions can indicate the mechanisms responsible for changes in copy number, as well as explain the relationship between gene copy number and expression. However, obtaining spatial information at regions displaying extensive copy number variation, such as the DEFA1A3 locus, is complex, because of the difficulty in the phasing and assembly of these regions. The DEFA1A3 locus is intriguing in that it falls within a region of high linkage disequilibrium, despite its high variability in copy number (n = 3-16); hence, the mechanisms responsible for changes in copy number at this locus are unclear. In this study, a region flanking the DEFA1A3 locus was sequenced across 120 independent haplotypes with European ancestry, identifying five common classes of DEFA1A3 haplotype. Assigning DEFA1A3 class to haplotypes within the 1000 Genomes project highlights a significant difference in DEFA1A3 class frequencies between populations with different ancestry. The features of each DEFA1A3 class, for example, the associated DEFA1A3 copy numbers, were initially assessed in a European cohort (n = 599) and replicated in the 1000 Genomes samples, showing within-class similarity, but between-class and between-population differences in the features of the DEFA1A3 locus. Emulsion haplotype fusion-PCR was used to generate 61 structural haplotypes at the DEFA1A3 locus, showing a high within-class similarity in structure. Structural haplotypes across the DEFA1A3 locus indicate that intra-allelic rearrangement is the predominant mechanism responsible for changes in DEFA1A3 copy number, explaining the conservation of linkage disequilibrium across the locus. The identification of common structural haplotypes at the DEFA1A3 locus could aid studies into how DEFA1A3 copy number influences expression, which is currently unclear.

  1. The clinical application of single-sperm-based SNP haplotyping for PGD of osteogenesis imperfecta.

    Science.gov (United States)

    Chen, Linjun; Diao, Zhenyu; Xu, Zhipeng; Zhou, Jianjun; Yan, Guijun; Sun, Haixiang

    2018-05-15

    Osteogenesis imperfecta (OI) is a genetically heterogeneous disorder, presenting either autosomal dominant, autosomal recessive or X-linked inheritance patterns. The majority of OI cases are autosomal dominant and are caused by heterozygous mutations in either the COL1A1 or COL1A2 gene. In these dominant disorders, allele dropout (ADO) can lead to misdiagnosis in preimplantation genetic diagnosis (PGD). Polymorphic markers linked to the mutated genes have been used to establish haplotypes for identifying ADO and ensuring the accuracy of PGD. However, the haplotype of male patients cannot be determined without data from affected relatives. Here, we developed a method for single-sperm-based single-nucleotide polymorphism (SNP) haplotyping via next-generation sequencing (NGS) for the PGD of OI. After NGS, 10 informative polymorphic SNP markers located upstream and downstream of the COL1A1 gene and its pathogenic mutation site were linked to individual alleles in a single sperm from an affected male. After haplotyping, a normal blastocyst was transferred to the uterus for a subsequent frozen embryo transfer cycle. The accuracy of PGD was confirmed by amniocentesis at 19 weeks of gestation. A healthy infant weighing 4,250 g was born via vaginal delivery at the 40th week of gestation. Single-sperm-based SNP haplotyping can be applied for PGD of any monogenic disorders or de novo mutations in males in whom the haplotype of paternal mutations cannot be determined due to a lack of affected relatives. ADO: allele dropout; DI: dentinogenesis imperfect; ESHRE: European Society of Human Reproduction and Embryology; FET: frozen embryo transfer; gDNA: genomic DNA; ICSI: intracytoplasmic sperm injection; IVF: in vitro fertilization; MDA: multiple displacement amplification; NGS: next-generation sequencing; OI: osteogenesis imperfect; PBS: phosphate buffer saline; PCR: polymerase chain reaction; PGD: preimplantation genetic diagnosis; SNP: single-nucleotide polymorphism; STR

  2. De novo assembly and phasing of a Korean human genome.

    Science.gov (United States)

    Seo, Jeong-Sun; Rhie, Arang; Kim, Junsoo; Lee, Sangjin; Sohn, Min-Hwan; Kim, Chang-Uk; Hastie, Alex; Cao, Han; Yun, Ji-Young; Kim, Jihye; Kuk, Junho; Park, Gun Hwa; Kim, Juhyeok; Ryu, Hanna; Kim, Jongbum; Roh, Mira; Baek, Jeonghun; Hunkapiller, Michael W; Korlach, Jonas; Shin, Jong-Yeon; Kim, Changhoon

    2016-10-13

    Advances in genome assembly and phasing provide an opportunity to investigate the diploid architecture of the human genome and reveal the full range of structural variation across population groups. Here we report the de novo assembly and haplotype phasing of the Korean individual AK1 (ref. 1) using single-molecule real-time sequencing, next-generation mapping, microfluidics-based linked reads, and bacterial artificial chromosome (BAC) sequencing approaches. Single-molecule sequencing coupled with next-generation mapping generated a highly contiguous assembly, with a contig N50 size of 17.9 Mb and a scaffold N50 size of 44.8 Mb, resolving 8 chromosomal arms into single scaffolds. The de novo assembly, along with local assemblies and spanning long reads, closes 105 and extends into 72 out of 190 euchromatic gaps in the reference genome, adding 1.03 Mb of previously intractable sequence. High concordance between the assembly and paired-end sequences from 62,758 BAC clones provides strong support for the robustness of the assembly. We identify 18,210 structural variants by direct comparison of the assembly with the human reference, identifying thousands of breakpoints that, to our knowledge, have not been reported before. Many of the insertions are reflected in the transcriptome and are shared across the Asian population. We performed haplotype phasing of the assembly with short reads, long reads and linked reads from whole-genome sequencing and with short reads from 31,719 BAC clones, thereby achieving phased blocks with an N50 size of 11.6 Mb. Haplotigs assembled from single-molecule real-time reads assigned to haplotypes on phased blocks covered 89% of genes. The haplotigs accurately characterized the hypervariable major histocompatability complex region as well as demonstrating allele configuration in clinically relevant genes such as CYP2D6. This work presents the most contiguous diploid human genome assembly so far, with extensive investigation of

  3. Transcriptome sequencing from diverse human populations reveals differentiated regulatory architecture.

    Directory of Open Access Journals (Sweden)

    Alicia R Martin

    2014-08-01

    Full Text Available Large-scale sequencing efforts have documented extensive genetic variation within the human genome. However, our understanding of the origins, global distribution, and functional consequences of this variation is far from complete. While regulatory variation influencing gene expression has been studied within a handful of populations, the breadth of transcriptome differences across diverse human populations has not been systematically analyzed. To better understand the spectrum of gene expression variation, alternative splicing, and the population genetics of regulatory variation in humans, we have sequenced the genomes, exomes, and transcriptomes of EBV transformed lymphoblastoid cell lines derived from 45 individuals in the Human Genome Diversity Panel (HGDP. The populations sampled span the geographic breadth of human migration history and include Namibian San, Mbuti Pygmies of the Democratic Republic of Congo, Algerian Mozabites, Pathan of Pakistan, Cambodians of East Asia, Yakut of Siberia, and Mayans of Mexico. We discover that approximately 25.0% of the variation in gene expression found amongst individuals can be attributed to population differences. However, we find few genes that are systematically differentially expressed among populations. Of this population-specific variation, 75.5% is due to expression rather than splicing variability, and we find few genes with strong evidence for differential splicing across populations. Allelic expression analyses indicate that previously mapped common regulatory variants identified in eight populations from the International Haplotype Map Phase 3 project have similar effects in our seven sampled HGDP populations, suggesting that the cellular effects of common variants are shared across diverse populations. Together, these results provide a resource for studies analyzing functional differences across populations by estimating the degree of shared gene expression, alternative splicing, and

  4. De Novo Assembly and Phasing of Dikaryotic Genomes from Two Isolates of Puccinia coronata f. sp. avenae, the Causal Agent of Oat Crown Rust

    Directory of Open Access Journals (Sweden)

    Marisa E. Miller

    2018-02-01

    Full Text Available Oat crown rust, caused by the fungus Pucinnia coronata f. sp. avenae, is a devastating disease that impacts worldwide oat production. For much of its life cycle, P. coronata f. sp. avenae is dikaryotic, with two separate haploid nuclei that may vary in virulence genotype, highlighting the importance of understanding haplotype diversity in this species. We generated highly contiguous de novo genome assemblies of two P. coronata f. sp. avenae isolates, 12SD80 and 12NC29, from long-read sequences. In total, we assembled 603 primary contigs for 12SD80, for a total assembly length of 99.16 Mbp, and 777 primary contigs for 12NC29, for a total length of 105.25 Mbp; approximately 52% of each genome was assembled into alternate haplotypes. This revealed structural variation between haplotypes in each isolate equivalent to more than 2% of the genome size, in addition to about 260,000 and 380,000 heterozygous single-nucleotide polymorphisms in 12SD80 and 12NC29, respectively. Transcript-based annotation identified 26,796 and 28,801 coding sequences for isolates 12SD80 and 12NC29, respectively, including about 7,000 allele pairs in haplotype-phased regions. Furthermore, expression profiling revealed clusters of coexpressed secreted effector candidates, and the majority of orthologous effectors between isolates showed conservation of expression patterns. However, a small subset of orthologs showed divergence in expression, which may contribute to differences in virulence between 12SD80 and 12NC29. This study provides the first haplotype-phased reference genome for a dikaryotic rust fungus as a foundation for future studies into virulence mechanisms in P. coronata f. sp. avenae.

  5. Switchgrass genomic diversity, ploidy, and evolution: novel insights from a network-based SNP discovery protocol.

    Directory of Open Access Journals (Sweden)

    Fei Lu

    Full Text Available Switchgrass (Panicum virgatum L. is a perennial grass that has been designated as an herbaceous model biofuel crop for the United States of America. To facilitate accelerated breeding programs of switchgrass, we developed both an association panel and linkage populations for genome-wide association study (GWAS and genomic selection (GS. All of the 840 individuals were then genotyped using genotyping by sequencing (GBS, generating 350 GB of sequence in total. As a highly heterozygous polyploid (tetraploid and octoploid species lacking a reference genome, switchgrass is highly intractable with earlier methodologies of single nucleotide polymorphism (SNP discovery. To access the genetic diversity of species like switchgrass, we developed a SNP discovery pipeline based on a network approach called the Universal Network-Enabled Analysis Kit (UNEAK. Complexities that hinder single nucleotide polymorphism discovery, such as repeats, paralogs, and sequencing errors, are easily resolved with UNEAK. Here, 1.2 million putative SNPs were discovered in a diverse collection of primarily upland, northern-adapted switchgrass populations. Further analysis of this data set revealed the fundamentally diploid nature of tetraploid switchgrass. Taking advantage of the high conservation of genome structure between switchgrass and foxtail millet (Setaria italica (L. P. Beauv., two parent-specific, synteny-based, ultra high-density linkage maps containing a total of 88,217 SNPs were constructed. Also, our results showed clear patterns of isolation-by-distance and isolation-by-ploidy in natural populations of switchgrass. Phylogenetic analysis supported a general south-to-north migration path of switchgrass. In addition, this analysis suggested that upland tetraploid arose from upland octoploid. All together, this study provides unparalleled insights into the diversity, genomic complexity, population structure, phylogeny, phylogeography, ploidy, and evolutionary dynamics

  6. Hidden diversity revealed by genome-resolved metagenomics of iron-oxidizing microbial mats from Lō'ihi Seamount, Hawai'i.

    Science.gov (United States)

    Fullerton, Heather; Hager, Kevin W; McAllister, Sean M; Moyer, Craig L

    2017-08-01

    The Zetaproteobacteria are ubiquitous in marine environments, yet this class of Proteobacteria is only represented by a few closely-related cultured isolates. In high-iron environments, such as diffuse hydrothermal vents, the Zetaproteobacteria are important members of the community driving its structure. Biogeography of Zetaproteobacteria has shown two ubiquitous operational taxonomic units (OTUs), yet much is unknown about their genomic diversity. Genome-resolved metagenomics allows for the specific binning of microbial genomes based on genomic signatures present in composite metagenome assemblies. This resulted in the recovery of 93 genome bins, of which 34 were classified as Zetaproteobacteria. Form II ribulose 1,5-bisphosphate carboxylase genes were recovered from nearly all the Zetaproteobacteria genome bins. In addition, the Zetaproteobacteria genome bins contain genes for uptake and utilization of bioavailable nitrogen, detoxification of arsenic, and a terminal electron acceptor adapted for low oxygen concentration. Our results also support the hypothesis of a Cyc2-like protein as the site for iron oxidation, now detected across a majority of the Zetaproteobacteria genome bins. Whole genome comparisons showed a high genomic diversity across the Zetaproteobacteria OTUs and genome bins that were previously unidentified by SSU rRNA gene analysis. A single lineage of cosmopolitan Zetaproteobacteria (zOTU 2) was found to be monophyletic, based on cluster analysis of average nucleotide identity and average amino acid identity comparisons. From these data, we can begin to pinpoint genomic adaptations of the more ecologically ubiquitous Zetaproteobacteria, and further understand their environmental constraints and metabolic potential.

  7. A LDA-based approach to promoting ranking diversity for genomics information retrieval.

    Science.gov (United States)

    Chen, Yan; Yin, Xiaoshi; Li, Zhoujun; Hu, Xiaohua; Huang, Jimmy Xiangji

    2012-06-11

    In the biomedical domain, there are immense data and tremendous increase of genomics and biomedical relevant publications. The wealth of information has led to an increasing amount of interest in and need for applying information retrieval techniques to access the scientific literature in genomics and related biomedical disciplines. In many cases, the desired information of a query asked by biologists is a list of a certain type of entities covering different aspects that are related to the question, such as cells, genes, diseases, proteins, mutations, etc. Hence, it is important of a biomedical IR system to be able to provide relevant and diverse answers to fulfill biologists' information needs. However traditional IR model only concerns with the relevance between retrieved documents and user query, but does not take redundancy between retrieved documents into account. This will lead to high redundancy and low diversity in the retrieval ranked lists. In this paper, we propose an approach which employs a topic generative model called Latent Dirichlet Allocation (LDA) to promoting ranking diversity for biomedical information retrieval. Different from other approaches or models which consider aspects on word level, our approach assumes that aspects should be identified by the topics of retrieved documents. We present LDA model to discover topic distribution of retrieval passages and word distribution of each topic dimension, and then re-rank retrieval results with topic distribution similarity between passages based on N-size slide window. We perform our approach on TREC 2007 Genomics collection and two distinctive IR baseline runs, which can achieve 8% improvement over the highest Aspect MAP reported in TREC 2007 Genomics track. The proposed method is the first study of adopting topic model to genomics information retrieval, and demonstrates its effectiveness in promoting ranking diversity as well as in improving relevance of ranked lists of genomics search

  8. Natural Selection and Genetic Diversity in the Butterfly Heliconius melpomene.

    Science.gov (United States)

    Martin, Simon H; Möst, Markus; Palmer, William J; Salazar, Camilo; McMillan, W Owen; Jiggins, Francis M; Jiggins, Chris D

    2016-05-01

    A combination of selective and neutral evolutionary forces shape patterns of genetic diversity in nature. Among the insects, most previous analyses of the roles of drift and selection in shaping variation across the genome have focused on the genus Drosophila A more complete understanding of these forces will come from analyzing other taxa that differ in population demography and other aspects of biology. We have analyzed diversity and signatures of selection in the neotropical Heliconius butterflies using resequenced genomes from 58 wild-caught individuals of Heliconius melpomene and another 21 resequenced genomes representing 11 related species. By comparing intraspecific diversity and interspecific divergence, we estimate that 31% of amino acid substitutions between Heliconius species are adaptive. Diversity at putatively neutral sites is negatively correlated with the local density of coding sites as well as nonsynonymous substitutions and positively correlated with recombination rate, indicating widespread linked selection. This process also manifests in significantly reduced diversity on longer chromosomes, consistent with lower recombination rates. Although hitchhiking around beneficial nonsynonymous mutations has significantly shaped genetic variation in H. melpomene, evidence for strong selective sweeps is limited overall. We did however identify two regions where distinct haplotypes have swept in different populations, leading to increased population differentiation. On the whole, our study suggests that positive selection is less pervasive in these butterflies as compared to fruit flies, a fact that curiously results in very similar levels of neutral diversity in these very different insects. Copyright © 2016 by the Genetics Society of America.

  9. Artificial selection on introduced Asian haplotypes shaped the genetic architecture in European commercial pigs.

    Science.gov (United States)

    Bosse, Mirte; Lopes, Marcos S; Madsen, Ole; Megens, Hendrik-Jan; Crooijmans, Richard P M A; Frantz, Laurent A F; Harlizius, Barbara; Bastiaansen, John W M; Groenen, Martien A M

    2015-12-22

    Early pig farmers in Europe imported Asian pigs to cross with their local breeds in order to improve traits of commercial interest. Current genomics techniques enabled genome-wide identification of these Asian introgressed haplotypes in modern European pig breeds. We propose that the Asian variants are still present because they affect phenotypes that were important for ancient traditional, as well as recent, commercial pig breeding. Genome-wide introgression levels were only weakly correlated with gene content and recombination frequency. However, regions with an excess or absence of Asian haplotypes (AS) contained genes that were previously identified as phenotypically important such as FASN, ME1, and KIT. Therefore, the Asian alleles are thought to have an effect on phenotypes that were historically under selection. We aimed to estimate the effect of AS in introgressed regions in Large White pigs on the traits of backfat (BF) and litter size. The majority of regions we tested that retained Asian deoxyribonucleic acid (DNA) showed significantly increased BF from the Asian alleles. Our results suggest that the introgression in Large White pigs has been strongly determined by the selective pressure acting upon the introgressed AS. We therefore conclude that human-driven hybridization and selection contributed to the genomic architecture of these commercial pigs. © 2015 The Author(s).

  10. Comparison of phasing strategies for whole human genomes.

    Science.gov (United States)

    Choi, Yongwook; Chan, Agnes P; Kirkness, Ewen; Telenti, Amalio; Schork, Nicholas J

    2018-04-01

    Humans are a diploid species that inherit one set of chromosomes paternally and one homologous set of chromosomes maternally. Unfortunately, most human sequencing initiatives ignore this fact in that they do not directly delineate the nucleotide content of the maternal and paternal copies of the 23 chromosomes individuals possess (i.e., they do not 'phase' the genome) often because of the costs and complexities of doing so. We compared 11 different widely-used approaches to phasing human genomes using the publicly available 'Genome-In-A-Bottle' (GIAB) phased version of the NA12878 genome as a gold standard. The phasing strategies we compared included laboratory-based assays that prepare DNA in unique ways to facilitate phasing as well as purely computational approaches that seek to reconstruct phase information from general sequencing reads and constructs or population-level haplotype frequency information obtained through a reference panel of haplotypes. To assess the performance of the 11 approaches, we used metrics that included, among others, switch error rates, haplotype block lengths, the proportion of fully phase-resolved genes, phasing accuracy and yield between pairs of SNVs. Our comparisons suggest that a hybrid or combined approach that leverages: 1. population-based phasing using the SHAPEIT software suite, 2. either genome-wide sequencing read data or parental genotypes, and 3. a large reference panel of variant and haplotype frequencies, provides a fast and efficient way to produce highly accurate phase-resolved individual human genomes. We found that for population-based approaches, phasing performance is enhanced with the addition of genome-wide read data; e.g., whole genome shotgun and/or RNA sequencing reads. Further, we found that the inclusion of parental genotype data within a population-based phasing strategy can provide as much as a ten-fold reduction in phasing errors. We also considered a majority voting scheme for the construction of a

  11. Haplotype structure in Ashkenazi Jewish BRCA1 and BRCA2 mutation carriers

    DEFF Research Database (Denmark)

    Im, Kate M; Kirchhoff, Tomas; Wang, Xianshu

    2011-01-01

    Three founder mutations in BRCA1 and BRCA2 contribute to the risk of hereditary breast and ovarian cancer in Ashkenazi Jews (AJ). They are observed at increased frequency in the AJ compared to other BRCA mutations in Caucasian non-Jews (CNJ). Several authors have proposed that elevated allele...... the tools of statistical genomics to examine the likelihood of long-range LD at a deleterious locus in a population that faced a genetic bottleneck. We studied the genotypes of hundreds of women from a large international consortium of BRCA1 and BRCA2 mutation carriers and found that AJ women exhibited long......-range haplotypes compared to CNJ women. More than 50% of the AJ chromosomes with the BRCA1 185delAG mutation share an identical 2.1 Mb haplotype and nearly 16% of AJ chromosomes carrying the BRCA2 6174delT mutation share a 1.4 Mb haplotype. Simulations based on the best inference of Ashkenazi population demography...

  12. Insights into the Prunus-Specific S-RNase-Based Self-Incompatibility System from a Genome-Wide Analysis of the Evolutionary Radiation of S Locus-Related F-box Genes.

    Science.gov (United States)

    Akagi, Takashi; Henry, Isabelle M; Morimoto, Takuya; Tao, Ryutaro

    2016-06-01

    Self-incompatibility (SI) is an important plant reproduction mechanism that facilitates the maintenance of genetic diversity within species. Three plant families, the Solanaceae, Rosaceae and Plantaginaceae, share an S-RNase-based gametophytic SI (GSI) system that involves a single S-RNase as the pistil S determinant and several F-box genes as pollen S determinants that act via non-self-recognition. Previous evidence has suggested a specific self-recognition mechanism in Prunus (Rosaceae), raising questions about the generality of the S-RNase-based GSI system. We investigated the evolution of the pollen S determinant by comparing the sequences of the Prunus S haplotype-specific F-box gene (SFB) with those of its orthologs in other angiosperm genomes. Our results indicate that the Prunus SFB does not cluster with the pollen S of other plants and diverged early after the establishment of the Eudicots. Our results further indicate multiple F-box gene duplication events, specifically in the Rosaceae family, and suggest that the Prunus SFB gene originated in a recent Prunus-specific gene duplication event. Transcriptomic and evolutionary analyses of the Prunus S paralogs are consistent with the establishment of a Prunus-specific SI system, and the possibility of subfunctionalization differentiating the newly generated SFB from the original pollen S determinant. © The Author 2016. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  13. Natural Product Biosynthetic Diversity and Comparative Genomics of the Cyanobacteria.

    Science.gov (United States)

    Dittmann, Elke; Gugger, Muriel; Sivonen, Kaarina; Fewer, David P

    2015-10-01

    Cyanobacteria are an ancient lineage of slow-growing photosynthetic bacteria and a prolific source of natural products with intricate chemical structures and potent biological activities. The bulk of these natural products are known from just a handful of genera. Recent efforts have elucidated the mechanisms underpinning the biosynthesis of a diverse array of natural products from cyanobacteria. Many of the biosynthetic mechanisms are unique to cyanobacteria or rarely described from other organisms. Advances in genome sequence technology have precipitated a deluge of genome sequences for cyanobacteria. This makes it possible to link known natural products to biosynthetic gene clusters but also accelerates the discovery of new natural products through genome mining. These studies demonstrate that cyanobacteria encode a huge variety of cryptic gene clusters for the production of natural products, and the known chemical diversity is likely to be just a fraction of the true biosynthetic capabilities of this fascinating and ancient group of organisms. Copyright © 2015. Published by Elsevier Ltd.

  14. Analysis of Multiallelic CNVs by Emulsion Haplotype Fusion PCR.

    Science.gov (United States)

    Tyson, Jess; Armour, John A L

    2017-01-01

    Emulsion-fusion PCR recovers long-range sequence information by combining products in cis from individual genomic DNA molecules. Emulsion droplets act as very numerous small reaction chambers in which different PCR products from a single genomic DNA molecule are condensed into short joint products, to unite sequences in cis from widely separated genomic sites. These products can therefore provide information about the arrangement of sequences and variants at a larger scale than established long-read sequencing methods. The method has been useful in defining the phase of variants in haplotypes, the typing of inversions, and determining the configuration of sequence variants in multiallelic CNVs. In this description we outline the rationale for the application of emulsion-fusion PCR methods to the analysis of multiallelic CNVs, and give practical details for our own implementation of the method in that context.

  15. Updated listing of haplotypes at the human phenylalanine hydroxylase (PAH) locus

    Energy Technology Data Exchange (ETDEWEB)

    Eisensmith, R.C.; Woo, S.L.C. (Baylor College of Medicine, Houston, TX (United States))

    1992-12-01

    Analysis of mutant PAH chromosomes has identified approximately 60 different single-base substitutions and deletions within the PAH locus. Nearly all of these molecular lesions are in strong linkage disequilibrium with specific RFLP haplotypes in different ethnic populations. Thus, haplotype analysis is not only useful for diagnostic purposes but is proving to be a valuable tool in population genetic studies of the origin and spread of phenylketonuria alleles in human populations. PCR-based methods have been developed to detect six of the eight polymorphic restriction sites used for determination of RFLP haplotypes at the PAH locus. A table of the proposed expanded haplotypes is given.

  16. A Genome-wide Combinatorial Strategy Dissects Complex Genetic Architecture of Seed Coat Color in Chickpea.

    Science.gov (United States)

    Bajaj, Deepak; Das, Shouvik; Upadhyaya, Hari D; Ranjan, Rajeev; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C L Laxmipathi; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K; Parida, Swarup K

    2015-01-01

    The study identified 9045 high-quality SNPs employing both genome-wide GBS- and candidate gene-based SNP genotyping assays in 172, including 93 cultivated (desi and kabuli) and 79 wild chickpea accessions. The GWAS in a structured population of 93 sequenced accessions detected 15 major genomic loci exhibiting significant association with seed coat color. Five seed color-associated major genomic loci underlying robust QTLs mapped on a high-density intra-specific genetic linkage map were validated by QTL mapping. The integration of association and QTL mapping with gene haplotype-specific LD mapping and transcript profiling identified novel allelic variants (non-synonymous SNPs) and haplotypes in a MATE secondary transporter gene regulating light/yellow brown and beige seed coat color differentiation in chickpea. The down-regulation and decreased transcript expression of beige seed coat color-associated MATE gene haplotype was correlated with reduced proanthocyanidins accumulation in the mature seed coats of beige than light/yellow brown seed colored desi and kabuli accessions for their coloration/pigmentation. This seed color-regulating MATE gene revealed strong purifying selection pressure primarily in LB/YB seed colored desi and wild Cicer reticulatum accessions compared with the BE seed colored kabuli accessions. The functionally relevant molecular tags identified have potential to decipher the complex transcriptional regulatory gene function of seed coat coloration and for understanding the selective sweep-based seed color trait evolutionary pattern in cultivated and wild accessions during chickpea domestication. The genome-wide integrated approach employed will expedite marker-assisted genetic enhancement for developing cultivars with desirable seed coat color types in chickpea.

  17. Discovery, validation and characterization of Erbb4 and Nrg1 haplotypes using data from three genome-wide association studies of schizophrenia.

    Directory of Open Access Journals (Sweden)

    Zeynep Sena Agim

    Full Text Available Schizophrenia is one of the most common and complex neuropsychiatric disorders, which is contributed both by genetic and environmental exposures. Recently, it is shown that NRG1-mediated ErbB4 signalling regulates many important cellular and molecular processes such as cellular growth, differentiation and death, particularly in myelin-producing cells, glia and neurons. Recent association studies have revealed genomic regions of NRG1 and ERBB4, which are significantly associated with risk of developing schizophrenia; however, inconsistencies exist in terms of validation of findings between distinct populations. In this study, we aim to validate the previously identified regions and to discover novel haplotypes of NRG1 and ERBB4 using logistic regression models and Haploview analyses in three independent datasets from GWAS conducted on European subjects, namely, CATIE, GAIN and nonGAIN. We identified a significant 6-kb block in ERBB4 between chromosome locations 212,156,823 and 212,162,848 in CATIE and GAIN datasets (p = 0.0206 and 0.0095, respectively. In NRG1, a significant 25-kb block, between 32,291,552 and 32,317,192, was associated with risk of schizophrenia in all CATIE, GAIN, and nonGAIN datasets (p = 0.0005, 0.0589, and 0.0143, respectively. Fine mapping and FastSNP analysis of genetic variation located within significantly associated regions proved the presence of binding sites for several transcription factors such as SRY, SOX5, CEPB, and ETS1. In this study, we have discovered and validated haplotypes of ERBB4 and NRG1 in three independent European populations. These findings suggest that these haplotypes play an important role in the development of schizophrenia by affecting transcription factor binding affinity.

  18. Host specific diversity in Lactobacillus johnsonii as evidenced by a major chromosomal inversion and phage resistance mechanisms.

    Science.gov (United States)

    Guinane, Caitriona M; Kent, Robert M; Norberg, Sarah; Hill, Colin; Fitzgerald, Gerald F; Stanton, Catherine; Ross, R Paul

    2011-04-20

    Genetic diversity and genomic rearrangements are a driving force in bacterial evolution and niche adaptation. We sequenced and annotated the genome of Lactobacillus johnsonii DPC6026, a strain isolated from the porcine intestinal tract. Although the genome of DPC6026 is similar in size (1.97 mbp) and GC content (34.8%) to the sequenced human isolate L. johnsonii NCC 533, a large symmetrical inversion of approximately 750 kb differentiated the two strains. Comparative analysis among 12 other strains of L. johnsonii including 8 porcine, 3 human and 1 poultry isolate indicated that the genome architecture found in DPC6026 is more common within the species than that of NCC 533. Furthermore a number of unique features were annotated in DPC6026, some of which are likely to have been acquired by horizontal gene transfer (HGT) and contribute to protection against phage infection. A putative type III restriction-modification system was identified, as were novel Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) elements. Interestingly, these particular elements are not widely distributed among L. johnsonii strains. Taken together these data suggest intra-species genomic rearrangements and significant genetic diversity within the L. johnsonii species and indicate towards a host-specific divergence of L. johnsonii strains with respect to genome inversion and phage exposure.

  19. Host specific diversity in Lactobacillus johnsonii as evidenced by a major chromosomal inversion and phage resistance mechanisms.

    Directory of Open Access Journals (Sweden)

    Caitriona M Guinane

    Full Text Available Genetic diversity and genomic rearrangements are a driving force in bacterial evolution and niche adaptation. We sequenced and annotated the genome of Lactobacillus johnsonii DPC6026, a strain isolated from the porcine intestinal tract. Although the genome of DPC6026 is similar in size (1.97 mbp and GC content (34.8% to the sequenced human isolate L. johnsonii NCC 533, a large symmetrical inversion of approximately 750 kb differentiated the two strains. Comparative analysis among 12 other strains of L. johnsonii including 8 porcine, 3 human and 1 poultry isolate indicated that the genome architecture found in DPC6026 is more common within the species than that of NCC 533. Furthermore a number of unique features were annotated in DPC6026, some of which are likely to have been acquired by horizontal gene transfer (HGT and contribute to protection against phage infection. A putative type III restriction-modification system was identified, as were novel Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR elements. Interestingly, these particular elements are not widely distributed among L. johnsonii strains. Taken together these data suggest intra-species genomic rearrangements and significant genetic diversity within the L. johnsonii species and indicate towards a host-specific divergence of L. johnsonii strains with respect to genome inversion and phage exposure.

  20. Genomic diversity and evolution of the head crest in the rock pigeon

    DEFF Research Database (Denmark)

    Shapiro, Michael D.; Kronenberg, Zev; Li, Cai

    2013-01-01

    The geographic origins of breeds and the genetic basis of variation within the widely distributed and phenotypically diverse domestic rock pigeon (Columba livia) remain largely unknown. We generated a rock pigeon reference genome and additional genome sequences representing domestic and feral...

  1. Impacts of Genome-Wide Analyses on Our Understanding of Human Herpesvirus Diversity and Evolution.

    Science.gov (United States)

    Renner, Daniel W; Szpara, Moriah L

    2018-01-01

    Until fairly recently, genome-wide evolutionary dynamics and within-host diversity were more commonly examined in the context of small viruses than in the context of large double-stranded DNA viruses such as herpesviruses. The high mutation rates and more compact genomes of RNA viruses have inspired the investigation of population dynamics for these species, and recent data now suggest that herpesviruses might also be considered candidates for population modeling. High-throughput sequencing (HTS) and bioinformatics have expanded our understanding of herpesviruses through genome-wide comparisons of sequence diversity, recombination, allele frequency, and selective pressures. Here we discuss recent data on the mechanisms that generate herpesvirus genomic diversity and underlie the evolution of these virus families. We focus on human herpesviruses, with key insights drawn from veterinary herpesviruses and other large DNA virus families. We consider the impacts of cell culture on herpesvirus genomes and how to accurately describe the viral populations under study. The need for a strong foundation of high-quality genomes is also discussed, since it underlies all secondary genomic analyses such as RNA sequencing (RNA-Seq), chromatin immunoprecipitation, and ribosome profiling. Areas where we foresee future progress, such as the linking of viral genetic differences to phenotypic or clinical outcomes, are highlighted as well. Copyright © 2017 Renner and Szpara.

  2. Impacts of Genome-Wide Analyses on Our Understanding of Human Herpesvirus Diversity and Evolution

    Science.gov (United States)

    Renner, Daniel W.

    2017-01-01

    ABSTRACT Until fairly recently, genome-wide evolutionary dynamics and within-host diversity were more commonly examined in the context of small viruses than in the context of large double-stranded DNA viruses such as herpesviruses. The high mutation rates and more compact genomes of RNA viruses have inspired the investigation of population dynamics for these species, and recent data now suggest that herpesviruses might also be considered candidates for population modeling. High-throughput sequencing (HTS) and bioinformatics have expanded our understanding of herpesviruses through genome-wide comparisons of sequence diversity, recombination, allele frequency, and selective pressures. Here we discuss recent data on the mechanisms that generate herpesvirus genomic diversity and underlie the evolution of these virus families. We focus on human herpesviruses, with key insights drawn from veterinary herpesviruses and other large DNA virus families. We consider the impacts of cell culture on herpesvirus genomes and how to accurately describe the viral populations under study. The need for a strong foundation of high-quality genomes is also discussed, since it underlies all secondary genomic analyses such as RNA sequencing (RNA-Seq), chromatin immunoprecipitation, and ribosome profiling. Areas where we foresee future progress, such as the linking of viral genetic differences to phenotypic or clinical outcomes, are highlighted as well. PMID:29046445

  3. Two extended haplotype blocks are associated with adaptation to high altitude habitats in East African honey bees.

    Directory of Open Access Journals (Sweden)

    Andreas Wallberg

    2017-05-01

    Full Text Available Understanding the genetic basis of adaption is a central task in biology. Populations of the honey bee Apis mellifera that inhabit the mountain forests of East Africa differ in behavior and morphology from those inhabiting the surrounding lowland savannahs, which likely reflects adaptation to these habitats. We performed whole genome sequencing on 39 samples of highland and lowland bees from two pairs of populations to determine their evolutionary affinities and identify the genetic basis of these putative adaptations. We find that in general, levels of genetic differentiation between highland and lowland populations are very low, consistent with them being a single panmictic population. However, we identify two loci on chromosomes 7 and 9, each several hundred kilobases in length, which exhibit near fixation for different haplotypes between highland and lowland populations. The highland haplotypes at these loci are extremely rare in samples from the rest of the world. Patterns of segregation of genetic variants suggest that recombination between haplotypes at each locus is suppressed, indicating that they comprise independent structural variants. The haplotype on chromosome 7 harbors nearly all octopamine receptor genes in the honey bee genome. These have a role in learning and foraging behavior in honey bees and are strong candidates for adaptation to highland habitats. Molecular analysis of a putative breakpoint indicates that it may disrupt the coding sequence of one of these genes. Divergence between the highland and lowland haplotypes at both loci is extremely high suggesting that they are ancient balanced polymorphisms that greatly predate divergence between the extant honey bee subspecies.

  4. Two extended haplotype blocks are associated with adaptation to high altitude habitats in East African honey bees

    Science.gov (United States)

    Schöning, Caspar

    2017-01-01

    Understanding the genetic basis of adaption is a central task in biology. Populations of the honey bee Apis mellifera that inhabit the mountain forests of East Africa differ in behavior and morphology from those inhabiting the surrounding lowland savannahs, which likely reflects adaptation to these habitats. We performed whole genome sequencing on 39 samples of highland and lowland bees from two pairs of populations to determine their evolutionary affinities and identify the genetic basis of these putative adaptations. We find that in general, levels of genetic differentiation between highland and lowland populations are very low, consistent with them being a single panmictic population. However, we identify two loci on chromosomes 7 and 9, each several hundred kilobases in length, which exhibit near fixation for different haplotypes between highland and lowland populations. The highland haplotypes at these loci are extremely rare in samples from the rest of the world. Patterns of segregation of genetic variants suggest that recombination between haplotypes at each locus is suppressed, indicating that they comprise independent structural variants. The haplotype on chromosome 7 harbors nearly all octopamine receptor genes in the honey bee genome. These have a role in learning and foraging behavior in honey bees and are strong candidates for adaptation to highland habitats. Molecular analysis of a putative breakpoint indicates that it may disrupt the coding sequence of one of these genes. Divergence between the highland and lowland haplotypes at both loci is extremely high suggesting that they are ancient balanced polymorphisms that greatly predate divergence between the extant honey bee subspecies. PMID:28542163

  5. High intraspecific genome diversity in the model arbuscular mycorrhizal symbiont Rhizophagus irregularis.

    Science.gov (United States)

    Chen, Eric C H; Morin, Emmanuelle; Beaudet, Denis; Noel, Jessica; Yildirir, Gokalp; Ndikumana, Steve; Charron, Philippe; St-Onge, Camille; Giorgi, John; Krüger, Manuela; Marton, Timea; Ropars, Jeanne; Grigoriev, Igor V; Hainaut, Matthieu; Henrissat, Bernard; Roux, Christophe; Martin, Francis; Corradi, Nicolas

    2018-01-22

    Arbuscular mycorrhizal fungi (AMF) are known to improve plant fitness through the establishment of mycorrhizal symbioses. Genetic and phenotypic variations among closely related AMF isolates can significantly affect plant growth, but the genomic changes underlying this variability are unclear. To address this issue, we improved the genome assembly and gene annotation of the model strain Rhizophagus irregularis DAOM197198, and compared its gene content with five isolates of R. irregularis sampled in the same field. All isolates harbor striking genome variations, with large numbers of isolate-specific genes, gene family expansions, and evidence of interisolate genetic exchange. The observed variability affects all gene ontology terms and PFAM protein domains, as well as putative mycorrhiza-induced small secreted effector-like proteins and other symbiosis differentially expressed genes. High variability is also found in active transposable elements. Overall, these findings indicate a substantial divergence in the functioning capacity of isolates harvested from the same field, and thus their genetic potential for adaptation to biotic and abiotic changes. Our data also provide a first glimpse into the genome diversity that resides within natural populations of these symbionts, and open avenues for future analyses of plant-AMF interactions that link AMF genome variation with plant phenotype and fitness. © 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.

  6. Mitochondrial and Y chromosome haplotype motifs as diagnostic markers of Jewish ancestry: a reconsideration.

    Directory of Open Access Journals (Sweden)

    Sergio eTofanelli

    2014-11-01

    Full Text Available Several authors have proposed haplotype motifs based on site variants at the mitochondrial genome (mtDNA and the non-recombining portion of the Y chromosome (NRY to trace the genealogies of Jewish people. Here, we analyzed their main approaches and test the feasibility of adopting motifs as ancestry markers through construction of a large database of mtDNA and NRY haplotypes from public genetic genealogical repositories. We verified the reliability of Jewish ancestry prediction based on the Cohen and Levite Modal Haplotypes in their classical 6 STR marker format or in the extended 12 STR format, as well as four founder mtDNA lineages (HVS-I segments accounting for about 40% of the current population of Ashkenazi Jews. For this purpose we compared haplotype composition in individuals of self-reported Jewish ancestry with the rest of European, African or Middle Eastern samples, to test for non-random association of ethno-geographic groups and haplotypes. Overall, NRY and mtDNA based motifs, previously reported to differentiate between groups, were found to be more represented in Jewish compared to non-Jewish groups. However, this seems to stem from common ancestors of Jewish lineages being rather recent respect to ancestors of non-Jewish lineages with the same haplotype signatures. Moreover, the polyphyly of haplotypes which contain the proposed motifs and the misuse of constant mutation rates heavily affected previous attempts to correctly dating the origin of common ancestries. Accordingly, our results stress the limitations of using the above haplotype motifs as reliable Jewish ancestry predictors and show its inadequacy for forensic or genealogical purposes.

  7. Extremely Low Genomic Diversity of Rickettsia japonica Distributed in Japan.

    Science.gov (United States)

    Akter, Arzuba; Ooka, Tadasuke; Gotoh, Yasuhiro; Yamamoto, Seigo; Fujita, Hiromi; Terasoma, Fumio; Kida, Kouji; Taira, Masakatsu; Nakadouzono, Fumiko; Gokuden, Mutsuyo; Hirano, Manabu; Miyashiro, Mamoru; Inari, Kouichi; Shimazu, Yukie; Tabara, Kenji; Toyoda, Atsushi; Yoshimura, Dai; Itoh, Takehiko; Kitano, Tomokazu; Sato, Mitsuhiko P; Katsura, Keisuke; Mondal, Shakhinur Islam; Ogura, Yoshitoshi; Ando, Shuji; Hayashi, Tetsuya

    2017-01-01

    Rickettsiae are obligate intracellular bacteria that have small genomes as a result of reductive evolution. Many Rickettsia species of the spotted fever group (SFG) cause tick-borne diseases known as "spotted fevers". The life cycle of SFG rickettsiae is closely associated with that of the tick, which is generally thought to act as a bacterial vector and reservoir that maintains the bacterium through transstadial and transovarial transmission. Each SFG member is thought to have adapted to a specific tick species, thus restricting the bacterial distribution to a relatively limited geographic region. These unique features of SFG rickettsiae allow investigation of how the genomes of such biologically and ecologically specialized bacteria evolve after genome reduction and the types of population structures that are generated. Here, we performed a nationwide, high-resolution phylogenetic analysis of Rickettsia japonica, an etiological agent of Japanese spotted fever that is distributed in Japan and Korea. The comparison of complete or nearly complete sequences obtained from 31 R. japonica strains isolated from various sources in Japan over the past 30 years demonstrated an extremely low level of genomic diversity. In particular, only 34 single nucleotide polymorphisms were identified among the 27 strains of the major lineage containing all clinical isolates and tick isolates from the three tick species. Our data provide novel insights into the biology and genome evolution of R. japonica, including the possibilities of recent clonal expansion and a long generation time in nature due to the long dormant phase associated with tick life cycles. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  8. Haplotype association analysis of human disease traits using genotype data of unrelated individuals

    DEFF Research Database (Denmark)

    Tan, Qihua; Christiansen, Lene; Christensen, Kaare

    2005-01-01

    unphased multi-locus genotype data, ranging from the early approach by the simple gene-counting method to the recent work using the generalized linear model. However, these methods are either confined to case – control design or unable to yield unbiased point and interval estimates of haplotype effects....... Based on the popular logistic regression model, we present a new approach for haplotype association analysis of human disease traits. Using haplotype-based parameterization, our model infers the effects of specific haplotypes (point estimation) and constructs confidence interval for the risks...... on the well-known logistic regression model is a useful tool for haplotype association analysis of human disease traits....

  9. Echinococcus equinus and Echinococcus granulosus sensu stricto from the United Kingdom: genetic diversity and haplotypic variation.

    Science.gov (United States)

    Boufana, Belgees; Lett, Wai San; Lahmar, Samia; Buishi, Imad; Bodell, Anthony J; Varcasia, Antonio; Casulli, Adriano; Beeching, Nicholas J; Campbell, Fiona; Terlizzo, Monica; McManus, Donald P; Craig, Philip S

    2015-02-01

    Cystic echinococcosis is endemic in Europe including the United Kingdom. However, information on the molecular epidemiology of Echinococcus spp. from the United Kingdom is limited. Echinococcus isolates from intermediate and definitive animal hosts as well as from human cystic echinococcosis cases were analysed to determine species and genotypes within these hosts. Echinococcus equinus was identified from horse hydatid isolates, cysts retrieved from captive UK mammals and copro-DNA of foxhounds and farm dogs. Echinococcus granulosus sensu stricto (s.s.) was identified from hydatid cysts of sheep and cattle as well as in DNA extracted from farm dog and foxhound faecal samples, and from four human cystic echinococcosis isolates, including the first known molecular confirmation of E. granulosus s.s. infection in a Welsh sheep farmer. Low genetic variability for E. equinus from various hosts and from different geographical locations was detected using the mitochondrial cytochrome c oxidase subunit 1 gene (cox1), indicating the presence of a dominant haplotype (EQUK01). In contrast, greater haplotypic variation was observed for E. granulosus s.s. cox1 sequences. The haplotype network showed a star-shaped network with a centrally placed main haplotype (EgUK01) that had been reported from other world regions. Copyright © 2014 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.

  10. The diploid genome sequence of an Asian individual

    DEFF Research Database (Denmark)

    Wang, Jun; Wang, Wei; Li, Ruiqiang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...... used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP...... identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J...

  11. Using genomic information to conserve genetic diversity in livestock

    NARCIS (Netherlands)

    Eynard, Sonia E.

    2018-01-01

    Concern about the status of livestock breeds and their conservation has increased as selection and small population sizes caused loss of genetic diversity. Meanwhile, dense SNP chips and whole genome sequences (WGS) became available, providing opportunities to accurately quantify the impact of

  12. European Chlamydia abortus livestock isolate genomes reveal unusual stability and limited diversity, reflected in geographical signatures.

    Science.gov (United States)

    Seth-Smith, H M B; Busó, Leonor Sánchez; Livingstone, M; Sait, M; Harris, S R; Aitchison, K D; Vretou, Evangelia; Siarkou, V I; Laroucau, K; Sachse, K; Longbottom, D; Thomson, N R

    2017-05-04

    Chlamydia abortus (formerly Chlamydophila abortus) is an economically important livestock pathogen, causing ovine enzootic abortion (OEA), and can also cause zoonotic infections in humans affecting pregnancy outcome. Large-scale genomic studies on other chlamydial species are giving insights into the biology of these organisms but have not yet been performed on C. abortus. Our aim was to investigate a broad collection of European isolates of C. abortus, using next generation sequencing methods, looking at diversity, geographic distribution and genome dynamics. Whole genome sequencing was performed on our collection of 57 C. abortus isolates originating primarily from the UK, Germany, France and Greece, but also from Tunisia, Namibia and the USA. Phylogenetic analysis of a total of 64 genomes shows a deep structural division within the C. abortus species with a major clade displaying limited diversity, in addition to a branch carrying two more distantly related Greek isolates, LLG and POS. Within the major clade, seven further phylogenetic groups can be identified, demonstrating geographical associations. The number of variable nucleotide positions across the sampled isolates is significantly lower than those published for C. trachomatis and C. psittaci. No recombination was identified within C. abortus, and no plasmid was found. Analysis of pseudogenes showed lineage specific loss of some functions, notably with several Pmp and TMH/Inc proteins predicted to be inactivated in many of the isolates studied. The diversity within C. abortus appears to be much lower compared to other species within the genus. There are strong geographical signatures within the phylogeny, indicating clonal expansion within areas of limited livestock transport. No recombination has been identified within this species, showing that different species of Chlamydia may demonstrate different evolutionary dynamics, and that the genome of C. abortus is highly stable.

  13. The Jujube Genome Provides Insights into Genome Evolution and the Domestication of Sweetness/Acidity Taste in Fruit Trees.

    Directory of Open Access Journals (Sweden)

    Jian Huang

    2016-12-01

    Full Text Available Jujube (Ziziphus jujuba Mill. belongs to the Rhamnaceae family and is a popular fruit tree species with immense economic and nutritional value. Here, we report a draft genome of the dry jujube cultivar 'Junzao' and the genome resequencing of 31 geographically diverse accessions of cultivated and wild jujubes (Ziziphus jujuba var. spinosa. Comparative analysis revealed that the genome of 'Dongzao', a fresh jujube, was ~86.5 Mb larger than that of the 'Junzao', partially due to the recent insertions of transposable elements in the 'Dongzao' genome. We constructed eight proto-chromosomes of the common ancestor of Rhamnaceae and Rosaceae, two sister families in the order Rosales, and elucidated the evolutionary processes that have shaped the genome structures of modern jujubes. Population structure analysis revealed the complex genetic background of jujubes resulting from extensive hybridizations between jujube and its wild relatives. Notably, several key genes that control fruit organic acid metabolism and sugar content were identified in the selective sweep regions. We also identified S-locus genes controlling gametophytic self-incompatibility and investigated haplotype patterns of the S locus in the jujube genomes, which would provide a guideline for parent selection for jujube crossbreeding. This study provides valuable genomic resources for jujube improvement, and offers insights into jujube genome evolution and its population structure and domestication.

  14. Genome-wide conserved non-coding microsatellite (CNMS) marker-based integrative genetical genomics for quantitative dissection of seed weight in chickpea.

    Science.gov (United States)

    Bajaj, Deepak; Saxena, Maneesha S; Kujur, Alice; Das, Shouvik; Badoni, Saurabh; Tripathi, Shailesh; Upadhyaya, Hari D; Gowda, C L L; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K; Parida, Swarup K

    2015-03-01

    Phylogenetic footprinting identified 666 genome-wide paralogous and orthologous CNMS (conserved non-coding microsatellite) markers from 5'-untranslated and regulatory regions (URRs) of 603 protein-coding chickpea genes. The (CT)n and (GA)n CNMS carrying CTRMCAMV35S and GAGA8BKN3 regulatory elements, respectively, are abundant in the chickpea genome. The mapped genic CNMS markers with robust amplification efficiencies (94.7%) detected higher intraspecific polymorphic potential (37.6%) among genotypes, implying their immense utility in chickpea breeding and genetic analyses. Seventeen differentially expressed CNMS marker-associated genes showing strong preferential and seed tissue/developmental stage-specific expression in contrasting genotypes were selected to narrow down the gene targets underlying seed weight quantitative trait loci (QTLs)/eQTLs (expression QTLs) through integrative genetical genomics. The integration of transcript profiling with seed weight QTL/eQTL mapping, molecular haplotyping, and association analyses identified potential molecular tags (GAGA8BKN3 and RAV1AAT regulatory elements and alleles/haplotypes) in the LOB-domain-containing protein- and KANADI protein-encoding transcription factor genes controlling the cis-regulated expression for seed weight in the chickpea. This emphasizes the potential of CNMS marker-based integrative genetical genomics for the quantitative genetic dissection of complex seed weight in chickpea. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  15. Small RNA pathways and diversity in model legumes: lessons from genomics.

    Directory of Open Access Journals (Sweden)

    Pilar eBustos-Sanmamed

    2013-07-01

    Full Text Available Small non coding RNAs (smRNA participate in the regulation of development, cell differentiation, adaptation to environmental constraints and defense responses in plants. They negatively regulate gene expression by degrading specific mRNA targets, repressing their translation or modifying chromatin conformation through homologous interaction with target loci. MicroRNAs (miRNA and short-interfering RNAs (siRNA are generated from long double stranded RNA (dsRNA that are cleaved into 20- to 24-nucleotide dsRNAs by RNase III proteins called DICERs (DCL. One strand of the duplex is then loaded onto effective complexes containing different ARGONAUTE (AGO proteins. In this review, we explored smRNA diversity in model legumes and compiled available data from miRBAse, the miRNA database, and from 22 reports of smRNA deep sequencing or miRNA identification genome-wide in Medicago truncatula, Glycine max and Lotus japonicus. In addition to conserved miRNAs present in other plant species, 229, 179 and 35 novel miRNA families were identified respectively in these 3 legumes, among which several seems legume-specific. New potential functions of several miRNAs in the legume-specific nodulation process are discussed. Furthermore, a new category of siRNA, the phased siRNAs, which seems to mainly regulate disease-resistance genes, was recently discovered in legumes. Despite that the genome sequence of model legumes are not yet fully completed, further analysis was performed by database mining of gene families and protein characteristics of DCLs and AGOs in these genomes. Although most components of the smRNA pathways are conserved, identifiable homologs of key smRNA players from non-legumes could not yet be detected in M. truncatula available genomic and expressed sequence databases. In addition, an important gene diversification was observed in the three legumes. Functional significance of these variant isoforms may reflect peculiarities of smRNA biogenesis in

  16. Origins and Domestication of Cultivated Banana Inferred from Chloroplast and Nuclear Genes

    Science.gov (United States)

    Zhang, Cui; Wang, Xin-Feng; Shi, Feng-Xue; Chen, Wen-Na; Ge, Xue-Jun

    2013-01-01

    Background Cultivated bananas are large, vegetatively-propagated members of the genus Musa. More than 1,000 cultivars are grown worldwide and they are major economic and food resources in numerous developing countries. It has been suggested that cultivated bananas originated from the islands of Southeast Asia (ISEA) and have been developed through complex geodomestication pathways. However, the maternal and parental donors of most cultivars are unknown, and the pattern of nucleotide diversity in domesticated banana has not been fully resolved. Methodology/Principal Findings We studied the genetics of 16 cultivated and 18 wild Musa accessions using two single-copy nuclear (granule-bound starch synthase I, GBSS I, also known as Waxy, and alcohol dehydrogenase 1, Adh1) and two chloroplast (maturase K, matK, and the trnL-F gene cluster) genes. The results of phylogenetic analyses showed that all A-genome haplotypes of cultivated bananas were grouped together with those of ISEA subspecies of M. acuminata (A-genome). Similarly, the B- and S-genome haplotypes of cultivated bananas clustered with the wild species M. balbisiana (B-genome) and M. schizocarpa (S-genome), respectively. Notably, it has been shown that distinct haplotypes of each cultivar (A-genome group) were nested together to different ISEA subspecies M. acuminata. Analyses of nucleotide polymorphism in the Waxy and Adh1 genes revealed that, in comparison to the wild relatives, cultivated banana exhibited slightly lower nucleotide diversity both across all sites and specifically at silent sites. However, dramatically reduced nucleotide diversity was found at nonsynonymous sites for cultivated bananas. Conclusions/Significance Our study not only confirmed the origin of cultivated banana as arising from multiple intra- and inter-specific hybridization events, but also showed that cultivated banana may have not suffered a severe genetic bottleneck during the domestication process. Importantly, our findings

  17. The family Rhabdoviridae: Mono- and bipartite negative-sense RNA viruses with diverse genome organization and common evolutionary origins

    Science.gov (United States)

    Dietzgen, Ralf G.; Kondo, Hideki; Goodin, Michael M.; Kurath, Gael; Vasilakis, Nikos

    2017-01-01

    The family Rhabdoviridae consists of mostly enveloped, bullet-shaped or bacilliform viruses with a negative-sense, single-stranded RNA genome that infect vertebrates, invertebrates or plants. This ecological diversity is reflected by the diversity and complexity of their genomes. Five canonical structural protein genes are conserved in all rhabdoviruses, but may be overprinted, overlapped or interspersed with several novel and diverse accessory genes. This review gives an overview of the characteristics and diversity of rhabdoviruses, their taxonomic classification, replication mechanism, properties of classical rhabdoviruses such as rabies virus and rhabdoviruses with complex genomes, rhabdoviruses infecting aquatic species, and plant rhabdoviruses with both mono- and bipartite genomes.

  18. Honey bee-inspired algorithms for SNP haplotype reconstruction problem

    Science.gov (United States)

    PourkamaliAnaraki, Maryam; Sadeghi, Mehdi

    2016-03-01

    Reconstructing haplotypes from SNP fragments is an important problem in computational biology. There have been a lot of interests in this field because haplotypes have been shown to contain promising data for disease association research. It is proved that haplotype reconstruction in Minimum Error Correction model is an NP-hard problem. Therefore, several methods such as clustering techniques, evolutionary algorithms, neural networks and swarm intelligence approaches have been proposed in order to solve this problem in appropriate time. In this paper, we have focused on various evolutionary clustering techniques and try to find an efficient technique for solving haplotype reconstruction problem. It can be referred from our experiments that the clustering methods relying on the behaviour of honey bee colony in nature, specifically bees algorithm and artificial bee colony methods, are expected to result in more efficient solutions. An application program of the methods is available at the following link. http://www.bioinf.cs.ipm.ir/software/haprs/

  19. Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists

    Directory of Open Access Journals (Sweden)

    Matheus Sanitá Lima

    2017-11-01

    Full Text Available Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb, indicating that most of the organelle DNA—coding and noncoding—is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells.

  20. Genetic and molecular characterization of three novel S-haplotypes in sour cherry (Prunus cerasus L.).

    Science.gov (United States)

    Tsukamoto, Tatsuya; Potter, Daniel; Tao, Ryutaro; Vieira, Cristina P; Vieira, Jorge; Iezzoni, Amy F

    2008-01-01

    Tetraploid sour cherry (Prunus cerasus L.) exhibits gametophytic self-incompatibility (GSI) whereby the specificity of self-pollen rejection is controlled by alleles of the stylar and pollen specificity genes, S-RNase and SFB (S haplotype-specific F-box protein gene), respectively. As sour cherry selections can be either self-compatible (SC) or self-incompatible (SI), polyploidy per se does not result in SC. Instead the genotype-dependent loss of SI in sour cherry is due to the accumulation of non-functional S-haplotypes. The presence of two or more non-functional S-haplotypes within sour cherry 2x pollen renders that pollen SC. Two new S-haplotypes from sour cherry, S(33) and S(34), that are presumed to be contributed by the P. fruticosa species parent, the complete S-RNase and SFB sequences of a third S-haplotype, S(35), plus the presence of two previously identified sweet cherry S-haplotypes, S(14) and S(16) are described here. Genetic segregation data demonstrated that the S(16)-, S(33)-, S(34)-, and S(35)-haplotypes present in sour cherry are fully functional. This result is consistent with our previous finding that 'hetero-allelic' pollen is incompatible in sour cherry. Phylogenetic analyses of the SFB and S-RNase sequences from available Prunus species reveal that the relationships among S-haplotypes show no correspondence to known organismal relationships at any taxonomic level within Prunus, indicating that polymorphisms at the S-locus have been maintained throughout the evolution of the genus. Furthermore, the phylogenetic relationships among SFB sequences are generally incongruent with those among S-RNase sequences for the same S-haplotypes. Hypotheses compatible with these results are discussed.

  1. Are Eimeria Genetically Diverse, and Does It Matter?

    Science.gov (United States)

    Clark, Emily L; Tomley, Fiona M; Blake, Damer P

    2017-03-01

    Eimeria pose a risk to all livestock species as a cause of coccidiosis, reducing productivity and compromising animal welfare. Pressure to reduce drug use in the food chain makes the development of cost-effective vaccines against Eimeria essential. For novel vaccines to be successful, understanding genetic and antigenic diversity in field populations is key. Eimeria species that infect chickens are most significant, with Eimeria tenella among the best studied and most economically important. Genome-wide single nucleotide polymorphism (SNP)-based haplotyping has been used to determine population structure, genotype distribution, and potential for cross-fertilization between E. tenella strains. Here, we discuss recent developments in our understanding of diversity for Eimeria in relation to its specialized life cycle, distribution across the globe, and the challenges posed to vaccine development. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Predicting Tissue-Specific Enhancers in the Human Genome

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.; Loots, Gabriela G.; Nobrega, Marcelo A.; Ovcharenko, Ivan

    2006-07-01

    Determining how transcriptional regulatory signals areencoded in vertebrate genomes is essential for understanding the originsof multi-cellular complexity; yet the genetic code of vertebrate generegulation remains poorly understood. In an attempt to elucidate thiscode, we synergistically combined genome-wide gene expression profiling,vertebrate genome comparisons, and transcription factor binding siteanalysis to define sequence signatures characteristic of candidatetissue-specific enhancers in the human genome. We applied this strategyto microarray-based gene expression profiles from 79 human tissues andidentified 7,187 candidate enhancers that defined their flanking geneexpression, the majority of which were located outside of knownpromoters. We cross-validated this method for its ability to de novopredict tissue-specific gene expression and confirmed its reliability in57 of the 79 available human tissues, with an average precision inenhancer recognition ranging from 32 percent to 63 percent, and asensitivity of 47 percent. We used the sequence signatures identified bythis approach to assign tissue-specific predictions to ~;328,000human-mouse conserved noncoding elements in the human genome. Byoverlapping these genome-wide predictions with a large in vivo dataset ofenhancers validated in transgenic mice, we confirmed our results with a28 percent sensitivity and 50 percent precision. These results indicatethe power of combining complementary genomic datasets as an initialcomputational foray into the global view of tissue-specific generegulation in vertebrates.

  3. Unraveling Mycobacterium tuberculosis genomic diversity and evolution in Lisbon, Portugal, a highly drug resistant setting

    KAUST Repository

    Perdigã o, Joã o; Silva, Hugo; Machado, Diana; Macedo, Rita; Maltez, Fernando; Silva, Carla; Jordao, Luisa; Couto, Isabel; Mallard, Kim; Coll, Francesc; Hill-Cawthorne, Grant A.; McNerney, Ruth; Pain, Arnab; Clark, Taane G; Viveiros, Miguel; Portugal, Isabel

    2014-01-01

    Globally, this study contributes with novel genome-wide phylogenetic data and has led to the identification of new genomic variants that support the notion of a growing genomic diversity facing both setting and host adaptation.

  4. Genome-Based Studies of Marine Microorganisms to Maximize the Diversity of Natural Products Discovery for Medical Treatments

    Directory of Open Access Journals (Sweden)

    Xin-Qing Zhao

    2011-01-01

    Full Text Available Marine microorganisms are rich source for natural products which play important roles in pharmaceutical industry. Over the past decade, genome-based studies of marine microorganisms have unveiled the tremendous diversity of the producers of natural products and also contributed to the efficiency of harness the strain diversity and chemical diversity, as well as the genetic diversity of marine microorganisms for the rapid discovery and generation of new natural products. In the meantime, genomic information retrieved from marine symbiotic microorganisms can also be employed for the discovery of new medical molecules from yet-unculturable microorganisms. In this paper, the recent progress in the genomic research of marine microorganisms is reviewed; new tools of genome mining as well as the advance in the activation of orphan pathways and metagenomic studies are summarized. Genome-based research of marine microorganisms will maximize the biodiscovery process and solve the problems of supply and sustainability of drug molecules for medical treatments.

  5. Haplotypes of CYP3A4 and their close linkage with CYP3A5 haplotypes in a Japanese population.

    Science.gov (United States)

    Fukushima-Uesaka, Hiromi; Saito, Yoshiro; Watanabe, Hidemi; Shiseki, Kisho; Saeki, Mayumi; Nakamura, Takahiro; Kurose, Kouichi; Sai, Kimie; Komamura, Kazuo; Ueno, Kazuyuki; Kamakura, Shiro; Kitakaze, Masafumi; Hanai, Sotaro; Nakajima, Toshiharu; Matsumoto, Kenji; Saito, Hirohisa; Goto, Yu-ichi; Kimura, Hideo; Katoh, Masaaki; Sugai, Kenji; Minami, Narihiro; Shirao, Kuniaki; Tamura, Tomohide; Yamamoto, Noboru; Minami, Hironobu; Ohtsu, Atsushi; Yoshida, Teruhiko; Saijo, Nagahiro; Kitamura, Yutaka; Kamatani, Naoyuki; Ozawa, Shogo; Sawada, Jun-ichi

    2004-01-01

    In order to identify single nucleotide polymorphisms (SNPs) and haplotype frequencies of CYP3A4 in a Japanese population, the distal enhancer and proximal promoter regions, all exons, and the surrounding introns were sequenced from genomic DNA of 416 Japanese subjects. We found 24 SNPs, including 17 novel ones: two in the distal enhancer, four in the proximal promoter, one in the 5'-untranslated region (UTR), seven in the introns, and three in the 3'-UTR. The most common SNP was c.1026+12G>A (IVS10+12G>A), with a 0.249 frequency. Four non-synonymous SNPs, c.554C>G (p.T185S, CYP3A4(*)16), c.830_831insA (p.E277fsX8, (*)6), c.878T>C (p.L293P, (*)18), and c.1088 C>T (p.T363M, (*)11) were found with frequencies of 0.014, 0.001, 0.028, and 0.002, respectively. No SNP was found in the known nuclear transcriptional factor-binding sites in the enhancer and promoter regions. Using these 24 SNPs, 16 haplotypes were unambiguously identified, and nine haplotypes were inferred by aid of an expectation-maximization-based program. In addition, using data from 186 subjects enabled a close linkage to be found between CYP3A4 and CYP3A5 SNPs, especially among the SNPs at c.1026+12 in CYP3A4 and c.219-237 (IVS3-237, a key SNP site for CYP3A5(*)3), c.865+77 (IVS9+77) and c.1523 in CYP3A5. This result suggested that CYP3A4 and CYP3A5 are within the same gene block. Haplotype analysis between CYP3A4 and CYP3A5 revealed several major haplotype combinations in the CYP3A4-CYP3A5 block. Our findings provide fundamental and useful information for genotyping CYP3A4 (and CYP3A5) in the Japanese, and probably Asian populations. Copyright 2003 Wiley-Liss, Inc.

  6. H-PoP and H-PoPG: heuristic partitioning algorithms for single individual haplotyping of polyploids.

    Science.gov (United States)

    Xie, Minzhu; Wu, Qiong; Wang, Jianxin; Jiang, Tao

    2016-12-15

    Some economically important plants including wheat and cotton have more than two copies of each chromosome. With the decreasing cost and increasing read length of next-generation sequencing technologies, reconstructing the multiple haplotypes of a polyploid genome from its sequence reads becomes practical. However, the computational challenge in polyploid haplotyping is much greater than that in diploid haplotyping, and there are few related methods. This article models the polyploid haplotyping problem as an optimal poly-partition problem of the reads, called the Polyploid Balanced Optimal Partition model. For the reads sequenced from a k-ploid genome, the model tries to divide the reads into k groups such that the difference between the reads of the same group is minimized while the difference between the reads of different groups is maximized. When the genotype information is available, the model is extended to the Polyploid Balanced Optimal Partition with Genotype constraint problem. These models are all NP-hard. We propose two heuristic algorithms, H-PoP and H-PoPG, based on dynamic programming and a strategy of limiting the number of intermediate solutions at each iteration, to solve the two models, respectively. Extensive experimental results on simulated and real data show that our algorithms can solve the models effectively, and are much faster and more accurate than the recent state-of-the-art polyploid haplotyping algorithms. The experiments also show that our algorithms can deal with long reads and deep read coverage effectively and accurately. Furthermore, H-PoP might be applied to help determine the ploidy of an organism. https://github.com/MinzhuXie/H-PoPG CONTACT: xieminzhu@hotmail.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. The family Rhabdoviridae: mono- and bipartite negative-sense RNA viruses with diverse genome organization and common evolutionary origins.

    Science.gov (United States)

    Dietzgen, Ralf G; Kondo, Hideki; Goodin, Michael M; Kurath, Gael; Vasilakis, Nikos

    2017-01-02

    The family Rhabdoviridae consists of mostly enveloped, bullet-shaped or bacilliform viruses with a negative-sense, single-stranded RNA genome that infect vertebrates, invertebrates or plants. This ecological diversity is reflected by the diversity and complexity of their genomes. Five canonical structural protein genes are conserved in all rhabdoviruses, but may be overprinted, overlapped or interspersed with several novel and diverse accessory genes. This review gives an overview of the characteristics and diversity of rhabdoviruses, their taxonomic classification, replication mechanism, properties of classical rhabdoviruses such as rabies virus and rhabdoviruses with complex genomes, rhabdoviruses infecting aquatic species, and plant rhabdoviruses with both mono- and bipartite genomes. Copyright © 2016 Elsevier B.V. All rights reserved.

  8. Nucleotide Sequence Diversity and Linkage Disequilibrium of Four Nuclear Loci in Foxtail Millet (Setaria italica.

    Directory of Open Access Journals (Sweden)

    Shui-Lian He

    Full Text Available Foxtail millet (Setaria italica (L. Beauv is one of the earliest domesticated grains, which has been cultivated in northern China by 8,700 years before present (YBP and across Eurasia by 4,000 YBP. Owing to a small genome and diploid nature, foxtail millet is a tractable model crop for studying functional genomics of millets and bioenergy grasses. In this study, we examined nucleotide sequence diversity, geographic structure, and levels of linkage disequilibrium at four nuclear loci (ADH1, G3PDH, IGS1 and TPI1 in representative samples of 311 landrace accessions across its cultivated range. Higher levels of nucleotide sequence and haplotype diversity were observed in samples from China relative to other sampled regions. Genetic assignment analysis classified the accessions into seven clusters based on nucleotide sequence polymorphisms. Intralocus LD decayed rapidly to half the initial value within ~1.2 kb or less.

  9. Nucleotide Sequence Diversity and Linkage Disequilibrium of Four Nuclear Loci in Foxtail Millet (Setaria italica).

    Science.gov (United States)

    He, Shui-Lian; Yang, Yang; Morrell, Peter L; Yi, Ting-Shuang

    2015-01-01

    Foxtail millet (Setaria italica (L.) Beauv) is one of the earliest domesticated grains, which has been cultivated in northern China by 8,700 years before present (YBP) and across Eurasia by 4,000 YBP. Owing to a small genome and diploid nature, foxtail millet is a tractable model crop for studying functional genomics of millets and bioenergy grasses. In this study, we examined nucleotide sequence diversity, geographic structure, and levels of linkage disequilibrium at four nuclear loci (ADH1, G3PDH, IGS1 and TPI1) in representative samples of 311 landrace accessions across its cultivated range. Higher levels of nucleotide sequence and haplotype diversity were observed in samples from China relative to other sampled regions. Genetic assignment analysis classified the accessions into seven clusters based on nucleotide sequence polymorphisms. Intralocus LD decayed rapidly to half the initial value within ~1.2 kb or less.

  10. Strain screen and haplotype association mapping of wheel running in inbred mouse strains.

    Science.gov (United States)

    Lightfoot, J Timothy; Leamy, Larry; Pomp, Daniel; Turner, Michael J; Fodor, Anthony A; Knab, Amy; Bowen, Robert S; Ferguson, David; Moore-Harrison, Trudy; Hamilton, Alicia

    2010-09-01

    Previous genetic association studies of physical activity, in both animal and human models, have been limited in number of subjects and genetically homozygous strains used as well as number of genomic markers available for analysis. Expansion of the available mouse physical activity strain screens and the recently published dense single-nucleotide polymorphism (SNP) map of the mouse genome (approximately 8.3 million SNPs) and associated statistical methods allowed us to construct a more generalizable map of the quantitative trait loci (QTL) associated with physical activity. Specifically, we measured wheel running activity in male and female mice (average age 9 wk) in 41 inbred strains and used activity data from 38 of these strains in a haplotype association mapping analysis to determine QTL associated with activity. As seen previously, there was a large range of activity patterns among the strains, with the highest and lowest strains differing significantly in daily distance run (27.4-fold), duration of activity (23.6-fold), and speed (2.9-fold). On a daily basis, female mice ran further (24%), longer (13%), and faster (11%). Twelve QTL were identified, with three (on Chr. 12, 18, and 19) in both male and female mice, five specific to males, and four specific to females. Eight of the 12 QTL, including the 3 general QTL found for both sexes, fell into intergenic areas. The results of this study further support the findings of a moderate to high heritability of physical activity and add general genomic areas applicable to a large number of mouse strains that can be further mined for candidate genes associated with regulation of physical activity. Additionally, results suggest that potential genetic mechanisms arising from traditional noncoding regions of the genome may be involved in regulation of physical activity.

  11. Mutation profile of all 49 exons of the human myosin VIIA gene, and haplotype analysis, in Usher 1B families from diverse origins.

    Science.gov (United States)

    Adato, A; Weil, D; Kalinski, H; Pel-Or, Y; Ayadi, H; Petit, C; Korostishevsky, M; Bonne-Tamir, B

    1997-10-01

    Usher syndrome types I (USH1A-USH1E) are a group of autosomal recessive diseases characterized by profound congenital hearing loss, vestibular areflexia, and progressive visual loss due to retinitis pigmentosa. The human myosin VIIA gene, located on 11q14, has been shown to be responsible for Usher syndrome type 1B (USH1B). Haplotypes were constructed in 28 USH1 families by use of the following polymorphic markers spanning the USH1B locus: D11S787, D11S527, D11S1789, D11S906, D11S4186, and OMP. Affected individuals and members of their families from 12 different ethnic origins were screened for the presence of mutations in all 49 exons of the myosin VIIA gene. In 15 families myosin VIIA mutations were detected, verifying their classification as USH1B. All these mutations are novel, including three missense mutations, one premature stop codon, two splicing mutations, one frameshift, and one deletion of >2 kb comprising exons 47 and 48, a part of exon 49, and the introns between them. Three mutations were shared by more than one family, consistent with haplotype similarities. Altogether, 16 USH1B haplotypes were observed in the 15 families; most haplotypes were population specific. Several exonic and intronic polymorphisms were also detected. None of the 20 known USH1B mutations reported so far in other world populations were identified in our families.

  12. Profiling of gene duplication patterns of sequenced teleost genomes: evidence for rapid lineage-specific genome expansion mediated by recent tandem duplications.

    Science.gov (United States)

    Lu, Jianguo; Peatman, Eric; Tang, Haibao; Lewis, Joshua; Liu, Zhanjiang

    2012-06-15

    Gene duplication has had a major impact on genome evolution. Localized (or tandem) duplication resulting from unequal crossing over and whole genome duplication are believed to be the two dominant mechanisms contributing to vertebrate genome evolution. While much scrutiny has been directed toward discerning patterns indicative of whole-genome duplication events in teleost species, less attention has been paid to the continuous nature of gene duplications and their impact on the size, gene content, functional diversity, and overall architecture of teleost genomes. Here, using a Markov clustering algorithm directed approach we catalogue and analyze patterns of gene duplication in the four model teleost species with chromosomal coordinates: zebrafish, medaka, stickleback, and Tetraodon. Our analyses based on set size, duplication type, synonymous substitution rate (Ks), and gene ontology emphasize shared and lineage-specific patterns of genome evolution via gene duplication. Most strikingly, our analyses highlight the extraordinary duplication and retention rate of recent duplicates in zebrafish and their likely role in the structural and functional expansion of the zebrafish genome. We find that the zebrafish genome is remarkable in its large number of duplicated genes, small duplicate set size, biased Ks distribution toward minimal mutational divergence, and proportion of tandem and intra-chromosomal duplicates when compared with the other teleost model genomes. The observed gene duplication patterns have played significant roles in shaping the architecture of teleost genomes and appear to have contributed to the recent functional diversification and divergence of important physiological processes in zebrafish. We have analyzed gene duplication patterns and duplication types among the available teleost genomes and found that a large number of genes were tandemly and intrachromosomally duplicated, suggesting their origin of independent and continuous duplication

  13. A unified framework for haplotype inference in nuclear families.

    Science.gov (United States)

    Iliadis, Alexandros; Anastassiou, Dimitris; Wang, Xiaodong

    2012-07-01

    Many large genome-wide association studies include nuclear families with more than one child (trio families), allowing for analysis of differences between siblings (sib pair analysis). Statistical power can be increased when haplotypes are used instead of genotypes. Currently, haplotype inference in families with more than one child can be performed either using the familial information or statistical information derived from the population samples but not both. Building on our recently proposed tree-based deterministic framework (TDS) for trio families, we augment its applicability to general nuclear families. We impose a minimum recombinant approach locally and independently on each multiple children family, while resorting to the population-derived information to solve the remaining ambiguities. Thus our framework incorporates all available information (familial and population) in a given study. We demonstrate that using all the constraints in our approach we can have gains in the accuracy as opposed to breaking the multiple children families to separate trios and resorting to a trio inference algorithm or phasing each family in isolation. We believe that our proposed framework could be the method of choice for haplotype inference in studies that include nuclear families with multiple children. Our software (tds2.0) is downloadable from www.ee.columbia.edu/∼anastas/tds. © 2012 The Authors Annals of Human Genetics © 2012 Blackwell Publishing Ltd/University College London.

  14. Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists.

    Science.gov (United States)

    Sanitá Lima, Matheus; Smith, David Roy

    2017-11-06

    Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells. Copyright © 2017 Sanitá Lima and Smith.

  15. Patterns of genome size diversity in bats (order Chiroptera).

    Science.gov (United States)

    Smith, Jillian D L; Bickham, John W; Gregory, T Ryan

    2013-08-01

    Despite being a group of particular interest in considering relationships between genome size and metabolic parameters, bats have not been well studied from this perspective. This study presents new estimates for 121 "microbat" species from 12 families and complements a previous study on members of the family Pteropodidae ("megabats"). The results confirm that diversity in genome size in bats is very limited even compared with other mammals, varying approximately 2-fold from 1.63 pg in Lophostoma carrikeri to 3.17 pg in Rhinopoma hardwickii and averaging only 2.35 pg ± 0.02 SE (versus 3.5 pg overall for mammals). However, contrary to some other vertebrate groups, and perhaps owing to the narrow range observed, genome size correlations were not apparent with any chromosomal, physiological, flight-related, developmental, or ecological characteristics within the order Chiroptera. Genome size is positively correlated with measures of body size in bats, though the strength of the relationships differs between pteropodids ("megabats") and nonpteropodids ("microbats").

  16. Mitochondrial genome diversity and population structure of the giant squid Architeuthis

    DEFF Research Database (Denmark)

    Winkelmann, Inger Eleanor Hall; Campos, Paula; Strugnell, Jan

    2013-01-01

    techniques, considerable controversy exists with regard to topics as varied as their taxonomy, biology and even behaviour. In this study, we have characterized the mitochondrial genome (mitogenome) diversity of 43 Architeuthis samples collected from across the range of the species, in order to use genetic...... a recent population expansion or selective sweep, which may explain the low level of genetic diversity....

  17. The Human Genome Diversity Project

    Energy Technology Data Exchange (ETDEWEB)

    Cavalli-Sforza, L. [Stanford Univ., CA (United States)

    1994-12-31

    The Human Genome Diversity Project (HGD Project) is an international anthropology project that seeks to study the genetic richness of the entire human species. This kind of genetic information can add a unique thread to the tapestry knowledge of humanity. Culture, environment, history, and other factors are often more important, but humanity`s genetic heritage, when analyzed with recent technology, brings another type of evidence for understanding species` past and present. The Project will deepen the understanding of this genetic richness and show both humanity`s diversity and its deep and underlying unity. The HGD Project is still largely in its planning stages, seeking the best ways to reach its goals. The continuing discussions of the Project, throughout the world, should improve the plans for the Project and their implementation. The Project is as global as humanity itself; its implementation will require the kinds of partnerships among different nations and cultures that make the involvement of UNESCO and other international organizations particularly appropriate. The author will briefly discuss the Project`s history, describe the Project, set out the core principles of the Project, and demonstrate how the Project will help combat the scourge of racism.

  18. Comparative genomics of Geobacter chemotaxis genes reveals diverse signaling function

    Directory of Open Access Journals (Sweden)

    Antommattei Frances M

    2008-10-01

    Full Text Available Abstract Background Geobacter species are δ-Proteobacteria and are often the predominant species in a variety of sedimentary environments where Fe(III reduction is important. Their ability to remediate contaminated environments and produce electricity makes them attractive for further study. Cell motility, biofilm formation, and type IV pili all appear important for the growth of Geobacter in changing environments and for electricity production. Recent studies in other bacteria have demonstrated that signaling pathways homologous to the paradigm established for Escherichia coli chemotaxis can regulate type IV pili-dependent motility, the synthesis of flagella and type IV pili, the production of extracellular matrix material, and biofilm formation. The classification of these pathways by comparative genomics improves the ability to understand how Geobacter thrives in natural environments and better their use in microbial fuel cells. Results The genomes of G. sulfurreducens, G. metallireducens, and G. uraniireducens contain multiple (~70 homologs of chemotaxis genes arranged in several major clusters (six, seven, and seven, respectively. Unlike the single gene cluster of E. coli, the Geobacter clusters are not all located near the flagellar genes. The probable functions of some Geobacter clusters are assignable by homology to known pathways; others appear to be unique to the Geobacter sp. and contain genes of unknown function. We identified large numbers of methyl-accepting chemotaxis protein (MCP homologs that have diverse sensing domain architectures and generate a potential for sensing a great variety of environmental signals. We discuss mechanisms for class-specific segregation of the MCPs in the cell membrane, which serve to maintain pathway specificity and diminish crosstalk. Finally, the regulation of gene expression in Geobacter differs from E. coli. The sequences of predicted promoter elements suggest that the alternative sigma factors

  19. A comprehensive literature review of haplotyping software and methods for use with unrelated individuals

    Directory of Open Access Journals (Sweden)

    Salem Rany M

    2005-03-01

    Full Text Available Abstract Interest in the assignment and frequency analysis of haplotypes in samples of unrelated individuals has increased immeasurably as a result of the emphasis placed on haplotype analyses by, for example, the International HapMap Project and related initiatives. Although there are many available computer programs for haplotype analysis applicable to samples of unrelated individuals, many of these programs have limitations and/or very specific uses. In this paper, the key features of available haplotype analysis software for use with unrelated individuals, as well as pooled DNA samples from unrelated individuals, are summarised. Programs for haplotype analysis were identified through keyword searches on PUBMED and various internet search engines, a review of citations from retrieved papers and personal communications, up to June 2004. Priority was given to functioning computer programs, rather than theoretical models and methods. The available software was considered in light of a number of factors: the algorithm(s used, algorithm accuracy, assumptions, the accommodation of genotyping error, implementation of hypothesis testing, handling of missing data, software characteristics and web-based implementations. Review papers comparing specific methods and programs are also summarised. Forty-six haplotyping programs were identified and reviewed. The programs were divided into two groups: those designed for individual genotype data (a total of 43 programs and those designed for use with pooled DNA samples (a total of three programs. The accuracy of programs using various criteria are assessed and the programs are categorised and discussed in light of: algorithm and method, accuracy, assumptions, genotyping error, hypothesis testing, missing data, software characteristics and web implementation. Many available programs have limitations (eg some cannot accommodate missing data and/or are designed with specific tasks in mind (eg estimating

  20. Aspects combinatoires des réarrangements génomiques et des réseaux d'haplotypes

    OpenAIRE

    Labarre, Anthony

    2008-01-01

    The dissertation covers two problems motivated by computational biology: genome rearrangements, and haplotype networks.Genome rearrangement problems are a particular case of edit distance problems, where one seeks to transform two given objects into one another using as few operations as possible, with the additional constraint that the set of allowed operations is fixed beforehand; we are also interested in computing the corresponding distances between those objects, i.e. merely computing th...

  1. Cis-acting mutation and duplication: History of molecular evolution in a P450 haplotype responsible for insecticide resistance in Culex quinquefasciatus.

    Science.gov (United States)

    Itokawa, Kentaro; Komagata, Osamu; Kasai, Shinji; Masada, Masahiro; Tomita, Takashi

    2011-07-01

    A cytochrome P450 gene, Cyp9m10, is more than 200-fold overexpressed in a pyrethroid resistant strain of Culex quinquefasciatus, JPal-per. The haplotype of this strain contains two copies of Cyp9m10 resulted from recent tandem duplication. In this study, we discovered and isolated a Cyp9m10 haplotype closely related to this duplicated Cyp9m10 haplotype from JHB, a strain used for the recent genome project for this mosquito species. The isolated haplotype (JHB-NIID-B haplotype) shared the same insertion of a transposable element upstream of the coding region with JPal-per strain but not duplicated. The JHB-NIID-B haplotype was considered to have diverged from the JPal-per lineage just before the duplication event. Cyp9m10 was moderately overexpressed in larvae with the JHB-NIID-B haplotype. The overexpressions in JHB-NIID-B and JPal-per haplotypes were developmentally regulated in similar pattern indicating both haplotypes share a common cis-acting mutation responsible for the overexpressions. The isolated moderately overexpressed haplotype conferred resistance, however, its efficacy was relatively small. We hypothesized that the first cis-acting mutation modified the consequence of the subsequent duplication in JPal-per lineage to confer stronger phenotypic effect than that if it occurred before the first cis-acting mutation. Copyright © 2011 Elsevier Ltd. All rights reserved.

  2. β-globin haplotypes in normal and hemoglobinopathic individuals from Reconcavo Baiano, State of Bahia, Brazil

    Directory of Open Access Journals (Sweden)

    Wellington dos Santos Silva

    2010-01-01

    Full Text Available Five restriction site polymorphisms in the β-globin gene cluster (HincII-5'ε, HindIII-Gγ, HindIII-ªγ, HincII-'ψβ1 and HincII-3''ψβ1 were analyzed in three populations (n = 114 from Reconcavo Baiano, State of Bahia, Brazil. The groups included two urban populations from the towns of Cachoeira and Maragojipe and one rural Afro-descendant population, known as the "quilombo community", from Cachoeira municipality. The number of haplotypes found in the populations ranged from 10 to 13, which indicated higher diversity than in the parental populations. The haplotypes 2 (+----,3(----+,4(-+--+and6(-++-+onthe βA chromosomes were the most common, and two haplotypes, 9 (-++++and 14 (++--+, were found exclusively in the Maragojipe population. The other haplotypes (1, 5, 9, 11, 12, 13, 14 and 16 had lower frequencies. Restriction site analysis and the derived haplotypes indicated homogeneity among the populations. Thirty-two individuals with hemoglobinopathies (17 sickle cell disease, 12 HbSC disease and 3 HbCC disease were also analyzed. The haplotype frequencies of these patients differed significantly from those of the general population. In the sickle cell disease subgroup, the predominant haplotypes were BEN (Benin and CAR (Central African Republic, with frequencies of 52.9% and 32.4%, respectively. The high frequency of the BEN haplotype agreed with the historical origin of the afro-descendant population in the state of Bahia. However, this frequency differed from that of Salvador, the state capital, where the CAR and BEN haplotypes have similar frequencies, probably as a consequence of domestic slave trade and subsequent internal migrations to other regions of Brazil.

  3. Genome-wide association study identifies shared risk loci common to two malignancies in golden retrievers.

    Directory of Open Access Journals (Sweden)

    Noriko Tonomura

    2015-02-01

    Full Text Available Dogs, with their breed-determined limited genetic background, are great models of human disease including cancer. Canine B-cell lymphoma and hemangiosarcoma are both malignancies of the hematologic system that are clinically and histologically similar to human B-cell non-Hodgkin lymphoma and angiosarcoma, respectively. Golden retrievers in the US show significantly elevated lifetime risk for both B-cell lymphoma (6% and hemangiosarcoma (20%. We conducted genome-wide association studies for hemangiosarcoma and B-cell lymphoma, identifying two shared predisposing loci. The two associated loci are located on chromosome 5, and together contribute ~20% of the risk of developing these cancers. Genome-wide p-values for the top SNP of each locus are 4.6×10-7 and 2.7×10-6, respectively. Whole genome resequencing of nine cases and controls followed by genotyping and detailed analysis identified three shared and one B-cell lymphoma specific risk haplotypes within the two loci, but no coding changes were associated with the risk haplotypes. Gene expression analysis of B-cell lymphoma tumors revealed that carrying the risk haplotypes at the first locus is associated with down-regulation of several nearby genes including the proximal gene TRPC6, a transient receptor Ca2+-channel involved in T-cell activation, among other functions. The shared risk haplotype in the second locus overlaps the vesicle transport and release gene STX8. Carrying the shared risk haplotype is associated with gene expression changes of 100 genes enriched for pathways involved in immune cell activation. Thus, the predisposing germ-line mutations in B-cell lymphoma and hemangiosarcoma appear to be regulatory, and affect pathways involved in T-cell mediated immune response in the tumor. This suggests that the interaction between the immune system and malignant cells plays a common role in the tumorigenesis of these relatively different cancers.

  4. Detection of haplotypes associated with prenatal death in dairy cattle and identification of deleterious mutations in GART, SHBG and SLC37A2.

    Science.gov (United States)

    Fritz, Sébastien; Capitan, Aurelien; Djari, Anis; Rodriguez, Sabrina C; Barbat, Anne; Baur, Aurélia; Grohs, Cécile; Weiss, Bernard; Boussaha, Mekki; Esquerré, Diane; Klopp, Christophe; Rocha, Dominique; Boichard, Didier

    2013-01-01

    The regular decrease of female fertility over time is a major concern in modern dairy cattle industry. Only half of this decrease is explained by indirect response to selection on milk production, suggesting the existence of other factors such as embryonic lethal genetic defects. Genomic regions harboring recessive deleterious mutations were detected in three dairy cattle breeds by identifying frequent haplotypes (>1%) showing a deficit in homozygotes among Illumina Bovine 50k Beadchip haplotyping data from the French genomic selection database (47,878 Holstein, 16,833 Montbéliarde, and 11,466 Normande animals). Thirty-four candidate haplotypes (pHH3 in Holstein breed were identified. Haplotype length varied from 1 to 4.8 Mb and frequencies from 1.7 up to 9%. A significant negative effect on calving rate, consistent in heifers and in lactating cows, was observed for 9 of these haplotypes in matings between carrier bulls and daughters of carrier sires, confirming their association with embryonic lethal mutations. Eight regions were further investigated using whole genome sequencing data from heterozygous bull carriers and control animals (45 animals in total). Six strong candidate causative mutations including polymorphisms previously reported in FANCI (Brachyspina), SLC35A3 (CVM), APAF1 (HH1) and three novel mutations with very damaging effect on the protein structure, according to SIFT and Polyphen-2, were detected in GART, SHBG and SLC37A2 genes. In conclusion, this study reveals a yet hidden consequence of the important inbreeding rate observed in intensively selected and specialized cattle breeds. Counter-selection of these mutations and management of matings will have positive consequences on female fertility in dairy cattle.

  5. The mitochondrial genome in embryo technologies.

    Science.gov (United States)

    Hiendleder, S; Wolf, E

    2003-08-01

    The mammalian mitochondrial genome encodes for 37 genes which are involved in a broad range of cellular functions. The mitochondrial DNA (mtDNA) molecule is commonly assumed to be inherited through oocyte cytoplasm in a clonal manner, and apparently species-specific mechanisms have evolved to eliminate the contribution of sperm mitochondria after natural fertilization. However, recent evidence for paternal mtDNA inheritance in embryos and offspring questions the general validity of this model, particularly in the context of assisted reproduction and embryo biotechnology. In addition to normal mt DNA haplotype variation, oocytes and spermatozoa show remarkable differences in mtDNA content and may be affected by inherited or acquired mtDNA aberrations. All these parameters have been correlated with gamete quality and reproductive success rates. Nuclear transfer (NT) technology provides experimental models for studying interactions between nuclear and mitochondrial genomes. Recent studies demonstrated (i) a significant effect of mtDNA haplotype or other maternal cytoplasmic factors on the efficiency of NT; (ii) phenotypic differences between transmitochondrial clones pointing to functionally relevant nuclear-cytoplasmic interactions; and (iii) neutral or non-neutral selection of mtDNA haplotypes in heteroplasmic conditions. Mitochondria form a dynamic reticulum, enabling complementation of mitochondrial components and possibly mixing of different mtDNA populations in heteroplasmic individuals. Future directions of research on mtDNA in the context of reproductive biotechnology range from the elimination of adverse effects of artificial heteroplasmy, e.g. created by ooplasm transfer, to engineering of optimized constellations of nuclear and cytoplasmic genes for the production of superior livestock.

  6. Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity.

    Directory of Open Access Journals (Sweden)

    Tamara Smokvina

    Full Text Available Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis

  7. Comparative genomics of plant-associated Pseudomonas spp.: insights into diversity and inheritance of traits involved in multitrophic interactions.

    Directory of Open Access Journals (Sweden)

    Joyce E Loper

    2012-07-01

    Full Text Available We provide here a comparative genome analysis of ten strains within the Pseudomonas fluorescens group including seven new genomic sequences. These strains exhibit a diverse spectrum of traits involved in biological control and other multitrophic interactions with plants, microbes, and insects. Multilocus sequence analysis placed the strains in three sub-clades, which was reinforced by high levels of synteny, size of core genomes, and relatedness of orthologous genes between strains within a sub-clade. The heterogeneity of the P. fluorescens group was reflected in the large size of its pan-genome, which makes up approximately 54% of the pan-genome of the genus as a whole, and a core genome representing only 45-52% of the genome of any individual strain. We discovered genes for traits that were not known previously in the strains, including genes for the biosynthesis of the siderophores achromobactin and pseudomonine and the antibiotic 2-hexyl-5-propyl-alkylresorcinol; novel bacteriocins; type II, III, and VI secretion systems; and insect toxins. Certain gene clusters, such as those for two type III secretion systems, are present only in specific sub-clades, suggesting vertical inheritance. Almost all of the genes associated with multitrophic interactions map to genomic regions present in only a subset of the strains or unique to a specific strain. To explore the evolutionary origin of these genes, we mapped their distributions relative to the locations of mobile genetic elements and repetitive extragenic palindromic (REP elements in each genome. The mobile genetic elements and many strain-specific genes fall into regions devoid of REP elements (i.e., REP deserts and regions displaying atypical tri-nucleotide composition, possibly indicating relatively recent acquisition of these loci. Collectively, the results of this study highlight the enormous heterogeneity of the P. fluorescens group and the importance of the variable genome in tailoring

  8. Genetic diversity and natural selection footprints of the glycine amidinotransferase gene in various human populations.

    Science.gov (United States)

    Khan, Asifullah; Tian, Lei; Zhang, Chao; Yuan, Kai; Xu, Shuhua

    2016-01-05

    The glycine amidinotransferase gene (GATM) plays a vital role in energy metabolism in muscle tissues and is associated with multiple clinically important phenotypes. However, the genetic diversity of the GATM gene remains poorly understood within and between human populations. Here we analyzed the 1,000 Genomes Project data through population genetics approaches and observed significant genetic diversity across the GATM gene among various continental human populations. We observed considerable variations in GATM allele frequencies and haplotype composition among different populations. Substantial genetic differences were observed between East Asian and European populations (FST = 0.56). In addition, the frequency of a distinct major GATM haplotype in these groups was congruent with population-wide diversity at this locus. Furthermore, we identified GATM as the top differentiated gene compared to the other statin drug response-associated genes. Composite multiple analyses identified signatures of positive selection at the GATM locus, which was estimated to have occurred around 850 generations ago in European populations. As GATM catalyzes the key step of creatine biosynthesis involved in energy metabolism, we speculate that the European prehistorical demographic transition from hunter-gatherer to farming cultures was the driving force of selection that fulfilled creatine-based metabolic requirement of the populations.

  9. Evidence of a Native Northwest Atlantic COI Haplotype Clade in the Cryptogenic Colonial Ascidian Botryllus schlosseri.

    Science.gov (United States)

    Yund, Philip O; Collins, Catherine; Johnson, Sheri L

    2015-06-01

    The colonial ascidian Botryllus schlosseri should be considered cryptogenic (i.e., not definitively classified as either native or introduced) in the Northwest Atlantic. Although all the evidence is quite circumstantial, over the last 15 years most research groups have accepted the scenario of human-mediated dispersal and classified B. schlosseri as introduced; others have continued to consider it native or cryptogenic. We address the invasion status of this species by adding 174 sequences to the growing worldwide database for the mitochondrial gene cytochrome c oxidase subunit I (COI) and analyzing 1077 sequences to compare genetic diversity of one clade of haplotypes in the Northwest Atlantic with two hypothesized source regions (the Northeast Atlantic and Mediterranean). Our results lead us to reject the prevailing view of the directionality of transport across the Atlantic. We argue that the genetic diversity patterns at COI are far more consistent with the existence of at least one haplotype clade in the Northwest Atlantic (and possibly a second) that substantially pre-dates human colonization from Europe, with this native North American clade subsequently introduced to three sites in Northeast Atlantic and Mediterranean waters. However, we agree with past researchers that some sites in the Northwest Atlantic have more recently been invaded by alien haplotypes, so that some populations are currently composed of a mixture of native and invader haplotypes. © 2015 Marine Biological Laboratory.

  10. Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity.

    Directory of Open Access Journals (Sweden)

    Nicolas Heslot

    Full Text Available Genome-wide molecular markers are often being used to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorphisms in the population under study. Ascertainment bias arises when marker data is not obtained from a random sample of the polymorphisms in the population of interest. Genotyping-by-sequencing (GBS is rapidly emerging as a low-cost genotyping platform, even for the large, complex, and polyploid wheat (Triticum aestivum L. genome. With GBS, marker discovery and genotyping occur simultaneously, resulting in minimal ascertainment bias. The previous platform of choice for whole-genome genotyping in many species such as wheat was DArT (Diversity Array Technology and has formed the basis of most of our knowledge about cereals genetic diversity. This study compared GBS and DArT marker platforms for measuring genetic diversity and genomic selection (GS accuracy in elite U.S. soft winter wheat. From a set of 365 breeding lines, 38,412 single nucleotide polymorphism GBS markers were discovered and genotyped. The GBS SNPs gave a higher GS accuracy than 1,544 DArT markers on the same lines, despite 43.9% missing data. Using a bootstrap approach, we observed significantly more clustering of markers and ascertainment bias with DArT relative to GBS. The minor allele frequency distribution of GBS markers had a deficit of rare variants compared to DArT markers. Despite the ascertainment bias of the DArT markers, GS accuracy for three traits out of four was not significantly different when an equal number of markers were used for each platform. This suggests that the gain in accuracy observed using GBS compared to DArT markers was mainly due to a large increase in the number of markers available for the analysis.

  11. Impact of Marker Ascertainment Bias on Genomic Selection Accuracy and Estimates of Genetic Diversity

    Science.gov (United States)

    Heslot, Nicolas; Rutkoski, Jessica; Poland, Jesse; Jannink, Jean-Luc; Sorrells, Mark E.

    2013-01-01

    Genome-wide molecular markers are often being used to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorphisms in the population under study. Ascertainment bias arises when marker data is not obtained from a random sample of the polymorphisms in the population of interest. Genotyping-by-sequencing (GBS) is rapidly emerging as a low-cost genotyping platform, even for the large, complex, and polyploid wheat (Triticum aestivum L.) genome. With GBS, marker discovery and genotyping occur simultaneously, resulting in minimal ascertainment bias. The previous platform of choice for whole-genome genotyping in many species such as wheat was DArT (Diversity Array Technology) and has formed the basis of most of our knowledge about cereals genetic diversity. This study compared GBS and DArT marker platforms for measuring genetic diversity and genomic selection (GS) accuracy in elite U.S. soft winter wheat. From a set of 365 breeding lines, 38,412 single nucleotide polymorphism GBS markers were discovered and genotyped. The GBS SNPs gave a higher GS accuracy than 1,544 DArT markers on the same lines, despite 43.9% missing data. Using a bootstrap approach, we observed significantly more clustering of markers and ascertainment bias with DArT relative to GBS. The minor allele frequency distribution of GBS markers had a deficit of rare variants compared to DArT markers. Despite the ascertainment bias of the DArT markers, GS accuracy for three traits out of four was not significantly different when an equal number of markers were used for each platform. This suggests that the gain in accuracy observed using GBS compared to DArT markers was mainly due to a large increase in the number of markers available for the analysis. PMID:24040295

  12. Expanding the Diversity of Mycobacteriophages: Insights into Genome Architecture and Evolution

    OpenAIRE

    Pope, Welkin H.; Jacobs-Sera, Deborah; Russell, Daniel A.; Peebles, Craig L.; Al-Atrache, Zein; Alcoser, Turi A.; Alexander, Lisa M.; Alfano, Matthew B.; Alford, Samantha T.; Amy, Nichols E.; Anderson, Marie D.; Anderson, Alexander G.; Ang, Andrew A. S.; Ares, Manuel; Barber, Amanda J.

    2011-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts such as Mycobacterium smegmatis and Mycobacterium tuberculosis. All mycobacteriophages characterized to date are dsDNA tailed phages, and have either siphoviral or myoviral morphotypes. However, their genetic diversity is considerable, and although sixty-two genomes have been sequenced and comparatively analyzed, these likely represent only a small portion of the diversity of the mycobacteriophage population at large. Here we repo...

  13. Genome-wide analysis of the Dof transcription factor gene family reveals soybean-specific duplicable and functional characteristics.

    Directory of Open Access Journals (Sweden)

    Yong Guo

    Full Text Available The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max. In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.

  14. Genome-centric resolution of microbial diversity, metabolism and interactions in anaerobic digestion.

    Science.gov (United States)

    Vanwonterghem, Inka; Jensen, Paul D; Rabaey, Korneel; Tyson, Gene W

    2016-09-01

    Our understanding of the complex interconnected processes performed by microbial communities is hindered by our inability to culture the vast majority of microorganisms. Metagenomics provides a way to bypass this cultivation bottleneck and recent advances in this field now allow us to recover a growing number of genomes representing previously uncultured populations from increasingly complex environments. In this study, a temporal genome-centric metagenomic analysis was performed of lab-scale anaerobic digesters that host complex microbial communities fulfilling a series of interlinked metabolic processes to enable the conversion of cellulose to methane. In total, 101 population genomes that were moderate to near-complete were recovered based primarily on differential coverage binning. These populations span 19 phyla, represent mostly novel species and expand the genomic coverage of several rare phyla. Classification into functional guilds based on their metabolic potential revealed metabolic networks with a high level of functional redundancy as well as niche specialization, and allowed us to identify potential roles such as hydrolytic specialists for several rare, uncultured populations. Genome-centric analyses of complex microbial communities across diverse environments provide the key to understanding the phylogenetic and metabolic diversity of these interactive communities. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.

  15. Natural selection among Eurasians at genomic regions associated with HIV-1 control

    Directory of Open Access Journals (Sweden)

    Allison David B

    2011-06-01

    Full Text Available Abstract Background HIV susceptibility and pathogenicity exhibit both interindividual and intergroup variability. The etiology of intergroup variability is still poorly understood, and could be partly linked to genetic differences among racial/ethnic groups. These genetic differences may be traceable to different regimes of natural selection in the 60,000 years since the human radiation out of Africa. Here, we examine population differentiation and haplotype patterns at several loci identified through genome-wide association studies on HIV-1 control, as determined by viral-load setpoint, in European and African-American populations. We use genome-wide data from the Human Genome Diversity Project, consisting of 53 world-wide populations, to compare measures of FST and relative extended haplotype homozygosity (REHH at these candidate loci to the rest of the respective chromosome. Results We find that the Europe-Middle East and Europe-South Asia pairwise FST in the most strongly associated region are elevated compared to most pairwise comparisons with the sub-Saharan African group, which exhibit very low FST. We also find genetic signatures of recent positive selection (higher REHH at these associated regions among all groups except for sub-Saharan Africans and Native Americans. This pattern is consistent with one in which genetic differentiation, possibly due to diversifying/positive selection, occurred at these loci among Eurasians. Conclusions These findings are concordant with those from earlier studies suggesting recent evolutionary change at immunity-related genomic regions among Europeans, and shed light on the potential genetic and evolutionary origin of population differences in HIV-1 control.

  16. Evolutionary origin of Rosaceae-specific active non-autonomous hAT elements and their contribution to gene regulation and genomic structural variation.

    Science.gov (United States)

    Wang, Lu; Peng, Qian; Zhao, Jianbo; Ren, Fei; Zhou, Hui; Wang, Wei; Liao, Liao; Owiti, Albert; Jiang, Quan; Han, Yuepeng

    2016-05-01

    Transposable elements account for approximately 30 % of the Prunus genome; however, their evolutionary origin and functionality remain largely unclear. In this study, we identified a hAT transposon family, termed Moshan, in Prunus. The Moshan elements consist of three types, aMoshan, tMoshan, and mMoshan. The aMoshan and tMoshan types contain intact or truncated transposase genes, respectively, while the mMoshan type is miniature inverted-repeat transposable element (MITE). The Moshan transposons are unique to Rosaceae, and the copy numbers of different Moshan types are significantly correlated. Sequence homology analysis reveals that the mMoshan MITEs are direct deletion derivatives of the tMoshan progenitors, and one kind of mMoshan containing a MuDR-derived fragment were amplified predominately in the peach genome. The mMoshan sequences contain cis-regulatory elements that can enhance gene expression up to 100-fold. The mMoshan MITEs can serve as potential sources of micro and long noncoding RNAs. Whole-genome re-sequencing analysis indicates that mMoshan elements are highly active, and an insertion into S-haplotype-specific F-box gene was reported to cause the breakdown of self-incompatibility in sour cherry. Taken together, all these results suggest that the mMoshan elements play important roles in regulating gene expression and driving genomic structural variation in Prunus.

  17. Diverse Lifestyles and Strategies of Plant Pathogenesis Encoded in the Genomes of Eighteen Dothideomycetes

    Energy Technology Data Exchange (ETDEWEB)

    Ohm, Robin A.; Feau, Nicolas; Henrissat, Bernard; Schoch, Conrad L.; Horwitz, Benjamin A.; Barry, Kerrie W.; Condon, Bradford J.; Copeland, Alex C.; Dhillon, Braham; Glaser, Fabian; Hesse, Cedar N.; Kosti, Idit; LaButti, Kurt; Lindquist, Erika A.; Lucas, Susan; Salamov, Asaf A.; Bradshaw, Rosie E.; Ciuffetti, Lynda; Hamelin, Richard C.; Kema, Gert H. J.; Lawrence, Christopher; Scott, James A.; Spatafora, Joseph W.; Turgeon, B. Gillian; de Wit, Pierre J. G. M.; Zhong, Shaobin; Goodwin, Stephen B.; Grigoriev, Igor V.

    2013-03-05

    The class of Dothideomycetes is one of the largest and most diverse groups of fungi. Many are plant pathogens and pose a serious threat to agricultural crops that are grown for biofuel, food or feed. Most Dothideomycetes have only a single host plant, and related species can have very diverse hosts. Eighteen genomes of Dothideomycetes have currently been sequenced by the Joint Genome Institute and other sequencing centers. Here we describe the results of comparative analyses of the fungi in this group.

  18. Genome-wide survey of allele-specific splicing in humans

    Directory of Open Access Journals (Sweden)

    Scheffler Konrad

    2008-06-01

    Full Text Available Abstract Background Accurate mRNA splicing depends on multiple regulatory signals encoded in the transcribed RNA sequence. Many examples of mutations within human splice regulatory regions that alter splicing qualitatively or quantitatively have been reported and allelic differences in mRNA splicing are likely to be a common and important source of phenotypic diversity at the molecular level, in addition to their contribution to genetic disease susceptibility. However, because the effect of a mutation on the efficiency of mRNA splicing is often difficult to predict, many mutations that cause disease through an effect on splicing are likely to remain undiscovered. Results We have combined a genome-wide scan for sequence polymorphisms likely to affect mRNA splicing with analysis of publicly available Expressed Sequence Tag (EST and exon array data. The genome-wide scan uses published tools and identified 30,977 SNPs located within donor and acceptor splice sites, branch points and exonic splicing enhancer elements. For 1,185 candidate splicing polymorphisms the difference in splicing between alternative alleles was corroborated by publicly available exon array data from 166 lymphoblastoid cell lines. We developed a novel probabilistic method to infer allele-specific splicing from EST data. The method uses SNPs and alternative mRNA isoforms mapped to EST sequences and models both regulated alternative splicing as well as allele-specific splicing. We have also estimated heritability of splicing and report that a greater proportion of genes show evidence of splicing heritability than show heritability of overall gene expression level. Our results provide an extensive resource that can be used to assess the possible effect on splicing of human polymorphisms in putative splice-regulatory sites. Conclusion We report a set of genes showing evidence of allele-specific splicing from an integrated analysis of genomic polymorphisms, EST data and exon array

  19. Analysis of DR4 haplotypes in insulin dependent diabetes (IDD)

    International Nuclear Information System (INIS)

    Monos, D.S.; Radka, S.F.; Zmijewski, C.M.; Kamoun, M.

    1986-01-01

    Population studies indicate that HLA-DR4 is implicated in the susceptibility of IDD. However, biochemical characterization of the serologically defined DR4 haplotype from normal individuals revealed five DR4 and three DQW3 molecular forms. Hence, in this study, they investigated the heterogeneity of the DR4 haplotype, using B-lymphoblastoid cell lines (B-LCL) generated from patients with IDD and seropositive for DR4. Class II molecules, metabolically labeled with 35 S-methionine, were immunoprecipitated with monoclonal antibodies specific for DR(L243), DQ(N297), DQW3(IVD12) or DR and DQ(SG465) and analyzed by two-dimensional polyacrylamide gel electrophoresis (2D-PAGE). The isoelectrofocusing (IEF) conditions employed in this study allow representation only of the DR4 haplotype from either DR3/4 or DR4/4 cell lines. The analysis of six different DR4 haplotypes from seven IDD patients, revealed the presence of two DR4 β and two DQW3 β chains. Three of the six DR4 β haplotypes analyzed shared the same DR4 β chain and three others shared a different one. Additionally five of the six haplotypes shared a different one. Additionally five of the six haplotypes shared the same DQW3 β chain and only one was carrying a different one. Different combinations of the two DR4 and two DQW3 β chains constitute three distinct patterns of DR4 haplotypes. These results suggest the prevalence of a DQW3 β chain in the small sample of IDD patients studied. Studies of a large number of patients should clarify whether IDD is associated with unique variants of DR4 or DQW3 β chains

  20. Population genomic analysis reveals differential evolutionary histories and patterns of diversity across subgenomes and subpopulations of Brassica napus L.

    Directory of Open Access Journals (Sweden)

    Elodie eGazave

    2016-04-01

    Full Text Available The allotetraploid species Brassica napus L. is a global crop of major economic importance, providing canola oil (seed and vegetables for human consumption and fodder and meal for livestock feed. Characterizing the genetic diversity present in the extant germplasm pool of B. napus is fundamental to better conserve, manage and utilize the genetic resources of this species. We used sequence-based genotyping to identify and genotype 30,881 SNPs in a diversity panel of 782 B. napus accessions, representing samples of winter and spring growth habits originating from 33 countries across Europe, Asia and America. We detected strong population structure broadly concordant with growth habit and geography, and identified three major genetic groups: spring (SP, winter Europe (WE, and winter Asia (WA. Subpopulation-specific polymorphism patterns suggest enriched genetic diversity within the WA group and a smaller effective breeding population for the SP group compared to WE. Interestingly, the two subgenomes of B. napus appear to have different geographic origins, with phylogenetic analysis placing WE and WA as basal clades for the other subpopulations in the C and A subgenomes, respectively. Finally, we identified 16 genomic regions where the patterns of diversity differed markedly from the genome-wide average, several of which are suggestive of genomic inversions. The results obtained in this study constitute a valuable resource for worldwide breeding efforts and the genetic dissection and prediction of complex B. napus traits.

  1. Nanoparticles for Site Specific Genome Editing

    Science.gov (United States)

    McNeer, Nicole Ali

    Triplex-forming peptide nucleic acids (PNAs) can be used to coordinate the recombination of short 50-60 by "donor DNA" fragments into genomic DNA, resulting in site-specific correction of genetic mutations or the introduction of advantageous genetic modifications. Site-specific gene editing in hematopoietic stem and progenitor cells (HSPCs) could result in treatment or cure of inherited disorders of the blood such as beta-thalassemia. Gene editing in HSPCs and differentiated T cells could help combat HIV/AIDs by modifying receptors, such as CCR5, necessary for R5-tropic HIV entry. However, translation of genome modification technologies to clinical practice is limited by challenges in intracellular delivery, especially in difficult-to-transfect hematolymphoid cells. In vivo gene editing could also provide novel treatment for systemic monogenic disorders such as cystic fibrosis, an autosomal recessive disorder caused by mutations in the cystic fibrosis transmembrane receptor. Here, we have engineered biodegradable nanoparticles to deliver oligonucleotides for site-specific genome editing of disease-relevant genes in human cells, with high efficiency, low toxicity, and editing of clinically relevant cell types. We designed nanoparticles to edit the human beta-globin and CCR5 genes in hematopoietic cells. We show that poly(lactic-co-glycolic acid) (PLGA) nanoparticles can delivery PNA and donor DNA for site-specific gene modification in human hematopoietic cells in vitro and in vivo in NOD-scid IL2rgammanull mice. Nanoparticles delivered by tail vein localized to hematopoietic compartments in the spleen and bone marrow of humanized mice, resulting in modification of the beta-globin and CCR5 genes. Modification frequencies ranged from 0.005 to 20% of cells depending on the organ and cell type, without detectable toxicity. This project developed highly versatile methods for delivery of therapeutics to hematolymphoid cells and hematopoietic stem cells, and will help to

  2. Population-specific haplotype association of the postsynaptic density gene DLG4 with schizophrenia, in family-based association studies.

    Directory of Open Access Journals (Sweden)

    Shabeesh Balan

    Full Text Available The post-synaptic density (PSD of glutamatergic synapses harbors a multitude of proteins critical for maintaining synaptic dynamics. Alteration of protein expression levels in this matrix is a marked phenomenon of neuropsychiatric disorders including schizophrenia, where cognitive functions are impaired. To investigate the genetic relationship of genes expressed in the PSD with schizophrenia, a family-based association analysis of genetic variants in PSD genes such as DLG4, DLG1, PICK1 and MDM2, was performed, using Japanese samples (124 pedigrees, n = 376 subjects. Results showed a significant association of the rs17203281 variant from the DLG4 gene, with preferential transmission of the C allele (p = 0.02, although significance disappeared after correction for multiple testing. Replication analysis of this variant, found no association in a Chinese schizophrenia cohort (293 pedigrees, n = 1163 subjects or in a Japanese case-control sample (n = 4182 subjects. The DLG4 expression levels between postmortem brain samples from schizophrenia patients showed no significant changes from controls. Interestingly, a five marker haplotype in DLG4, involving rs2242449, rs17203281, rs390200, rs222853 and rs222837, was enriched in a population specific manner, where the sequences A-C-C-C-A and G-C-C-C-A accumulated in Japanese (p = 0.0009 and Chinese (p = 0.0007 schizophrenia pedigree samples, respectively. However, this could not be replicated in case-control samples. None of the variants in other examined candidate genes showed any significant association in these samples. The current study highlights a putative role for DLG4 in schizophrenia pathogenesis, evidenced by haplotype association, and warrants further dense screening for variants within these haplotypes.

  3. Comparative Genomic Analysis Reveals a Diverse Repertoire of Genes Involved in Prokaryote-Eukaryote Interactions within the Pseudovibrio Genus.

    Science.gov (United States)

    Romano, Stefano; Fernàndez-Guerra, Antonio; Reen, F Jerry; Glöckner, Frank O; Crowley, Susan P; O'Sullivan, Orla; Cotter, Paul D; Adams, Claire; Dobson, Alan D W; O'Gara, Fergal

    2016-01-01

    Strains of the Pseudovibrio genus have been detected worldwide, mainly as part of bacterial communities associated with marine invertebrates, particularly sponges. This recurrent association has been considered as an indication of a symbiotic relationship between these microbes and their host. Until recently, the availability of only two genomes, belonging to closely related strains, has limited the knowledge on the genomic and physiological features of the genus to a single phylogenetic lineage. Here we present 10 newly sequenced genomes of Pseudovibrio strains isolated from marine sponges from the west coast of Ireland, and including the other two publicly available genomes we performed an extensive comparative genomic analysis. Homogeneity was apparent in terms of both the orthologous genes and the metabolic features shared amongst the 12 strains. At the genomic level, a key physiological difference observed amongst the isolates was the presence only in strain P. axinellae AD2 of genes encoding proteins involved in assimilatory nitrate reduction, which was then proved experimentally. We then focused on studying those systems known to be involved in the interactions with eukaryotic and prokaryotic cells. This analysis revealed that the genus harbors a large diversity of toxin-like proteins, secretion systems and their potential effectors. Their distribution in the genus was not always consistent with the phylogenetic relationship of the strains. Finally, our analyses identified new genomic islands encoding potential toxin-immunity systems, previously unknown in the genus. Our analyses shed new light on the Pseudovibrio genus, indicating a large diversity of both metabolic features and systems for interacting with the host. The diversity in both distribution and abundance of these systems amongst the strains underlines how metabolically and phylogenetically similar bacteria may use different strategies to interact with the host and find a niche within its

  4. Detection of haplotypes associated with prenatal death in dairy cattle and identification of deleterious mutations in GART, SHBG and SLC37A2.

    Directory of Open Access Journals (Sweden)

    Sébastien Fritz

    Full Text Available The regular decrease of female fertility over time is a major concern in modern dairy cattle industry. Only half of this decrease is explained by indirect response to selection on milk production, suggesting the existence of other factors such as embryonic lethal genetic defects. Genomic regions harboring recessive deleterious mutations were detected in three dairy cattle breeds by identifying frequent haplotypes (>1% showing a deficit in homozygotes among Illumina Bovine 50k Beadchip haplotyping data from the French genomic selection database (47,878 Holstein, 16,833 Montbéliarde, and 11,466 Normande animals. Thirty-four candidate haplotypes (p<10(-4 including previously reported regions associated with Brachyspina, CVM, HH1, and HH3 in Holstein breed were identified. Haplotype length varied from 1 to 4.8 Mb and frequencies from 1.7 up to 9%. A significant negative effect on calving rate, consistent in heifers and in lactating cows, was observed for 9 of these haplotypes in matings between carrier bulls and daughters of carrier sires, confirming their association with embryonic lethal mutations. Eight regions were further investigated using whole genome sequencing data from heterozygous bull carriers and control animals (45 animals in total. Six strong candidate causative mutations including polymorphisms previously reported in FANCI (Brachyspina, SLC35A3 (CVM, APAF1 (HH1 and three novel mutations with very damaging effect on the protein structure, according to SIFT and Polyphen-2, were detected in GART, SHBG and SLC37A2 genes. In conclusion, this study reveals a yet hidden consequence of the important inbreeding rate observed in intensively selected and specialized cattle breeds. Counter-selection of these mutations and management of matings will have positive consequences on female fertility in dairy cattle.

  5. Detection of Haplotypes Associated with Prenatal Death in Dairy Cattle and Identification of Deleterious Mutations in GART, SHBG and SLC37A2

    Science.gov (United States)

    Fritz, Sébastien; Capitan, Aurelien; Djari, Anis; Rodriguez, Sabrina C.; Barbat, Anne; Baur, Aurélia; Grohs, Cécile; Weiss, Bernard; Boussaha, Mekki; Esquerré, Diane; Klopp, Christophe; Rocha, Dominique; Boichard, Didier

    2013-01-01

    The regular decrease of female fertility over time is a major concern in modern dairy cattle industry. Only half of this decrease is explained by indirect response to selection on milk production, suggesting the existence of other factors such as embryonic lethal genetic defects. Genomic regions harboring recessive deleterious mutations were detected in three dairy cattle breeds by identifying frequent haplotypes (>1%) showing a deficit in homozygotes among Illumina Bovine 50k Beadchip haplotyping data from the French genomic selection database (47,878 Holstein, 16,833 Montbéliarde, and 11,466 Normande animals). Thirty-four candidate haplotypes (p<10−4) including previously reported regions associated with Brachyspina, CVM, HH1, and HH3 in Holstein breed were identified. Haplotype length varied from 1 to 4.8 Mb and frequencies from 1.7 up to 9%. A significant negative effect on calving rate, consistent in heifers and in lactating cows, was observed for 9 of these haplotypes in matings between carrier bulls and daughters of carrier sires, confirming their association with embryonic lethal mutations. Eight regions were further investigated using whole genome sequencing data from heterozygous bull carriers and control animals (45 animals in total). Six strong candidate causative mutations including polymorphisms previously reported in FANCI (Brachyspina), SLC35A3 (CVM), APAF1 (HH1) and three novel mutations with very damaging effect on the protein structure, according to SIFT and Polyphen-2, were detected in GART, SHBG and SLC37A2 genes. In conclusion, this study reveals a yet hidden consequence of the important inbreeding rate observed in intensively selected and specialized cattle breeds. Counter-selection of these mutations and management of matings will have positive consequences on female fertility in dairy cattle. PMID:23762392

  6. A Candidate Trans-acting Modulator of Fetal Hemoglobin Gene Expression in the Arab-Indian Haplotype of Sickle Cell Anemia

    Science.gov (United States)

    Vathipadiekal, Vinod; Farrell, John J.; Wang, Shuai; Edward, Heather L.; Shappell, Heather; Al-Rubaish, A.M.; Al-Muhanna, Fahad; Naserullah, Z.; Alsuliman, A.; Qutub, Hatem Othman; Simkin, Irene; Farrer, Lindsay A.; Jiang, Zhihua; Luo, Hong-Yuan; Huang, Shengwen; Mostoslavsky, Gustavo; Murphy, George J.; Patra, Pradeep.K.; Chui, David H.K.; Alsultan, Abdulrahman; Al-Ali, Amein K.; Sebastiani, Paola.; Steinberg, Martin. H.

    2016-01-01

    Fetal hemoglobin (HbF) levels are higher in the Arab-Indian (AI) β-globin gene haplotype of sickle cell anemia compared with African-origin haplotypes. To study genetic elements that effect HbF expression in the AI haplotype we completed whole genome sequencing in 14 Saudi AI haplotype sickle hemoglobin homozygotes—seven selected for low HbF (8.2±1.3%) and seven selected for high HbF (23.5±.2.6%). An intronic single nucleotide polymorphism (SNP) in ANTXR1, an anthrax toxin receptor (chromosome 2p13), was associated with HbF. These results were replicated in two independent Saudi AI haplotype cohorts of 120 and 139 patients, but not in 76 Saudi Benin haplotype, 894 African origin haplotype and 44 Arab Indian haplotype patients of Indian descent, suggesting that this association is effective only in the Saudi AI haplotype background. ANTXR1 variants explained 10% of the HbF variability compared with 8% for BCL11A. These two genes had independent, additive effects on HbF and together explained about 15% of HbF variability in Saudi AI sickle cell anemia patients. ANTXR1 was expressed at mRNA and protein levels in erythroid progenitors derived from induced pluripotent stem cells (iPSCs) and CD34+ cells. As CD34+ cells matured and their HbF decreased ANTXR1 expression increased; as iPSCs differentiated and their HbF increased, ANTXR1 expression decreased. Along with elements in cis to the HbF genes, ANTXR1 contributes to the variation in HbF in Saudi AI haplotype sickle cell anemia and is the first gene in trans to HBB that is associated with HbF only in carriers of the Saudi AI haplotype. PMID:27501013

  7. Sequence variation of bovine mitochondrial ND-5 between haplotypes of composite and Hereford Breeds of beef cattle

    Directory of Open Access Journals (Sweden)

    SUTARNO

    2002-07-01

    Full Text Available The aims of the study were to: Investigate polymorphisms in the ND-5 region of bovine mitochondrial DNA in the composite and purebred Hereford herds from the Wokalup selection experiment, sequencing and compare the sequences between haplotypes and published sequence from Genebank. A total of 194 Hereford and 235 composite breed cattle from Wokalup Research Station were used in this study. The mitochondrial DNA was extracted using Wizard genomic DNA purification system from Promega. ND-5 fragment of mitochondrial DNA was amplified using PCR and continued with RFLP. Each haplotypes were sequenced. PCR products of each haplotype were cloned into pCR II, transformed, colonies selection, plasmid DNA extraction continued with cycle sequencing. Polymorphisms were found in both breeds of cattle in ND-5 region of mitochondrial DNA by PCR-RFLP analysis. Sequencing analysis confirmed the RFLPs data.

  8. The genome portal of the Department of Energy Joint Genome Institute: 2014 updates

    Energy Technology Data Exchange (ETDEWEB)

    Nordberg, Henrik [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Cantor, Michael [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Dusheyko, Serge [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Hua, Susan [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Poliakov, Alexander [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Shabalov, Igor [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Smirnova, Tatyana [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Grigoriev, Igor V. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Dubchak, Inna [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)

    2013-11-12

    The U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a national user facility, serves the diverse scientific community by providing integrated high-throughput sequencing and computational analysis to enable system-based scientific approaches in support of DOE missions related to clean energy generation and environmental characterization. The JGI Genome Portal (http://genome.jgi.doe.gov) provides unified access to all JGI genomic databases and analytical tools. The JGI maintains extensive data management systems and specialized analytical capabilities to manage and interpret complex genomic data. A user can search, download and explore multiple data sets available for all DOE JGI sequencing projects including their status, assemblies and annotations of sequenced genomes. In this paper, we describe major updates of the Genome Portal in the past 2 years with a specific emphasis on efficient handling of the rapidly growing amount of diverse genomic data accumulated in JGI.

  9. ARG-walker: inference of individual specific strengths of meiotic recombination hotspots by population genomics analysis.

    Science.gov (United States)

    Chen, Hao; Yang, Peng; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie

    2015-01-01

    Meiotic recombination hotspots play important roles in various aspects of genomics, but the underlying mechanisms for regulating the locations and strengths of recombination hotspots are not yet fully revealed. Most existing algorithms for estimating recombination rates from sequence polymorphism data can only output average recombination rates of a population, although there is evidence for the heterogeneity in recombination rates among individuals. For genome-wide association studies (GWAS) of recombination hotspots, an efficient algorithm that estimates the individualized strengths of recombination hotspots is highly desirable. In this work, we propose a novel graph mining algorithm named ARG-walker, based on random walks on ancestral recombination graphs (ARG), to estimate individual-specific recombination hotspot strengths. Extensive simulations demonstrate that ARG-walker is able to distinguish the hot allele of a recombination hotspot from the cold allele. Integrated with output of ARG-walker, we performed GWAS on the phased haplotype data of the 22 autosome chromosomes of the HapMap Asian population samples of Chinese and Japanese (JPT+CHB). Significant cis-regulatory signals have been detected, which is corroborated by the enrichment of the well-known 13-mer motif CCNCCNTNNCCNC of PRDM9 protein. Moreover, two new DNA motifs have been identified in the flanking regions of the significantly associated SNPs (single nucleotide polymorphisms), which are likely to be new cis-regulatory elements of meiotic recombination hotspots of the human genome. Our results on both simulated and real data suggest that ARG-walker is a promising new method for estimating the individual recombination variations. In the future, it could be used to uncover the mechanisms of recombination regulation and human diseases related with recombination hotspots.

  10. Patterns of interaction specificity of fungus-growing termites and Termitomyces symbionts in South Africa

    Directory of Open Access Journals (Sweden)

    de Beer Z Wilhelm

    2007-07-01

    Full Text Available Abstract Background Termites of the subfamily Macrotermitinae live in a mutualistic symbiosis with basidiomycete fungi of the genus Termitomyces. Here, we explored interaction specificity in fungus-growing termites using samples from 101 colonies in South-Africa and Senegal, belonging to eight species divided over three genera. Knowledge of interaction specificity is important to test the hypothesis that inhabitants (symbionts are taxonomically less diverse than 'exhabitants' (hosts and to test the hypothesis that transmission mode is an important determinant for interaction specificity. Results Analysis of Molecular Variance among symbiont ITS sequences across termite hosts at three hierarchical levels showed that 47 % of the variation occurred between genera, 18 % between species, and the remaining 35 % between colonies within species. Different patterns of specificity were evident. High mutual specificity was found for the single Macrotermes species studied, as M. natalensis was associated with a single unique fungal haplotype. The three species of the genus Odontotermes showed low symbiont specificity: they were all associated with a genetically diverse set of fungal symbionts, but their fungal symbionts showed some host specificity, as none of the fungal haplotypes were shared between the studied Odontotermes species. Finally, bilaterally low specificity was found for the four tentatively recognized species of the genus Microtermes, which shared and apparently freely exchanged a common pool of divergent fungal symbionts. Conclusion Interaction specificity was high at the genus level and generally much lower at the species level. A comparison of the observed diversity among fungal symbionts with the diversity among termite hosts, indicated that the fungal symbiont does not follow the general pattern of an endosymbiont, as we found either similar diversity at both sides or higher diversity in the symbiont. Our results further challenge the

  11. VNTR alleles associated with the {alpha}-globin locus are haplotype and population related

    Energy Technology Data Exchange (ETDEWEB)

    Martinson, J.J.; Clegg, J.B.; Boyce, A.J. [Univ. of Oxford (United Kingdom)

    1994-09-01

    The human {alpha}-globin complex contains several polymorphic restriction-enzyme sites (i.e., RFLPs) linked to form haplotypes and is flanked by two hypervariable VNTR loci, the 5{prime} hypervariable region (HVR) and the more highly polymorphic 3{prime}HVR. Using a combination of RFLP analysis and PCR, the authors have characterized the 5{prime}HVR and 3{prime}HVR alleles associated with the {alpha}-globin haplotypes of 133 chromosomes, and they here show that specific {alpha}-globin haplotypes are each associated with discrete subsets of the alleles observed at these two VNTR loci. This statistically highly significant association is observed over a region spanning {approximately} 100 kb. With the exception of closely related haplotypes, different haplotypes do not share identically sized 3{prime}HVR alleles. Earlier studies have shown that {alpha}-globin haplotype distributions differ between populations; the current findings also reveal extensive population substructure in the repertoire of {alpha}-globin VNTRs. If similar features are characteristic of other VNTR loci, this will have important implications for forensic and anthropological studies. 42 refs., 5 figs., 5 tabs.

  12. Vast diversity of prokaryotic virus genomes encoding double jelly-roll major capsid proteins uncovered by genomic and metagenomic sequence analysis.

    Science.gov (United States)

    Yutin, Natalya; Bäckström, Disa; Ettema, Thijs J G; Krupovic, Mart; Koonin, Eugene V

    2018-04-10

    Analysis of metagenomic sequences has become the principal approach for the study of the diversity of viruses. Many recent, extensive metagenomic studies on several classes of viruses have dramatically expanded the visible part of the virosphere, showing that previously undetected viruses, or those that have been considered rare, actually are important components of the global virome. We investigated the provenance of viruses related to tail-less bacteriophages of the family Tectiviridae by searching genomic and metagenomics sequence databases for distant homologs of the tectivirus-like Double Jelly-Roll major capsid proteins (DJR MCP). These searches resulted in the identification of numerous genomes of virus-like elements that are similar in size to tectiviruses (10-15 kilobases) and have diverse gene compositions. By comparison of the gene repertoires, the DJR MCP-encoding genomes were classified into 6 distinct groups that can be predicted to differ in reproduction strategies and host ranges. Only the DJR MCP gene that is present by design is shared by all these genomes, and most also encode a predicted DNA-packaging ATPase; the rest of the genes are present only in subgroups of this unexpectedly diverse collection of DJR MCP-encoding genomes. Only a minority encode a DNA polymerase which is a hallmark of the family Tectiviridae and the putative family "Autolykiviridae". Notably, one of the identified putative DJR MCP viruses encodes a homolog of Cas1 endonuclease, the integrase involved in CRISPR-Cas adaptation and integration of transposon-like elements called casposons. This is the first detected occurrence of Cas1 in a virus. Many of the identified elements are individual contigs flanked by inverted or direct repeats and appear to represent complete, extrachromosomal viral genomes, whereas others are flanked by bacterial genes and thus can be considered as proviruses. These contigs come from metagenomes of widely different environments, some dominated by

  13. Diversity of Plasmodium falciparum chloroquine resistance transporter (pfcrt exon 2 haplotypes in the Pacific from 1959 to 1979.

    Directory of Open Access Journals (Sweden)

    Chim W Chan

    Full Text Available Nearly one million deaths are attributed to malaria every year. Recent reports of multi-drug treatment failure of falciparum malaria underscore the need to understand the molecular basis of drug resistance. Multiple mutations in the Plasmodium falciparum chloroquine resistance transporter (pfcrt are involved in chloroquine resistance, but the evolution of complex haplotypes is not yet well understood. Using over 4,500 archival human serum specimens collected from 19 Pacific populations between 1959 and 1979, the period including and just prior to the appearance of chloroquine treatment failure in the Pacific, we PCR-amplified and sequenced a portion of the pfcrt exon 2 from 771 P. falciparum-infected individuals to explore the spatial and temporal variation in falciparum malaria prevalence and the evolution of chloroquine resistance. In the Pacific, the prevalence of P. falciparum varied considerably across ecological zones. On the island of New Guinea, the decreases in prevalence of P. falciparum in coastal, high-transmission areas over time were contrasted by the increase in prevalence during the same period in the highlands, where transmission was intermittent. We found 78 unique pfcrt haplotypes consisting of 34 amino acid substitutions and 28 synonymous mutations. More importantly, two pfcrt mutations (N75D and K76T implicated in chloroquine resistance were present in parasites from New Hebrides (now Vanuatu eight years before the first report of treatment failure. Our results also revealed unexpectedly high levels of genetic diversity in pfcrt exon 2 prior to the historical chloroquine resistance selective sweep, particularly in areas where disease burden was relatively low. In the Pacific, parasite genetic isolation, as well as host acquired immune status and genetic resistance to malaria, were important contributors to the evolution of chloroquine resistance in P. falciparum.

  14. A genome-wide scan for selection signatures in Nellore cattle.

    Science.gov (United States)

    Somavilla, A L; Sonstegard, T S; Higa, R H; Rosa, A N; Siqueira, F; Silva, L O C; Torres Júnior, R A A; Coutinho, L L; Mudadu, M A; Alencar, M M; Regitano, L C A

    2014-12-01

    Brazilian Nellore cattle (Bos indicus) have been selected for growth traits for over more than four decades. In recent years, reproductive and meat quality traits have become more important because of increasing consumption, exports and consumer demand. The identification of genome regions altered by artificial selection can potentially permit a better understanding of the biology of specific phenotypes that are useful for the development of tools designed to increase selection efficiency. Therefore, the aims of this study were to detect evidence of recent selection signatures in Nellore cattle using extended haplotype homozygosity methodology and BovineHD marker genotypes (>777,000 single nucleotide polymorphisms) as well as to identify corresponding genes underlying these signals. Thirty-one significant regions (P meat quality, fatty acid profiles and immunity. In addition, 545 genes were identified in regions harboring selection signatures. Within this group, 58 genes were associated with growth, muscle and adipose tissue metabolism, reproductive traits or the immune system. Using relative extended haplotype homozygosity to analyze high-density single nucleotide polymorphism marker data allowed for the identification of regions potentially under artificial selection pressure in the Nellore genome, which might be used to better understand autozygosity and the effects of selection on the Nellore genome. © 2014 Stichting International Foundation for Animal Genetics.

  15. Insular Celtic population structure and genomic footprints of migration.

    Directory of Open Access Journals (Sweden)

    Ross P Byrne

    2018-01-01

    Full Text Available Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.

  16. Phenotypic Diversity of Sickle Cell Disease in Patients with a Double Heterozygosity for Hb S and Hb D-Punjab.

    Science.gov (United States)

    Torres, Lidiane S; Okumura, Jéssika V; Belini-Júnior, Édis; Oliveira, Renan G; Nascimento, Patrícia P; Silva, Danilo G H; Lobo, Clarisse L C; Oliani, Sonia M; Bonini-Domingos, Claudia R

    2016-09-01

    Phenotypic heterogeneity for sickle cell disease is associated to several genetic factors such as genotype for sickle cell disease, β-globin gene cluster haplotypes and Hb F levels. The coinheritance of Hb S (HBB: c.20A > T) and Hb D-Punjab (HBB: c.364G > C) results in a double heterozygosity, which constitutes one of the genotypic causes of sickle cell disease. This study aimed to assess the phenotypic diversity of sickle cell disease presented by carriers of the Hb S/Hb D-Punjab genotype and the Bantu [- + - - - -] haplotype. We evaluated medical records from 12 patients with sickle cell disease whose Hb S/Hb D-Punjab genotype and Bantu haplotype were confirmed by molecular analysis. Hb S and Hb D-Punjab levels were quantified by chromatographic analysis. Mean concentrations of Hb S and Hb D-Punjab were 44.8 ± 2.3% and 43.3 ± 1.8%, respectively. Painful crises were present in eight (66.7%) patients evaluated, representing the most common clinical event. Acute chest syndrome (ACS) was the second most prevalent manifestation, occurring in two individuals (16.7%). Three patients were asymptomatic, while another two exhibited greater diversity of severe clinical manifestations. Medical records here analyzed reported a significant clinical diversity in sickle cell disease ranging from the absence of symptoms to wide phenotypic variety. The sickle cell disease genotype, Bantu haplotype and hemoglobin (Hb) levels did not influence the clinical diversity. Thus, we concluded that the phenotypic variation in sickle cell disease was present within a specific genotype for disease regardless of the β-globin gene cluster haplotypes.

  17. Mammalian-specific genomic functions: Newly acquired traits generated by genomic imprinting and LTR retrotransposon-derived genes in mammals.

    Science.gov (United States)

    Kaneko-Ishino, Tomoko; Ishino, Fumitoshi

    2015-01-01

    Mammals, including human beings, have evolved a unique viviparous reproductive system and a highly developed central nervous system. How did these unique characteristics emerge in mammalian evolution, and what kinds of changes did occur in the mammalian genomes as evolution proceeded? A key conceptual term in approaching these issues is "mammalian-specific genomic functions", a concept covering both mammalian-specific epigenetics and genetics. Genomic imprinting and LTR retrotransposon-derived genes are reviewed as the representative, mammalian-specific genomic functions that are essential not only for the current mammalian developmental system, but also mammalian evolution itself. First, the essential roles of genomic imprinting in mammalian development, especially related to viviparous reproduction via placental function, as well as the emergence of genomic imprinting in mammalian evolution, are discussed. Second, we introduce the novel concept of "mammalian-specific traits generated by mammalian-specific genes from LTR retrotransposons", based on the finding that LTR retrotransposons served as a critical driving force in the mammalian evolution via generating mammalian-specific genes.

  18. KIR And HLA Haplotype Analysis in a Family Lacking The KIR 2DL1-2DP1 Genes

    Directory of Open Access Journals (Sweden)

    Vojvodić Svetlana

    2015-06-01

    Full Text Available The killer cell immunoglobulin-like receptor (KIR gene cluster exhibits extensive allelic and haplotypic diversity that is observed as presence/absence of genes, resulting in expansion and contraction of KIR haplotypes and by allelic variation of individual KIR genes. We report a case of KIR pseudogene 2DP1 and 2DL1 gene absence in members of one family with the children suffering from acute myelogenous leukemia (AML. Killer cell immunoglo-bulin-like receptor low resolution genotyping was performed by the polymerase chain reaction (PCR-sequencespecific primers (SSP/sequence-specific oligonucleotide (SSO method and haplotype assignment was done by gene content analysis. Both parents and the maternal grandfather, shared the same Cen-B2 KIR haplotype, containing KIR 3DL3, -2DS2, -2DL2 and -3DP1 genes. The second haplotype in the KIR genotype of the mother and grandfather was Tel-A1 with KIR 2DL4 (normal and deleted variant, -3DL1, -22 bp deletion variant of the 2DS4 gene and -3DL2, while the second haplotype in the KIR genotype of the father was Tel-B1 with 2DL4 (normal variant, -3DS1, -2DL5, -2DS5, -2DS1 and 3DL2 genes. Haplotype analysis in all three offsprings revealed that the children inherited the Cen-B2 haplotype with the same gene content but two of the children inherited a deleted variant of the 2DL4 gene, while the third child inherited a normal one. The second haplotype of all three offspring contained KIR 2DL4, -2DL5, -2DS1, -2DS4 (del 22bp variant, -2DS5, -3DL1 and -3DL2 genes, which was the basis of the assumption that there is a hybrid haplotype and that the present 3DL1 gene is a variant of the 3DS1 gene. Due to consanguinity among the ancestors, the results of KIR segregation analysis showed the existence of a very rare KIR genotype in the offspring. The family who is the subject of this case is even more interesting because the father was 10/10 human leukocyte antigen (HLA-matched to his daughter, all members of the family have

  19. A Back Migration from Asia to Sub-Saharan Africa Is Supported by High-Resolution Analysis of Human Y-Chromosome Haplotypes

    Science.gov (United States)

    Cruciani, Fulvio; Santolamazza, Piero; Shen, Peidong; Macaulay, Vincent; Moral, Pedro; Olckers, Antonel; Modiano, David; Holmes, Susan; Destro-Bisol, Giovanni; Coia, Valentina; Wallace, Douglas C.; Oefner, Peter J.; Torroni, Antonio; Cavalli-Sforza, L. Luca; Scozzari, Rosaria; Underhill, Peter A.

    2002-01-01

    The variation of 77 biallelic sites located in the nonrecombining portion of the Y chromosome was examined in 608 male subjects from 22 African populations. This survey revealed a total of 37 binary haplotypes, which were combined with microsatellite polymorphism data to evaluate internal diversities and to estimate coalescence ages of the binary haplotypes. The majority of binary haplotypes showed a nonuniform distribution across the continent. Analysis of molecular variance detected a high level of interpopulation diversity (ΦST=0.342), which appears to be partially related to the geography (ΦCT=0.230). In sub-Saharan Africa, the recent spread of a set of haplotypes partially erased pre-existing diversity, but a high level of population (ΦST=0.332) and geographic (ΦCT=0.179) structuring persists. Correspondence analysis shows that three main clusters of populations can be identified: northern, eastern, and sub-Saharan Africans. Among the latter, the Khoisan, the Pygmies, and the northern Cameroonians are clearly distinct from a tight cluster formed by the Niger-Congo–speaking populations from western, central western, and southern Africa. Phylogeographic analyses suggest that a large component of the present Khoisan gene pool is eastern African in origin and that Asia was the source of a back migration to sub-Saharan Africa. Haplogroup IX Y chromosomes appear to have been involved in such a migration, the traces of which can now be observed mostly in northern Cameroon. PMID:11910562

  20. Quantifying Temporal Genomic Erosion in Endangered Species.

    Science.gov (United States)

    Díez-Del-Molino, David; Sánchez-Barreiro, Fatima; Barnes, Ian; Gilbert, M Thomas P; Dalén, Love

    2018-03-01

    Many species have undergone dramatic population size declines over the past centuries. Although stochastic genetic processes during and after such declines are thought to elevate the risk of extinction, comparative analyses of genomic data from several endangered species suggest little concordance between genome-wide diversity and current population sizes. This is likely because species-specific life-history traits and ancient bottlenecks overshadow the genetic effect of recent demographic declines. Therefore, we advocate that temporal sampling of genomic data provides a more accurate approach to quantify genetic threats in endangered species. Specifically, genomic data from predecline museum specimens will provide valuable baseline data that enable accurate estimation of recent decreases in genome-wide diversity, increases in inbreeding levels, and accumulation of deleterious genetic variation. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Haplotype Diversity and Reconstruction of Ancestral Haplotype Associated with the c.35delG Mutation in the GJB2 (Cx26) Gene among the Volgo-Ural Populations of Russia.

    Science.gov (United States)

    Dzhemileva, L U; Posukh, O L; Barashkov, N A; Fedorova, S A; Teryutin, F M; Akhmetova, V L; Khidiyatova, I M; Khusainova, R I; Lobov, S L; Khusnutdinova, E K

    2011-07-01

    The mutations in theGJB2(Сх26) gene make the biggest contribution to hereditary hearing loss. The spectrum and prevalence of theGJB2gene mutations are specific to populations of different ethnic origins. For severalGJB2 mutations, their origin from appropriate ancestral founder chromosome was shown, approximate estimations of "age" obtained, and presumable regions of their origin outlined. This work presents the results of the carrier frequencies' analysis of the major (for European countries) mutation c.35delG (GJB2gene) among 2,308 healthy individuals from 18 Eurasian populations of different ethnic origins: Bashkirs, Tatars, Chuvashs, Udmurts, Komi-Permyaks, Mordvins, and Russians (the Volga-Ural region of Russia); Byelorussians, Ukrainians (Eastern Europe); Abkhazians, Avars, Cherkessians, and Ingushes (Caucasus); Kazakhs, Uzbeks, Uighurs (Central Asia); and Yakuts, and Altaians (Siberia). The prevalence of the c.35delG mutation in the studied ethnic groups may act as additional evidence for a prospective role of the founder effect in the origin and distribution of this mutation in various populations worldwide. The haplotype analysis of chromosomes with the c.35delG mutation in patients with nonsyndromic sensorineural hearing loss (N=112) and in population samples (N =358) permitted the reconstruction of an ancestral haplotype with this mutation, established the common origin of the majority of the studied mutant chromosomes, and provided the estimated time of the c.35delG mutation carriers expansion (11,800 years) on the territory of the Volga-Ural region.

  2. Quantitative trait loci and the relevance of phased haplotypes

    DEFF Research Database (Denmark)

    Gregersen, Vivi Raundahl

    Genetic control of different production traits and diseases within livestock has been of great interest since domenstication. SNPs have greatly facilitated the use of QTL studies in the search of genomic regions affecting different phenotypes. The studies have been conducted to identify regions...... underlying gentic control both as traditional linkage studies relying on genetic maps and as GWAS where an approach of phasing haplotypes within the QTL have been conducted to validate the regions. Overall, regions of interest have been identified for chronic pleuritis and osteochondrosis in addition to meat...... quality and boar taint in pigs, and for improved chees production within cows...

  3. Genetic and genomic diversity studies of Acacia symbionts in Senegal reveal new species of Mesorhizobium with a putative geographical pattern.

    Science.gov (United States)

    Diouf, Fatou; Diouf, Diegane; Klonowska, Agnieszka; Le Queré, Antoine; Bakhoum, Niokhor; Fall, Dioumacor; Neyra, Marc; Parrinello, Hugues; Diouf, Mayecor; Ndoye, Ibrahima; Moulin, Lionel

    2015-01-01

    Acacia senegal (L) Willd. and Acacia seyal Del. are highly nitrogen-fixing and moderately salt tolerant species. In this study we focused on the genetic and genomic diversity of Acacia mesorhizobia symbionts from diverse origins in Senegal and investigated possible correlations between the genetic diversity of the strains, their soil of origin, and their tolerance to salinity. We first performed a multi-locus sequence analysis on five markers gene fragments on a collection of 47 mesorhizobia strains of A. senegal and A. seyal from 8 localities. Most of the strains (60%) clustered with the M. plurifarium type strain ORS 1032T, while the others form four new clades (MSP1 to MSP4). We sequenced and assembled seven draft genomes: four in the M. plurifarium clade (ORS3356, ORS3365, STM8773 and ORS1032T), one in MSP1 (STM8789), MSP2 (ORS3359) and MSP3 (ORS3324). The average nucleotide identities between these genomes together with the MLSA analysis reveal three new species of Mesorhizobium. A great variability of salt tolerance was found among the strains with a lack of correlation between the genetic diversity of mesorhizobia, their salt tolerance and the soils samples characteristics. A putative geographical pattern of A. senegal symbionts between the dryland north part and the center of Senegal was found, reflecting adaptations to specific local conditions such as the water regime. However, the presence of salt does not seem to be an important structuring factor of Mesorhizobium species.

  4. Haplotype and genetic relationship of 27 Y-STR loci in Han population of Chaoshan area of China

    Directory of Open Access Journals (Sweden)

    Qing-hua TIAN

    2017-04-01

    Full Text Available Objective  To investigate the genetic polymorphisms of 27 Y-chromosomal short tandem repeats (Y-STR loci included in Yfiler® Plus kit in Han population of Chaoshan area, and explore the population genetic relationships and evaluate its application value on forensic medicine. Methods  By detecting 795 unrelated Chaoshan Han males with Yfiler® Plus kit, haplotype frequencies and population genetics parameters of the 27 Y-STR loci were statistically analyzed and compared with available data of other populations from different races and regions for analyzing the genetic distance and clustering relation of Chaoshan Han population. Results  Seven hundred and eighty-seven different haplotypes were observed in 795 unrelated male individuals, of which 779 haplotypes were unique, and 8 haplotypes occurred twice. The haplotype diversity (HD was 0.999975 with discriminative capacity (DC of 98.99%. The gene diversity (GD at the 27 Y-STR loci ranged from 0.3637(DYS391 to 0.9559(DYS385a/b. Comparing with Asian reference populations, the genetic distance (Rst between Chaoshan Han and Guangdong Han was the smallest (0.0036, while it was relatively larger between Chaoshan Han and Gansu Tibetan population (0.0935. The multi-dimensional scaling (MDS plot based on Rst values was similar to the results of clustering analysis. Conclusion  Multiplex detection of the 27 Y-STR loci reveals a highly polymorphic genetic distribution in Chaoshan Han population, which demonstrates the important significance of Yfiler® Plus kit for establishing a Y-STR database, studying population genetics, and for good practice in forensic medicine. DOI: 10.11855/j.issn.0577-7402.2017.03.08

  5. Genetic diversity analysis of two commercial breeds of pigs using genomic and pedigree data.

    Science.gov (United States)

    Zanella, Ricardo; Peixoto, Jane O; Cardoso, Fernando F; Cardoso, Leandro L; Biegelmeyer, Patrícia; Cantão, Maurício E; Otaviano, Antonio; Freitas, Marcelo S; Caetano, Alexandre R; Ledur, Mônica C

    2016-03-30

    Genetic improvement in livestock populations can be achieved without significantly affecting genetic diversity if mating systems and selection decisions take genetic relationships among individuals into consideration. The objective of this study was to examine the genetic diversity of two commercial breeds of pigs. Genotypes from 1168 Landrace (LA) and 1094 Large White (LW) animals from a commercial breeding program in Brazil were obtained using the Illumina PorcineSNP60 Beadchip. Inbreeding estimates based on pedigree (F x) and genomic information using runs of homozygosity (F ROH) and the single nucleotide polymorphisms (SNP) by SNP inbreeding coefficient (F SNP) were obtained. Linkage disequilibrium (LD), correlation of linkage phase (r) and effective population size (N e ) were also estimated. Estimates of inbreeding obtained with pedigree information were lower than those obtained with genomic data in both breeds. We observed that the extent of LD was slightly larger at shorter distances between SNPs in the LW population than in the LA population, which indicates that the LW population was derived from a smaller N e . Estimates of N e based on genomic data were equal to 53 and 40 for the current populations of LA and LW, respectively. The correlation of linkage phase between the two breeds was equal to 0.77 at distances up to 50 kb, which suggests that genome-wide association and selection should be performed within breed. Although selection intensities have been stronger in the LA breed than in the LW breed, levels of genomic and pedigree inbreeding were lower for the LA than for the LW breed. The use of genomic data to evaluate population diversity in livestock animals can provide new and more precise insights about the effects of intense selection for production traits. Resulting information and knowledge can be used to effectively increase response to selection by appropriately managing the rate of inbreeding, minimizing negative effects of inbreeding

  6. Characterizing neutral genomic diversity and selection signatures in indigenous populations of Moroccan goats (Capra hircus using WGS data

    Directory of Open Access Journals (Sweden)

    Badr eBenjelloun

    2015-04-01

    Full Text Available Since the time of their domestication, goats (Capra hircus have evolved in a large variety of locally adapted populations in response to different human and environmental pressures. In the present era, many indigenous populations are threatened with extinction due to their substitution by cosmopolitan breeds, while they might represent highly valuable genomic resources. It is thus crucial to characterize the neutral and adaptive genetic diversity of indigenous populations. A fine characterization of whole genome variation in farm animals is now possible by using new sequencing technologies. We sequenced the complete genome at 12X coverage of 44 goats geographically representative of the three phenotypically distinct indigenous populations in Morocco. The study of mitochondrial genomes showed a high diversity exclusively restricted to the haplogroup A. The 44 nuclear genomes showed a very high diversity (24 million variants associated with low linkage disequilibrium. The overall genetic diversity was weakly structured according to geography and phenotypes. When looking for signals of positive selection in each population we identified many candidate genes, several of which gave insights into the metabolic pathways or biological processes involved in the adaptation to local conditions (e.g. panting in warm/desert conditions. This study highlights the interest of WGS data to characterize livestock genomic diversity. It illustrates the valuable genetic richness present in indigenous populations that have to be sustainably managed and may represent valuable genetic resources for the long-term preservation of the species.

  7. LifeStyle-Specific-Islands (LiSSI): Integrated Bioinformatics Platform for Genomic Island Analysis

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Rottger, Richard; Hauschild, Anne-Christin

    2017-01-01

    Distinct bacteria are able to cope with highly diverse lifestyles; for instance, they can be free living or host-associated. Thus, these organisms must possess a large and varied genomic arsenal to withstand different environmental conditions. To facilitate the identification of genomic features ...

  8. The family Rhabdoviridae: mono- and bipartite negative-sense RNA viruses with diverse genome organization and common evolutionary origins

    OpenAIRE

    Dietzgen, Ralf G.; Kondo, Hideki; Goodin, Michael M.; Kurath, Gael; Vasilakis, Nikos

    2016-01-01

    The family Rhabdoviridae consists of mostly enveloped, bullet-shaped or bacilliform viruses with a negative-sense, single-stranded RNA genome that infect vertebrates, invertebrates or plants. This ecological diversity is reflected by the diversity and complexity of their genomes. Five canonical structural protein genes are conserved in all rhabdoviruses, but may be overprinted, overlapped or interspersed with several novel and diverse accessory genes. This review gives an overview of the char...

  9. Diverse Lifestyles and Strategies of Plant Pathogenesis Encoded in the Genomes of Eighteen Doethideomycetes Fungi

    Energy Technology Data Exchange (ETDEWEB)

    Ohm, Robin A.; Feau, Nicolas; Henrissat, Bernard; Schoch, Conrad L.; Horwitz, Benjamin A.; Barry, Kerrie W.; Condon, Bradford J.; Copeland, Alex C.; Dhillon, Braham; Glaser, Fabien; Hesse, Cedar N.; Kosti, Idit; LaButti, Kurt; Lindquist, Erika A.; Lucas, Susan; Salamov, Asaf A.; Bradshaw, Rosie E.; Ciuffetti, Lynda; Hamelin, Richard C.; Kema, Gert H. J.; Lawrence, Christopher; Scott, James A.; Spatafora, Joseph W.; Turgeon, B. Gillian; de Wit, Pierre J. G. M.; Zhong, Shaobin; Goodwin, Stephen B.; Grigoriev, Igor V.

    2012-03-13

    The class of Dothideomycetes is one of the largest and most diverse groups of fungi. Many are plant pathogens and pose a serious threat to agricultural crops grown for biofuel, food or feed. Most Dothideomycetes have only a single host and related species can have very diverse host plants. Eighteen genomes of Dothideomycetes have currently been sequenced by the Joint Genome Institute and other sequencing centers. Here we describe the results of comparative analyses of the fungi in this group.

  10. Ecology and genomics of Bacillus subtilis.

    Science.gov (United States)

    Earl, Ashlee M; Losick, Richard; Kolter, Roberto

    2008-06-01

    Bacillus subtilis is a remarkably diverse bacterial species that is capable of growth within many environments. Recent microarray-based comparative genomic analyses have revealed that members of this species also exhibit considerable genomic diversity. The identification of strain-specific genes might explain how B. subtilis has become so broadly adapted. The goal of identifying ecologically adaptive genes could soon be realized with the imminent release of several new B. subtilis genome sequences. As we embark upon this exciting new era of B. subtilis comparative genomics we review what is currently known about the ecology and evolution of this species.

  11. Inferring genome-wide patterns of admixture in Qataris using fifty-five ancestral populations

    Directory of Open Access Journals (Sweden)

    Omberg Larsson

    2012-06-01

    Full Text Available Abstract Background Populations of the Arabian Peninsula have a complex genetic structure that reflects waves of migrations including the earliest human migrations from Africa and eastern Asia, migrations along ancient civilization trading routes and colonization history of recent centuries. Results Here, we present a study of genome-wide admixture in this region, using 156 genotyped individuals from Qatar, a country located at the crossroads of these migration patterns. Since haplotypes of these individuals could have originated from many different populations across the world, we have developed a machine learning method "SupportMix" to infer loci-specific genomic ancestry when simultaneously analyzing many possible ancestral populations. Simulations show that SupportMix is not only more accurate than other popular admixture discovery tools but is the first admixture inference method that can efficiently scale for simultaneous analysis of 50-100 putative ancestral populations while being independent of prior demographic information. Conclusions By simultaneously using the 55 world populations from the Human Genome Diversity Panel, SupportMix was able to extract the fine-scale ancestry of the Qatar population, providing many new observations concerning the ancestry of the region. For example, as well as recapitulating the three major sub-populations in Qatar, composed of mainly Arabic, Persian, and African ancestry, SupportMix additionally identifies the specific ancestry of the Persian group to populations sampled in Greater Persia rather than from China and the ancestry of the African group to sub-Saharan origin and not Southern African Bantu origin as previously thought.

  12. HLA-G Haplotypes Are Differentially Associated with Asthmatic Features

    Directory of Open Access Journals (Sweden)

    Camille Ribeyre

    2018-02-01

    Full Text Available Human leukocyte antigen (HLA-G, a HLA class Ib molecule, interacts with receptors on lymphocytes such as T cells, B cells, and natural killer cells to influence immune responses. Unlike classical HLA molecules, HLA-G expression is not found on all somatic cells, but restricted to tissue sites, including human bronchial epithelium cells (HBEC. Individual variation in HLA-G expression is linked to its genetic polymorphism and has been associated with many pathological situations such as asthma, which is characterized by epithelium abnormalities and inflammatory cell activation. Studies reported both higher and equivalent soluble HLA-G (sHLA-G expression in different cohorts of asthmatic patients. In particular, we recently described impaired local expression of HLA-G and abnormal profiles for alternatively spliced isoforms in HBEC from asthmatic patients. sHLA-G dosage is challenging because of its many levels of polymorphism (dimerization, association with β2-microglobulin, and alternative splicing, thus many clinical studies focused on HLA-G single-nucleotide polymorphisms as predictive biomarkers, but few analyzed HLA-G haplotypes. Here, we aimed to characterize HLA-G haplotypes and describe their association with asthmatic clinical features and sHLA-G peripheral expression and to describe variations in transcription factor (TF binding sites and alternative splicing sites. HLA-G haplotypes were differentially distributed in 330 healthy and 580 asthmatic individuals. Furthermore, HLA-G haplotypes were associated with asthmatic clinical features showed. However, we did not confirm an association between sHLA-G and genetic, biological, or clinical parameters. HLA-G haplotypes were phylogenetically split into distinct groups, with each group displaying particular variations in TF binding or RNA splicing sites that could reflect differential HLA-G qualitative or quantitative expression, with tissue-dependent specificities. Our results, based on a

  13. Genome sequence and genetic diversity of European ash trees

    DEFF Research Database (Denmark)

    Sollars, Elizabeth S A; Harper, Andrea L; Kelly, Laura J

    2017-01-01

    -heterozygosity Fraxinus excelsior tree from Gloucestershire, UK, annotating 38,852 protein-coding genes of which 25% appear ash specific when compared with the genomes of ten other plant species. Analyses of paralogous genes suggest a whole-genome duplication shared with olive (Olea europaea, Oleaceae). We also re...

  14. Factor IX gene haplotypes in Amerindians.

    Science.gov (United States)

    Franco, R F; Araújo, A G; Zago, M A; Guerreiro, J F; Figueiredo, M S

    1997-02-01

    We have determined the haplotypes of the factor IX gene for 95 Indians from 5 Brazilian Amazon tribes: Wayampí, Wayana-Apalaí, Kayapó, Arára, and Yanomámi. Eight polymorphisms linked to the factor IX gene were investigated: MseI (at 5', nt -698), BamHI (at 5', nt -561), DdeI (intron 1), BamHI (intron 2), XmnI (intron 3), TaqI (intron 4), MspI (intron 4), and HhaI (at 3', approximately 8 kb). The results of the haplotype distribution and the allele frequencies for each of the factor IX gene polymorphisms in Amerindians were similar to the results reported for Asian populations but differed from results for other ethnic groups. Only five haplotypes were identified within the entire Amerindian study population, and the haplotype distribution was significantly different among the five tribes, with one (Arára) to four (Wayampí) haplotypes being found per tribe. These findings indicate a significant heterogeneity among the Indian tribes and contrast with the homogeneous distribution of the beta-globin gene cluster haplotypes but agree with our recent findings on the distribution of alpha-globin gene cluster haplotypes and the allele frequencies for six VNTRs in the same Amerindian tribes. Our data represent the first study of factor IX-associated polymorphisms in Amerindian populations and emphasizes the applicability of these genetic markers for population and human evolution studies.

  15. Genome‐scale diversity and niche adaptation analysis of Lactococcus lactis by comparative genome hybridization using multi‐strain arrays

    Science.gov (United States)

    Siezen, Roland J.; Bayjanov, Jumamurat R.; Felis, Giovanna E.; van der Sijde, Marijke R.; Starrenburg, Marjo; Molenaar, Douwe; Wels, Michiel; van Hijum, Sacha A. F. T.; van Hylckama Vlieg, Johan E. T.

    2011-01-01

    Summary Lactococcus lactis produces lactic acid and is widely used in the manufacturing of various fermented dairy products. However, the species is also frequently isolated from non‐dairy niches, such as fermented plant material. Recently, these non‐dairy strains have gained increasing interest, as they have been described to possess flavour‐forming activities that are rarely found in dairy isolates and have diverse metabolic properties. We performed an extensive whole‐genome diversity analysis on 39 L. lactis strains, isolated from dairy and plant sources. Comparative genome hybridization analysis with multi‐strain microarrays was used to assess presence or absence of genes and gene clusters in these strains, relative to all L. lactis sequences in public databases, whereby chromosomal and plasmid‐encoded genes were computationally analysed separately. Nearly 3900 chromosomal orthologous groups (chrOGs) were defined on basis of four sequenced chromosomes of L. lactis strains (IL1403, KF147, SK11, MG1363). Of these, 1268 chrOGs are present in at least 35 strains and represent the presently known core genome of L. lactis, and 72 chrOGs appear to be unique for L. lactis. Nearly 600 and 400 chrOGs were found to be specific for either the subspecies lactis or subspecies cremoris respectively. Strain variability was found in presence or absence of gene clusters related to growth on plant substrates, such as genes involved in the consumption of arabinose, xylan, α‐galactosides and galacturonate. Further niche‐specific differences were found in gene clusters for exopolysaccharides biosynthesis, stress response (iron transport, osmotolerance) and bacterial defence mechanisms (nisin biosynthesis). Strain variability of functions encoded on known plasmids included proteolysis, lactose fermentation, citrate uptake, metal ion resistance and exopolysaccharides biosynthesis. The present study supports the view of L. lactis as a species with a very flexible

  16. Differentiation analysis for estimating individual ancestry from the Tibetan Plateau by an archaic altitude adaptation EPAS1 haplotype among East Asian populations.

    Science.gov (United States)

    Jiang, Li; Peng, Jianxiong; Huang, Meisha; Liu, Jing; Wang, Ling; Ma, Quan; Zhao, Hui; Yang, Xin; Ji, Anquan; Li, Caixia

    2018-02-10

    Tibetans have adapted to the extreme environment of high altitude for hundreds of generations. A highly differentiated 5-SNP (Single Nucleotide Polymorphism) haplotype motif (AGGAA) on a hypoxic pathway gene, EPAS1, is observed in Tibetans and lowlanders. To evaluate the potential usage of the 5-SNP haplotype in ancestry inference for Tibetan or Tibetan-related populations, we analyzed this haplotype in 1053 individuals of 12 Chinese populations residing on the Tibetan Plateau, peripheral regions of Tibet, and plain regions. These data were integrated with the genotypes from the 1000 Genome populations and populations in a previously reported paper for population structure analyses. We found that populations representing highland and lowland groups have different dominant ancestry components. The core Denisovan haplotype (AGGAA) was observed at a frequency of 72.32% in the Tibetan Plateau, with a frequency range from 9.48 to 21.05% in the peripheral regions and Tibetan Plateau carried the archaic haplotype, while < 5% of the Chinese Han people carried the haplotype. Our findings indicate that the 5-SNP haplotype has a special distribution pattern in populations of Tibet and peripheral regions and could be integrated into AISNP (Ancestry Informative Single Nucleotide Polymorphism) panels to enhance ancestry resolution.

  17. Genetic diversity and trait genomic prediction in a pea diversity panel.

    Science.gov (United States)

    Burstin, Judith; Salloignon, Pauline; Chabert-Martinello, Marianne; Magnin-Robert, Jean-Bernard; Siol, Mathieu; Jacquin, Françoise; Chauveau, Aurélie; Pont, Caroline; Aubert, Grégoire; Delaitre, Catherine; Truntzer, Caroline; Duc, Gérard

    2015-02-21

    Pea (Pisum sativum L.), a major pulse crop grown for its protein-rich seeds, is an important component of agroecological cropping systems in diverse regions of the world. New breeding challenges imposed by global climate change and new regulations urge pea breeders to undertake more efficient methods of selection and better take advantage of the large genetic diversity present in the Pisum sativum genepool. Diversity studies conducted so far in pea used Simple Sequence Repeat (SSR) and Retrotransposon Based Insertion Polymorphism (RBIP) markers. Recently, SNP marker panels have been developed that will be useful for genetic diversity assessment and marker-assisted selection. A collection of diverse pea accessions, including landraces and cultivars of garden, field or fodder peas as well as wild peas was characterised at the molecular level using newly developed SNP markers, as well as SSR markers and RBIP markers. The three types of markers were used to describe the structure of the collection and revealed different pictures of the genetic diversity among the collection. SSR showed the fastest rate of evolution and RBIP the slowest rate of evolution, pointing to their contrasted mode of evolution. SNP markers were then used to predict phenotypes -the date of flowering (BegFlo), the number of seeds per plant (Nseed) and thousand seed weight (TSW)- that were recorded for the collection. Different statistical methods were tested including the LASSO (Least Absolute Shrinkage ans Selection Operator), PLS (Partial Least Squares), SPLS (Sparse Partial Least Squares), Bayes A, Bayes B and GBLUP (Genomic Best Linear Unbiased Prediction) methods and the structure of the collection was taken into account in the prediction. Despite a limited number of 331 markers used for prediction, TSW was reliably predicted. The development of marker assisted selection has not reached its full potential in pea until now. This paper shows that the high-throughput SNP arrays that are being

  18. Historically low mitochondrial DNA diversity in koalas (Phascolarctos cinereus).

    Science.gov (United States)

    Tsangaras, Kyriakos; Ávila-Arcos, María C; Ishida, Yasuko; Helgen, Kristofer M; Roca, Alfred L; Greenwood, Alex D

    2012-10-24

    The koala (Phascolarctos cinereus) is an arboreal marsupial that was historically widespread across eastern Australia until the end of the 19th century when it suffered a steep population decline. Hunting for the fur trade, habitat conversion, and disease contributed to a precipitous reduction in koala population size during the late 1800s and early 1900s. To examine the effects of these reductions in population size on koala genetic diversity, we sequenced part of the hypervariable region of mitochondrial DNA (mtDNA) in koala museum specimens collected in the 19th and 20th centuries, hypothesizing that the historical samples would exhibit greater genetic diversity. The mtDNA haplotypes present in historical museum samples were identical to haplotypes found in modern koala populations, and no novel haplotypes were detected. Rarefaction analyses suggested that the mtDNA genetic diversity present in the museum samples was similar to that of modern koalas. Low mtDNA diversity may have been present in koala populations prior to recent population declines. When considering management strategies, low genetic diversity of the mtDNA hypervariable region may not indicate recent inbreeding or founder events but may reflect an older historical pattern for koalas.

  19. Unraveling Mycobacterium tuberculosis genomic diversity and evolution in Lisbon, Portugal, a highly drug resistant setting

    KAUST Repository

    Perdigão, João

    2014-11-18

    Background Multidrug- (MDR) and extensively drug resistant (XDR) tuberculosis (TB) presents a challenge to disease control and elimination goals. In Lisbon, Portugal, specific and successful XDR-TB strains have been found in circulation for almost two decades. Results In the present study we have genotyped and sequenced the genomes of 56 Mycobacterium tuberculosis isolates recovered mostly from Lisbon. The genotyping data revealed three major clusters associated with MDR-TB, two of which are associated with XDR-TB. Whilst the genomic data contributed to elucidate the phylogenetic positioning of circulating MDR-TB strains, showing a high predominance of a single SNP cluster group 5. Furthermore, a genome-wide phylogeny analysis from these strains, together with 19 publicly available genomes of Mycobacterium tuberculosis clinical isolates, revealed two major clades responsible for M/XDR-TB in the region: Lisboa3 and Q1 (LAM). The data presented by this study yielded insights on microevolution and identification of novel compensatory mutations associated with rifampicin resistance in rpoB and rpoC. The screening for other structural variations revealed putative clade-defining variants. One deletion in PPE41, found among Lisboa3 isolates, is proposed to contribute to immune evasion and as a selective advantage. Insertion sequence (IS) mapping has also demonstrated the role of IS6110 as a major driver in mycobacterial evolution by affecting gene integrity and regulation. Conclusions Globally, this study contributes with novel genome-wide phylogenetic data and has led to the identification of new genomic variants that support the notion of a growing genomic diversity facing both setting and host adaptation.

  20. Diversity, abundance, and host relationships of avian malaria and related haemosporidians in New Mexico pine forests

    Directory of Open Access Journals (Sweden)

    Rosario A. Marroquin-Flores

    2017-08-01

    Full Text Available Avian malaria and related haemosporidian parasites (genera Haemoproteus, Plasmodium, and Leucocytozoon affect bird demography, species range limits, and community structure, yet they remain unsurveyed in most bird communities and populations. We conducted a community-level survey of these vector-transmitted parasites in New Mexico, USA, to describe their diversity, abundance, and host associations. We focused on the breeding-bird community in the transition zone between piñon-juniper woodland and ponderosa pine forests (elevational range: 2,150–2,460 m. We screened 186 birds representing 49 species using both standard PCR and microscopy techniques to detect infections of all three avian haemosporidian genera. We detected infections in 68 out of 186 birds (36.6%, the highest proportion of which were infected with Haemoproteus (20.9%, followed by Leucocytozoon (13.4%, then Plasmodium (8.0%. We sequenced mtDNA for 77 infections representing 43 haplotypes (25 Haemoproteus, 12 Leucocytozoon, 6 Plasmodium. When compared to all previously known haplotypes in the MalAvi and GenBank databases, 63% (27 of the haplotypes we recovered were novel. We found evidence for host specificity at the avian clade and species level, but this specificity was variable among parasite genera, in that Haemoproteus and Leucocytozoon were each restricted to three avian groups (out of six, while Plasmodium occurred in all groups except non-passerines. We found striking variation in infection rate among host species, with nearly universal infection among vireos and no infection among nuthatches. Using rarefaction and extrapolation, we estimated the total avian haemosporidian diversity to be 70 haplotypes (95% CI [43–98]; thus, we may have already sampled ∼60% of the diversity of avian haemosporidians in New Mexico pine forests. It is possible that future studies will find higher diversity in microhabitats or host species that are under-sampled or unsampled in the

  1. Mapping of HLA- DQ haplotypes in a group of Danish patients with celiac disease

    DEFF Research Database (Denmark)

    Lund, Flemming; Hermansen, Mette N; Pedersen, Merete F

    2015-01-01

    BACKGROUND: A cost-effective identification of HLA- DQ risk haplotypes using the single nucleotide polymorphism (SNP) technique has recently been applied in the diagnosis of celiac disease (CD) in four European populations. The objective of the study was to map risk HLA- DQ haplotypes in a group...... of Danish CD patients using the SNP technique. METHODS: Cohort A: Among 65 patients with gastrointestinal symptoms we compared the HLA- DQ2 and HLA- DQ8 risk haplotypes obtained by the SNP technique (method 1) with results based on a sequence specific primer amplification technique (method 2...

  2. A high-density Diversity Arrays Technology (DArT microarray for genome-wide genotyping in Eucalyptus

    Directory of Open Access Journals (Sweden)

    Myburg Alexander A

    2010-06-01

    Full Text Available Abstract Background A number of molecular marker technologies have allowed important advances in the understanding of the genetics and evolution of Eucalyptus, a genus that includes over 700 species, some of which are used worldwide in plantation forestry. Nevertheless, the average marker density achieved with current technologies remains at the level of a few hundred markers per population. Furthermore, the transferability of markers produced with most existing technology across species and pedigrees is usually very limited. High throughput, combined with wide genome coverage and high transferability are necessary to increase the resolution, speed and utility of molecular marker technology in eucalypts. We report the development of a high-density DArT genome profiling resource and demonstrate its potential for genome-wide diversity analysis and linkage mapping in several species of Eucalyptus. Findings After testing several genome complexity reduction methods we identified the PstI/TaqI method as the most effective for Eucalyptus and developed 18 genomic libraries from PstI/TaqI representations of 64 different Eucalyptus species. A total of 23,808 cloned DNA fragments were screened and 13,300 (56% were found to be polymorphic among 284 individuals. After a redundancy analysis, 6,528 markers were selected for the operational array and these were supplemented with 1,152 additional clones taken from a library made from the E. grandis tree whose genome has been sequenced. Performance validation for diversity studies revealed 4,752 polymorphic markers among 174 individuals. Additionally, 5,013 markers showed segregation when screened using six inter-specific mapping pedigrees, with an average of 2,211 polymorphic markers per pedigree and a minimum of 859 polymorphic markers that were shared between any two pedigrees. Conclusions This operational DArT array will deliver 1,000-2,000 polymorphic markers for linkage mapping in most eucalypt pedigrees

  3. Detecting structure of haplotypes and local ancestry

    Science.gov (United States)

    We present a two-layer hidden Markov model to detect the structure of haplotypes for unrelated individuals. This allows us to model two scales of linkage disequilibrium (one within a group of haplotypes and one between groups), thereby taking advantage of rich haplotype information to infer local an...

  4. Assessing and Exploiting Functional Diversity in Germplasm Pools to Enhance Abiotic Stress Adaptation and Yield in Cereals and Food Legumes

    Science.gov (United States)

    Dwivedi, Sangam L.; Scheben, Armin; Edwards, David; Spillane, Charles; Ortiz, Rodomiro

    2017-01-01

    There is a need to accelerate crop improvement by introducing alleles conferring host plant resistance, abiotic stress adaptation, and high yield potential. Elite cultivars, landraces and wild relatives harbor useful genetic variation that needs to be more easily utilized in plant breeding. We review genome-wide approaches for assessing and identifying alleles associated with desirable agronomic traits in diverse germplasm pools of cereals and legumes. Major quantitative trait loci and single nucleotide polymorphisms (SNPs) associated with desirable agronomic traits have been deployed to enhance crop productivity and resilience. These include alleles associated with variation conferring enhanced photoperiod and flowering traits. Genetic variants in the florigen pathway can provide both environmental flexibility and improved yields. SNPs associated with length of growing season and tolerance to abiotic stresses (precipitation, high temperature) are valuable resources for accelerating breeding for drought-prone environments. Both genomic selection and genome editing can also harness allelic diversity and increase productivity by improving multiple traits, including phenology, plant architecture, yield potential and adaptation to abiotic stresses. Discovering rare alleles and useful haplotypes also provides opportunities to enhance abiotic stress adaptation, while epigenetic variation has potential to enhance abiotic stress adaptation and productivity in crops. By reviewing current knowledge on specific traits and their genetic basis, we highlight recent developments in the understanding of crop functional diversity and identify potential candidate genes for future use. The storage and integration of genetic, genomic and phenotypic information will play an important role in ensuring broad and rapid application of novel genetic discoveries by the plant breeding community. Exploiting alleles for yield-related traits would allow improvement of selection efficiency and

  5. Assessing and Exploiting Functional Diversity in Germplasm Pools to Enhance Abiotic Stress Adaptation and Yield in Cereals and Food Legumes

    Directory of Open Access Journals (Sweden)

    Sangam L. Dwivedi

    2017-08-01

    Full Text Available There is a need to accelerate crop improvement by introducing alleles conferring host plant resistance, abiotic stress adaptation, and high yield potential. Elite cultivars, landraces and wild relatives harbor useful genetic variation that needs to be more easily utilized in plant breeding. We review genome-wide approaches for assessing and identifying alleles associated with desirable agronomic traits in diverse germplasm pools of cereals and legumes. Major quantitative trait loci and single nucleotide polymorphisms (SNPs associated with desirable agronomic traits have been deployed to enhance crop productivity and resilience. These include alleles associated with variation conferring enhanced photoperiod and flowering traits. Genetic variants in the florigen pathway can provide both environmental flexibility and improved yields. SNPs associated with length of growing season and tolerance to abiotic stresses (precipitation, high temperature are valuable resources for accelerating breeding for drought-prone environments. Both genomic selection and genome editing can also harness allelic diversity and increase productivity by improving multiple traits, including phenology, plant architecture, yield potential and adaptation to abiotic stresses. Discovering rare alleles and useful haplotypes also provides opportunities to enhance abiotic stress adaptation, while epigenetic variation has potential to enhance abiotic stress adaptation and productivity in crops. By reviewing current knowledge on specific traits and their genetic basis, we highlight recent developments in the understanding of crop functional diversity and identify potential candidate genes for future use. The storage and integration of genetic, genomic and phenotypic information will play an important role in ensuring broad and rapid application of novel genetic discoveries by the plant breeding community. Exploiting alleles for yield-related traits would allow improvement of selection

  6. Haplotype analysis indicates an association between the DOPA decarboxylase (DDC) gene and nicotine dependence.

    Science.gov (United States)

    Ma, Jennie Z; Beuten, Joke; Payne, Thomas J; Dupont, Randolph T; Elston, Robert C; Li, Ming D

    2005-06-15

    DOPA decarboxylase (DDC; also known as L-amino acid decarboxylase; AADC) is involved in the synthesis of dopamine, norepinephrine and serotonin. Because the mesolimbic dopaminergic system is implicated in the reinforcing effects of many drugs, including nicotine, the DDC gene is considered a plausible candidate for involvement in the development of vulnerability to nicotine dependence (ND). Further, this gene is located within the 7p11 region that showed a 'suggestive linkage' to ND in our previous genome-wide scan in the Framingham Heart Study population. In the present study, we tested eight single nucleotide polymorphisms (SNPs) within DDC for association with ND, which was assessed by smoking quantity (SQ), the heaviness of smoking index (HSI) and the Fagerstrom test for ND (FTND) score, in a total of 2037 smokers and non-smokers from 602 nuclear families of African- or European-American (AA or EA, respectively) ancestry. Association analysis for individual SNPs using the PBAT-GEE program indicated that SNP rs921451 was significantly associated with two of the three adjusted ND measures in the EA sample (P=0.01-0.04). Haplotype-based association analysis revealed a protective T-G-T-G haplotype for rs921451-rs3735273-rs1451371-rs2060762 in the AA sample, which was significantly associated with all three adjusted ND measures after correction for multiple testing (min Z=-2.78, P=0.006 for HSI). In contrast, we found a high-risk T-G-T-G haplotype for a different SNP combination in the EA sample, rs921451-rs3735273-rs1451371-rs3757472, which showed a significant association after Bonferroni correction with the SQ and FTND score (max Z=2.73, P=0.005 for FTND). In summary, our findings provide the first evidence for the involvement of DDC in the susceptibility to ND and, further, reveal the racial specificity of its impact.

  7. Global Population Structure of a Worldwide Pest and Virus Vector: Genetic Diversity and Population History of the Bemisia tabaci Sibling Species Group

    Science.gov (United States)

    2016-01-01

    The whitefly Bemisia tabaci sibling species (sibsp.) group comprises morphologically indiscernible lineages of well-known exemplars referred to as biotypes. It is distributed throughout tropical and subtropical latitudes and includes the contemporary invasive haplotypes, termed B and Q. Several well-studied B. tabaci biotypes exhibit ecological and biological diversity, however, most members are poorly studied or completely uncharacterized. Genetic studies have revealed substantial diversity within the group based on a fragment of the mitochondrial cytochrome oxidase I (mtCOI) sequence (haplotypes), with other tested markers being less useful for deep phylogenetic comparisons. The view of global relationships within the B. tabaci sibsp. group is largely derived from this single marker, making assessment of gene flow and genetic structure difficult at the population level. Here, the population structure was explored for B. tabaci in a global context using nuclear data from variable microsatellite markers. Worldwide collections were examined representing most of the available diversity, including known monophagous, polyphagous, invasive, and indigenous haplotypes. Well-characterized biotypes and other related geographic lineages discovered represented highly differentiated genetic clusters with little or no evidence of gene flow. The invasive B and Q biotypes exhibited moderate to high levels of genetic diversity, suggesting that they stemmed from large founding populations that have maintained ancestral variation, despite homogenizing effects, possibly due to human-mediated among-population gene flow. Results of the microsatellite analyses are in general agreement with published mtCOI phylogenies; however, notable conflicts exist between the nuclear and mitochondrial relationships, highlighting the need for a multifaceted approach to delineate the evolutionary history of the group. This study supports the hypothesis that the extant B. tabaci sibsp. group contains

  8. Genetic and genomic diversity studies of Acacia symbionts in Senegal reveal new species of Mesorhizobium with a putative geographical pattern.

    Directory of Open Access Journals (Sweden)

    Fatou Diouf

    Full Text Available Acacia senegal (L Willd. and Acacia seyal Del. are highly nitrogen-fixing and moderately salt tolerant species. In this study we focused on the genetic and genomic diversity of Acacia mesorhizobia symbionts from diverse origins in Senegal and investigated possible correlations between the genetic diversity of the strains, their soil of origin, and their tolerance to salinity. We first performed a multi-locus sequence analysis on five markers gene fragments on a collection of 47 mesorhizobia strains of A. senegal and A. seyal from 8 localities. Most of the strains (60% clustered with the M. plurifarium type strain ORS 1032T, while the others form four new clades (MSP1 to MSP4. We sequenced and assembled seven draft genomes: four in the M. plurifarium clade (ORS3356, ORS3365, STM8773 and ORS1032T, one in MSP1 (STM8789, MSP2 (ORS3359 and MSP3 (ORS3324. The average nucleotide identities between these genomes together with the MLSA analysis reveal three new species of Mesorhizobium. A great variability of salt tolerance was found among the strains with a lack of correlation between the genetic diversity of mesorhizobia, their salt tolerance and the soils samples characteristics. A putative geographical pattern of A. senegal symbionts between the dryland north part and the center of Senegal was found, reflecting adaptations to specific local conditions such as the water regime. However, the presence of salt does not seem to be an important structuring factor of Mesorhizobium species.

  9. Genome-wide analysis reveals signatures of selection for important traits in domestic sheep from different ecoregions.

    Science.gov (United States)

    Liu, Zhaohua; Ji, Zhibin; Wang, Guizhi; Chao, Tianle; Hou, Lei; Wang, Jianmin

    2016-11-03

    Throughout a long period of adaptation and selection, sheep have thrived in a diverse range of ecological environments. Mongolian sheep is the common ancestor of the Chinese short fat-tailed sheep. Migration to different ecoregions leads to changes in selection pressures and results in microevolution. Mongolian sheep and its subspecies differ in a number of important traits, especially reproductive traits. Genome-wide intraspecific variation is required to dissect the genetic basis of these traits. This research resequenced 3 short fat-tailed sheep breeds with a 43.2-fold coverage of the sheep genome. We report more than 17 million single nucleotide polymorphisms and 2.9 million indels and identify 143 genomic regions with reduced pooled heterozygosity or increased genetic distance to each other breed that represent likely targets for selection during the migration. These regions harbor genes related to developmental processes, cellular processes, multicellular organismal processes, biological regulation, metabolic processes, reproduction, localization, growth and various components of the stress responses. Furthermore, we examined the haplotype diversity of 3 genomic regions involved in reproduction and found significant differences in TSHR and PRL gene regions among 8 sheep breeds. Our results provide useful genomic information for identifying genes or causal mutations associated with important economic traits in sheep and for understanding the genetic basis of adaptation to different ecological environments.

  10. Molecular analysis and genetic diversity of Aedes albopictus (Diptera, Culicidae) from China.

    Science.gov (United States)

    Ruiling, Zhang; Peien, Leng; Xuejun, Wang; Zhong, Zhang

    2018-05-01

    Aedes albopictus is one of the most invasive species, which can carry Dengue virus, Yellow fever virus and more than twenty arboviruses. Based on mitochondrial gene cytochrome c oxidase I (COI) and samples collected from 17 populations, we investigated the molecular character and genetic diversity of Ae. albopictus from China. Altogether, 25 haplotypes were detected, including 10 shared haplotypes and 15 private haplotypes. H1 was the dominant haplotype, which is widely distributed in 13 populations. Tajima'D value of most populations was significantly negative, demonstrating that populations experienced rapid range expansion recently. Most haplotypes clustered together both in phylogenetic and median-joining network analysis without clear phylogeographic patterns. However, neutrality tests revealed shallow divergences among Hainan and Guangxi with other populations (0.15599 ≤ F ST ≤ 0.75858), which probably due to interrupted gene flow, caused by geographical isolations. In conclusion, Ae. albopictus populations showed low genetic diversity in China.

  11. Haplotyping Problem, A Clustering Approach

    International Nuclear Information System (INIS)

    Eslahchi, Changiz; Sadeghi, Mehdi; Pezeshk, Hamid; Kargar, Mehdi; Poormohammadi, Hadi

    2007-01-01

    Construction of two haplotypes from a set of Single Nucleotide Polymorphism (SNP) fragments is called haplotype reconstruction problem. One of the most popular computational model for this problem is Minimum Error Correction (MEC). Since MEC is an NP-hard problem, here we propose a novel heuristic algorithm based on clustering analysis in data mining for haplotype reconstruction problem. Based on hamming distance and similarity between two fragments, our iterative algorithm produces two clusters of fragments; then, in each iteration, the algorithm assigns a fragment to one of the clusters. Our results suggest that the algorithm has less reconstruction error rate in comparison with other algorithms

  12. Estimating haplotype effects for survival data.

    Science.gov (United States)

    Scheike, Thomas H; Martinussen, Torben; Silver, Jeremy D

    2010-09-01

    Genetic association studies often investigate the effect of haplotypes on an outcome of interest. Haplotypes are not observed directly, and this complicates the inclusion of such effects in survival models. We describe a new estimating equations approach for Cox's regression model to assess haplotype effects for survival data. These estimating equations are simple to implement and avoid the use of the EM algorithm, which may be slow in the context of the semiparametric Cox model with incomplete covariate information. These estimating equations also lead to easily computable, direct estimators of standard errors, and thus overcome some of the difficulty in obtaining variance estimators based on the EM algorithm in this setting. We also develop an easily implemented goodness-of-fit procedure for Cox's regression model including haplotype effects. Finally, we apply the procedures presented in this article to investigate possible haplotype effects of the PAF-receptor on cardiovascular events in patients with coronary artery disease, and compare our results to those based on the EM algorithm. © 2009, The International Biometric Society.

  13. Characterization of Human Cytomegalovirus Genome Diversity in Immunocompromised Hosts by Whole-Genome Sequencing Directly From Clinical Specimens.

    Science.gov (United States)

    Hage, Elias; Wilkie, Gavin S; Linnenweber-Held, Silvia; Dhingra, Akshay; Suárez, Nicolás M; Schmidt, Julius J; Kay-Fedorov, Penelope C; Mischak-Weissinger, Eva; Heim, Albert; Schwarz, Anke; Schulz, Thomas F; Davison, Andrew J; Ganzenmueller, Tina

    2017-06-01

    Advances in next-generation sequencing (NGS) technologies allow comprehensive studies of genetic diversity over the entire genome of human cytomegalovirus (HCMV), a significant pathogen for immunocompromised individuals. Next-generation sequencing was performed on target enriched sequence libraries prepared directly from a variety of clinical specimens (blood, urine, breast milk, respiratory samples, biopsies, and vitreous humor) obtained longitudinally or from different anatomical compartments from 20 HCMV-infected patients (renal transplant recipients, stem cell transplant recipients, and congenitally infected children). De novo-assembled HCMV genome sequences were obtained for 57 of 68 sequenced samples. Analysis of longitudinal or compartmental HCMV diversity revealed various patterns: no major differences were detected among longitudinal, intraindividual blood samples from 9 of 15 patients and in most of the patients with compartmental samples, whereas a switch of the major HCMV population was observed in 6 individuals with sequential blood samples and upon compartmental analysis of 1 patient with HCMV retinitis. Variant analysis revealed additional aspects of minor virus population dynamics and antiviral-resistance mutations. In immunosuppressed patients, HCMV can remain relatively stable or undergo drastic genomic changes that are suggestive of the emergence of minor resident strains or de novo infection. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.

  14. Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections

    Directory of Open Access Journals (Sweden)

    Saliha Hammoumi

    2016-09-01

    Full Text Available Koi herpesvirus disease (KHVD is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3, also known as koi herpesvirus (KHV. Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984 as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×107. The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity. By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3.

  15. Regions of homozygosity in the porcine genome: consequence of demography and the recombination landscape.

    Directory of Open Access Journals (Sweden)

    Mirte Bosse

    Full Text Available Inbreeding has long been recognized as a primary cause of fitness reduction in both wild and domesticated populations. Consanguineous matings cause inheritance of haplotypes that are identical by descent (IBD and result in homozygous stretches along the genome of the offspring. Size and position of regions of homozygosity (ROHs are expected to correlate with genomic features such as GC content and recombination rate, but also direction of selection. Thus, ROHs should be non-randomly distributed across the genome. Therefore, demographic history may not fully predict the effects of inbreeding. The porcine genome has a relatively heterogeneous distribution of recombination rate, making Sus scrofa an excellent model to study the influence of both recombination landscape and demography on genomic variation. This study utilizes next-generation sequencing data for the analysis of genomic ROH patterns, using a comparative sliding window approach. We present an in-depth study of genomic variation based on three different parameters: nucleotide diversity outside ROHs, the number of ROHs in the genome, and the average ROH size. We identified an abundance of ROHs in all genomes of multiple pigs from commercial breeds and wild populations from Eurasia. Size and number of ROHs are in agreement with known demography of the populations, with population bottlenecks highly increasing ROH occurrence. Nucleotide diversity outside ROHs is high in populations derived from a large ancient population, regardless of current population size. In addition, we show an unequal genomic ROH distribution, with strong correlations of ROH size and abundance with recombination rate and GC content. Global gene content does not correlate with ROH frequency, but some ROH hotspots do contain positive selected genes in commercial lines and wild populations. This study highlights the importance of the influence of demography and recombination on homozygosity in the genome to understand

  16. High-throughput multiplex cpDNA resequencing clarifies the genetic diversity and genetic relationships among Brassica napus, Brassica rapa and Brassica oleracea.

    Science.gov (United States)

    Qiao, Jiangwei; Cai, Mengxian; Yan, Guixin; Wang, Nian; Li, Feng; Chen, Binyun; Gao, Guizhen; Xu, Kun; Li, Jun; Wu, Xiaoming

    2016-01-01

    Brassica napus (rapeseed) is a recent allotetraploid plant and the second most important oilseed crop worldwide. The origin of B. napus and the genetic relationships with its diploid ancestor species remain largely unresolved. Here, chloroplast DNA (cpDNA) from 488 B. napus accessions of global origin, 139 B. rapa accessions and 49 B. oleracea accessions were populationally resequenced using Illumina Solexa sequencing technologies. The intraspecific cpDNA variants and their allelic frequencies were called genomewide and further validated via EcoTILLING analyses of the rpo region. The cpDNA of the current global B. napus population comprises more than 400 variants (SNPs and short InDels) and maintains one predominant haplotype (Bncp1). Whole-genome resequencing of the cpDNA of Bncp1 haplotype eliminated its direct inheritance from any accession of the B. rapa or B. oleracea species. The distribution of the polymorphism information content (PIC) values for each variant demonstrated that B. napus has much lower cpDNA diversity than B. rapa; however, a vast majority of the wild and cultivated B. oleracea specimens appeared to share one same distinct cpDNA haplotype, in contrast to its wild C-genome relatives. This finding suggests that the cpDNA of the three Brassica species is well differentiated. The predominant B. napus cpDNA haplotype may have originated from uninvestigated relatives or from interactions between cpDNA mutations and natural/artificial selection during speciation and evolution. These exhaustive data on variation in cpDNA would provide fundamental data for research on cpDNA and chloroplasts. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  17. Genome position specific priors for genomic prediction

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Lund, Mogens Sandø

    2012-01-01

    casual mutation is different between the populations but affects the same gene. Proportions of a four-distribution mixture for SNP effects in segments of fixed size along the genome are derived from one population and set as location specific prior proportions of distributions of SNP effects...... for the target population. The model was tested using dairy cattle populations of different breeds: 540 Australian Jersey bulls, 2297 Australian Holstein bulls and 5214 Nordic Holstein bulls. The traits studied were protein-, fat- and milk yield. Genotypic data was Illumina 777K SNPs, real or imputed Results...

  18. Evolution and diversity of a fungal self/nonself recognition locus.

    Directory of Open Access Journals (Sweden)

    Charles Hall

    2010-11-01

    Full Text Available Self/nonself discrimination is an essential feature for pathogen recognition and graft rejection and is a ubiquitous phenomenon in many organisms. Filamentous fungi, such as Neurospora crassa, provide a model for analyses of population genetics/evolution of self/nonself recognition loci due to their haploid nature, small genomes and excellent genetic/genomic resources. In N. crassa, nonself discrimination during vegetative growth is determined by 11 heterokaryon incompatibility (het loci. Cell fusion between strains that differ in allelic specificity at any of these het loci triggers a rapid programmed cell death response.In this study, we evaluated the evolution, population genetics and selective mechanisms operating at a nonself recognition complex consisting of two closely linked loci, het-c (NCU03493 and pin-c (NCU03494. The genomic position of pin-c next to het-c is unique to Neurospora/Sordaria species, and originated by gene duplication after divergence from other species within the Sordariaceae. The het-c pin-c alleles in N. crassa are in severe linkage disequilibrium and consist of three haplotypes, het-c1/pin-c1, het-c2/pin-c2 and het-c3/pin-c3, which are equally frequent in population samples and exhibit trans-species polymorphisms. The absence of recombinant haplotypes is correlated with divergence of the het-c/pin-c intergenic sequence. Tests for positive and balancing selection at het-c and pin-c support the conclusion that both of these loci are under non-neutral balancing selection; other regions of both genes appear to be under positive selection. Our data show that the het-c2/pin-c2 haplotype emerged by a recombination event between the het-c1/pin-c1 and het-c3/pin-c3 approximately 3-12 million years ago.These results support models by which loci that confer nonself discrimination form by the association of polymorphic genes with genes containing HET domains. Distinct allele classes can emerge by recombination and positive

  19. A Thousand Fly Genomes: An Expanded Drosophila Genome Nexus.

    Science.gov (United States)

    Lack, Justin B; Lange, Jeremy D; Tang, Alison D; Corbett-Detig, Russell B; Pool, John E

    2016-12-01

    The Drosophila Genome Nexus is a population genomic resource that provides D. melanogaster genomes from multiple sources. To facilitate comparisons across data sets, genomes are aligned using a common reference alignment pipeline which involves two rounds of mapping. Regions of residual heterozygosity, identity-by-descent, and recent population admixture are annotated to enable data filtering based on the user's needs. Here, we present a significant expansion of the Drosophila Genome Nexus, which brings the current data object to a total of 1,121 wild-derived genomes. New additions include 305 previously unpublished genomes from inbred lines representing six population samples in Egypt, Ethiopia, France, and South Africa, along with another 193 genomes added from recently-published data sets. We also provide an aligned D. simulans genome to facilitate divergence comparisons. This improved resource will broaden the range of population genomic questions that can addressed from multi-population allele frequencies and haplotypes in this model species. The larger set of genomes will also enhance the discovery of functionally relevant natural variation that exists within and between populations. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  20. A Glimpse of the genomic diversity of haloarchaeal tailed viruses

    Directory of Open Access Journals (Sweden)

    Ana eSencilo

    2014-03-01

    Full Text Available Tailed viruses are the most common isolates infecting prokaryotic hosts residing hypersaline environments. Archaeal tailed viruses represent only a small portion of all characterized tailed viruses of prokaryotes. But even this small dataset revealed that archaeal tailed viruses have many similarities to their counterparts infecting bacteria, the bacteriophages. Shared functional homologues and similar genome organizations suggested that all microbial tailed viruses have common virion architectural and assembly principles. Recent structural studies have provided evidence justifying this thereby grouping archaeal and bacterial tailed viruses into a single lineage. Currently there are 17 haloarchaeal tailed viruses with entirely sequenced genomes. Nine viruses have at least one close relative among the 17 viruses and, according to the similarities, can be divided into three groups. Two other viruses share some homologues and therefore are distantly related, whereas the rest of the viruses are rather divergent (or singletons. Comparative genomics analysis of these viruses offers a glimpse into the genetic diversity and structure of haloarchaeal tailed virus communities.

  1. Historically low mitochondrial DNA diversity in koalas (Phascolarctos cinereus

    Directory of Open Access Journals (Sweden)

    Tsangaras Kyriakos

    2012-10-01

    Full Text Available Abstract Background The koala (Phascolarctos cinereus is an arboreal marsupial that was historically widespread across eastern Australia until the end of the 19th century when it suffered a steep population decline. Hunting for the fur trade, habitat conversion, and disease contributed to a precipitous reduction in koala population size during the late 1800s and early 1900s. To examine the effects of these reductions in population size on koala genetic diversity, we sequenced part of the hypervariable region of mitochondrial DNA (mtDNA in koala museum specimens collected in the 19th and 20th centuries, hypothesizing that the historical samples would exhibit greater genetic diversity. Results The mtDNA haplotypes present in historical museum samples were identical to haplotypes found in modern koala populations, and no novel haplotypes were detected. Rarefaction analyses suggested that the mtDNA genetic diversity present in the museum samples was similar to that of modern koalas. Conclusions Low mtDNA diversity may have been present in koala populations prior to recent population declines. When considering management strategies, low genetic diversity of the mtDNA hypervariable region may not indicate recent inbreeding or founder events but may reflect an older historical pattern for koalas.

  2. HLA alleles and haplotypes in Burmese (Myanmarese) and Karen in Thailand.

    Science.gov (United States)

    Kongmaroeng, C; Romphruk, A; Puapairoj, C; Leelayuwat, C; Kulski, J K; Inoko, H; Dunn, D S; Romphruk, A V

    2015-09-01

    This is the first report on human leukocyte antigen (HLA) allele and haplotype frequencies at three class I loci and two class II loci in unrelated healthy individuals from two ethnic groups, 170 Burmese and 200 Karen, originally from Burma (Myanmar), but sampled while residing in Thailand. Overall, the HLA allele and haplotype frequencies detected by polymerase chain reaction sequence-specific primer (PCR-SSP) at five loci (A, B, C, DRB1 and DRQB1) at low resolution showed distinct differences between the Burmese and Karen. In Burmese, five HLA-B*15 haplotypes with different HLA-A and HLA-DR/DQ combinations were detected with three of these not previously reported in other Asian populations. The data are important in the fields of anthropology, transplantation and disease-association studies. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  3. Twenty-one genome sequences from Pseudomonas species and 19 genome sequences from diverse bacteria isolated from the rhizosphere and endosphere of Populus deltoides.

    Science.gov (United States)

    Brown, Steven D; Utturkar, Sagar M; Klingeman, Dawn M; Johnson, Courtney M; Martin, Stanton L; Land, Miriam L; Lu, Tse-Yuan S; Schadt, Christopher W; Doktycz, Mitchel J; Pelletier, Dale A

    2012-11-01

    To aid in the investigation of the Populus deltoides microbiome, we generated draft genome sequences for 21 Pseudomonas strains and 19 other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium, and Variovorax were generated.

  4. Metatranscriptomic analysis of diverse microbial communities reveals core metabolic pathways and microbiome-specific functionality.

    Science.gov (United States)

    Jiang, Yue; Xiong, Xuejian; Danska, Jayne; Parkinson, John

    2016-01-12

    Metatranscriptomics is emerging as a powerful technology for the functional characterization of complex microbial communities (microbiomes). Use of unbiased RNA-sequencing can reveal both the taxonomic composition and active biochemical functions of a complex microbial community. However, the lack of established reference genomes, computational tools and pipelines make analysis and interpretation of these datasets challenging. Systematic studies that compare data across microbiomes are needed to demonstrate the ability of such pipelines to deliver biologically meaningful insights on microbiome function. Here, we apply a standardized analytical pipeline to perform a comparative analysis of metatranscriptomic data from diverse microbial communities derived from mouse large intestine, cow rumen, kimchi culture, deep-sea thermal vent and permafrost. Sequence similarity searches allowed annotation of 19 to 76% of putative messenger RNA (mRNA) reads, with the highest frequency in the kimchi dataset due to its relatively low complexity and availability of closely related reference genomes. Metatranscriptomic datasets exhibited distinct taxonomic and functional signatures. From a metabolic perspective, we identified a common core of enzymes involved in amino acid, energy and nucleotide metabolism and also identified microbiome-specific pathways such as phosphonate metabolism (deep sea) and glycan degradation pathways (cow rumen). Integrating taxonomic and functional annotations within a novel visualization framework revealed the contribution of different taxa to metabolic pathways, allowing the identification of taxa that contribute unique functions. The application of a single, standard pipeline confirms that the rich taxonomic and functional diversity observed across microbiomes is not simply an artefact of different analysis pipelines but instead reflects distinct environmental influences. At the same time, our findings show how microbiome complexity and availability of

  5. Nucleotide variation in the mitochondrial genome provides evidence for dual routes of postglacial recolonization and genetic recombination in the northeastern brook trout (Salvelinus fontinalis).

    Science.gov (United States)

    Pilgrim, B L; Perry, R C; Barron, J L; Marshall, H D

    2012-09-26

    Levels and patterns of mitochondrial DNA (mtDNA) variation were examined to investigate the population structure and possible routes of postglacial recolonization of the world's northernmost native populations of brook trout (Salvelinus fontinalis), which are found in Labrador, Canada. We analyzed the sequence diversity of a 1960-bp portion of the mitochondrial genome (NADH dehydrogenase 1 gene and part of cytochrome oxidase 1) of 126 fish from 32 lakes distributed throughout seven regions of northeastern Canada. These populations were found to have low levels of mtDNA diversity, a characteristic trait of populations at northern extremes, with significant structuring at the level of the watershed. Upon comparison of northeastern brook trout sequences to the publicly available brook trout whole mitochondrial genome (GenBank AF154850), we infer that the GenBank sequence is from a fish whose mtDNA has recombined with that of Arctic charr (S. alpinus). The haplotype distribution provides evidence of two different postglacial founding groups contributing to present-day brook trout populations in the northernmost part of their range; the evolution of the majority of the haplotypes coincides with the timing of glacier retreat from Labrador. Our results exemplify the strong influence that historical processes such as glaciations have had on shaping the current genetic structure of northern species such as the brook trout.

  6. Genomic diversity guides conservation strategies among rare terrestrial orchid species when taxonomy remains uncertain.

    Science.gov (United States)

    Ahrens, Collin W; Supple, Megan A; Aitken, Nicola C; Cantrill, David J; Borevitz, Justin O; James, Elizabeth A

    2017-06-01

    Species are often used as the unit for conservation, but may not be suitable for species complexes where taxa are difficult to distinguish. Under such circumstances, it may be more appropriate to consider species groups or populations as evolutionarily significant units (ESUs). A population genomic approach was employed to investigate the diversity within and among closely related species to create a more robust, lineage-specific conservation strategy for a nationally endangered terrestrial orchid and its relatives from south-eastern Australia. Four putative species were sampled from a total of 16 populations in the Victorian Volcanic Plain (VVP) bioregion and one population of a sub-alpine outgroup in south-eastern Australia. Morphological measurements were taken in situ along with leaf material for genotyping by sequencing (GBS) and microsatellite analyses. Species could not be differentiated using morphological measurements. Microsatellite and GBS markers confirmed the outgroup as distinct, but only GBS markers provided resolution of population genetic structure. The nationally endangered Diuris basaltica was indistinguishable from two related species ( D. chryseopsis and D. behrii ), while the state-protected D. gregaria showed genomic differentiation. Genomic diversity identified among the four Diuris species suggests that conservation of this taxonomically complex group will be best served by considering them as one ESU rather than separately aligned with species as currently recognized. This approach will maximize evolutionary potential among all species during increased isolation and environmental change. The methods used here can be applied generally to conserve evolutionary processes for groups where taxonomic uncertainty hinders the use of species as conservation units. © The Author 2017. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  7. Methods for Optimizing CRISPR-Cas9 Genome Editing Specificity

    Science.gov (United States)

    Tycko, Josh; Myer, Vic E.; Hsu, Patrick D.

    2016-01-01

    Summary Advances in the development of delivery, repair, and specificity strategies for the CRISPR-Cas9 genome engineering toolbox are helping researchers understand gene function with unprecedented precision and sensitivity. CRISPR-Cas9 also holds enormous therapeutic potential for the treatment of genetic disorders by directly correcting disease-causing mutations. Although the Cas9 protein has been shown to bind and cleave DNA at off-target sites, the field of Cas9 specificity is rapidly progressing with marked improvements in guide RNA selection, protein and guide engineering, novel enzymes, and off-target detection methods. We review important challenges and breakthroughs in the field as a comprehensive practical guide to interested users of genome editing technologies, highlighting key tools and strategies for optimizing specificity. The genome editing community should now strive to standardize such methods for measuring and reporting off-target activity, while keeping in mind that the goal for specificity should be continued improvement and vigilance. PMID:27494557

  8. Genomic Analyses Reveal the Influence of Geographic Origin, Migration, and Hybridization on Modern Dog Breed Development

    Directory of Open Access Journals (Sweden)

    Heidi G. Parker

    2017-04-01

    Full Text Available There are nearly 400 modern domestic dog breeds with a unique histories and genetic profiles. To track the genetic signatures of breed development, we have assembled the most diverse dataset of dog breeds, reflecting their extensive phenotypic variation and heritage. Combining genetic distance, migration, and genome-wide haplotype sharing analyses, we uncover geographic patterns of development and independent origins of common traits. Our analyses reveal the hybrid history of breeds and elucidate the effects of immigration, revealing for the first time a suggestion of New World dog within some modern breeds. Finally, we used cladistics and haplotype sharing to show that some common traits have arisen more than once in the history of the dog. These analyses characterize the complexities of breed development, resolving longstanding questions regarding individual breed origination, the effect of migration on geographically distinct breeds, and, by inference, transfer of trait and disease alleles among dog breeds.

  9. Phenotypic Heterogeneity of Genomically-Diverse Isolates of Streptococcus mutans

    Science.gov (United States)

    Palmer, Sara R.; Miller, James H.; Abranches, Jacqueline; Zeng, Lin; Lefebure, Tristan; Richards, Vincent P.; Lemos, José A.; Stanhope, Michael J.; Burne, Robert A.

    2013-01-01

    High coverage, whole genome shotgun (WGS) sequencing of 57 geographically- and genetically-diverse isolates of Streptococcus mutans from individuals of known dental caries status was recently completed. Of the 57 sequenced strains, fifteen isolates, were selected based primarily on differences in gene content and phenotypic characteristics known to affect virulence and compared with the reference strain UA159. A high degree of variability in these properties was observed between strains, with a broad spectrum of sensitivities to low pH, oxidative stress (air and paraquat) and exposure to competence stimulating peptide (CSP). Significant differences in autolytic behavior and in biofilm development in glucose or sucrose were also observed. Natural genetic competence varied among isolates, and this was correlated to the presence or absence of competence genes, comCDE and comX, and to bacteriocins. In general strains that lacked the ability to become competent possessed fewer genes for bacteriocins and immunity proteins or contained polymorphic variants of these genes. WGS sequence analysis of the pan-genome revealed, for the first time, components of a Type VII secretion system in several S. mutans strains, as well as two putative ORFs that encode possible collagen binding proteins located upstream of the cnm gene, which is associated with host cell invasiveness. The virulence of these particular strains was assessed in a wax-worm model. This is the first study to combine a comprehensive analysis of key virulence-related phenotypes with extensive genomic analysis of a pathogen that evolved closely with humans. Our analysis highlights the phenotypic diversity of S. mutans isolates and indicates that the species has evolved a variety of adaptive strategies to persist in the human oral cavity and, when conditions are favorable, to initiate disease. PMID:23613838

  10. Phenotypic heterogeneity of genomically-diverse isolates of Streptococcus mutans.

    Directory of Open Access Journals (Sweden)

    Sara R Palmer

    Full Text Available High coverage, whole genome shotgun (WGS sequencing of 57 geographically- and genetically-diverse isolates of Streptococcus mutans from individuals of known dental caries status was recently completed. Of the 57 sequenced strains, fifteen isolates, were selected based primarily on differences in gene content and phenotypic characteristics known to affect virulence and compared with the reference strain UA159. A high degree of variability in these properties was observed between strains, with a broad spectrum of sensitivities to low pH, oxidative stress (air and paraquat and exposure to competence stimulating peptide (CSP. Significant differences in autolytic behavior and in biofilm development in glucose or sucrose were also observed. Natural genetic competence varied among isolates, and this was correlated to the presence or absence of competence genes, comCDE and comX, and to bacteriocins. In general strains that lacked the ability to become competent possessed fewer genes for bacteriocins and immunity proteins or contained polymorphic variants of these genes. WGS sequence analysis of the pan-genome revealed, for the first time, components of a Type VII secretion system in several S. mutans strains, as well as two putative ORFs that encode possible collagen binding proteins located upstream of the cnm gene, which is associated with host cell invasiveness. The virulence of these particular strains was assessed in a wax-worm model. This is the first study to combine a comprehensive analysis of key virulence-related phenotypes with extensive genomic analysis of a pathogen that evolved closely with humans. Our analysis highlights the phenotypic diversity of S. mutans isolates and indicates that the species has evolved a variety of adaptive strategies to persist in the human oral cavity and, when conditions are favorable, to initiate disease.

  11. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences

    Directory of Open Access Journals (Sweden)

    Alessandra Traini

    2013-01-01

    Full Text Available Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  12. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

    Science.gov (United States)

    Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

    2013-01-01

    Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  13. Twenty-One Genome Sequences from Pseudomonas Species and 19 Genome Sequences from Diverse Bacteria Isolated from the Rhizosphere and Endosphere of Populus deltoides

    Energy Technology Data Exchange (ETDEWEB)

    Brown, Steven D [ORNL; Utturkar, Sagar M [ORNL; Klingeman, Dawn Marie [ORNL; Johnson, Courtney M [ORNL; Martin, Stanton [ORNL; Land, Miriam L [ORNL; Lu, Tse-Yuan [ORNL; Schadt, Christopher Warren [ORNL; Doktycz, Mitchel John [ORNL; Pelletier, Dale A [ORNL

    2012-01-01

    To aid in the investigation of the Populus deltoides microbiome we generated draft genome sequences for twenty one Pseudomonas and twenty one other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Burkholderia, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium and Variovorax were generated.

  14. Founder haplotype analysis of Fanconi anemia in the Korean population finds common ancestral haplotypes for a FANCG variant.

    Science.gov (United States)

    Park, Joonhong; Kim, Myungshin; Jang, Woori; Chae, Hyojin; Kim, Yonggoo; Chung, Nack-Gyun; Lee, Jae-Wook; Cho, Bin; Jeong, Dae-Chul; Park, In Yang; Park, Mi Sun

    2015-05-01

    A common ancestral haplotype is strongly suggested in the Korean and Japanese patients with Fanconi anemia (FA), because common mutations have been frequently found: c.2546delC and c.3720_3724delAAACA of FANCA; c.307+1G>C, c.1066C>T, and c.1589_1591delATA of FANCG. Our aim in this study was to investigate the origin of these common mutations of FANCA and FANCG. We genotyped 13 FA patients consisting of five FA-A patients and eight FA-G patients from the Korean FA population. Microsatellite markers used for haplotype analysis included four CA repeat markers which are closely linked with FANCA and eight CA repeat markers which are contiguous with FANCG. As a result, Korean FA-A patients carrying c.2546delC or c.3720_3724delAAACA did not share the same haplotypes. However, three unique haplotypes carrying c.307+1G>C, c.1066C > T, or c.1589_1591delATA, that consisted of eight polymorphic loci covering a flanking region were strongly associated with Korean FA-G, consistent with founder haplotypes reported previously in the Japanese FA-G population. Our finding confirmed the common ancestral haplotypes on the origins of the East Asian FA-G patients, which will improve our understanding of the molecular population genetics of FA-G. To the best of our knowledge, this is the first report on the association between disease-linked mutations and common ancestral haplotypes in the Korean FA population. © 2015 John Wiley & Sons Ltd/University College London.

  15. Whole mitochondrial genome sequencing of domestic horses reveals incorporation of extensive wild horse diversity during domestication

    Directory of Open Access Journals (Sweden)

    Lippold Sebastian

    2011-11-01

    Full Text Available Abstract Background DNA target enrichment by micro-array capture combined with high throughput sequencing technologies provides the possibility to obtain large amounts of sequence data (e.g. whole mitochondrial DNA genomes from multiple individuals at relatively low costs. Previously, whole mitochondrial genome data for domestic horses (Equus caballus were limited to only a few specimens and only short parts of the mtDNA genome (especially the hypervariable region were investigated for larger sample sets. Results In this study we investigated whole mitochondrial genomes of 59 domestic horses from 44 breeds and a single Przewalski horse (Equus przewalski using a recently described multiplex micro-array capture approach. We found 473 variable positions within the domestic horses, 292 of which are parsimony-informative, providing a well resolved phylogenetic tree. Our divergence time estimate suggests that the mitochondrial genomes of modern horse breeds shared a common ancestor around 93,000 years ago and no later than 38,000 years ago. A Bayesian skyline plot (BSP reveals a significant population expansion beginning 6,000-8,000 years ago with an ongoing exponential growth until the present, similar to other domestic animal species. Our data further suggest that a large sample of wild horse diversity was incorporated into the domestic population; specifically, at least 46 of the mtDNA lineages observed in domestic horses (73% already existed before the beginning of domestication about 5,000 years ago. Conclusions Our study provides a window into the maternal origins of extant domestic horses and confirms that modern domestic breeds present a wide sample of the mtDNA diversity found in ancestral, now extinct, wild horse populations. The data obtained allow us to detect a population expansion event coinciding with the beginning of domestication and to estimate both the minimum number of female horses incorporated into the domestic gene pool and the

  16. Pan-genome analysis of Aeromonas hydrophila, Aeromonas veronii and Aeromonas caviae indicates phylogenomic diversity and greater pathogenic potential for Aeromonas hydrophila.

    Science.gov (United States)

    Ghatak, Sandeep; Blom, Jochen; Das, Samir; Sanjukta, Rajkumari; Puro, Kekungu; Mawlong, Michael; Shakuntala, Ingudam; Sen, Arnab; Goesmann, Alexander; Kumar, Ashok; Ngachan, S V

    2016-07-01

    Aeromonas species are important pathogens of fishes and aquatic animals capable of infecting humans and other animals via food. Due to the paucity of pan-genomic studies on aeromonads, the present study was undertaken to analyse the pan-genome of three clinically important Aeromonas species (A. hydrophila, A. veronii, A. caviae). Results of pan-genome analysis revealed an open pan-genome for all three species with pan-genome sizes of 9181, 7214 and 6884 genes for A. hydrophila, A. veronii and A. caviae, respectively. Core-genome: pan-genome ratio (RCP) indicated greater genomic diversity for A. hydrophila and interestingly RCP emerged as an effective indicator to gauge genomic diversity which could possibly be extended to other organisms too. Phylogenomic network analysis highlighted the influence of homologous recombination and lateral gene transfer in the evolution of Aeromonas spp. Prediction of virulence factors indicated no significant difference among the three species though analysis of pathogenic potential and acquired antimicrobial resistance genes revealed greater hazards from A. hydrophila. In conclusion, the present study highlighted the usefulness of whole genome analyses to infer evolutionary cues for Aeromonas species which indicated considerable phylogenomic diversity for A. hydrophila and hitherto unknown genomic evidence for pathogenic potential of A. hydrophila compared to A. veronii and A. caviae.

  17. Comparison of 26 sphingomonad genomes reveals diverse environmental adaptations and biodegradative capabilities

    DEFF Research Database (Denmark)

    Aylward, Frank O.; McDonald, Bradon R.; Adams, Sandra M.

    2013-01-01

    to the genus Sphingobium. Our pan-genomic analysis of sphingomonads reveals numerous species-specific open reading frames (ORFs) but few signatures of genus-specific cores. The organization and coding potential of the sphingomonad genomes appear to be highly variable, and plasmid-mediated gene transfer...... and chromosome-plasmid recombination, together with prophage- and transposon-mediated rearrangements, appear to play prominent roles in the genome evolution of this group. We find that many of the sphingomonad genomes encode numerous oxygenases and glycoside hydrolases, which are likely responsible...... a basis for understanding the ecological strategies employed by sphingomonads and their role in environmental nutrient cycling....

  18. Deep whole-genome sequencing of 90 Han Chinese genomes.

    Science.gov (United States)

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000

  19. Endothelial Nitric Oxide Synthase Haplotypes Are Associated with Preeclampsia in Maya Mestizo Women

    Science.gov (United States)

    Díaz-Olguín, Lizbeth; Coral-Vázquez, Ramón Mauricio; Canto-Cetina, Thelma; Canizales-Quinteros, Samuel; Ramírez Regalado, Belem; Fernández, Genny; Canto, Patricia

    2011-01-01

    Preeclampsia is a specific disease of pregnancy and believed to have a genetic component. The aim of this study was to investigate if three polymorphisms in eNOS or their haplotypes are associated with preeclampsia in Maya mestizo women. A case-control study was performed where 127 preeclamptic patients and 263 controls were included. Genotyped and haplotypes for the -768T→C, intron 4 variants, Glu298Asp of eNOS were determined by PCR and real-time PCR allelic discrimination. Logistic regression analysis with adjustment for age and body mass index (BMI) was used to test for associations between genotype and preeclampsia under recessive, codominant and dominant models. Pairwise linkage disequilibrium between single nucleotide polymorphisms was calculated by direct correlation r2, and haplotype analysis was conducted. Women homozygous for the Asp298 allele showed an association of preeclampsia. In addition, analysis of the haplotype frequencies revealed that the -786C-4b-Asp298 haplotype was significantly more frequent in preeclamptic patients than in controls (0.143 vs. 0.041, respectively; OR = 3.01; 95% CI = 1.74–5.23; P = 2.9 × 10−4). Despite the Asp298 genotype in a recessive model associated with the presence of preeclampsia in Maya mestizo women, we believe that in this population the -786C-4b-Asp298 haplotype is a better genetic marker. PMID:21897002

  20. Experimental analysis of specification language impact on NPP software diversity

    International Nuclear Information System (INIS)

    Yoo, Chang Sik; Seong, Poong Hyun

    1998-01-01

    When redundancy and diversity is applied in NPP digital computer system, diversification of system software may be a critical point for the entire system dependability. As the means of enhancing software diversity, specification language diversity is suggested in this study. We set up a simple hypothesis for the specification language impact on common errors, and an experiment based on NPP protection system application was performed. Experiment result showed that this hypothesis could be justified and specification language diversity is effective in overcoming software common mode failure problem

  1. Whole-genome resequencing of honeybee drones to detect genomic selection in a population managed for royal jelly.

    Science.gov (United States)

    Wragg, David; Marti-Marimon, Maria; Basso, Benjamin; Bidanel, Jean-Pierre; Labarthe, Emmanuelle; Bouchez, Olivier; Le Conte, Yves; Vignal, Alain

    2016-06-03

    Four main evolutionary lineages of A. mellifera have been described including eastern Europe (C) and western and northern Europe (M). Many apiculturists prefer bees from the C lineage due to their docility and high productivity. In France, the routine importation of bees from the C lineage has resulted in the widespread admixture of bees from the M lineage. The haplodiploid nature of the honeybee Apis mellifera, and its small genome size, permits affordable and extensive genomics studies. As a pilot study of a larger project to characterise French honeybee populations, we sequenced 60 drones sampled from two commercial populations managed for the production of honey and royal jelly. Results indicate a C lineage origin, whilst mitochondrial analysis suggests two drones originated from the O lineage. Analysis of heterozygous SNPs identified potential copy number variants near to genes encoding odorant binding proteins and several cytochrome P450 genes. Signatures of selection were detected using the hapFLK haplotype-based method, revealing several regions under putative selection for royal jelly production. The framework developed during this study will be applied to a broader sampling regime, allowing the genetic diversity of French honeybees to be characterised in detail.

  2. Exceptional diversity, non-random distribution, and rapid evolution of retroelements in the B73 maize genome.

    Directory of Open Access Journals (Sweden)

    Regina S Baucom

    2009-11-01

    Full Text Available Recent comprehensive sequence analysis of the maize genome now permits detailed discovery and description of all transposable elements (TEs in this complex nuclear environment. Reiteratively optimized structural and homology criteria were used in the computer-assisted search for retroelements, TEs that transpose by reverse transcription of an RNA intermediate, with the final results verified by manual inspection. Retroelements were found to occupy the majority (>75% of the nuclear genome in maize inbred B73. Unprecedented genetic diversity was discovered in the long terminal repeat (LTR retrotransposon class of retroelements, with >400 families (>350 newly discovered contributing >31,000 intact elements. The two other classes of retroelements, SINEs (four families and LINEs (at least 30 families, were observed to contribute 1,991 and approximately 35,000 copies, respectively, or a combined approximately 1% of the B73 nuclear genome. With regard to fully intact elements, median copy numbers for all retroelement families in maize was 2 because >250 LTR retrotransposon families contained only one or two intact members that could be detected in the B73 draft sequence. The majority, perhaps all, of the investigated retroelement families exhibited non-random dispersal across the maize genome, with LINEs, SINEs, and many low-copy-number LTR retrotransposons exhibiting a bias for accumulation in gene-rich regions. In contrast, most (but not all medium- and high-copy-number LTR retrotransposons were found to preferentially accumulate in gene-poor regions like pericentromeric heterochromatin, while a few high-copy-number families exhibited the opposite bias. Regions of the genome with the highest LTR retrotransposon density contained the lowest LTR retrotransposon diversity. These results indicate that the maize genome provides a great number of different niches for the survival and procreation of a great variety of retroelements that have evolved to

  3. Global spread and genetic variants of the two CYP9M10 haplotype forms associated with insecticide resistance in Culex quinquefasciatus Say.

    Science.gov (United States)

    Itokawa, K; Komagata, O; Kasai, S; Kawada, H; Mwatele, C; Dida, G O; Njenga, S M; Mwandawiro, C; Tomita, T

    2013-09-01

    Insecticide resistance develops as a genetic factor (allele) conferring lower susceptibility to insecticides proliferates within a target insect population under strong positive selection. Intriguingly, a resistance allele pre-existing in a population often bears a series of further adaptive allelic variants through new mutations. This phenomenon occasionally results in replacement of the predominating resistance allele by fitter new derivatives, and consequently, development of greater resistance at the population level. The overexpression of the cytochrome P450 gene CYP9M10 is associated with pyrethroid resistance in the southern house mosquito Culex quinquefasciatus. Previously, we have found two genealogically related overexpressing CYP9M10 haplotypes, which differ in gene copy number (duplicated and non-duplicated). The duplicated haplotype was derived from the non-duplicated overproducer probably recently. In the present study, we investigated allelic series of CYP9M10 involved in three C. quinquefasciatus laboratory colonies recently collected from three different localities. Duplicated and non-duplicated overproducing haplotypes coexisted in African and Asian colonies indicating a global distribution of both haplotype lineages. The duplicated haplotypes both in the Asian and African colonies were associated with higher expression levels and stronger resistance than non-duplicated overproducing haplotypes. There were slight variation in expression level among the non-duplicated overproducing haplotypes. The nucleotide sequences in coding and upstream regions among members of this group also showed a little diversity. Non-duplicated overproducing haplotypes with relatively higher expression were genealogically closer to the duplicated haplotypes than the other non-duplicated overproducing haplotypes, suggesting multiple cis-acting mutations before duplication.

  4. Estimating haplotype effects for survival data

    DEFF Research Database (Denmark)

    Scheike, Thomas; Martinussen, Torben; Silver, J

    2010-01-01

    Genetic association studies often investigate the effect of haplotypes on an outcome of interest. Haplotypes are not observed directly, and this complicates the inclusion of such effects in survival models. We describe a new estimating equations approach for Cox's regression model to assess haplo...

  5. Microsatellite genotyping and genome-wide single nucleotide polymorphism-based indices of Plasmodium falciparum diversity within clinical infections.

    Science.gov (United States)

    Murray, Lee; Mobegi, Victor A; Duffy, Craig W; Assefa, Samuel A; Kwiatkowski, Dominic P; Laman, Eugene; Loua, Kovana M; Conway, David J

    2016-05-12

    In regions where malaria is endemic, individuals are often infected with multiple distinct parasite genotypes, a situation that may impact on evolution of parasite virulence and drug resistance. Most approaches to studying genotypic diversity have involved analysis of a modest number of polymorphic loci, although whole genome sequencing enables a broader characterisation of samples. PCR-based microsatellite typing of a panel of ten loci was performed on Plasmodium falciparum in 95 clinical isolates from a highly endemic area in the Republic of Guinea, to characterize within-isolate genetic diversity. Separately, single nucleotide polymorphism (SNP) data from genome-wide short-read sequences of the same samples were used to derive within-isolate fixation indices (F ws), an inverse measure of diversity within each isolate compared to overall local genetic diversity. The latter indices were compared with the microsatellite results, and also with indices derived by randomly sampling modest numbers of SNPs. As expected, the number of microsatellite loci with more than one allele in each isolate was highly significantly inversely correlated with the genome-wide F ws fixation index (r = -0.88, P 10 % had high correlation (r > 0.90) with the index derived using all SNPs. Different types of data give highly correlated indices of within-infection diversity, although PCR-based analysis detects low-level minority genotypes not apparent in bulk sequence analysis. When whole-genome data are not obtainable, quantitative assay of ten or more SNPs can yield a reasonably accurate estimate of the within-infection fixation index (F ws).

  6. The impact of genomics on research in diversity and evolution of archaea.

    Science.gov (United States)

    Mardanov, A V; Ravin, N V

    2012-08-01

    Since the definition of archaea as a separate domain of life along with bacteria and eukaryotes, they have become one of the most interesting objects of modern microbiology, molecular biology, and biochemistry. Sequencing and analysis of archaeal genomes were especially important for studies on archaea because of a limited availability of genetic tools for the majority of these microorganisms and problems associated with their cultivation. Fifteen years since the publication of the first genome of an archaeon, more than one hundred complete genome sequences of representatives of different phylogenetic groups have been determined. Analysis of these genomes has expanded our knowledge of biology of archaea, their diversity and evolution, and allowed identification and characterization of new deep phylogenetic lineages of archaea. The development of genome technologies has allowed sequencing the genomes of uncultivated archaea directly from enrichment cultures, metagenomic samples, and even from single cells. Insights have been gained into the evolution of key biochemical processes in archaea, such as cell division and DNA replication, the role of horizontal gene transfer in the evolution of archaea, and new relationships between archaea and eukaryotes have been revealed.

  7. Data compression can discriminate broilers by selection line, detect haplotypes, and estimate genetic potential for complex phenotypes.

    Science.gov (United States)

    Hudson, N J; Hawken, R J; Okimoto, R; Sapp, R L; Reverter, A

    2017-09-01

    Accurately establishing the relationships among individuals lays the foundation for genetic analyses such as genome-wide association studies and identification of selection signatures. Of particular interest to the poultry industry are estimates of genetic merit based on molecular data. These estimates can be commercially exploited in marker-assisted breeding programs to accelerate genetic improvement. Here, we test the utility of a new method we have recently developed to estimate animal relatedness and applied it to genetic parameter estimation in commercial broilers. Our approach is based on the concept of data compression from information theory. Using the real-world compressor gzip to estimate normalized compression distance (NCD) we have built compression-based relationship matrices (CRM) for 988 chickens from 4 commercial broiler lines-2 male and 2 female lines. For all pairs of individuals, we found a strong negative relationship between the commonly used genomic relationship matrix (GRM) and NCD. This reflects the fact that "similarity" is the inverse of "distance." The CRM explained more genetic variation than the corresponding GRM in 2 of 3 phenotypes, with corresponding improvements in accuracy of genomic-enabled predictions of breeding value. A sliding-window version of the analysis highlighted haplotype regions of the genome apparently under selection in a line-specific manner. In the male lines, we retrieved high population-specific scores for IGF-1 and a cognate receptor, INSR. For the female lines, we detected an extreme score for a region containing a reproductive hormone receptor (GNRHR). We conclude that our compression-based method is a valid approach to established relationships and identify regions under selective pressure in commercial lines of broiler chickens. © 2017 Poultry Science Association Inc.

  8. Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis.

    Science.gov (United States)

    Zhu, Huayu; Song, Pengyao; Koo, Dal-Hoe; Guo, Luqin; Li, Yanman; Sun, Shouru; Weng, Yiqun; Yang, Luming

    2016-08-05

    Microsatellite markers are one of the most informative and versatile DNA-based markers used in plant genetic research, but their development has traditionally been difficult and costly. The whole genome sequencing with next-generation sequencing (NGS) technologies provides large amounts of sequence data to develop numerous microsatellite markers at whole genome scale. SSR markers have great advantage in cross-species comparisons and allow investigation of karyotype and genome evolution through highly efficient computation approaches such as in silico PCR. Here we described genome wide development and characterization of SSR markers in the watermelon (Citrullus lanatus) genome, which were then use in comparative analysis with two other important crop species in the Cucurbitaceae family: cucumber (Cucumis sativus L.) and melon (Cucumis melo L.). We further applied these markers in evaluating the genetic diversity and population structure in watermelon germplasm collections. A total of 39,523 microsatellite loci were identified from the watermelon draft genome with an overall density of 111 SSRs/Mbp, and 32,869 SSR primers were designed with suitable flanking sequences. The dinucleotide SSRs were the most common type representing 34.09 % of the total SSR loci and the AT-rich motifs were the most abundant in all nucleotide repeat types. In silico PCR analysis identified 832 and 925 SSR markers with each having a single amplicon in the cucumber and melon draft genome, respectively. Comparative analysis with these cross-species SSR markers revealed complicated mosaic patterns of syntenic blocks among the genomes of three species. In addition, genetic diversity analysis of 134 watermelon accessions with 32 highly informative SSR loci placed these lines into two groups with all accessions of C.lanatus var. citorides and three accessions of C. colocynthis clustered in one group and all accessions of C. lanatus var. lanatus and the remaining accessions of C. colocynthis

  9. The performance of phylogenetic algorithms in estimating haplotype genealogies with migration.

    Science.gov (United States)

    Salzburger, Walter; Ewing, Greg B; Von Haeseler, Arndt

    2011-05-01

    Genealogies estimated from haplotypic genetic data play a prominent role in various biological disciplines in general and in phylogenetics, population genetics and phylogeography in particular. Several software packages have specifically been developed for the purpose of reconstructing genealogies from closely related, and hence, highly similar haplotype sequence data. Here, we use simulated data sets to test the performance of traditional phylogenetic algorithms, neighbour-joining, maximum parsimony and maximum likelihood in estimating genealogies from nonrecombining haplotypic genetic data. We demonstrate that these methods are suitable for constructing genealogies from sets of closely related DNA sequences with or without migration. As genealogies based on phylogenetic reconstructions are fully resolved, but not necessarily bifurcating, and without reticulations, these approaches outperform widespread 'network' constructing methods. In our simulations of coalescent scenarios involving panmictic, symmetric and asymmetric migration, we found that phylogenetic reconstruction methods performed well, while the statistical parsimony approach as implemented in TCS performed poorly. Overall, parsimony as implemented in the PHYLIP package performed slightly better than other methods. We further point out that we are not making the case that widespread 'network' constructing methods are bad, but that traditional phylogenetic tree finding methods are applicable to haplotypic data and exhibit reasonable performance with respect to accuracy and robustness. We also discuss some of the problems of converting a tree to a haplotype genealogy, in particular that it is nonunique. © 2011 Blackwell Publishing Ltd.

  10. Secondary uses and the governance of de-identified data: Lessons from the human genome diversity panel

    Directory of Open Access Journals (Sweden)

    Lee Sandra S-J

    2011-09-01

    Full Text Available Abstract Background Recent changes to regulatory guidance in the US and Europe have complicated oversight of secondary research by rendering most uses of de-identified data exempt from human subjects oversight. To identify the implications of such guidelines for harms to participants and communities, this paper explores the secondary uses of one de-identified DNA sample collection with limited oversight: the Human Genome Diversity Project (HGDP-Centre d'Etude du Polymorphisme Humain, Fondation Jean Dausset (CEPH Human Genome Diversity Panel. Methods Using a combination of keyword and cited reference search, we identified English-language scientific articles published between 2002 and 2009 that reported analysis of HGDP Diversity Panel samples and/or data. We then reviewed each article to identify the specific research use to which the samples and/or data was applied. Secondary uses were categorized according to the type and kind of research supported by the collection. Results A wide variety of secondary uses were identified from 148 peer-reviewed articles. While the vast majority of these uses were consistent with the original intent of the collection, a minority of published reports described research whose primary findings could be regarded as controversial, objectionable, or potentially stigmatizing in their interpretation. Conclusions We conclude that potential risks to participants and communities cannot be wholly eliminated by anonymization of individual data and suggest that explicit review of proposed secondary uses, by a Data Access Committee or similar internal oversight body with suitable stakeholder representation, should be a required component of the trustworthy governance of any repository of data or specimens.

  11. Co-invading symbiotic mutualists of Medicago polymorpha retain high ancestral diversity and contain diverse accessory genomes.

    Science.gov (United States)

    Porter, Stephanie S; Faber-Hammond, Joshua J; Friesen, Maren L

    2018-01-01

    Exotic, invasive plants and animals can wreak havoc on ecosystems by displacing natives and altering environmental conditions. However, much less is known about the identities or evolutionary dynamics of the symbiotic microbes that accompany invasive species. Most leguminous plants rely upon symbiotic rhizobium bacteria to fix nitrogen and are incapable of colonizing areas devoid of compatible rhizobia. We compare the genomes of symbiotic rhizobia in a portion of the legume's invaded range with those of the rhizobium symbionts from across the legume's native range. We show that in an area of California the legume Medicago polymorpha has invaded, its Ensifer medicae symbionts: (i) exhibit genome-wide patterns of relatedness that together with historical evidence support host-symbiont co-invasion from Europe into California, (ii) exhibit population genomic patterns consistent with the introduction of the majority of deep diversity from the native range, rather than a genetic bottleneck during colonization of California and (iii) harbor a large set of accessory genes uniquely enriched in binding functions, which could play a role in habitat invasion. Examining microbial symbiont genome dynamics during biological invasions is critical for assessing host-symbiont co-invasions whereby microbial symbiont range expansion underlies plant and animal invasions. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Exceptionally diverse morphotypes and genomes of crenarchaeal hyperthermophilic viruses

    DEFF Research Database (Denmark)

    Prangishvili, D; Garrett, R A

    2004-01-01

    and Rudiviridae. They all have double-stranded DNA genomes and infect hyperthermophilic crenarchaea of the orders Sulfolobales and Thermoproteales. Representatives of the different viral families share a few homologous ORFs (open reading frames). However, about 90% of all ORFs in the seven sequenced genomes show...... no significant matches to sequences in public databases. This suggests that these hyperthermophilic viruses have exceptional biochemical solutions for biological functions. Specific features of genome organization, as well as strategies for DNA replication, suggest that phylogenetic relationships exist between...... crenarchaeal rudiviruses and the large eukaryal DNA viruses: poxviruses, the African swine fever virus and Chlorella viruses. Sequence patterns at the ends of the linear genome of the lipothrixvirus AFV1 are reminiscent of the telomeric ends of linear eukaryal chromosomes and suggest that a primitive telomeric...

  13. Mitochondrial haplotypes are not associated with mice selectively bred for high voluntary wheel running.

    Science.gov (United States)

    Wone, Bernard W M; Yim, Won C; Schutz, Heidi; Meek, Thomas H; Garland, Theodore

    2018-04-04

    Mitochondrial haplotypes have been associated with human and rodent phenotypes, including nonshivering thermogenesis capacity, learning capability, and disease risk. Although the mammalian mitochondrial D-loop is highly polymorphic, D-loops in laboratory mice are identical, and variation occurs elsewhere mainly between nucleotides 9820 and 9830. Part of this region codes for the tRNA Arg gene and is associated with mitochondrial densities and number of mtDNA copies. We hypothesized that the capacity for high levels of voluntary wheel-running behavior would be associated with mitochondrial haplotype. Here, we analyzed the mtDNA polymorphic region in mice from each of four replicate lines selectively bred for 54 generations for high voluntary wheel running (HR) and from four control lines (Control) randomly bred for 54 generations. Sequencing the polymorphic region revealed a variable number of adenine repeats. Single nucleotide polymorphisms (SNPs) varied from 2 to 3 adenine insertions, resulting in three haplotypes. We found significant genetic differentiations between the HR and Control groups (F st  = 0.779, p ≤ 0.0001), as well as among the replicate lines of mice within groups (F sc  = 0.757, p ≤ 0.0001). Haplotypes, however, were not strongly associated with voluntary wheel running (revolutions run per day), nor with either body mass or litter size. This system provides a useful experimental model to dissect the physiological processes linking mitochondrial, genomic SNPs, epigenetics, or nuclear-mitochondrial cross-talk to exercise activity. Copyright © 2018. Published by Elsevier B.V.

  14. Genomic and Metagenomic Analysis of Diversity-Generating Retroelements Associated with Treponema denticola

    Directory of Open Access Journals (Sweden)

    Sutichot eNimkulrat

    2016-06-01

    Full Text Available Diversity-generating retroelements (DGRs are genetic cassettes that can produce massive protein sequence variation in prokaryotes. Presumably DGRs confer selective advantages to their hosts (bacteria or viruses by generating variants of target genes—typically resulting in target proteins with altered ligand-binding specificity—through a specialized error-prone reverse transcription process. The only extensively studied DGR system is from the Bordetella phage BPP-1, although DGRs are predicted to exist in other species. Using bioinformatics analysis, we discovered that the DGR system associated with the Treponema denticola species (a human oral-associated periopathogen is dynamic (with gains/losses of the system found in the isolates and diverse (with multiple types found in isolated genomes and the human microbiota. The T. denticola DGR is found in only nine of the 17 sequenced T. denticola strains. Analysis of the DGR-associated template regions and reverse transcriptase gene sequences revealed two types of DGR systems in T. denticola: the ATCC35405-type shared by seven isolates including ATCC35405; and the SP32-type shared by two isolates (SP32 and SP33, suggesting multiple DGR acquisitions. We detected additional variants of the T. denticola DGR systems in the human microbiomes, and found that the SP32-type DGR is more abundant than the ATCC35405-type in the healthy human oral microbiome, although the latter is found in more sequenced isolates. This is the first comprehensive study to characterize the DGRs associated with T. denticola in individual genomes as well as human microbiomes, demonstrating the importance of utilizing both individual genomes and metagenomes for characterizing the elements, and for analyzing their diversity and distribution in human populations.

  15. The IGF1 small dog haplotype is derived from Middle Eastern grey wolves

    Directory of Open Access Journals (Sweden)

    Ostrander Elaine A

    2010-02-01

    Full Text Available Abstract Background A selective sweep containing the insulin-like growth factor 1 (IGF1 gene is associated with size variation in domestic dogs. Intron 2 of IGF1 contains a SINE element and single nucleotide polymorphism (SNP found in all small dog breeds that is almost entirely absent from large breeds. In this study, we surveyed a large sample of grey wolf populations to better understand the ancestral pattern of variation at IGF1 with a particular focus on the distribution of the small dog haplotype and its relationship to the origin of the dog. Results We present DNA sequence data that confirms the absence of the derived small SNP allele in the intron 2 region of IGF1 in a large sample of grey wolves and further establishes the absence of a small dog associated SINE element in all wild canids and most large dog breeds. Grey wolf haplotypes from the Middle East have higher nucleotide diversity suggesting an origin there. Additionally, PCA and phylogenetic analyses suggests a closer kinship of the small domestic dog IGF1 haplotype with those from Middle Eastern grey wolves. Conclusions The absence of both the SINE element and SNP allele in grey wolves suggests that the mutation for small body size post-dates the domestication of dogs. However, because all small dogs possess these diagnostic mutations, the mutations likely arose early in the history of domestic dogs. Our results show that the small dog haplotype is closely related to those in Middle Eastern wolves and is consistent with an ancient origin of the small dog haplotype there. Thus, in concordance with past archeological studies, our molecular analysis is consistent with the early evolution of small size in dogs from the Middle East. See associated opinion by Driscoll and Macdonald: http://jbiol.com/content/9/2/10

  16. Identification of Padi2 as a novel angiogenesis-regulating gene by genome association studies in mice.

    Science.gov (United States)

    Khajavi, Mehrdad; Zhou, Yi; Birsner, Amy E; Bazinet, Lauren; Rosa Di Sant, Amanda; Schiffer, Alex J; Rogers, Michael S; Krishnaji, Subrahmanian Tarakkad; Hu, Bella; Nguyen, Vy; Zon, Leonard; D'Amato, Robert J

    2017-06-01

    Recent findings indicate that growth factor-driven angiogenesis is markedly influenced by genetic variation. This variation in angiogenic responsiveness may alter the susceptibility to a number of angiogenesis-dependent diseases. Here, we utilized the genetic diversity available in common inbred mouse strains to identify the loci and candidate genes responsible for differences in angiogenic response. The corneal micropocket neovascularization assay was performed on 42 different inbred mouse strains using basic fibroblast growth factor (bFGF) pellets. We performed a genome-wide association study utilizing efficient mixed-model association (EMMA) mapping using the induced vessel area from all strains. Our analysis yielded five loci with genome-wide significance on chromosomes 4, 8, 11, 15 and 16. We further refined the mapping on chromosome 4 within a haplotype block containing multiple candidate genes. These genes were evaluated by expression analysis in corneas of various inbred strains and in vitro functional assays in human microvascular endothelial cells (HMVECs). Of these, we found the expression of peptidyl arginine deiminase type II (Padi2), known to be involved in metabolic pathways, to have a strong correlation with a haplotype shared by multiple high angiogenic strains. In addition, inhibition of Padi2 demonstrated a dosage-dependent effect in HMVECs. To investigate its role in vivo, we knocked down Padi2 in transgenic kdrl:zsGreen zebrafish embryos using morpholinos. These embryos had disrupted vessel formation compared to control siblings. The impaired vascular pattern was partially rescued by human PADI2 mRNA, providing evidence for the specificity of the morphant phenotype. Taken together, our study is the first to indicate the potential role of Padi2 as an angiogenesis-regulating gene. The characterization of Padi2 and other genes in associated pathways may provide new understanding of angiogenesis regulation and novel targets for diagnosis and

  17. Identification of Padi2 as a novel angiogenesis-regulating gene by genome association studies in mice.

    Directory of Open Access Journals (Sweden)

    Mehrdad Khajavi

    2017-06-01

    Full Text Available Recent findings indicate that growth factor-driven angiogenesis is markedly influenced by genetic variation. This variation in angiogenic responsiveness may alter the susceptibility to a number of angiogenesis-dependent diseases. Here, we utilized the genetic diversity available in common inbred mouse strains to identify the loci and candidate genes responsible for differences in angiogenic response. The corneal micropocket neovascularization assay was performed on 42 different inbred mouse strains using basic fibroblast growth factor (bFGF pellets. We performed a genome-wide association study utilizing efficient mixed-model association (EMMA mapping using the induced vessel area from all strains. Our analysis yielded five loci with genome-wide significance on chromosomes 4, 8, 11, 15 and 16. We further refined the mapping on chromosome 4 within a haplotype block containing multiple candidate genes. These genes were evaluated by expression analysis in corneas of various inbred strains and in vitro functional assays in human microvascular endothelial cells (HMVECs. Of these, we found the expression of peptidyl arginine deiminase type II (Padi2, known to be involved in metabolic pathways, to have a strong correlation with a haplotype shared by multiple high angiogenic strains. In addition, inhibition of Padi2 demonstrated a dosage-dependent effect in HMVECs. To investigate its role in vivo, we knocked down Padi2 in transgenic kdrl:zsGreen zebrafish embryos using morpholinos. These embryos had disrupted vessel formation compared to control siblings. The impaired vascular pattern was partially rescued by human PADI2 mRNA, providing evidence for the specificity of the morphant phenotype. Taken together, our study is the first to indicate the potential role of Padi2 as an angiogenesis-regulating gene. The characterization of Padi2 and other genes in associated pathways may provide new understanding of angiogenesis regulation and novel targets for

  18. A novel protective MHC-I haplotype not associated with dominant Gag-specific CD8+ T-cell responses in SIVmac239 infection of Burmese rhesus macaques.

    Directory of Open Access Journals (Sweden)

    Naofumi Takahashi

    Full Text Available Several major histocompatibility complex class I (MHC-I alleles are associated with lower viral loads and slower disease progression in human immunodeficiency virus (HIV and simian immunodeficiency virus (SIV infections. Immune-correlates analyses in these MHC-I-related HIV/SIV controllers would lead to elucidation of the mechanism for viral control. Viral control associated with some protective MHC-I alleles is attributed to CD8+ T-cell responses targeting Gag epitopes. We have been trying to know the mechanism of SIV control in multiple groups of Burmese rhesus macaques sharing MHC-I genotypes at the haplotype level. Here, we found a protective MHC-I haplotype, 90-010-Id (D, which is not associated with dominant Gag-specific CD8+ T-cell responses. Viral loads in five D+ animals became significantly lower than those in our previous cohorts after 6 months. Most D+ animals showed predominant Nef-specific but not Gag-specific CD8+ T-cell responses after SIV challenge. Further analyses suggested two Nef-epitope-specific CD8+ T-cell responses exerting strong suppressive pressure on SIV replication. Another set of five D+ animals that received a prophylactic vaccine using a Gag-expressing Sendai virus vector showed significantly reduced viral loads compared to unvaccinated D+ animals at 3 months, suggesting rapid SIV control by Gag-specific CD8+ T-cell responses in addition to Nef-specific ones. These results present a pattern of SIV control with involvement of non-Gag antigen-specific CD8+ T-cell responses.

  19. The minimum information about a genome sequence (MIGS) specification

    DEFF Research Database (Denmark)

    Field, D; Garrity, G; Gray, T

    2008-01-01

    With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the...... that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases....... the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources...

  20. Consequences for diversity when prioritizing animals for conservation with pedigree or genomic information

    NARCIS (Netherlands)

    Engelsma, K.A.; Veerkamp, R.F.; Calus, M.P.L.; Windig, J.J.

    2011-01-01

    Up to now, prioritization of animals for conservation has been mainly based on pedigree information; however, genomic information may improve prioritization. In this study, we used two Holstein populations to investigate the consequences for genetic diversity when animals are prioritized with

  1. Genetic diversity and structure of elite cotton germplasm (Gossypium hirsutum L.) using genome-wide SNP data.

    Science.gov (United States)

    Ai, XianTao; Liang, YaJun; Wang, JunDuo; Zheng, JuYun; Gong, ZhaoLong; Guo, JiangPing; Li, XueYuan; Qu, YanYing

    2017-10-01

    Cotton (Gossypium spp.) is the most important natural textile fiber crop, and Gossypium hirsutum L. is responsible for 90% of the annual cotton crop in the world. Information on cotton genetic diversity and population structure is essential for new breeding lines. In this study, we analyzed population structure and genetic diversity of 288 elite Gossypium hirsutum cultivar accessions collected from around the world, and especially from China, using genome-wide single nucleotide polymorphisms (SNP) markers. The average polymorphsim information content (PIC) was 0.25, indicating a relatively low degree of genetic diversity. Population structure analysis revealed extensive admixture and identified three subgroups. Phylogenetic analysis supported the subgroups identified by STRUCTURE. The results from both population structure and phylogenetic analysis were, for the most part, in agreement with pedigree information. Analysis of molecular variance revealed a larger amount of variation was due to diversity within the groups. Establishment of genetic diversity and population structure from this study could be useful for genetic and genomic analysis and systematic utilization of the standing genetic variation in upland cotton.

  2. On the limits of computational functional genomics for bacterial lifestyle prediction

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Röttger, Richard; Hauschild, Anne-Christin

    2014-01-01

    We review the level of genomic specificity regarding actinobacterial pathogenicity. As they occupy various niches in diverse habitats, one may assume the existence of lifestyle-specific genomic features. We include 240 actinobacteria classified into four pathogenicity classes: human pathogens (HPs...

  3. Alignment-free genome tree inference by learning group-specific distance metrics.

    Science.gov (United States)

    Patil, Kaustubh R; McHardy, Alice C

    2013-01-01

    Understanding the evolutionary relationships between organisms is vital for their in-depth study. Gene-based methods are often used to infer such relationships, which are not without drawbacks. One can now attempt to use genome-scale information, because of the ever increasing number of genomes available. This opportunity also presents a challenge in terms of computational efficiency. Two fundamentally different methods are often employed for sequence comparisons, namely alignment-based and alignment-free methods. Alignment-free methods rely on the genome signature concept and provide a computationally efficient way that is also applicable to nonhomologous sequences. The genome signature contains evolutionary signal as it is more similar for closely related organisms than for distantly related ones. We used genome-scale sequence information to infer taxonomic distances between organisms without additional information such as gene annotations. We propose a method to improve genome tree inference by learning specific distance metrics over the genome signature for groups of organisms with similar phylogenetic, genomic, or ecological properties. Specifically, our method learns a Mahalanobis metric for a set of genomes and a reference taxonomy to guide the learning process. By applying this method to more than a thousand prokaryotic genomes, we showed that, indeed, better distance metrics could be learned for most of the 18 groups of organisms tested here. Once a group-specific metric is available, it can be used to estimate the taxonomic distances for other sequenced organisms from the group. This study also presents a large scale comparison between 10 methods--9 alignment-free and 1 alignment-based.

  4. Genome sequence diversity and clues to the evolution of variola (smallpox) virus.

    Science.gov (United States)

    Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M

    2006-08-11

    Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.

  5. Strategies for haplotype-based association mapping in complex pedigreed populations

    DEFF Research Database (Denmark)

    Boleckova, J; Christensen, Ole Fredslund; Sørensen, Peter

    2012-01-01

    In association mapping, haplotype-based methods are generally regarded to provide higher power and increased precision than methods based on single markers. For haplotype-based association mapping most studies use a fixed haplotype effect in the model. However, an increase in haplotype length inc...

  6. The polymorphism and haplotypes of XRCC1 and survival of non-small-cell lung cancer after radiotherapy

    International Nuclear Information System (INIS)

    Yoon, Sang Min; Hong, Yun-Chul; Park, Heon Joo; Lee, Jong-Eun; Kim, Sang Yoon; Kim, Jong Hoon; Lee, Sang-Wook; Park, So-Yeon; Lee, Jung Shin; Choi, Eun Kyung

    2005-01-01

    Purpose: The X-ray repair cross-complementing Group 1 (XRCC1) protein is involved mainly in the base excision repair of the DNA repair process. This study examined the association of 3 polymorphisms (codon 194, 280, and 399) of XRCC1 and lung cancer in terms of whether or not these polymorphisms have an effect on the survival of lung cancer patients who have received radiotherapy. Methods and Materials: Between January 2000 and April 2004, 229 lung cancer patients with non-small-cell lung cancer in Stages I-III were enrolled. Genotyping was performed by single base primer extension assay using the SNP-IT Kit with genomic DNA samples from all patients. The haplotype of the XRCC1 polymorphisms was estimated by PHASE version 2.1. Results: The patients consisted of 191 (83.4%) males and 38 (16.6%) females with a median age of 62 (range, 26-88 years). Sixty percent of the patients were included in Stage I-IIIa. The median progression-free and overall survival was 13 months and 16 months, respectively. The XRCC1 codon 194, histology, and stage were shown to be significant predictors of the progression-free survival. The 6 haplotypes among the XRCC1 polymorphisms (194, 280, and 399) were estimated by PHASE v.2.1. The patients with haplotype pairs other than the homozygous TGG haplotype pairs survived significantly longer (p = 0.04). Conclusions: Polymorphisms of XRCC1 have an effect on the survival of lung cancer patients treated with radiotherapy, and this effect seems to be more significant after the haplotype pairs are considered

  7. Class I gene regulation of haplotype preference may influence antiviral immunity in vivo

    DEFF Research Database (Denmark)

    Thomsen, Allan Randrup; Marker, O

    1989-01-01

    targets. In regard to the in vivo significance of haplotype preference it was found that (C X C3) F1 mice expressed an earlier and stronger virus-specific delayed type hypersensitivity response and exerted a more efficient virus control than did (C-H-2dm2 X C3) F1. Taken together these findings suggest...... that haplotype preference reflects a selection process favoring the restriction element associated with the most efficient immune response in vivo. The implications of this are discussed....

  8. Genetic diversity and genetic structure of the striped field mouse Apodemus agrarius coreae (Muridae, Rodentia) in Korea.

    Science.gov (United States)

    Kim, Hye Ri; Park, Yung Chul

    2015-11-10

    The aim of this study was to investigate the genetic diversity and genetic structure of the striped field mouse Apodemus agrarius coreae in Korea. The Korean A. a. coreae is characterized by high levels of haplotype diversity (Hd=0.967) and low levels of nucleotide diversity (π=0.00683). Haplogroup 1 is well separated from the haplotypes of the neighboring regions of the Korean Peninsula, while the other haplogroups are closely related to those from the Russian Far East. Thus, further investigations are required to confirm the validity of the subspecies status of A. a. coreae by implementing additional morphological characters as well as genetic data from the populations present in the Korean Peninsula and its neighboring countries. Haplogroup 1 includes most Korean haplotypes and forms a star-like haplotype network structure, which reveals relatively low levels of sequence divergence and high frequency of unique mutations (only few mutations are shared in most of the haplotype nodes). The results indicate that the haplotypes of Haplogroup 1 might have experienced population expansion since their migration into Korea, which was further corroborated with negative results of neutrality tests for Korean population of A. a. coreae. Copyright © 2015. Published by Elsevier B.V.

  9. Ancient DNA from Giant Panda (Ailuropoda melanoleuca) of South-Western China Reveals Genetic Diversity Loss during the Holocene.

    Science.gov (United States)

    Sheng, Gui-Lian; Barlow, Axel; Cooper, Alan; Hou, Xin-Dong; Ji, Xue-Ping; Jablonski, Nina G; Zhong, Bo-Jian; Liu, Hong; Flynn, Lawrence J; Yuan, Jun-Xia; Wang, Li-Rui; Basler, Nikolas; Westbury, Michael V; Hofreiter, Michael; Lai, Xu-Long

    2018-04-06

    The giant panda was widely distributed in China and south-eastern Asia during the middle to late Pleistocene, prior to its habitat becoming rapidly reduced in the Holocene. While conservation reserves have been established and population numbers of the giant panda have recently increased, the interpretation of its genetic diversity remains controversial. Previous analyses, surprisingly, have indicated relatively high levels of genetic diversity raising issues concerning the efficiency and usefulness of reintroducing individuals from captive populations. However, due to a lack of DNA data from fossil specimens, it is unknown whether genetic diversity was even higher prior to the most recent population decline. We amplified complete cyt b and 12s rRNA, partial 16s rRNA and ND1 , and control region sequences from the mitochondrial genomes of two Holocene panda specimens. We estimated genetic diversity and population demography by analyzing the ancient mitochondrial DNA sequences alongside those from modern giant pandas, as well as from other members of the bear family (Ursidae). Phylogenetic analyses show that one of the ancient haplotypes is sister to all sampled modern pandas and the second ancient individual is nested among the modern haplotypes, suggesting that genetic diversity may indeed have been higher earlier during the Holocene. Bayesian skyline plot analysis supports this view and indicates a slight decline in female effective population size starting around 6000 years B.P., followed by a recovery around 2000 years ago. Therefore, while the genetic diversity of the giant panda has been affected by recent habitat contraction, it still harbors substantial genetic diversity. Moreover, while its still low population numbers require continued conservation efforts, there seem to be no immediate threats from the perspective of genetic evolutionary potential.

  10. Ancient DNA from Giant Panda (Ailuropoda melanoleuca of South-Western China Reveals Genetic Diversity Loss during the Holocene

    Directory of Open Access Journals (Sweden)

    Gui-Lian Sheng

    2018-04-01

    Full Text Available The giant panda was widely distributed in China and south-eastern Asia during the middle to late Pleistocene, prior to its habitat becoming rapidly reduced in the Holocene. While conservation reserves have been established and population numbers of the giant panda have recently increased, the interpretation of its genetic diversity remains controversial. Previous analyses, surprisingly, have indicated relatively high levels of genetic diversity raising issues concerning the efficiency and usefulness of reintroducing individuals from captive populations. However, due to a lack of DNA data from fossil specimens, it is unknown whether genetic diversity was even higher prior to the most recent population decline. We amplified complete cytb and 12s rRNA, partial 16s rRNA and ND1, and control region sequences from the mitochondrial genomes of two Holocene panda specimens. We estimated genetic diversity and population demography by analyzing the ancient mitochondrial DNA sequences alongside those from modern giant pandas, as well as from other members of the bear family (Ursidae. Phylogenetic analyses show that one of the ancient haplotypes is sister to all sampled modern pandas and the second ancient individual is nested among the modern haplotypes, suggesting that genetic diversity may indeed have been higher earlier during the Holocene. Bayesian skyline plot analysis supports this view and indicates a slight decline in female effective population size starting around 6000 years B.P., followed by a recovery around 2000 years ago. Therefore, while the genetic diversity of the giant panda has been affected by recent habitat contraction, it still harbors substantial genetic diversity. Moreover, while its still low population numbers require continued conservation efforts, there seem to be no immediate threats from the perspective of genetic evolutionary potential.

  11. MtDNA diversity among four Portuguese autochthonous dog breeds: a fine-scale characterisation

    Directory of Open Access Journals (Sweden)

    Santa-Rita Pedro

    2005-06-01

    Full Text Available Abstract Background The picture of dog mtDNA diversity, as obtained from geographically wide samplings but from a small number of individuals per region or breed, has revealed weak geographic correlation and high degree of haplotype sharing between very distant breeds. We aimed at a more detailed picture through extensive sampling (n = 143 of four Portuguese autochthonous breeds – Castro Laboreiro Dog, Serra da Estrela Mountain Dog, Portuguese Sheepdog and Azores Cattle Dog-and comparatively reanalysing published worldwide data. Results Fifteen haplotypes belonging to four major haplogroups were found in these breeds, of which five are newly reported. The Castro Laboreiro Dog presented a 95% frequency of a new A haplotype, while all other breeds contained a diverse pool of existing lineages. The Serra da Estrela Mountain Dog, the most heterogeneous of the four Portuguese breeds, shared haplotypes with the other mainland breeds, while Azores Cattle Dog shared no haplotypes with the other Portuguese breeds. A review of mtDNA haplotypes in dogs across the world revealed that: (a breeds tend to display haplotypes belonging to different haplogroups; (b haplogroup A is present in all breeds, and even uncommon haplogroups are highly dispersed among breeds and continental areas; (c haplotype sharing between breeds of the same region is lower than between breeds of different regions and (d genetic distances between breeds do not correlate with geography. Conclusion MtDNA haplotype sharing occurred between Serra da Estrela Mountain dogs (with putative origin in the centre of Portugal and two breeds in the north and south of the country-with the Castro Laboreiro Dog (which behaves, at the mtDNA level, as a sub-sample of the Serra da Estrela Mountain Dog and the southern Portuguese Sheepdog. In contrast, the Azores Cattle Dog did not share any haplotypes with the other Portuguese breeds, but with dogs sampled in Northern Europe. This suggested that the

  12. Associations of Haplotypes Upstream of IRS1 with Insulin Resistance, Type 2 Diabetes, Dyslipidemia, Preclinical Atherosclerosis, and Skeletal Muscle LOC646736 mRNA Levels

    Directory of Open Access Journals (Sweden)

    Selma M. Soyal

    2015-01-01

    Full Text Available The genomic region ~500 kb upstream of IRS1 has been implicated in insulin resistance, type 2 diabetes, adverse lipid profile, and cardiovascular risk. To gain further insight into this chromosomal region, we typed four SNPs in a cross-sectional cohort and subjects with type 2 diabetes recruited from the same geographic region. From 16 possible haplotypes, 6 haplotypes with frequencies >0.01 were observed. We identified one haplotype that was protective against insulin resistance (determined by HOMA-IR and fasting plasma insulin levels, type 2 diabetes, an adverse lipid profile, increased C-reactive protein, and asymptomatic atherosclerotic disease (assessed by intima media thickness of the common carotid arteries. BMI and total adipose tissue mass as well as visceral and subcutaneous adipose tissue mass did not differ between the reference and protective haplotypes. In 92 subjects, we observed an association of the protective haplotype with higher skeletal muscle mRNA levels of LOC646736, which is located in the same haplotype block as the informative SNPs and is mainly expressed in skeletal muscle, but only at very low levels in liver or adipose tissues. These data suggest a role for LOC646736 in human insulin resistance and warrant further studies on the functional effects of this locus.

  13. Genetic diversity and distribution of Senegalia senegal (L.) Britton under climate change scenarios in West Africa

    Science.gov (United States)

    Duque-Lazo, Joaquín; Durka, Walter; Hauenschild, Frank; Schnitzler, Jan; Michalak, Ingo; Ogundipe, Oluwatoyin Temitayo; Muellner-Riehl, Alexandra Nora

    2018-01-01

    Climate change is predicted to impact species’ genetic diversity and distribution. We used Senegalia senegal (L.) Britton, an economically important species distributed in the Sudano-Sahelian savannah belt of West Africa, to investigate the impact of climate change on intraspecific genetic diversity and distribution. We used ten nuclear and two plastid microsatellite markers to assess genetic variation, population structure and differentiation across thirteen sites in West Africa. We projected suitable range, and potential impact of climate change on genetic diversity using a maximum entropy approach, under four different climate change scenarios. We found higher genetic and haplotype diversity at both nuclear and plastid markers than previously reported. Genetic differentiation was strong for chloroplast and moderate for the nuclear genome. Both genomes indicated three spatially structured genetic groups. The distribution of Senegalia senegal is strongly correlated with extractable nitrogen, coarse fragments, soil organic carbon stock, precipitation of warmest and coldest quarter and mean temperature of driest quarter. We predicted 40.96 to 6.34 per cent of the current distribution to favourably support the species’ ecological requirements under future climate scenarios. Our results suggest that climate change is going to affect the population genetic structure of Senegalia senegal, and that patterns of genetic diversity are going to influence the species’ adaptive response to climate change. Our study contributes to the growing evidence predicting the loss of economically relevant plants in West Africa in the next decades due to climate change. PMID:29659603

  14. A recombination hotspot in a schizophrenia-associated region of GABRB2.

    Directory of Open Access Journals (Sweden)

    Siu-Kin Ng

    Full Text Available BACKGROUND: Schizophrenia is a major disorder with complex genetic mechanisms. Earlier, population genetic studies revealed the occurrence of strong positive selection in the GABRB2 gene encoding the beta(2 subunit of GABA(A receptors, within a segment of 3,551 bp harboring twenty-nine single nucleotide polymorphisms (SNPs and containing schizophrenia-associated SNPs and haplotypes. METHODOLOGY/PRINCIPAL FINDINGS: In the present study, the possible occurrence of recombination in this 'S1-S29' segment was assessed. The occurrence of hotspot recombination was indicated by high resolution recombination rate estimation, haplotype diversity, abundance of rare haplotypes, recurrent mutations and torsos in haplotype networks, and experimental haplotyping of somatic and sperm DNA. The sub-segment distribution of relative recombination strength, measured by the ratio of haplotype diversity (H(d over mutation rate (theta, was indicative of a human specific Alu-Yi6 insertion serving as a central recombining sequence facilitating homologous recombination. Local anomalous DNA conformation attributable to the Alu-Yi6 element, as suggested by enhanced DNase I sensitivity and obstruction to DNA sequencing, could be a contributing factor of the increased sequence diversity. Linkage disequilibrium (LD analysis yielded prominent low LD points that supported ongoing recombination. LD contrast revealed significant dissimilarity between control and schizophrenic cohorts. Among the large array of inferred haplotypes, H26 and H73 were identified to be protective, and H19 and H81 risk-conferring, toward the development of schizophrenia. CONCLUSIONS/SIGNIFICANCE: The co-occurrence of hotspot recombination and positive selection in the S1-S29 segment of GABRB2 has provided a plausible contribution to the molecular genetics mechanisms for schizophrenia. The present findings therefore suggest that genome regions characterized by the co-occurrence of positive selection and

  15. Introgression of a Rare Haplotype from Southeastern Africa to Breed California Blackeyes with Larger Seeds

    Directory of Open Access Journals (Sweden)

    Mitchell R Lucas

    2015-03-01

    Full Text Available Seed size distinguishes most crops from their wild relatives and is an important quality trait for the grain legume cowpea. In order to breed cowpea varieties with larger seeds we introgressed a rare haplotype associated with large seeds at the Css-1 locus from an African buff seed type cultivar, IT82E-18 (18.5g/100 seeds, into a blackeye seed type cultivar, CB27 (22g/100 seed. Four RILs derived from these two parents were chosen for marker-assisted breeding based on SNP genotyping with a goal of stacking large seed haplotypes into a CB27 background. Foreground and background selection were performed during two cycles of backcrossing based on genome-wide SNP markers. The average seed size of introgression lines homozygous for haplotypes associated with large seeds was 28.7g/100 seed and 24.8g/100 seed for cycles 1 and 2, respectively. One cycle 1 introgression line with desirable seed quality was selfed for two generations to make families with very large seeds (28-35g/100 seeds. Field-based performance trials helped identify breeding lines that not only have large seeds but are also desirable in terms of yield, maturity, and plant architecture when compared to industry standards. A principal component analysis was used to explore the relationships between the parents relative to a core set of landraces and improved varieties based on high-density SNP data. The geographic distribution of haplotypes at the Css-1 locus suggest the haplotype associated with large seeds is unique to accessions collected from Southeastern Africa. Therefore this QTL has a strong potential to develop larger seeded varieties for other growing regions which is demonstrated in this work using a California pedigree.

  16. Complete avian malaria parasite genomes reveal features associated with lineage-specific evolution in birds and mammals

    Science.gov (United States)

    Böhme, Ulrike; Otto, Thomas D.; Cotton, James A.; Steinbiss, Sascha; Sanders, Mandy; Oyola, Samuel O.; Nicot, Antoine; Gandon, Sylvain; Patra, Kailash P.; Herd, Colin; Bushell, Ellen; Modrzynska, Katarzyna K.; Billker, Oliver; Vinetz, Joseph M.; Rivero, Ana; Newbold, Chris I.; Berriman, Matthew

    2018-01-01

    Avian malaria parasites are prevalent around the world and infect a wide diversity of bird species. Here, we report the sequencing and analysis of high-quality draft genome sequences for two avian malaria species, Plasmodium relictum and Plasmodium gallinaceum. We identify 50 genes that are specific to avian malaria, located in an otherwise conserved core of the genome that shares gene synteny with all other sequenced malaria genomes. Phylogenetic analysis suggests that the avian malaria species form an outgroup to the mammalian Plasmodium species, and using amino acid divergence between species, we estimate the avian- and mammalian-infective lineages diverged in the order of 10 million years ago. Consistent with their phylogenetic position, we identify orthologs of genes that had previously appeared to be restricted to the clades of parasites containing Plasmodium falciparum and Plasmodium vivax, the species with the greatest impact on human health. From these orthologs, we explore differential diversifying selection across the genus and show that the avian lineage is remarkable in the extent to which invasion-related genes are evolving. The subtelomeres of the P. relictum and P. gallinaceum genomes contain several novel gene families, including an expanded surf multigene family. We also identify an expansion of reticulocyte binding protein homologs in P. relictum, and within these proteins, we detect distinct regions that are specific to nonhuman primate, humans, rodent, and avian hosts. For the first time in the Plasmodium lineage, we find evidence of transposable elements, including several hundred fragments of LTR-retrotransposons in both species and an apparently complete LTR-retrotransposon in the genome of P. gallinaceum. PMID:29500236

  17. Genomic diversity among drug sensitive and multidrug resistant isolates of Mycobacterium tuberculosis with identical DNA fingerprints.

    Directory of Open Access Journals (Sweden)

    Stefan Niemann

    2009-10-01

    Full Text Available Mycobacterium tuberculosis complex (MTBC, the causative agent of tuberculosis (TB, is characterized by low sequence diversity making this bacterium one of the classical examples of a genetically monomorphic pathogen. Because of this limited DNA sequence variation, routine genotyping of clinical MTBC isolates for epidemiological purposes relies on highly discriminatory DNA fingerprinting methods based on mobile and repetitive genetic elements. According to the standard view, isolates exhibiting the same fingerprinting pattern are considered direct progeny of the same bacterial clone, and most likely reflect ongoing transmission or disease relapse within individual patients.Here we further investigated this assumption and used massively parallel whole-genome sequencing to compare one drug-susceptible (K-1 and one multidrug resistant (MDR isolate (K-2 of a rapidly spreading M. tuberculosis Beijing genotype clone from a high incidence region (Karakalpakstan, Uzbekistan. Both isolates shared the same IS6110 RFLP pattern and the same allele at 23 out of 24 MIRU-VNTR loci. We generated 23.9 million (K-1 and 33.0 million (K-2 paired 50 bp purity filtered reads corresponding to a mean coverage of 483.5 fold and 656.1 fold respectively. Compared with the laboratory strain H37Rv both Beijing isolates shared 1,209 SNPs. The two Beijing isolates differed by 130 SNPs and one large deletion. The susceptible isolate had 55 specific SNPs, while the MDR variant had 75 specific SNPs, including the five known resistance-conferring mutations.Our results suggest that M. tuberculosis isolates exhibiting identical DNA fingerprinting patterns can harbour substantial genomic diversity. Because this heterogeneity is not captured by traditional genotyping of MTBC, some aspects of the transmission dynamics of tuberculosis could be missed or misinterpreted. Furthermore, a valid differentiation between disease relapse and exogenous reinfection might be impossible using

  18. Genomic diversity among drug sensitive and multidrug resistant isolates of Mycobacterium tuberculosis with identical DNA fingerprints.

    Science.gov (United States)

    Niemann, Stefan; Köser, Claudio U; Gagneux, Sebastien; Plinke, Claudia; Homolka, Susanne; Bignell, Helen; Carter, Richard J; Cheetham, R Keira; Cox, Anthony; Gormley, Niall A; Kokko-Gonzales, Paula; Murray, Lisa J; Rigatti, Roberto; Smith, Vincent P; Arends, Felix P M; Cox, Helen S; Smith, Geoff; Archer, John A C

    2009-10-12

    Mycobacterium tuberculosis complex (MTBC), the causative agent of tuberculosis (TB), is characterized by low sequence diversity making this bacterium one of the classical examples of a genetically monomorphic pathogen. Because of this limited DNA sequence variation, routine genotyping of clinical MTBC isolates for epidemiological purposes relies on highly discriminatory DNA fingerprinting methods based on mobile and repetitive genetic elements. According to the standard view, isolates exhibiting the same fingerprinting pattern are considered direct progeny of the same bacterial clone, and most likely reflect ongoing transmission or disease relapse within individual patients. Here we further investigated this assumption and used massively parallel whole-genome sequencing to compare one drug-susceptible (K-1) and one multidrug resistant (MDR) isolate (K-2) of a rapidly spreading M. tuberculosis Beijing genotype clone from a high incidence region (Karakalpakstan, Uzbekistan). Both isolates shared the same IS6110 RFLP pattern and the same allele at 23 out of 24 MIRU-VNTR loci. We generated 23.9 million (K-1) and 33.0 million (K-2) paired 50 bp purity filtered reads corresponding to a mean coverage of 483.5 fold and 656.1 fold respectively. Compared with the laboratory strain H37Rv both Beijing isolates shared 1,209 SNPs. The two Beijing isolates differed by 130 SNPs and one large deletion. The susceptible isolate had 55 specific SNPs, while the MDR variant had 75 specific SNPs, including the five known resistance-conferring mutations. Our results suggest that M. tuberculosis isolates exhibiting identical DNA fingerprinting patterns can harbour substantial genomic diversity. Because this heterogeneity is not captured by traditional genotyping of MTBC, some aspects of the transmission dynamics of tuberculosis could be missed or misinterpreted. Furthermore, a valid differentiation between disease relapse and exogenous reinfection might be impossible using standard

  19. Application of site and haplotype-frequency based approaches for detecting selection signatures in cattle

    Directory of Open Access Journals (Sweden)

    Moore Stephen

    2011-06-01

    Full Text Available Abstract Background 'Selection signatures' delimit regions of the genome that are, or have been, functionally important and have therefore been under either natural or artificial selection. In this study, two different and complementary methods--integrated Haplotype Homozygosity Score (|iHS| and population differentiation index (FST--were applied to identify traces of decades of intensive artificial selection for traits of economic importance in modern cattle. Results We scanned the genome of a diverse set of dairy and beef breeds from Germany, Canada and Australia genotyped with a 50 K SNP panel. Across breeds, a total of 109 extreme |iHS| values exceeded the empirical threshold level of 5% with 19, 27, 9, 10 and 17 outliers in Holstein, Brown Swiss, Australian Angus, Hereford and Simmental, respectively. Annotating the regions harboring clustered |iHS| signals revealed a panel of interesting candidate genes like SPATA17, MGAT1, PGRMC2 and ACTC1, COL23A1, MATN2, respectively, in the context of reproduction and muscle formation. In a further step, a new Bayesian FST-based approach was applied with a set of geographically separated populations including Holstein, Brown Swiss, Simmental, North American Angus and Piedmontese for detecting differentiated loci. In total, 127 regions exceeding the 2.5 per cent threshold of the empirical posterior distribution were identified as extremely differentiated. In a substantial number (56 out of 127 cases the extreme FST values were found to be positioned in poor gene content regions which deviated significantly (p ST values were found in regions of some relevant genes such as SMCP and FGF1. Conclusions Overall, 236 regions putatively subject to recent positive selection in the cattle genome were detected. Both |iHS| and FST suggested selection in the vicinity of the Sialic acid binding Ig-like lectin 5 gene on BTA18. This region was recently reported to be a major QTL with strong effects on productive life

  20. Genome sequences of lower Great Lakes Microcystis sp. reveal strain-specific genes that are present and expressed in western Lake Erie blooms.

    Directory of Open Access Journals (Sweden)

    Kevin Anthony Meyer

    Full Text Available Blooms of the potentially toxic cyanobacterium Microcystis are increasing worldwide. In the Laurentian Great Lakes they pose major socioeconomic, ecological, and human health threats, particularly in western Lake Erie. However, the interpretation of "omics" data is constrained by the highly variable genome of Microcystis and the small number of reference genome sequences from strains isolated from the Great Lakes. To address this, we sequenced two Microcystis isolates from Lake Erie (Microcystis aeruginosa LE3 and M. wesenbergii LE013-01 and one from upstream Lake St. Clair (M. cf aeruginosa LSC13-02, and compared these data to the genomes of seventeen Microcystis spp. from across the globe as well as one metagenome and seven metatranscriptomes from a 2014 Lake Erie Microcystis bloom. For the publically available strains analyzed, the core genome is ~1900 genes, representing ~11% of total genes in the pan-genome and ~45% of each strain's genome. The flexible genome content was related to Microcystis subclades defined by phylogenetic analysis of both housekeeping genes and total core genes. To our knowledge this is the first evidence that the flexible genome is linked to the core genome of the Microcystis species complex. The majority of strain-specific genes were present and expressed in bloom communities in Lake Erie. Roughly 8% of these genes from the lower Great Lakes are involved in genome plasticity (rapid gain, loss, or rearrangement of genes and resistance to foreign genetic elements (such as CRISPR-Cas systems. Intriguingly, strain-specific genes from Microcystis cultured from around the world were also present and expressed in the Lake Erie blooms, suggesting that the Microcystis pangenome is truly global. The presence and expression of flexible genes, including strain-specific genes, suggests that strain-level genomic diversity may be important in maintaining Microcystis abundance during bloom events.

  1. The minimum information about a genome sequence (MIGS) specification

    Science.gov (United States)

    Field, Dawn; Garrity, George; Gray, Tanya; Morrison, Norman; Selengut, Jeremy; Sterk, Peter; Tatusova, Tatiana; Thomson, Nicholas; Allen, Michael J; Angiuoli, Samuel V; Ashburner, Michael; Axelrod, Nelson; Baldauf, Sandra; Ballard, Stuart; Boore, Jeffrey; Cochrane, Guy; Cole, James; Dawyndt, Peter; De Vos, Paul; dePamphilis, Claude; Edwards, Robert; Faruque, Nadeem; Feldman, Robert; Gilbert, Jack; Gilna, Paul; Glöckner, Frank Oliver; Goldstein, Philip; Guralnick, Robert; Haft, Dan; Hancock, David; Hermjakob, Henning; Hertz-Fowler, Christiane; Hugenholtz, Phil; Joint, Ian; Kagan, Leonid; Kane, Matthew; Kennedy, Jessie; Kowalchuk, George; Kottmann, Renzo; Kolker, Eugene; Kravitz, Saul; Kyrpides, Nikos; Leebens-Mack, Jim; Lewis, Suzanna E; Li, Kelvin; Lister, Allyson L; Lord, Phillip; Maltsev, Natalia; Markowitz, Victor; Martiny, Jennifer; Methe, Barbara; Mizrachi, Ilene; Moxon, Richard; Nelson, Karen; Parkhill, Julian; Proctor, Lita; White, Owen; Sansone, Susanna-Assunta; Spiers, Andrew; Stevens, Robert; Swift, Paul; Taylor, Chris; Tateno, Yoshio; Tett, Adrian; Turner, Sarah; Ussery, David; Vaughan, Bob; Ward, Naomi; Whetzel, Trish; Gil, Ingio San; Wilson, Gareth; Wipat, Anil

    2008-01-01

    With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the ‘transparency’ of the information contained in existing genomic databases. PMID:18464787

  2. Haplotype-based association analysis of general cognitive ability in Generation Scotland, the English Longitudinal Study of Ageing, and UK Biobank [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    David M. Howard

    2017-08-01

    Full Text Available Background: Cognitive ability is a heritable trait with a polygenic architecture, for which several associated variants have been identified using genotype-based and candidate gene approaches. Haplotype-based analyses are a complementary technique that take phased genotype data into account, and potentially provide greater statistical power to detect lower frequency variants. Methods: In the present analysis, three cohort studies (ntotal = 48,002 were utilised: Generation Scotland: Scottish Family Health Study (GS:SFHS, the English Longitudinal Study of Ageing (ELSA, and the UK Biobank. A genome-wide haplotype-based meta-analysis of cognitive ability was performed, as well as a targeted meta-analysis of several gene coding regions. Results: None of the assessed haplotypes provided evidence of a statistically significant association with cognitive ability in either the individual cohorts or the meta-analysis. Within the meta-analysis, the haplotype with the lowest observed P-value overlapped with the D-amino acid oxidase activator (DAOA gene coding region. This coding region has previously been associated with bipolar disorder, schizophrenia and Alzheimer’s disease, which have all been shown to impact upon cognitive ability. Another potentially interesting region highlighted within the current genome-wide association analysis (GS:SFHS: P = 4.09 x 10-7, was the butyrylcholinesterase (BCHE gene coding region. The protein encoded by BCHE has been shown to influence the progression of Alzheimer’s disease and its role in cognitive ability merits further investigation. Conclusions: Although no evidence was found for any haplotypes with a statistically significant association with cognitive ability, our results did provide further evidence that the genetic variants contributing to the variance of cognitive ability are likely to be of small effect.

  3. Selection for silage yield and composition did not affect genomic diversity within the Wisconsin Quality Synthetic maize population.

    Science.gov (United States)

    Lorenz, Aaron J; Beissinger, Timothy M; Silva, Renato Rodrigues; de Leon, Natalia

    2015-02-02

    Maize silage is forage of high quality and yield, and represents the second most important use of maize in the United States. The Wisconsin Quality Synthetic (WQS) maize population has undergone five cycles of recurrent selection for silage yield and composition, resulting in a genetically improved population. The application of high-density molecular markers allows breeders and geneticists to identify important loci through association analysis and selection mapping, as well as to monitor changes in the distribution of genetic diversity across the genome. The objectives of this study were to identify loci controlling variation for maize silage traits through association analysis and the assessment of selection signatures and to describe changes in the genomic distribution of gene diversity through selection and genetic drift in the WQS recurrent selection program. We failed to find any significant marker-trait associations using the historical phenotypic data from WQS breeding trials combined with 17,719 high-quality, informative single nucleotide polymorphisms. Likewise, no strong genomic signatures were left by selection on silage yield and quality in the WQS despite genetic gain for these traits. These results could be due to the genetic complexity underlying these traits, or the role of selection on standing genetic variation. Variation in loss of diversity through drift was observed across the genome. Some large regions experienced much greater loss in diversity than what is expected, suggesting limited recombination combined with small populations in recurrent selection programs could easily lead to fixation of large swaths of the genome. Copyright © 2015 Lorenz et al.

  4. CPm gene diversity in field isolates of Citrus tristeza virus from Colombia.

    Science.gov (United States)

    Oliveros-Garay, Oscar Arturo; Martinez-Salazar, Natalhie; Torres-Ruiz, Yanneth; Acosta, Orlando

    2009-01-01

    The nucleotide sequence diversity of the CPm gene from 28 field isolates of Citrus tristeza virus (CTV) was assessed by SSCP and sequence analyses. These isolates showed two major shared haplotypes, which differed in distribution: A1 was the major haplotype in 23 isolates from different geographic regions, whereas R1 was found in isolates from a discrete region. Phylogenetic reconstruction clustered A1 within an independent group, while R1 was grouped with mild isolates T30 from Florida and T385 from Spain. Some isolates contained several minor haplotypes, which were very similar to, and associated with, the major haplotype.

  5. MHC Class II haplotypes of Colombian Amerindian tribes

    Science.gov (United States)

    Yunis, Juan J.; Yunis, Edmond J.; Yunis, Emilio

    2013-01-01

    We analyzed 1041 individuals belonging to 17 Amerindian tribes of Colombia, Chimila, Bari and Tunebo (Chibcha linguistic family), Embera, Waunana (Choco linguistic family), Puinave and Nukak (Maku-Puinave linguistic families), Cubeo, Guanano, Tucano, Desano and Piratapuyo (Tukano linguistic family), Guahibo and Guayabero (Guayabero Linguistic Family), Curripaco and Piapoco (Arawak linguistic family) and Yucpa (Karib linguistic family). for MHC class II haplotypes (HLA-DRB1, DQA1, DQB1). Approximately 90% of the MHC class II haplotypes found among these tribes are haplotypes frequently encountered in other Amerindian tribes. Nonetheless, striking differences were observed among Chibcha and non-Chibcha speaking tribes. The DRB1*04:04, DRB1*04:11, DRB1*09:01 carrying haplotypes were frequently found among non-Chibcha speaking tribes, while the DRB1*04:07 haplotype showed significant frequencies among Chibcha speaking tribes, and only marginal frequencies among non-Chibcha speaking tribes. Our results suggest that the differences in MHC class II haplotype frequency found among Chibcha and non-Chibcha speaking tribes could be due to genetic differentiation in Mesoamerica of the ancestral Amerindian population into Chibcha and non-Chibcha speaking populations before they entered into South America. PMID:23885196

  6. Birth of a healthy infant following preimplantation PKHD1 haplotyping for autosomal recessive polycystic kidney disease using multiple displacement amplification

    Science.gov (United States)

    Janson, Marleen M.; Roesler, Mark R.; Avner, Ellis D.; Strawn, Estil Y.; Bick, David P.

    2010-01-01

    Purpose To develop a reliable preimplantation genetic diagnosis protocol for couples who both carry a mutant PKHD1 gene wishing to conceive children unaffected with autosomal recessive polycystic kidney disease (ARPKD). Methods Development of a unique protocol for preimplantation genetic testing using whole genome amplification of single blastomeres by multiple displacement amplification (MDA), and haplotype analysis with novel short tandem repeat (STR) markers from the PKHD1 gene and flanking sequences, and a case report of successful utilization of the protocol followed by successful IVF resulting in the birth of an infant unaffected with ARPKD. Results We have developed 20 polymorphic STR markers suitable for linkage analysis of ARPKD. These linked STR markers have enabled unambiguous identification of the PKHD1 haplotypes of embryos produced by at-risk couples. Conclusions We have developed a reliable protocol for preimplantation genetic diagnosis of ARPKD using single-cell MDA products for PKHD1 haplotyping. PMID:20490649

  7. A tutorial of diverse genome analysis tools found in the CoGe web-platform using Plasmodium spp. as a model

    Science.gov (United States)

    Castillo, Andreina I; Nelson, Andrew D L; Haug-Baltzell, Asher K; Lyons, Eric

    2018-01-01

    Abstract Integrated platforms for storage, management, analysis and sharing of large quantities of omics data have become fundamental to comparative genomics. CoGe (https://genomevolution.org/coge/) is an online platform designed to manage and study genomic data, enabling both data- and hypothesis-driven comparative genomics. CoGe’s tools and resources can be used to organize and analyse both publicly available and private genomic data from any species. Here, we demonstrate the capabilities of CoGe through three example workflows using 17 Plasmodium genomes as a model. Plasmodium genomes present unique challenges for comparative genomics due to their rapidly evolving and highly variable genomic AT/GC content. These example workflows are intended to serve as templates to help guide researchers who would like to use CoGe to examine diverse aspects of genome evolution. In the first workflow, trends in genome composition and amino acid usage are explored. In the second, changes in genome structure and the distribution of synonymous (Ks) and non-synonymous (Kn) substitution values are evaluated across species with different levels of evolutionary relatedness. In the third workflow, microsyntenic analyses of multigene families’ genomic organization are conducted using two Plasmodium-specific gene families—serine repeat antigen, and cytoadherence-linked asexual gene—as models. In general, these example workflows show how to achieve quick, reproducible and shareable results using the CoGe platform. We were able to replicate previously published results, as well as leverage CoGe’s tools and resources to gain additional insight into various aspects of Plasmodium genome evolution. Our results highlight the usefulness of the CoGe platform, particularly in understanding complex features of genome evolution. Database URL: https://genomevolution.org/coge/

  8. Homology models guide discovery of diverse enzyme specificities among dipeptide epimerases in the enolase superfamily

    Science.gov (United States)

    Lukk, Tiit; Sakai, Ayano; Kalyanaraman, Chakrapani; Brown, Shoshana D.; Imker, Heidi J.; Song, Ling; Fedorov, Alexander A.; Fedorov, Elena V.; Toro, Rafael; Hillerich, Brandan; Seidel, Ronald; Patskovsky, Yury; Vetting, Matthew W.; Nair, Satish K.; Babbitt, Patricia C.; Almo, Steven C.; Gerlt, John A.; Jacobson, Matthew P.

    2012-01-01

    The rapid advance in genome sequencing presents substantial challenges for protein functional assignment, with half or more of new protein sequences inferred from these genomes having uncertain assignments. The assignment of enzyme function in functionally diverse superfamilies represents a particular challenge, which we address through a combination of computational predictions, enzymology, and structural biology. Here we describe the results of a focused investigation of a group of enzymes in the enolase superfamily that are involved in epimerizing dipeptides. The first members of this group to be functionally characterized were Ala-Glu epimerases in Eschericiha coli and Bacillus subtilis, based on the operon context and enzymological studies; these enzymes are presumed to be involved in peptidoglycan recycling. We have subsequently studied more than 65 related enzymes by computational methods, including homology modeling and metabolite docking, which suggested that many would have divergent specificities;, i.e., they are likely to have different (unknown) biological roles. In addition to the Ala-Phe epimerase specificity reported previously, we describe the prediction and experimental verification of: (i) a new group of presumed Ala-Glu epimerases; (ii) several enzymes with specificity for hydrophobic dipeptides, including one from Cytophaga hutchinsonii that epimerizes D-Ala-D-Ala; and (iii) a small group of enzymes that epimerize cationic dipeptides. Crystal structures for certain of these enzymes further elucidate the structural basis of the specificities. The results highlight the potential of computational methods to guide experimental characterization of enzymes in an automated, large-scale fashion. PMID:22392983

  9. Origins of the amphiploid species Brassica napus L. investigated by chloroplast and nuclear molecular markers

    Directory of Open Access Journals (Sweden)

    Allender Charlotte J

    2010-03-01

    haplotypes in B. napus and B. rapa accessions was not correlated with nuclear genetic diversity as determined by AFLPs, indicating that such accessions do not represent recent hybrids. Whilst some chloroplast diversity observed within B. napus can be explained by introgression from inter-specific crosses made during crop improvement programmes, there is evidence that the original hybridisation event resulting in to B. napus occurred on more than one occasion, and involved different maternal genotypes.

  10. The Human Genome Diversity (HGD) Project. Summary document

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1993-12-31

    In 1991 a group of human geneticists and molecular biologists proposed to the scientific community that a world wide survey be undertaken of variation in the human genome. To aid their considerations, the committee therefore decided to hold a small series of international workshops to explore the major scientific issues involved. The intention was to define a framework for the project which could provide a basis for much wider and more detailed discussion and planning--it was recognized that the successful implementation of the proposed project, which has come to be known as the Human Genome Diversity (HGD) Project, would not only involve scientists but also various national and international non-scientific groups all of which should contribute to the project`s development. The international HGD workshop held in Sardinia in September 1993 was the last in the initial series of planning workshops. As such it not only explored new ground but also pulled together into a more coherent form much of the formal and informal discussion that had taken place in the preceding two years. This report presents the deliberations of the Sardinia workshop within a consideration of the overall development of the HGD Project to date.

  11. eGenomics: Cataloguing Our Complete Genome Collection III

    Directory of Open Access Journals (Sweden)

    Dawn Field

    2007-01-01

    Full Text Available This meeting report summarizes the proceedings of the “eGenomics: Cataloguing our Complete Genome Collection III” workshop held September 11–13, 2006, at the National Institute for Environmental eScience (NIEeS, Cambridge, United Kingdom. This 3rd workshop of the Genomic Standards Consortium was divided into two parts. The first half of the three-day workshop was dedicated to reviewing the genomic diversity of our current and future genome and metagenome collection, and exploring linkages to a series of existing projects through formal presentations. The second half was dedicated to strategic discussions. Outcomes of the workshop include a revised “Minimum Information about a Genome Sequence” (MIGS specification (v1.1, consensus on a variety of features to be added to the Genome Catalogue (GCat, agreement by several researchers to adopt MIGS for imminent genome publications, and an agreement by the EBI and NCBI to input their genome collections into GCat for the purpose of quantifying the amount of optional data already available (e.g., for geographic location coordinates and working towards a single, global list of all public genomes and metagenomes.

  12. BMP4 and FGF3 haplotypes increase the risk of tendinopathy in volleyball athletes.

    Science.gov (United States)

    Salles, José Inácio; Amaral, Marcus Vinícius; Aguiar, Diego Pinheiro; Lira, Daisy Anne; Quinelato, Valquiria; Bonato, Letícia Ladeira; Duarte, Maria Eugenia Leite; Vieira, Alexandre Rezende; Casado, Priscila Ladeira

    2015-03-01

    To investigate whether genetic variants can be correlated with tendinopathy in elite male volleyball athletes. Case-control study. Fifteen single nucleotide polymorphisms within BMP4, FGF3, FGF10, FGFR1 genes were investigated in 138 elite volleyball athletes, aged between 18 and 35 years, who undergo 4-5h of training per day: 52 with tendinopathy and 86 with no history of pain suggestive of tendinopathy in patellar, Achilles, shoulder, and hip abductors tendons. The clinical diagnostic criterion was progressive pain during training, confirmed by magnetic resonance image. Genomic DNA was obtained from saliva samples. Genetic markers were genotyped using TaqMan real-time PCR. Chi-square test compared genotypes and haplotype differences between groups. Multivariate logistic regression analyzed the significance of covariates and incidence of tendinopathy. Statistical analysis revealed participant age (p=0.005) and years of practice (p=0.004) were risk factors for tendinopathy. A significant association between BMP4 rs2761884 (p=0.03) and tendinopathy was observed. Athletes with a polymorphic genotype have 2.4 times more susceptibility to tendinopathy (OR=2.39; 95%CI=1.10-5.19). Also, association between disease and haplotype TTGGA in BMP4 (p=0.01) was observed. The FGF3 TGGTA haplotype showed a tendency of association with tendinopathy (p=0.05), and so did FGF10 rs900379. FGFR1 showed no association with disease. These findings indicate that haplotypes in BMP4 and FGF3 genes may contribute to the tendon disease process in elite volleyball athletes. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  13. Distribution of QPY and RAH haplotypes of granzyme B gene in distinct Brazilian populations

    Directory of Open Access Journals (Sweden)

    Fernanda Bernadelli Garcia

    2012-08-01

    Full Text Available INTRODUCTION: The cytolysis mediated by granules is one of the most important effector functions of cytotoxic T lymphocytes and natural killer cells. Recently, three single nucleotide polymorphisms (SNPs were identified at exons 2, 3, and 5 of the granzyme B gene, resulting in a haplotype in which three amino acids of mature protein Q48P88Y245 are changed to R48A88H245, which leads to loss of cytotoxic activity of the protein. In this study, we evaluated the frequency of these polymorphisms in Brazilian populations. METHODS: We evaluated the frequency of these polymorphisms in Brazilian ethnic groups (white, Afro-Brazilian, and Asian by sequencing these regions. RESULTS: The allelic and genotypic frequencies of SNP 2364A/G at exon 2 in Afro-Brazilian individuals (42.3% and 17.3% were significantly higher when compared with those in whites and Asians (p < 0.0001 and p = 0.0007, respectively. The polymorphisms 2933C/G and 4243C/T also were more frequent in Afro-Brazilians but without any significant difference regarding the other groups. The Afro-Brazilian group presented greater diversity of haplotypes, and the RAH haplotype seemed to be more frequent in this group (25%, followed by the whites (20.7% and by the Asians (11.9%, similar to the frequency presented in the literature. CONCLUSIONS: There is a higher frequency of polymorphisms in Afro-Brazilians, and the RAH haplotype was more frequent in these individuals. We believe that further studies should aim to investigate the correlation of this haplotype with diseases related to immunity mediated by cytotoxic lymphocytes, and if this correlation is confirmed, novel treatment strategies might be elaborated.

  14. Tracing the route of modern humans out of Africa by using 225 human genome sequences from Ethiopians and Egyptians.

    Science.gov (United States)

    Pagani, Luca; Schiffels, Stephan; Gurdasani, Deepti; Danecek, Petr; Scally, Aylwyn; Chen, Yuan; Xue, Yali; Haber, Marc; Ekong, Rosemary; Oljira, Tamiru; Mekonnen, Ephrem; Luiselli, Donata; Bradman, Neil; Bekele, Endashaw; Zalloua, Pierre; Durbin, Richard; Kivisild, Toomas; Tyler-Smith, Chris

    2015-06-04

    The predominantly African origin of all modern human populations is well established, but the route taken out of Africa is still unclear. Two alternative routes, via Egypt and Sinai or across the Bab el Mandeb strait into Arabia, have traditionally been proposed as feasible gateways in light of geographic, paleoclimatic, archaeological, and genetic evidence. Distinguishing among these alternatives has been difficult. We generated 225 whole-genome sequences (225 at 8× depth, of which 8 were increased to 30×; Illumina HiSeq 2000) from six modern Northeast African populations (100 Egyptians and five Ethiopian populations each represented by 25 individuals). West Eurasian components were masked out, and the remaining African haplotypes were compared with a panel of sub-Saharan African and non-African genomes. We showed that masked Northeast African haplotypes overall were more similar to non-African haplotypes and more frequently present outside Africa than were any sets of haplotypes derived from a West African population. Furthermore, the masked Egyptian haplotypes showed these properties more markedly than the masked Ethiopian haplotypes, pointing to Egypt as the more likely gateway in the exodus to the rest of the world. Using five Ethiopian and three Egyptian high-coverage masked genomes and the multiple sequentially Markovian coalescent (MSMC) approach, we estimated the genetic split times of Egyptians and Ethiopians from non-African populations at 55,000 and 65,000 years ago, respectively, whereas that of West Africans was estimated to be 75,000 years ago. Both the haplotype and MSMC analyses thus suggest a predominant northern route out of Africa via Egypt. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  15. Genetic diversity of the Chinese goat in the littoral zone of the Yangtze River as assessed by microsatellite and mtDNA.

    Science.gov (United States)

    E, Guang-Xin; Zhao, Yong-Ju; Chen, Li-Peng; Ma, Yue-Hui; Chu, Ming-Xing; Li, Xiang-Long; Hong, Qiong-Hua; Li, Lan-Hui; Guo, Ji-Jun; Zhu, Lan; Han, Yan-Guo; Gao, Hui-Jiang; Zhang, Jia-Hua; Jiang, Huai-Zhi; Jiang, Cao-De; Wang, Gao-Fu; Ren, Hang-Xing; Jin, Mei-Lan; Sun, Yuan-Zhi; Zhou, Peng; Huang, Yong-Fu

    2018-05-01

    The objective of this study was to assess the genetic diversity and population structure of goats in the Yangtze River region using microsatellite and mtDNA to better understand the current status of those goat genetic diversity and the effects of natural landscape in fashion of domestic animal genetic diversity. The genetic variability of 16 goat populations in the littoral zone of the Yangtze River was estimated using 21 autosomal microsatellites, which revealed high diversity and genetic population clustering with a dispersed geographical distribution. A phylogenetic analysis of the mitochondrial D-loop region (482 bp) was conducted in 494 goats from the Yangtze River region. In total, 117 SNPs were reconstructed, and 173 haplotypes were identified, 94.5% of which belonged to lineages A and B. Lineages C, D, and G had lower frequencies (5.2%), and lineage F haplotypes were undetected. Several high-frequency haplotypes were shared by different ecogeographically distributed populations, and the close phylogenetic relationships among certain low-frequency haplotypes indicated the historical exchange of genetic material among these populations. In particular, the lineage G haplotype suggests that some west Asian goat genetic material may have been transferred to China via Muslim migration.

  16. Hapsembler: An Assembler for Highly Polymorphic Genomes

    Science.gov (United States)

    Donmez, Nilgun; Brudno, Michael

    As whole genome sequencing has become a routine biological experiment, algorithms for assembly of whole genome shotgun data has become a topic of extensive research, with a plethora of off-the-shelf methods that can reconstruct the genomes of many organisms. Simultaneously, several recently sequenced genomes exhibit very high polymorphism rates. For these organisms genome assembly remains a challenge as most assemblers are unable to handle highly divergent haplotypes in a single individual. In this paper we describe Hapsembler, an assembler for highly polymorphic genomes, which makes use of paired reads. Our experiments show that Hapsembler produces accurate and contiguous assemblies of highly polymorphic genomes, while performing on par with the leading tools on haploid genomes. Hapsembler is available for download at http://compbio.cs.toronto.edu/hapsembler.

  17. The Genetic Diversity and Structure of Linkage Disequilibrium of the MTHFR Gene in Populations of Northern Eurasia.

    Science.gov (United States)

    Trifonova, E A; Eremina, E R; Urnov, F D; Stepanov, V A

    2012-01-01

    The structure of the haplotypes and linkage disequilibrium (LD) of the methylenetetrahydrofolate reductase gene (MTHFR) in 9 population groups from Northern Eurasia and populations of the international HapMap project was investigated in the present study. The data suggest that the architecture of LD in the human genome is largely determined by the evolutionary history of populations; however, the results of phylogenetic and haplotype analyses seems to suggest that in fact there may be a common "old" mechanism for the formation of certain patterns of LD. Variability in the structure of LD and the level of diversity of MTHFRhaplotypes cause a certain set of tagSNPs with an established prognostic significance for each population. In our opinion, the results obtained in the present study are of considerable interest for understanding multiple genetic phenomena: namely, the association of interpopulation differences in the patterns of LD with structures possessing a genetic susceptibility to complex diseases, and the functional significance of the pleiotropicMTHFR gene effect. Summarizing the results of this study, a conclusion can be made that the genetic variability analysis with emphasis on the structure of LD in human populations is a powerful tool that can make a significant contribution to such areas of biomedical science as human evolutionary biology, functional genomics, genetics of complex diseases, and pharmacogenomics.

  18. Entangled fates of holobiont genomes during invasion: nested bacterial and host diversities in Caulerpa taxifolia

    KAUST Repository

    Arnaud-Haond, S.; Aires, T.; Candeias, R.; Teixeira, S. J. L; Duarte, Carlos M.; Valero, M.; Serrã o, E. A.

    2017-01-01

    Successful prevention and mitigation of biological invasions requires retracing the initial steps of introduction, as well as understanding key elements enhancing the adaptability of invasive species. We studied the genetic diversity of the green alga Caulerpa taxifolia and its associated bacterial communities in several areas around the world. The striking congruence of α and ß diversity of the algal genome and endophytic communities reveals a tight association, supporting the holobiont concept as best describing the unit of spreading and invasion. Both genomic compartments support the hypotheses of a unique accidental introduction in the Mediterranean and of multiple invasion events in Southern Australia. In addition to helping with tracing the origin of invasion, bacterial communities exhibit metabolic functions that can potentially enhance adaptability and competitiveness of the consortium they form with their host. We thus hypothesize that low genetic diversities of both host and symbiont communities may contribute to the recent regression in the Mediterranean, in contrast with the persistence of highly diverse assemblages in southern Australia. This study supports the importance of scaling up from the host to the holobiont for a comprehensive understanding of invasions. This article is protected by copyright. All rights reserved.

  19. Entangled fates of holobiont genomes during invasion: nested bacterial and host diversities in Caulerpa taxifolia

    KAUST Repository

    Arnaud-Haond, S.

    2017-01-30

    Successful prevention and mitigation of biological invasions requires retracing the initial steps of introduction, as well as understanding key elements enhancing the adaptability of invasive species. We studied the genetic diversity of the green alga Caulerpa taxifolia and its associated bacterial communities in several areas around the world. The striking congruence of α and ß diversity of the algal genome and endophytic communities reveals a tight association, supporting the holobiont concept as best describing the unit of spreading and invasion. Both genomic compartments support the hypotheses of a unique accidental introduction in the Mediterranean and of multiple invasion events in Southern Australia. In addition to helping with tracing the origin of invasion, bacterial communities exhibit metabolic functions that can potentially enhance adaptability and competitiveness of the consortium they form with their host. We thus hypothesize that low genetic diversities of both host and symbiont communities may contribute to the recent regression in the Mediterranean, in contrast with the persistence of highly diverse assemblages in southern Australia. This study supports the importance of scaling up from the host to the holobiont for a comprehensive understanding of invasions. This article is protected by copyright. All rights reserved.

  20. Subspecific origin and haplotype diversity in the laboratory mouse

    Czech Academy of Sciences Publication Activity Database

    Yang, H.; Wang, J. R.; Didion, J. P.; Buus, R. J.; Bell, T. A.; Welsh, C. E.; Bonhomme, F.; Yu, A. H.-T.; Nachman, M. W.; Piálek, Jaroslav; Tucker, P.; Boursot, P.; McMillan, L.; Churchill, G. A.; de Villena, F. P.

    2011-01-01

    Roč. 45, č. 7 (2011), s. 648-655 ISSN 1061-4036 R&D Projects: GA ČR GA206/08/0640 Institutional research plan: CEZ:AV0Z60930519 Keywords : inbred strains * house mice * resource * genome * genes * SNPS Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 35.532, year: 2011

  1. Transcriptome analysis reveals the same 17 S-locus F-box genes in two haplotypes of the self-incompatibility locus of Petunia inflata.

    Science.gov (United States)

    Williams, Justin S; Der, Joshua P; dePamphilis, Claude W; Kao, Teh-Hui

    2014-07-01

    Petunia possesses self-incompatibility, by which pistils reject self-pollen but accept non-self-pollen for fertilization. Self-/non-self-recognition between pollen and pistil is regulated by the pistil-specific S-RNase gene and by multiple pollen-specific S-locus F-box (SLF) genes. To date, 10 SLF genes have been identified by various methods, and seven have been shown to be involved in pollen specificity. For a given S-haplotype, each SLF interacts with a subset of its non-self S-RNases, and an as yet unknown number of SLFs are thought to collectively mediate ubiquitination and degradation of all non-self S-RNases to allow cross-compatible pollination. To identify a complete suite of SLF genes of P. inflata, we used a de novo RNA-seq approach to analyze the pollen transcriptomes of S2-haplotype and S3-haplotype, as well as the leaf transcriptome of the S3S3 genotype. We searched for genes that fit several criteria established from the properties of the known SLF genes and identified the same seven new SLF genes in S2-haplotype and S3-haplotype, suggesting that a total of 17 SLF genes constitute pollen specificity in each S-haplotype. This finding lays the foundation for understanding how multiple SLF genes evolved and the biochemical basis for differential interactions between SLF proteins and S-RNases. © 2014 American Society of Plant Biologists. All rights reserved.

  2. Comparative genomics reveals insights into avian genome evolution and adaptation

    Science.gov (United States)

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  3. Genome-wide diversity and differentiation in New World populations of the human malaria parasite Plasmodium vivax.

    Science.gov (United States)

    de Oliveira, Thais C; Rodrigues, Priscila T; Menezes, Maria José; Gonçalves-Lopes, Raquel M; Bastos, Melissa S; Lima, Nathália F; Barbosa, Susana; Gerber, Alexandra L; Loss de Morais, Guilherme; Berná, Luisa; Phelan, Jody; Robello, Carlos; de Vasconcelos, Ana Tereza R; Alves, João Marcelo P; Ferreira, Marcelo U

    2017-07-01

    The Americas were the last continent colonized by humans carrying malaria parasites. Plasmodium falciparum from the New World shows very little genetic diversity and greater linkage disequilibrium, compared with its African counterparts, and is clearly subdivided into local, highly divergent populations. However, limited available data have revealed extensive genetic diversity in American populations of another major human malaria parasite, P. vivax. We used an improved sample preparation strategy and next-generation sequencing to characterize 9 high-quality P. vivax genome sequences from northwestern Brazil. These new data were compared with publicly available sequences from recently sampled clinical P. vivax isolates from Brazil (BRA, total n = 11 sequences), Peru (PER, n = 23), Colombia (COL, n = 31), and Mexico (MEX, n = 19). We found that New World populations of P. vivax are as diverse (nucleotide diversity π between 5.2 × 10-4 and 6.2 × 10-4) as P. vivax populations from Southeast Asia, where malaria transmission is substantially more intense. They display several non-synonymous nucleotide substitutions (some of them previously undescribed) in genes known or suspected to be involved in antimalarial drug resistance, such as dhfr, dhps, mdr1, mrp1, and mrp-2, but not in the chloroquine resistance transporter ortholog (crt-o) gene. Moreover, P. vivax in the Americas is much less geographically substructured than local P. falciparum populations, with relatively little between-population genome-wide differentiation (pairwise FST values ranging between 0.025 and 0.092). Finally, P. vivax populations show a rapid decline in linkage disequilibrium with increasing distance between pairs of polymorphic sites, consistent with very frequent outcrossing. We hypothesize that the high diversity of present-day P. vivax lineages in the Americas originated from successive migratory waves and subsequent admixture between parasite lineages from geographically diverse sites

  4. Platyhelminth Venom Allergen-Like (VAL) proteins: revealing structural diversity, class-specific features and biological associations across the phylum

    Science.gov (United States)

    CHALMERS, IAIN W.; HOFFMANN, KARL F.

    2012-01-01

    SUMMARY During platyhelminth infection, a cocktail of proteins is released by the parasite to aid invasion, initiate feeding, facilitate adaptation and mediate modulation of the host immune response. Included amongst these proteins is the Venom Allergen-Like (VAL) family, part of the larger sperm coating protein/Tpx-1/Ag5/PR-1/Sc7 (SCP/TAPS) superfamily. To explore the significance of this protein family during Platyhelminthes development and host interactions, we systematically summarize all published proteomic, genomic and immunological investigations of the VAL protein family to date. By conducting new genomic and transcriptomic interrogations to identify over 200 VAL proteins (228) from species in all 4 traditional taxonomic classes (Trematoda, Cestoda, Monogenea and Turbellaria), we further expand our knowledge related to platyhelminth VAL diversity across the phylum. Subsequent phylogenetic and tertiary structural analyses reveal several class-specific VAL features, which likely indicate a range of roles mediated by this protein family. Our comprehensive analysis of platyhelminth VALs represents a unifying synopsis for understanding diversity within this protein family and a firm context in which to initiate future functional characterization of these enigmatic members. PMID:22717097

  5. Genome-wide identification of specific oligonucleotides using artificial neural network and computational genomic analysis

    Directory of Open Access Journals (Sweden)

    Chen Jiun-Ching

    2007-05-01

    Full Text Available Abstract Background Genome-wide identification of specific oligonucleotides (oligos is a computationally-intensive task and is a requirement for designing microarray probes, primers, and siRNAs. An artificial neural network (ANN is a machine learning technique that can effectively process complex and high noise data. Here, ANNs are applied to process the unique subsequence distribution for prediction of specific oligos. Results We present a novel and efficient algorithm, named the integration of ANN and BLAST (IAB algorithm, to identify specific oligos. We establish the unique marker database for human and rat gene index databases using the hash table algorithm. We then create the input vectors, via the unique marker database, to train and test the ANN. The trained ANN predicted the specific oligos with high efficiency, and these oligos were subsequently verified by BLAST. To improve the prediction performance, the ANN over-fitting issue was avoided by early stopping with the best observed error and a k-fold validation was also applied. The performance of the IAB algorithm was about 5.2, 7.1, and 6.7 times faster than the BLAST search without ANN for experimental results of 70-mer, 50-mer, and 25-mer specific oligos, respectively. In addition, the results of polymerase chain reactions showed that the primers predicted by the IAB algorithm could specifically amplify the corresponding genes. The IAB algorithm has been integrated into a previously published comprehensive web server to support microarray analysis and genome-wide iterative enrichment analysis, through which users can identify a group of desired genes and then discover the specific oligos of these genes. Conclusion The IAB algorithm has been developed to construct SpecificDB, a web server that provides a specific and valid oligo database of the probe, siRNA, and primer design for the human genome. We also demonstrate the ability of the IAB algorithm to predict specific oligos through

  6. Phylogeography and Ecological Niche Modeling Reveal Reduced Genetic Diversity and Colonization Patterns of Skunk Cabbage (Symplocarpus foetidus; Araceae From Glacial Refugia in Eastern North America

    Directory of Open Access Journals (Sweden)

    Seon-Hee Kim

    2018-05-01

    Full Text Available Alternating glacial and interglacial periods during the Quaternary have dramatically affected the distribution and population genetic structure of plant and animal species throughout the northern hemisphere. Surprisingly, little is known about the post-glacial recolonization history of wetland herbaceous perennials that are widely distributed in the understory of deciduous or mixed deciduous-evergreen forests in eastern North America. In this study, we investigated infraspecific variation among 32 populations of skunk cabbage, Symplocarpus foetidus, to test the hypothesis that the extant species diversity of skunk cabbage is the result of a post-glacial range expansion from southern refugia during the Quaternary Ice Age. A total of 4041 base pairs (bp of the chloroplast intergenic spacer region (cpDNA was sequenced from 485 individuals sampled from glaciated (18 populations, 275 individuals and unglaciated (14 populations, 210 individuals regions east and west of the Appalachian Mountains. Haplotype number, haplotype diversity, and nucleotide diversity were calculated, and genetic variation within and among populations was assessed by analysis of molecular variance (AMOVA. The geographic pattern of genetic differentiation was further investigated with a spatial analysis of molecular variance (SAMOVA. A total of eight haplotypes and three genetic groups (SAMOVA were recovered and a much higher haplotype number (eight haplotypes and haplotype diversity (0.7425 was observed in unglaciated compared to glaciated populations (five haplotypes, haplotype diversity = 0.6099. All haplotypes found in glaciated regions represented a subset of haplotypes found in unglaciated regions. Haplotypes of S. foetidus likely diverged during the Tertiary (mid-Miocene and late Pliocene, predating the last glacial maximum (LGM. Predictions based on ecological niche modeling (ENM suggested that there was considerably less suitable habitat for skunk cabbage during the LGM

  7. The genome diversity and karyotype evolution of mammals

    Directory of Open Access Journals (Sweden)

    Trifonov Vladimir A

    2011-10-01

    Full Text Available Abstract The past decade has witnessed an explosion of genome sequencing and mapping in evolutionary diverse species. While full genome sequencing of mammals is rapidly progressing, the ability to assemble and align orthologous whole chromosome regions from more than a few species is still not possible. The intense focus on building of comparative maps for companion (dog and cat, laboratory (mice and rat and agricultural (cattle, pig, and horse animals has traditionally been used as a means to understand the underlying basis of disease-related or economically important phenotypes. However, these maps also provide an unprecedented opportunity to use multispecies analysis as a tool for inferring karyotype evolution. Comparative chromosome painting and related techniques are now considered to be the most powerful approaches in comparative genome studies. Homologies can be identified with high accuracy using molecularly defined DNA probes for fluorescence in situ hybridization (FISH on chromosomes of different species. Chromosome painting data are now available for members of nearly all mammalian orders. In most orders, there are species with rates of chromosome evolution that can be considered as 'default' rates. The number of rearrangements that have become fixed in evolutionary history seems comparatively low, bearing in mind the 180 million years of the mammalian radiation. Comparative chromosome maps record the history of karyotype changes that have occurred during evolution. The aim of this review is to provide an overview of these recent advances in our endeavor to decipher the karyotype evolution of mammals by integrating the published results together with some of our latest unpublished results.

  8. Fundamental problem of forensic mathematics--the evidential value of a rare haplotype.

    Science.gov (United States)

    Brenner, Charles H

    2010-10-01

    Y-chromosomal and mitochondrial haplotyping offer special advantages for criminal (and other) identification. For different reasons, each of them is sometimes detectable in a crime stain for which autosomal typing fails. But they also present special problems, including a fundamental mathematical one: When a rare haplotype is shared between suspect and crime scene, how strong is the evidence linking the two? Assume a reference population sample is available which contains n-1 haplotypes. The most interesting situation as well as the most common one is that the crime scene haplotype was never observed in the population sample. The traditional tools of product rule and sample frequency are not useful when there are no components to multiply and the sample frequency is zero. A useful statistic is the fraction κ of the population sample that consists of "singletons" - of once-observed types. A simple argument shows that the probability for a random innocent suspect to match a previously unobserved crime scene type is (1-κ)/n - distinctly less than 1/n, likely ten times less. The robust validity of this model is confirmed by testing it against a range of population models. This paper hinges above all on one key insight: probability is not frequency. The common but erroneous "frequency" approach adopts population frequency as a surrogate for matching probability and attempts the intractable problem of guessing how many instances exist of the specific haplotype at a certain crime. Probability, by contrast, depends by definition only on the available data. Hence if different haplotypes but with the same data occur in two different crimes, although the frequencies are different (and are hopelessly elusive), the matching probabilities are the same, and are not hard to find. Copyright © 2009 Elsevier Ireland Ltd. All rights reserved.

  9. Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate

    Directory of Open Access Journals (Sweden)

    Andersson Jan O

    2010-10-01

    Full Text Available Abstract Background Giardia intestinalis is a protozoan parasite that causes diarrhea in a wide range of mammalian species. To further understand the genetic diversity between the Giardia intestinalis species, we have performed genome sequencing and analysis of a wild-type Giardia intestinalis sample from the assemblage E group, isolated from a pig. Results We identified 5012 protein coding genes, the majority of which are conserved compared to the previously sequenced genomes of the WB and GS strains in terms of microsynteny and sequence identity. Despite this, there is an unexpectedly large number of chromosomal rearrangements and several smaller structural changes that are present in all chromosomes. Novel members of the VSP, NEK Kinase and HCMP gene families were identified, which may reveal possible mechanisms for host specificity and new avenues for antigenic variation. We used comparative genomics of the three diverse Giardia intestinalis isolates P15, GS and WB to define a core proteome for this species complex and to identify lineage-specific genes. Extensive analyses of polymorphisms in the core proteome of Giardia revealed differential rates of divergence among cellular processes. Conclusions Our results indicate that despite a well conserved core of genes there is significant genome variation between Giardia isolates, both in terms of gene content, gene polymorphisms, structural chromosomal variations and surface molecule repertoires. This study improves the annotation of the Giardia genomes and enables the identification of functionally important variation.

  10. Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Otillar, Robert; Fagnan, Kirsten; Boussau, Bastien; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Held, Benjamin; Nagy, Laszlo; Floudas, Dimitris; Morin, Emmanuelle; Manning, Gerard; Baker, Scott; Martin, Francis; Blanchette, Robert; Hibbett, David; Grigoriev, Igor V.

    2013-03-11

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.

  11. Genome Variation Map: a data repository of genome variations in BIG Data Center.

    Science.gov (United States)

    Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

    2018-01-04

    The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Genome Variation Map: a data repository of genome variations in BIG Data Center

    Science.gov (United States)

    Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang

    2018-01-01

    Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. PMID:29069473

  13. A genetic assessment of the English bulldog

    OpenAIRE

    Pedersen, Niels C.; Pooch, Ashley S.; Liu, Hongwei

    2016-01-01

    Background This study examines genetic diversity among 102 registered English Bulldogs used for breeding based on maternal and paternal haplotypes, allele frequencies in 33 highly polymorphic short tandem repeat (STR) loci on 25 chromosomes, STR-linked dog leukocyte antigen (DLA) class I and II haplotypes, and the number and size of genome-wide runs of homozygosity (ROH) determined from high density SNP arrays. The objective was to assess whether the breed retains enough genetic diversity to ...

  14. The Geobacillus pan-genome: implications for the evolution of the genus

    Directory of Open Access Journals (Sweden)

    Oliver Keoagile Ignatius Bezuidt

    2016-05-01

    Full Text Available The genus Geobacillus is comprised of a diverse group of spore-forming Gram-positive thermophilic bacterial species and is well known for both its ecological diversity and as a source of novel thermostable enzymes. Although the mechanisms underlying the thermophilicity of the organism and the thermostability of its macromolecules are reasonably well understood, relatively little is known of the evolutionary mechanisms, which underlie the structural and functional properties of members of this genus. In this study, we have compared 29 Geobacillus genomes, with a specific focus on the elements, which comprise the conserved core and flexible genomes. Based on comparisons of conserved core and flexible genomes, we present evidence of habitat delineation with specific Geobacillus genomes linked to specific niches. Interestingly, our analysis has shown that horizontal gene transfer is a major factor deriving the evolution of Geobacillus from Bacillus, with genetic contributions from other phylogenetically distant taxa.

  15. Limits of variation, specific infectivity, and genome packaging of massively recoded poliovirus genomes.

    Science.gov (United States)

    Song, Yutong; Gorbatsevych, Oleksandr; Liu, Ying; Mugavero, JoAnn; Shen, Sam H; Ward, Charles B; Asare, Emmanuel; Jiang, Ping; Paul, Aniko V; Mueller, Steffen; Wimmer, Eckard

    2017-10-10

    Computer design and chemical synthesis generated viable variants of poliovirus type 1 (PV1), whose ORF (6,189 nucleotides) carried up to 1,297 "Max" mutations (excess of overrepresented synonymous codon pairs) or up to 2,104 "SD" mutations (randomly scrambled synonymous codons). "Min" variants (excess of underrepresented synonymous codon pairs) are nonviable except for P2 Min , a variant temperature-sensitive at 33 and 39.5 °C. Compared with WT PV1, P2 Min displayed a vastly reduced specific infectivity (si) (WT, 1 PFU/118 particles vs. P2 Min , 1 PFU/35,000 particles), a phenotype that will be discussed broadly. Si of haploid PV presents cellular infectivity of a single genotype. We performed a comprehensive analysis of sequence and structures of the PV genome to determine if evolutionary conserved cis-acting packaging signal(s) were preserved after recoding. We showed that conserved synonymous sites and/or local secondary structures that might play a role in determining packaging specificity do not survive codon pair recoding. This makes it unlikely that numerous "cryptic, sequence-degenerate, dispersed RNA packaging signals mapping along the entire viral genome" [Patel N, et al. (2017) Nat Microbiol 2:17098] play the critical role in poliovirus packaging specificity. Considering all available evidence, we propose a two-step assembly strategy for +ssRNA viruses: step I, acquisition of packaging specificity, either ( a ) by specific recognition between capsid protein(s) and replication proteins (poliovirus), or ( b ) by the high affinity interaction of a single RNA packaging signal (PS) with capsid protein(s) (most +ssRNA viruses so far studied); step II, cocondensation of genome/capsid precursors in which an array of hairpin structures plays a role in virion formation.

  16. Genetic Architecture of Aluminum Tolerance in Rice (Oryza sativa) Determined through Genome-Wide Association Analysis and QTL Mapping

    Science.gov (United States)

    Famoso, Adam N.; Zhao, Keyan; Clark, Randy T.; Tung, Chih-Wei; Wright, Mark H.; Bustamante, Carlos; Kochian, Leon V.; McCouch, Susan R.

    2011-01-01

    Aluminum (Al) toxicity is a primary limitation to crop productivity on acid soils, and rice has been demonstrated to be significantly more Al tolerant than other cereal crops. However, the mechanisms of rice Al tolerance are largely unknown, and no genes underlying natural variation have been reported. We screened 383 diverse rice accessions, conducted a genome-wide association (GWA) study, and conducted QTL mapping in two bi-parental populations using three estimates of Al tolerance based on root growth. Subpopulation structure explained 57% of the phenotypic variation, and the mean Al tolerance in Japonica was twice that of Indica. Forty-eight regions associated with Al tolerance were identified by GWA analysis, most of which were subpopulation-specific. Four of these regions co-localized with a priori candidate genes, and two highly significant regions co-localized with previously identified QTLs. Three regions corresponding to induced Al-sensitive rice mutants (ART1, STAR2, Nrat1) were identified through bi-parental QTL mapping or GWA to be involved in natural variation for Al tolerance. Haplotype analysis around the Nrat1 gene identified susceptible and tolerant haplotypes explaining 40% of the Al tolerance variation within the aus subpopulation, and sequence analysis of Nrat1 identified a trio of non-synonymous mutations predictive of Al sensitivity in our diversity panel. GWA analysis discovered more phenotype–genotype associations and provided higher resolution, but QTL mapping identified critical rare and/or subpopulation-specific alleles not detected by GWA analysis. Mapping using Indica/Japonica populations identified QTLs associated with transgressive variation where alleles from a susceptible aus or indica parent enhanced Al tolerance in a tolerant Japonica background. This work supports the hypothesis that selectively introgressing alleles across subpopulations is an efficient approach for trait enhancement in plant breeding programs and

  17. Identification of genome-specific transcripts in wheat–rye translocation lines

    Directory of Open Access Journals (Sweden)

    Tong Geon Lee

    2015-09-01

    Full Text Available Studying gene expression in wheat–rye translocation lines is complicated due to the presence of homeologs in hexaploid wheat and high levels of synteny between wheat and rye genomes (Naranjo and Fernandez-Rueda, 1991 [1]; Devos et al., 1995 [2]; Lee et al., 2010 [3]; Lee et al., 2013 [4]. To overcome limitations of current gene expression studies on wheat–rye translocation lines and identify genome-specific transcripts, we developed a custom Roche NimbleGen Gene Expression microarray that contains probes derived from the sequence of hexaploid wheat, diploid rye and diploid progenitors of hexaploid wheat genome (Lee et al., 2014. Using the array developed, we identified genome-specific transcripts in a wheat–rye translocation line (Lee et al., 2014. Expression data are deposited in the NCBI Gene Expression Omnibus (GEO under accession number GSE58678. Here we report the details of the methods used in the array workflow and data analysis.

  18. Intraspecies genomic diversity and natural population structure of the meat-borne lactic acid bacterium Lactobacillus sakei.

    Science.gov (United States)

    Chaillou, Stéphane; Daty, Marie; Baraige, Fabienne; Dudez, Anne-Marie; Anglade, Patricia; Jones, Rhys; Alpert, Carl-Alfred; Champomier-Vergès, Marie-Christine; Zagorec, Monique

    2009-02-01

    Lactobacillus sakei is a food-borne bacterium naturally found in meat and fish products. A study was performed to examine the intraspecies diversity among 73 isolates sourced from laboratory collections in several different countries. Pulsed-field gel electrophoresis analysis demonstrated a 25% variation in genome size between isolates, ranging from 1,815 kb to 2,310 kb. The relatedness between isolates was then determined using a PCR-based method that detects the possession of 60 chromosomal genes belonging to the flexible gene pool. Ten different strain clusters were identified that had noticeable differences in their average genome size reflecting the natural population structure. The results show that many different genotypes may be isolated from similar types of meat products, suggesting a complex ecological habitat in which intraspecies diversity may be required for successful adaptation. Finally, proteomic analysis revealed a slight difference between the migration patterns of highly abundant GapA isoforms of the two prevailing L. sakei subspecies (sakei and carnosus). This analysis was used to affiliate the genotypic clusters with the corresponding subspecies. These findings reveal for the first time the extent of intraspecies genomic diversity in L. sakei. Consequently, identification of molecular subtypes may in the future prove valuable for a better understanding of microbial ecosystems in food products.

  19. Human cytochrome P450 2B6 genetic variability in Botswana: a case of haplotype diversity and convergent phenotypes

    KAUST Repository

    Tawe, Leabaneng

    2018-03-14

    Identification of inter-individual variability for drug metabolism through cytochrome P450 2B6 (CYP2B6) enzyme is important for understanding the differences in clinical responses to malaria and HIV. This study evaluates the distribution of CYP2B6 alleles, haplotypes and inferred metabolic phenotypes among subjects with different ethnicity in Botswana. A total of 570 subjects were analyzed for CYP2B6 polymorphisms at position 516 G > T (rs3745274), 785 A > G (rs2279343) and 983 T > C (rs28399499). Samples were collected in three districts of Botswana where the population belongs to Bantu (Serowe/Palapye and Chobe) and San-related (Ghanzi) ethnicity. The three districts showed different haplotype composition according to the ethnic background but similar metabolic inferred phenotypes, with 59.12%, 34.56%, 2.10% and 4.21% of the subjects having, respectively, an extensive, intermediate, slow and rapid metabolic profile. The results hint at the possibility of a convergent adaptation of detoxifying metabolic phenotypes despite a different haplotype structure due to the different genetic background. The main implication is that, while there is substantial homogeneity of metabolic inferred phenotypes among the country, the response to drugs metabolized via CYP2B6 could be individually associated to an increased risk of treatment failure and toxicity. These are important facts since Botswana is facing malaria elimination and a very high HIV prevalence.

  20. Human cytochrome P450 2B6 genetic variability in Botswana: a case of haplotype diversity and convergent phenotypes

    KAUST Repository

    Tawe, Leabaneng; Motshoge, Thato; Ramatlho, Pleasure; Mutukwa, Naledi; Muthoga, Charles Waithaka; Dongho, Ghyslaine Bruna Djeunang; Martinelli, Axel; Peloewetse, Elias; Russo, Gianluca; Quaye, Isaac Kweku; Paganotti, Giacomo Maria

    2018-01-01

    Identification of inter-individual variability for drug metabolism through cytochrome P450 2B6 (CYP2B6) enzyme is important for understanding the differences in clinical responses to malaria and HIV. This study evaluates the distribution of CYP2B6 alleles, haplotypes and inferred metabolic phenotypes among subjects with different ethnicity in Botswana. A total of 570 subjects were analyzed for CYP2B6 polymorphisms at position 516 G > T (rs3745274), 785 A > G (rs2279343) and 983 T > C (rs28399499). Samples were collected in three districts of Botswana where the population belongs to Bantu (Serowe/Palapye and Chobe) and San-related (Ghanzi) ethnicity. The three districts showed different haplotype composition according to the ethnic background but similar metabolic inferred phenotypes, with 59.12%, 34.56%, 2.10% and 4.21% of the subjects having, respectively, an extensive, intermediate, slow and rapid metabolic profile. The results hint at the possibility of a convergent adaptation of detoxifying metabolic phenotypes despite a different haplotype structure due to the different genetic background. The main implication is that, while there is substantial homogeneity of metabolic inferred phenotypes among the country, the response to drugs metabolized via CYP2B6 could be individually associated to an increased risk of treatment failure and toxicity. These are important facts since Botswana is facing malaria elimination and a very high HIV prevalence.