WorldWideScience

Sample records for genomics reveals selective

  1. Signatures of selection in tilapia revealed by whole genome resequencing.

    Science.gov (United States)

    Xia, Jun Hong; Bai, Zhiyi; Meng, Zining; Zhang, Yong; Wang, Le; Liu, Feng; Jing, Wu; Wan, Zi Yi; Li, Jiale; Lin, Haoran; Yue, Gen Hua

    2015-09-16

    Natural selection and selective breeding for genetic improvement have left detectable signatures within the genome of a species. Identification of selection signatures is important in evolutionary biology and for detecting genes that facilitate to accelerate genetic improvement. However, selection signatures, including artificial selection and natural selection, have only been identified at the whole genome level in several genetically improved fish species. Tilapia is one of the most important genetically improved fish species in the world. Using next-generation sequencing, we sequenced the genomes of 47 tilapia individuals. We identified a total of 1.43 million high-quality SNPs and found that the LD block sizes ranged from 10-100 kb in tilapia. We detected over a hundred putative selective sweep regions in each line of tilapia. Most selection signatures were located in non-coding regions of the tilapia genome. The Wnt signaling, gonadotropin-releasing hormone receptor and integrin signaling pathways were under positive selection in all improved tilapia lines. Our study provides a genome-wide map of genetic variation and selection footprints in tilapia, which could be important for genetic studies and accelerating genetic improvement of tilapia.

  2. Single-Molecule FISH Reveals Non-selective Packaging of Rift Valley Fever Virus Genome Segments.

    Directory of Open Access Journals (Sweden)

    Paul J Wichgers Schreur

    2016-08-01

    Full Text Available The bunyavirus genome comprises a small (S, medium (M, and large (L RNA segment of negative polarity. Although genome segmentation confers evolutionary advantages by enabling genome reassortment events with related viruses, genome segmentation also complicates genome replication and packaging. Accumulating evidence suggests that genomes of viruses with eight or more genome segments are incorporated into virions by highly selective processes. Remarkably, little is known about the genome packaging process of the tri-segmented bunyaviruses. Here, we evaluated, by single-molecule RNA fluorescence in situ hybridization (FISH, the intracellular spatio-temporal distribution and replication kinetics of the Rift Valley fever virus (RVFV genome and determined the segment composition of mature virions. The results reveal that the RVFV genome segments start to replicate near the site of infection before spreading and replicating throughout the cytoplasm followed by translocation to the virion assembly site at the Golgi network. Despite the average intracellular S, M and L genome segments approached a 1:1:1 ratio, major differences in genome segment ratios were observed among cells. We also observed a significant amount of cells lacking evidence of M-segment replication. Analysis of two-segmented replicons and four-segmented viruses subsequently confirmed the previous notion that Golgi recruitment is mediated by the Gn glycoprotein. The absence of colocalization of the different segments in the cytoplasm and the successful rescue of a tri-segmented variant with a codon shuffled M-segment suggested that inter-segment interactions are unlikely to drive the copackaging of the different segments into a single virion. The latter was confirmed by direct visualization of RNPs inside mature virions which showed that the majority of virions lack one or more genome segments. Altogether, this study suggests that RVFV genome packaging is a non-selective process.

  3. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    Energy Technology Data Exchange (ETDEWEB)

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMcahon, Katherine D.; Mamlstrom, Rex R.

    2014-05-12

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ecotype model? of diversification, but not previously observed in natural populations.

  4. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    Energy Technology Data Exchange (ETDEWEB)

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMahon, Katherine D.; Malmstrom, Rex R.

    2014-06-18

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ‘ecotype model’ of diversification, but not previously observed in natural populations.

  5. The Slow:Fast substitution ratio reveals changing patterns of natural selection in gamma-proteobacterial genomes

    Energy Technology Data Exchange (ETDEWEB)

    Alm, Eric; Shapiro, B. Jesse

    2009-04-15

    Different microbial species are thought to occupy distinct ecological niches, subjecting each species to unique selective constraints, which may leave a recognizable signal in their genomes. Thus, it may be possible to extract insight into the genetic basis of ecological differences among lineages by identifying unusual patterns of substitutions in orthologous gene or protein sequences. We use the ratio of substitutions in slow versus fast-evolving sites (nucleotides in DNA, or amino acids in protein sequence) to quantify deviations from the typical pattern of selective constraint observed across bacterial lineages. We propose that elevated S:F in one branch (an excess of slow-site substitutions) can indicate a functionally-relevant change, due to either positive selection or relaxed evolutionary constraint. In a genome-wide comparative study of gamma-proteobacterial proteins, we find that cell-surface proteins involved with motility and secretion functions often have high S:F ratios, while information-processing genes do not. Change in evolutionary constraints in some species is evidenced by increased S:F ratios within functionally-related sets of genes (e.g., energy production in Pseudomonas fluorescens), while other species apparently evolve mostly by drift (e.g., uniformly elevated S:F across most genes in Buchnera spp.). Overall, S:F reveals several species-specific, protein-level changes with potential functional/ecological importance. As microbial genome projects yield more species-rich gene-trees, the S:F ratio will become an increasingly powerful tool for uncovering functional genetic differences among species.

  6. Strong signatures of selection in the domestic pig genome

    DEFF Research Database (Denmark)

    Rubin, Carl-Johan; Megens, Hendrik-Jan; Barrio, Alvaro Martinez

    2012-01-01

    Domestication of wild boar (Sus scrofa) and subsequent selection have resulted in dramatic phenotypic changes in domestic pigs for a number of traits, including behavior, body composition, reproduction, and coat color. Here we have used whole-genome resequencing to reveal some of the loci that un...... to strong directional selection.......Domestication of wild boar (Sus scrofa) and subsequent selection have resulted in dramatic phenotypic changes in domestic pigs for a number of traits, including behavior, body composition, reproduction, and coat color. Here we have used whole-genome resequencing to reveal some of the loci...... that underlie phenotypic evolution in European domestic pigs. Selective sweep analyses revealed strong signatures of selection at three loci harboring quantitative trait loci that explain a considerable part of one of the most characteristic morphological changes in the domestic pig—the elongation of the back...

  7. Genome-Wide Analysis of the World's Sheep Breeds Reveals High Levels of Historic Mixture and Strong Recent Selection

    Science.gov (United States)

    Kijas, James W.; Lenstra, Johannes A.; Hayes, Ben; Boitard, Simon; Porto Neto, Laercio R.; San Cristobal, Magali; Servin, Bertrand; McCulloch, Russell; Whan, Vicki; Gietzen, Kimberly; Paiva, Samuel; Barendse, William; Ciani, Elena; Raadsma, Herman; McEwan, John; Dalrymple, Brian

    2012-01-01

    Through their domestication and subsequent selection, sheep have been adapted to thrive in a diverse range of environments. To characterise the genetic consequence of both domestication and selection, we genotyped 49,034 SNP in 2,819 animals from a diverse collection of 74 sheep breeds. We find the majority of sheep populations contain high SNP diversity and have retained an effective population size much higher than most cattle or dog breeds, suggesting domestication occurred from a broad genetic base. Extensive haplotype sharing and generally low divergence time between breeds reveal frequent genetic exchange has occurred during the development of modern breeds. A scan of the genome for selection signals revealed 31 regions containing genes for coat pigmentation, skeletal morphology, body size, growth, and reproduction. We demonstrate the strongest selection signal has occurred in response to breeding for the absence of horns. The high density map of genetic variability provides an in-depth view of the genetic history for this important livestock species. PMID:22346734

  8. Genome-wide analysis of the world's sheep breeds reveals high levels of historic mixture and strong recent selection.

    Directory of Open Access Journals (Sweden)

    James W Kijas

    2012-02-01

    Full Text Available Through their domestication and subsequent selection, sheep have been adapted to thrive in a diverse range of environments. To characterise the genetic consequence of both domestication and selection, we genotyped 49,034 SNP in 2,819 animals from a diverse collection of 74 sheep breeds. We find the majority of sheep populations contain high SNP diversity and have retained an effective population size much higher than most cattle or dog breeds, suggesting domestication occurred from a broad genetic base. Extensive haplotype sharing and generally low divergence time between breeds reveal frequent genetic exchange has occurred during the development of modern breeds. A scan of the genome for selection signals revealed 31 regions containing genes for coat pigmentation, skeletal morphology, body size, growth, and reproduction. We demonstrate the strongest selection signal has occurred in response to breeding for the absence of horns. The high density map of genetic variability provides an in-depth view of the genetic history for this important livestock species.

  9. Whole genome detection of signature of positive selection in African cattle reveals selection for thermotolerance.

    Science.gov (United States)

    Taye, Mengistie; Lee, Wonseok; Caetano-Anolles, Kelsey; Dessie, Tadelle; Hanotte, Olivier; Mwai, Okeyo Ally; Kemp, Stephen; Cho, Seoae; Oh, Sung Jong; Lee, Hak-Kyo; Kim, Heebal

    2017-12-01

    As African indigenous cattle evolved in a hot tropical climate, they have developed an inherent thermotolerance; survival mechanisms include a light-colored and shiny coat, increased sweating, and cellular and molecular mechanisms to cope with high environmental temperature. Here, we report the positive selection signature of genes in African cattle breeds which contribute for their heat tolerance mechanisms. We compared the genomes of five indigenous African cattle breeds with the genomes of four commercial cattle breeds using cross-population composite likelihood ratio (XP-CLR) and cross-population extended haplotype homozygosity (XP-EHH) statistical methods. We identified 296 (XP-EHH) and 327 (XP-CLR) positively selected genes. Gene ontology analysis resulted in 41 biological process terms and six Kyoto Encyclopedia of Genes and Genomes pathways. Several genes and pathways were found to be involved in oxidative stress response, osmotic stress response, heat shock response, hair and skin properties, sweat gland development and sweating, feed intake and metabolism, and reproduction functions. The genes and pathways identified directly or indirectly contribute to the superior heat tolerance mechanisms in African cattle populations. The result will improve our understanding of the biological mechanisms of heat tolerance in African cattle breeds and opens an avenue for further study. © 2017 Japanese Society of Animal Science.

  10. Genome-Wide Footprints of Pig Domestication and Selection Revealed through Massive Parallel Sequencing of Pooled DNA

    NARCIS (Netherlands)

    Amaral, A.J.; Ferretti, L.; Megens, H.J.W.C.; Crooijmans, R.P.M.A.; Nie, H.; Ramos-Onsins, S.E.; Perez-Enciso, M.; Schook, L.B.; Groenen, M.A.M.

    2011-01-01

    Background Artificial selection has caused rapid evolution in domesticated species. The identification of selection footprints across domesticated genomes can contribute to uncover the genetic basis of phenotypic diversity. Methodology/Main Findings Genome wide footprints of pig domestication and

  11. Detection of selection signatures of population-specific genomic regions selected during domestication process in Jinhua pigs.

    Science.gov (United States)

    Li, Zhengcao; Chen, Jiucheng; Wang, Zhen; Pan, Yuchun; Wang, Qishan; Xu, Ningying; Wang, Zhengguang

    2016-12-01

    Chinese pigs have been undergoing both natural and artificial selection for thousands of years. Jinhua pigs are of great importance, as they can be a valuable model for exploring the genetic mechanisms linked to meat quality and other traits such as disease resistance, reproduction and production. The purpose of this study was to identify distinctive footprints of selection between Jinhua pigs and other breeds utilizing genome-wide SNP data. Genotyping by genome reducing and sequencing was implemented in order to perform cross-population extended haplotype homozygosity to reveal strong signatures of selection for those economically important traits. This work was performed at a 2% genome level, which comprised 152 006 SNPs genotyped in a total of 517 individuals. Population-specific footprints of selective sweeps were searched for in the genome of Jinhua pigs using six native breeds and three European breeds as reference groups. Several candidate genes associated with meat quality, health and reproduction, such as GH1, CRHR2, TRAF4 and CCK, were found to be overlapping with the significantly positive outliers. Additionally, the results revealed that some genomic regions associated with meat quality, immune response and reproduction in Jinhua pigs have evolved directionally under domestication and subsequent selections. The identified genes and biological pathways in Jinhua pigs showed different selection patterns in comparison with the Chinese and European breeds. © 2016 Stichting International Foundation for Animal Genetics.

  12. Exploring evidence of positive selection reveals genetic basis of meat quality traits in Berkshire pigs through whole genome sequencing.

    Science.gov (United States)

    Jeong, Hyeonsoo; Song, Ki-Duk; Seo, Minseok; Caetano-Anollés, Kelsey; Kim, Jaemin; Kwak, Woori; Oh, Jae-Don; Kim, EuiSoo; Jeong, Dong Kee; Cho, Seoae; Kim, Heebal; Lee, Hak-Kyo

    2015-08-20

    Natural and artificial selection following domestication has led to the existence of more than a hundred pig breeds, as well as incredible variation in phenotypic traits. Berkshire pigs are regarded as having superior meat quality compared to other breeds. As the meat production industry seeks selective breeding approaches to improve profitable traits such as meat quality, information about genetic determinants of these traits is in high demand. However, most of the studies have been performed using trained sensory panel analysis without investigating the underlying genetic factors. Here we investigate the relationship between genomic composition and this phenotypic trait by scanning for signatures of positive selection in whole-genome sequencing data. We generated genomes of 10 Berkshire pigs at a total of 100.6 coverage depth, using the Illumina Hiseq2000 platform. Along with the genomes of 11 Landrace and 13 Yorkshire pigs, we identified genomic variants of 18.9 million SNVs and 3.4 million Indels in the mapped regions. We identified several associated genes related to lipid metabolism, intramuscular fatty acid deposition, and muscle fiber type which attribute to pork quality (TG, FABP1, AKIRIN2, GLP2R, TGFBR3, JPH3, ICAM2, and ERN1) by applying between population statistical tests (XP-EHH and XP-CLR). A statistical enrichment test was also conducted to detect breed specific genetic variation. In addition, de novo short sequence read assembly strategy identified several candidate genes (SLC25A14, IGF1, PI4KA, CACNA1A) as also contributing to lipid metabolism. Results revealed several candidate genes involved in Berkshire meat quality; most of these genes are involved in lipid metabolism and intramuscular fat deposition. These results can provide a basis for future research on the genomic characteristics of Berkshire pigs.

  13. Genomic selection: genome-wide prediction in plant improvement.

    Science.gov (United States)

    Desta, Zeratsion Abera; Ortiz, Rodomiro

    2014-09-01

    Association analysis is used to measure relations between markers and quantitative trait loci (QTL). Their estimation ignores genes with small effects that trigger underpinning quantitative traits. By contrast, genome-wide selection estimates marker effects across the whole genome on the target population based on a prediction model developed in the training population (TP). Whole-genome prediction models estimate all marker effects in all loci and capture small QTL effects. Here, we review several genomic selection (GS) models with respect to both the prediction accuracy and genetic gain from selection. Phenotypic selection or marker-assisted breeding protocols can be replaced by selection, based on whole-genome predictions in which phenotyping updates the model to build up the prediction accuracy. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. Analysis of adaptive evolution in Lyssavirus genomes reveals pervasive diversifying selection during species diversification.

    Science.gov (United States)

    Voloch, Carolina M; Capellão, Renata T; Mello, Beatriz; Schrago, Carlos G

    2014-11-19

    Lyssavirus is a diverse genus of viruses that infect a variety of mammalian hosts, typically causing encephalitis. The evolution of this lineage, particularly the rabies virus, has been a focus of research because of the extensive occurrence of cross-species transmission, and the distinctive geographical patterns present throughout the diversification of these viruses. Although numerous studies have examined pattern-related questions concerning Lyssavirus evolution, analyses of the evolutionary processes acting on Lyssavirus diversification are scarce. To clarify the relevance of positive natural selection in Lyssavirus diversification, we conducted a comprehensive scan for episodic diversifying selection across all lineages and codon sites of the five coding regions in lyssavirus genomes. Although the genomes of these viruses are generally conserved, the glycoprotein (G), RNA-dependent RNA polymerase (L) and polymerase (P) genes were frequently targets of adaptive evolution during the diversification of the genus. Adaptive evolution is particularly manifest in the glycoprotein gene, which was inferred to have experienced the highest density of positively selected codon sites along branches. Substitutions in the L gene were found to be associated with the early diversification of phylogroups. A comparison between the number of positively selected sites inferred along the branches of RABV population branches and Lyssavirus intespecies branches suggested that the occurrence of positive selection was similar on the five coding regions of the genome in both groups.

  15. Analysis of Adaptive Evolution in Lyssavirus Genomes Reveals Pervasive Diversifying Selection during Species Diversification

    Directory of Open Access Journals (Sweden)

    Carolina M. Voloch

    2014-11-01

    Full Text Available Lyssavirus is a diverse genus of viruses that infect a variety of mammalian hosts, typically causing encephalitis. The evolution of this lineage, particularly the rabies virus, has been a focus of research because of the extensive occurrence of cross-species transmission, and the distinctive geographical patterns present throughout the diversification of these viruses. Although numerous studies have examined pattern-related questions concerning Lyssavirus evolution, analyses of the evolutionary processes acting on Lyssavirus diversification are scarce. To clarify the relevance of positive natural selection in Lyssavirus diversification, we conducted a comprehensive scan for episodic diversifying selection across all lineages and codon sites of the five coding regions in lyssavirus genomes. Although the genomes of these viruses are generally conserved, the glycoprotein (G, RNA-dependent RNA polymerase (L and polymerase (P genes were frequently targets of adaptive evolution during the diversification of the genus. Adaptive evolution is particularly manifest in the glycoprotein gene, which was inferred to have experienced the highest density of positively selected codon sites along branches. Substitutions in the L gene were found to be associated with the early diversification of phylogroups. A comparison between the number of positively selected sites inferred along the branches of RABV population branches and Lyssavirus intespecies branches suggested that the occurrence of positive selection was similar on the five coding regions of the genome in both groups.

  16. Prehistoric genomes reveal the genetic foundation and cost of horse domestication

    DEFF Research Database (Denmark)

    Schubert, Mikkel; Jáónsson, Hákon; Chang, Dan

    2014-01-01

    genetics alone. We therefore sequenced two complete horse genomes, predating domestication by thousands of years, to characterize the genetic footprint of domestication. These ancient genomes reveal predomestic population structure and a significant fraction of genetic variation shared with the domestic...... breeds but absent from Przewalski’s horses. We find positive selection on genes involved in various aspects of locomotion, physiology, and cognition. Finally, we show that modern horse genomes contain an excess of deleterious mutations, likely representing the genetic cost of domestication....

  17. Identifying artificial selection signals in the chicken genome.

    Directory of Open Access Journals (Sweden)

    Yunlong Ma

    Full Text Available Identifying the signals of artificial selection can contribute to further shaping economically important traits. Here, a chicken 600k SNP-array was employed to detect the signals of artificial selection using 331 individuals from 9 breeds, including Jingfen (JF, Jinghong (JH, Araucanas (AR, White Leghorn (WL, Pekin-Bantam (PB, Shamo (SH, Gallus-Gallus-Spadiceus (GA, Rheinlander (RH and Vorwerkhuhn (VO. Per the population genetic structure, 9 breeds were combined into 5 breed-pools, and a 'two-step' strategy was used to reveal the signals of artificial selection. GA, which has little artificial selection, was defined as the reference population, and a total of 204, 155, 305 and 323 potential artificial selection signals were identified in AR_VO, PB, RH_WL and JH_JF, respectively. We also found signals derived from standing and de-novo genetic variations have contributed to adaptive evolution during artificial selection. Further enrichment analysis suggests that the genomic regions of artificial selection signals harbour genes, including THSR, PTHLH and PMCH, responsible for economic traits, such as fertility, growth and immunization. Overall, this study found a series of genes that contribute to the improvement of chicken breeds and revealed the genetic mechanisms of adaptive evolution, which can be used as fundamental information in future chicken functional genomics study.

  18. Genomic selection in plant breeding.

    Science.gov (United States)

    Newell, Mark A; Jannink, Jean-Luc

    2014-01-01

    Genomic selection (GS) is a method to predict the genetic value of selection candidates based on the genomic estimated breeding value (GEBV) predicted from high-density markers positioned throughout the genome. Unlike marker-assisted selection, the GEBV is based on all markers including both minor and major marker effects. Thus, the GEBV may capture more of the genetic variation for the particular trait under selection.

  19. High Resolution Genomic Scans Reveal Genetic Architecture Controlling Alcohol Preference in Bidirectionally Selected Rat Model.

    Directory of Open Access Journals (Sweden)

    Chiao-Ling Lo

    2016-08-01

    Full Text Available Investigations on the influence of nature vs. nurture on Alcoholism (Alcohol Use Disorder in human have yet to provide a clear view on potential genomic etiologies. To address this issue, we sequenced a replicated animal model system bidirectionally-selected for alcohol preference (AP. This model is uniquely suited to map genetic effects with high reproducibility, and resolution. The origin of the rat lines (an 8-way cross resulted in small haplotype blocks (HB with a corresponding high level of resolution. We sequenced DNAs from 40 samples (10 per line of each replicate to determine allele frequencies and HB. We achieved ~46X coverage per line and replicate. Excessive differentiation in the genomic architecture between lines, across replicates, termed signatures of selection (SS, were classified according to gene and region. We identified SS in 930 genes associated with AP. The majority (50% of the SS were confined to single gene regions, the greatest numbers of which were in promoters (284 and intronic regions (169 with the least in exon's (4, suggesting that differences in AP were primarily due to alterations in regulatory regions. We confirmed previously identified genes and found many new genes associated with AP. Of those newly identified genes, several demonstrated neuronal function involved in synaptic memory and reward behavior, e.g. ion channels (Kcnf1, Kcnn3, Scn5a, excitatory receptors (Grin2a, Gria3, Grip1, neurotransmitters (Pomc, and synapses (Snap29. This study not only reveals the polygenic architecture of AP, but also emphasizes the importance of regulatory elements, consistent with other complex traits.

  20. High Resolution Genomic Scans Reveal Genetic Architecture Controlling Alcohol Preference in Bidirectionally Selected Rat Model.

    Science.gov (United States)

    Lo, Chiao-Ling; Lossie, Amy C; Liang, Tiebing; Liu, Yunlong; Xuei, Xiaoling; Lumeng, Lawrence; Zhou, Feng C; Muir, William M

    2016-08-01

    Investigations on the influence of nature vs. nurture on Alcoholism (Alcohol Use Disorder) in human have yet to provide a clear view on potential genomic etiologies. To address this issue, we sequenced a replicated animal model system bidirectionally-selected for alcohol preference (AP). This model is uniquely suited to map genetic effects with high reproducibility, and resolution. The origin of the rat lines (an 8-way cross) resulted in small haplotype blocks (HB) with a corresponding high level of resolution. We sequenced DNAs from 40 samples (10 per line of each replicate) to determine allele frequencies and HB. We achieved ~46X coverage per line and replicate. Excessive differentiation in the genomic architecture between lines, across replicates, termed signatures of selection (SS), were classified according to gene and region. We identified SS in 930 genes associated with AP. The majority (50%) of the SS were confined to single gene regions, the greatest numbers of which were in promoters (284) and intronic regions (169) with the least in exon's (4), suggesting that differences in AP were primarily due to alterations in regulatory regions. We confirmed previously identified genes and found many new genes associated with AP. Of those newly identified genes, several demonstrated neuronal function involved in synaptic memory and reward behavior, e.g. ion channels (Kcnf1, Kcnn3, Scn5a), excitatory receptors (Grin2a, Gria3, Grip1), neurotransmitters (Pomc), and synapses (Snap29). This study not only reveals the polygenic architecture of AP, but also emphasizes the importance of regulatory elements, consistent with other complex traits.

  1. Genome-wide analysis reveals signatures of selection for important traits in domestic sheep from different ecoregions.

    Science.gov (United States)

    Liu, Zhaohua; Ji, Zhibin; Wang, Guizhi; Chao, Tianle; Hou, Lei; Wang, Jianmin

    2016-11-03

    Throughout a long period of adaptation and selection, sheep have thrived in a diverse range of ecological environments. Mongolian sheep is the common ancestor of the Chinese short fat-tailed sheep. Migration to different ecoregions leads to changes in selection pressures and results in microevolution. Mongolian sheep and its subspecies differ in a number of important traits, especially reproductive traits. Genome-wide intraspecific variation is required to dissect the genetic basis of these traits. This research resequenced 3 short fat-tailed sheep breeds with a 43.2-fold coverage of the sheep genome. We report more than 17 million single nucleotide polymorphisms and 2.9 million indels and identify 143 genomic regions with reduced pooled heterozygosity or increased genetic distance to each other breed that represent likely targets for selection during the migration. These regions harbor genes related to developmental processes, cellular processes, multicellular organismal processes, biological regulation, metabolic processes, reproduction, localization, growth and various components of the stress responses. Furthermore, we examined the haplotype diversity of 3 genomic regions involved in reproduction and found significant differences in TSHR and PRL gene regions among 8 sheep breeds. Our results provide useful genomic information for identifying genes or causal mutations associated with important economic traits in sheep and for understanding the genetic basis of adaptation to different ecological environments.

  2. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  3. Genome-wide scans between two honeybee populations reveal putative signatures of human-mediated selection.

    Science.gov (United States)

    Parejo, M; Wragg, D; Henriques, D; Vignal, A; Neuditschko, M

    2017-12-01

    Human-mediated selection has left signatures in the genomes of many domesticated animals, including the European dark honeybee, Apis mellifera mellifera, which has been selected by apiculturists for centuries. Using whole-genome sequence information, we investigated selection signatures in spatially separated honeybee subpopulations (Switzerland, n = 39 and France, n = 17). Three different test statistics were calculated in windows of 2 kb (fixation index, cross-population extended haplotype homozygosity and cross-population composite likelihood ratio) and combined into a recently developed composite selection score. Applying a stringent false discovery rate of 0.01, we identified six significant selective sweeps distributed across five chromosomes covering eight genes. These genes are associated with multiple molecular and biological functions, including regulation of transcription, receptor binding and signal transduction. Of particular interest is a selection signature on chromosome 1, which corresponds to the WNT4 gene, the family of which is conserved across the animal kingdom with a variety of functions. In Drosophila melanogaster, WNT4 alleles have been associated with differential wing, cross vein and abdominal phenotypes. Defining phenotypic characteristics of different Apis mellifera ssp., which are typically used as selection criteria, include colour and wing venation pattern. This signal is therefore likely to be a good candidate for human mediated-selection arising from different applied breeding practices in the two managed populations. © 2017 The Authors. Animal Genetics published by John Wiley & Sons Ltd on behalf of Stichting International Foundation for Animal Genetics.

  4. Genomic Selection in Multi-environment Crop Trials.

    Science.gov (United States)

    Oakey, Helena; Cullis, Brian; Thompson, Robin; Comadran, Jordi; Halpin, Claire; Waugh, Robbie

    2016-05-03

    Genomic selection in crop breeding introduces modeling challenges not found in animal studies. These include the need to accommodate replicate plants for each line, consider spatial variation in field trials, address line by environment interactions, and capture nonadditive effects. Here, we propose a flexible single-stage genomic selection approach that resolves these issues. Our linear mixed model incorporates spatial variation through environment-specific terms, and also randomization-based design terms. It considers marker, and marker by environment interactions using ridge regression best linear unbiased prediction to extend genomic selection to multiple environments. Since the approach uses the raw data from line replicates, the line genetic variation is partitioned into marker and nonmarker residual genetic variation (i.e., additive and nonadditive effects). This results in a more precise estimate of marker genetic effects. Using barley height data from trials, in 2 different years, of up to 477 cultivars, we demonstrate that our new genomic selection model improves predictions compared to current models. Analyzing single trials revealed improvements in predictive ability of up to 5.7%. For the multiple environment trial (MET) model, combining both year trials improved predictive ability up to 11.4% compared to a single environment analysis. Benefits were significant even when fewer markers were used. Compared to a single-year standard model run with 3490 markers, our partitioned MET model achieved the same predictive ability using between 500 and 1000 markers depending on the trial. Our approach can be used to increase accuracy and confidence in the selection of the best lines for breeding and/or, to reduce costs by using fewer markers. Copyright © 2016 Oakey et al.

  5. Shifts in the evolutionary rate and intensity of purifying selection between two Brassica genomes revealed by analyses of orthologous transposons and relics of a whole genome triplication.

    Science.gov (United States)

    Zhao, Meixia; Du, Jianchang; Lin, Feng; Tong, Chaobo; Yu, Jingyin; Huang, Shunmou; Wang, Xiaowu; Liu, Shengyi; Ma, Jianxin

    2013-10-01

    Recent sequencing of the Brassica rapa and Brassica oleracea genomes revealed extremely contrasting genomic features such as the abundance and distribution of transposable elements between the two genomes. However, whether and how these structural differentiations may have influenced the evolutionary rates of the two genomes since their split from a common ancestor are unknown. Here, we investigated and compared the rates of nucleotide substitution between two long terminal repeats (LTRs) of individual orthologous LTR-retrotransposons, the rates of synonymous and non-synonymous substitution among triplicated genes retained in both genomes from a shared whole genome triplication event, and the rates of genetic recombination estimated/deduced by the comparison of physical and genetic distances along chromosomes and ratios of solo LTRs to intact elements. Overall, LTR sequences and genic sequences showed more rapid nucleotide substitution in B. rapa than in B. oleracea. Synonymous substitution of triplicated genes retained from a shared whole genome triplication was detected at higher rates in B. rapa than in B. oleracea. Interestingly, non-synonymous substitution was observed at lower rates in the former than in the latter, indicating shifted densities of purifying selection between the two genomes. In addition to evolutionary asymmetry, orthologous genes differentially regulated and/or disrupted by transposable elements between the two genomes were also characterized. Our analyses suggest that local genomic and epigenomic features, such as recombination rates and chromatin dynamics reshaped by independent proliferation of transposable elements and elimination between the two genomes, are perhaps partially the causes and partially the outcomes of the observed inter-specific asymmetric evolution. © 2013 Purdue University The Plant Journal © 2013 John Wiley & Sons Ltd.

  6. The research progress of genomic selection in livestock.

    Science.gov (United States)

    Li, Hong-wei; Wang, Rui-jun; Wang, Zhi-ying; Li, Xue-wu; Wang, Zhen-yu; Yanjun, Zhang; Rui, Su; Zhihong, Liu; Jinquan, Li

    2017-05-20

    With the development of gene chip and breeding technology, genomic selection in plants and animals has become research hotspots in recent years. Genomic selection has been extensively applied to all kinds of economic livestock, due to its high accuracy, short generation intervals and low breeding costs. In this review, we summarize genotyping technology and the methods for genomic breeding value estimation, the latter including the least square method, RR-BLUP, GBLUP, ssGBLUP, BayesA and BayesB. We also cover basic principles of genomic selection and compare their genetic marker ranges, genomic selection accuracy and operational speed. In addition, we list common indicators, methods and influencing factors that are related to genomic selection accuracy. Lastly, we discuss latest applications and the current problems of genomic selection at home and abroad. Importantly, we envision future status of genomic selection research, including multi-trait and multi-population genomic selection, as well as impact of whole genome sequencing and dominant effects on genomic selection. This review will provide some venues for other breeders to further understand genome selection.

  7. Whole-genome sequencing of two North American Drosophila melanogaster populations reveals genetic differentiation and positive selection.

    Science.gov (United States)

    Campo, D; Lehmann, K; Fjeldsted, C; Souaiaia, T; Kao, J; Nuzhdin, S V

    2013-10-01

    The prevailing demographic model for Drosophila melanogaster suggests that the colonization of North America occurred very recently from a subset of European flies that rapidly expanded across the continent. This model implies a sudden population growth and range expansion consistent with very low or no population subdivision. As flies adapt to new environments, local adaptation events may be expected. To describe demographic and selective events during North American colonization, we have generated a data set of 35 individual whole-genome sequences from inbred lines of D. melanogaster from a west coast US population (Winters, California, USA) and compared them with a public genome data set from Raleigh (Raleigh, North Carolina, USA). We analysed nuclear and mitochondrial genomes and described levels of variation and divergence within and between these two North American D. melanogaster populations. Both populations exhibit negative values of Tajima's D across the genome, a common signature of demographic expansion. We also detected a low but significant level of genome-wide differentiation between the two populations, as well as multiple allele surfing events, which can be the result of gene drift in local subpopulations on the edge of an expansion wave. In contrast to this genome-wide pattern, we uncovered a 50-kilobase segment in chromosome arm 3L that showed all the hallmarks of a soft selective sweep in both populations. A comparison of allele frequencies within this divergent region among six populations from three continents allowed us to cluster these populations in two differentiated groups, providing evidence for the action of natural selection on a global scale. © 2013 John Wiley & Sons Ltd.

  8. Recent adaptive events in human brain revealed by meta-analysis of positively selected genes.

    Directory of Open Access Journals (Sweden)

    Yue Huang

    Full Text Available BACKGROUND AND OBJECTIVES: Analysis of positively-selected genes can help us understand how human evolved, especially the evolution of highly developed cognitive functions. However, previous works have reached conflicting conclusions regarding whether human neuronal genes are over-represented among genes under positive selection. METHODS AND RESULTS: We divided positively-selected genes into four groups according to the identification approaches, compiling a comprehensive list from 27 previous studies. We showed that genes that are highly expressed in the central nervous system are enriched in recent positive selection events in human history identified by intra-species genomic scan, especially in brain regions related to cognitive functions. This pattern holds when different datasets, parameters and analysis pipelines were used. Functional category enrichment analysis supported these findings, showing that synapse-related functions are enriched in genes under recent positive selection. In contrast, immune-related functions, for instance, are enriched in genes under ancient positive selection revealed by inter-species coding region comparison. We further demonstrated that most of these patterns still hold even after controlling for genomic characteristics that might bias genome-wide identification of positively-selected genes including gene length, gene density, GC composition, and intensity of negative selection. CONCLUSION: Our rigorous analysis resolved previous conflicting conclusions and revealed recent adaptation of human brain functions.

  9. Genomic selection in dairy cattle

    NARCIS (Netherlands)

    Roos, de A.P.W.

    2011-01-01

    The objectives of this Ph.D. thesis were (1) to optimise genomic selection in dairy cattle with respect to the accuracy of predicting total genetic merit and (2) to optimise a dairy cattle breeding program using genomic selection. The study was performed using a combination of real data sets and

  10. Complex evolutionary patterns revealed by mitochondrial genomes of the domestic horse.

    Science.gov (United States)

    Ning, T; Li, J; Lin, K; Xiao, H; Wylie, S; Hua, S; Li, H; Zhang, Y-P

    2014-01-01

    The domestic horse is the most widely used and important stock and recreational animal, valued for its strength and endurance. The energy required by the domestic horse is mainly supplied by mitochondria via oxidative phosphorylation. Thus, selection may have played an essential role in the evolution of the horse mitochondria. Besides, demographic events also affect the DNA polymorphic pattern on mitochondria. To understand the evolutionary patterns of the mitochondria of the domestic horse, we used a deep sequencing approach to obtain the complete sequences of 15 mitochondrial genomes, and four mitochondrial gene sequences, ND6, ATP8, ATP6 and CYTB, collected from 509, 363, 363 and 409 domestic horses, respectively. Evidence of strong substitution rate heterogeneity was found at nonsynonymous sites across the genomes. Signatures of recent positive selection on mtDNA of domestic horse were detected. Specifically, five amino acids in the four mitochondrial genes were identified as the targets of positive selection. Coalescentbased simulations imply that recent population expansion is the most probable explanation for the matrilineal population history for domestic horse. Our findings reveal a complex pattern of non-neutral evolution of the mitochondrial genome in the domestic horses.

  11. Conditional Selection of Genomic Alterations Dictates Cancer Evolution and Oncogenic Dependencies.

    Science.gov (United States)

    Mina, Marco; Raynaud, Franck; Tavernari, Daniele; Battistello, Elena; Sungalee, Stephanie; Saghafinia, Sadegh; Laessle, Titouan; Sanchez-Vega, Francisco; Schultz, Nikolaus; Oricchio, Elisa; Ciriello, Giovanni

    2017-08-14

    Cancer evolves through the emergence and selection of molecular alterations. Cancer genome profiling has revealed that specific events are more or less likely to be co-selected, suggesting that the selection of one event depends on the others. However, the nature of these evolutionary dependencies and their impact remain unclear. Here, we designed SELECT, an algorithmic approach to systematically identify evolutionary dependencies from alteration patterns. By analyzing 6,456 genomes from multiple tumor types, we constructed a map of oncogenic dependencies associated with cellular pathways, transcriptional readouts, and therapeutic response. Finally, modeling of cancer evolution shows that alteration dependencies emerge only under conditional selection. These results provide a framework for the design of strategies to predict cancer progression and therapeutic response. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Non-additive Effects in Genomic Selection

    Directory of Open Access Journals (Sweden)

    Luis Varona

    2018-03-01

    Full Text Available In the last decade, genomic selection has become a standard in the genetic evaluation of livestock populations. However, most procedures for the implementation of genomic selection only consider the additive effects associated with SNP (Single Nucleotide Polymorphism markers used to calculate the prediction of the breeding values of candidates for selection. Nevertheless, the availability of estimates of non-additive effects is of interest because: (i they contribute to an increase in the accuracy of the prediction of breeding values and the genetic response; (ii they allow the definition of mate allocation procedures between candidates for selection; and (iii they can be used to enhance non-additive genetic variation through the definition of appropriate crossbreeding or purebred breeding schemes. This study presents a review of methods for the incorporation of non-additive genetic effects into genomic selection procedures and their potential applications in the prediction of future performance, mate allocation, crossbreeding, and purebred selection. The work concludes with a brief outline of some ideas for future lines of that may help the standard inclusion of non-additive effects in genomic selection.

  13. Genome Sequencing Reveals Loci under Artificial Selection that Underlie Disease Phenotypes in the Laboratory Rat

    NARCIS (Netherlands)

    Atanur, Santosh S.; Diaz, Ana Garcia; Maratou, Klio; Sarkis, Allison; Rotival, Maxime; Game, Laurence; Tschannen, Michael R.; Kaisaki, Pamela J.; Otto, Georg W.; Ma, Man Chun John; Keane, Thomas M.; Hummel, Oliver; Saar, Kathrin; Chen, Wei; Guryev, Victor; Gopalakrishnan, Kathirvel; Garrett, Michael R.; Joe, Bina; Citterio, Lorena; Bianchi, Giuseppe; McBride, Martin; Dominiczak, Anna; Adams, David J.; Serikawa, Tadao; Flicek, Paul; Cuppen, Edwin; Hubner, Norbert; Petretto, Enrico; Gauguier, Dominique; Kwitek, Anne; Jacob, Howard; Aitman, Timothy J.

    2013-01-01

    Large numbers of inbred laboratory rat strains have been developed for a range of complex disease phenotypes. To gain insights into the evolutionary pressures underlying selection for these phenotypes, we sequenced the genomes of 27 rat strains, including 11 models of hypertension, diabetes, and

  14. Positive Selection Driving Cytoplasmic Genome Evolution of the Medicinally Important Ginseng Plant Genus Panax.

    Science.gov (United States)

    Jiang, Peng; Shi, Feng-Xue; Li, Ming-Rui; Liu, Bao; Wen, Jun; Xiao, Hong-Xing; Li, Lin-Feng

    2018-01-01

    Panax L. (the ginseng genus) is a shade-demanding group within the family Araliaceae and all of its species are of crucial significance in traditional Chinese medicine. Phylogenetic and biogeographic analyses demonstrated that two rounds of whole genome duplications accompanying with geographic and ecological isolations promoted the diversification of Panax species. However, contributions of the cytoplasmic genomes to the adaptive evolution of Panax species remained largely uninvestigated. In this study, we sequenced the chloroplast and mitochondrial genomes of 11 accessions belonging to seven Panax species. Our results show that heterogeneity in nucleotide substitution rate is abundant in both of the two cytoplasmic genomes, with the mitochondrial genome possessing more variants at the total level but the chloroplast showing higher sequence polymorphisms at the genic regions. Genome-wide scanning of positive selection identified five and 12 genes from the chloroplast and mitochondrial genomes, respectively. Functional analyses further revealed that these selected genes play important roles in plant development, cellular metabolism and adaptation. We therefore conclude that positive selection might be one of the potential evolutionary forces that shaped nucleotide variation pattern of these Panax species. In particular, the mitochondrial genes evolved under stronger selective pressure compared to the chloroplast genes.

  15. Initiating genomic selection in tetraploid potato

    DEFF Research Database (Denmark)

    Sverrisdóttir, Elsa; Janss, Luc; Byrne, Stephen

    Breeding for more space and resource efficient crops is important to feed the world’s increasing population. Potatoes produce approximately twice the amount of calories per hectare compared to cereals. The traditional “mate and phenotype” breeding approach is costly and time-consuming; however......, the completion of the genome sequence of potato has enabled the application of genomics-assisted breeding technologies. Genomic selection using genome-wide molecular markers is becoming increasingly applicable to crops as the genotyping costs continue to reduce and it is thus an attractive breeding alternative...... selection, can be obtained with good prediction accuracies in tetraploid potato....

  16. Comparative genomics reveals insights into avian genome evolution and adaptation

    Science.gov (United States)

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  17. A MITE-based genotyping method to reveal hundreds of DNA polymorphisms in an animal genome after a few generations of artificial selection

    Directory of Open Access Journals (Sweden)

    Tetreau Guillaume

    2008-10-01

    Full Text Available Abstract Background For most organisms, developing hundreds of genetic markers spanning the whole genome still requires excessive if not unrealistic efforts. In this context, there is an obvious need for methodologies allowing the low-cost, fast and high-throughput genotyping of virtually any species, such as the Diversity Arrays Technology (DArT. One of the crucial steps of the DArT technique is the genome complexity reduction, which allows obtaining a genomic representation characteristic of the studied DNA sample and necessary for subsequent genotyping. In this article, using the mosquito Aedes aegypti as a study model, we describe a new genome complexity reduction method taking advantage of the abundance of miniature inverted repeat transposable elements (MITEs in the genome of this species. Results Ae. aegypti genomic representations were produced following a two-step procedure: (1 restriction digestion of the genomic DNA and simultaneous ligation of a specific adaptor to compatible ends, and (2 amplification of restriction fragments containing a particular MITE element called Pony using two primers, one annealing to the adaptor sequence and one annealing to a conserved sequence motif of the Pony element. Using this protocol, we constructed a library comprising more than 6,000 DArT clones, of which at least 5.70% were highly reliable polymorphic markers for two closely related mosquito strains separated by only a few generations of artificial selection. Within this dataset, linkage disequilibrium was low, and marker redundancy was evaluated at 2.86% only. Most of the detected genetic variability was observed between the two studied mosquito strains, but individuals of the same strain could still be clearly distinguished. Conclusion The new complexity reduction method was particularly efficient to reveal genetic polymorphisms in Ae. egypti. Overall, our results testify of the flexibility of the DArT genotyping technique and open new

  18. Comprehensive Genomic Profiling of Esthesioneuroblastoma Reveals Additional Treatment Options.

    Science.gov (United States)

    Gay, Laurie M; Kim, Sungeun; Fedorchak, Kyle; Kundranda, Madappa; Odia, Yazmin; Nangia, Chaitali; Battiste, James; Colon-Otero, Gerardo; Powell, Steven; Russell, Jeffery; Elvin, Julia A; Vergilio, Jo-Anne; Suh, James; Ali, Siraj M; Stephens, Philip J; Miller, Vincent A; Ross, Jeffrey S

    2017-07-01

    Esthesioneuroblastoma (ENB), also known as olfactory neuroblastoma, is a rare malignant neoplasm of the olfactory mucosa. Despite surgical resection combined with radiotherapy and adjuvant chemotherapy, ENB often relapses with rapid progression. Current multimodality, nontargeted therapy for relapsed ENB is of limited clinical benefit. We queried whether comprehensive genomic profiling (CGP) of relapsed or refractory ENB can uncover genomic alterations (GA) that could identify potential targeted therapies for these patients. CGP was performed on formalin-fixed, paraffin-embedded sections from 41 consecutive clinical cases of ENBs using a hybrid-capture, adaptor ligation based next-generation sequencing assay to a mean coverage depth of 593X. The results were analyzed for base substitutions, insertions and deletions, select rearrangements, and copy number changes (amplifications and homozygous deletions). Clinically relevant GA (CRGA) were defined as GA linked to drugs on the market or under evaluation in clinical trials. A total of 28 ENBs harbored GA, with a mean of 1.5 GA per sample. Approximately half of the ENBs (21, 51%) featured at least one CRGA, with an average of 1 CRGA per sample. The most commonly altered gene was TP53 (17%), with GA in PIK3CA , NF1 , CDKN2A , and CDKN2C occurring in 7% of samples. We report comprehensive genomic profiles for 41 ENB tumors. CGP revealed potential new therapeutic targets, including targetable GA in the mTOR, CDK and growth factor signaling pathways, highlighting the clinical value of genomic profiling in ENB. Comprehensive genomic profiling of 41 relapsed or refractory ENBs reveals recurrent alterations or classes of mutation, including amplification of tyrosine kinases encoded on chromosome 5q and mutations affecting genes in the mTOR/PI3K pathway. Approximately half of the ENBs (21, 51%) featured at least one clinically relevant genomic alteration (CRGA), with an average of 1 CRGA per sample. The most commonly altered

  19. Comparative Pan-Genome Analysis of Piscirickettsia salmonis Reveals Genomic Divergences within Genogroups

    Directory of Open Access Journals (Sweden)

    Guillermo Nourdin-Galindo

    2017-10-01

    Full Text Available Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis, functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these

  20. Advances and Challenges in Genomic Selection for Disease Resistance.

    Science.gov (United States)

    Poland, Jesse; Rutkoski, Jessica

    2016-08-04

    Breeding for disease resistance is a central focus of plant breeding programs, as any successful variety must have the complete package of high yield, disease resistance, agronomic performance, and end-use quality. With the need to accelerate the development of improved varieties, genomics-assisted breeding is becoming an important tool in breeding programs. With marker-assisted selection, there has been success in breeding for disease resistance; however, much of this work and research has focused on identifying, mapping, and selecting for major resistance genes that tend to be highly effective but vulnerable to breakdown with rapid changes in pathogen races. In contrast, breeding for minor-gene quantitative resistance tends to produce more durable varieties but is a more challenging breeding objective. As the genetic architecture of resistance shifts from single major R genes to a diffused architecture of many minor genes, the best approach for molecular breeding will shift from marker-assisted selection to genomic selection. Genomics-assisted breeding for quantitative resistance will therefore necessitate whole-genome prediction models and selection methodology as implemented for classical complex traits such as yield. Here, we examine multiple case studies testing whole-genome prediction models and genomic selection for disease resistance. In general, whole-genome models for disease resistance can produce prediction accuracy suitable for application in breeding. These models also largely outperform multiple linear regression as would be applied in marker-assisted selection. With the implementation of genomic selection for yield and other agronomic traits, whole-genome marker profiles will be available for the entire set of breeding lines, enabling genomic selection for disease at no additional direct cost. In this context, the scope of implementing genomics selection for disease resistance, and specifically for quantitative resistance and quarantined pathogens

  1. Genome size analyses of Pucciniales reveal the largest fungal genomes.

    Science.gov (United States)

    Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T; Loureiro, João; Talhinhas, Pedro

    2014-01-01

    Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research.

  2. Genomic consequences of selection and genome-wide association mapping in soybean.

    Science.gov (United States)

    Wen, Zixiang; Boyse, John F; Song, Qijian; Cregan, Perry B; Wang, Dechun

    2015-09-03

    Crop improvement always involves selection of specific alleles at genes controlling traits of agronomic importance, likely resulting in detectable signatures of selection within the genome of modern soybean (Glycine max L. Merr.). The identification of these signatures of selection is meaningful from the perspective of evolutionary biology and for uncovering the genetic architecture of agronomic traits. To this end, two populations of soybean, consisting of 342 landraces and 1062 improved lines, were genotyped with the SoySNP50K Illumina BeadChip containing 52,041 single nucleotide polymorphisms (SNPs), and systematically phenotyped for 9 agronomic traits. A cross-population composite likelihood ratio (XP-CLR) method was used to screen the signals of selective sweeps. A total of 125 candidate selection regions were identified, many of which harbored genes potentially involved in crop improvement. To further investigate whether these candidate regions were in fact enriched for genes affected by selection, genome-wide association studies (GWAS) were conducted on 7 selection traits targeted in soybean breeding (grain yield, plant height, lodging, maturity date, seed coat color, seed protein and oil content) and 2 non-selection traits (pubescence and flower color). Major genomic regions associated with selection traits overlapped with candidate selection regions, whereas no overlap of this kind occurred for the non-selection traits, suggesting that the selection sweeps identified are associated with traits of agronomic importance. Multiple novel loci and refined map locations of known loci related to these traits were also identified. These findings illustrate that comparative genomic analyses, especially when combined with GWAS, are a promising approach to dissect the genetic architecture of complex traits.

  3. Interplay of recombination and selection in the genomes of Chlamydia trachomatis

    Directory of Open Access Journals (Sweden)

    Dean Deborah

    2011-05-01

    Full Text Available Abstract Background Chlamydia trachomatis is an obligate intracellular bacterial parasite, which causes several severe and debilitating diseases in humans. This study uses comparative genomic analyses of 12 complete published C. trachomatis genomes to assess the contribution of recombination and selection in this pathogen and to understand the major evolutionary forces acting on the genome of this bacterium. Results The conserved core genes of C. trachomatis are a large proportion of the pan-genome: we identified 836 core genes in C. trachomatis out of a range of 874-927 total genes in each genome. The ratio of recombination events compared to mutation (ρ/θ was 0.07 based on ancestral reconstructions using the ClonalFrame tool, but recombination had a significant effect on genetic diversification (r/m = 0.71. The distance-dependent decay of linkage disequilibrium also indicated that C. trachomatis populations behaved intermediately between sexual and clonal extremes. Fifty-five genes were identified as having a history of recombination and 92 were under positive selection based on statistical tests. Twenty-three genes showed evidence of being under both positive selection and recombination, which included genes with a known role in virulence and pathogencity (e.g., ompA, pmps, tarp. Analysis of inter-clade recombination flux indicated non-uniform currents of recombination between clades, which suggests the possibility of spatial population structure in C. trachomatis infections. Conclusions C. trachomatis is the archetype of a bacterial species where recombination is relatively frequent yet gene gains by horizontal gene transfer (HGT and losses (by deletion are rare. Gene conversion occurs at sites across the whole C. trachomatis genome but may be more often fixed in genes that are under diversifying selection. Furthermore, genome sequencing will reveal patterns of serotype specific gene exchange and selection that will generate important

  4. Comparative genomics Lactobacillus reuteri from sourdough reveals adaptation of an intestinal symbiont to food fermentations.

    Science.gov (United States)

    Zheng, Jinshui; Zhao, Xin; Lin, Xiaoxi B; Gänzle, Michael

    2015-12-11

    Lactobacillus reuteri is a dominant member of intestinal microbiota of vertebrates, and occurs in food fermentations. The stable presence of L. reuteri in sourdough provides the opportunity to study the adaptation of vertebrate symbionts to an extra-intestinal habitat. This study evaluated this adaptation by comparative genomics of 16 strains of L. reuteri. A core genome phylogenetic tree grouped L. reuteri into 5 clusters corresponding to the host-adapted lineages. The topology of a gene content tree, which includes accessory genes, differed from the core genome phylogenetic tree, suggesting that the differentiation of L. reuteri is shaped by gene loss or acquisition. About 10% of the core genome (124 core genes) were under positive selection. In lineage III sourdough isolates, 177 genes were under positive selection, mainly related to energy conversion and carbohydrate metabolism. The analysis of the competitiveness of L. reuteri in sourdough revealed that the competitivess of sourdough isolates was equal or higher when compared to rodent isolates. This study provides new insights into the adaptation of L. reuteri to food and intestinal habitats, suggesting that these two habitats exert different selective pressure related to growth rate and energy (carbohydrate) metabolism.

  5. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes

    Science.gov (United States)

    Liu, Shengyi; Liu, Yumei; Yang, Xinhua; Tong, Chaobo; Edwards, David; Parkin, Isobel A. P.; Zhao, Meixia; Ma, Jianxin; Yu, Jingyin; Huang, Shunmou; Wang, Xiyin; Wang, Junyi; Lu, Kun; Fang, Zhiyuan; Bancroft, Ian; Yang, Tae-Jin; Hu, Qiong; Wang, Xinfa; Yue, Zhen; Li, Haojie; Yang, Linfeng; Wu, Jian; Zhou, Qing; Wang, Wanxin; King, Graham J; Pires, J. Chris; Lu, Changxin; Wu, Zhangyan; Sampath, Perumal; Wang, Zhuo; Guo, Hui; Pan, Shengkai; Yang, Limei; Min, Jiumeng; Zhang, Dong; Jin, Dianchuan; Li, Wanshun; Belcram, Harry; Tu, Jinxing; Guan, Mei; Qi, Cunkou; Du, Dezhi; Li, Jiana; Jiang, Liangcai; Batley, Jacqueline; Sharpe, Andrew G; Park, Beom-Seok; Ruperao, Pradeep; Cheng, Feng; Waminal, Nomar Espinosa; Huang, Yin; Dong, Caihua; Wang, Li; Li, Jingping; Hu, Zhiyong; Zhuang, Mu; Huang, Yi; Huang, Junyan; Shi, Jiaqin; Mei, Desheng; Liu, Jing; Lee, Tae-Ho; Wang, Jinpeng; Jin, Huizhe; Li, Zaiyun; Li, Xun; Zhang, Jiefu; Xiao, Lu; Zhou, Yongming; Liu, Zhongsong; Liu, Xuequn; Qin, Rui; Tang, Xu; Liu, Wenbin; Wang, Yupeng; Zhang, Yangyong; Lee, Jonghoon; Kim, Hyun Hee; Denoeud, France; Xu, Xun; Liang, Xinming; Hua, Wei; Wang, Xiaowu; Wang, Jun; Chalhoub, Boulos; Paterson, Andrew H

    2014-01-01

    Polyploidization has provided much genetic variation for plant adaptive evolution, but the mechanisms by which the molecular evolution of polyploid genomes establishes genetic architecture underlying species differentiation are unclear. Brassica is an ideal model to increase knowledge of polyploid evolution. Here we describe a draft genome sequence of Brassica oleracea, comparing it with that of its sister species B. rapa to reveal numerous chromosome rearrangements and asymmetrical gene loss in duplicated genomic blocks, asymmetrical amplification of transposable elements, differential gene co-retention for specific pathways and variation in gene expression, including alternative splicing, among a large number of paralogous and orthologous genes. Genes related to the production of anticancer phytochemicals and morphological variations illustrate consequences of genome duplication and gene divergence, imparting biochemical and morphological variation to B. oleracea. This study provides insights into Brassica genome evolution and will underpin research into the many important crops in this genus. PMID:24852848

  6. AFLP genome scanning reveals divergent selection in natural populations of Liriodendron chinense (Magnoliaceae along a latitudinal transect

    Directory of Open Access Journals (Sweden)

    Aihong eYang

    2016-05-01

    Full Text Available Understanding adaptive genetic variation and its relation to environmental factors are important for understanding how plants adapt to climate change and for managing genetic resources. Genome scans for the loci exhibiting either notably high or low levels of population differentiation (outlier loci provide one means of identifying genomic regions possibly associated with convergent or divergent selection. In this study, we combined AFLP genome scan and environmental association analysis to test for signals of natural selection in natural populations of Liriodendron chinense (Chinese Tulip Tree; Magnoliaceae along a latitudinal transect. We genotyped 276 individuals from 11 populations of L. chinense using 987 AFLP markers. Two complementary methods (Dfdist and BayeScan and association analysis between AFLP loci and climate factors were applied to detect outlier loci. Our analyses recovered both neutral and potentially adaptive genetic differentiation among populations of L. chinense. We found moderate genetic diversity within populations and high genetic differentiation among populations with reduced genetic diversity towards the periphery of the species ranges. Nine AFLP marker loci showed evidence of being outliers for population differentiation for both detection methods. Of these, six were strongly associated with at least one climate factor. Temperature, precipitation and radiation were found to be three important factors influencing local adaptation of L. chinense. The outlier AFLP loci are likely not the target of natural selection, but the neighboring genes of these loci might be involved in local adaptation. Hence, these candidates should be validated by further studies.

  7. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  8. Impact of selective genotyping in the training population on accuracy and bias of genomic selection.

    Science.gov (United States)

    Zhao, Yusheng; Gowda, Manje; Longin, Friedrich H; Würschum, Tobias; Ranc, Nicolas; Reif, Jochen C

    2012-08-01

    Estimating marker effects based on routinely generated phenotypic data of breeding programs is a cost-effective strategy to implement genomic selection. Truncation selection in breeding populations, however, could have a strong impact on the accuracy to predict genomic breeding values. The main objective of our study was to investigate the influence of phenotypic selection on the accuracy and bias of genomic selection. We used experimental data of 788 testcross progenies from an elite maize breeding program. The testcross progenies were evaluated in unreplicated field trials in ten environments and fingerprinted with 857 SNP markers. Random regression best linear unbiased prediction method was used in combination with fivefold cross-validation based on genotypic sampling. We observed a substantial loss in the accuracy to predict genomic breeding values in unidirectional selected populations. In contrast, estimating marker effects based on bidirectional selected populations led to only a marginal decrease in the prediction accuracy of genomic breeding values. We concluded that bidirectional selection is a valuable approach to efficiently implement genomic selection in applied plant breeding programs.

  9. Australian wild rice reveals pre-domestication origin of polymorphism deserts in rice genome.

    Science.gov (United States)

    Krishnan S, Gopala; Waters, Daniel L E; Henry, Robert J

    2014-01-01

    Rice is a major source of human food with a predominantly Asian production base. Domestication involved selection of traits that are desirable for agriculture and to human consumers. Wild relatives of crop plants are a source of useful variation which is of immense value for crop improvement. Australian wild rices have been isolated from the impacts of domestication in Asia and represents a source of novel diversity for global rice improvement. Oryza rufipogon is a perennial wild progenitor of cultivated rice. Oryza meridionalis is a related annual species in Australia. We have examined the sequence of the genomes of AA genome wild rices from Australia that are close relatives of cultivated rice through whole genome re-sequencing. Assembly of the resequencing data to the O. sativa ssp. japonica cv. Nipponbare shows that Australian wild rices possess 2.5 times more single nucleotide polymorphisms than in the Asian wild rice and cultivated O. sativa ssp. indica. Analysis of the genome of domesticated rice reveals regions of low diversity that show very little variation (polymorphism deserts). Both the perennial and annual wild rice from Australia show a high degree of conservation of sequence with that found in cultivated rice in the same 4.58 Mbp region on chromosome 5, which suggests that some of the 'polymorphism deserts' in this and other parts of the rice genome may have originated prior to domestication due to natural selection. Analysis of genes in the 'polymorphism deserts' indicates that this selection may have been due to biotic or abiotic stress in the environment of early rice relatives. Despite having closely related sequences in these genome regions, the Australian wild populations represent an invaluable source of diversity supporting rice food security.

  10. Australian wild rice reveals pre-domestication origin of polymorphism deserts in rice genome.

    Directory of Open Access Journals (Sweden)

    Gopala Krishnan S

    Full Text Available BACKGROUND: Rice is a major source of human food with a predominantly Asian production base. Domestication involved selection of traits that are desirable for agriculture and to human consumers. Wild relatives of crop plants are a source of useful variation which is of immense value for crop improvement. Australian wild rices have been isolated from the impacts of domestication in Asia and represents a source of novel diversity for global rice improvement. Oryza rufipogon is a perennial wild progenitor of cultivated rice. Oryza meridionalis is a related annual species in Australia. RESULTS: We have examined the sequence of the genomes of AA genome wild rices from Australia that are close relatives of cultivated rice through whole genome re-sequencing. Assembly of the resequencing data to the O. sativa ssp. japonica cv. Nipponbare shows that Australian wild rices possess 2.5 times more single nucleotide polymorphisms than in the Asian wild rice and cultivated O. sativa ssp. indica. Analysis of the genome of domesticated rice reveals regions of low diversity that show very little variation (polymorphism deserts. Both the perennial and annual wild rice from Australia show a high degree of conservation of sequence with that found in cultivated rice in the same 4.58 Mbp region on chromosome 5, which suggests that some of the 'polymorphism deserts' in this and other parts of the rice genome may have originated prior to domestication due to natural selection. CONCLUSIONS: Analysis of genes in the 'polymorphism deserts' indicates that this selection may have been due to biotic or abiotic stress in the environment of early rice relatives. Despite having closely related sequences in these genome regions, the Australian wild populations represent an invaluable source of diversity supporting rice food security.

  11. MtDNA genomes reveal a relaxation of selective constraints in low-BMI individuals in a Uyghur population.

    Science.gov (United States)

    Zheng, Hong-Xiang; Li, Lei; Jiang, Xiao-Yan; Yan, Shi; Qin, Zhendong; Wang, Xiaofeng; Jin, Li

    2017-10-01

    Considerable attention has been focused on the effect of deleterious mutations caused by the recent relaxation of selective constraints on human health, including the prevalence of obesity, which might represent an adaptive response of energy-conserving metabolism under the conditions of modern society. Mitochondrial DNA (mtDNA) encoding 13 core subunits of oxidative phosphorylation plays an important role in metabolism. Therefore, we hypothesized that a relaxation of selection constraints on mtDNA and an increase in the proportion of deleterious mutations have played a role in obesity prevalence. In this study, we collected and sequenced the mtDNA genomes of 722 Uyghurs, a typical population with a high prevalence of obesity. We identified the variants that occurred in the Uyghur population for each sample and found that the number of nonsynonymous mutations carried by Uyghur individuals declined with elevation of their BMI (P = 0.015). We further calculated the nonsynonymous and synonymous ratio (N/S) of the high-BMI and low-BMI haplogroups, and the results showed that a significantly higher N/S occurred in the whole mtDNA genomes of the low-BMI haplogroups (0.64) than in that of the high-BMI haplogroups (0.35, P = 0.030) and ancestor haplotypes (0.41, P = 0.032); these findings indicated that low-BMI individuals showed a recent relaxation of selective constraints. In addition, we investigated six clinical characteristics and found that fasting plasma glucose might be correlated with the N/S and selective pressures. We hypothesized that a higher proportion of deleterious mutations led to mild mitochondrial dysfunction, which helps to drive glucose consumption and thereby prevents obesity. Our results provide new insights into the relationship between obesity predisposition and mitochondrial genome evolution.

  12. Nannochloropsis genomes reveal evolution of microalgal oleaginous traits.

    Directory of Open Access Journals (Sweden)

    Dongmei Wang

    2014-01-01

    Full Text Available Oleaginous microalgae are promising feedstock for biofuels, yet the genetic diversity, origin and evolution of oleaginous traits remain largely unknown. Here we present a detailed phylogenomic analysis of five oleaginous Nannochloropsis species (a total of six strains and one time-series transcriptome dataset for triacylglycerol (TAG synthesis on one representative strain. Despite small genome sizes, high coding potential and relative paucity of mobile elements, the genomes feature small cores of ca. 2,700 protein-coding genes and a large pan-genome of >38,000 genes. The six genomes share key oleaginous traits, such as the enrichment of selected lipid biosynthesis genes and certain glycoside hydrolase genes that potentially shift carbon flux from chrysolaminaran to TAG synthesis. The eleven type II diacylglycerol acyltransferase genes (DGAT-2 in every strain, each expressed during TAG synthesis, likely originated from three ancient genomes, including the secondary endosymbiosis host and the engulfed green and red algae. Horizontal gene transfers were inferred in most lipid synthesis nodes with expanded gene doses and many glycoside hydrolase genes. Thus multiple genome pooling and horizontal genetic exchange, together with selective inheritance of lipid synthesis genes and species-specific gene loss, have led to the enormous genetic apparatus for oleaginousness and the wide genomic divergence among present-day Nannochloropsis. These findings have important implications in the screening and genetic engineering of microalgae for biofuels.

  13. Flexibility and symmetry of prokaryotic genome rearrangement reveal lineage-associated core-gene-defined genome organizational frameworks.

    Science.gov (United States)

    Kang, Yu; Gu, Chaohao; Yuan, Lina; Wang, Yue; Zhu, Yanmin; Li, Xinna; Luo, Qibin; Xiao, Jingfa; Jiang, Daquan; Qian, Minping; Ahmed Khan, Aftab; Chen, Fei; Zhang, Zhang; Yu, Jun

    2014-11-25

    The prokaryotic pangenome partitions genes into core and dispensable genes. The order of core genes, albeit assumed to be stable under selection in general, is frequently interrupted by horizontal gene transfer and rearrangement, but how a core-gene-defined genome maintains its stability or flexibility remains to be investigated. Based on data from 30 species, including 425 genomes from six phyla, we grouped core genes into syntenic blocks in the context of a pangenome according to their stability across multiple isolates. A subset of the core genes, often species specific and lineage associated, formed a core-gene-defined genome organizational framework (cGOF). Such cGOFs are either single segmental (one-third of the species analyzed) or multisegmental (the rest). Multisegment cGOFs were further classified into symmetric or asymmetric according to segment orientations toward the origin-terminus axis. The cGOFs in Gram-positive species are exclusively symmetric and often reversible in orientation, as opposed to those of the Gram-negative bacteria, which are all asymmetric and irreversible. Meanwhile, all species showing strong strand-biased gene distribution contain symmetric cGOFs and often specific DnaE (α subunit of DNA polymerase III) isoforms. Furthermore, functional evaluations revealed that cGOF genes are hub associated with regard to cellular activities, and the stability of cGOF provides efficient indexes for scaffold orientation as demonstrated by assembling virtual and empirical genome drafts. cGOFs show species specificity, and the symmetry of multisegmental cGOFs is conserved among taxa and constrained by DNA polymerase-centric strand-biased gene distribution. The definition of species-specific cGOFs provides powerful guidance for genome assembly and other structure-based analysis. Prokaryotic genomes are frequently interrupted by horizontal gene transfer (HGT) and rearrangement. To know whether there is a set of genes not only conserved in position

  14. Improving Genetic Gain with Genomic Selection in Autotetraploid Potato

    Directory of Open Access Journals (Sweden)

    Anthony T. Slater

    2016-11-01

    Full Text Available Potato ( L. breeders consider a large number of traits during cultivar development and progress in conventional breeding can be slow. There is accumulating evidence that some of these traits, such as yield, are affected by a large number of genes with small individual effects. Recently, significant efforts have been applied to the development of genomic resources to improve potato breeding, culminating in a draft genome sequence and the identification of a large number of single nucleotide polymorphisms (SNPs. The availability of these genome-wide SNPs is a prerequisite for implementing genomic selection for improvement of polygenic traits such as yield. In this review, we investigate opportunities for the application of genomic selection to potato, including novel breeding program designs. We have considered a number of factors that will influence this process, including the autotetraploid and heterozygous genetic nature of potato, the rate of decay of linkage disequilibrium, the number of required markers, the design of a reference population, and trait heritability. Based on estimates of the effective population size derived from a potato breeding program, we have calculated the expected accuracy of genomic selection for four key traits of varying heritability and propose that it will be reasonably accurate. We compared the expected genetic gain from genomic selection with the expected gain from phenotypic and pedigree selection, and found that genetic gain can be substantially improved by using genomic selection.

  15. Accuracy of genomic selection for alfalfa biomass yield in different reference populations.

    Science.gov (United States)

    Annicchiarico, Paolo; Nazzicari, Nelson; Li, Xuehui; Wei, Yanling; Pecetti, Luciano; Brummer, E Charles

    2015-12-01

    Genomic selection based on genotyping-by-sequencing (GBS) data could accelerate alfalfa yield gains, if it displayed moderate ability to predict parent breeding values. Its interest would be enhanced by predicting ability also for germplasm/reference populations other than those for which it was defined. Predicting accuracy may be influenced by statistical models, SNP calling procedures and missing data imputation strategies. Landrace and variety material from two genetically-contrasting reference populations, i.e., 124 elite genotypes adapted to the Po Valley (sub-continental climate; PV population) and 154 genotypes adapted to Mediterranean-climate environments (Me population), were genotyped by GBS and phenotyped in separate environments for dry matter yield of their dense-planted half-sib progenies. Both populations showed no sub-population genetic structure. Predictive accuracy was higher by joint rather than separate SNP calling for the two data sets, and using random forest imputation of missing data. Highest accuracy was obtained using Support Vector Regression (SVR) for PV, and Ridge Regression BLUP and SVR for Me germplasm. Bayesian methods (Bayes A, Bayes B and Bayesian Lasso) tended to be less accurate. Random Forest Regression was the least accurate model. Accuracy attained about 0.35 for Me in the range of 0.30-0.50 missing data, and 0.32 for PV at 0.50 missing data, using at least 10,000 SNP markers. Cross-population predictions based on a smaller subset of common SNPs implied a relative loss of accuracy of about 25% for Me and 30% for PV. Genome-wide association analyses based on large subsets of M. truncatula-aligned markers revealed many SNPs with modest association with yield, and some genome areas hosting putative QTLs. A comparison of genomic vs. conventional selection for parent breeding value assuming 1-year vs. 5-year selection cycles, respectively, indicated over three-fold greater predicted yield gain per unit time for genomic selection

  16. Long- and short-term selective forces on malaria parasite genomes

    KAUST Repository

    Nygaard, Sanne

    2010-09-09

    Plasmodium parasites, the causal agents of malaria, result in more than 1 million deaths annually. Plasmodium are unicellular eukaryotes with small ~23 Mb genomes encoding ~5200 protein-coding genes. The protein-coding genes comprise about half of these genomes. Although evolutionary processes have a significant impact on malaria control, the selective pressures within Plasmodium genomes are poorly understood, particularly in the non-protein-coding portion of the genome. We use evolutionary methods to describe selective processes in both the coding and non-coding regions of these genomes. Based on genome alignments of seven Plasmodium species, we show that protein-coding, intergenic and intronic regions are all subject to purifying selection and we identify 670 conserved non-genic elements. We then use genome-wide polymorphism data from P. falciparum to describe short-term selective processes in this species and identify some candidate genes for balancing (diversifying) selection. Our analyses suggest that there are many functional elements in the non-genic regions of these genomes and that adaptive evolution has occurred more frequently in the protein-coding regions of the genome. © 2010 Nygaard et al.

  17. Genome-wide selection signatures in Pinzgau cattle

    Directory of Open Access Journals (Sweden)

    Radovan Kasarda

    2015-08-01

    Full Text Available The aim of this study was to identify the evidence of recent selection based on estimation of the integrated Haplotype Score (iHS, population differentiation index (FST and characterize affected regions near QTL associated with traits under strong selection in Pinzgau cattle. In total 21 Austrian and 19 Slovak purebreed bulls genotyped with Illumina bovineHD and  bovineSNP50 BeadChip were used to identify genomic regions under selection. Only autosomal loci with call rate higher than 90%, minor allele frequency higher than 0.01 and Hardy-Weinberg equlibrium limit of 0.001 were included in the subsequent analyses of selection sweeps presence. The final dataset was consisted from 30538 SNPs with 81.86 kb average adjacent SNPs spacing. The iHS score were averaged into non-overlapping 500 kb segments across the genome. The FST values were also plotted against genome position based on sliding windows approach and averaged over 8 consecutive SNPs. Based on integrated Haplotype Score evaluation only 7 regions with iHS score higher than 1.7 was found. The average iHS score observed for each adjacent syntenic regions indicated slight effect of recent selection in analysed group of Pinzgau bulls. The level of genetic differentiation between Austrian and Slovak bulls estimated based on FST index was low. Only 24% of FST values calculated for each SNP was greather than 0.01. By using sliding windows approach was found that 5% of analysed windows had higher value than 0.01. Our results indicated use of similar selection scheme in breeding programs of Slovak and Austrian Pinzgau bulls. The evidence for genome-wide association between signatures of selection and regions affecting complex traits such as milk production was insignificant, because the loci in segments identified as affected by selection were very distant from each other. Identification of genomic regions that may be under pressure of selection for phenotypic traits to better understanding of the

  18. Positive selection for unpreferred codon usage in eukaryotic genomes

    Directory of Open Access Journals (Sweden)

    Galagan James E

    2007-07-01

    Full Text Available Abstract Background Natural selection has traditionally been understood as a force responsible for pushing genes to states of higher translational efficiency, whereas lower translational efficiency has been explained by neutral mutation and genetic drift. We looked for evidence of directional selection resulting in increased unpreferred codon usage (and presumably reduced translational efficiency in three divergent clusters of eukaryotic genomes using a simple optimal-codon-based metric (Kp/Ku. Results Here we show that for some genes natural selection is indeed responsible for causing accelerated unpreferred codon substitution, and document the scope of this selection. In Cryptococcus and to a lesser extent Drosophila, we find many genes showing a statistically significant signal of selection for unpreferred codon usage in one or more lineages. We did not find evidence for this type of selection in Saccharomyces. The signal of positive selection observed from unpreferred synonymous codon substitutions is coincident in Cryptococcus and Drosophila with the distribution of upstream open reading frames (uORFs, another genic feature known to reduce translational efficiency. Functional enrichment analysis of genes exhibiting low Kp/Ku ratios reveals that genes in regulatory roles are particularly subject to this type of selection. Conclusion Through genome-wide scans, we find recent selection for unpreferred codon usage at approximately 1% of genetic loci in a Cryptococcus and several genes in Drosophila. Unpreferred codons can impede translation efficiency, and we find that genes with translation-impeding uORFs are enriched for this selection signal. We find that regulatory genes are particularly likely to be subject to selection for unpreferred codon usage. Given that expression noise can propagate through regulatory cascades, and that low translational efficiency can reduce expression noise, this finding supports the hypothesis that translational

  19. Integrating genomic selection into dairy cattle breeding programmes: a review.

    Science.gov (United States)

    Bouquet, A; Juga, J

    2013-05-01

    Extensive genetic progress has been achieved in dairy cattle populations on many traits of economic importance because of efficient breeding programmes. Success of these programmes has relied on progeny testing of the best young males to accurately assess their genetic merit and hence their potential for breeding. Over the last few years, the integration of dense genomic information into statistical tools used to make selection decisions, commonly referred to as genomic selection, has enabled gains in predicting accuracy of breeding values for young animals without own performance. The possibility to select animals at an early stage allows defining new breeding strategies aimed at boosting genetic progress while reducing costs. The first objective of this article was to review methods used to model and optimize breeding schemes integrating genomic selection and to discuss their relative advantages and limitations. The second objective was to summarize the main results and perspectives on the use of genomic selection in practical breeding schemes, on the basis of the example of dairy cattle populations. Two main designs of breeding programmes integrating genomic selection were studied in dairy cattle. Genomic selection can be used either for pre-selecting males to be progeny tested or for selecting males to be used as active sires in the population. The first option produces moderate genetic gains without changing the structure of breeding programmes. The second option leads to large genetic gains, up to double those of conventional schemes because of a major reduction in the mean generation interval, but it requires greater changes in breeding programme structure. The literature suggests that genomic selection becomes more attractive when it is coupled with embryo transfer technologies to further increase selection intensity on the dam-to-sire pathway. The use of genomic information also offers new opportunities to improve preservation of genetic variation. However

  20. Sunflower Hybrid Breeding: From Markers to Genomic Selection.

    Science.gov (United States)

    Dimitrijevic, Aleksandra; Horn, Renate

    2017-01-01

    In sunflower, molecular markers for simple traits as, e.g., fertility restoration, high oleic acid content, herbicide tolerance or resistances to Plasmopara halstedii, Puccinia helianthi , or Orobanche cumana have been successfully used in marker-assisted breeding programs for years. However, agronomically important complex quantitative traits like yield, heterosis, drought tolerance, oil content or selection for disease resistance, e.g., against Sclerotinia sclerotiorum have been challenging and will require genome-wide approaches. Plant genetic resources for sunflower are being collected and conserved worldwide that represent valuable resources to study complex traits. Sunflower association panels provide the basis for genome-wide association studies, overcoming disadvantages of biparental populations. Advances in technologies and the availability of the sunflower genome sequence made novel approaches on the whole genome level possible. Genotype-by-sequencing, and whole genome sequencing based on next generation sequencing technologies facilitated the production of large amounts of SNP markers for high density maps as well as SNP arrays and allowed genome-wide association studies and genomic selection in sunflower. Genome wide or candidate gene based association studies have been performed for traits like branching, flowering time, resistance to Sclerotinia head and stalk rot. First steps in genomic selection with regard to hybrid performance and hybrid oil content have shown that genomic selection can successfully address complex quantitative traits in sunflower and will help to speed up sunflower breeding programs in the future. To make sunflower more competitive toward other oil crops higher levels of resistance against pathogens and better yield performance are required. In addition, optimizing plant architecture toward a more complex growth type for higher plant densities has the potential to considerably increase yields per hectare. Integrative approaches

  1. Sunflower Hybrid Breeding: From Markers to Genomic Selection

    Directory of Open Access Journals (Sweden)

    Aleksandra Dimitrijevic

    2018-01-01

    Full Text Available In sunflower, molecular markers for simple traits as, e.g., fertility restoration, high oleic acid content, herbicide tolerance or resistances to Plasmopara halstedii, Puccinia helianthi, or Orobanche cumana have been successfully used in marker-assisted breeding programs for years. However, agronomically important complex quantitative traits like yield, heterosis, drought tolerance, oil content or selection for disease resistance, e.g., against Sclerotinia sclerotiorum have been challenging and will require genome-wide approaches. Plant genetic resources for sunflower are being collected and conserved worldwide that represent valuable resources to study complex traits. Sunflower association panels provide the basis for genome-wide association studies, overcoming disadvantages of biparental populations. Advances in technologies and the availability of the sunflower genome sequence made novel approaches on the whole genome level possible. Genotype-by-sequencing, and whole genome sequencing based on next generation sequencing technologies facilitated the production of large amounts of SNP markers for high density maps as well as SNP arrays and allowed genome-wide association studies and genomic selection in sunflower. Genome wide or candidate gene based association studies have been performed for traits like branching, flowering time, resistance to Sclerotinia head and stalk rot. First steps in genomic selection with regard to hybrid performance and hybrid oil content have shown that genomic selection can successfully address complex quantitative traits in sunflower and will help to speed up sunflower breeding programs in the future. To make sunflower more competitive toward other oil crops higher levels of resistance against pathogens and better yield performance are required. In addition, optimizing plant architecture toward a more complex growth type for higher plant densities has the potential to considerably increase yields per hectare

  2. Sunflower Hybrid Breeding: From Markers to Genomic Selection

    Science.gov (United States)

    Dimitrijevic, Aleksandra; Horn, Renate

    2018-01-01

    In sunflower, molecular markers for simple traits as, e.g., fertility restoration, high oleic acid content, herbicide tolerance or resistances to Plasmopara halstedii, Puccinia helianthi, or Orobanche cumana have been successfully used in marker-assisted breeding programs for years. However, agronomically important complex quantitative traits like yield, heterosis, drought tolerance, oil content or selection for disease resistance, e.g., against Sclerotinia sclerotiorum have been challenging and will require genome-wide approaches. Plant genetic resources for sunflower are being collected and conserved worldwide that represent valuable resources to study complex traits. Sunflower association panels provide the basis for genome-wide association studies, overcoming disadvantages of biparental populations. Advances in technologies and the availability of the sunflower genome sequence made novel approaches on the whole genome level possible. Genotype-by-sequencing, and whole genome sequencing based on next generation sequencing technologies facilitated the production of large amounts of SNP markers for high density maps as well as SNP arrays and allowed genome-wide association studies and genomic selection in sunflower. Genome wide or candidate gene based association studies have been performed for traits like branching, flowering time, resistance to Sclerotinia head and stalk rot. First steps in genomic selection with regard to hybrid performance and hybrid oil content have shown that genomic selection can successfully address complex quantitative traits in sunflower and will help to speed up sunflower breeding programs in the future. To make sunflower more competitive toward other oil crops higher levels of resistance against pathogens and better yield performance are required. In addition, optimizing plant architecture toward a more complex growth type for higher plant densities has the potential to considerably increase yields per hectare. Integrative approaches

  3. Single-Molecule FISH Reveals Non-selective Packaging of Rift Valley Fever Virus Genome Segments

    NARCIS (Netherlands)

    Wichgers Schreur, Paul J.; Kortekaas, Jeroen

    2016-01-01

    The bunyavirus genome comprises a small (S), medium (M), and large (L) RNA segment of negative polarity. Although genome segmentation confers evolutionary advantages by enabling genome reassortment events with related viruses, genome segmentation also complicates genome replication and packaging.

  4. The draft genome of Tibetan hulless barley reveals adaptive patterns to the high stressful Tibetan Plateau.

    Science.gov (United States)

    Zeng, Xingquan; Long, Hai; Wang, Zhuo; Zhao, Shancen; Tang, Yawei; Huang, Zhiyong; Wang, Yulin; Xu, Qijun; Mao, Likai; Deng, Guangbing; Yao, Xiaoming; Li, Xiangfeng; Bai, Lijun; Yuan, Hongjun; Pan, Zhifen; Liu, Renjian; Chen, Xin; WangMu, QiMei; Chen, Ming; Yu, Lili; Liang, Junjun; DunZhu, DaWa; Zheng, Yuan; Yu, Shuiyang; LuoBu, ZhaXi; Guang, Xuanmin; Li, Jiang; Deng, Cao; Hu, Wushu; Chen, Chunhai; TaBa, XiongNu; Gao, Liyun; Lv, Xiaodan; Abu, Yuval Ben; Fang, Xiaodong; Nevo, Eviatar; Yu, Maoqun; Wang, Jun; Tashi, Nyima

    2015-01-27

    The Tibetan hulless barley (Hordeum vulgare L. var. nudum), also called "Qingke" in Chinese and "Ne" in Tibetan, is the staple food for Tibetans and an important livestock feed in the Tibetan Plateau. The diploid nature and adaptation to diverse environments of the highland give it unique resources for genetic research and crop improvement. Here we produced a 3.89-Gb draft assembly of Tibetan hulless barley with 36,151 predicted protein-coding genes. Comparative analyses revealed the divergence times and synteny between barley and other representative Poaceae genomes. The expansion of the gene family related to stress responses was found in Tibetan hulless barley. Resequencing of 10 barley accessions uncovered high levels of genetic variation in Tibetan wild barley and genetic divergence between Tibetan and non-Tibetan barley genomes. Selective sweep analyses demonstrate adaptive correlations of genes under selection with extensive environmental variables. Our results not only construct a genomic framework for crop improvement but also provide evolutionary insights of highland adaptation of Tibetan hulless barley.

  5. Review. Promises, pitfalls and challenges of genomic selection in breeding programs

    Energy Technology Data Exchange (ETDEWEB)

    Ibanez-Escriche, N.; Gonzalez-Recio, O.

    2011-07-01

    The aim of this work was to review the main challenges and pitfalls of the implementation of genomic selection in the breeding programs of different livestock species. Genomic selection is now one of the main challenges in animal breeding and genetics. Its application could considerably increase the genetic gain in traits of interest. However, the success of its practical implementation depends on the selection scheme characteristics, and these must be studied for each particular case. In dairy cattle, especially in Holsteins, genomic selection is a reality. However, in other livestock species (beef cattle, small ruminants, monogastrics and fish) genomic selection has mainly been used experimentally. The main limitation for its implementation in the mentioned livestock species is the high geno typing costs compared to the low selection value of the candidate. Nevertheless, nowadays the possibility of using single-nucleotide polymorphism (SNP) chips of low density to make genomic selection applications economically feasible is under study. Economic studies may optimize the benefits of genomic selection (GS) to include new traits in the breeding goals. It is evident that genomic selection offers great potential; however, a suitable geno typing strategy and recording system for each case is needed in order to properly exploit it. (Author) 50 refs.

  6. Will genomic selection be a practical method for plant breeding?

    OpenAIRE

    Nakaya, Akihiro; Isobe, Sachiko N.

    2012-01-01

    Background Genomic selection or genome-wide selection (GS) has been highlighted as a new approach for marker-assisted selection (MAS) in recent years. GS is a form of MAS that selects favourable individuals based on genomic estimated breeding values. Previous studies have suggested the utility of GS, especially for capturing small-effect quantitative trait loci, but GS has not become a popular methodology in the field of plant breeding, possibly because there is insufficient information avail...

  7. Genome size variation affects song attractiveness in grasshoppers: evidence for sexual selection against large genomes.

    Science.gov (United States)

    Schielzeth, Holger; Streitner, Corinna; Lampe, Ulrike; Franzke, Alexandra; Reinhold, Klaus

    2014-12-01

    Genome size is largely uncorrelated to organismal complexity and adaptive scenarios. Genetic drift as well as intragenomic conflict have been put forward to explain this observation. We here study the impact of genome size on sexual attractiveness in the bow-winged grasshopper Chorthippus biguttulus. Grasshoppers show particularly large variation in genome size due to the high prevalence of supernumerary chromosomes that are considered (mildly) selfish, as evidenced by non-Mendelian inheritance and fitness costs if present in high numbers. We ranked male grasshoppers by song characteristics that are known to affect female preferences in this species and scored genome sizes of attractive and unattractive individuals from the extremes of this distribution. We find that attractive singers have significantly smaller genomes, demonstrating that genome size is reflected in male courtship songs and that females prefer songs of males with small genomes. Such a genome size dependent mate preference effectively selects against selfish genetic elements that tend to increase genome size. The data therefore provide a novel example of how sexual selection can reinforce natural selection and can act as an agent in an intragenomic arms race. Furthermore, our findings indicate an underappreciated route of how choosy females could gain indirect benefits. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.

  8. Genome-wide survey of single-nucleotide polymorphisms reveals fine-scale population structure and signs of selection in the threatened Caribbean elkhorn coral, Acropora palmata

    Directory of Open Access Journals (Sweden)

    Meghann K. Devlin-Durante

    2017-11-01

    Full Text Available The advent of next-generation sequencing tools has made it possible to conduct fine-scale surveys of population differentiation and genome-wide scans for signatures of selection in non-model organisms. Such surveys are of particular importance in sharply declining coral species, since knowledge of population boundaries and signs of local adaptation can inform restoration and conservation efforts. Here, we use genome-wide surveys of single-nucleotide polymorphisms in the threatened Caribbean elkhorn coral, Acropora palmata, to reveal fine-scale population structure and infer the major barrier to gene flow that separates the eastern and western Caribbean populations between the Bahamas and Puerto Rico. The exact location of this break had been subject to discussion because two previous studies based on microsatellite data had come to differing conclusions. We investigate this contradiction by analyzing an extended set of 11 microsatellite markers including the five previously employed and discovered that one of the original microsatellite loci is apparently under selection. Exclusion of this locus reconciles the results from the SNP and the microsatellite datasets. Scans for outlier loci in the SNP data detected 13 candidate loci under positive selection, however there was no correlation between available environmental parameters and genetic distance. Together, these results suggest that reef restoration efforts should use local sources and utilize existing functional variation among geographic regions in ex situ crossing experiments to improve stress resistance of this species.

  9. Genome-wide survey of single-nucleotide polymorphisms reveals fine-scale population structure and signs of selection in the threatened Caribbean elkhorn coral, Acropora palmata.

    Science.gov (United States)

    Devlin-Durante, Meghann K; Baums, Iliana B

    2017-01-01

    The advent of next-generation sequencing tools has made it possible to conduct fine-scale surveys of population differentiation and genome-wide scans for signatures of selection in non-model organisms. Such surveys are of particular importance in sharply declining coral species, since knowledge of population boundaries and signs of local adaptation can inform restoration and conservation efforts. Here, we use genome-wide surveys of single-nucleotide polymorphisms in the threatened Caribbean elkhorn coral, Acropora palmata , to reveal fine-scale population structure and infer the major barrier to gene flow that separates the eastern and western Caribbean populations between the Bahamas and Puerto Rico. The exact location of this break had been subject to discussion because two previous studies based on microsatellite data had come to differing conclusions. We investigate this contradiction by analyzing an extended set of 11 microsatellite markers including the five previously employed and discovered that one of the original microsatellite loci is apparently under selection. Exclusion of this locus reconciles the results from the SNP and the microsatellite datasets. Scans for outlier loci in the SNP data detected 13 candidate loci under positive selection, however there was no correlation between available environmental parameters and genetic distance. Together, these results suggest that reef restoration efforts should use local sources and utilize existing functional variation among geographic regions in ex situ crossing experiments to improve stress resistance of this species.

  10. A primer on high-throughput computing for genomic selection.

    Science.gov (United States)

    Wu, Xiao-Lin; Beissinger, Timothy M; Bauck, Stewart; Woodward, Brent; Rosa, Guilherme J M; Weigel, Kent A; Gatti, Natalia de Leon; Gianola, Daniel

    2011-01-01

    High-throughput computing (HTC) uses computer clusters to solve advanced computational problems, with the goal of accomplishing high-throughput over relatively long periods of time. In genomic selection, for example, a set of markers covering the entire genome is used to train a model based on known data, and the resulting model is used to predict the genetic merit of selection candidates. Sophisticated models are very computationally demanding and, with several traits to be evaluated sequentially, computing time is long, and output is low. In this paper, we present scenarios and basic principles of how HTC can be used in genomic selection, implemented using various techniques from simple batch processing to pipelining in distributed computer clusters. Various scripting languages, such as shell scripting, Perl, and R, are also very useful to devise pipelines. By pipelining, we can reduce total computing time and consequently increase throughput. In comparison to the traditional data processing pipeline residing on the central processors, performing general-purpose computation on a graphics processing unit provide a new-generation approach to massive parallel computing in genomic selection. While the concept of HTC may still be new to many researchers in animal breeding, plant breeding, and genetics, HTC infrastructures have already been built in many institutions, such as the University of Wisconsin-Madison, which can be leveraged for genomic selection, in terms of central processing unit capacity, network connectivity, storage availability, and middleware connectivity. Exploring existing HTC infrastructures as well as general-purpose computing environments will further expand our capability to meet increasing computing demands posed by unprecedented genomic data that we have today. We anticipate that HTC will impact genomic selection via better statistical models, faster solutions, and more competitive products (e.g., from design of marker panels to realized

  11. Recent and ongoing selection in the human genome

    DEFF Research Database (Denmark)

    Nielsen, Rasmus; Hellmann, Ines; Hubisz, Melissa

    2007-01-01

    The recent availability of genome-scale genotyping data has led to the identification of regions of the human genome that seem to have been targeted by selection. These findings have increased our understanding of the evolutionary forces that affect the human genome, have augmented our knowledge...... of gene function and promise to increase our understanding of the genetic basis of disease. However, inferences of selection are challenged by several confounding factors, especially the complex demographic history of human populations, and concordance between studies is variable. Although such studies...

  12. Annotation-Based Whole Genomic Prediction and Selection

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Do, Duy Ngoc; Janss, Luc

    Genomic selection is widely used in both animal and plant species, however, it is performed with no input from known genomic or biological role of genetic variants and therefore is a black box approach in a genomic era. This study investigated the role of different genomic regions and detected QTLs...... in their contribution to estimated genomic variances and in prediction of genomic breeding values by applying SNP annotation approaches to feed efficiency. Ensembl Variant Predictor (EVP) and Pig QTL database were used as the source of genomic annotation for 60K chip. Genomic prediction was performed using the Bayes...... classes. Predictive accuracy was 0.531, 0.532, 0.302, and 0.344 for DFI, RFI, ADG and BF, respectively. The contribution per SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP...

  13. Multiple-Trait Genomic Selection Methods Increase Genetic Value Prediction Accuracy

    Science.gov (United States)

    Jia, Yi; Jannink, Jean-Luc

    2012-01-01

    Genetic correlations between quantitative traits measured in many breeding programs are pervasive. These correlations indicate that measurements of one trait carry information on other traits. Current single-trait (univariate) genomic selection does not take advantage of this information. Multivariate genomic selection on multiple traits could accomplish this but has been little explored and tested in practical breeding programs. In this study, three multivariate linear models (i.e., GBLUP, BayesA, and BayesCπ) were presented and compared to univariate models using simulated and real quantitative traits controlled by different genetic architectures. We also extended BayesA with fixed hyperparameters to a full hierarchical model that estimated hyperparameters and BayesCπ to impute missing phenotypes. We found that optimal marker-effect variance priors depended on the genetic architecture of the trait so that estimating them was beneficial. We showed that the prediction accuracy for a low-heritability trait could be significantly increased by multivariate genomic selection when a correlated high-heritability trait was available. Further, multiple-trait genomic selection had higher prediction accuracy than single-trait genomic selection when phenotypes are not available on all individuals and traits. Additional factors affecting the performance of multiple-trait genomic selection were explored. PMID:23086217

  14. Identification of genomic variants putatively targeted by selection during dog domestication.

    Science.gov (United States)

    Cagan, Alex; Blass, Torsten

    2016-01-12

    Dogs [Canis lupus familiaris] were the first animal species to be domesticated and continue to occupy an important place in human societies. Recent studies have begun to reveal when and where dog domestication occurred. While much progress has been made in identifying the genetic basis of phenotypic differences between dog breeds we still know relatively little about the genetic changes underlying the phenotypes that differentiate all dogs from their wild progenitors, wolves [Canis lupus]. In particular, dogs generally show reduced aggression and fear towards humans compared to wolves. Therefore, selection for tameness was likely a necessary prerequisite for dog domestication. With the increasing availability of whole-genome sequence data it is possible to try and directly identify the genetic variants contributing to the phenotypic differences between dogs and wolves. We analyse the largest available database of genome-wide polymorphism data in a global sample of dogs 69 and wolves 7. We perform a scan to identify regions of the genome that are highly differentiated between dogs and wolves. We identify putatively functional genomic variants that are segregating or at high frequency [> = 0.75 Fst] for alternative alleles between dogs and wolves. A biological pathways analysis of the genes containing these variants suggests that there has been selection on the 'adrenaline and noradrenaline biosynthesis pathway', well known for its involvement in the fight-or-flight response. We identify 11 genes with putatively functional variants fixed for alternative alleles between dogs and wolves. The segregating variants in these genes are strong candidates for having been targets of selection during early dog domestication. We present the first genome-wide analysis of the different categories of putatively functional variants that are fixed or segregating at high frequency between a global sampling of dogs and wolves. We find evidence that selection has been strongest

  15. A Primer on High-Throughput Computing for Genomic Selection

    Directory of Open Access Journals (Sweden)

    Xiao-Lin eWu

    2011-02-01

    Full Text Available High-throughput computing (HTC uses computer clusters to solve advanced computational problems, with the goal of accomplishing high throughput over relatively long periods of time. In genomic selection, for example, a set of markers covering the entire genome is used to train a model based on known data, and the resulting model is used to predict the genetic merit of selection candidates. Sophisticated models are very computationally demanding and, with several traits to be evaluated sequentially, computing time is long and output is low. In this paper, we present scenarios and basic principles of how HTC can be used in genomic selection, implemented using various techniques from simple batch processing to pipelining in distributed computer clusters. Various scripting languages, such as shell scripting, Perl and R, are also very useful to devise pipelines. By pipelining, we can reduce total computing time and consequently increase throughput. In comparison to the traditional data processing pipeline residing on the central processors, performing general purpose computation on a graphics processing unit (GPU provide a new-generation approach to massive parallel computing in genomic selection. While the concept of HTC may still be new to many researchers in animal breeding, plant breeding, and genetics, HTC infrastructures have already been built in many institutions, such as the University of Wisconsin – Madison, which can be leveraged for genomic selection, in terms of central processing unit (CPU capacity, network connectivity, storage availability, and middleware connectivity. Exploring existing HTC infrastructures as well as general purpose computing environments will further expand our capability to meet increasing computing demands posed by unprecedented genomic data that we have today. We anticipate that HTC will impact genomic selection via better statistical models, faster solutions, and more competitive products (e.g., from design of

  16. Selection on Optimal Haploid Value Increases Genetic Gain and Preserves More Genetic Diversity Relative to Genomic Selection.

    Science.gov (United States)

    Daetwyler, Hans D; Hayden, Matthew J; Spangenberg, German C; Hayes, Ben J

    2015-08-01

    Doubled haploids are routinely created and phenotypically selected in plant breeding programs to accelerate the breeding cycle. Genomic selection, which makes use of both phenotypes and genotypes, has been shown to further improve genetic gain through prediction of performance before or without phenotypic characterization of novel germplasm. Additional opportunities exist to combine genomic prediction methods with the creation of doubled haploids. Here we propose an extension to genomic selection, optimal haploid value (OHV) selection, which predicts the best doubled haploid that can be produced from a segregating plant. This method focuses selection on the haplotype and optimizes the breeding program toward its end goal of generating an elite fixed line. We rigorously tested OHV selection breeding programs, using computer simulation, and show that it results in up to 0.6 standard deviations more genetic gain than genomic selection. At the same time, OHV selection preserved a substantially greater amount of genetic diversity in the population than genomic selection, which is important to achieve long-term genetic gain in breeding populations. Copyright © 2015 by the Genetics Society of America.

  17. [Genomic selection of milk cattle. The practical application over five years].

    Science.gov (United States)

    Smaragdov, M G

    2013-11-01

    Genomic selection is a method based on the use of single nucleotide polymorphisms (SNPs) as markers for detecting animal or plant genotype values. The review describes the genomic selection of milk cattle 5 years after the design of dense SNP chips. References to the application of genomic selection to other animal and plant species are given. The main principles of constructing linear and nonlinear mathematical models that allow one to determine genomic estimates in animals are briefly described. Particular attention is focused on the accuracy and the phenomenon of the additivity ofgenomic estimates, as well as to the prospective use of various genomic selection schemes that consider it over dozens of generations. Information including international organizations that provide the consolidation of genomic information from different countries aimed at designing global reference populations of milk cattle is reported. The results of the practical application of genomic selection to detecting of the breeding value of milk cattle over 5 years are demonstrated in the table, which makes it possible to visually assess the achievements of this highly technological field of cattle breeding.

  18. Genomic Selection for Drought Tolerance Using Genome-Wide SNPs in Maize

    Directory of Open Access Journals (Sweden)

    Thirunavukkarasu Nepolean

    2017-04-01

    Full Text Available Traditional breeding strategies for selecting superior genotypes depending on phenotypic traits have proven to be of limited success, as this direct selection is hindered by low heritability, genetic interactions such as epistasis, environmental-genotype interactions, and polygenic effects. With the advent of new genomic tools, breeders have paved a way for selecting superior breeds. Genomic selection (GS has emerged as one of the most important approaches for predicting genotype performance. Here, we tested the breeding values of 240 maize subtropical lines phenotyped for drought at different environments using 29,619 cured SNPs. Prediction accuracies of seven genomic selection models (ridge regression, LASSO, elastic net, random forest, reproducing kernel Hilbert space, Bayes A and Bayes B were tested for their agronomic traits. Though prediction accuracies of Bayes B, Bayes A and RKHS were comparable, Bayes B outperformed the other models by predicting highest Pearson correlation coefficient in all three environments. From Bayes B, a set of the top 1053 significant SNPs with higher marker effects was selected across all datasets to validate the genes and QTLs. Out of these 1053 SNPs, 77 SNPs associated with 10 drought-responsive transcription factors. These transcription factors were associated with different physiological and molecular functions (stomatal closure, root development, hormonal signaling and photosynthesis. Of several models, Bayes B has been shown to have the highest level of prediction accuracy for our data sets. Our experiments also highlighted several SNPs based on their performance and relative importance to drought tolerance. The result of our experiments is important for the selection of superior genotypes and candidate genes for breeding drought-tolerant maize hybrids.

  19. Sequencing the CHO DXB11 genome reveals regional variations in genomic stability and haploidy

    DEFF Research Database (Denmark)

    Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.

    2015-01-01

    Background: The DHFR negative CHO DXB11 cell line (also known as DUX-B11 and DUKX) was historically the first CHO cell line to be used for large scale production of heterologous proteins and is still used for production of a number of complex proteins.  Results: Here we present the genomic sequence...... of the CHO DXB11 genome sequenced to a depth of 33x. Overall a significant genomic drift was seen favoring GC -> AT point mutations in line with the chemical mutagenesis strategy used for generation of the cell line. The sequencing depth for each gene in the genome revealed distinct peaks at sequencing...... in eight additional analyzed CHO genomes (15-20% haploidy) but not in the genome of the Chinese hamster. The dhfr gene is confirmed to be haploid in CHO DXB11; transcriptionally active and the remaining allele contains a G410C point mutation causing a Thr137Arg missense mutation. We find similar to 2...

  20. Parallel altitudinal clines reveal trends in adaptive evolution of genome size in Zea mays

    Science.gov (United States)

    Berg, Jeremy J.; Birchler, James A.; Grote, Mark N.; Lorant, Anne; Quezada, Juvenal

    2018-01-01

    While the vast majority of genome size variation in plants is due to differences in repetitive sequence, we know little about how selection acts on repeat content in natural populations. Here we investigate parallel changes in intraspecific genome size and repeat content of domesticated maize (Zea mays) landraces and their wild relative teosinte across altitudinal gradients in Mesoamerica and South America. We combine genotyping, low coverage whole-genome sequence data, and flow cytometry to test for evidence of selection on genome size and individual repeat abundance. We find that population structure alone cannot explain the observed variation, implying that clinal patterns of genome size are maintained by natural selection. Our modeling additionally provides evidence of selection on individual heterochromatic knob repeats, likely due to their large individual contribution to genome size. To better understand the phenotypes driving selection on genome size, we conducted a growth chamber experiment using a population of highland teosinte exhibiting extensive variation in genome size. We find weak support for a positive correlation between genome size and cell size, but stronger support for a negative correlation between genome size and the rate of cell production. Reanalyzing published data of cell counts in maize shoot apical meristems, we then identify a negative correlation between cell production rate and flowering time. Together, our data suggest a model in which variation in genome size is driven by natural selection on flowering time across altitudinal clines, connecting intraspecific variation in repetitive sequence to important differences in adaptive phenotypes. PMID:29746459

  1. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  2. Maximizing Crossbred Performance through Purebred Genomic Selection

    DEFF Research Database (Denmark)

    Esfandyari, Hadi; Sørensen, Anders Christian; Bijma, Pieter

    Genomic selection (GS) can be used to select purebreds for crossbred performance (CP). As dominance is the likely genetic basis of heterosis, explicitly including dominance in the GS model may be beneficial for selection of purebreds for CP, when estimating allelic effects from pure line data. Th...

  3. Does genomic selection have a future in plant breeding?

    Science.gov (United States)

    Jonas, Elisabeth; de Koning, Dirk-Jan

    2013-09-01

    Plant breeding largely depends on phenotypic selection in plots and only for some, often disease-resistance-related traits, uses genetic markers. The more recently developed concept of genomic selection, using a black box approach with no need of prior knowledge about the effect or function of individual markers, has also been proposed as a great opportunity for plant breeding. Several empirical and theoretical studies have focused on the possibility to implement this as a novel molecular method across various species. Although we do not question the potential of genomic selection in general, in this Opinion, we emphasize that genomic selection approaches from dairy cattle breeding cannot be easily applied to complex plant breeding. Copyright © 2013 Elsevier Ltd. All rights reserved.

  4. Efficient oligonucleotide probe selection for pan-genomic tiling arrays

    Directory of Open Access Journals (Sweden)

    Zhang Wei

    2009-09-01

    Full Text Available Abstract Background Array comparative genomic hybridization is a fast and cost-effective method for detecting, genotyping, and comparing the genomic sequence of unknown bacterial isolates. This method, as with all microarray applications, requires adequate coverage of probes targeting the regions of interest. An unbiased tiling of probes across the entire length of the genome is the most flexible design approach. However, such a whole-genome tiling requires that the genome sequence is known in advance. For the accurate analysis of uncharacterized bacteria, an array must query a fully representative set of sequences from the species' pan-genome. Prior microarrays have included only a single strain per array or the conserved sequences of gene families. These arrays omit potentially important genes and sequence variants from the pan-genome. Results This paper presents a new probe selection algorithm (PanArray that can tile multiple whole genomes using a minimal number of probes. Unlike arrays built on clustered gene families, PanArray uses an unbiased, probe-centric approach that does not rely on annotations, gene clustering, or multi-alignments. Instead, probes are evenly tiled across all sequences of the pan-genome at a consistent level of coverage. To minimize the required number of probes, probes conserved across multiple strains in the pan-genome are selected first, and additional probes are used only where necessary to span polymorphic regions of the genome. The viability of the algorithm is demonstrated by array designs for seven different bacterial pan-genomes and, in particular, the design of a 385,000 probe array that fully tiles the genomes of 20 different Listeria monocytogenes strains with overlapping probes at greater than twofold coverage. Conclusion PanArray is an oligonucleotide probe selection algorithm for tiling multiple genome sequences using a minimal number of probes. It is capable of fully tiling all genomes of a species on

  5. Practical Approaches for Detecting Selection in Microbial Genomes

    OpenAIRE

    Hedge, Jessica; Wilson, Daniel J.

    2016-01-01

    Microbial genome evolution is shaped by a variety of selective pressures. Understanding how these processes occur can help to address important problems in microbiology by explaining observed differences in phenotypes, including virulence and resistance to antibiotics. Greater access to whole-genome sequencing provides microbiologists with the opportunity to perform large-scale analyses of selection in novel settings, such as within individual hosts. This tutorial aims to guide researchers th...

  6. Signatures of selection in the Iberian honey bee (Apis mellifera iberiensis) revealed by a genome scan analysis of single nucleotide polymorphisms.

    Science.gov (United States)

    Chávez-Galarza, Julio; Henriques, Dora; Johnston, J Spencer; Azevedo, João C; Patton, John C; Muñoz, Irene; De la Rúa, Pilar; Pinto, M Alice

    2013-12-01

    Understanding the genetic mechanisms of adaptive population divergence is one of the most fundamental endeavours in evolutionary biology and is becoming increasingly important as it will allow predictions about how organisms will respond to global environmental crisis. This is particularly important for the honey bee, a species of unquestionable ecological and economical importance that has been exposed to increasing human-mediated selection pressures. Here, we conducted a single nucleotide polymorphism (SNP)-based genome scan in honey bees collected across an environmental gradient in Iberia and used four FST -based outlier tests to identify genomic regions exhibiting signatures of selection. Additionally, we analysed associations between genetic and environmental data for the identification of factors that might be correlated or act as selective pressures. With these approaches, 4.4% (17 of 383) of outlier loci were cross-validated by four FST -based methods, and 8.9% (34 of 383) were cross-validated by at least three methods. Of the 34 outliers, 15 were found to be strongly associated with one or more environmental variables. Further support for selection, provided by functional genomic information, was particularly compelling for SNP outliers mapped to different genes putatively involved in the same function such as vision, xenobiotic detoxification and innate immune response. This study enabled a more rigorous consideration of selection as the underlying cause of diversity patterns in Iberian honey bees, representing an important first step towards the identification of polymorphisms implicated in local adaptation and possibly in response to recent human-mediated environmental changes. © 2013 John Wiley & Sons Ltd.

  7. Comparative genomics of four closely related Clostridium perfringens bacteriophages reveals variable evolution among core genes with therapeutic potential

    Directory of Open Access Journals (Sweden)

    Siragusa Gregory R

    2011-06-01

    Full Text Available Abstract Background Because biotechnological uses of bacteriophage gene products as alternatives to conventional antibiotics will require a thorough understanding of their genomic context, we sequenced and analyzed the genomes of four closely related phages isolated from Clostridium perfringens, an important agricultural and human pathogen. Results Phage whole-genome tetra-nucleotide signatures and proteomic tree topologies correlated closely with host phylogeny. Comparisons of our phage genomes to 26 others revealed three shared COGs; of particular interest within this core genome was an endolysin (PF01520, an N-acetylmuramoyl-L-alanine amidase and a holin (PF04531. Comparative analyses of the evolutionary history and genomic context of these common phage proteins revealed two important results: 1 strongly significant host-specific sequence variation within the endolysin, and 2 a protein domain architecture apparently unique to our phage genomes in which the endolysin is located upstream of its associated holin. Endolysin sequences from our phages were one of two very distinct genotypes distinguished by variability within the putative enzymatically-active domain. The shared or core genome was comprised of genes with multiple sequence types belonging to five pfam families, and genes belonging to 12 pfam families, including the holin genes, which were nearly identical. Conclusions Significant genomic diversity exists even among closely-related bacteriophages. Holins and endolysins represent conserved functions across divergent phage genomes and, as we demonstrate here, endolysins can have significant variability and host-specificity even among closely-related genomes. Endolysins in our phage genomes may be subject to different selective pressures than the rest of the genome. These findings may have important implications for potential biotechnological applications of phage gene products.

  8. Contrasting Patterns of Genomic Diversity Reveal Accelerated Genetic Drift but Reduced Directional Selection on X-Chromosome in Wild and Domestic Sheep Species.

    Science.gov (United States)

    Chen, Ze-Hui; Zhang, Min; Lv, Feng-Hua; Ren, Xue; Li, Wen-Rong; Liu, Ming-Jun; Nam, Kiwoong; Bruford, Michael W; Li, Meng-Hua

    2018-04-01

    Analyses of genomic diversity along the X chromosome and of its correlation with autosomal diversity can facilitate understanding of evolutionary forces in shaping sex-linked genomic architecture. Strong selective sweeps and accelerated genetic drift on the X-chromosome have been inferred in primates and other model species, but no such insight has yet been gained in domestic animals compared with their wild relatives. Here, we analyzed X-chromosome variability in a large ovine data set, including a BeadChip array for 943 ewes from the world's sheep populations and 110 whole genomes of wild and domestic sheep. Analyzing whole-genome sequences, we observed a substantially reduced X-to-autosome diversity ratio (∼0.6) compared with the value expected under a neutral model (0.75). In particular, one large X-linked segment (43.05-79.25 Mb) was found to show extremely low diversity, most likely due to a high density of coding genes, featuring highly conserved regions. In general, we observed higher nucleotide diversity on the autosomes, but a flat diversity gradient in X-linked segments, as a function of increasing distance from the nearest genes, leading to a decreased X: autosome (X/A) diversity ratio and contrasting to the positive correlation detected in primates and other model animals. Our evidence suggests that accelerated genetic drift but reduced directional selection on X chromosome, as well as sex-biased demographic events, explain low X-chromosome diversity in sheep species. The distinct patterns of X-linked and X/A diversity we observed between Middle Eastern and non-Middle Eastern sheep populations can be explained by multiple migrations, selection, and admixture during the domestic sheep's recent postdomestication demographic expansion, coupled with natural selection for adaptation to new environments. In addition, we identify important novel genes involved in abnormal behavioral phenotypes, metabolism, and immunity, under selection on the sheep X-chromosome.

  9. Genomic Selection Using Extreme Phenotypes and Pre-Selection of SNPs in Large Yellow Croaker (Larimichthys crocea).

    Science.gov (United States)

    Dong, Linsong; Xiao, Shijun; Chen, Junwei; Wan, Liang; Wang, Zhiyong

    2016-10-01

    Genomic selection (GS) is an effective method to improve predictive accuracies of genetic values. However, high cost in genotyping will limit the application of this technology in some species. Therefore, it is necessary to find some methods to reduce the genotyping costs in genomic selection. Large yellow croaker is one of the most commercially important marine fish species in southeast China and Eastern Asia. In this study, genotyping-by-sequencing was used to construct the libraries for the NGS sequencing and find 29,748 SNPs in the genome. Two traits, eviscerated weight (EW) and the ratio between eviscerated weight and whole body weight (REW), were chosen to study. Two strategies to reduce the costs were proposed as follows: selecting extreme phenotypes (EP) for genotyping in reference population or pre-selecting SNPs to construct low-density marker panels in candidates. Three methods of pre-selection of SNPs, i.e., pre-selecting SNPs by absolute effects (SE), by single marker analysis (SMA), and by fixed intervals of sequence number (EL), were studied. The results showed that using EP was a feasible method to save the genotyping costs in reference population. Heritability did not seem to have obvious influences on the predictive abilities estimated by EP. Using SMA was the most feasible method to save the genotyping costs in candidates. In addition, the combination of EP and SMA in genomic selection also showed good results, especially for trait of REW. We also described how to apply the new methods in genomic selection and compared the genotyping costs before and after using the new methods. Our study may not only offer a reference for aquatic genomic breeding but also offer a reference for genomic prediction in other species including livestock and plants, etc.

  10. Annotation of selection strengths in viral genomes

    DEFF Research Database (Denmark)

    McCauley, Stephen; de Groot, Saskia; Mailund, Thomas

    2007-01-01

    Motivation: Viral genomes tend to code in overlapping reading frames to maximize information content. This may result in atypical codon bias and particular evolutionary constraints. Due to the fast mutation rate of viruses, there is additional strong evidence for varying selection between intra......- and intergenomic regions. The presence of multiple coding regions complicates the concept of Ka/Ks ratio, and thus begs for an alternative approach when investigating selection strengths. Building on the paper by McCauley & Hein (2006), we develop a method for annotating a viral genome coding in overlapping...... may thus achieve an annotation both of coding regions as well as selection strengths, allowing us to investigate different selection patterns and hypotheses. Results: We illustrate our method by applying it to a multiple alignment of four HIV2 sequences, as well as four Hepatitis B sequences. We...

  11. Assessing Predictive Properties of Genome-Wide Selection in Soybeans

    Directory of Open Access Journals (Sweden)

    Alencar Xavier

    2016-08-01

    Full Text Available Many economically important traits in plant breeding have low heritability or are difficult to measure. For these traits, genomic selection has attractive features and may boost genetic gains. Our goal was to evaluate alternative scenarios to implement genomic selection for yield components in soybean (Glycine max L. merr. We used a nested association panel with cross validation to evaluate the impacts of training population size, genotyping density, and prediction model on the accuracy of genomic prediction. Our results indicate that training population size was the factor most relevant to improvement in genome-wide prediction, with greatest improvement observed in training sets up to 2000 individuals. We discuss assumptions that influence the choice of the prediction model. Although alternative models had minor impacts on prediction accuracy, the most robust prediction model was the combination of reproducing kernel Hilbert space regression and BayesB. Higher genotyping density marginally improved accuracy. Our study finds that breeding programs seeking efficient genomic selection in soybeans would best allocate resources by investing in a representative training set.

  12. Assessing Predictive Properties of Genome-Wide Selection in Soybeans.

    Science.gov (United States)

    Xavier, Alencar; Muir, William M; Rainey, Katy Martin

    2016-08-09

    Many economically important traits in plant breeding have low heritability or are difficult to measure. For these traits, genomic selection has attractive features and may boost genetic gains. Our goal was to evaluate alternative scenarios to implement genomic selection for yield components in soybean (Glycine max L. merr). We used a nested association panel with cross validation to evaluate the impacts of training population size, genotyping density, and prediction model on the accuracy of genomic prediction. Our results indicate that training population size was the factor most relevant to improvement in genome-wide prediction, with greatest improvement observed in training sets up to 2000 individuals. We discuss assumptions that influence the choice of the prediction model. Although alternative models had minor impacts on prediction accuracy, the most robust prediction model was the combination of reproducing kernel Hilbert space regression and BayesB. Higher genotyping density marginally improved accuracy. Our study finds that breeding programs seeking efficient genomic selection in soybeans would best allocate resources by investing in a representative training set. Copyright © 2016 Xavie et al.

  13. Whole-Genome Resequencing of Experimental Populations Reveals Polygenic Basis of Egg-Size Variation in Drosophila melanogaster.

    Science.gov (United States)

    Jha, Aashish R; Miles, Cecelia M; Lippert, Nodia R; Brown, Christopher D; White, Kevin P; Kreitman, Martin

    2015-10-01

    Complete genome resequencing of populations holds great promise in deconstructing complex polygenic traits to elucidate molecular and developmental mechanisms of adaptation. Egg size is a classic adaptive trait in insects, birds, and other taxa, but its highly polygenic architecture has prevented high-resolution genetic analysis. We used replicated experimental evolution in Drosophila melanogaster and whole-genome sequencing to identify consistent signatures of polygenic egg-size adaptation. A generalized linear-mixed model revealed reproducible allele frequency differences between replicated experimental populations selected for large and small egg volumes at approximately 4,000 single nucleotide polymorphisms (SNPs). Several hundred distinct genomic regions contain clusters of these SNPs and have lower heterozygosity than the genomic background, consistent with selection acting on polymorphisms in these regions. These SNPs are also enriched among genes expressed in Drosophila ovaries and many of these genes have well-defined functions in Drosophila oogenesis. Additional genes regulating egg development, growth, and cell size show evidence of directional selection as genes regulating these biological processes are enriched for highly differentiated SNPs. Genetic crosses performed with a subset of candidate genes demonstrated that these genes influence egg size, at least in the large genetic background. These findings confirm the highly polygenic architecture of this adaptive trait, and suggest the involvement of many novel candidate genes in regulating egg size. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. Maximizing crossbred performance through purebred genomic selection

    DEFF Research Database (Denmark)

    Esfandyari, Hadi; Sørensen, Anders Christian; Bijma, Piter

    2015-01-01

    Background In livestock production, many animals are crossbred, with two distinct advantages: heterosis and breed complementarity. Genomic selection (GS) can be used to select purebred parental lines for crossbred performance (CP). Dominance being the likely genetic basis of heterosis, explicitly...

  15. Performance of Genomic Selection in Mice

    OpenAIRE

    Legarra, Andrés; Robert-Granié, Christèle; Manfredi, Eduardo; Elsen, Jean-Michel

    2008-01-01

    Selection plans in plant and animal breeding are driven by genetic evaluation. Recent developments suggest using massive genetic marker information, known as “genomic selection.” There is little evidence of its performance, though. We empirically compared three strategies for selection: (1) use of pedigree and phenotypic information, (2) use of genomewide markers and phenotypic information, and (3) the combination of both. We analyzed four traits from a heterogeneous mouse population (http://...

  16. Practical Approaches for Detecting Selection in Microbial Genomes.

    Science.gov (United States)

    Hedge, Jessica; Wilson, Daniel J

    2016-02-01

    Microbial genome evolution is shaped by a variety of selective pressures. Understanding how these processes occur can help to address important problems in microbiology by explaining observed differences in phenotypes, including virulence and resistance to antibiotics. Greater access to whole-genome sequencing provides microbiologists with the opportunity to perform large-scale analyses of selection in novel settings, such as within individual hosts. This tutorial aims to guide researchers through the fundamentals underpinning popular methods for measuring selection in pathogens. These methods are transferable to a wide variety of organisms, and the exercises provided are designed for researchers with any level of programming experience.

  17. Integrated genomics of Mucorales reveals novel therapeutic targets

    Science.gov (United States)

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. We sequenced 30 fungal genomes and performed transcriptomics with three representative Rhizopus and Mucor strains with human airway epithelial cells during fungal invasion to reveal key host and fungal determinants contributing ...

  18. Single nucleus genome sequencing reveals high similarity among nuclei of an endomycorrhizal fungus.

    Directory of Open Access Journals (Sweden)

    Kui Lin

    2014-01-01

    Full Text Available Nuclei of arbuscular endomycorrhizal fungi have been described as highly diverse due to their asexual nature and absence of a single cell stage with only one nucleus. This has raised fundamental questions concerning speciation, selection and transmission of the genetic make-up to next generations. Although this concept has become textbook knowledge, it is only based on studying a few loci, including 45S rDNA. To provide a more comprehensive insight into the genetic makeup of arbuscular endomycorrhizal fungi, we applied de novo genome sequencing of individual nuclei of Rhizophagus irregularis. This revealed a surprisingly low level of polymorphism between nuclei. In contrast, within a nucleus, the 45S rDNA repeat unit turned out to be highly diverged. This finding demystifies a long-lasting hypothesis on the complex genetic makeup of arbuscular endomycorrhizal fungi. Subsequent genome assembly resulted in the first draft reference genome sequence of an arbuscular endomycorrhizal fungus. Its length is 141 Mbps, representing over 27,000 protein-coding gene models. We used the genomic sequence to reinvestigate the phylogenetic relationships of Rhizophagus irregularis with other fungal phyla. This unambiguously demonstrated that Glomeromycota are more closely related to Mucoromycotina than to its postulated sister Dikarya.

  19. Whole-genome resequencing of honeybee drones to detect genomic selection in a population managed for royal jelly.

    Science.gov (United States)

    Wragg, David; Marti-Marimon, Maria; Basso, Benjamin; Bidanel, Jean-Pierre; Labarthe, Emmanuelle; Bouchez, Olivier; Le Conte, Yves; Vignal, Alain

    2016-06-03

    Four main evolutionary lineages of A. mellifera have been described including eastern Europe (C) and western and northern Europe (M). Many apiculturists prefer bees from the C lineage due to their docility and high productivity. In France, the routine importation of bees from the C lineage has resulted in the widespread admixture of bees from the M lineage. The haplodiploid nature of the honeybee Apis mellifera, and its small genome size, permits affordable and extensive genomics studies. As a pilot study of a larger project to characterise French honeybee populations, we sequenced 60 drones sampled from two commercial populations managed for the production of honey and royal jelly. Results indicate a C lineage origin, whilst mitochondrial analysis suggests two drones originated from the O lineage. Analysis of heterozygous SNPs identified potential copy number variants near to genes encoding odorant binding proteins and several cytochrome P450 genes. Signatures of selection were detected using the hapFLK haplotype-based method, revealing several regions under putative selection for royal jelly production. The framework developed during this study will be applied to a broader sampling regime, allowing the genetic diversity of French honeybees to be characterised in detail.

  20. Genome-Wide Association Study Reveals Natural Variations Contributing to Drought Resistance in Crops

    Directory of Open Access Journals (Sweden)

    Hongwei Wang

    2017-06-01

    Full Text Available Crops are often cultivated in regions where they will face environmental adversities; resulting in substantial yield loss which can ultimately lead to food and societal problems. Thus, significant efforts have been made to breed stress tolerant cultivars in an attempt to minimize these problems and to produce more stability with respect to crop yields across broad geographies. Since stress tolerance is a complex and multi-genic trait, advancements with classical breeding approaches have been challenging. On the other hand, molecular breeding, which is based on transgenics, marker-assisted selection and genome editing technologies; holds great promise to enable farmers to better cope with these challenges. However, identification of the key genetic components underlying the trait is critical and will serve as the foundation for future crop genetic improvement. Recently, genome-wide association studies have made significant contributions to facilitate the discovery of natural variation contributing to stress tolerance in crops. From these studies, the identified loci can serve as targets for genomic selection or editing to enable the molecular design of new cultivars. Here, we summarize research progress on this issue and focus on the genetic basis of drought tolerance as revealed by genome-wide association studies and quantitative trait loci mapping. Although many favorable loci have been identified, elucidation of their molecular mechanisms contributing to increased stress tolerance still remains a challenge. Thus, continuous efforts are still required to functionally dissect this complex trait through comprehensive approaches, such as system biological studies. It is expected that proper application of the acquired knowledge will enable the development of stress tolerant cultivars; allowing agricultural production to become more sustainable under dynamic environmental conditions.

  1. Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits.

    Science.gov (United States)

    Biazzi, Elisa; Nazzicari, Nelson; Pecetti, Luciano; Brummer, E Charles; Palmonari, Alberto; Tava, Aldo; Annicchiarico, Paolo

    2017-01-01

    Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3-0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits

  2. Practical Approaches for Detecting Selection in Microbial Genomes.

    Directory of Open Access Journals (Sweden)

    Jessica Hedge

    2016-02-01

    Full Text Available Microbial genome evolution is shaped by a variety of selective pressures. Understanding how these processes occur can help to address important problems in microbiology by explaining observed differences in phenotypes, including virulence and resistance to antibiotics. Greater access to whole-genome sequencing provides microbiologists with the opportunity to perform large-scale analyses of selection in novel settings, such as within individual hosts. This tutorial aims to guide researchers through the fundamentals underpinning popular methods for measuring selection in pathogens. These methods are transferable to a wide variety of organisms, and the exercises provided are designed for researchers with any level of programming experience.

  3. Evaluation of genomic selection for replacement strategies using selection index theory.

    Science.gov (United States)

    Calus, M P L; Bijma, P; Veerkamp, R F

    2015-09-01

    Our objective was to investigate the economic effect of prioritizing heifers for replacement at the herd level based on genomic estimated breeding values, and to compute break-even genotyping costs across a wide range of scenarios. Specifically, we aimed to determine the optimal proportion of preselection based on parent average information for all scenarios considered. Considered replacement strategies include a range of different selection intensities by considering different numbers of heifers available for replacement (15-45 in a herd with 100 dairy cows) as well as different replacement rates (15-40%). Use of conventional versus sexed semen was considered, where the latter resulted in having twice as many heifers available for replacement. The baseline scenario relies on prioritization of replacement heifers based on parent average. The first alternative scenario involved genomic selection of heifers, considering that all heifers were genotyped. The benefits of genomic selection in this scenario were computed using a simple formula that only requires the number of lactating animals, the difference in accuracy between parent average and genomic selection (GS), and the selection intensity as input. When all heifers were genotyped, using GS for replacement of heifers was beneficial in most scenarios for current genotyping prices, provided some room exists for selection, in the sense that at least 2 more heifers are available than needed for replacement. In those scenarios, minimum break-even genotyping costs were equal to half the economic value of a standard deviation of the breeding goal. The second alternative scenario involved a preselection based on parent average, followed by GS among all the preselected heifers. It was in almost all cases beneficial to genotype all heifers when conventional semen was used (i.e., to do no preselection). The optimal proportion of preselection based on parent average was at least 0.63 when sexed semen was used. Use of sexed

  4. Camelid genomes reveal evolution and adaptation to desert environments.

    Science.gov (United States)

    Wu, Huiguang; Guang, Xuanmin; Al-Fageeh, Mohamed B; Cao, Junwei; Pan, Shengkai; Zhou, Huanmin; Zhang, Li; Abutarboush, Mohammed H; Xing, Yanping; Xie, Zhiyuan; Alshanqeeti, Ali S; Zhang, Yanru; Yao, Qiulin; Al-Shomrani, Badr M; Zhang, Dong; Li, Jiang; Manee, Manee M; Yang, Zili; Yang, Linfeng; Liu, Yiyi; Zhang, Jilin; Altammami, Musaad A; Wang, Shenyuan; Yu, Lili; Zhang, Wenbin; Liu, Sanyang; Ba, La; Liu, Chunxia; Yang, Xukui; Meng, Fanhua; Wang, Shaowei; Li, Lu; Li, Erli; Li, Xueqiong; Wu, Kaifeng; Zhang, Shu; Wang, Junyi; Yin, Ye; Yang, Huanming; Al-Swailem, Abdulaziz M; Wang, Jun

    2014-10-21

    Bactrian camel (Camelus bactrianus), dromedary (Camelus dromedarius) and alpaca (Vicugna pacos) are economically important livestock. Although the Bactrian camel and dromedary are large, typically arid-desert-adapted mammals, alpacas are adapted to plateaus. Here we present high-quality genome sequences of these three species. Our analysis reveals the demographic history of these species since the Tortonian Stage of the Miocene and uncovers a striking correlation between large fluctuations in population size and geological time boundaries. Comparative genomic analysis reveals complex features related to desert adaptations, including fat and water metabolism, stress responses to heat, aridity, intense ultraviolet radiation and choking dust. Transcriptomic analysis of Bactrian camels further reveals unique osmoregulation, osmoprotection and compensatory mechanisms for water reservation underpinned by high blood glucose levels. We hypothesize that these physiological mechanisms represent kidney evolutionary adaptations to the desert environment. This study advances our understanding of camelid evolution and the adaptation of camels to arid-desert environments.

  5. Accuracy of genomic selection in European maize elite breeding populations.

    Science.gov (United States)

    Zhao, Yusheng; Gowda, Manje; Liu, Wenxin; Würschum, Tobias; Maurer, Hans P; Longin, Friedrich H; Ranc, Nicolas; Reif, Jochen C

    2012-03-01

    Genomic selection is a promising breeding strategy for rapid improvement of complex traits. The objective of our study was to investigate the prediction accuracy of genomic breeding values through cross validation. The study was based on experimental data of six segregating populations from a half-diallel mating design with 788 testcross progenies from an elite maize breeding program. The plants were intensively phenotyped in multi-location field trials and fingerprinted with 960 SNP markers. We used random regression best linear unbiased prediction in combination with fivefold cross validation. The prediction accuracy across populations was higher for grain moisture (0.90) than for grain yield (0.58). The accuracy of genomic selection realized for grain yield corresponds to the precision of phenotyping at unreplicated field trials in 3-4 locations. As for maize up to three generations are feasible per year, selection gain per unit time is high and, consequently, genomic selection holds great promise for maize breeding programs.

  6. Long- and short-term selective forces on malaria parasite genomes

    KAUST Repository

    Nygaard, Sanne; Braunstein, Alexander; Malsen, Gareth; Van Dongen, Stijn; Gardner, Paul P.; Krogh, Anders; Otto, Thomas D.; Pain, Arnab; Berriman, Matthew; McAuliffe, Jon; Dermitzakis, Emmanouil T.; Jeffares, Daniel C.

    2010-01-01

    of these genomes. Although evolutionary processes have a significant impact on malaria control, the selective pressures within Plasmodium genomes are poorly understood, particularly in the non-protein-coding portion of the genome. We use evolutionary methods

  7. Genomic selection in maritime pine.

    Science.gov (United States)

    Isik, Fikret; Bartholomé, Jérôme; Farjat, Alfredo; Chancerel, Emilie; Raffin, Annie; Sanchez, Leopoldo; Plomion, Christophe; Bouffier, Laurent

    2016-01-01

    A two-generation maritime pine (Pinus pinaster Ait.) breeding population (n=661) was genotyped using 2500 SNP markers. The extent of linkage disequilibrium and utility of genomic selection for growth and stem straightness improvement were investigated. The overall intra-chromosomal linkage disequilibrium was r(2)=0.01. Linkage disequilibrium corrected for genomic relationships derived from markers was smaller (rV(2)=0.006). Genomic BLUP, Bayesian ridge regression and Bayesian LASSO regression statistical models were used to obtain genomic estimated breeding values. Two validation methods (random sampling 50% of the population and 10% of the progeny generation as validation sets) were used with 100 replications. The average predictive ability across statistical models and validation methods was about 0.49 for stem sweep, and 0.47 and 0.43 for total height and tree diameter, respectively. The sensitivity analysis suggested that prior densities (variance explained by markers) had little or no discernible effect on posterior means (residual variance) in Bayesian prediction models. Sampling from the progeny generation for model validation increased the predictive ability of markers for tree diameter and stem sweep but not for total height. The results are promising despite low linkage disequilibrium and low marker coverage of the genome (∼1.39 markers/cM). Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  8. Genome-wide analysis of ivermectin response by Onchocerca volvulus reveals that genetic drift and soft selective sweeps contribute to loss of drug sensitivity.

    Directory of Open Access Journals (Sweden)

    Stephen R Doyle

    2017-07-01

    Full Text Available Treatment of onchocerciasis using mass ivermectin administration has reduced morbidity and transmission throughout Africa and Central/South America. Mass drug administration is likely to exert selection pressure on parasites, and phenotypic and genetic changes in several Onchocerca volvulus populations from Cameroon and Ghana-exposed to more than a decade of regular ivermectin treatment-have raised concern that sub-optimal responses to ivermectin's anti-fecundity effect are becoming more frequent and may spread.Pooled next generation sequencing (Pool-seq was used to characterise genetic diversity within and between 108 adult female worms differing in ivermectin treatment history and response. Genome-wide analyses revealed genetic variation that significantly differentiated good responder (GR and sub-optimal responder (SOR parasites. These variants were not randomly distributed but clustered in ~31 quantitative trait loci (QTLs, with little overlap in putative QTL position and gene content between the two countries. Published candidate ivermectin SOR genes were largely absent in these regions; QTLs differentiating GR and SOR worms were enriched for genes in molecular pathways associated with neurotransmission, development, and stress responses. Finally, single worm genotyping demonstrated that geographic isolation and genetic change over time (in the presence of drug exposure had a significantly greater role in shaping genetic diversity than the evolution of SOR.This study is one of the first genome-wide association analyses in a parasitic nematode, and provides insight into the genomics of ivermectin response and population structure of O. volvulus. We argue that ivermectin response is a polygenically-determined quantitative trait (QT whereby identical or related molecular pathways but not necessarily individual genes are likely to determine the extent of ivermectin response in different parasite populations. Furthermore, we propose that genetic

  9. Genome-wide analysis of ivermectin response by Onchocerca volvulus reveals that genetic drift and soft selective sweeps contribute to loss of drug sensitivity

    Science.gov (United States)

    Nana-Djeunga, Hugues C.; Kengne-Ouafo, Jonas A.; Pion, Sébastien D. S.; Bopda, Jean; Kamgno, Joseph; Wanji, Samuel; Che, Hua; Kuesel, Annette C.; Walker, Martin; Basáñez, Maria-Gloria; Boakye, Daniel A.; Osei-Atweneboana, Mike Y.; Boussinesq, Michel; Prichard, Roger K.; Grant, Warwick N.

    2017-01-01

    Background Treatment of onchocerciasis using mass ivermectin administration has reduced morbidity and transmission throughout Africa and Central/South America. Mass drug administration is likely to exert selection pressure on parasites, and phenotypic and genetic changes in several Onchocerca volvulus populations from Cameroon and Ghana—exposed to more than a decade of regular ivermectin treatment—have raised concern that sub-optimal responses to ivermectin's anti-fecundity effect are becoming more frequent and may spread. Methodology/Principal findings Pooled next generation sequencing (Pool-seq) was used to characterise genetic diversity within and between 108 adult female worms differing in ivermectin treatment history and response. Genome-wide analyses revealed genetic variation that significantly differentiated good responder (GR) and sub-optimal responder (SOR) parasites. These variants were not randomly distributed but clustered in ~31 quantitative trait loci (QTLs), with little overlap in putative QTL position and gene content between the two countries. Published candidate ivermectin SOR genes were largely absent in these regions; QTLs differentiating GR and SOR worms were enriched for genes in molecular pathways associated with neurotransmission, development, and stress responses. Finally, single worm genotyping demonstrated that geographic isolation and genetic change over time (in the presence of drug exposure) had a significantly greater role in shaping genetic diversity than the evolution of SOR. Conclusions/Significance This study is one of the first genome-wide association analyses in a parasitic nematode, and provides insight into the genomics of ivermectin response and population structure of O. volvulus. We argue that ivermectin response is a polygenically-determined quantitative trait (QT) whereby identical or related molecular pathways but not necessarily individual genes are likely to determine the extent of ivermectin response in different

  10. Commonalities in Development of Pure Breeds and Population Isolates Revealed in the Genome of the Sardinian Fonni's Dog

    Science.gov (United States)

    Dreger, Dayna L.; Davis, Brian W.; Cocco, Raffaella; Sechi, Sara; Di Cerbo, Alessandro; Parker, Heidi G.; Polli, Michele; Marelli, Stefano P.; Crepaldi, Paola; Ostrander, Elaine A.

    2016-01-01

    The island inhabitants of Sardinia have long been a focus for studies of complex human traits due to their unique ancestral background and population isolation reflecting geographic and cultural restriction. Population isolates share decreased genomic diversity, increased linkage disequilibrium, and increased inbreeding coefficients. In many regions, dogs and humans have been exposed to the same natural and artificial forces of environment, growth, and migration. Distinct dog breeds have arisen through human-driven selection of characteristics to meet an ideal standard of appearance and function. The Fonni’s Dog, an endemic dog population on Sardinia, has not been subjected to an intensive system of artificial selection, but rather has developed alongside the human population of Sardinia, influenced by geographic isolation and unregulated selection based on its environmental adaptation and aptitude for owner-desired behaviors. Through analysis of 28 dog breeds, represented with whole-genome sequences from 13 dogs and ∼170,000 genome-wide single nucleotide variants from 155 dogs, we have produced a genomic illustration of the Fonni’s Dog. Genomic patterns confirm within-breed similarity, while population and demographic analyses provide spatial identity of Fonni’s Dog to other Mediterranean breeds. Investigation of admixture and fixation indices reveals insights into the involvement of Fonni’s Dogs in breed development throughout the Mediterranean. We describe how characteristics of population isolates are reflected in dog breeds that have undergone artificial selection, and are mirrored in the Fonni’s Dog through traditional isolating factors that affect human populations. Lastly, we show that the genetic history of Fonni’s Dog parallels demographic events in local human populations. PMID:27519604

  11. The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system.

    Science.gov (United States)

    Vonk, Freek J; Casewell, Nicholas R; Henkel, Christiaan V; Heimberg, Alysha M; Jansen, Hans J; McCleary, Ryan J R; Kerkkamp, Harald M E; Vos, Rutger A; Guerreiro, Isabel; Calvete, Juan J; Wüster, Wolfgang; Woods, Anthony E; Logan, Jessica M; Harrison, Robert A; Castoe, Todd A; de Koning, A P Jason; Pollock, David D; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S; Ribeiro, José M C; Arntzen, Jan W; van den Thillart, Guido E E J M; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P; Spaink, Herman P; Duboule, Denis; McGlinn, Edwina; Kini, R Manjunatha; Richardson, Michael K

    2013-12-17

    Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection.

  12. [Genomic selection and its application].

    Science.gov (United States)

    Li, Heng-De; Bao, Zhen-Min; Sun, Xiao-Wen

    2011-12-01

    Selective breeding is very important in agricultural production and breeding value estimation is the core of selective breeding. With the development of genetic markers, especially high throughput genotyping technology, it becomes available to estimate breeding value at genome level, i.e. genomic selection (GS). In this review, the methods of GS was categorized into two groups: one is to predict genomic estimated breeding value (GEBV) based on the allele effect, such as least squares, random regression - best linear unbiased prediction (RR-BLUP), Bayes and principle component analysis, etc; the other is to predict GEBV with genetic relationship matrix, which constructs genetic relationship matrix via high throughput genetic markers and then predicts GEBV through linear mixed model, i.e. GBLUP. The basic principles of these methods were also introduced according to the above two classifications. Factors affecting GS accuracy include markers of type and density, length of haplotype, the size of reference population, the extent between marker-QTL and so on. Among the methods of GS, Bayes and GBLUP are usually more accurate than the others and least squares is the worst. GBLUP is time-efficient and can combine pedigree with genotypic information, hence it is superior to other methods. Although progress was made in GS, there are still some challenges, for examples, united breeding, long-term genetic gain with GS, and disentangling markers with and without contribution to the traits. GS has been applied in animal and plant breeding practice and also has the potential to predict genetic predisposition in humans and study evolutionary dynamics. GS, which is more precise than the traditional method, is a breakthrough at measuring genetic relationship. Therefore, GS will be a revolutionary event in the history of animal and plant breeding.

  13. Minipig and beagle animal model genomes aid species selection in pharmaceutical discovery and development

    Energy Technology Data Exchange (ETDEWEB)

    Vamathevan, Jessica J., E-mail: jessica.j.vamathevan@gsk.com [Computational Biology, Quantitative Sciences, GlaxoSmithKline, Stevenage (United Kingdom); Hall, Matthew D.; Hasan, Samiul; Woollard, Peter M. [Computational Biology, Quantitative Sciences, GlaxoSmithKline, Stevenage (United Kingdom); Xu, Meng; Yang, Yulan; Li, Xin; Wang, Xiaoli [BGI-Shenzen, Shenzhen (China); Kenny, Steve [Safety Assessment, PTS, GlaxoSmithKline, Ware (United Kingdom); Brown, James R. [Computational Biology, Quantitative Sciences, GlaxoSmithKline, Collegeville, PA (United States); Huxley-Jones, Julie [UK Platform Technology Sciences (PTS) Operations and Planning, PTS, GlaxoSmithKline, Stevenage (United Kingdom); Lyon, Jon; Haselden, John [Safety Assessment, PTS, GlaxoSmithKline, Ware (United Kingdom); Min, Jiumeng [BGI-Shenzen, Shenzhen (China); Sanseau, Philippe [Computational Biology, Quantitative Sciences, GlaxoSmithKline, Stevenage (United Kingdom)

    2013-07-15

    Improving drug attrition remains a challenge in pharmaceutical discovery and development. A major cause of early attrition is the demonstration of safety signals which can negate any therapeutic index previously established. Safety attrition needs to be put in context of clinical translation (i.e. human relevance) and is negatively impacted by differences between animal models and human. In order to minimize such an impact, an earlier assessment of pharmacological target homology across animal model species will enhance understanding of the context of animal safety signals and aid species selection during later regulatory toxicology studies. Here we sequenced the genomes of the Sus scrofa Göttingen minipig and the Canis familiaris beagle, two widely used animal species in regulatory safety studies. Comparative analyses of these new genomes with other key model organisms, namely mouse, rat, cynomolgus macaque, rhesus macaque, two related breeds (S. scrofa Duroc and C. familiaris boxer) and human reveal considerable variation in gene content. Key genes in toxicology and metabolism studies, such as the UGT2 family, CYP2D6, and SLCO1A2, displayed unique duplication patterns. Comparisons of 317 known human drug targets revealed surprising variation such as species-specific positive selection, duplication and higher occurrences of pseudogenized targets in beagle (41 genes) relative to minipig (19 genes). These data will facilitate the more effective use of animals in biomedical research. - Highlights: • Genomes of the minipig and beagle dog, two species used in pharmaceutical studies. • First systematic comparative genome analysis of human and six experimental animals. • Key drug toxicology genes display unique duplication patterns across species. • Comparison of 317 drug targets show species-specific evolutionary patterns.

  14. Minipig and beagle animal model genomes aid species selection in pharmaceutical discovery and development

    International Nuclear Information System (INIS)

    Vamathevan, Jessica J.; Hall, Matthew D.; Hasan, Samiul; Woollard, Peter M.; Xu, Meng; Yang, Yulan; Li, Xin; Wang, Xiaoli; Kenny, Steve; Brown, James R.; Huxley-Jones, Julie; Lyon, Jon; Haselden, John; Min, Jiumeng; Sanseau, Philippe

    2013-01-01

    Improving drug attrition remains a challenge in pharmaceutical discovery and development. A major cause of early attrition is the demonstration of safety signals which can negate any therapeutic index previously established. Safety attrition needs to be put in context of clinical translation (i.e. human relevance) and is negatively impacted by differences between animal models and human. In order to minimize such an impact, an earlier assessment of pharmacological target homology across animal model species will enhance understanding of the context of animal safety signals and aid species selection during later regulatory toxicology studies. Here we sequenced the genomes of the Sus scrofa Göttingen minipig and the Canis familiaris beagle, two widely used animal species in regulatory safety studies. Comparative analyses of these new genomes with other key model organisms, namely mouse, rat, cynomolgus macaque, rhesus macaque, two related breeds (S. scrofa Duroc and C. familiaris boxer) and human reveal considerable variation in gene content. Key genes in toxicology and metabolism studies, such as the UGT2 family, CYP2D6, and SLCO1A2, displayed unique duplication patterns. Comparisons of 317 known human drug targets revealed surprising variation such as species-specific positive selection, duplication and higher occurrences of pseudogenized targets in beagle (41 genes) relative to minipig (19 genes). These data will facilitate the more effective use of animals in biomedical research. - Highlights: • Genomes of the minipig and beagle dog, two species used in pharmaceutical studies. • First systematic comparative genome analysis of human and six experimental animals. • Key drug toxicology genes display unique duplication patterns across species. • Comparison of 317 drug targets show species-specific evolutionary patterns

  15. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus

    Science.gov (United States)

    Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

    2015-01-01

    Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10–56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa. PMID:26695430

  16. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus.

    Directory of Open Access Journals (Sweden)

    Fagen Li

    Full Text Available Dense genetic maps, along with quantitative trait loci (QTLs detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR, expressed sequence tag (EST derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS, and diversity arrays technology (DArT markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus and with the E. grandis genome sequence. Fifty-three QTLs for growth (10-56 months of age and wood density (56 months were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa.

  17. Genomic selection improves response to selection in resilience by exploiting genotype by environment interactions

    Directory of Open Access Journals (Sweden)

    Han Mulder

    2016-10-01

    Full Text Available Genotype by environment interactions (GxE are very common in livestock and hamper genetic improvement. On the other hand, GxE is a source of genetic variation: genetic variation in response to environment, e.g. environmental perturbations such as heat stress or disease. In livestock breeding, there is tendency to ignore GxE because of increased complexity of models for genetic evaluations and lack of accuracy in extreme environments. GxE, however, creates opportunities to increase resilience of animals towards environmental perturbations. The main aim of the paper is to investigate to which extent GxE can be exploited with traditional and genomic selection methods. Furthermore, we investigated the benefit of reaction norm models compared to conventional methods ignoring GxE. The questions were addressed with selection index theory. GxE was modelled according to a linear reaction norm model in which the environmental gradient is the contemporary group mean. Economic values were based on linear and non-linear profit equations.Accuracies of environment-specific (GEBV were highest in intermediate environments and lowest in extreme environments. Reaction norm models had higher accuracies of (GEBV in extreme environments than conventional models ignoring GxE. Genomic selection always resulted in higher response to selection in all environments than sib or progeny testing schemes. The increase in response was with genomic selection between 9% and 140% compared to sib testing and between 11% and 114% compared to progeny testing when the reference population consisted of 1 million animals across all environments. When the aim was to decrease environmental sensitivity, the response in slope of the reaction norm model with genomic selection was between 1.09 and 319 times larger than with sib or progeny testing and in the right direction in contrast to sib and progeny testing that still increased environmental sensitivity. This shows that genomic selection

  18. Molecular cytogenetic and genomic analyses reveal new insights into the origin of the wheat B genome.

    Science.gov (United States)

    Zhang, Wei; Zhang, Mingyi; Zhu, Xianwen; Cao, Yaping; Sun, Qing; Ma, Guojia; Chao, Shiaoman; Yan, Changhui; Xu, Steven S; Cai, Xiwen

    2018-02-01

    This work pinpointed the goatgrass chromosomal segment in the wheat B genome using modern cytogenetic and genomic technologies, and provided novel insights into the origin of the wheat B genome. Wheat is a typical allopolyploid with three homoeologous subgenomes (A, B, and D). The donors of the subgenomes A and D had been identified, but not for the subgenome B. The goatgrass Aegilops speltoides (genome SS) has been controversially considered a possible candidate for the donor of the wheat B genome. However, the relationship of the Ae. speltoides S genome with the wheat B genome remains largely obscure. The present study assessed the homology of the B and S genomes using an integrative cytogenetic and genomic approach, and revealed the contribution of Ae. speltoides to the origin of the wheat B genome. We discovered noticeable homology between wheat chromosome 1B and Ae. speltoides chromosome 1S, but not between other chromosomes in the B and S genomes. An Ae. speltoides-originated segment spanning a genomic region of approximately 10.46 Mb was detected on the long arm of wheat chromosome 1B (1BL). The Ae. speltoides-originated segment on 1BL was found to co-evolve with the rest of the B genome. Evidently, Ae. speltoides had been involved in the origin of the wheat B genome, but should not be considered an exclusive donor of this genome. The wheat B genome might have a polyphyletic origin with multiple ancestors involved, including Ae. speltoides. These novel findings will facilitate genome studies in wheat and other polyploids.

  19. Selective recruitment of nuclear factors to productively replicating herpes simplex virus genomes.

    Science.gov (United States)

    Dembowski, Jill A; DeLuca, Neal A

    2015-05-01

    Much of the HSV-1 life cycle is carried out in the cell nucleus, including the expression, replication, repair, and packaging of viral genomes. Viral proteins, as well as cellular factors, play essential roles in these processes. Isolation of proteins on nascent DNA (iPOND) was developed to label and purify cellular replication forks. We adapted aspects of this method to label viral genomes to both image, and purify replicating HSV-1 genomes for the identification of associated proteins. Many viral and cellular factors were enriched on viral genomes, including factors that mediate DNA replication, repair, chromatin remodeling, transcription, and RNA processing. As infection proceeded, packaging and structural components were enriched to a greater extent. Among the more abundant proteins that copurified with genomes were the viral transcription factor ICP4 and the replication protein ICP8. Furthermore, all seven viral replication proteins were enriched on viral genomes, along with cellular PCNA and topoisomerases, while other cellular replication proteins were not detected. The chromatin-remodeling complexes present on viral genomes included the INO80, SWI/SNF, NURD, and FACT complexes, which may prevent chromatinization of the genome. Consistent with this conclusion, histones were not readily recovered with purified viral genomes, and imaging studies revealed an underrepresentation of histones on viral genomes. RNA polymerase II, the mediator complex, TFIID, TFIIH, and several other transcriptional activators and repressors were also affinity purified with viral DNA. The presence of INO80, NURD, SWI/SNF, mediator, TFIID, and TFIIH components is consistent with previous studies in which these complexes copurified with ICP4. Therefore, ICP4 is likely involved in the recruitment of these key cellular chromatin remodeling and transcription factors to viral genomes. Taken together, iPOND is a valuable method for the study of viral genome dynamics during infection and

  20. CpG islands undermethylation in human genomic regions under selective pressure.

    Directory of Open Access Journals (Sweden)

    Sergio Cocozza

    Full Text Available DNA methylation at CpG islands (CGIs is one of the most intensively studied epigenetic mechanisms. It is fundamental for cellular differentiation and control of transcriptional potential. DNA methylation is involved also in several processes that are central to evolutionary biology, including phenotypic plasticity and evolvability. In this study, we explored the relationship between CpG islands methylation and signatures of selective pressure in Homo Sapiens, using a computational biology approach. By analyzing methylation data of 25 cell lines from the Encyclopedia of DNA Elements (ENCODE Consortium, we compared the DNA methylation of CpG islands in genomic regions under selective pressure with the methylation of CpG islands in the remaining part of the genome. To define genomic regions under selective pressure, we used three different methods, each oriented to provide distinct information about selective events. Independently of the method and of the cell type used, we found evidences of undermethylation of CGIs in human genomic regions under selective pressure. Additionally, by analyzing SNP frequency in CpG islands, we demonstrated that CpG islands in regions under selective pressure show lower genetic variation. Our findings suggest that the CpG islands in regions under selective pressure seem to be somehow more "protected" from methylation when compared with other regions of the genome.

  1. Comparative mitochondrial genome analysis reveals the evolutionary rearrangement mechanism in Brassica.

    Science.gov (United States)

    Yang, J; Liu, G; Zhao, N; Chen, S; Liu, D; Ma, W; Hu, Z; Zhang, M

    2016-05-01

    The genus Brassica has many species that are important for oil, vegetable and other food products. Three mitochondrial genome types (mitotype) originated from its common ancestor. In this paper, a B. nigra mitochondrial main circle genome with 232,407 bp was generated through de novo assembly. Synteny analysis showed that the mitochondrial genomes of B. rapa and B. oleracea had a better syntenic relationship than B. nigra. Principal components analysis and development of a phylogenetic tree indicated maternal ancestors of three allotetraploid species in Us triangle of Brassica. Diversified mitotypes were found in allotetraploid B. napus, in which napus-type B. napus was derived from B. oleracea, while polima-type B. napus was inherited from B. rapa. In addition, the mitochondrial genome of napus-type B. napus was closer to botrytis-type than capitata-type B. oleracea. The sub-stoichiometric shifting of several mitochondrial genes suggested that mitochondrial genome rearrangement underwent evolutionary selection during domestication and/or plant breeding. Our findings clarify the role of diploid species in the maternal origin of allotetraploid species in Brassica and suggest the possibility of breeding selection of the mitochondrial genome. © 2015 German Botanical Society and The Royal Botanical Society of the Netherlands.

  2. Selection on Optimal Haploid Value Increases Genetic Gain and Preserves More Genetic Diversity Relative to Genomic Selection

    OpenAIRE

    Daetwyler, Hans D.; Hayden, Matthew J.; Spangenberg, German C.; Hayes, Ben J.

    2015-01-01

    Doubled haploids are routinely created and phenotypically selected in plant breeding programs to accelerate the breeding cycle. Genomic selection, which makes use of both phenotypes and genotypes, has been shown to further improve genetic gain through prediction of performance before or without phenotypic characterization of novel germplasm. Additional opportunities exist to combine genomic prediction methods with the creation of doubled haploids. Here we propose an extension to genomic selec...

  3. Prediction of malting quality traits in barley based on genome-wide marker data to assess the potential of genomic selection.

    Science.gov (United States)

    Schmidt, Malthe; Kollers, Sonja; Maasberg-Prelle, Anja; Großer, Jörg; Schinkel, Burkhard; Tomerius, Alexandra; Graner, Andreas; Korzun, Viktor

    2016-02-01

    Genomic prediction of malting quality traits in barley shows the potential of applying genomic selection to improve selection for malting quality and speed up the breeding process. Genomic selection has been applied to various plant species, mostly for yield or yield-related traits such as grain dry matter yield or thousand kernel weight, and improvement of resistances against diseases. Quality traits have not been the main scope of analysis for genomic selection, but have rather been addressed by marker-assisted selection. In this study, the potential to apply genomic selection to twelve malting quality traits in two commercial breeding programs of spring and winter barley (Hordeum vulgare L.) was assessed. Phenotypic means were calculated combining multilocational field trial data from 3 or 4 years, depending on the trait investigated. Three to five locations were available in each of these years. Heritabilities for malting traits ranged between 0.50 and 0.98. Predictive abilities (PA), as derived from cross validation, ranged between 0.14 to 0.58 for spring barley and 0.40-0.80 for winter barley. Small training sets were shown to be sufficient to obtain useful PAs, possibly due to the narrow genetic base in this breeding material. Deployment of genomic selection in malting barley breeding clearly has the potential to reduce cost intensive phenotyping for quality traits, increase selection intensity and to shorten breeding cycles.

  4. Patterns of positive selection in six Mammalian genomes

    DEFF Research Database (Denmark)

    Kosiol, Carolin; Vinar, Tomás; da Fonseca, Rute R

    2008-01-01

    Genome-wide scans for positively selected genes (PSGs) in mammals have provided insight into the dynamics of genome evolution, the genetic basis of differences between species, and the functions of individual genes. However, previous scans have been limited in power and accuracy owing to small...... several new lineage- and clade-specific tests to be applied. Of approximately 16,500 human genes with high-confidence orthologs in at least two other species, 400 genes showed significant evidence of positive selection (FDR... showed evidence of positive selection on particular lineages or clades. As in previous studies, the identified PSGs were enriched for roles in defense/immunity, chemosensory perception, and reproduction, but enrichments were also evident for more specific functions, such as complement-mediated immunity...

  5. Genomic Selection Improves Heat Tolerance in Dairy Cattle

    Science.gov (United States)

    Garner, J. B.; Douglas, M. L.; Williams, S. R. O; Wales, W. J.; Marett, L. C.; Nguyen, T. T. T.; Reich, C. M.; Hayes, B. J.

    2016-01-01

    Dairy products are a key source of valuable proteins and fats for many millions of people worldwide. Dairy cattle are highly susceptible to heat-stress induced decline in milk production, and as the frequency and duration of heat-stress events increases, the long term security of nutrition from dairy products is threatened. Identification of dairy cattle more tolerant of heat stress conditions would be an important progression towards breeding better adapted dairy herds to future climates. Breeding for heat tolerance could be accelerated with genomic selection, using genome wide DNA markers that predict tolerance to heat stress. Here we demonstrate the value of genomic predictions for heat tolerance in cohorts of Holstein cows predicted to be heat tolerant and heat susceptible using controlled-climate chambers simulating a moderate heatwave event. Not only was the heat challenge stimulated decline in milk production less in cows genomically predicted to be heat-tolerant, physiological indicators such as rectal and intra-vaginal temperatures had reduced increases over the 4 day heat challenge. This demonstrates that genomic selection for heat tolerance in dairy cattle is a step towards securing a valuable source of nutrition and improving animal welfare facing a future with predicted increases in heat stress events. PMID:27682591

  6. Goals and hurdles for a successful implementation of genomic selection in breeding programme for selected annual and perennial crops.

    Science.gov (United States)

    Jonas, Elisabeth; de Koning, Dirk Jan

    Genomic Selection is an important topic in quantitative genetics and breeding. Not only does it allow the full use of current molecular genetic technologies, it stimulates also the development of new methods and models. Genomic selection, if fully implemented in commercial farming, should have a major impact on the productivity of various agricultural systems. But suggested approaches need to be applicable in commercial breeding populations. Many of the published research studies focus on methodologies. We conclude from the reviewed publications, that a stronger focus on strategies for the implementation of genomic selection in advanced breeding lines, introduction of new varieties, hybrids or multi-line crosses is needed. Efforts to find solutions for a better prediction and integration of environmental influences need to continue within applied breeding schemes. Goals of the implementation of genomic selection into crop breeding should be carefully defined and crop breeders in the private sector will play a substantial part in the decision-making process. However, the lack of published results from studies within, or in collaboration with, private companies diminishes the knowledge on the status of genomic selection within applied breeding programmes. Studies on the implementation of genomic selection in plant breeding need to evaluate models and methods with an enhanced emphasis on population-specific requirements and production environments. Adaptation of methods to breeding schemes or changes to breeding programmes for a better integration of genomic selection strategies are needed across species. More openness with a continuous exchange will contribute to successes.

  7. Symbiodinium genomes reveal adaptive evolution of functions related to symbiosis

    KAUST Repository

    Liu, Huanle; Stephens, Timothy G.; Gonzá lez-Pech, Raú l; Beltran, Victor H.; Lapeyre, Bruno; Bongaerts, Pim; Cooke, Ira; Bourne, David G.; Forê t, Sylvain; Miller, David John; van Oppen, Madeleine J. H.; Voolstra, Christian R.; Ragan, Mark A.; Chan, Cheong Xin

    2017-01-01

    Symbiosis between dinoflagellates of the genus Symbiodinium and reef-building corals forms the trophic foundation of the world's coral reef ecosystems. Here we present the first draft genome of Symbiodinium goreaui (Clade C, type C1: 1.03 Gbp), one of the most ubiquitous endosymbionts associated with corals, and an improved draft genome of Symbiodinium kawagutii (Clade F, strain CS-156: 1.05 Gbp), previously sequenced as strain CCMP2468, to further elucidate genomic signatures of this symbiosis. Comparative analysis of four available Symbiodinium genomes against other dinoflagellate genomes led to the identification of 2460 nuclear gene families that show evidence of positive selection, including genes involved in photosynthesis, transmembrane ion transport, synthesis and modification of amino acids and glycoproteins, and stress response. Further, we identified extensive sets of genes for meiosis and response to light stress. These draft genomes provide a foundational resource for advancing our understanding Symbiodinium biology and the coral-algal symbiosis.

  8. Symbiodinium genomes reveal adaptive evolution of functions related to symbiosis

    KAUST Repository

    Liu, Huanle

    2017-10-06

    Symbiosis between dinoflagellates of the genus Symbiodinium and reef-building corals forms the trophic foundation of the world\\'s coral reef ecosystems. Here we present the first draft genome of Symbiodinium goreaui (Clade C, type C1: 1.03 Gbp), one of the most ubiquitous endosymbionts associated with corals, and an improved draft genome of Symbiodinium kawagutii (Clade F, strain CS-156: 1.05 Gbp), previously sequenced as strain CCMP2468, to further elucidate genomic signatures of this symbiosis. Comparative analysis of four available Symbiodinium genomes against other dinoflagellate genomes led to the identification of 2460 nuclear gene families that show evidence of positive selection, including genes involved in photosynthesis, transmembrane ion transport, synthesis and modification of amino acids and glycoproteins, and stress response. Further, we identified extensive sets of genes for meiosis and response to light stress. These draft genomes provide a foundational resource for advancing our understanding Symbiodinium biology and the coral-algal symbiosis.

  9. Relaxation of selective constraints causes independent selenoprotein extinction in insect genomes.

    Directory of Open Access Journals (Sweden)

    Charles E Chapple

    Full Text Available BACKGROUND: Selenoproteins are a diverse family of proteins notable for the presence of the 21st amino acid, selenocysteine. Until very recently, all metazoan genomes investigated encoded selenoproteins, and these proteins had therefore been believed to be essential for animal life. Challenging this assumption, recent comparative analyses of insect genomes have revealed that some insect genomes appear to have lost selenoprotein genes. METHODOLOGY/PRINCIPAL FINDINGS: In this paper we investigate in detail the fate of selenoproteins, and that of selenoprotein factors, in all available arthropod genomes. We use a variety of in silico comparative genomics approaches to look for known selenoprotein genes and factors involved in selenoprotein biosynthesis. We have found that five insect species have completely lost the ability to encode selenoproteins and that selenoprotein loss in these species, although so far confined to the Endopterygota infraclass, cannot be attributed to a single evolutionary event, but rather to multiple, independent events. Loss of selenoproteins and selenoprotein factors is usually coupled to the deletion of the entire no-longer functional genomic region, rather than to sequence degradation and consequent pseudogenisation. Such dynamics of gene extinction are consistent with the high rate of genome rearrangements observed in Drosophila. We have also found that, while many selenoprotein factors are concomitantly lost with the selenoproteins, others are present and conserved in all investigated genomes, irrespective of whether they code for selenoproteins or not, suggesting that they are involved in additional, non-selenoprotein related functions. CONCLUSIONS/SIGNIFICANCE: Selenoproteins have been independently lost in several insect species, possibly as a consequence of the relaxation in insects of the selective constraints acting across metazoans to maintain selenoproteins. The dispensability of selenoproteins in insects may

  10. Genome-wide scan of gastrointestinal nematode resistance in closed Angus population selected for minimized influence of MHC.

    Science.gov (United States)

    Kim, Eui-Soo; Sonstegard, Tad S; da Silva, Marcos V G B; Gasbarre, Louis C; Van Tassell, Curtis P

    2015-01-01

    Genetic markers associated with parasite indicator traits are ideal targets for study of marker assisted selection aimed at controlling infections that reduce herd use of anthelminthics. For this study, we collected gastrointestinal (GI) nematode fecal egg count (FEC) data from post-weaning animals of an Angus resource population challenged to a 26 week natural exposure on pasture. In all, data from 487 animals was collected over a 16 year period between 1992 and 2007, most of which were selected for a specific DRB1 allele to reduce the influence of potential allelic variant effects of the MHC locus. A genome-wide association study (GWAS) based on BovineSNP50 genotypes revealed six genomic regions located on bovine Chromosomes 3, 5, 8, 15 and 27; which were significantly associated (-log10 p=4.3) with Box-Cox transformed mean FEC (BC-MFEC). DAVID analysis of the genes within the significant genomic regions suggested a correlation between our results and annotation for genes involved in inflammatory response to infection. Furthermore, ROH and selection signature analyses provided strong evidence that the genomic regions associated BC-MFEC have not been affected by local autozygosity or recent experimental selection. These findings provide useful information for parasite resistance prediction for young grazing cattle and suggest new candidate gene targets for development of disease-modifying therapies or future studies of host response to GI parasite infection.

  11. Complete mitochondrial genome sequences of three bats species and whole genome mitochondrial analyses reveal patterns of codon bias and lend support to a basal split in Chiroptera.

    Science.gov (United States)

    Meganathan, P R; Pagan, Heidi J T; McCulloch, Eve S; Stevens, Richard D; Ray, David A

    2012-01-15

    Order Chiroptera is a unique group of mammals whose members have attained self-powered flight as their main mode of locomotion. Much speculation persists regarding bat evolution; however, lack of sufficient molecular data hampers evolutionary and conservation studies. Of ~1200 species, complete mitochondrial genome sequences are available for only eleven. Additional sequences should be generated if we are to resolve many questions concerning these fascinating mammals. Herein, we describe the complete mitochondrial genomes of three bats: Corynorhinus rafinesquii, Lasiurus borealis and Artibeus lituratus. We also compare the currently available mitochondrial genomes and analyze codon usage in Chiroptera. C. rafinesquii, L. borealis and A. lituratus mitochondrial genomes are 16438 bp, 17048 bp and 16709 bp, respectively. Genome organization and gene arrangements are similar to other bats. Phylogenetic analyses using complete mitochondrial genome sequences support previously established phylogenetic relationships and suggest utility in future studies focusing on the evolutionary aspects of these species. Comprehensive analyses of available bat mitochondrial genomes reveal distinct nucleotide patterns and synonymous codon preferences corresponding to different chiropteran families. These patterns suggest that mutational and selection forces are acting to different extents within Chiroptera and shape their mitochondrial genomes. Copyright © 2011 Elsevier B.V. All rights reserved.

  12. The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system

    Science.gov (United States)

    Vonk, Freek J.; Casewell, Nicholas R.; Henkel, Christiaan V.; Heimberg, Alysha M.; Jansen, Hans J.; McCleary, Ryan J. R.; Kerkkamp, Harald M. E.; Vos, Rutger A.; Guerreiro, Isabel; Calvete, Juan J.; Wüster, Wolfgang; Woods, Anthony E.; Logan, Jessica M.; Harrison, Robert A.; Castoe, Todd A.; de Koning, A. P. Jason; Pollock, David D.; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B.; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S.; Ribeiro, José M. C.; Arntzen, Jan W.; van den Thillart, Guido E. E. J. M.; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P.; Spaink, Herman P.; Duboule, Denis; McGlinn, Edwina; Kini, R. Manjunatha; Richardson, Michael K.

    2013-01-01

    Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection. PMID:24297900

  13. GGRaSP: A R-package for selecting representative genomes using Gaussian mixture models.

    Science.gov (United States)

    Clarke, Thomas H; Brinkac, Lauren M; Sutton, Granger; Fouts, Derrick E

    2018-04-14

    The vast number of available sequenced bacterial genomes occasionally exceeds the facilities of comparative genomic methods or is dominated by a single outbreak strain, and thus a diverse and representative subset is required. Generation of the reduced subset currently requires a priori supervised clustering and sequence-only selection of medoid genomic sequences, independent of any additional genome metrics or strain attributes. The GGRaSP R-package described below generates a reduced subset of genomes that prioritizes maintaining genomes of interest to the user as well as minimizing the loss of genetic variation. The package also allows for unsupervised clustering by modeling the genomic relationships using a Gaussian Mixture Model to select an appropriate cluster threshold. We demonstrate the capabilities of GGRaSP by generating a reduced list of 315 genomes from a genomic dataset of 4600 Escherichia coli genomes, prioritizing selection by type strain and by genome completeness. GGRaSP is available at https://github.com/JCVenterInstitute/ggrasp/. tclarke@jcvi.org. Supplementary data are available at the GitHub site.

  14. Comparative Genomics Reveals the Diversity of Restriction-Modification Systems and DNA Methylation Sites in Listeria monocytogenes.

    Science.gov (United States)

    Chen, Poyin; den Bakker, Henk C; Korlach, Jonas; Kong, Nguyet; Storey, Dylan B; Paxinos, Ellen E; Ashby, Meredith; Clark, Tyson; Luong, Khai; Wiedmann, Martin; Weimer, Bart C

    2017-02-01

    which manifests as gastroenteritis, meningoencephalitis, and abortion. Among Salmonella, Escherichia coli, Campylobacter, and Listeria-causing the most prevalent foodborne illnesses-infection by L. monocytogenes carries the highest mortality rate. The ability of L. monocytogenes to regulate its response to various harsh environments enables its persistence and transmission. Small-scale comparisons of L. monocytogenes focusing solely on genome contents reveal a highly syntenic genome yet fail to address the observed diversity in phenotypic regulation. This study provides a large-scale comparison of 302 L. monocytogenes isolates, revealing the importance of the epigenome and restriction-modification systems as major determinants of L. monocytogenes phylogenetic grouping and subsequent phenotypic expression. Further examination of virulence genes of select outbreak strains reveals an unprecedented diversity in methylation statuses despite high degrees of genome conservation. Copyright © 2017 American Society for Microbiology.

  15. Genome patterns of selection and introgression of haplotypes in natural populations of the house mouse (Mus musculus.

    Directory of Open Access Journals (Sweden)

    Fabian Staubach

    Full Text Available General parameters of selection, such as the frequency and strength of positive selection in natural populations or the role of introgression, are still insufficiently understood. The house mouse (Mus musculus is a particularly well-suited model system to approach such questions, since it has a defined history of splits into subspecies and populations and since extensive genome information is available. We have used high-density single-nucleotide polymorphism (SNP typing arrays to assess genomic patterns of positive selection and introgression of alleles in two natural populations of each of the subspecies M. m. domesticus and M. m. musculus. Applying different statistical procedures, we find a large number of regions subject to apparent selective sweeps, indicating frequent positive selection on rare alleles or novel mutations. Genes in the regions include well-studied imprinted loci (e.g. Plagl1/Zac1, homologues of human genes involved in adaptations (e.g. alpha-amylase genes or in genetic diseases (e.g. Huntingtin and Parkin. Haplotype matching between the two subspecies reveals a large number of haplotypes that show patterns of introgression from specific populations of the respective other subspecies, with at least 10% of the genome being affected by partial or full introgression. Using neutral simulations for comparison, we find that the size and the fraction of introgressed haplotypes are not compatible with a pure migration or incomplete lineage sorting model. Hence, it appears that introgressed haplotypes can rise in frequency due to positive selection and thus can contribute to the adaptive genomic landscape of natural populations. Our data support the notion that natural genomes are subject to complex adaptive processes, including the introgression of haplotypes from other differentiated populations or species at a larger scale than previously assumed for animals. This implies that some of the admixture found in inbred strains of mice

  16. The Population Genomics of Sunflowers and Genomic Determinants of Protein Evolution Revealed by RNAseq

    Directory of Open Access Journals (Sweden)

    Loren H. Rieseberg

    2012-10-01

    Full Text Available Few studies have investigated the causes of evolutionary rate variation among plant nuclear genes, especially in recently diverged species still capable of hybridizing in the wild. The recent advent of Next Generation Sequencing (NGS permits investigation of genome wide rates of protein evolution and the role of selection in generating and maintaining divergence. Here, we use individual whole-transcriptome sequencing (RNAseq to refine our understanding of the population genomics of wild species of sunflowers (Helianthus spp. and the factors that affect rates of protein evolution. We aligned 35 GB of transcriptome sequencing data and identified 433,257 polymorphic sites (SNPs in a reference transcriptome comprising 16,312 genes. Using SNP markers, we identified strong population clustering largely corresponding to the three species analyzed here (Helianthus annuus, H. petiolaris, H. debilis, with one distinct early generation hybrid. Then, we calculated the proportions of adaptive substitution fixed by selection (alpha and identified gene ontology categories with elevated values of alpha. The “response to biotic stimulus” category had the highest mean alpha across the three interspecific comparisons, implying that natural selection imposed by other organisms plays an important role in driving protein evolution in wild sunflowers. Finally, we examined the relationship between protein evolution (dN/dS ratio and several genomic factors predicted to co-vary with protein evolution (gene expression level, divergence and specificity, genetic divergence [FST], and nucleotide diversity pi. We find that variation in rates of protein divergence was correlated with gene expression level and specificity, consistent with results from a broad range of taxa and timescales. This would in turn imply that these factors govern protein evolution both at a microevolutionary and macroevolutionary timescale. Our results contribute to a general understanding of the

  17. Genome Scan for Selection in Structured Layer Chicken Populations Exploiting Linkage Disequilibrium Information.

    Directory of Open Access Journals (Sweden)

    Mahmood Gholami

    Full Text Available An increasing interest is being placed in the detection of genes, or genomic regions, that have been targeted by selection because identifying signatures of selection can lead to a better understanding of genotype-phenotype relationships. A common strategy for the detection of selection signatures is to compare samples from distinct populations and to search for genomic regions with outstanding genetic differentiation. The aim of this study was to detect selective signatures in layer chicken populations using a recently proposed approach, hapFLK, which exploits linkage disequilibrium information while accounting appropriately for the hierarchical structure of populations. We performed the analysis on 70 individuals from three commercial layer breeds (White Leghorn, White Rock and Rhode Island Red, genotyped for approximately 1 million SNPs. We found a total of 41 and 107 regions with outstanding differentiation or similarity using hapFLK and its single SNP counterpart FLK respectively. Annotation of selection signature regions revealed various genes and QTL corresponding to productions traits, for which layer breeds were selected. A number of the detected genes were associated with growth and carcass traits, including IGF-1R, AGRP and STAT5B. We also annotated an interesting gene associated with the dark brown feather color mutational phenotype in chickens (SOX10. We compared FST, FLK and hapFLK and demonstrated that exploiting linkage disequilibrium information and accounting for hierarchical population structure decreased the false detection rate.

  18. Background selection as baseline for nucleotide variation across the Drosophila genome.

    Directory of Open Access Journals (Sweden)

    Josep M Comeron

    2014-06-01

    Full Text Available The constant removal of deleterious mutations by natural selection causes a reduction in neutral diversity and efficacy of selection at genetically linked sites (a process called Background Selection, BGS. Population genetic studies, however, often ignore BGS effects when investigating demographic events or the presence of other types of selection. To obtain a more realistic evolutionary expectation that incorporates the unavoidable consequences of deleterious mutations, we generated high-resolution landscapes of variation across the Drosophila melanogaster genome under a BGS scenario independent of polymorphism data. We find that BGS plays a significant role in shaping levels of variation across the entire genome, including long introns and intergenic regions distant from annotated genes. We also find that a very large percentage of the observed variation in diversity across autosomes can be explained by BGS alone, up to 70% across individual chromosome arms at 100-kb scale, thus indicating that BGS predictions can be used as baseline to infer additional types of selection and demographic events. This approach allows detecting several outlier regions with signal of recent adaptive events and selective sweeps. The use of a BGS baseline, however, is particularly appropriate to investigate the presence of balancing selection and our study exposes numerous genomic regions with the predicted signature of higher polymorphism than expected when a BGS context is taken into account. Importantly, we show that these conclusions are robust to the mutation and selection parameters of the BGS model. Finally, analyses of protein evolution together with previous comparisons of genetic maps between Drosophila species, suggest temporally variable recombination landscapes and, thus, local BGS effects that may differ between extant and past phases. Because genome-wide BGS and temporal changes in linkage effects can skew approaches to estimate demographic and

  19. Accelerating the Switchgrass (Panicum virgatum L.) Breeding Cycle Using Genomic Selection Approaches

    Science.gov (United States)

    Lipka, Alexander E.; Lu, Fei; Cherney, Jerome H.; Buckler, Edward S.; Casler, Michael D.; Costich, Denise E.

    2014-01-01

    Switchgrass (Panicum virgatum L.) is a perennial grass undergoing development as a biofuel feedstock. One of the most important factors hindering breeding efforts in this species is the need for accurate measurement of biomass yield on a per-hectare basis. Genomic selection on simple-to-measure traits that approximate biomass yield has the potential to significantly speed up the breeding cycle. Recent advances in switchgrass genomic and phenotypic resources are now making it possible to evaluate the potential of genomic selection of such traits. We leveraged these resources to study the ability of three widely-used genomic selection models to predict phenotypic values of morphological and biomass quality traits in an association panel consisting of predominantly northern adapted upland germplasm. High prediction accuracies were obtained for most of the traits, with standability having the highest ten-fold cross validation prediction accuracy (0.52). Moreover, the morphological traits generally had higher prediction accuracies than the biomass quality traits. Nevertheless, our results suggest that the quality of current genomic and phenotypic resources available for switchgrass is sufficiently high for genomic selection to significantly impact breeding efforts for biomass yield. PMID:25390940

  20. Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes

    Science.gov (United States)

    Thybert, David; Roller, Maša; Navarro, Fábio C.P.; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janoušek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C.; Laukaitis, Christina M.; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A.; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J.; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M.; Odom, Duncan T.; Flicek, Paul

    2018-01-01

    Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology. PMID:29563166

  1. Selective recruitment of nuclear factors to productively replicating herpes simplex virus genomes.

    Directory of Open Access Journals (Sweden)

    Jill A Dembowski

    2015-05-01

    Full Text Available Much of the HSV-1 life cycle is carried out in the cell nucleus, including the expression, replication, repair, and packaging of viral genomes. Viral proteins, as well as cellular factors, play essential roles in these processes. Isolation of proteins on nascent DNA (iPOND was developed to label and purify cellular replication forks. We adapted aspects of this method to label viral genomes to both image, and purify replicating HSV-1 genomes for the identification of associated proteins. Many viral and cellular factors were enriched on viral genomes, including factors that mediate DNA replication, repair, chromatin remodeling, transcription, and RNA processing. As infection proceeded, packaging and structural components were enriched to a greater extent. Among the more abundant proteins that copurified with genomes were the viral transcription factor ICP4 and the replication protein ICP8. Furthermore, all seven viral replication proteins were enriched on viral genomes, along with cellular PCNA and topoisomerases, while other cellular replication proteins were not detected. The chromatin-remodeling complexes present on viral genomes included the INO80, SWI/SNF, NURD, and FACT complexes, which may prevent chromatinization of the genome. Consistent with this conclusion, histones were not readily recovered with purified viral genomes, and imaging studies revealed an underrepresentation of histones on viral genomes. RNA polymerase II, the mediator complex, TFIID, TFIIH, and several other transcriptional activators and repressors were also affinity purified with viral DNA. The presence of INO80, NURD, SWI/SNF, mediator, TFIID, and TFIIH components is consistent with previous studies in which these complexes copurified with ICP4. Therefore, ICP4 is likely involved in the recruitment of these key cellular chromatin remodeling and transcription factors to viral genomes. Taken together, iPOND is a valuable method for the study of viral genome dynamics

  2. Genomic analysis and selected molecular pathways in rare cancers

    International Nuclear Information System (INIS)

    Liu, Stephen V; Lenkiewicz, Elizabeth; Evers, Lisa; Holley, Tara; Kiefer, Jeffrey; Demeure, Michael J; Ramanathan, Ramesh K; Von Hoff, Daniel D; Barrett, Michael T; Ruiz, Christian; Glatz, Katharina; Bubendorf, Lukas; Eng, Cathy

    2012-01-01

    It is widely accepted that many cancers arise as a result of an acquired genomic instability and the subsequent evolution of tumor cells with variable patterns of selected and background aberrations. The presence and behaviors of distinct neoplastic cell populations within a patient's tumor may underlie multiple clinical phenotypes in cancers. A goal of many current cancer genome studies is the identification of recurring selected driver events that can be advanced for the development of personalized therapies. Unfortunately, in the majority of rare tumors, this type of analysis can be particularly challenging. Large series of specimens for analysis are simply not available, allowing recurring patterns to remain hidden. In this paper, we highlight the use of DNA content-based flow sorting to identify and isolate DNA-diploid and DNA-aneuploid populations from tumor biopsies as a strategy to comprehensively study the genomic composition and behaviors of individual cancers in a series of rare solid tumors: intrahepatic cholangiocarcinoma, anal carcinoma, adrenal leiomyosarcoma, and pancreatic neuroendocrine tumors. We propose that the identification of highly selected genomic events in distinct tumor populations within each tumor can identify candidate driver events that can facilitate the development of novel, personalized treatment strategies for patients with cancer. (paper)

  3. Selfing for the design of genomic selection experiments in biparental plant populations.

    Science.gov (United States)

    McClosky, Benjamin; LaCombe, Jason; Tanksley, Steven D

    2013-11-01

    Self-fertilization (selfing) is commonly used for population development in plant breeding, and it is well established that selfing increases genetic variance between lines, thus increasing response to phenotypic selection. Furthermore, numerous studies have explored how selfing can be deployed to maximal benefit in the context of traditional plant breeding programs (Cornish in Heredity 65:201-211,1990a, Heredity 65:213-220,1990b; Liu et al. in Theor Appl Genet 109:370-376, 2004; Pooni and Jinks in Heredity 54:255-260, 1985). However, the impact of selfing on response to genomic selection has not been explored. In the current study we examined how selfing impacts the two key aspects of genomic selection-GEBV prediction (training) and selection response. We reach the following conclusions: (1) On average, selfing increases genomic selection gains by more than 70 %. (2) The gains in genomic selection response attributable to selfing hold over a wide range population sizes (100-500), heritabilities (0.2-0.8), and selection intensities (0.01-0.1). However, the benefits of selfing are dramatically reduced as the number of QTLs drops below 20. (3) The major cause of the improved response to genomic selection with selfing is through an increase in the occurrence of superior genotypes and not through improved GEBV predictions. While performance of the training population improves with selfing (especially with low heritability and small population sizes), the magnitude of these improvements is relatively small compared with improvements observed in the selection population. To illustrate the value of these insights, we propose a practical genomic selection scheme that substantially shortens the number of generations required to fully capture the benefits of selfing. Specifically, we provide simulation evidence that indicates the proposed scheme matches or exceeds the selection gains observed in advanced populations (i.e. F 8 and doubled haploid) across a broad range of

  4. Genome wide selection in Citrus breeding.

    Science.gov (United States)

    Gois, I B; Borém, A; Cristofani-Yaly, M; de Resende, M D V; Azevedo, C F; Bastianel, M; Novelli, V M; Machado, M A

    2016-10-17

    Genome wide selection (GWS) is essential for the genetic improvement of perennial species such as Citrus because of its ability to increase gain per unit time and to enable the efficient selection of characteristics with low heritability. This study assessed GWS efficiency in a population of Citrus and compared it with selection based on phenotypic data. A total of 180 individual trees from a cross between Pera sweet orange (Citrus sinensis Osbeck) and Murcott tangor (Citrus sinensis Osbeck x Citrus reticulata Blanco) were evaluated for 10 characteristics related to fruit quality. The hybrids were genotyped using 5287 DArT_seq TM (diversity arrays technology) molecular markers and their effects on phenotypes were predicted using the random regression - best linear unbiased predictor (rr-BLUP) method. The predictive ability, prediction bias, and accuracy of GWS were estimated to verify its effectiveness for phenotype prediction. The proportion of genetic variance explained by the markers was also computed. The heritability of the traits, as determined by markers, was 16-28%. The predictive ability of these markers ranged from 0.53 to 0.64, and the regression coefficients between predicted and observed phenotypes were close to unity. Over 35% of the genetic variance was accounted for by the markers. Accuracy estimates with GWS were lower than those obtained by phenotypic analysis; however, GWS was superior in terms of genetic gain per unit time. Thus, GWS may be useful for Citrus breeding as it can predict phenotypes early and accurately, and reduce the length of the selection cycle. This study demonstrates the feasibility of genomic selection in Citrus.

  5. Phylogeny of Banana Streak Virus reveals recent and repetitive endogenization in the genome of its banana host (Musa sp.).

    Science.gov (United States)

    Gayral, Philippe; Iskra-Caruana, Marie-Line

    2009-07-01

    Banana streak virus (BSV) is a plant dsDNA pararetrovirus (family Caulimoviridae, genus badnavirus). Although integration is not an essential step in the BSV replication cycle, the nuclear genome of banana (Musa sp.) contains BSV endogenous pararetrovirus sequences (BSV EPRVs). Some BSV EPRVs are infectious by reconstituting a functional viral genome. Recent studies revealed a large molecular diversity of episomal BSV viruses (i.e., nonintegrated) while others focused on BSV EPRV sequences only. In this study, the evolutionary history of badnavirus integration in banana was inferred from phylogenetic relationships between BSV and BSV EPRVs. The relative evolution rates and selective pressures (d(N)/d(S) ratio) were also compared between endogenous and episomal viral sequences. At least 27 recent independent integration events occurred after the divergence of three banana species, indicating that viral integration is a recent and frequent phenomenon. Relaxation of selective pressure on badnaviral sequences that experienced neutral evolution after integration in the plant genome was recorded. Additionally, a significant decrease (35%) in the EPRV evolution rate was observed compared to BSV, reflecting the difference in the evolution rate between episomal dsDNA viruses and plant genome. The comparison of our results with the evolution rate of the Musa genome and other reverse-transcribing viruses suggests that EPRVs play an active role in episomal BSV diversity and evolution.

  6. Association Mapping and the Genomic Consequences of Selection in Sunflower

    Science.gov (United States)

    Mandel, Jennifer R.; Nambeesan, Savithri; Bowers, John E.; Marek, Laura F.; Ebert, Daniel; Rieseberg, Loren H.; Knapp, Steven J.; Burke, John M.

    2013-01-01

    The combination of large-scale population genomic analyses and trait-based mapping approaches has the potential to provide novel insights into the evolutionary history and genome organization of crop plants. Here, we describe the detailed genotypic and phenotypic analysis of a sunflower (Helianthus annuus L.) association mapping population that captures nearly 90% of the allelic diversity present within the cultivated sunflower germplasm collection. We used these data to characterize overall patterns of genomic diversity and to perform association analyses on plant architecture (i.e., branching) and flowering time, successfully identifying numerous associations underlying these agronomically and evolutionarily important traits. Overall, we found variable levels of linkage disequilibrium (LD) across the genome. In general, islands of elevated LD correspond to genomic regions underlying traits that are known to have been targeted by selection during the evolution of cultivated sunflower. In many cases, these regions also showed significantly elevated levels of differentiation between the two major sunflower breeding groups, consistent with the occurrence of divergence due to strong selection. One of these regions, which harbors a major branching locus, spans a surprisingly long genetic interval (ca. 25 cM), indicating the occurrence of an extended selective sweep in an otherwise recombinogenic interval. PMID:23555290

  7. Genomic Comparisons Reveal Microevolutionary Differences in Mycobacterium abscessus Subspecies

    Directory of Open Access Journals (Sweden)

    Joon L. Tan

    2017-10-01

    Full Text Available Mycobacterium abscessus, a rapid-growing non-tuberculous mycobacterium, has been the cause of sporadic and outbreak infections world-wide. The subspecies in M. abscessus complex (M. abscessus, M. massiliense, and M. bolletii are associated with different biologic and pathogenic characteristics and are known to be among the most frequently isolated opportunistic pathogens from clinical material. To date, the evolutionary forces that could have contributed to these biological and clinical differences are still unclear. We compared genome data from 243 M. abscessus strains downloaded from the NCBI ftp Refseq database to understand how the microevolutionary processes of homologous recombination and positive selection influenced the diversification of the M. abscessus complex at the subspecies level. The three subspecies are clearly separated in the Minimum Spanning Tree. Their MUMi-based genomic distances support the separation of M. massiliense and M. bolletii into two subspecies. Maximum Likelihood analysis through dN/dS (the ratio of number of non-synonymous substitutions per non-synonymous site, to the number of synonymous substitutions per synonymous site identified distinct genes in each subspecies that could have been affected by positive selection during evolution. The results of genome-wide alignment based on concatenated locally-collinear blocks suggest that (a recombination has affected the M. abscessus complex more than mutation and positive selection; (b recombination occurred more frequently in M. massiliense than in the other two subspecies; and (c the recombined segments in the three subspecies have come from different intra-species and inter-species origins. The results lead to the identification of possible gene sets that could have been responsible for the subspecies-specific features and suggest independent evolution among the three subspecies, with recombination playing a more significant role than positive selection in the

  8. Genomic Comparisons Reveal Microevolutionary Differences in Mycobacterium abscessus Subspecies

    Science.gov (United States)

    Tan, Joon L.; Ng, Kee P.; Ong, Chia S.; Ngeow, Yun F.

    2017-01-01

    Mycobacterium abscessus, a rapid-growing non-tuberculous mycobacterium, has been the cause of sporadic and outbreak infections world-wide. The subspecies in M. abscessus complex (M. abscessus, M. massiliense, and M. bolletii) are associated with different biologic and pathogenic characteristics and are known to be among the most frequently isolated opportunistic pathogens from clinical material. To date, the evolutionary forces that could have contributed to these biological and clinical differences are still unclear. We compared genome data from 243 M. abscessus strains downloaded from the NCBI ftp Refseq database to understand how the microevolutionary processes of homologous recombination and positive selection influenced the diversification of the M. abscessus complex at the subspecies level. The three subspecies are clearly separated in the Minimum Spanning Tree. Their MUMi-based genomic distances support the separation of M. massiliense and M. bolletii into two subspecies. Maximum Likelihood analysis through dN/dS (the ratio of number of non-synonymous substitutions per non-synonymous site, to the number of synonymous substitutions per synonymous site) identified distinct genes in each subspecies that could have been affected by positive selection during evolution. The results of genome-wide alignment based on concatenated locally-collinear blocks suggest that (a) recombination has affected the M. abscessus complex more than mutation and positive selection; (b) recombination occurred more frequently in M. massiliense than in the other two subspecies; and (c) the recombined segments in the three subspecies have come from different intra-species and inter-species origins. The results lead to the identification of possible gene sets that could have been responsible for the subspecies-specific features and suggest independent evolution among the three subspecies, with recombination playing a more significant role than positive selection in the diversification

  9. Genomic Selection in Plant Breeding: Methods, Models, and Perspectives.

    Science.gov (United States)

    Crossa, José; Pérez-Rodríguez, Paulino; Cuevas, Jaime; Montesinos-López, Osval; Jarquín, Diego; de Los Campos, Gustavo; Burgueño, Juan; González-Camacho, Juan M; Pérez-Elizalde, Sergio; Beyene, Yoseph; Dreisigacker, Susanne; Singh, Ravi; Zhang, Xuecai; Gowda, Manje; Roorkiwal, Manish; Rutkoski, Jessica; Varshney, Rajeev K

    2017-11-01

    Genomic selection (GS) facilitates the rapid selection of superior genotypes and accelerates the breeding cycle. In this review, we discuss the history, principles, and basis of GS and genomic-enabled prediction (GP) as well as the genetics and statistical complexities of GP models, including genomic genotype×environment (G×E) interactions. We also examine the accuracy of GP models and methods for two cereal crops and two legume crops based on random cross-validation. GS applied to maize breeding has shown tangible genetic gains. Based on GP results, we speculate how GS in germplasm enhancement (i.e., prebreeding) programs could accelerate the flow of genes from gene bank accessions to elite lines. Recent advances in hyperspectral image technology could be combined with GS and pedigree-assisted breeding. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Genome-wide scan of gastrointestinal nematode resistance in closed Angus population selected for minimized influence of MHC.

    Directory of Open Access Journals (Sweden)

    Eui-Soo Kim

    Full Text Available Genetic markers associated with parasite indicator traits are ideal targets for study of marker assisted selection aimed at controlling infections that reduce herd use of anthelminthics. For this study, we collected gastrointestinal (GI nematode fecal egg count (FEC data from post-weaning animals of an Angus resource population challenged to a 26 week natural exposure on pasture. In all, data from 487 animals was collected over a 16 year period between 1992 and 2007, most of which were selected for a specific DRB1 allele to reduce the influence of potential allelic variant effects of the MHC locus. A genome-wide association study (GWAS based on BovineSNP50 genotypes revealed six genomic regions located on bovine Chromosomes 3, 5, 8, 15 and 27; which were significantly associated (-log10 p=4.3 with Box-Cox transformed mean FEC (BC-MFEC. DAVID analysis of the genes within the significant genomic regions suggested a correlation between our results and annotation for genes involved in inflammatory response to infection. Furthermore, ROH and selection signature analyses provided strong evidence that the genomic regions associated BC-MFEC have not been affected by local autozygosity or recent experimental selection. These findings provide useful information for parasite resistance prediction for young grazing cattle and suggest new candidate gene targets for development of disease-modifying therapies or future studies of host response to GI parasite infection.

  11. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic

  12. Single-Cell (Meta-Genomics of a Dimorphic Candidatus Thiomargarita nelsonii Reveals Genomic Plasticity

    Directory of Open Access Journals (Sweden)

    Beverly E. Flood

    2016-05-01

    Full Text Available The genus Thiomargarita includes the world’s largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria.Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence transposable elements and miniature inverted-repeat transposable elements (MITEs. In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsr

  13. A genome-wide scan for selection signatures in Nellore cattle.

    Science.gov (United States)

    Somavilla, A L; Sonstegard, T S; Higa, R H; Rosa, A N; Siqueira, F; Silva, L O C; Torres Júnior, R A A; Coutinho, L L; Mudadu, M A; Alencar, M M; Regitano, L C A

    2014-12-01

    Brazilian Nellore cattle (Bos indicus) have been selected for growth traits for over more than four decades. In recent years, reproductive and meat quality traits have become more important because of increasing consumption, exports and consumer demand. The identification of genome regions altered by artificial selection can potentially permit a better understanding of the biology of specific phenotypes that are useful for the development of tools designed to increase selection efficiency. Therefore, the aims of this study were to detect evidence of recent selection signatures in Nellore cattle using extended haplotype homozygosity methodology and BovineHD marker genotypes (>777,000 single nucleotide polymorphisms) as well as to identify corresponding genes underlying these signals. Thirty-one significant regions (P meat quality, fatty acid profiles and immunity. In addition, 545 genes were identified in regions harboring selection signatures. Within this group, 58 genes were associated with growth, muscle and adipose tissue metabolism, reproductive traits or the immune system. Using relative extended haplotype homozygosity to analyze high-density single nucleotide polymorphism marker data allowed for the identification of regions potentially under artificial selection pressure in the Nellore genome, which might be used to better understand autozygosity and the effects of selection on the Nellore genome. © 2014 Stichting International Foundation for Animal Genetics.

  14. Simultaneous improvement of grain yield and protein content in durum wheat by different phenotypic indices and genomic selection.

    Science.gov (United States)

    Rapp, M; Lein, V; Lacoudre, F; Lafferty, J; Müller, E; Vida, G; Bozhanova, V; Ibraliu, A; Thorwarth, P; Piepho, H P; Leiser, W L; Würschum, T; Longin, C F H

    2018-06-01

    Simultaneous improvement of protein content and grain yield by index selection is possible but its efficiency largely depends on the weighting of the single traits. The genetic architecture of these indices is similar to that of the primary traits. Grain yield and protein content are of major importance in durum wheat breeding, but their negative correlation has hampered their simultaneous improvement. To account for this in wheat breeding, the grain protein deviation (GPD) and the protein yield were proposed as targets for selection. The aim of this work was to investigate the potential of different indices to simultaneously improve grain yield and protein content in durum wheat and to evaluate their genetic architecture towards genomics-assisted breeding. To this end, we investigated two different durum wheat panels comprising 159 and 189 genotypes, which were tested in multiple field locations across Europe and genotyped by a genotyping-by-sequencing approach. The phenotypic analyses revealed significant genetic variances for all traits and heritabilities of the phenotypic indices that were in a similar range as those of grain yield and protein content. The GPD showed a high and positive correlation with protein content, whereas protein yield was highly and positively correlated with grain yield. Thus, selecting for a high GPD would mainly increase the protein content whereas a selection based on protein yield would mainly improve grain yield, but a combination of both indices allows to balance this selection. The genome-wide association mapping revealed a complex genetic architecture for all traits with most QTL having small effects and being detected only in one germplasm set, thus limiting the potential of marker-assisted selection for trait improvement. By contrast, genome-wide prediction appeared promising but its performance strongly depends on the relatedness between training and prediction sets.

  15. A note on mate allocation for dominance handling in genomic selection

    Directory of Open Access Journals (Sweden)

    Toro Miguel A

    2010-08-01

    Full Text Available Abstract Estimation of non-additive genetic effects in animal breeding is important because it increases the accuracy of breeding value prediction and the value of mate allocation procedures. With the advent of genomic selection these ideas should be revisited. The objective of this study was to quantify the efficiency of including dominance effects and practising mating allocation under a whole-genome evaluation scenario. Four strategies of selection, carried out during five generations, were compared by simulation techniques. In the first scenario (MS, individuals were selected based on their own phenotypic information. In the second (GSA, they were selected based on the prediction generated by the Bayes A method of whole-genome evaluation under an additive model. In the third (GSD, the model was expanded to include dominance effects. These three scenarios used random mating to construct future generations, whereas in the fourth one (GSD + MA, matings were optimized by simulated annealing. The advantage of GSD over GSA ranges from 9 to 14% of the expected response and, in addition, using mate allocation (GSD + MA provides an additional response ranging from 6% to 22%. However, mate selection can improve the expected genetic response over random mating only in the first generation of selection. Furthermore, the efficiency of genomic selection is eroded after a few generations of selection, thus, a continued collection of phenotypic data and re-evaluation will be required.

  16. Comparison of 26 sphingomonad genomes reveals diverse environmental adaptations and biodegradative capabilities

    DEFF Research Database (Denmark)

    Aylward, Frank O.; McDonald, Bradon R.; Adams, Sandra M.

    2013-01-01

    to the genus Sphingobium. Our pan-genomic analysis of sphingomonads reveals numerous species-specific open reading frames (ORFs) but few signatures of genus-specific cores. The organization and coding potential of the sphingomonad genomes appear to be highly variable, and plasmid-mediated gene transfer...... and chromosome-plasmid recombination, together with prophage- and transposon-mediated rearrangements, appear to play prominent roles in the genome evolution of this group. We find that many of the sphingomonad genomes encode numerous oxygenases and glycoside hydrolases, which are likely responsible...... a basis for understanding the ecological strategies employed by sphingomonads and their role in environmental nutrient cycling....

  17. Genome-scale detection of positive selection in nine primates predicts human-virus evolutionary conflicts.

    Science.gov (United States)

    van der Lee, Robin; Wiel, Laurens; van Dam, Teunis J P; Huynen, Martijn A

    2017-10-13

    Hotspots of rapid genome evolution hold clues about human adaptation. We present a comparative analysis of nine whole-genome sequenced primates to identify high-confidence targets of positive selection. We find strong statistical evidence for positive selection in 331 protein-coding genes (3%), pinpointing 934 adaptively evolving codons (0.014%). Our new procedure is stringent and reveals substantial artefacts (20% of initial predictions) that have inflated previous estimates. The final 331 positively selected genes (PSG) are strongly enriched for innate and adaptive immunity, secreted and cell membrane proteins (e.g. pattern recognition, complement, cytokines, immune receptors, MHC, Siglecs). We also find evidence for positive selection in reproduction and chromosome segregation (e.g. centromere-associated CENPO, CENPT), apolipoproteins, smell/taste receptors and mitochondrial proteins. Focusing on the virus-host interaction, we retrieve most evolutionary conflicts known to influence antiviral activity (e.g. TRIM5, MAVS, SAMHD1, tetherin) and predict 70 novel cases through integration with virus-human interaction data. Protein structure analysis further identifies positive selection in the interaction interfaces between viruses and their cellular receptors (CD4-HIV; CD46-measles, adenoviruses; CD55-picornaviruses). Finally, primate PSG consistently show high sequence variation in human exomes, suggesting ongoing evolution. Our curated dataset of positive selection is a rich source for studying the genetics underlying human (antiviral) phenotypes. Procedures and data are available at https://github.com/robinvanderlee/positive-selection. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Genomic selection needs to be carefully assessed to meet specific requirements in livestock breeding programs

    Directory of Open Access Journals (Sweden)

    Elisabeth eJonas

    2015-02-01

    Full Text Available Genomic selection is a promising development in agriculture, aiming improved production by exploiting molecular genetic markers to design novel breeding programs and to develop new markers-based models for genetic evaluation. It opens opportunities for research, as novel algorithms and lab methodologies are developed. Genomic selection can be applied in many breeds and species. Further research on the implementation of genomic selection in breeding programs is highly desirable not only for the common good, but also the private sector (breeding companies. It has been projected that this approach will improve selection routines, especially in species with long reproduction cycles, late or sex-limited or expensive trait recording and for complex traits. The task of integrating genomic selection into existing breeding programs is, however, not straightforward. Despite successful integration into breeding programs for dairy cattle, it has yet to be shown how much emphasis can be given to the genomic information and how much additional phenotypic information is needed from new selection candidates. Genomic selection is already part of future planning in many breeding companies of pigs and beef cattle among others, but further research is needed to fully estimate how effective the use of genomic information will be for the prediction of the performance of future breeding stock. Genomic prediction of production in crossbreeding and across-breed schemes, costs and choice of individuals for genotyping are reasons for a reluctance to fully rely on genomic information for selection decisions. Breeding objectives are highly dependent on the industry and the additional gain when using genomic information has to be considered carefully. This review synthesizes some of the suggested approaches in selected livestock species including cattle, pig, chicken and fish. It outlines tasks to help understanding possible consequences when applying genomic information in

  19. Classic selective sweeps revealed by massive sequencing in cattle.

    Directory of Open Access Journals (Sweden)

    Saber Qanbari

    2014-02-01

    Full Text Available Human driven selection during domestication and subsequent breed formation has likely left detectable signatures within the genome of modern cattle. The elucidation of these signatures of selection is of interest from the perspective of evolutionary biology, and for identifying domestication-related genes that ultimately may help to further genetically improve this economically important animal. To this end, we employed a panel of more than 15 million autosomal SNPs identified from re-sequencing of 43 Fleckvieh animals. We mainly applied two somewhat complementary statistics, the integrated Haplotype Homozygosity Score (iHS reflecting primarily ongoing selection, and the Composite of Likelihood Ratio (CLR having the most power to detect completed selection after fixation of the advantageous allele. We find 106 candidate selection regions, many of which are harboring genes related to phenotypes relevant in domestication, such as coat coloring pattern, neurobehavioral functioning and sensory perception including KIT, MITF, MC1R, NRG4, Erbb4, TMEM132D and TAS2R16, among others. To further investigate the relationship between genes with signatures of selection and genes identified in QTL mapping studies, we use a sample of 3062 animals to perform four genome-wide association analyses using appearance traits, body size and somatic cell count. We show that regions associated with coat coloring significantly (P<0.0001 overlap with the candidate selection regions, suggesting that the selection signals we identify are associated with traits known to be affected by selection during domestication. Results also provide further evidence regarding the complexity of the genetics underlying coat coloring in cattle. This study illustrates the potential of population genetic approaches for identifying genomic regions affecting domestication-related phenotypes and further helps to identify specific regions targeted by selection during speciation, domestication and

  20. Genomic insights into the Acidobacteria reveal strategies for their success in terrestrial environments

    Science.gov (United States)

    Trojan, Daniela; Roux, Simon; Herbold, Craig; Rattei, Thomas; Woebken, Dagmar

    2018-01-01

    Summary Members of the phylum Acidobacteria are abundant and ubiquitous across soils. We performed a large‐scale comparative genome analysis spanning subdivisions 1, 3, 4, 6, 8 and 23 (n = 24) with the goal to identify features to help explain their prevalence in soils and understand their ecophysiology. Our analysis revealed that bacteriophage integration events along with transposable and mobile elements influenced the structure and plasticity of these genomes. Low‐ and high‐affinity respiratory oxygen reductases were detected in multiple genomes, suggesting the capacity for growing across different oxygen gradients. Among many genomes, the capacity to use a diverse collection of carbohydrates, as well as inorganic and organic nitrogen sources (such as via extracellular peptidases), was detected – both advantageous traits in environments with fluctuating nutrient environments. We also identified multiple soil acidobacteria with the potential to scavenge atmospheric concentrations of H2, now encompassing mesophilic soil strains within the subdivision 1 and 3, in addition to a previously identified thermophilic strain in subdivision 4. This large‐scale acidobacteria genome analysis reveal traits that provide genomic, physiological and metabolic versatility, presumably allowing flexibility and versatility in the challenging and fluctuating soil environment. PMID:29327410

  1. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA; de Vos, M.; Louw, GE; van der Merwe, RG; Dippenaar, A.; Streicher, EM; Abdallah, A. M.; Sampson, SL; Victor, TC; Dolby, T.; Simpson, JA; van Helden, PD; Warren, RM; Pain, Arnab

    2015-01-01

    Our study demonstrated true levels of genetic diversity within an M. tuberculosis population and showed that genetic diversity may be re-defined when a selective pressure, such as drug exposure, is imposed on M. tuberculosis populations during the course of infection. This suggests that the genome of M. tuberculosis is more dynamic than previously thought, suggesting preparedness to respond to a changing environment.

  2. Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes.

    Science.gov (United States)

    Thybert, David; Roller, Maša; Navarro, Fábio C P; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janoušek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C; Laukaitis, Christina M; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M; Odom, Duncan T; Flicek, Paul

    2018-04-01

    Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli , which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology. © 2018 Thybert et al.; Published by Cold Spring Harbor Laboratory Press.

  3. Experimental assessment of the accuracy of genomic selection in sugarcane.

    Science.gov (United States)

    Gouy, M; Rousselle, Y; Bastianelli, D; Lecomte, P; Bonnal, L; Roques, D; Efile, J-C; Rocher, S; Daugrois, J; Toubi, L; Nabeneza, S; Hervouet, C; Telismart, H; Denis, M; Thong-Chane, A; Glaszmann, J C; Hoarau, J-Y; Nibouche, S; Costet, L

    2013-10-01

    Sugarcane cultivars are interspecific hybrids with an aneuploid, highly heterozygous polyploid genome. The complexity of the sugarcane genome is the main obstacle to the use of marker-assisted selection in sugarcane breeding. Given the promising results of recent studies of plant genomic selection, we explored the feasibility of genomic selection in this complex polyploid crop. Genetic values were predicted in two independent panels, each composed of 167 accessions representing sugarcane genetic diversity worldwide. Accessions were genotyped with 1,499 DArT markers. One panel was phenotyped in Reunion Island and the other in Guadeloupe. Ten traits concerning sugar and bagasse contents, digestibility and composition of the bagasse, plant morphology, and disease resistance were used. We used four statistical predictive models: bayesian LASSO, ridge regression, reproducing kernel Hilbert space, and partial least square regression. The accuracy of the predictions was assessed through the correlation between observed and predicted genetic values by cross validation within each panel and between the two panels. We observed equivalent accuracy among the four predictive models for a given trait, and marked differences were observed among traits. Depending on the trait concerned, within-panel cross validation yielded median correlations ranging from 0.29 to 0.62 in the Reunion Island panel and from 0.11 to 0.5 in the Guadeloupe panel. Cross validation between panels yielded correlations ranging from 0.13 for smut resistance to 0.55 for brix. This level of correlations is promising for future implementations. Our results provide the first validation of genomic selection in sugarcane.

  4. A Genomic Map of the Effects of Linked Selection in Drosophila.

    Directory of Open Access Journals (Sweden)

    Eyal Elyashiv

    2016-08-01

    Full Text Available Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of "linked selection" on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR. They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs. Our inference further suggests a substantial effect of other modes of linked selection and of adaptation in particular. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated.

  5. A Genomic Map of the Effects of Linked Selection in Drosophila.

    Science.gov (United States)

    Elyashiv, Eyal; Sattath, Shmuel; Hu, Tina T; Strutsovsky, Alon; McVicker, Graham; Andolfatto, Peter; Coop, Graham; Sella, Guy

    2016-08-01

    Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of "linked selection" on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR). They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs). Our inference further suggests a substantial effect of other modes of linked selection and of adaptation in particular. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated.

  6. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    DEFF Research Database (Denmark)

    Li, Ying-hui; Zhao, Shan-cen; Ma, Jian-xin

    2013-01-01

    and genetic improvement were identified.CONCLUSIONS:Given the uniqueness of the soybean germplasm sequenced, this study drew a clear picture of human-mediated evolution of the soybean genomes. The genomic resources and information provided by this study would also facilitate the discovery of genes......BACKGROUND:Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re...

  7. Targeted Genome Sequencing Reveals Varicella-Zoster Virus Open Reading Frame 12 Deletion.

    Science.gov (United States)

    Cohrs, Randall J; Lee, Katherine S; Beach, Addilynn; Sanford, Bridget; Baird, Nicholas L; Como, Christina; Graybill, Chiharu; Jones, Dallas; Tekeste, Eden; Ballard, Mitchell; Chen, Xiaomi; Yalacki, David; Frietze, Seth; Jones, Kenneth; Lenac Rovis, Tihana; Jonjić, Stipan; Haas, Jürgen; Gilden, Don

    2017-10-15

    The neurotropic herpesvirus varicella-zoster virus (VZV) establishes a lifelong latent infection in humans following primary infection. The low abundance of VZV nucleic acids in human neurons has hindered an understanding of the mechanisms that regulate viral gene transcription during latency. To overcome this critical barrier, we optimized a targeted capture protocol to enrich VZV DNA and cDNA prior to whole-genome/transcriptome sequence analysis. Since the VZV genome is remarkably stable, it was surprising to detect that VZV32, a VZV laboratory strain with no discernible growth defect in tissue culture, contained a 2,158-bp deletion in open reading frame (ORF) 12. Consequently, ORF 12 and 13 protein expression was abolished and Akt phosphorylation was inhibited. The discovery of the ORF 12 deletion, revealed through targeted genome sequencing analysis, points to the need to authenticate the VZV genome when the virus is propagated in tissue culture. IMPORTANCE Viruses isolated from clinical samples often undergo genetic modifications when cultured in the laboratory. Historically, VZV is among the most genetically stable herpesviruses, a notion supported by more than 60 complete genome sequences from multiple isolates and following multiple in vitro passages. However, application of enrichment protocols to targeted genome sequencing revealed the unexpected deletion of a significant portion of VZV ORF 12 following propagation in cultured human fibroblast cells. While the enrichment protocol did not introduce bias in either the virus genome or transcriptome, the findings indicate the need for authentication of VZV by sequencing when the virus is propagated in tissue culture. Copyright © 2017 American Society for Microbiology.

  8. Most of the benefits from genomic selection can be realised by genotyping a proportion of selection candidates

    DEFF Research Database (Denmark)

    Henryon, Mark; Berg, Peer; Sørensen, Anders Christian

    2012-01-01

    allocated to male and female candidates at ratios of 100:0, 75:25, 50:50, 25:75, and 0:100. For genotyped candidates, a direct-genomic value (DGV) was sampled with reliabilities 0.10, 0.50, and 0.90. Ten sires and 300 dams with the highest breeding values after genotyping were selected at each generation......We reasoned that there are diminishing marginal returns from genomic selection as the proportion of genotyped selection candidates is increased and breeding values based on a priori information are used to choose the candidates that are genotyped. We tested this premise by stochastic simulation...... of breeding schemes that resembled those used for pigs. We estimated rates of genetic gain and inbreeding realized by genomic selection in breeding schemes where candidates were phenotyped before genotyping and 0-100% of the candidates were genotyped based on predicted breeding values. Genotypings were...

  9. Genome-wide patterns of differentiation and spatially varying selection between postglacial recolonization lineages of Populus alba (Salicaceae), a widespread forest tree.

    Science.gov (United States)

    Stölting, Kai N; Paris, Margot; Meier, Cécile; Heinze, Berthold; Castiglione, Stefano; Bartha, Denes; Lexer, Christian

    2015-08-01

    Studying the divergence continuum in plants is relevant to fundamental and applied biology because of the potential to reveal functionally important genetic variation. In this context, whole-genome sequencing (WGS) provides the necessary rigour for uncovering footprints of selection. We resequenced populations of two divergent phylogeographic lineages of Populus alba (n = 48), thoroughly characterized by microsatellites (n = 317), and scanned their genomes for regions of unusually high allelic differentiation and reduced diversity using > 1.7 million single nucleotide polymorphisms (SNPs) from WGS. Results were confirmed by Sanger sequencing. On average, 9134 high-differentiation (≥ 4 standard deviations) outlier SNPs were uncovered between populations, 848 of which were shared by ≥ three replicate comparisons. Annotation revealed that 545 of these were located in 437 predicted genes. Twelve percent of differentiation outlier genome regions exhibited significantly reduced genetic diversity. Gene ontology (GO) searches were successful for 327 high-differentiation genes, and these were enriched for 63 GO terms. Our results provide a snapshot of the roles of 'hard selective sweeps' vs divergent selection of standing genetic variation in distinct postglacial recolonization lineages of P. alba. Thus, this study adds to our understanding of the mechanisms responsible for the origin of functionally relevant variation in temperate trees. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  10. AFLP genome scans suggest divergent selection on colour patterning in allopatric colour morphs of a cichlid fish.

    Science.gov (United States)

    Mattersdorfer, Karin; Koblmüller, Stephan; Sefc, Kristina M

    2012-07-01

    Genome scan-based tests for selection are directly applicable to natural populations to study the genetic and evolutionary mechanisms behind phenotypic differentiation. We conducted AFLP genome scans in three distinct geographic colour morphs of the cichlid fish Tropheus moorii to assess whether the extant, allopatric colour pattern differentiation can be explained by drift and to identify markers mapping to genomic regions possibly involved in colour patterning. The tested morphs occupy adjacent shore sections in southern Lake Tanganyika and are separated from each other by major habitat barriers. The genome scans revealed significant genetic structure between morphs, but a very low proportion of loci fixed for alternative AFLP alleles in different morphs. This high level of polymorphism within morphs suggested that colour pattern differentiation did not result exclusively from neutral processes. Outlier detection methods identified six loci with excess differentiation in the comparison between a bluish and a yellow-blotch morph and five different outlier loci in comparisons of each of these morphs with a red morph. As population expansions and the genetic structure of Tropheus make the outlier approach prone to false-positive signals of selection, we examined the correlation between outlier locus alleles and colour phenotypes in a genetic and phenotypic cline between two morphs. Distributions of allele frequencies at one outlier locus were indeed consistent with linkage to a colour locus. Despite the challenges posed by population structure and demography, our results encourage the cautious application of genome scans to studies of divergent selection in subdivided and recently expanded populations. © 2012 Blackwell Publishing Ltd.

  11. Prospects for genomic selection in cassava breeding

    Science.gov (United States)

    Cassava (Manihot esculenta Crantz) is a clonally propagated staple food crop in the tropics. Genomic selection (GS) has been implemented at three breeding institutions in Africa in order to reduce cycle times. Initial studies provided promising estimates of predictive abilities. Here, we expand on p...

  12. Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium

    Energy Technology Data Exchange (ETDEWEB)

    Ma, Li Jun; van der Does, H. C.; Borkovich, Katherine A.; Coleman, Jeffrey J.; Daboussi, Marie-Jose; Di Pietro, Antonio; Dufresne, Marie; Freitag, Michael; Grabherr, Manfred; Henrissat, Bernard; Houterman, Petra M.; Kang, Seogchan; Shim, Won-Bo; Wolochuk, Charles; Xie, Xiaohui; Xu, Jin Rong; Antoniw, John; Baker, Scott E.; Bluhm, Burton H.; Breakspear, Andrew; Brown, Daren W.; Butchko, Robert A.; Chapman, Sinead; Coulson, Richard; Coutinho, Pedro M.; Danchin, Etienne G.; Diener, Andrew; Gale, Liane R.; Gardiner, Donald; Goff, Steven; Hammond-Kossack, Kim; Hilburn, Karen; Hua-Van, Aurelie; Jonkers, Wilfried; Kazan, Kemal; Kodira, Chinnappa D.; Koehrsen, Michael; Kumar, Lokesh; Lee, Yong Hwan; Li, Liande; Manners, John M.; Miranda-Saavedra, Diego; Mukherjee, Mala; Park, Gyungsoon; Park, Jongsun; Park, Sook Young; Proctor, Robert H.; Regev, Aviv; Ruiz-Roldan, M. C.; Sain, Divya; Sakthikumar, Sharadha; Sykes, Sean; Schwartz, David C.; Turgeon, Barbara G.; Wapinski, Ilan; Yoder, Olen; Young, Sarah; Zeng, Qiandong; Zhou, Shiguo; Galagan, James; Cuomo, Christina A.; Kistler, H. Corby; Rep, Martijn

    2010-03-18

    Fusarium species are among the most important phytopathogenic and toxigenic fungi, having significant impact on crop production and animal health. Distinctively, members of the F. oxysporum species complex exhibit wide host range but discontinuously distributed host specificity, reflecting remarkable genetic adaptability. To understand the molecular underpinnings of diverse phenotypic traits and their evolution in Fusarium, we compared the genomes of three economically important and phylogenetically related, yet phenotypically diverse plant-pathogenic species, F. graminearum, F. verticillioides and F. oxysporum f. sp. lycopersici. Our analysis revealed greatly expanded lineage-specific (LS) genomic regions in F. oxysporum that include four entire chromosomes, accounting for more than one-quarter of the genome. LS regions are rich in transposons and genes with distinct evolutionary profiles but related to pathogenicity. Experimentally, we demonstrate for the first time the transfer of two LS chromosomes between strains of F. oxysporum, resulting in the conversion of a non-pathogenic strain into a pathogen. Transfer of LS chromosomes between otherwise genetically isolated strains explains the polyphyletic origin of host specificity and the emergence of new pathogenic lineages in the F. oxysporum species complex, putting the evolution of fungal pathogenicity into a new perspective.

  13. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2004-07-14

    The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small

  14. Genomic selection needs to be carefully assessed to meet specific requirements in livestock breeding programs.

    Science.gov (United States)

    Jonas, Elisabeth; de Koning, Dirk-Jan

    2015-01-01

    Genomic selection is a promising development in agriculture, aiming improved production by exploiting molecular genetic markers to design novel breeding programs and to develop new markers-based models for genetic evaluation. It opens opportunities for research, as novel algorithms and lab methodologies are developed. Genomic selection can be applied in many breeds and species. Further research on the implementation of genomic selection (GS) in breeding programs is highly desirable not only for the common good, but also the private sector (breeding companies). It has been projected that this approach will improve selection routines, especially in species with long reproduction cycles, late or sex-limited or expensive trait recording and for complex traits. The task of integrating GS into existing breeding programs is, however, not straightforward. Despite successful integration into breeding programs for dairy cattle, it has yet to be shown how much emphasis can be given to the genomic information and how much additional phenotypic information is needed from new selection candidates. Genomic selection is already part of future planning in many breeding companies of pigs and beef cattle among others, but further research is needed to fully estimate how effective the use of genomic information will be for the prediction of the performance of future breeding stock. Genomic prediction of production in crossbreeding and across-breed schemes, costs and choice of individuals for genotyping are reasons for a reluctance to fully rely on genomic information for selection decisions. Breeding objectives are highly dependent on the industry and the additional gain when using genomic information has to be considered carefully. This review synthesizes some of the suggested approaches in selected livestock species including cattle, pig, chicken, and fish. It outlines tasks to help understanding possible consequences when applying genomic information in breeding scenarios.

  15. Differential metabolism of Mycoplasma species as revealed by their genomes

    Directory of Open Access Journals (Sweden)

    Fabricio B.M. Arraes

    2007-01-01

    Full Text Available The annotation and comparative analyses of the genomes of Mycoplasma synoviae and Mycoplasma hyopneumonie, as well as of other Mollicutes (a group of bacteria devoid of a rigid cell wall, has set the grounds for a global understanding of their metabolism and infection mechanisms. According to the annotation data, M. synoviae and M. hyopneumoniae are able to perform glycolytic metabolism, but do not possess the enzymatic machinery for citrate and glyoxylate cycles, gluconeogenesis and the pentose phosphate pathway. Both can synthesize ATP by lactic fermentation, but only M. synoviae can convert acetaldehyde to acetate. Also, our genome analysis revealed that M. synoviae and M. hyopneumoniae are not expected to synthesize polysaccharides, but they can take up a variety of carbohydrates via the phosphoenolpyruvate-dependent phosphotransferase system (PEP-PTS. Our data showed that these two organisms are unable to synthesize purine and pyrimidine de novo, since they only possess the sequences which encode salvage pathway enzymes. Comparative analyses of M. synoviae and M. hyopneumoniae with other Mollicutes have revealed differential genes in the former two genomes coding for enzymes that participate in carbohydrate, amino acid and nucleotide metabolism and host-pathogen interaction. The identification of these metabolic pathways will provide a better understanding of the biology and pathogenicity of these organisms.

  16. Natural selection and the distribution of identity-by-descent in the human genome

    DEFF Research Database (Denmark)

    Albrechtsen, Anders; Moltke, Ida; Nielsen, Rasmus

    2010-01-01

    There has recently been considerable interest in detecting natural selection in the human genome. Selection will usually tend to increase identity-by-descent (IBD) among individuals in a population, and many methods for detecting recent and ongoing positive selection indirectly take advantage...... of this. In this article we show that excess IBD sharing is a general property of natural selection and we show that this fact makes it possible to detect several types of selection including a type that is otherwise difficult to detect: selection acting on standing genetic variation. Motivated by this......, we use a recently developed method for identifying IBD sharing among individuals from genome-wide data to scan populations from the new HapMap phase 3 project for regions with excess IBD sharing in order to identify regions in the human genome that have been under strong, very recent selection...

  17. Allele frequency changes due to hitch-hiking in genomic selection programs

    DEFF Research Database (Denmark)

    Liu, Huiming; Sørensen, Anders Christian; Meuwissen, Theo H E

    2014-01-01

    of inbreeding due to changes in allele frequencies and hitch-hiking. This study aimed at understanding the impact of using long-term genomic selection on changes in allele frequencies, genetic variation and the level of inbreeding. Methods Selection was performed in simulated scenarios with a population of 400......-BLUP, Genomic BLUP and Bayesian Lasso. Changes in allele frequencies at QTL, markers and linked neutral loci were investigated for the different selection criteria and different scenarios, along with the loss of favourable alleles and the rate of inbreeding measured by pedigree and runs of homozygosity. Results...

  18. The house spider genome reveals an ancient whole-genome duplication during arachnid evolution.

    Science.gov (United States)

    Schwager, Evelyn E; Sharma, Prashant P; Clarke, Thomas; Leite, Daniel J; Wierschin, Torsten; Pechmann, Matthias; Akiyama-Oda, Yasuko; Esposito, Lauren; Bechsgaard, Jesper; Bilde, Trine; Buffry, Alexandra D; Chao, Hsu; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dugan, Shannon; Eibner, Cornelius; Extavour, Cassandra G; Funch, Peter; Garb, Jessica; Gonzalez, Luis B; Gonzalez, Vanessa L; Griffiths-Jones, Sam; Han, Yi; Hayashi, Cheryl; Hilbrant, Maarten; Hughes, Daniel S T; Janssen, Ralf; Lee, Sandra L; Maeso, Ignacio; Murali, Shwetha C; Muzny, Donna M; Nunes da Fonseca, Rodrigo; Paese, Christian L B; Qu, Jiaxin; Ronshaugen, Matthew; Schomburg, Christoph; Schönauer, Anna; Stollewerk, Angelika; Torres-Oliva, Montserrat; Turetzek, Natascha; Vanthournout, Bram; Werren, John H; Wolff, Carsten; Worley, Kim C; Bucher, Gregor; Gibbs, Richard A; Coddington, Jonathan; Oda, Hiroki; Stanke, Mario; Ayoub, Nadia A; Prpic, Nikola-Michael; Flot, Jean-François; Posnien, Nico; Richards, Stephen; McGregor, Alistair P

    2017-07-31

    The duplication of genes can occur through various mechanisms and is thought to make a major contribution to the evolutionary diversification of organisms. There is increasing evidence for a large-scale duplication of genes in some chelicerate lineages including two rounds of whole genome duplication (WGD) in horseshoe crabs. To investigate this further, we sequenced and analyzed the genome of the common house spider Parasteatoda tepidariorum. We found pervasive duplication of both coding and non-coding genes in this spider, including two clusters of Hox genes. Analysis of synteny conservation across the P. tepidariorum genome suggests that there has been an ancient WGD in spiders. Comparison with the genomes of other chelicerates, including that of the newly sequenced bark scorpion Centruroides sculpturatus, suggests that this event occurred in the common ancestor of spiders and scorpions, and is probably independent of the WGDs in horseshoe crabs. Furthermore, characterization of the sequence and expression of the Hox paralogs in P. tepidariorum suggests that many have been subject to neo-functionalization and/or sub-functionalization since their duplication. Our results reveal that spiders and scorpions are likely the descendants of a polyploid ancestor that lived more than 450 MYA. Given the extensive morphological diversity and ecological adaptations found among these animals, rivaling those of vertebrates, our study of the ancient WGD event in Arachnopulmonata provides a new comparative platform to explore common and divergent evolutionary outcomes of polyploidization events across eukaryotes.

  19. Selective constraint on noncoding regions of hominid genomes.

    Directory of Open Access Journals (Sweden)

    Eliot C Bush

    2005-12-01

    Full Text Available An important challenge for human evolutionary biology is to understand the genetic basis of human-chimpanzee differences. One influential idea holds that such differences depend, to a large extent, on adaptive changes in gene expression. An important step in assessing this hypothesis involves gaining a better understanding of selective constraint on noncoding regions of hominid genomes. In noncoding sequence, functional elements are frequently small and can be separated by large nonfunctional regions. For this reason, constraint in hominid genomes is likely to be patchy. Here we use conservation in more distantly related mammals and amniotes as a way of identifying small sequence windows that are likely to be functional. We find that putatively functional noncoding elements defined in this manner are subject to significant selective constraint in hominids.

  20. Selective Constraint on Noncoding Regions of Hominid Genomes.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available An important challenge for human evolutionary biology is to understand the genetic basis of human-chimpanzee differences. One influential idea holds that such differences depend, to a large extent, on adaptive changes in gene expression. An important step in assessing this hypothesis involves gaining a better understanding of selective constraint on noncoding regions of hominid genomes. In noncoding sequence, functional elements are frequently small and can be separated by large nonfunctional regions. For this reason, constraint in hominid genomes is likely to be patchy. Here we use conservation in more distantly related mammals and amniotes as a way of identifying small sequence windows that are likely to be functional. We find that putatively functional noncoding elements defined in this manner are subject to significant selective constraint in hominids.

  1. Targets of balancing selection in the human genome

    DEFF Research Database (Denmark)

    Andrés, Aida M; Hubisz, Melissa J; Indap, Amit

    2009-01-01

    Balancing selection is potentially an important biological force for maintaining advantageous genetic diversity in populations, including variation that is responsible for long-term adaptation to the environment. By serving as a means to maintain genetic variation, it may be particularly relevant...... to maintaining phenotypic variation in natural populations. Nevertheless, its prevalence and specific targets in the human genome remain largely unknown. We have analyzed the patterns of diversity and divergence of 13,400 genes in two human populations using an unbiased single-nucleotide polymorphism data set......, a genome-wide approach, and a method that incorporates demography in neutrality tests. We identified an unbiased catalog of genes with signatures of long-term balancing selection, which includes immunity genes as well as genes encoding keratins and membrane channels; the catalog also shows enrichment...

  2. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    Science.gov (United States)

    Macas, Jiří; Novák, Petr; Pellicer, Jaume; Čížková, Jana; Koblížková, Andrea; Neumann, Pavel; Fuková, Iva; Doležel, Jaroslav; Kelly, Laura J; Leitch, Ilia J

    2015-01-01

    The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  3. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    Directory of Open Access Journals (Sweden)

    Jiří Macas

    Full Text Available The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57% of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%. Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  4. Improving the baking quality of bread wheat by genomic selection in early generations.

    Science.gov (United States)

    Michel, Sebastian; Kummer, Christian; Gallee, Martin; Hellinger, Jakob; Ametz, Christian; Akgöl, Batuhan; Epure, Doru; Löschenberger, Franziska; Buerstmayr, Hermann

    2018-02-01

    Genomic selection shows great promise for pre-selecting lines with superior bread baking quality in early generations, 3 years ahead of labour-intensive, time-consuming, and costly quality analysis. The genetic improvement of baking quality is one of the grand challenges in wheat breeding as the assessment of the associated traits often involves time-consuming, labour-intensive, and costly testing forcing breeders to postpone sophisticated quality tests to the very last phases of variety development. The prospect of genomic selection for complex traits like grain yield has been shown in numerous studies, and might thus be also an interesting method to select for baking quality traits. Hence, we focused in this study on the accuracy of genomic selection for laborious and expensive to phenotype quality traits as well as its selection response in comparison with phenotypic selection. More than 400 genotyped wheat lines were, therefore, phenotyped for protein content, dough viscoelastic and mixing properties related to baking quality in multi-environment trials 2009-2016. The average prediction accuracy across three independent validation populations was r = 0.39 and could be increased to r = 0.47 by modelling major QTL as fixed effects as well as employing multi-trait prediction models, which resulted in an acceptable prediction accuracy for all dough rheological traits (r = 0.38-0.63). Genomic selection can furthermore be applied 2-3 years earlier than direct phenotypic selection, and the estimated selection response was nearly twice as high in comparison with indirect selection by protein content for baking quality related traits. This considerable advantage of genomic selection could accordingly support breeders in their selection decisions and aid in efficiently combining superior baking quality with grain yield in newly developed wheat varieties.

  5. Comparison of methods used to identify superior individuals in genomic selection in plant breeding.

    Science.gov (United States)

    Bhering, L L; Junqueira, V S; Peixoto, L A; Cruz, C D; Laviola, B G

    2015-09-10

    The aim of this study was to evaluate different methods used in genomic selection, and to verify those that select a higher proportion of individuals with superior genotypes. Thus, F2 populations of different sizes were simulated (100, 200, 500, and 1000 individuals) with 10 replications each. These consisted of 10 linkage groups (LG) of 100 cM each, containing 100 equally spaced markers per linkage group, of which 200 controlled the characteristics, defined as the 20 initials of each LG. Genetic and phenotypic values were simulated assuming binomial distribution of effects for each LG, and the absence of dominance. For phenotypic values, heritabilities of 20, 50, and 80% were considered. To compare methodologies, the analysis processing time, coefficient of coincidence (selection of 5, 10, and 20% of superior individuals), and Spearman correlation between true genetic values, and the genomic values predicted by each methodology were determined. Considering the processing time, the three methodologies were statistically different, rrBLUP was the fastest, and Bayesian LASSO was the slowest. Spearman correlation revealed that the rrBLUP and GBLUP methodologies were equivalent, and Bayesian LASSO provided the lowest correlation values. Similar results were obtained in coincidence variables among the individuals selected, in which Bayesian LASSO differed statistically and presented a lower value than the other methodologies. Therefore, for the scenarios evaluated, rrBLUP is the best methodology for the selection of genetically superior individuals.

  6. Accuracy of Genomic Selection in a Rice Synthetic Population Developed for Recurrent Selection Breeding.

    Science.gov (United States)

    Grenier, Cécile; Cao, Tuong-Vi; Ospina, Yolima; Quintero, Constanza; Châtel, Marc Henri; Tohme, Joe; Courtois, Brigitte; Ahmadi, Nourollah

    2015-01-01

    Genomic selection (GS) is a promising strategy for enhancing genetic gain. We investigated the accuracy of genomic estimated breeding values (GEBV) in four inter-related synthetic populations that underwent several cycles of recurrent selection in an upland rice-breeding program. A total of 343 S2:4 lines extracted from those populations were phenotyped for flowering time, plant height, grain yield and panicle weight, and genotyped with an average density of one marker per 44.8 kb. The relative effect of the linkage disequilibrium (LD) and minor allele frequency (MAF) thresholds for selecting markers, the relative size of the training population (TP) and of the validation population (VP), the selected trait and the genomic prediction models (frequentist and Bayesian) on the accuracy of GEBVs was investigated in 540 cross validation experiments with 100 replicates. The effect of kinship between the training and validation populations was tested in an additional set of 840 cross validation experiments with a single genomic prediction model. LD was high (average r2 = 0.59 at 25 kb) and decreased slowly, distribution of allele frequencies at individual loci was markedly skewed toward unbalanced frequencies (MAF average value 15.2% and median 9.6%), and differentiation between the four synthetic populations was low (FST ≤0.06). The accuracy of GEBV across all cross validation experiments ranged from 0.12 to 0.54 with an average of 0.30. Significant differences in accuracy were observed among the different levels of each factor investigated. Phenotypic traits had the biggest effect, and the size of the incidence matrix had the smallest. Significant first degree interaction was observed for GEBV accuracy between traits and all the other factors studied, and between prediction models and LD, MAF and composition of the TP. The potential of GS to accelerate genetic gain and breeding options to increase the accuracy of predictions are discussed.

  7. Accuracy of multi-trait genomic selection using different methods

    NARCIS (Netherlands)

    Calus, M.P.L.; Veerkamp, R.F.

    2011-01-01

    Background Genomic selection has become a very important tool in animal genetics and is rapidly emerging in plant genetics. It holds the promise to be particularly beneficial to select for traits that are difficult or expensive to measure, such as traits that are measured in one environment and

  8. Chemical rationale for selection of isolates for genome sequencing

    DEFF Research Database (Denmark)

    Rank, Christian; Larsen, Thomas Ostenfeld; Frisvad, Jens Christian

    The advances in gene sequencing will in the near future enable researchers to affordably acquire the full genomes of handpicked isolates. We here present a method to evaluate the chemical potential of an entire species and select representatives for genome sequencing. The selection criteria for new...... strains to be sequenced can be manifold, but for studying the functional phenotype, using a metabolome based approach offers a cheap and rapid assessment of critical strains to cover the chemical diversity. We have applied this methodology on the complex A. flavus/A. oryzae group. Though these two species...... are in principal identical, they represent two different phenotypes. This is clearly presented through a correspondence analysis of selected extrolites, in which the subtle chemical differences are visually dispersed. The results points to a handful of strains, which, if sequenced, will likely enhance our...

  9. A Ranking Approach to Genomic Selection.

    Science.gov (United States)

    Blondel, Mathieu; Onogi, Akio; Iwata, Hiroyoshi; Ueda, Naonori

    2015-01-01

    Genomic selection (GS) is a recent selective breeding method which uses predictive models based on whole-genome molecular markers. Until now, existing studies formulated GS as the problem of modeling an individual's breeding value for a particular trait of interest, i.e., as a regression problem. To assess predictive accuracy of the model, the Pearson correlation between observed and predicted trait values was used. In this paper, we propose to formulate GS as the problem of ranking individuals according to their breeding value. Our proposed framework allows us to employ machine learning methods for ranking which had previously not been considered in the GS literature. To assess ranking accuracy of a model, we introduce a new measure originating from the information retrieval literature called normalized discounted cumulative gain (NDCG). NDCG rewards more strongly models which assign a high rank to individuals with high breeding value. Therefore, NDCG reflects a prerequisite objective in selective breeding: accurate selection of individuals with high breeding value. We conducted a comparison of 10 existing regression methods and 3 new ranking methods on 6 datasets, consisting of 4 plant species and 25 traits. Our experimental results suggest that tree-based ensemble methods including McRank, Random Forests and Gradient Boosting Regression Trees achieve excellent ranking accuracy. RKHS regression and RankSVM also achieve good accuracy when used with an RBF kernel. Traditional regression methods such as Bayesian lasso, wBSR and BayesC were found less suitable for ranking. Pearson correlation was found to correlate poorly with NDCG. Our study suggests two important messages. First, ranking methods are a promising research direction in GS. Second, NDCG can be a useful evaluation measure for GS.

  10. Landscape genomics: natural selection drives the evolution of mitogenome in penguins

    OpenAIRE

    Ramos, Barbara; González-Acuña, Daniel; Loyola, David E.; Johnson, Warren E.; Parker, Patricia G.; Massaro, Melanie; Dantas, Gisele P. M.; Miranda, Marcelo D.; Vianna, Juliana A.

    2018-01-01

    Background Mitochondria play a key role in the balance of energy and heat production, and therefore the mitochondrial genome is under natural selection by environmental temperature and food availability, since starvation can generate more efficient coupling of energy production. However, selection over mitochondrial DNA (mtDNA) genes has usually been evaluated at the population level. We sequenced by NGS 12 mitogenomes and with four published genomes, assessed genetic variation in ten penguin...

  11. Genome-wide detection and characterization of positive selection in human populations.

    Science.gov (United States)

    Sabeti, Pardis C; Varilly, Patrick; Fry, Ben; Lohmueller, Jason; Hostetter, Elizabeth; Cotsapas, Chris; Xie, Xiaohui; Byrne, Elizabeth H; McCarroll, Steven A; Gaudet, Rachelle; Schaffner, Stephen F; Lander, Eric S; Frazer, Kelly A; Ballinger, Dennis G; Cox, David R; Hinds, David A; Stuve, Laura L; Gibbs, Richard A; Belmont, John W; Boudreau, Andrew; Hardenbol, Paul; Leal, Suzanne M; Pasternak, Shiran; Wheeler, David A; Willis, Thomas D; Yu, Fuli; Yang, Huanming; Zeng, Changqing; Gao, Yang; Hu, Haoran; Hu, Weitao; Li, Chaohua; Lin, Wei; Liu, Siqi; Pan, Hao; Tang, Xiaoli; Wang, Jian; Wang, Wei; Yu, Jun; Zhang, Bo; Zhang, Qingrun; Zhao, Hongbin; Zhao, Hui; Zhou, Jun; Gabriel, Stacey B; Barry, Rachel; Blumenstiel, Brendan; Camargo, Amy; Defelice, Matthew; Faggart, Maura; Goyette, Mary; Gupta, Supriya; Moore, Jamie; Nguyen, Huy; Onofrio, Robert C; Parkin, Melissa; Roy, Jessica; Stahl, Erich; Winchester, Ellen; Ziaugra, Liuda; Altshuler, David; Shen, Yan; Yao, Zhijian; Huang, Wei; Chu, Xun; He, Yungang; Jin, Li; Liu, Yangfan; Shen, Yayun; Sun, Weiwei; Wang, Haifeng; Wang, Yi; Wang, Ying; Xiong, Xiaoyan; Xu, Liang; Waye, Mary M Y; Tsui, Stephen K W; Xue, Hong; Wong, J Tze-Fei; Galver, Luana M; Fan, Jian-Bing; Gunderson, Kevin; Murray, Sarah S; Oliphant, Arnold R; Chee, Mark S; Montpetit, Alexandre; Chagnon, Fanny; Ferretti, Vincent; Leboeuf, Martin; Olivier, Jean-François; Phillips, Michael S; Roumy, Stéphanie; Sallée, Clémentine; Verner, Andrei; Hudson, Thomas J; Kwok, Pui-Yan; Cai, Dongmei; Koboldt, Daniel C; Miller, Raymond D; Pawlikowska, Ludmila; Taillon-Miller, Patricia; Xiao, Ming; Tsui, Lap-Chee; Mak, William; Song, You Qiang; Tam, Paul K H; Nakamura, Yusuke; Kawaguchi, Takahisa; Kitamoto, Takuya; Morizono, Takashi; Nagashima, Atsushi; Ohnishi, Yozo; Sekine, Akihiro; Tanaka, Toshihiro; Tsunoda, Tatsuhiko; Deloukas, Panos; Bird, Christine P; Delgado, Marcos; Dermitzakis, Emmanouil T; Gwilliam, Rhian; Hunt, Sarah; Morrison, Jonathan; Powell, Don; Stranger, Barbara E; Whittaker, Pamela; Bentley, David R; Daly, Mark J; de Bakker, Paul I W; Barrett, Jeff; Chretien, Yves R; Maller, Julian; McCarroll, Steve; Patterson, Nick; Pe'er, Itsik; Price, Alkes; Purcell, Shaun; Richter, Daniel J; Sabeti, Pardis; Saxena, Richa; Schaffner, Stephen F; Sham, Pak C; Varilly, Patrick; Altshuler, David; Stein, Lincoln D; Krishnan, Lalitha; Smith, Albert Vernon; Tello-Ruiz, Marcela K; Thorisson, Gudmundur A; Chakravarti, Aravinda; Chen, Peter E; Cutler, David J; Kashuk, Carl S; Lin, Shin; Abecasis, Gonçalo R; Guan, Weihua; Li, Yun; Munro, Heather M; Qin, Zhaohui Steve; Thomas, Daryl J; McVean, Gilean; Auton, Adam; Bottolo, Leonardo; Cardin, Niall; Eyheramendy, Susana; Freeman, Colin; Marchini, Jonathan; Myers, Simon; Spencer, Chris; Stephens, Matthew; Donnelly, Peter; Cardon, Lon R; Clarke, Geraldine; Evans, David M; Morris, Andrew P; Weir, Bruce S; Tsunoda, Tatsuhiko; Johnson, Todd A; Mullikin, James C; Sherry, Stephen T; Feolo, Michael; Skol, Andrew; Zhang, Houcan; Zeng, Changqing; Zhao, Hui; Matsuda, Ichiro; Fukushima, Yoshimitsu; Macer, Darryl R; Suda, Eiko; Rotimi, Charles N; Adebamowo, Clement A; Ajayi, Ike; Aniagwu, Toyin; Marshall, Patricia A; Nkwodimmah, Chibuzor; Royal, Charmaine D M; Leppert, Mark F; Dixon, Missy; Peiffer, Andy; Qiu, Renzong; Kent, Alastair; Kato, Kazuto; Niikawa, Norio; Adewole, Isaac F; Knoppers, Bartha M; Foster, Morris W; Clayton, Ellen Wright; Watkin, Jessica; Gibbs, Richard A; Belmont, John W; Muzny, Donna; Nazareth, Lynne; Sodergren, Erica; Weinstock, George M; Wheeler, David A; Yakub, Imtaz; Gabriel, Stacey B; Onofrio, Robert C; Richter, Daniel J; Ziaugra, Liuda; Birren, Bruce W; Daly, Mark J; Altshuler, David; Wilson, Richard K; Fulton, Lucinda L; Rogers, Jane; Burton, John; Carter, Nigel P; Clee, Christopher M; Griffiths, Mark; Jones, Matthew C; McLay, Kirsten; Plumb, Robert W; Ross, Mark T; Sims, Sarah K; Willey, David L; Chen, Zhu; Han, Hua; Kang, Le; Godbout, Martin; Wallenburg, John C; L'Archevêque, Paul; Bellemare, Guy; Saeki, Koji; Wang, Hongguang; An, Daochang; Fu, Hongbo; Li, Qing; Wang, Zhen; Wang, Renwu; Holden, Arthur L; Brooks, Lisa D; McEwen, Jean E; Guyer, Mark S; Wang, Vivian Ota; Peterson, Jane L; Shi, Michael; Spiegel, Jack; Sung, Lawrence M; Zacharia, Lynn F; Collins, Francis S; Kennedy, Karen; Jamieson, Ruth; Stewart, John

    2007-10-18

    With the advent of dense maps of human genetic variation, it is now possible to detect positive natural selection across the human genome. Here we report an analysis of over 3 million polymorphisms from the International HapMap Project Phase 2 (HapMap2). We used 'long-range haplotype' methods, which were developed to identify alleles segregating in a population that have undergone recent selection, and we also developed new methods that are based on cross-population comparisons to discover alleles that have swept to near-fixation within a population. The analysis reveals more than 300 strong candidate regions. Focusing on the strongest 22 regions, we develop a heuristic for scrutinizing these regions to identify candidate targets of selection. In a complementary analysis, we identify 26 non-synonymous, coding, single nucleotide polymorphisms showing regional evidence of positive selection. Examination of these candidates highlights three cases in which two genes in a common biological process have apparently undergone positive selection in the same population:LARGE and DMD, both related to infection by the Lassa virus, in West Africa;SLC24A5 and SLC45A2, both involved in skin pigmentation, in Europe; and EDAR and EDA2R, both involved in development of hair follicles, in Asia.

  12. Genome sequence of Thermofilum pendens reveals an exceptional loss of biosynthetic pathways without genome reduction

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Anderson, Iain; Rodriguez, Jason; Susanti, Dwi; Porat, Iris; Reich, Claudia; Ulrich, Luke E.; Elkins, James G.; Mavromatis, Kostas; Lykidis, Athanasios; Kim, Edwin; Thompson, Linda S.; Nolan, Matt; Land, Miriam; Copeland, Alex; Lapidus, Alla; Lucas, Susan; Detter, Chris; Zhulin, Igor B.; Olsen, Gary J.; Whitman, William; Mukhopadhyay, Biswarup; Bristow, James; Kyrpides, Nikos

    2008-01-01

    We report the complete genome of Thermofilum pendens, a deep-branching, hyperthermophilic member of the order Thermoproteales within the archaeal kingdom Crenarchaeota. T. pendens is a sulfur-dependent, anaerobic heterotroph isolated from a solfatara in Iceland. It is an extracellular commensal, requiring an extract of Thermoproteus tenax for growth, and the genome sequence reveals that biosynthetic pathways for purines, most amino acids, and most cofactors are absent. In fact T. pendens has fewer biosynthetic enzymes than obligate intracellular parasites, although it does not display other features common among obligate parasites and thus does not appear to be in the process of becoming a parasite. It appears that T. pendens has adapted to life in an environment rich in nutrients. T. pendens was known to utilize peptides as an energy source, but the genome reveals substantial ability to grow on carbohydrates. T. pendens is the first crenarchaeote and only the second archaeon found to have a transporter of the phosphotransferase system. In addition to fermentation, T. pendens may gain energy from sulfur reduction with hydrogen and formate as electron donors. It may also be capable of sulfur-independent growth on formate with formate hydrogenlyase. Additional novel features are the presence of a monomethylamine:corrinoid methyltransferase, the first time this enzyme has been found outside of Methanosarcinales, and a presenilin-related protein. Predicted highly expressed proteins do not include housekeeping genes, and instead include ABC transporters for carbohydrates and peptides, and CRISPR-associated proteins.

  13. Relaxation of Selective Constraints Causes Independent Selenoprotein Extinction in Insect Genomes

    OpenAIRE

    Chapple, Charles E.; Guigó, Roderic

    2008-01-01

    BACKGROUND: Selenoproteins are a diverse family of proteins notable for the presence of the 21st amino acid, selenocysteine. Until very recently, all metazoan genomes investigated encoded selenoproteins, and these proteins had therefore been believed to be essential for animal life. Challenging this assumption, recent comparative analyses of insect genomes have revealed that some insect genomes appear to have lost selenoprotein genes. METHODOLOGY/PRINCIPAL FINDINGS: In this paper we investiga...

  14. Cow genotyping strategies for genomic selection in a small dairy cattle population.

    Science.gov (United States)

    Jenko, J; Wiggans, G R; Cooper, T A; Eaglen, S A E; Luff, W G de L; Bichard, M; Pong-Wong, R; Woolliams, J A

    2017-01-01

    This study compares how different cow genotyping strategies increase the accuracy of genomic estimated breeding values (EBV) in dairy cattle breeds with low numbers. In these breeds, few sires have progeny records, and genotyping cows can improve the accuracy of genomic EBV. The Guernsey breed is a small dairy cattle breed with approximately 14,000 recorded individuals worldwide. Predictions of phenotypes of milk yield, fat yield, protein yield, and calving interval were made for Guernsey cows from England and Guernsey Island using genomic EBV, with training sets including 197 de-regressed proofs of genotyped bulls, with cows selected from among 1,440 genotyped cows using different genotyping strategies. Accuracies of predictions were tested using 10-fold cross-validation among the cows. Genomic EBV were predicted using 4 different methods: (1) pedigree BLUP, (2) genomic BLUP using only bulls, (3) univariate genomic BLUP using bulls and cows, and (4) bivariate genomic BLUP. Genotyping cows with phenotypes and using their data for the prediction of single nucleotide polymorphism effects increased the correlation between genomic EBV and phenotypes compared with using only bulls by 0.163±0.022 for milk yield, 0.111±0.021 for fat yield, and 0.113±0.018 for protein yield; a decrease of 0.014±0.010 for calving interval from a low base was the only exception. Genetic correlation between phenotypes from bulls and cows were approximately 0.6 for all yield traits and significantly different from 1. Only a very small change occurred in correlation between genomic EBV and phenotypes when using the bivariate model. It was always better to genotype all the cows, but when only half of the cows were genotyped, a divergent selection strategy was better compared with the random or directional selection approach. Divergent selection of 30% of the cows remained superior for the yield traits in 8 of 10 folds. Copyright © 2017 American Dairy Science Association. Published by

  15. Neolithic and Medieval virus genomes reveal complex evolution of Hepatitis B.

    Science.gov (United States)

    Krause-Kyora, Ben; Susat, Julian; Key, Felix M; Kühnert, Denise; Bosse, Esther; Immel, Alexander; Rinne, Christoph; Kornell, Sabin-Christin; Yepes, Diego; Franzenburg, Sören; Heyne, Henrike O; Meier, Thomas; Lösch, Sandra; Meller, Harald; Friederich, Susanne; Nicklisch, Nicole; Alt, Kurt W; Schreiber, Stefan; Tholey, Andreas; Herbig, Alexander; Nebel, Almut; Krause, Johannes

    2018-05-10

    The hepatitis B virus (HBV) is one of the most widespread human pathogens known today, yet its origin and evolutionary history are still unclear and controversial. Here, we report the analysis of three ancient HBV genomes recovered from human skeletons found at three different archaeological sites in Germany. We reconstructed two Neolithic and one medieval HBV genomes by de novo assembly from shotgun DNA sequencing data. Additionally, we observed HBV-specific peptides using paleo-proteomics. Our results show that HBV circulates in the European population for at least 7000 years. The Neolithic HBV genomes show a high genomic similarity to each other. In a phylogenetic network, they do not group with any human-associated HBV genome and are most closely related to those infecting African non-human primates. These ancient virus forms appear to represent distinct lineages that have no close relatives today and possibly went extinct. Our results reveal the great potential of ancient DNA from human skeletons in order to study the long-time evolution of blood borne viruses. © 2018, Krause-Kyora et al.

  16. Genomic selection accuracy using multi-family prediction models in a wheat breeding program

    Science.gov (United States)

    Genomic selection (GS) uses genome-wide molecular marker data to predict the genetic value of selection candidates in breeding programs. In plant breeding, the ability to produce large numbers of progeny per cross allows GS to be conducted within each family. However, this approach requires phenotyp...

  17. Genome-wide characterization of genetic variants and putative regions under selection in meat and egg-type chicken lines.

    Science.gov (United States)

    Boschiero, Clarissa; Moreira, Gabriel Costa Monteiro; Gheyas, Almas Ara; Godoy, Thaís Fernanda; Gasparin, Gustavo; Mariani, Pilar Drummond Sampaio Corrêa; Paduan, Marcela; Cesar, Aline Silva Mello; Ledur, Mônica Corrêa; Coutinho, Luiz Lehmann

    2018-01-25

    Meat and egg-type chickens have been selected for several generations for different traits. Artificial and natural selection for different phenotypes can change frequency of genetic variants, leaving particular genomic footprints throghtout the genome. Thus, the aims of this study were to sequence 28 chickens from two Brazilian lines (meat and white egg-type) and use this information to characterize genome-wide genetic variations, identify putative regions under selection using Fst method, and find putative pathways under selection. A total of 13.93 million SNPs and 1.36 million INDELs were identified, with more variants detected from the broiler (meat-type) line. Although most were located in non-coding regions, we identified 7255 intolerant non-synonymous SNPs, 512 stopgain/loss SNPs, 1381 frameshift and 1094 non-frameshift INDELs that may alter protein functions. Genes harboring intolerant non-synonymous SNPs affected metabolic pathways related mainly to reproduction and endocrine systems in the white-egg layer line, and lipid metabolism and metabolic diseases in the broiler line. Fst analysis in sliding windows, using SNPs and INDELs separately, identified over 300 putative regions of selection overlapping with more than 250 genes. For the first time in chicken, INDEL variants were considered for selection signature analysis, showing high level of correlation in results between SNP and INDEL data. The putative regions of selection signatures revealed interesting candidate genes and pathways related to important phenotypic traits in chicken, such as lipid metabolism, growth, reproduction, and cardiac development. In this study, Fst method was applied to identify high confidence putative regions under selection, providing novel insights into selection footprints that can help elucidate the functional mechanisms underlying different phenotypic traits relevant to meat and egg-type chicken lines. In addition, we generated a large catalog of line-specific and common

  18. Constraints on genome dynamics revealed from gene distribution among the Ralstonia solanacearum species.

    Directory of Open Access Journals (Sweden)

    Pierre Lefeuvre

    Full Text Available Because it is suspected that gene content may partly explain host adaptation and ecology of pathogenic bacteria, it is important to study factors affecting genome composition and its evolution. While recent genomic advances have revealed extremely large pan-genomes for some bacterial species, it remains difficult to predict to what extent gene pool is accessible within or transferable between populations. As genomes bear imprints of the history of the organisms, gene distribution pattern analyses should provide insights into the forces and factors at play in the shaping and maintaining of bacterial genomes. In this study, we revisited the data obtained from a previous CGH microarrays analysis in order to assess the genomic plasticity of the R. solanacearum species complex. Gene distribution analyses demonstrated the remarkably dispersed genome of R. solanacearum with more than half of the genes being accessory. From the reconstruction of the ancestral genomes compositions, we were able to infer the number of gene gain and loss events along the phylogeny. Analyses of gene movement patterns reveal that factors associated with gene function, genomic localization and ecology delineate gene flow patterns. While the chromosome displayed lower rates of movement, the megaplasmid was clearly associated with hot-spots of gene gain and loss. Gene function was also confirmed to be an essential factor in gene gain and loss dynamics with significant differences in movement patterns between different COG categories. Finally, analyses of gene distribution highlighted possible highways of horizontal gene transfer. Due to sampling and design bias, we can only speculate on factors at play in this gene movement dynamic. Further studies examining precise conditions that favor gene transfer would provide invaluable insights in the fate of bacteria, species delineation and the emergence of successful pathogens.

  19. Analysis of nuclear and organellar genomes of Plasmodium knowlesi in humans reveals ancient population structure and recent recombination among host-specific subpopulations

    KAUST Repository

    Diez Benavente, Ernest

    2017-09-18

    The macaque parasite Plasmodium knowlesi is a significant concern in Malaysia where cases of human infection are increasing. Parasites infecting humans originate from genetically distinct subpopulations associated with the long-tailed (Macaca fascicularis (Mf)) or pig-tailed macaques (Macaca nemestrina (Mn)). We used a new high-quality reference genome to re-evaluate previously described subpopulations among human and macaque isolates from Malaysian-Borneo and Peninsular-Malaysia. Nuclear genomes were dimorphic, as expected, but new evidence of chromosomal-segment exchanges between subpopulations was found. A large segment on chromosome 8 originating from the Mn subpopulation and containing genes encoding proteins expressed in mosquito-borne parasite stages, was found in Mf genotypes. By contrast, non-recombining organelle genomes partitioned into 3 deeply branched lineages, unlinked with nuclear genomic dimorphism. Subpopulations which diverged in isolation have re-connected, possibly due to deforestation and disruption of wild macaque habitats. The resulting genomic mosaics reveal traits selected by host-vector-parasite interactions in a setting of ecological transition.

  20. Analysis of nuclear and organellar genomes of Plasmodium knowlesi in humans reveals ancient population structure and recent recombination among host-specific subpopulations

    KAUST Repository

    Diez Benavente, Ernest; Florez de Sessions, Paola; Moon, Robert W.; Holder, Anthony A.; Blackman, Michael J.; Roper, Cally; Drakeley, Christopher J.; Pain, Arnab; Sutherland, Colin J.; Hibberd, Martin L.; Campino, Susana; Clark, Taane G.

    2017-01-01

    The macaque parasite Plasmodium knowlesi is a significant concern in Malaysia where cases of human infection are increasing. Parasites infecting humans originate from genetically distinct subpopulations associated with the long-tailed (Macaca fascicularis (Mf)) or pig-tailed macaques (Macaca nemestrina (Mn)). We used a new high-quality reference genome to re-evaluate previously described subpopulations among human and macaque isolates from Malaysian-Borneo and Peninsular-Malaysia. Nuclear genomes were dimorphic, as expected, but new evidence of chromosomal-segment exchanges between subpopulations was found. A large segment on chromosome 8 originating from the Mn subpopulation and containing genes encoding proteins expressed in mosquito-borne parasite stages, was found in Mf genotypes. By contrast, non-recombining organelle genomes partitioned into 3 deeply branched lineages, unlinked with nuclear genomic dimorphism. Subpopulations which diverged in isolation have re-connected, possibly due to deforestation and disruption of wild macaque habitats. The resulting genomic mosaics reveal traits selected by host-vector-parasite interactions in a setting of ecological transition.

  1. Evolution of small prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    David José Martínez-Cano

    2015-01-01

    Full Text Available As revealed by genome sequencing, the biology of prokaryotes with reduced genomes is strikingly diverse. These include free-living prokaryotes with ~800 genes as well as endosymbiotic bacteria with as few as ~140 genes. Comparative genomics is revealing the evolutionary mechanisms that led to these small genomes. In the case of free-living prokaryotes, natural selection directly favored genome reduction, while in the case of endosymbiotic prokaryotes neutral processes played a more prominent role. However, new experimental data suggest that selective processes may be at operation as well for endosymbiotic prokaryotes at least during the first stages of genome reduction. Endosymbiotic prokaryotes have evolved diverse strategies for living with reduced gene sets inside a host-defined medium. These include utilization of host-encoded functions (some of them coded by genes acquired by gene transfer from the endosymbiont and/or other bacteria; metabolic complementation between co-symbionts; and forming consortiums with other bacteria within the host. Recent genome sequencing projects of intracellular mutualistic bacteria showed that previously believed universal evolutionary trends like reduced G+C content and conservation of genome synteny are not always present in highly reduced genomes. Finally, the simplified molecular machinery of some of these organisms with small genomes may be used to aid in the design of artificial minimal cells. Here we review recent genomic discoveries of the biology of prokaryotes endowed with small gene sets and discuss the evolutionary mechanisms that have been proposed to explain their peculiar nature.

  2. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans

    DEFF Research Database (Denmark)

    Raghavan, Maanasa; Skoglund, Pontus; Graf, Kelly E.

    2014-01-01

    ,000-year-old individual (MA-1), from Mal'ta in south-central Siberia, to an average depth of 1×. To our knowledge this is the oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome belongs to haplogroup U, which has also been found at high frequency among Upper Palaeolithic......The origins of the First Americans remain contentious. Although Native Americans seem to be genetically most closely related to east Asians, there is no consensus with regard to which specific Old World populations they are closest to. Here we sequence the draft genome of an approximately 24...... that the region was continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal that western Eurasian genetic signatures in modern-day Native Americans derive not only from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the First Americans....

  3. Adaptations to a Subterranean Environment and Longevity Revealed by the Analysis of Mole Rat Genomes

    Directory of Open Access Journals (Sweden)

    Xiaodong Fang

    2014-09-01

    Full Text Available Subterranean mammals spend their lives in dark, unventilated environments that are rich in carbon dioxide and ammonia and low in oxygen. Many of these animals are also long-lived and exhibit reduced aging-associated diseases, such as neurodegenerative disorders and cancer. We sequenced the genome of the Damaraland mole rat (DMR, Fukomys damarensis and improved the genome assembly of the naked mole rat (NMR, Heterocephalus glaber. Comparative genome analyses, along with the transcriptomes of related subterranean rodents, revealed candidate molecular adaptations for subterranean life and longevity, including a divergent insulin peptide, expression of oxygen-carrying globins in the brain, prevention of high CO2-induced pain perception, and enhanced ammonia detoxification. Juxtaposition of the genomes of DMR and other more conventional animals with the genome of NMR revealed several truly exceptional NMR features: unusual thermogenesis, an aberrant melatonin system, pain insensitivity, and unique processing of 28S rRNA. Together, these genomes and transcriptomes extend our understanding of subterranean adaptations, stress resistance, and longevity.

  4. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    DEFF Research Database (Denmark)

    Machado, Henrique; Gram, Lone

    2017-01-01

    was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.......Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...

  5. Distribution of triclosan-resistant genes in major pathogenic microorganisms revealed by metagenome and genome-wide analysis.

    Directory of Open Access Journals (Sweden)

    Raees Khan

    Full Text Available The substantial use of triclosan (TCS has been aimed to kill pathogenic bacteria, but TCS resistance seems to be prevalent in microbial species and limited knowledge exists about TCS resistance determinants in a majority of pathogenic bacteria. We aimed to evaluate the distribution of TCS resistance determinants in major pathogenic bacteria (N = 231 and to assess the enrichment of potentially pathogenic genera in TCS contaminated environments. A TCS-resistant gene (TRG database was constructed and experimentally validated to predict TCS resistance in major pathogenic bacteria. Genome-wide in silico analysis was performed to define the distribution of TCS-resistant determinants in major pathogens. Microbiome analysis of TCS contaminated soil samples was also performed to investigate the abundance of TCS-resistant pathogens. We experimentally confirmed that TCS resistance could be accurately predicted using genome-wide in silico analysis against TRG database. Predicted TCS resistant phenotypes were observed in all of the tested bacterial strains (N = 17, and heterologous expression of selected TCS resistant genes from those strains conferred expected levels of TCS resistance in an alternative host Escherichia coli. Moreover, genome-wide analysis revealed that potential TCS resistance determinants were abundant among the majority of human-associated pathogens (79% and soil-borne plant pathogenic bacteria (98%. These included a variety of enoyl-acyl carrier protein reductase (ENRs homologues, AcrB efflux pumps, and ENR substitutions. FabI ENR, which is the only known effective target for TCS, was either co-localized with other TCS resistance determinants or had TCS resistance-associated substitutions. Furthermore, microbiome analysis revealed that pathogenic genera with intrinsic TCS-resistant determinants exist in TCS contaminated environments. We conclude that TCS may not be as effective against the majority of bacterial pathogens as previously

  6. Distribution of triclosan-resistant genes in major pathogenic microorganisms revealed by metagenome and genome-wide analysis

    Science.gov (United States)

    Khan, Raees; Roy, Nazish; Choi, Kihyuck

    2018-01-01

    The substantial use of triclosan (TCS) has been aimed to kill pathogenic bacteria, but TCS resistance seems to be prevalent in microbial species and limited knowledge exists about TCS resistance determinants in a majority of pathogenic bacteria. We aimed to evaluate the distribution of TCS resistance determinants in major pathogenic bacteria (N = 231) and to assess the enrichment of potentially pathogenic genera in TCS contaminated environments. A TCS-resistant gene (TRG) database was constructed and experimentally validated to predict TCS resistance in major pathogenic bacteria. Genome-wide in silico analysis was performed to define the distribution of TCS-resistant determinants in major pathogens. Microbiome analysis of TCS contaminated soil samples was also performed to investigate the abundance of TCS-resistant pathogens. We experimentally confirmed that TCS resistance could be accurately predicted using genome-wide in silico analysis against TRG database. Predicted TCS resistant phenotypes were observed in all of the tested bacterial strains (N = 17), and heterologous expression of selected TCS resistant genes from those strains conferred expected levels of TCS resistance in an alternative host Escherichia coli. Moreover, genome-wide analysis revealed that potential TCS resistance determinants were abundant among the majority of human-associated pathogens (79%) and soil-borne plant pathogenic bacteria (98%). These included a variety of enoyl-acyl carrier protein reductase (ENRs) homologues, AcrB efflux pumps, and ENR substitutions. FabI ENR, which is the only known effective target for TCS, was either co-localized with other TCS resistance determinants or had TCS resistance-associated substitutions. Furthermore, microbiome analysis revealed that pathogenic genera with intrinsic TCS-resistant determinants exist in TCS contaminated environments. We conclude that TCS may not be as effective against the majority of bacterial pathogens as previously presumed

  7. Genomic signatures of geographic isolation and natural selection in coral reef fishes.

    Science.gov (United States)

    Gaither, Michelle R; Bernal, Moisés A; Coleman, Richard R; Bowen, Brian W; Jones, Shelley A; Simison, W Brian; Rocha, Luiz A

    2015-04-01

    The drivers of speciation remain among the most controversial topics in evolutionary biology. Initially, Darwin emphasized natural selection as a primary mechanism of speciation, but the architects of the modern synthesis largely abandoned that view in favour of divergence by geographic isolation. The balance between selection and isolation is still at the forefront of the evolutionary debate, especially for the world's tropical oceans where biodiversity is high, but isolating barriers are few. Here, we identify the drivers of speciation in Pacific reef fishes of the genus Acanthurus by comparative genome scans of two peripheral populations that split from a large Central-West Pacific lineage at roughly the same time. Mitochondrial sequences indicate that populations in the Hawaiian Archipelago and the Marquesas Islands became isolated approximately 0.5 Ma. The Hawaiian lineage is morphologically indistinguishable from the widespread Pacific form, but the Marquesan form is recognized as a distinct species that occupies an unusual tropical ecosystem characterized by upwelling, turbidity, temperature fluctuations, algal blooms and little coral cover. An analysis of 3737 SNPs reveals a strong signal of selection at the Marquesas, with 59 loci under disruptive selection including an opsin Rh2 locus. While both the Hawaiian and Marquesan populations indicate signals of drift, the former shows a weak signal of selection that is comparable with populations in the Central-West Pacific. This contrast between closely related lineages reveals one population diverging due primarily to geographic isolation and genetic drift, and the other achieving taxonomic species status under the influence of selection. © 2015 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  8. Comparative Genomics of Methanopyrus sp. SNP6 and KOL6 Revealing Genomic Regions of Plasticity Implicated in Extremely Thermophilic Profiles

    Directory of Open Access Journals (Sweden)

    Zhiliang Yu

    2017-07-01

    Full Text Available Methanopyrus spp. are usually isolated from harsh niches, such as high osmotic pressure and extreme temperature. However, the molecular mechanisms for their environmental adaption are poorly understood. Archaeal species is commonly considered as primitive organism. The evolutional placement of archaea is a fundamental and intriguing scientific question. We sequenced the genomes of Methanopyrus strains SNP6 and KOL6 isolated from the Atlantic and Iceland, respectively. Comparative genomic analysis revealed genetic diversity and instability implicated in niche adaption, including a number of transporter- and integrase/transposase-related genes. Pan-genome analysis also defined the gene pool of Methanopyrus spp., in addition of ~120-Kb genomic region of plasticity impacting cognate genomic architecture. We believe that Methanopyrus genomics could facilitate efficient investigation/recognition of archaeal phylogenetic diverse patterns, as well as improve understanding of biological roles and significance of these versatile microbes.

  9. Genome-wide genetic diversity and differentially selected regions among Suffolk, Rambouillet, Columbia, Polypay, and Targhee sheep.

    Directory of Open Access Journals (Sweden)

    Lifan Zhang

    Full Text Available Sheep are among the major economically important livestock species worldwide because the animals produce milk, wool, skin, and meat. In the present study, the Illumina OvineSNP50 BeadChip was used to investigate genetic diversity and genome selection among Suffolk, Rambouillet, Columbia, Polypay, and Targhee sheep breeds from the United States. After quality-control filtering of SNPs (single nucleotide polymorphisms, we used 48,026 SNPs, including 46,850 SNPs on autosomes that were in Hardy-Weinberg equilibrium and 1,176 SNPs on chromosome × for analysis. Phylogenetic analysis based on all 46,850 SNPs clearly separated Suffolk from Rambouillet, Columbia, Polypay, and Targhee, which was not surprising as Rambouillet contributed to the synthesis of the later three breeds. Based on pair-wise estimates of F(ST, significant genetic differentiation appeared between Suffolk and Rambouillet (F(ST = 0.1621, while Rambouillet and Targhee had the closest relationship (F(ST = 0.0681. A scan of the genome revealed 45 and 41 differentially selected regions (DSRs between Suffolk and Rambouillet and among Rambouillet-related breed populations, respectively. Our data indicated that regions 13 and 24 between Suffolk and Rambouillet might be good candidates for evaluating breed differences. Furthermore, ovine genome v3.1 assembly was used as reference to link functionally known homologous genes to economically important traits covered by these differentially selected regions. In brief, our present study provides a comprehensive genome-wide view on within- and between-breed genetic differentiation, biodiversity, and evolution among Suffolk, Rambouillet, Columbia, Polypay, and Targhee sheep breeds. These results may provide new guidance for the synthesis of new breeds with different breeding objectives.

  10. A genomic survey of positive selection in Burkholderia pseudomallei provides insights into the evolution of accidental virulence.

    Directory of Open Access Journals (Sweden)

    Tannistha Nandi

    2010-04-01

    Full Text Available Certain environmental microorganisms can cause severe human infections, even in the absence of an obvious requirement for transition through an animal host for replication ("accidental virulence". To understand this process, we compared eleven isolate genomes of Burkholderia pseudomallei (Bp, a tropical soil microbe and causative agent of the human and animal disease melioidosis. We found evidence for the existence of several new genes in the Bp reference genome, identifying 282 novel genes supported by at least two independent lines of supporting evidence (mRNA transcripts, database homologs, and presence of ribosomal binding sites and 81 novel genes supported by all three lines. Within the Bp core genome, 211 genes exhibited significant levels of positive selection (4.5%, distributed across many cellular pathways including carbohydrate and secondary metabolism. Functional experiments revealed that certain positively selected genes might enhance mammalian virulence by interacting with host cellular pathways or utilizing host nutrients. Evolutionary modifications improving Bp environmental fitness may thus have indirectly facilitated the ability of Bp to colonize and survive in mammalian hosts. These findings improve our understanding of the pathogenesis of melioidosis, and establish Bp as a model system for studying the genetics of accidental virulence.

  11. A genome scan for positive selection in thoroughbred horses.

    Science.gov (United States)

    Gu, Jingjing; Orr, Nick; Park, Stephen D; Katz, Lisa M; Sulimova, Galina; MacHugh, David E; Hill, Emmeline W

    2009-06-02

    Thoroughbred horses have been selected for exceptional racing performance resulting in system-wide structural and functional adaptations contributing to elite athletic phenotypes. Because selection has been recent and intense in a closed population that stems from a small number of founder animals Thoroughbreds represent a unique population within which to identify genomic contributions to exercise-related traits. Employing a population genetics-based hitchhiking mapping approach we performed a genome scan using 394 autosomal and X chromosome microsatellite loci and identified positively selected loci in the extreme tail-ends of the empirical distributions for (1) deviations from expected heterozygosity (Ewens-Watterson test) in Thoroughbred (n = 112) and (2) global differentiation among four geographically diverse horse populations (F(ST)). We found positively selected genomic regions in Thoroughbred enriched for phosphoinositide-mediated signalling (3.2-fold enrichment; PThoroughbred athletic phenotype. We report for the first time candidate athletic-performance genes within regions targeted by selection in Thoroughbred horses that are principally responsible for fatty acid oxidation, increased insulin sensitivity and muscle strength: ACSS1 (acyl-CoA synthetase short-chain family member 1), ACTA1 (actin, alpha 1, skeletal muscle), ACTN2 (actinin, alpha 2), ADHFE1 (alcohol dehydrogenase, iron containing, 1), MTFR1 (mitochondrial fission regulator 1), PDK4 (pyruvate dehydrogenase kinase, isozyme 4) and TNC (tenascin C). Understanding the genetic basis for exercise adaptation will be crucial for the identification of genes within the complex molecular networks underlying obesity and its consequential pathologies, such as type 2 diabetes. Therefore, we propose Thoroughbred as a novel in vivo large animal model for understanding molecular protection against metabolic disease.

  12. Genomic Footprints of Selective Sweeps from Metabolic Resistance to Pyrethroids in African Malaria Vectors Are Driven by Scale up of Insecticide-Based Vector Control.

    Science.gov (United States)

    Barnes, Kayla G; Weedall, Gareth D; Ndula, Miranda; Irving, Helen; Mzihalowa, Themba; Hemingway, Janet; Wondji, Charles S

    2017-02-01

    Insecticide resistance in mosquito populations threatens recent successes in malaria prevention. Elucidating patterns of genetic structure in malaria vectors to predict the speed and direction of the spread of resistance is essential to get ahead of the 'resistance curve' and to avert a public health catastrophe. Here, applying a combination of microsatellite analysis, whole genome sequencing and targeted sequencing of a resistance locus, we elucidated the continent-wide population structure of a major African malaria vector, Anopheles funestus. We identified a major selective sweep in a genomic region controlling cytochrome P450-based metabolic resistance conferring high resistance to pyrethroids. This selective sweep occurred since 2002, likely as a direct consequence of scaled up vector control as revealed by whole genome and fine-scale sequencing of pre- and post-intervention populations. Fine-scaled analysis of the pyrethroid resistance locus revealed that a resistance-associated allele of the cytochrome P450 monooxygenase CYP6P9a has swept through southern Africa to near fixation, in contrast to high polymorphism levels before interventions, conferring high levels of pyrethroid resistance linked to control failure. Population structure analysis revealed a barrier to gene flow between southern Africa and other areas, which may prevent or slow the spread of the southern mechanism of pyrethroid resistance to other regions. By identifying a genetic signature of pyrethroid-based interventions, we have demonstrated the intense selective pressure that control interventions exert on mosquito populations. If this level of selection and spread of resistance continues unabated, our ability to control malaria with current interventions will be compromised.

  13. Genomic Footprints of Selective Sweeps from Metabolic Resistance to Pyrethroids in African Malaria Vectors Are Driven by Scale up of Insecticide-Based Vector Control.

    Directory of Open Access Journals (Sweden)

    Kayla G Barnes

    2017-02-01

    Full Text Available Insecticide resistance in mosquito populations threatens recent successes in malaria prevention. Elucidating patterns of genetic structure in malaria vectors to predict the speed and direction of the spread of resistance is essential to get ahead of the 'resistance curve' and to avert a public health catastrophe. Here, applying a combination of microsatellite analysis, whole genome sequencing and targeted sequencing of a resistance locus, we elucidated the continent-wide population structure of a major African malaria vector, Anopheles funestus. We identified a major selective sweep in a genomic region controlling cytochrome P450-based metabolic resistance conferring high resistance to pyrethroids. This selective sweep occurred since 2002, likely as a direct consequence of scaled up vector control as revealed by whole genome and fine-scale sequencing of pre- and post-intervention populations. Fine-scaled analysis of the pyrethroid resistance locus revealed that a resistance-associated allele of the cytochrome P450 monooxygenase CYP6P9a has swept through southern Africa to near fixation, in contrast to high polymorphism levels before interventions, conferring high levels of pyrethroid resistance linked to control failure. Population structure analysis revealed a barrier to gene flow between southern Africa and other areas, which may prevent or slow the spread of the southern mechanism of pyrethroid resistance to other regions. By identifying a genetic signature of pyrethroid-based interventions, we have demonstrated the intense selective pressure that control interventions exert on mosquito populations. If this level of selection and spread of resistance continues unabated, our ability to control malaria with current interventions will be compromised.

  14. Detecting Genomic Signatures of Natural Selection with Principal Component Analysis: Application to the 1000 Genomes Data.

    Science.gov (United States)

    Duforet-Frebourg, Nicolas; Luu, Keurcien; Laval, Guillaume; Bazin, Eric; Blum, Michael G B

    2016-04-01

    To characterize natural selection, various analytical methods for detecting candidate genomic regions have been developed. We propose to perform genome-wide scans of natural selection using principal component analysis (PCA). We show that the common FST index of genetic differentiation between populations can be viewed as the proportion of variance explained by the principal components. Considering the correlations between genetic variants and each principal component provides a conceptual framework to detect genetic variants involved in local adaptation without any prior definition of populations. To validate the PCA-based approach, we consider the 1000 Genomes data (phase 1) considering 850 individuals coming from Africa, Asia, and Europe. The number of genetic variants is of the order of 36 millions obtained with a low-coverage sequencing depth (3×). The correlations between genetic variation and each principal component provide well-known targets for positive selection (EDAR, SLC24A5, SLC45A2, DARC), and also new candidate genes (APPBPP2, TP1A1, RTTN, KCNMA, MYO5C) and noncoding RNAs. In addition to identifying genes involved in biological adaptation, we identify two biological pathways involved in polygenic adaptation that are related to the innate immune system (beta defensins) and to lipid metabolism (fatty acid omega oxidation). An additional analysis of European data shows that a genome scan based on PCA retrieves classical examples of local adaptation even when there are no well-defined populations. PCA-based statistics, implemented in the PCAdapt R package and the PCAdapt fast open-source software, retrieve well-known signals of human adaptation, which is encouraging for future whole-genome sequencing project, especially when defining populations is difficult. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  15. Performance comparison of two efficient genomic selection methods (gsbay & MixP) applied in aquacultural organisms

    Science.gov (United States)

    Su, Hailin; Li, Hengde; Wang, Shi; Wang, Yangfan; Bao, Zhenmin

    2017-02-01

    Genomic selection is more and more popular in animal and plant breeding industries all around the world, as it can be applied early in life without impacting selection candidates. The objective of this study was to bring the advantages of genomic selection to scallop breeding. Two different genomic selection tools MixP and gsbay were applied on genomic evaluation of simulated data and Zhikong scallop ( Chlamys farreri) field data. The data were compared with genomic best linear unbiased prediction (GBLUP) method which has been applied widely. Our results showed that both MixP and gsbay could accurately estimate single-nucleotide polymorphism (SNP) marker effects, and thereby could be applied for the analysis of genomic estimated breeding values (GEBV). In simulated data from different scenarios, the accuracy of GEBV acquired was ranged from 0.20 to 0.78 by MixP; it was ranged from 0.21 to 0.67 by gsbay; and it was ranged from 0.21 to 0.61 by GBLUP. Estimations made by MixP and gsbay were expected to be more reliable than those estimated by GBLUP. Predictions made by gsbay were more robust, while with MixP the computation is much faster, especially in dealing with large-scale data. These results suggested that both algorithms implemented by MixP and gsbay are feasible to carry out genomic selection in scallop breeding, and more genotype data will be necessary to produce genomic estimated breeding values with a higher accuracy for the industry.

  16. Accuracy of Genomic Selection in a Rice Synthetic Population Developed for Recurrent Selection Breeding.

    Directory of Open Access Journals (Sweden)

    Cécile Grenier

    Full Text Available Genomic selection (GS is a promising strategy for enhancing genetic gain. We investigated the accuracy of genomic estimated breeding values (GEBV in four inter-related synthetic populations that underwent several cycles of recurrent selection in an upland rice-breeding program. A total of 343 S2:4 lines extracted from those populations were phenotyped for flowering time, plant height, grain yield and panicle weight, and genotyped with an average density of one marker per 44.8 kb. The relative effect of the linkage disequilibrium (LD and minor allele frequency (MAF thresholds for selecting markers, the relative size of the training population (TP and of the validation population (VP, the selected trait and the genomic prediction models (frequentist and Bayesian on the accuracy of GEBVs was investigated in 540 cross validation experiments with 100 replicates. The effect of kinship between the training and validation populations was tested in an additional set of 840 cross validation experiments with a single genomic prediction model. LD was high (average r2 = 0.59 at 25 kb and decreased slowly, distribution of allele frequencies at individual loci was markedly skewed toward unbalanced frequencies (MAF average value 15.2% and median 9.6%, and differentiation between the four synthetic populations was low (FST ≤0.06. The accuracy of GEBV across all cross validation experiments ranged from 0.12 to 0.54 with an average of 0.30. Significant differences in accuracy were observed among the different levels of each factor investigated. Phenotypic traits had the biggest effect, and the size of the incidence matrix had the smallest. Significant first degree interaction was observed for GEBV accuracy between traits and all the other factors studied, and between prediction models and LD, MAF and composition of the TP. The potential of GS to accelerate genetic gain and breeding options to increase the accuracy of predictions are discussed.

  17. Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome.

    Directory of Open Access Journals (Sweden)

    Jian Li

    Full Text Available The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR mediated by low-copy repeats (LCRs. Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ~1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR-mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.

  18. High-throughput phenotyping and genomic selection: the frontiers of crop breeding converge.

    Science.gov (United States)

    Cabrera-Bosquet, Llorenç; Crossa, José; von Zitzewitz, Jarislav; Serret, María Dolors; Araus, José Luis

    2012-05-01

    Genomic selection (GS) and high-throughput phenotyping have recently been captivating the interest of the crop breeding community from both the public and private sectors world-wide. Both approaches promise to revolutionize the prediction of complex traits, including growth, yield and adaptation to stress. Whereas high-throughput phenotyping may help to improve understanding of crop physiology, most powerful techniques for high-throughput field phenotyping are empirical rather than analytical and comparable to genomic selection. Despite the fact that the two methodological approaches represent the extremes of what is understood as the breeding process (phenotype versus genome), they both consider the targeted traits (e.g. grain yield, growth, phenology, plant adaptation to stress) as a black box instead of dissecting them as a set of secondary traits (i.e. physiological) putatively related to the target trait. Both GS and high-throughput phenotyping have in common their empirical approach enabling breeders to use genome profile or phenotype without understanding the underlying biology. This short review discusses the main aspects of both approaches and focuses on the case of genomic selection of maize flowering traits and near-infrared spectroscopy (NIRS) and plant spectral reflectance as high-throughput field phenotyping methods for complex traits such as crop growth and yield. © 2012 Institute of Botany, Chinese Academy of Sciences.

  19. Comparative Genomics Reveals the Core Gene Toolbox for the Fungus-Insect Symbiosis

    Science.gov (United States)

    Stata, Matt; Wang, Wei; White, Merlin M.; Moncalvo, Jean-Marc

    2018-01-01

    ABSTRACT Modern genomics has shed light on many entomopathogenic fungi and expanded our knowledge widely; however, little is known about the genomic features of the insect-commensal fungi. Harpellales are obligate commensals living in the digestive tracts of disease-bearing insects (black flies, midges, and mosquitoes). In this study, we produced and annotated whole-genome sequences of nine Harpellales taxa and conducted the first comparative analyses to infer the genomic diversity within the members of the Harpellales. The genomes of the insect gut fungi feature low (26% to 37%) GC content and large genome size variations (25 to 102 Mb). Further comparisons with insect-pathogenic fungi (from both Ascomycota and Zoopagomycota), as well as with free-living relatives (as negative controls), helped to identify a gene toolbox that is essential to the fungus-insect symbiosis. The results not only narrow the genomic scope of fungus-insect interactions from several thousands to eight core players but also distinguish host invasion strategies employed by insect pathogens and commensals. The genomic content suggests that insect commensal fungi rely mostly on adhesion protein anchors that target digestive system, while entomopathogenic fungi have higher numbers of transmembrane helices, signal peptides, and pathogen-host interaction (PHI) genes across the whole genome and enrich genes as well as functional domains to inactivate the host inflammation system and suppress the host defense. Phylogenomic analyses have revealed that genome sizes of Harpellales fungi vary among lineages with an integer-multiple pattern, which implies that ancient genome duplications may have occurred within the gut of insects. PMID:29764946

  20. Effects of genomic selection on genetic improvement, inbreeding, and merit of young versus proven bulls

    NARCIS (Netherlands)

    Roos, de A.P.W.; Schrooten, C.; Veerkamp, R.F.; Arendonk, van J.A.M.

    2011-01-01

    Genomic selection has the potential to revolutionize dairy cattle breeding because young animals can be accurately selected as parents, leading to a much shorter generation interval and higher rates of genetic gain. The aims of this study were to assess the effects of genomic selection and reduction

  1. Deciphering the Cryptic Genome: Genome-wide Analyses of the Rice Pathogen Fusarium fujikuroi Reveal Complex Regulation of Secondary Metabolism and Novel Metabolites

    Science.gov (United States)

    Studt, Lena; Niehaus, Eva-Maria; Espino, Jose J.; Huß, Kathleen; Michielse, Caroline B.; Albermann, Sabine; Wagner, Dominik; Bergner, Sonja V.; Connolly, Lanelle R.; Fischer, Andreas; Reuter, Gunter; Kleigrewe, Karin; Bald, Till; Wingfield, Brenda D.; Ophir, Ron; Freeman, Stanley; Hippler, Michael; Smith, Kristina M.; Brown, Daren W.; Proctor, Robert H.; Münsterkötter, Martin; Freitag, Michael; Humpf, Hans-Ulrich; Güldener, Ulrich; Tudzynski, Bettina

    2013-01-01

    The fungus Fusarium fujikuroi causes “bakanae” disease of rice due to its ability to produce gibberellins (GAs), but it is also known for producing harmful mycotoxins. However, the genetic capacity for the whole arsenal of natural compounds and their role in the fungus' interaction with rice remained unknown. Here, we present a high-quality genome sequence of F. fujikuroi that was assembled into 12 scaffolds corresponding to the 12 chromosomes described for the fungus. We used the genome sequence along with ChIP-seq, transcriptome, proteome, and HPLC-FTMS-based metabolome analyses to identify the potential secondary metabolite biosynthetic gene clusters and to examine their regulation in response to nitrogen availability and plant signals. The results indicate that expression of most but not all gene clusters correlate with proteome and ChIP-seq data. Comparison of the F. fujikuroi genome to those of six other fusaria revealed that only a small number of gene clusters are conserved among these species, thus providing new insights into the divergence of secondary metabolism in the genus Fusarium. Noteworthy, GA biosynthetic genes are present in some related species, but GA biosynthesis is limited to F. fujikuroi, suggesting that this provides a selective advantage during infection of the preferred host plant rice. Among the genome sequences analyzed, one cluster that includes a polyketide synthase gene (PKS19) and another that includes a non-ribosomal peptide synthetase gene (NRPS31) are unique to F. fujikuroi. The metabolites derived from these clusters were identified by HPLC-FTMS-based analyses of engineered F. fujikuroi strains overexpressing cluster genes. In planta expression studies suggest a specific role for the PKS19-derived product during rice infection. Thus, our results indicate that combined comparative genomics and genome-wide experimental analyses identified novel genes and secondary metabolites that contribute to the evolutionary success of F

  2. Will genomic selection be a practical method for plant breeding?

    Science.gov (United States)

    Nakaya, Akihiro; Isobe, Sachiko N

    2012-11-01

    Genomic selection or genome-wide selection (GS) has been highlighted as a new approach for marker-assisted selection (MAS) in recent years. GS is a form of MAS that selects favourable individuals based on genomic estimated breeding values. Previous studies have suggested the utility of GS, especially for capturing small-effect quantitative trait loci, but GS has not become a popular methodology in the field of plant breeding, possibly because there is insufficient information available on GS for practical use. In this review, GS is discussed from a practical breeding viewpoint. Statistical approaches employed in GS are briefly described, before the recent progress in GS studies is surveyed. GS practices in plant breeding are then reviewed before future prospects are discussed. Statistical concepts used in GS are discussed with genetic models and variance decomposition, heritability, breeding value and linear model. Recent progress in GS studies is reviewed with a focus on empirical studies. For the practice of GS in plant breeding, several specific points are discussed including linkage disequilibrium, feature of populations and genotyped markers and breeding scheme. Currently, GS is not perfect, but it is a potent, attractive and valuable approach for plant breeding. This method will be integrated into many practical breeding programmes in the near future with further advances and the maturing of its theory.

  3. Whole-genome modeling accurately predicts quantitative traits, as revealed in plants.

    OpenAIRE

    Tatarinova, Tatiana; Shin, Min-Gyoung; Marjoram, Paul; Nuzhdin, Sergey; Triska, Martin; Rickauer, Martina; Nikolsky, Yuri; Mazurier, Melanie; Gentzbittel, Laurent; Ben, Cecile

    2016-01-01

    Many adaptive events in natural populations, as well as response to artificial selection, are caused by polygenic action. Under selective pressure, the adaptive traits can quickly respond via small allele frequency shifts spread across numerous loci. We hypothesize that a large proportion of current phenotypic variation between individuals may be best explained by population admixture. We thus consider the complete, genome-wide universe of genetic variability, spread across several ancestral ...

  4. Trait variation and genetic diversity in a banana genomic selection training population

    Science.gov (United States)

    Nyine, Moses; Uwimana, Brigitte; Swennen, Rony; Batte, Michael; Brown, Allan; Christelová, Pavla; Hřibová, Eva; Lorenzen, Jim

    2017-01-01

    Banana (Musa spp.) is an important crop in the African Great Lakes region in terms of income and food security, with the highest per capita consumption worldwide. Pests, diseases and climate change hamper sustainable production of bananas. New breeding tools with increased crossbreeding efficiency are being investigated to breed for resistant, high yielding hybrids of East African Highland banana (EAHB). These include genomic selection (GS), which will benefit breeding through increased genetic gain per unit time. Understanding trait variation and the correlation among economically important traits is an essential first step in the development and selection of suitable GS models for banana. In this study, we tested the hypothesis that trait variations in bananas are not affected by cross combination, cycle, field management and their interaction with genotype. A training population created using EAHB breeding material and its progeny was phenotyped in two contrasting conditions. A high level of correlation among vegetative and yield related traits was observed. Therefore, genomic selection models could be developed for traits that are easily measured. It is likely that the predictive ability of traits that are difficult to phenotype will be similar to less difficult traits they are highly correlated with. Genotype response to cycle and field management practices varied greatly with respect to traits. Yield related traits accounted for 31–35% of principal component variation under low and high input field management conditions. Resistance to Black Sigatoka was stable across cycles but varied under different field management depending on the genotype. The best cross combination was 1201K-1xSH3217 based on selection response (R) of hybrids. Genotyping using simple sequence repeat (SSR) markers revealed that the training population was genetically diverse, reflecting a complex pedigree background, which was mostly influenced by the male parents. PMID:28586365

  5. Trait variation and genetic diversity in a banana genomic selection training population.

    Directory of Open Access Journals (Sweden)

    Moses Nyine

    Full Text Available Banana (Musa spp. is an important crop in the African Great Lakes region in terms of income and food security, with the highest per capita consumption worldwide. Pests, diseases and climate change hamper sustainable production of bananas. New breeding tools with increased crossbreeding efficiency are being investigated to breed for resistant, high yielding hybrids of East African Highland banana (EAHB. These include genomic selection (GS, which will benefit breeding through increased genetic gain per unit time. Understanding trait variation and the correlation among economically important traits is an essential first step in the development and selection of suitable GS models for banana. In this study, we tested the hypothesis that trait variations in bananas are not affected by cross combination, cycle, field management and their interaction with genotype. A training population created using EAHB breeding material and its progeny was phenotyped in two contrasting conditions. A high level of correlation among vegetative and yield related traits was observed. Therefore, genomic selection models could be developed for traits that are easily measured. It is likely that the predictive ability of traits that are difficult to phenotype will be similar to less difficult traits they are highly correlated with. Genotype response to cycle and field management practices varied greatly with respect to traits. Yield related traits accounted for 31-35% of principal component variation under low and high input field management conditions. Resistance to Black Sigatoka was stable across cycles but varied under different field management depending on the genotype. The best cross combination was 1201K-1xSH3217 based on selection response (R of hybrids. Genotyping using simple sequence repeat (SSR markers revealed that the training population was genetically diverse, reflecting a complex pedigree background, which was mostly influenced by the male parents.

  6. Trait variation and genetic diversity in a banana genomic selection training population.

    Science.gov (United States)

    Nyine, Moses; Uwimana, Brigitte; Swennen, Rony; Batte, Michael; Brown, Allan; Christelová, Pavla; Hřibová, Eva; Lorenzen, Jim; Doležel, Jaroslav

    2017-01-01

    Banana (Musa spp.) is an important crop in the African Great Lakes region in terms of income and food security, with the highest per capita consumption worldwide. Pests, diseases and climate change hamper sustainable production of bananas. New breeding tools with increased crossbreeding efficiency are being investigated to breed for resistant, high yielding hybrids of East African Highland banana (EAHB). These include genomic selection (GS), which will benefit breeding through increased genetic gain per unit time. Understanding trait variation and the correlation among economically important traits is an essential first step in the development and selection of suitable GS models for banana. In this study, we tested the hypothesis that trait variations in bananas are not affected by cross combination, cycle, field management and their interaction with genotype. A training population created using EAHB breeding material and its progeny was phenotyped in two contrasting conditions. A high level of correlation among vegetative and yield related traits was observed. Therefore, genomic selection models could be developed for traits that are easily measured. It is likely that the predictive ability of traits that are difficult to phenotype will be similar to less difficult traits they are highly correlated with. Genotype response to cycle and field management practices varied greatly with respect to traits. Yield related traits accounted for 31-35% of principal component variation under low and high input field management conditions. Resistance to Black Sigatoka was stable across cycles but varied under different field management depending on the genotype. The best cross combination was 1201K-1xSH3217 based on selection response (R) of hybrids. Genotyping using simple sequence repeat (SSR) markers revealed that the training population was genetically diverse, reflecting a complex pedigree background, which was mostly influenced by the male parents.

  7. Selective whole genome amplification for resequencing target microbial species from complex natural samples.

    Science.gov (United States)

    Leichty, Aaron R; Brisson, Dustin

    2014-10-01

    Population genomic analyses have demonstrated power to address major questions in evolutionary and molecular microbiology. Collecting populations of genomes is hindered in many microbial species by the absence of a cost effective and practical method to collect ample quantities of sufficiently pure genomic DNA for next-generation sequencing. Here we present a simple method to amplify genomes of a target microbial species present in a complex, natural sample. The selective whole genome amplification (SWGA) technique amplifies target genomes using nucleotide sequence motifs that are common in the target microbe genome, but rare in the background genomes, to prime the highly processive phi29 polymerase. SWGA thus selectively amplifies the target genome from samples in which it originally represented a minor fraction of the total DNA. The post-SWGA samples are enriched in target genomic DNA, which are ideal for population resequencing. We demonstrate the efficacy of SWGA using both laboratory-prepared mixtures of cultured microbes as well as a natural host-microbe association. Targeted amplification of Borrelia burgdorferi mixed with Escherichia coli at genome ratios of 1:2000 resulted in >10(5)-fold amplification of the target genomes with genomic extracts from Wolbachia pipientis-infected Drosophila melanogaster resulted in up to 70% of high-throughput resequencing reads mapping to the W. pipientis genome. By contrast, 2-9% of sequencing reads were derived from W. pipientis without prior amplification. The SWGA technique results in high sequencing coverage at a fraction of the sequencing effort, thus allowing population genomic studies at affordable costs. Copyright © 2014 by the Genetics Society of America.

  8. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes

    Science.gov (United States)

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approxima...

  9. PSP: rapid identification of orthologous coding genes under positive selection across multiple closely related prokaryotic genomes.

    Science.gov (United States)

    Su, Fei; Ou, Hong-Yu; Tao, Fei; Tang, Hongzhi; Xu, Ping

    2013-12-27

    With genomic sequences of many closely related bacterial strains made available by deep sequencing, it is now possible to investigate trends in prokaryotic microevolution. Positive selection is a sub-process of microevolution, in which a particular mutation is favored, causing the allele frequency to continuously shift in one direction. Wide scanning of prokaryotic genomes has shown that positive selection at the molecular level is much more frequent than expected. Genes with significant positive selection may play key roles in bacterial adaption to different environmental pressures. However, selection pressure analyses are computationally intensive and awkward to configure. Here we describe an open access web server, which is designated as PSP (Positive Selection analysis for Prokaryotic genomes) for performing evolutionary analysis on orthologous coding genes, specially designed for rapid comparison of dozens of closely related prokaryotic genomes. Remarkably, PSP facilitates functional exploration at the multiple levels by assignments and enrichments of KO, GO or COG terms. To illustrate this user-friendly tool, we analyzed Escherichia coli and Bacillus cereus genomes and found that several genes, which play key roles in human infection and antibiotic resistance, show significant evidence of positive selection. PSP is freely available to all users without any login requirement at: http://db-mml.sjtu.edu.cn/PSP/. PSP ultimately allows researchers to do genome-scale analysis for evolutionary selection across multiple prokaryotic genomes rapidly and easily, and identify the genes undergoing positive selection, which may play key roles in the interactions of host-pathogen and/or environmental adaptation.

  10. A privacy-preserving solution for compressed storage and selective retrieval of genomic data.

    Science.gov (United States)

    Huang, Zhicong; Ayday, Erman; Lin, Huang; Aiyar, Raeka S; Molyneaux, Adam; Xu, Zhenyu; Fellay, Jacques; Steinmetz, Lars M; Hubaux, Jean-Pierre

    2016-12-01

    In clinical genomics, the continuous evolution of bioinformatic algorithms and sequencing platforms makes it beneficial to store patients' complete aligned genomic data in addition to variant calls relative to a reference sequence. Due to the large size of human genome sequence data files (varying from 30 GB to 200 GB depending on coverage), two major challenges facing genomics laboratories are the costs of storage and the efficiency of the initial data processing. In addition, privacy of genomic data is becoming an increasingly serious concern, yet no standard data storage solutions exist that enable compression, encryption, and selective retrieval. Here we present a privacy-preserving solution named SECRAM (Selective retrieval on Encrypted and Compressed Reference-oriented Alignment Map) for the secure storage of compressed aligned genomic data. Our solution enables selective retrieval of encrypted data and improves the efficiency of downstream analysis (e.g., variant calling). Compared with BAM, the de facto standard for storing aligned genomic data, SECRAM uses 18% less storage. Compared with CRAM, one of the most compressed nonencrypted formats (using 34% less storage than BAM), SECRAM maintains efficient compression and downstream data processing, while allowing for unprecedented levels of security in genomic data storage. Compared with previous work, the distinguishing features of SECRAM are that (1) it is position-based instead of read-based, and (2) it allows random querying of a subregion from a BAM-like file in an encrypted form. Our method thus offers a space-saving, privacy-preserving, and effective solution for the storage of clinical genomic data. © 2016 Huang et al.; Published by Cold Spring Harbor Laboratory Press.

  11. Population genomics reveal recent speciation and rapid evolutionary adaptation in polar bears.

    Science.gov (United States)

    Liu, Shiping; Lorenzen, Eline D; Fumagalli, Matteo; Li, Bo; Harris, Kelley; Xiong, Zijun; Zhou, Long; Korneliussen, Thorfinn Sand; Somel, Mehmet; Babbitt, Courtney; Wray, Greg; Li, Jianwen; He, Weiming; Wang, Zhuo; Fu, Wenjing; Xiang, Xueyan; Morgan, Claire C; Doherty, Aoife; O'Connell, Mary J; McInerney, James O; Born, Erik W; Dalén, Love; Dietz, Rune; Orlando, Ludovic; Sonne, Christian; Zhang, Guojie; Nielsen, Rasmus; Willerslev, Eske; Wang, Jun

    2014-05-08

    Polar bears are uniquely adapted to life in the High Arctic and have undergone drastic physiological changes in response to Arctic climates and a hyper-lipid diet of primarily marine mammal prey. We analyzed 89 complete genomes of polar bear and brown bear using population genomic modeling and show that the species diverged only 479-343 thousand years BP. We find that genes on the polar bear lineage have been under stronger positive selection than in brown bears; nine of the top 16 genes under strong positive selection are associated with cardiomyopathy and vascular disease, implying important reorganization of the cardiovascular system. One of the genes showing the strongest evidence of selection, APOB, encodes the primary lipoprotein component of low-density lipoprotein (LDL); functional mutations in APOB may explain how polar bears are able to cope with life-long elevated LDL levels that are associated with high risk of heart disease in humans. Copyright © 2014 Elsevier Inc. All rights reserved.

  12. Genome-wide detection of selection and other evolutionary forces

    DEFF Research Database (Denmark)

    Xu, Zhuofei; Zhou, Rui

    2015-01-01

    As is well known, pathogenic microbes evolve rapidly to escape from the host immune system and antibiotics. Genetic variations among microbial populations occur frequently during the long-term pathogen–host evolutionary arms race, and individual mutation beneficial for the fitness can be fixed...... to scan genome-wide alignments for evidence of positive Darwinian selection, recombination, and other evolutionary forces operating on the coding regions. In this chapter, we describe an integrative analysis pipeline and its application to tracking featured evolutionary trajectories on the genome...

  13. The Jujube Genome Provides Insights into Genome Evolution and the Domestication of Sweetness/Acidity Taste in Fruit Trees.

    Science.gov (United States)

    Huang, Jian; Zhang, Chunmei; Zhao, Xing; Fei, Zhangjun; Wan, KangKang; Zhang, Zhong; Pang, Xiaoming; Yin, Xiao; Bai, Yang; Sun, Xiaoqing; Gao, Lizhi; Li, Ruiqiang; Zhang, Jinbo; Li, Xingang

    2016-12-01

    Jujube (Ziziphus jujuba Mill.) belongs to the Rhamnaceae family and is a popular fruit tree species with immense economic and nutritional value. Here, we report a draft genome of the dry jujube cultivar 'Junzao' and the genome resequencing of 31 geographically diverse accessions of cultivated and wild jujubes (Ziziphus jujuba var. spinosa). Comparative analysis revealed that the genome of 'Dongzao', a fresh jujube, was ~86.5 Mb larger than that of the 'Junzao', partially due to the recent insertions of transposable elements in the 'Dongzao' genome. We constructed eight proto-chromosomes of the common ancestor of Rhamnaceae and Rosaceae, two sister families in the order Rosales, and elucidated the evolutionary processes that have shaped the genome structures of modern jujubes. Population structure analysis revealed the complex genetic background of jujubes resulting from extensive hybridizations between jujube and its wild relatives. Notably, several key genes that control fruit organic acid metabolism and sugar content were identified in the selective sweep regions. We also identified S-locus genes controlling gametophytic self-incompatibility and investigated haplotype patterns of the S locus in the jujube genomes, which would provide a guideline for parent selection for jujube crossbreeding. This study provides valuable genomic resources for jujube improvement, and offers insights into jujube genome evolution and its population structure and domestication.

  14. Comparative Analyses of Nonpathogenic, Opportunistic, and Totally Pathogenic Mycobacteria Reveal Genomic and Biochemical Variabilities and Highlight the Survival Attributes of Mycobacterium tuberculosis

    Science.gov (United States)

    Singh, Yadvir; Kohli, Sakshi; Ahmad, Javeed; Ehtesham, Nasreen Z.; Tyagi, Anil K.

    2014-01-01

    ABSTRACT Mycobacterial evolution involves various processes, such as genome reduction, gene cooption, and critical gene acquisition. Our comparative genome size analysis of 44 mycobacterial genomes revealed that the nonpathogenic (NP) genomes were bigger than those of opportunistic (OP) or totally pathogenic (TP) mycobacteria, with the TP genomes being smaller yet variable in size—their genomic plasticity reflected their ability to evolve and survive under various environmental conditions. From the 44 mycobacterial species, 13 species, representing TP, OP, and NP, were selected for genomic-relatedness analyses. Analysis of homologous protein-coding genes shared between Mycobacterium indicus pranii (NP), Mycobacterium intracellulare ATCC 13950 (OP), and Mycobacterium tuberculosis H37Rv (TP) revealed that 4,995 (i.e., ~95%) M. indicaus pranii proteins have homology with M. intracellulare, whereas the homologies among M. indicus pranii, M. intracellulare ATCC 13950, and M. tuberculosis H37Rv were significantly lower. A total of 4,153 (~79%) M. indicus pranii proteins and 4,093 (~79%) M. intracellulare ATCC 13950 proteins exhibited homology with the M. tuberculosis H37Rv proteome, while 3,301 (~82%) and 3,295 (~82%) M. tuberculosis H37Rv proteins showed homology with M. indicus pranii and M. intracellulare ATCC 13950 proteomes, respectively. Comparative metabolic pathway analyses of TP/OP/NP mycobacteria showed enzymatic plasticity between M. indicus pranii (NP) and M. intracellulare ATCC 13950 (OP), Mycobacterium avium 104 (OP), and M. tuberculosis H37Rv (TP). Mycobacterium tuberculosis seems to have acquired novel alternate pathways with possible roles in metabolism, host-pathogen interactions, virulence, and intracellular survival, and by implication some of these could be potential drug targets. PMID:25370496

  15. Revealing metabolite biomarkers for acupuncture treatment by linear programming based feature selection.

    Science.gov (United States)

    Wang, Yong; Wu, Qiao-Feng; Chen, Chen; Wu, Ling-Yun; Yan, Xian-Zhong; Yu, Shu-Guang; Zhang, Xiang-Sun; Liang, Fan-Rong

    2012-01-01

    Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Our result demonstrates that metabolic profiling might be a promising method to

  16. Selection for long and short sleep duration in Drosophila melanogaster reveals the complex genetic network underlying natural variation in sleep.

    Science.gov (United States)

    Harbison, Susan T; Serrano Negron, Yazmin L; Hansen, Nancy F; Lobell, Amanda S

    2017-12-01

    Why do some individuals need more sleep than others? Forward mutagenesis screens in flies using engineered mutations have established a clear genetic component to sleep duration, revealing mutants that convey very long or short sleep. Whether such extreme long or short sleep could exist in natural populations was unknown. We applied artificial selection for high and low night sleep duration to an outbred population of Drosophila melanogaster for 13 generations. At the end of the selection procedure, night sleep duration diverged by 9.97 hours in the long and short sleeper populations, and 24-hour sleep was reduced to 3.3 hours in the short sleepers. Neither long nor short sleeper lifespan differed appreciably from controls, suggesting little physiological consequences to being an extreme long or short sleeper. Whole genome sequence data from seven generations of selection revealed several hundred thousand changes in allele frequencies at polymorphic loci across the genome. Combining the data from long and short sleeper populations across generations in a logistic regression implicated 126 polymorphisms in 80 candidate genes, and we confirmed three of these genes and a larger genomic region with mutant and chromosomal deficiency tests, respectively. Many of these genes could be connected in a single network based on previously known physical and genetic interactions. Candidate genes have known roles in several classic, highly conserved developmental and signaling pathways-EGFR, Wnt, Hippo, and MAPK. The involvement of highly pleiotropic pathway genes suggests that sleep duration in natural populations can be influenced by a wide variety of biological processes, which may be why the purpose of sleep has been so elusive.

  17. Selection for long and short sleep duration in Drosophila melanogaster reveals the complex genetic network underlying natural variation in sleep.

    Directory of Open Access Journals (Sweden)

    Susan T Harbison

    2017-12-01

    Full Text Available Why do some individuals need more sleep than others? Forward mutagenesis screens in flies using engineered mutations have established a clear genetic component to sleep duration, revealing mutants that convey very long or short sleep. Whether such extreme long or short sleep could exist in natural populations was unknown. We applied artificial selection for high and low night sleep duration to an outbred population of Drosophila melanogaster for 13 generations. At the end of the selection procedure, night sleep duration diverged by 9.97 hours in the long and short sleeper populations, and 24-hour sleep was reduced to 3.3 hours in the short sleepers. Neither long nor short sleeper lifespan differed appreciably from controls, suggesting little physiological consequences to being an extreme long or short sleeper. Whole genome sequence data from seven generations of selection revealed several hundred thousand changes in allele frequencies at polymorphic loci across the genome. Combining the data from long and short sleeper populations across generations in a logistic regression implicated 126 polymorphisms in 80 candidate genes, and we confirmed three of these genes and a larger genomic region with mutant and chromosomal deficiency tests, respectively. Many of these genes could be connected in a single network based on previously known physical and genetic interactions. Candidate genes have known roles in several classic, highly conserved developmental and signaling pathways-EGFR, Wnt, Hippo, and MAPK. The involvement of highly pleiotropic pathway genes suggests that sleep duration in natural populations can be influenced by a wide variety of biological processes, which may be why the purpose of sleep has been so elusive.

  18. Comparative genomics analyses revealed two virulent Listeria monocytogenes strains isolated from ready-to-eat food.

    Science.gov (United States)

    Lim, Shu Yong; Yap, Kien-Pong; Thong, Kwai Lin

    2016-01-01

    Listeria monocytogenes is an important foodborne pathogen that causes considerable morbidity in humans with high mortality rates. In this study, we have sequenced the genomes and performed comparative genomics analyses on two strains, LM115 and LM41, isolated from ready-to-eat food in Malaysia. The genome size of LM115 and LM41 was 2,959,041 and 2,963,111 bp, respectively. These two strains shared approximately 90% homologous genes. Comparative genomics and phylogenomic analyses revealed that LM115 and LM41 were more closely related to the reference strains F2365 and EGD-e, respectively. Our virulence profiling indicated a total of 31 virulence genes shared by both analysed strains. These shared genes included those that encode for internalins and L. monocytogenes pathogenicity island 1 (LIPI-1). Both the Malaysian L. monocytogenes strains also harboured several genes associated with stress tolerance to counter the adverse conditions. Seven antibiotic and efflux pump related genes which may confer resistance against lincomycin, erythromycin, fosfomycin, quinolone, tetracycline, and penicillin, and macrolides were identified in the genomes of both strains. Whole genome sequencing and comparative genomics analyses revealed two virulent L. monocytogenes strains isolated from ready-to-eat foods in Malaysia. The identification of strains with pathogenic, persistent, and antibiotic resistant potentials from minimally processed food warrant close attention from both healthcare and food industry.

  19. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  20. Genome sequencing and comparative genomics reveal a repertoire of putative pathogenicity genes in chilli anthracnose fungus Colletotrichum truncatum.

    Science.gov (United States)

    Rao, Soumya; Nandineni, Madhusudan R

    2017-01-01

    Colletotrichum truncatum, a major fungal phytopathogen, causes the anthracnose disease on an economically important spice crop chilli (Capsicum annuum), resulting in huge economic losses in tropical and sub-tropical countries. It follows a subcuticular intramural infection strategy on chilli with a short, asymptomatic, endophytic phase, which contrasts with the intracellular hemibiotrophic lifestyle adopted by most of the Colletotrichum species. However, little is known about the molecular determinants and the mechanism of pathogenicity in this fungus. A high quality whole genome sequence and gene annotation based on transcriptome data of an Indian isolate of C. truncatum from chilli has been obtained. Analysis of the genome sequence revealed a rich repertoire of pathogenicity genes in C. truncatum encoding secreted proteins, effectors, plant cell wall degrading enzymes, secondary metabolism associated proteins, with potential roles in the host-specific infection strategy, placing it next only to the Fusarium species. The size of genome assembly, number of predicted genes and some of the functional categories were similar to other sequenced Colletotrichum species. The comparative genomic analyses with other species and related fungi identified some unique genes and certain highly expanded gene families of CAZymes, proteases and secondary metabolism associated genes in the genome of C. truncatum. The draft genome assembly and functional annotation of potential pathogenicity genes of C. truncatum provide an important genomic resource for understanding the biology and lifestyle of this important phytopathogen and will pave the way for designing efficient disease control regimens.

  1. Comparative Genomics of Smut Pathogens: Insights From Orphans and Positively Selected Genes Into Host Specialization

    Directory of Open Access Journals (Sweden)

    Juliana Benevenuto

    2018-04-01

    Full Text Available Host specialization is a key evolutionary process for the diversification and emergence of new pathogens. However, the molecular determinants of host range are poorly understood. Smut fungi are biotrophic pathogens that have distinct and narrow host ranges based on largely unknown genetic determinants. Hence, we aimed to expand comparative genomics analyses of smut fungi by including more species infecting different hosts and to define orphans and positively selected genes to gain further insights into the genetics basis of host specialization. We analyzed nine lineages of smut fungi isolated from eight crop and non-crop hosts: maize, barley, sugarcane, wheat, oats, Zizania latifolia (Manchurian rice, Echinochloa colona (a wild grass, and Persicaria sp. (a wild dicot plant. We assembled two new genomes: Ustilago hordei (strain Uhor01 isolated from oats and U. tritici (strain CBS 119.19 isolated from wheat. The smut genomes were of small sizes, ranging from 18.38 to 24.63 Mb. U. hordei species experienced genome expansions due to the proliferation of transposable elements and the amount of these elements varied among the two strains. Phylogenetic analysis confirmed that Ustilago is not a monophyletic genus and, furthermore, detected misclassification of the U. tritici specimen. The comparison between smut pathogens of crop and non-crop hosts did not reveal distinct signatures, suggesting that host domestication did not play a dominant role in shaping the evolution of smuts. We found that host specialization in smut fungi likely has a complex genetic basis: different functional categories were enriched in orphans and lineage-specific selected genes. The diversification and gain/loss of effector genes are probably the most important determinants of host specificity.

  2. Natural selection affects multiple aspects of genetic variation at putatively neutral sites across the human genome.

    Science.gov (United States)

    Lohmueller, Kirk E; Albrechtsen, Anders; Li, Yingrui; Kim, Su Yeon; Korneliussen, Thorfinn; Vinckenbosch, Nicolas; Tian, Geng; Huerta-Sanchez, Emilia; Feder, Alison F; Grarup, Niels; Jørgensen, Torben; Jiang, Tao; Witte, Daniel R; Sandbæk, Annelli; Hellmann, Ines; Lauritzen, Torsten; Hansen, Torben; Pedersen, Oluf; Wang, Jun; Nielsen, Rasmus

    2011-10-01

    A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries of genetic variation, like allele frequencies, are also correlated with recombination rate and whether these correlations can be explained solely by negative selection against deleterious mutations or whether positive selection acting on favorable alleles is also required. Here we attempt to address these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations. However, models with strong positive selection on nonsynonymous mutations and little negative selection predict a stronger negative correlation between neutral diversity and nonsynonymous divergence than observed in the actual data, supporting the importance of negative, rather than positive, selection throughout the genome. Further, we show that the widespread presence of weakly deleterious alleles, rather than a small number of strongly positively selected mutations, is responsible for the correlation between neutral genetic diversity and recombination rate. This work suggests that natural selection has affected multiple aspects of linked neutral variation throughout the human genome and that positive selection is not required to explain these observations.

  3. Uninformative polymorphisms bias genome scans for signatures of selection

    Directory of Open Access Journals (Sweden)

    Roesti Marius

    2012-06-01

    Full Text Available Abstract Background With the establishment of high-throughput sequencing technologies and new methods for rapid and extensive single nucleotide (SNP discovery, marker-based genome scans in search of signatures of divergent selection between populations occupying ecologically distinct environments are becoming increasingly popular. Methods and Results On the basis of genome-wide SNP marker data generated by RAD sequencing of lake and stream stickleback populations, we show that the outcome of such studies can be systematically biased if markers with a low minor allele frequency are included in the analysis. The reason is that these ‘uninformative’ polymorphisms lack the adequate potential to capture signatures of drift and hitchhiking, the focal processes in ecological genome scans. Bias associated with uninformative polymorphisms is not eliminated by just avoiding technical artifacts in the data (PCR and sequencing errors, as a high proportion of SNPs with a low minor allele frequency is a general biological feature of natural populations. Conclusions We suggest that uninformative markers should be excluded from genome scans based on empirical criteria derived from careful inspection of the data, and that these criteria should be reported explicitly. Together, this should increase the quality and comparability of genome scans, and hence promote our understanding of the processes driving genomic differentiation.

  4. Different selective pressures lead to different genomic outcomes as newly-formed hybrid yeasts evolve

    Directory of Open Access Journals (Sweden)

    Piotrowski Jeff S

    2012-04-01

    Full Text Available Abstract Background Interspecific hybridization occurs in every eukaryotic kingdom. While hybrid progeny are frequently at a selective disadvantage, in some instances their increased genome size and complexity may result in greater stress resistance than their ancestors, which can be adaptively advantageous at the edges of their ancestors' ranges. While this phenomenon has been repeatedly documented in the field, the response of hybrid populations to long-term selection has not often been explored in the lab. To fill this knowledge gap we crossed the two most distantly related members of the Saccharomyces sensu stricto group, S. cerevisiae and S. uvarum, and established a mixed population of homoploid and aneuploid hybrids to study how different types of selection impact hybrid genome structure. Results As temperature was raised incrementally from 31°C to 46.5°C over 500 generations of continuous culture, selection favored loss of the S. uvarum genome, although the kinetics of genome loss differed among independent replicates. Temperature-selected isolates exhibited greater inherent and induced thermal tolerance than parental species and founding hybrids, and also exhibited ethanol resistance. In contrast, as exogenous ethanol was increased from 0% to 14% over 500 generations of continuous culture, selection favored euploid S. cerevisiae x S. uvarum hybrids. Ethanol-selected isolates were more ethanol tolerant than S. uvarum and one of the founding hybrids, but did not exhibit resistance to temperature stress. Relative to parental and founding hybrids, temperature-selected strains showed heritable differences in cell wall structure in the forms of increased resistance to zymolyase digestion and Micafungin, which targets cell wall biosynthesis. Conclusions This is the first study to show experimentally that the genomic fate of newly-formed interspecific hybrids depends on the type of selection they encounter during the course of evolution

  5. Comparison of analyses of the QTLMAS XIII common dataset. I: genomic selection

    NARCIS (Netherlands)

    Bastiaansen, J.W.M.; Bink, M.C.A.M.; Coster, A.; Maliepaard, C.A.; Calus, M.P.L.

    2010-01-01

    Background - Genomic selection, the use of markers across the whole genome, receives increasing amounts of attention and is having more and more impact on breeding programs. Development of statistical and computational methods to estimate breeding values based on markers is a very active area of

  6. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states.

    Directory of Open Access Journals (Sweden)

    Kevin A Wilkinson

    2008-04-01

    Full Text Available Replication and pathogenesis of the human immunodeficiency virus (HIV is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent high-throughput SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower (p-value < 0.0001 SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further

  7. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    OpenAIRE

    Henrique Machado; Henrique Machado; Lone Gram

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationship...

  8. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication

    Science.gov (United States)

    Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L.; Searle, Steven M. J.; Minx, Patrick; Hillier, LaDeana W.; Koboldt, Daniel C.; Davis, Brian W.; Driscoll, Carlos A.; Barr, Christina S.; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W. C.; Hahn, Matthew W.; Menotti-Raymond, Marilyn; O’Brien, Stephen J.; Wilson, Richard K.; Lyons, Leslie A.; Murphy, William J.; Warren, Wesley C.

    2014-01-01

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae. PMID:25385592

  9. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication.

    Science.gov (United States)

    Montague, Michael J; Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L; Searle, Steven M J; Minx, Patrick; Hillier, LaDeana W; Koboldt, Daniel C; Davis, Brian W; Driscoll, Carlos A; Barr, Christina S; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W C; Hahn, Matthew W; Menotti-Raymond, Marilyn; O'Brien, Stephen J; Wilson, Richard K; Lyons, Leslie A; Murphy, William J; Warren, Wesley C

    2014-12-02

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae.

  10. Adaptation of maize to temperate climates: mid-density genome-wide association genetics and diversity patterns reveal key genomic regions, with a major contribution of the Vgt2 (ZCN8 locus.

    Directory of Open Access Journals (Sweden)

    Sophie Bouchet

    Full Text Available The migration of maize from tropical to temperate climates was accompanied by a dramatic evolution in flowering time. To gain insight into the genetic architecture of this adaptive trait, we conducted a 50K SNP-based genome-wide association and diversity investigation on a panel of tropical and temperate American and European representatives. Eighteen genomic regions were associated with flowering time. The number of early alleles cumulated along these regions was highly correlated with flowering time. Polymorphism in the vicinity of the ZCN8 gene, which is the closest maize homologue to Arabidopsis major flowering time (FT gene, had the strongest effect. This polymorphism is in the vicinity of the causal factor of Vgt2 QTL. Diversity was lower, whereas differentiation and LD were higher for associated loci compared to the rest of the genome, which is consistent with selection acting on flowering time during maize migration. Selection tests also revealed supplementary loci that were highly differentiated among groups and not associated with flowering time in our panel, whereas they were in other linkage-based studies. This suggests that allele fixation led to a lack of statistical power when structure and relatedness were taken into account in a linear mixed model. Complementary designs and analysis methods are necessary to unravel the architecture of complex traits. Based on linkage disequilibrium (LD estimates corrected for population structure, we concluded that the number of SNPs genotyped should be at least doubled to capture all QTLs contributing to the genetic architecture of polygenic traits in this panel. These results show that maize flowering time is controlled by numerous QTLs of small additive effect and that strong polygenic selection occurred under cool climatic conditions. They should contribute to more efficient genomic predictions of flowering time and facilitate the dissemination of diverse maize genetic resources under a wide

  11. Accounting for linkage disequilibrium in genome scans for selection without individual genotypes: The local score approach.

    Science.gov (United States)

    Fariello, María Inés; Boitard, Simon; Mercier, Sabine; Robelin, David; Faraut, Thomas; Arnould, Cécile; Recoquillay, Julien; Bouchez, Olivier; Salin, Gérald; Dehais, Patrice; Gourichon, David; Leroux, Sophie; Pitel, Frédérique; Leterrier, Christine; SanCristobal, Magali

    2017-07-01

    Detecting genomic footprints of selection is an important step in the understanding of evolution. Accounting for linkage disequilibrium in genome scans increases detection power, but haplotype-based methods require individual genotypes and are not applicable on pool-sequenced samples. We propose to take advantage of the local score approach to account for linkage disequilibrium in genome scans for selection, cumulating (possibly small) signals from single markers over a genomic segment, to clearly pinpoint a selection signal. Using computer simulations, we demonstrate that this approach detects selection with higher power than several state-of-the-art single-marker, windowing or haplotype-based approaches. We illustrate this on two benchmark data sets including individual genotypes, for which we obtain similar results with the local score and one haplotype-based approach. Finally, we apply the local score approach to Pool-Seq data obtained from a divergent selection experiment on behaviour in quail and obtain precise and biologically coherent selection signals: while competing methods fail to highlight any clear selection signature, our method detects several regions involving genes known to act on social responsiveness or autistic traits. Although we focus here on the detection of positive selection from multiple population data, the local score approach is general and can be applied to other genome scans for selection or other genomewide analyses such as GWAS. © 2017 John Wiley & Sons Ltd.

  12. Accuracy and responses of genomic selection on key traits in apple breeding

    NARCIS (Netherlands)

    Muranty, Hélène; Troggio, Michela; Sadok, Ben Inès; Rifaï, Al Mehdi; Auwerkerken, Annemarie; Banchi, E.; Velasco, Riccardo; Stevanato, P.; Weg, van de W.E.; Guardo, Di M.; Kumar, S.; Laurens, François; Bink, M.C.A.M.

    2015-01-01

    The application of genomic selection in fruit tree crops is expected to enhance breeding efficiency by increasing prediction accuracy, increasing selection intensity and decreasing generation interval. The objectives of this study were to assess the accuracy of prediction and selection response in

  13. Accuracy of genomic selection in biparental populations of flax (Linum usitatissimum L.

    Directory of Open Access Journals (Sweden)

    Frank M. You

    2016-08-01

    Full Text Available Flax is an important economic crop for seed oil and stem fiber. Phenotyping of traits such as seed yield, seed quality, stem fiber yield, and quality characteristics is expensive and time consuming. Genomic selection (GS refers to a breeding approach aimed at selecting preferred individuals based on genomic estimated breeding values predicted by a statistical model based on the relationship between phenotypes and genome-wide genetic markers. We evaluated the prediction accuracy of GS (rMP and the efficiency of GS relative to phenotypic selection (RE for three GS models: ridge regression best linear unbiased prediction (RR-BLUP, Bayesian LASSO (BL, and Bayesian ridge regression (BRR, for seed yield, oil content, iodine value, linoleic, and linolenic acid content with a full and a common set of genome-wide simple sequence repeat markers in each of three biparental populations. The three GS models generated similar rMP and RE, while BRR displayed a higher coefficient of determination (R2 of the fitted models than did RR-BLUP or BL. The mean rMP and RE varied for traits with different heritabilities and was affected by the genetic variation of the traits in the populations. GS for seed yield generated a mean RE of 1.52 across populations and marker sets, a value significantly superior to that for direct phenotypic selection. Our empirical results provide the first validation of GS in flax and demonstrate that GS could increase genetic gain per unit time for linseed breeding. Further studies for selection of training populations and markers are warranted.

  14. Nomadic lifestyle of Lactobacillus plantarum revealed by comparative genomics of 54 strains isolated from different habitats.

    Science.gov (United States)

    Martino, Maria Elena; Bayjanov, Jumamurat R; Caffrey, Brian E; Wels, Michiel; Joncour, Pauline; Hughes, Sandrine; Gillet, Benjamin; Kleerebezem, Michiel; van Hijum, Sacha A F T; Leulier, François

    2016-12-01

    The ability of bacteria to adapt to diverse environmental conditions is well-known. The process of bacterial adaptation to a niche has been linked to large changes in the genome content, showing that many bacterial genomes reflect the constraints imposed by their habitat. However, some highly versatile bacteria are found in diverse habitats that almost share nothing in common. Lactobacillus plantarum is a lactic acid bacterium that is found in a large variety of habitat. With the aim of unravelling the link between evolution and ecological versatility of L. plantarum, we analysed the genomes of 54 L. plantarum strains isolated from different environments. Comparative genome analysis identified a high level of genomic diversity and plasticity among the strains analysed. Phylogenomic and functional divergence studies coupled with gene-trait matching analyses revealed a mixed distribution of the strains, which was uncoupled from their environmental origin. Our findings revealed the absence of specific genomic signatures marking adaptations of L. plantarum towards the diverse habitats it is associated with. This suggests fundamentally similar trends of genome evolution in L. plantarum, which occur in a manner that is apparently uncoupled from ecological constraint and reflects the nomadic lifestyle of this species. © 2016 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.

  15. Parallel or convergent evolution in human population genomic data revealed by genotype networks.

    Science.gov (United States)

    R Vahdati, Ali; Wagner, Andreas

    2016-08-02

    Genotype networks are representations of genetic variation data that are complementary to phylogenetic trees. A genotype network is a graph whose nodes are genotypes (DNA sequences) with the same broadly defined phenotype. Two nodes are connected if they differ in some minimal way, e.g., in a single nucleotide. We analyze human genome variation data from the 1,000 genomes project, and construct haploid genotype (haplotype) networks for 12,235 protein coding genes. The structure of these networks varies widely among genes, indicating different patterns of variation despite a shared evolutionary history. We focus on those genes whose genotype networks show many cycles, which can indicate homoplasy, i.e., parallel or convergent evolution, on the sequence level. For 42 genes, the observed number of cycles is so large that it cannot be explained by either chance homoplasy or recombination. When analyzing possible explanations, we discovered evidence for positive selection in 21 of these genes and, in addition, a potential role for constrained variation and purifying selection. Balancing selection plays at most a small role. The 42 genes with excess cycles are enriched in functions related to immunity and response to pathogens. Genotype networks are representations of genetic variation data that can help understand unusual patterns of genomic variation.

  16. Genome-wide comparative analysis of codon usage bias and codon context patterns among cyanobacterial genomes.

    Science.gov (United States)

    Prabha, Ratna; Singh, Dhananjaya P; Sinha, Swati; Ahmad, Khurshid; Rai, Anil

    2017-04-01

    With the increasing accumulation of genomic sequence information of prokaryotes, the study of codon usage bias has gained renewed attention. The purpose of this study was to examine codon selection pattern within and across cyanobacterial species belonging to diverse taxonomic orders and habitats. We performed detailed comparative analysis of cyanobacterial genomes with respect to codon bias. Our analysis reflects that in cyanobacterial genomes, A- and/or T-ending codons were used predominantly in the genes whereas G- and/or C-ending codons were largely avoided. Variation in the codon context usage of cyanobacterial genes corresponded to the clustering of cyanobacteria as per their GC content. Analysis of codon adaptation index (CAI) and synonymous codon usage order (SCUO) revealed that majority of genes are associated with low codon bias. Codon selection pattern in cyanobacterial genomes reflected compositional constraints as major influencing factor. It is also identified that although, mutational constraint may play some role in affecting codon usage bias in cyanobacteria, compositional constraint in terms of genomic GC composition coupled with environmental factors affected codon selection pattern in cyanobacterial genomes. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. Genomic prediction for Nordic Red Cattle using one-step and selection index blending

    DEFF Research Database (Denmark)

    Guosheng, Su; Madsen, Per; Nielsen, Ulrik Sander

    2012-01-01

    This study investigated the accuracy of direct genomic breeding values (DGV) using a genomic BLUP model, genomic enhanced breeding values (GEBV) using a one-step blending approach, and GEBV using a selection index blending approach for 15 traits of Nordic Red Cattle. The data comprised 6,631 bulls...... genotyped and nongenotyped bulls for one-step blending, and to scale DGV and its expected reliability in the selection index blending. Weighting (scaling) factors had a small influence on reliabilities of GEBV, but a large influence on the variation of GEBV. Based on the validation analyses, averaged over...... the 15 traits, the reliability of DGV for bulls without daughter records was 11.0 percentage points higher than the reliability of conventional pedigree index. Further gain of 0.9 percentage points was achieved by combining information from conventional pedigree index using the selection index blending...

  18. A genome scan for positive selection in thoroughbred horses.

    Directory of Open Access Journals (Sweden)

    Jingjing Gu

    2009-06-01

    Full Text Available Thoroughbred horses have been selected for exceptional racing performance resulting in system-wide structural and functional adaptations contributing to elite athletic phenotypes. Because selection has been recent and intense in a closed population that stems from a small number of founder animals Thoroughbreds represent a unique population within which to identify genomic contributions to exercise-related traits. Employing a population genetics-based hitchhiking mapping approach we performed a genome scan using 394 autosomal and X chromosome microsatellite loci and identified positively selected loci in the extreme tail-ends of the empirical distributions for (1 deviations from expected heterozygosity (Ewens-Watterson test in Thoroughbred (n = 112 and (2 global differentiation among four geographically diverse horse populations (F(ST. We found positively selected genomic regions in Thoroughbred enriched for phosphoinositide-mediated signalling (3.2-fold enrichment; P<0.01, insulin receptor signalling (5.0-fold enrichment; P<0.01 and lipid transport (2.2-fold enrichment; P<0.05 genes. We found a significant overrepresentation of sarcoglycan complex (11.1-fold enrichment; P<0.05 and focal adhesion pathway (1.9-fold enrichment; P<0.01 genes highlighting the role for muscle strength and integrity in the Thoroughbred athletic phenotype. We report for the first time candidate athletic-performance genes within regions targeted by selection in Thoroughbred horses that are principally responsible for fatty acid oxidation, increased insulin sensitivity and muscle strength: ACSS1 (acyl-CoA synthetase short-chain family member 1, ACTA1 (actin, alpha 1, skeletal muscle, ACTN2 (actinin, alpha 2, ADHFE1 (alcohol dehydrogenase, iron containing, 1, MTFR1 (mitochondrial fission regulator 1, PDK4 (pyruvate dehydrogenase kinase, isozyme 4 and TNC (tenascin C. Understanding the genetic basis for exercise adaptation will be crucial for the identification of genes

  19. GWA Mapping of Anthocyanin Accumulation Reveals Balancing Selection of MYB90 in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    Johanna A Bac-Molenaar

    Full Text Available Induction of anthocyanin accumulation by osmotic stress was assessed in 360 accessions of Arabidopsis thaliana. A wide range of natural variation, with phenotypes ranging from green to completely red/purple rosettes, was observed. A genome wide association (GWA mapping approach revealed that sequence diversity in a small 15 kb region on chromosome 1 explained 40% of the variation observed. Sequence and expression analyses of alleles of the candidate gene MYB90 identified a causal polymorphism at amino acid (AA position 210 of this transcription factor of the anthocyanin biosynthesis pathway. This amino acid discriminates the two most frequent alleles of MYB90. Both alleles are present in a substantial part of the population, suggesting balancing selection between these two alleles. Analysis of the geographical origin of the studied accessions suggests that the macro climate is not the driving force behind positive or negative selection for anthocyanin accumulation. An important role for local climatic conditions is, therefore, suggested. This study emphasizes that GWA mapping is a powerful approach to identify alleles that are under balancing selection pressure in nature.

  20. The Jujube Genome Provides Insights into Genome Evolution and the Domestication of Sweetness/Acidity Taste in Fruit Trees.

    Directory of Open Access Journals (Sweden)

    Jian Huang

    2016-12-01

    Full Text Available Jujube (Ziziphus jujuba Mill. belongs to the Rhamnaceae family and is a popular fruit tree species with immense economic and nutritional value. Here, we report a draft genome of the dry jujube cultivar 'Junzao' and the genome resequencing of 31 geographically diverse accessions of cultivated and wild jujubes (Ziziphus jujuba var. spinosa. Comparative analysis revealed that the genome of 'Dongzao', a fresh jujube, was ~86.5 Mb larger than that of the 'Junzao', partially due to the recent insertions of transposable elements in the 'Dongzao' genome. We constructed eight proto-chromosomes of the common ancestor of Rhamnaceae and Rosaceae, two sister families in the order Rosales, and elucidated the evolutionary processes that have shaped the genome structures of modern jujubes. Population structure analysis revealed the complex genetic background of jujubes resulting from extensive hybridizations between jujube and its wild relatives. Notably, several key genes that control fruit organic acid metabolism and sugar content were identified in the selective sweep regions. We also identified S-locus genes controlling gametophytic self-incompatibility and investigated haplotype patterns of the S locus in the jujube genomes, which would provide a guideline for parent selection for jujube crossbreeding. This study provides valuable genomic resources for jujube improvement, and offers insights into jujube genome evolution and its population structure and domestication.

  1. Selective Gene Delivery for Integrating Exogenous DNA into Plastid and Mitochondrial Genomes Using Peptide-DNA Complexes.

    Science.gov (United States)

    Yoshizumi, Takeshi; Oikawa, Kazusato; Chuah, Jo-Ann; Kodama, Yutaka; Numata, Keiji

    2018-05-14

    Selective gene delivery into organellar genomes (mitochondrial and plastid genomes) has been limited because of a lack of appropriate platform technology, even though these organelles are essential for metabolite and energy production. Techniques for selective organellar modification are needed to functionally improve organelles and produce transplastomic/transmitochondrial plants. However, no method for mitochondrial genome modification has yet been established for multicellular organisms including plants. Likewise, modification of plastid genomes has been limited to a few plant species and algae. In the present study, we developed ionic complexes of fusion peptides containing organellar targeting signal and plasmid DNA for selective delivery of exogenous DNA into the plastid and mitochondrial genomes of intact plants. This is the first report of exogenous DNA being integrated into the mitochondrial genomes of not only plants, but also multicellular organisms in general. This fusion peptide-mediated gene delivery system is a breakthrough platform for both plant organellar biotechnology and gene therapy for mitochondrial diseases in animals.

  2. The impacts of drift and selection on genomic evolution in insects

    Directory of Open Access Journals (Sweden)

    K. Jun Tong

    2017-04-01

    Full Text Available Genomes evolve through a combination of mutation, drift, and selection, all of which act heterogeneously across genes and lineages. This leads to differences in branch-length patterns among gene trees. Genes that yield trees with the same branch-length patterns can be grouped together into clusters. Here, we propose a novel phylogenetic approach to explain the factors that influence the number and distribution of these gene-tree clusters. We apply our method to a genomic dataset from insects, an ancient and diverse group of organisms. We find some evidence that when drift is the dominant evolutionary process, each cluster tends to contain a large number of fast-evolving genes. In contrast, strong negative selection leads to many distinct clusters, each of which contains only a few slow-evolving genes. Our work, although preliminary in nature, illustrates the use of phylogenetic methods to shed light on the factors driving rate variation in genomic evolution.

  3. Sister Dehalobacter Genomes Reveal Specialization in Organohalide Respiration and Recent Strain Differentiation Likely Driven by Chlorinated Substrates

    Directory of Open Access Journals (Sweden)

    Shuiquan eTang

    2016-02-01

    Full Text Available The genomes of two closely related Dehalobacter strains (strain CF and strain DCA were assembled from the metagenome of an anaerobic enrichment culture that reductively dechlorinates chloroform (CF, 1,1,1-trichloroethane (1,1,1-TCA and 1,1-dichloroethane (1,1-DCA. The 3.1 Mbp genomes of strain CF (that dechlorinates CF and 1,1,1-TCA and strain DCA (that dechlorinates 1,1-DCA each contain 17 putative reductive dehalogenase homologous (rdh genes. These two genomes were systematically compared to three other available organohalide-respiring Dehalobacter genomes (Dehalobacter restrictus strain PER-K23, Dehalobacter sp. strain E1 and Dehalobacter sp. strain UNSWDHB, and to the genomes of Dehalococcoides mccartyi strain 195 and Desulfitobacterium hafniense strain Y51. This analysis compared 42 different metabolic and physiological categories. The genomes of strains CF and DCA share 90% overall average nucleotide identity and greater than 99.8% identity over a 2.9 Mbp alignment that excludes large insertions, indicating that these genomes differentiated from a close common ancestor. This differentiation was likely driven by selection pressures around two orthologous reductive dehalogenase genes, cfrA and dcrA, that code for the enzymes that reduce CF or 1,1,1-TCA and 1,1-DCA. The many reductive dehalogenase genes found in the five Dehalobacter genomes cluster into two small conserved regions and were often associated with Crp/Fnr transcriptional regulators. Specialization is on-going on a strain-specific basis, as some strains but not others have lost essential genes in the Wood-Ljungdahl (strain E1 and corrinoid biosynthesis pathways (strains E1 and PER-K23. The gene encoding phosphoserine phosphatase, which catalyzes the last step of serine biosynthesis, is missing from all five Dehalobacter genomes, yet D. restrictus can grow without serine, suggesting an alternative or unrecognized biosynthesis route exists. In contrast to Dehalococcoides mccartyi

  4. Signatures of selection in the Iberian honey bee: a genome wide approach using single nucleotide polymorphisms (SNPs)

    OpenAIRE

    Chavez-Galarza, Julio; Johnston, J. Spencer; Azevedo, João; Muñoz, Irene; De la Rúa, Pilar; Patton, John C.; Pinto, M. Alice

    2011-01-01

    Dissecting genome-wide (expansions, contractions, admixture) from genome-specific effects (selection) is a goal of central importance in evolutionary biology because it leads to more robust inferences of demographic history and to identification of adaptive divergence. The publication of the honey bee genome and the development of high-density SNPs genotyping, provide us with powerful tools, allowing us to identify signatures of selection in the honey bee genome. These signatur...

  5. The Douglas-Fir Genome Sequence Reveals Specialization of the Photosynthetic Apparatus in Pinaceae

    Directory of Open Access Journals (Sweden)

    David B. Neale

    2017-09-01

    Full Text Available A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb. Franco (Coastal Douglas-fir is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp. Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms.

  6. The iSelect 9 K SNP analysis revealed polyploidization induced revolutionary changes and intense human selection causing strong haplotype blocks in wheat.

    Science.gov (United States)

    Hao, Chenyang; Wang, Yuquan; Chao, Shiaoman; Li, Tian; Liu, Hongxia; Wang, Lanfen; Zhang, Xueyong

    2017-01-30

    A Chinese wheat mini core collection was genotyped using the wheat 9 K iSelect SNP array. Total 2420 and 2396 polymorphic SNPs were detected on the A and the B genome chromosomes, which formed 878 haplotype blocks. There were more blocks in the B genome, but the average block size was significantly (P polyploidization of wheat (both tetraploidization and hexaploidization) induced revolutionary changes in both the A and the B genomes, with a greater increase of gene diversity compared to their diploid ancestors. Modern breeding has dramatically increased diversity in the gene coding regions, though obvious blocks were formed on most of the chromosomes in both tetraploid and hexaploid wheats. Tag-SNP markers identified in this study can be used for marker assisted selection using haplotype blocks as a wheat breeding strategy. This strategy can also be employed to facilitate genome selection in other self-pollinating crop species.

  7. Prediction of Cacao (Theobroma cacao) Resistance to Moniliophthora spp. Diseases via Genome-Wide Association Analysis and Genomic Selection.

    Science.gov (United States)

    McElroy, Michel S; Navarro, Alberto J R; Mustiga, Guiliana; Stack, Conrad; Gezan, Salvador; Peña, Geover; Sarabia, Widem; Saquicela, Diego; Sotomayor, Ignacio; Douglas, Gavin M; Migicovsky, Zoë; Amores, Freddy; Tarqui, Omar; Myles, Sean; Motamayor, Juan C

    2018-01-01

    Cacao ( Theobroma cacao ) is a globally important crop, and its yield is severely restricted by disease. Two of the most damaging diseases, witches' broom disease (WBD) and frosty pod rot disease (FPRD), are caused by a pair of related fungi: Moniliophthora perniciosa and Moniliophthora roreri , respectively. Resistant cultivars are the most effective long-term strategy to address Moniliophthora diseases, but efficiently generating resistant and productive new cultivars will require robust methods for screening germplasm before field testing. Marker-assisted selection (MAS) and genomic selection (GS) provide two potential avenues for predicting the performance of new genotypes, potentially increasing the selection gain per unit time. To test the effectiveness of these two approaches, we performed a genome-wide association study (GWAS) and GS on three related populations of cacao in Ecuador genotyped with a 15K single nucleotide polymorphism (SNP) microarray for three measures of WBD infection (vegetative broom, cushion broom, and chirimoya pod), one of FPRD (monilia pod) and two productivity traits (total fresh weight of pods and % healthy pods produced). GWAS yielded several SNPs associated with disease resistance in each population, but none were significantly correlated with the same trait in other populations. Genomic selection, using one population as a training set to estimate the phenotypes of the remaining two (composed of different families), varied among traits, from a mean prediction accuracy of 0.46 (vegetative broom) to 0.15 (monilia pod), and varied between training populations. Simulations demonstrated that selecting seedlings using GWAS markers alone generates no improvement over selecting at random, but that GS improves the selection process significantly. Our results suggest that the GWAS markers discovered here are not sufficiently predictive across diverse germplasm to be useful for MAS, but that using all markers in a GS framework holds

  8. Prediction of Cacao (Theobroma cacao Resistance to Moniliophthora spp. Diseases via Genome-Wide Association Analysis and Genomic Selection

    Directory of Open Access Journals (Sweden)

    Michel S. McElroy

    2018-03-01

    Full Text Available Cacao (Theobroma cacao is a globally important crop, and its yield is severely restricted by disease. Two of the most damaging diseases, witches’ broom disease (WBD and frosty pod rot disease (FPRD, are caused by a pair of related fungi: Moniliophthora perniciosa and Moniliophthora roreri, respectively. Resistant cultivars are the most effective long-term strategy to address Moniliophthora diseases, but efficiently generating resistant and productive new cultivars will require robust methods for screening germplasm before field testing. Marker-assisted selection (MAS and genomic selection (GS provide two potential avenues for predicting the performance of new genotypes, potentially increasing the selection gain per unit time. To test the effectiveness of these two approaches, we performed a genome-wide association study (GWAS and GS on three related populations of cacao in Ecuador genotyped with a 15K single nucleotide polymorphism (SNP microarray for three measures of WBD infection (vegetative broom, cushion broom, and chirimoya pod, one of FPRD (monilia pod and two productivity traits (total fresh weight of pods and % healthy pods produced. GWAS yielded several SNPs associated with disease resistance in each population, but none were significantly correlated with the same trait in other populations. Genomic selection, using one population as a training set to estimate the phenotypes of the remaining two (composed of different families, varied among traits, from a mean prediction accuracy of 0.46 (vegetative broom to 0.15 (monilia pod, and varied between training populations. Simulations demonstrated that selecting seedlings using GWAS markers alone generates no improvement over selecting at random, but that GS improves the selection process significantly. Our results suggest that the GWAS markers discovered here are not sufficiently predictive across diverse germplasm to be useful for MAS, but that using all markers in a GS framework holds

  9. Comparative genomics reveals insights into avian genome evolution and adaptation

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Cai; Li, Qiye

    2014-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, ...

  10. Genomic view of bipolar disorder revealed by whole genome sequencing in a genetic isolate.

    Directory of Open Access Journals (Sweden)

    Benjamin Georgi

    2014-03-01

    Full Text Available Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders.

  11. Genomic View of Bipolar Disorder Revealed by Whole Genome Sequencing in a Genetic Isolate

    Science.gov (United States)

    Georgi, Benjamin; Craig, David; Kember, Rachel L.; Liu, Wencheng; Lindquist, Ingrid; Nasser, Sara; Brown, Christopher; Egeland, Janice A.; Paul, Steven M.; Bućan, Maja

    2014-01-01

    Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders. PMID:24625924

  12. Genome-wide single-generation signatures of local selection in the panmictic European eel

    DEFF Research Database (Denmark)

    Pujolar, J. M.; Jacobsen, M. W.; Als, Thomas Damm

    2014-01-01

    Next-generation sequencing and the collection of genome-wide data allow identifying adaptive variation and footprints of directional selection. Using a large SNP data set from 259 RAD-sequenced European eel individuals (glass eels) from eight locations between 34 and 64oN, we examined the patterns...... of genome-wide genetic diversity across locations. We tested for local selection by searching for increased population differentiation using FST-based outlier tests and by testing for significant associations between allele frequencies and environmental variables. The overall low genetic differentiation...... with single-generation signatures of spatially varying selection acting on glass eels. After screening 50 354 SNPs, a total of 754 potentially locally selected SNPs were identified. Candidate genes for local selection constituted a wide array of functions, including calcium signalling, neuroactive ligand...

  13. Does selection against transcriptional interference shape retroelement-free regions in mammalian genomes?

    DEFF Research Database (Denmark)

    Mourier, Tobias; Willerslev, Eske

    2008-01-01

    in generating and maintaining retroelement-free regions in the human genome. METHODOLOGY/PRINCIPAL FINDINGS: Based on the known transcriptional properties of retroelements, we expect long interspersed elements (LINEs) to be able to display a high degree of transcriptional interference. In contrast, we expect......BACKGROUND: Eukaryotic genomes are scattered with retroelements that proliferate through retrotransposition. Although retroelements make up around 40 percent of the human genome, large regions are found to be completely devoid of retroelements. This has been hypothesised to be a result of genomic...... activity of LINEs has been identified previously. CONCLUSIONS/SIGNIFICANCE: Our observations are consistent with the notion that selection against transcriptional interference has contributed to the maintenance and/or generation of retroelement-free regions in the human genome....

  14. Natural selection affects multiple aspects of genetic variation at putatively peutral sites across the human genome

    DEFF Research Database (Denmark)

    Lohmueller, Kirk E; Albrechtsen, Anders; Li, Yingrui

    2011-01-01

    A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries...... these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination...... and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations...

  15. Identifying selected regions from heterozygosity and divergence using a light-coverage genomic dataset from two human populations.

    Directory of Open Access Journals (Sweden)

    Taras K Oleksyk

    2008-03-01

    Full Text Available When a selective sweep occurs in the chromosomal region around a target gene in two populations that have recently separated, it produces three dramatic genomic consequences: 1 decreased multi-locus heterozygosity in the region; 2 elevated or diminished genetic divergence (F(ST of multiple polymorphic variants adjacent to the selected locus between the divergent populations, due to the alternative fixation of alleles; and 3 a consequent regional increase in the variance of F(ST (S(2F(ST for the same clustered variants, due to the increased alternative fixation of alleles in the loci surrounding the selection target. In the first part of our study, to search for potential targets of directional selection, we developed and validated a resampling-based computational approach; we then scanned an array of 31 different-sized moving windows of SNP variants (5-65 SNPs across the human genome in a set of European and African American population samples with 183,997 SNP loci after correcting for the recombination rate variation. The analysis revealed 180 regions of recent selection with very strong evidence in either population or both. In the second part of our study, we compared the newly discovered putative regions to those sites previously postulated in the literature, using methods based on inspecting patterns of linkage disequilibrium, population divergence and other methodologies. The newly found regions were cross-validated with those found in nine other studies that have searched for selection signals. Our study was replicated especially well in those regions confirmed by three or more studies. These validated regions were independently verified, using a combination of different methods and different databases in other studies, and should include fewer false positives. The main strength of our analysis method compared to others is that it does not require dense genotyping and therefore can be used with data from population-based genome SNP scans

  16. Genomic characterisation of Wongabel virus reveals novel genes within the Rhabdoviridae.

    Science.gov (United States)

    Gubala, Aneta J; Proll, David F; Barnard, Ross T; Cowled, Chris J; Crameri, Sandra G; Hyatt, Alex D; Boyle, David B

    2008-06-20

    Viruses belonging to the family Rhabdoviridae infect a variety of different hosts, including insects, vertebrates and plants. Currently, there are approximately 200 ICTV-recognised rhabdoviruses isolated around the world. However, the majority remain poorly characterised and only a fraction have been definitively assigned to genera. The genomic and transcriptional complexity displayed by several of the characterised rhabdoviruses indicates large diversity and complexity within this family. To enable an improved taxonomic understanding of this family, it is necessary to gain further information about the poorly characterised members of this family. Here we present the complete genome sequence and predicted transcription strategy of Wongabel virus (WONV), a previously uncharacterised rhabdovirus isolated from biting midges (Culicoides austropalpalis) collected in northern Queensland, Australia. The 13,196 nucleotide genome of WONV encodes five typical rhabdovirus genes N, P, M, G and L. In addition, the WONV genome contains three genes located between the P and M genes (U1, U2, U3) and two open reading frames overlapping with the N and G genes (U4, U5). These five additional genes and their putative protein products appear to be novel, and their functions are unknown. Predictive analysis of the U5 gene product revealed characteristics typical of viroporins, and indicated structural similarities with the alpha-1 protein (putative viroporin) of viruses in the genus Ephemerovirus. Phylogenetic analyses of the N and G proteins of WONV indicated closest similarity with the avian-associated Flanders virus; however, the genomes of these two viruses are significantly diverged. WONV displays a novel and unique genome structure that has not previously been described for any animal rhabdovirus.

  17. Does selection against transcriptional interference shape retroelement-free regions in mammalian genomes?

    Directory of Open Access Journals (Sweden)

    Tobias Mourier

    Full Text Available BACKGROUND: Eukaryotic genomes are scattered with retroelements that proliferate through retrotransposition. Although retroelements make up around 40 percent of the human genome, large regions are found to be completely devoid of retroelements. This has been hypothesised to be a result of genomic regions being intolerant to insertions of retroelements. The inadvertent transcriptional activity of retroelements may affect neighbouring genes, which in turn could be detrimental to an organism. We speculate that such retroelement transcription, or transcriptional interference, is a contributing factor in generating and maintaining retroelement-free regions in the human genome. METHODOLOGY/PRINCIPAL FINDINGS: Based on the known transcriptional properties of retroelements, we expect long interspersed elements (LINEs to be able to display a high degree of transcriptional interference. In contrast, we expect short interspersed elements (SINEs to display very low levels of transcriptional interference. We find that genomic regions devoid of long interspersed elements (LINEs are enriched for protein-coding genes, but that this is not the case for regions devoid of short interspersed elements (SINEs. This is expected if genes are subject to selection against transcriptional interference. We do not find microRNAs to be associated with genomic regions devoid of either SINEs or LINEs. We further observe an increased relative activity of genes overlapping LINE-free regions during early embryogenesis, where activity of LINEs has been identified previously. CONCLUSIONS/SIGNIFICANCE: Our observations are consistent with the notion that selection against transcriptional interference has contributed to the maintenance and/or generation of retroelement-free regions in the human genome.

  18. The importance of identity-by-state information for the accuracy of genomic selection

    Directory of Open Access Journals (Sweden)

    Luan Tu

    2012-08-01

    Full Text Available Abstract Background It is commonly assumed that prediction of genome-wide breeding values in genomic selection is achieved by capitalizing on linkage disequilibrium between markers and QTL but also on genetic relationships. Here, we investigated the reliability of predicting genome-wide breeding values based on population-wide linkage disequilibrium information, based on identity-by-descent relationships within the known pedigree, and to what extent linkage disequilibrium information improves predictions based on identity-by-descent genomic relationship information. Methods The study was performed on milk, fat, and protein yield, using genotype data on 35 706 SNP and deregressed proofs of 1086 Italian Brown Swiss bulls. Genome-wide breeding values were predicted using a genomic identity-by-state relationship matrix and a genomic identity-by-descent relationship matrix (averaged over all marker loci. The identity-by-descent matrix was calculated by linkage analysis using one to five generations of pedigree data. Results We showed that genome-wide breeding values prediction based only on identity-by-descent genomic relationships within the known pedigree was as or more reliable than that based on identity-by-state, which implicitly also accounts for genomic relationships that occurred before the known pedigree. Furthermore, combining the two matrices did not improve the prediction compared to using identity-by-descent alone. Including different numbers of generations in the pedigree showed that most of the information in genome-wide breeding values prediction comes from animals with known common ancestors less than four generations back in the pedigree. Conclusions Our results show that, in pedigreed breeding populations, the accuracy of genome-wide breeding values obtained by identity-by-descent relationships was not improved by identity-by-state information. Although, in principle, genomic selection based on identity-by-state does not require

  19. Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods

    Science.gov (United States)

    2013-01-01

    Background Machine learning techniques are becoming useful as an alternative approach to conventional medical diagnosis or prognosis as they are good for handling noisy and incomplete data, and significant results can be attained despite a small sample size. Traditionally, clinicians make prognostic decisions based on clinicopathologic markers. However, it is not easy for the most skilful clinician to come out with an accurate prognosis by using these markers alone. Thus, there is a need to use genomic markers to improve the accuracy of prognosis. The main aim of this research is to apply a hybrid of feature selection and machine learning methods in oral cancer prognosis based on the parameters of the correlation of clinicopathologic and genomic markers. Results In the first stage of this research, five feature selection methods have been proposed and experimented on the oral cancer prognosis dataset. In the second stage, the model with the features selected from each feature selection methods are tested on the proposed classifiers. Four types of classifiers are chosen; these are namely, ANFIS, artificial neural network, support vector machine and logistic regression. A k-fold cross-validation is implemented on all types of classifiers due to the small sample size. The hybrid model of ReliefF-GA-ANFIS with 3-input features of drink, invasion and p63 achieved the best accuracy (accuracy = 93.81%; AUC = 0.90) for the oral cancer prognosis. Conclusions The results revealed that the prognosis is superior with the presence of both clinicopathologic and genomic markers. The selected features can be investigated further to validate the potential of becoming as significant prognostic signature in the oral cancer studies. PMID:23725313

  20. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    LENUS (Irish Health Repository)

    Potnis, Neha

    2011-03-11

    Abstract Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv) has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv) strain 1111 (ATCC 35937), X. perforans (Xp) strain 91-118 and X. gardneri (Xg) strain 101 (ATCC 19865). The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the lipopolysaccharide cluster

  1. Demographically-Based Evaluation of Genomic Regions under Selection in Domestic Dogs.

    Directory of Open Access Journals (Sweden)

    Adam H Freedman

    2016-03-01

    Full Text Available Controlling for background demographic effects is important for accurately identifying loci that have recently undergone positive selection. To date, the effects of demography have not yet been explicitly considered when identifying loci under selection during dog domestication. To investigate positive selection on the dog lineage early in the domestication, we examined patterns of polymorphism in six canid genomes that were previously used to infer a demographic model of dog domestication. Using an inferred demographic model, we computed false discovery rates (FDR and identified 349 outlier regions consistent with positive selection at a low FDR. The signals in the top 100 regions were frequently centered on candidate genes related to brain function and behavior, including LHFPL3, CADM2, GRIK3, SH3GL2, MBP, PDE7B, NTAN1, and GLRA1. These regions contained significant enrichments in behavioral ontology categories. The 3rd top hit, CCRN4L, plays a major role in lipid metabolism, that is supported by additional metabolism related candidates revealed in our scan, including SCP2D1 and PDXC1. Comparing our method to an empirical outlier approach that does not directly account for demography, we found only modest overlaps between the two methods, with 60% of empirical outliers having no overlap with our demography-based outlier detection approach. Demography-aware approaches have lower-rates of false discovery. Our top candidates for selection, in addition to expanding the set of neurobehavioral candidate genes, include genes related to lipid metabolism, suggesting a dietary target of selection that was important during the period when proto-dogs hunted and fed alongside hunter-gatherers.

  2. The Methanosarcina barkeri genome: comparative analysis withMethanosarcina acetivorans and Methanosarcina mazei reveals extensiverearrangement within methanosarcinal genomes

    Energy Technology Data Exchange (ETDEWEB)

    Maeder, Dennis L.; Anderson, Iain; Brettin, Thomas S.; Bruce,David C.; Gilna, Paul; Han, Cliff S.; Lapidus, Alla; Metcalf, William W.; Saunders, Elizabeth; Tapia, Roxanne; Sowers, Kevin R.

    2006-05-19

    We report here a comparative analysis of the genome sequence of Methanosarcina barkeri with those of Methanosarcina acetivorans and Methanosarcina mazei. All three genomes share a conserved double origin of replication and many gene clusters. M. barkeri is distinguished by having an organization that is well conserved with respect to the other Methanosarcinae in the region proximal to the origin of replication with interspecies gene similarities as high as 95%. However it is disordered and marked by increased transposase frequency and decreased gene synteny and gene density in the proximal semi-genome. Of the 3680 open reading frames in M. barkeri, 678 had paralogs with better than 80% similarity to both M. acetivorans and M. mazei while 128 nonhypothetical orfs were unique (non-paralogous) amongst these species including a complete formate dehydrogenase operon, two genes required for N-acetylmuramic acid synthesis, a 14 gene gas vesicle cluster and a bacterial P450-specific ferredoxin reductase cluster not previously observed or characterized in this genus. A cryptic 36 kbp plasmid sequence was detected in M. barkeri that contains an orc1 gene flanked by a presumptive origin of replication consisting of 38 tandem repeats of a 143 nt motif. Three-way comparison of these genomes reveals differing mechanisms for the accrual of changes. Elongation of the large M. acetivorans is the result of multiple gene-scale insertions and duplications uniformly distributed in that genome, while M. barkeri is characterized by localized inversions associated with the loss of gene content. In contrast, the relatively short M. mazei most closely approximates the ancestral organizational state.

  3. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    Directory of Open Access Journals (Sweden)

    Aurélien Chateigner

    2015-07-01

    Full Text Available Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%. K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs. Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential.

  4. Genome-wide mapping of infection-induced SINE RNAs reveals a role in selective mRNA export.

    Science.gov (United States)

    Karijolich, John; Zhao, Yang; Alla, Ravi; Glaunsinger, Britt

    2017-06-02

    Short interspersed nuclear elements (SINEs) are retrotransposons evolutionarily derived from endogenous RNA Polymerase III RNAs. Though SINE elements have undergone exaptation into gene regulatory elements, how transcribed SINE RNA impacts transcriptional and post-transcriptional regulation is largely unknown. This is partly due to a lack of information regarding which of the loci have transcriptional potential. Here, we present an approach (short interspersed nuclear element sequencing, SINE-seq), which selectively profiles RNA Polymerase III-derived SINE RNA, thereby identifying transcriptionally active SINE loci. Applying SINE-seq to monitor murine B2 SINE expression during a gammaherpesvirus infection revealed transcription from 28 270 SINE loci, with ∼50% of active SINE elements residing within annotated RNA Polymerase II loci. Furthermore, B2 RNA can form intermolecular RNA-RNA interactions with complementary mRNAs, leading to nuclear retention of the targeted mRNA via a mechanism involving p54nrb. These findings illuminate a pathway for the selective regulation of mRNA export during stress via retrotransposon activation. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. TAL effector nucleases induce mutations at a pre-selected location in the genome of primary barley transformants

    DEFF Research Database (Denmark)

    Wendt, Toni; Holm, Preben Bach; Starker, Colby G

    2013-01-01

    , and their broad targeting range. Here we report the assembly of several TALENs for a specific genomic locus in barley. The cleavage activity of individual TALENs was first tested in vivo using a yeast-based, single-strand annealing assay. The most efficient TALEN was then selected for barley transformation....... Analysis of the resulting transformants showed that TALEN-induced double strand breaks led to the introduction of short deletions at the target site. Additional analysis revealed that each barley transformant contained a range of different mutations, indicating that mutations occurred independently...

  6. A draft de novo genome assembly for the northern bobwhite (Colinus virginianus reveals evidence for a rapid decline in effective population size beginning in the Late Pleistocene.

    Directory of Open Access Journals (Sweden)

    Yvette A Halley

    Full Text Available Wild populations of northern bobwhites (Colinus virginianus; hereafter bobwhite have declined across nearly all of their U.S. range, and despite their importance as an experimental wildlife model for ecotoxicology studies, no bobwhite draft genome assembly currently exists. Herein, we present a bobwhite draft de novo genome assembly with annotation, comparative analyses including genome-wide analyses of divergence with the chicken (Gallus gallus and zebra finch (Taeniopygia guttata genomes, and coalescent modeling to reconstruct the demographic history of the bobwhite for comparison to other birds currently in decline (i.e., scarlet macaw; Ara macao. More than 90% of the assembled bobwhite genome was captured within 14,000 unique genes and proteins. Bobwhite analyses of divergence with the chicken and zebra finch genomes revealed many extremely conserved gene sequences, and evidence for lineage-specific divergence of noncoding regions. Coalescent models for reconstructing the demographic history of the bobwhite and the scarlet macaw provided evidence for population bottlenecks which were temporally coincident with human colonization of the New World, the late Pleistocene collapse of the megafauna, and the last glacial maximum. Demographic trends predicted for the bobwhite and the scarlet macaw also were concordant with how opposing natural selection strategies (i.e., skewness in the r-/K-selection continuum would be expected to shape genome diversity and the effective population sizes in these species, which is directly relevant to future conservation efforts.

  7. The Physcomitrella genome reveals evolutionary insights into the conquest of land by plants

    Energy Technology Data Exchange (ETDEWEB)

    Rensing, Stefan A.; Lang, Daniel; Zimmer, Andreas D.; Terry, Astrid; Salamov, Asaf; Shapiro, Harris; Nishiyama, Tomaoki; Perroud, Pierre-Francois; Lindquist, Erika A.; Kamisugi, Yasuko; Tanahashi, Takako; Sakakibara, Keiko; Fujita, Tomomichi; Oishi, Kazuko; Shin, Tadasu; Kuroki, Yoko; Toyoda, Atsushi; Suzuki, Yutaka; Hashimoto, Shin-ichi; Yamaguchi, Kazuo; Sugano, Sumio; Kohara, Yuji; Fujiyama, Asao; Anterola, Aldwin; Aoki, Setsuyuki; Ashton, Neil; Barbazuk, W. Brad; Barker, Elizabeth; Bennetzen, Jeffrey L.; Blankenship, Robert; Cho, Sung Hyun; Dutcher, Susan K.; Estelle, Mark; Fawcett, Jeffrey A.; Gundlach, Heidrum; Hanada, Kousuke; Melkozernov, Alexander; Murata, Takashi; Nelson, David R.; Pils, Birgit; Prigge, Michael; Reiss, Bernd; Renner, Tanya; Rombauts, Stephane; Rushton, Paul J.; Sanderfoot, Anton; Schween, Gabriele; Shiu, Shin-Han; Stueber, Kurt; Theodoulou, Frederica L.; Tu, Hank; Van de Peer, Yves; Verrier, Paul J.; Waters, Elizabeth; Wood, Andrew; Yang, Lixing; Cove, David; Cuming, Andrew C.; Hasebe, Mitsayasu; Lucas, Susan; Mishler, Brent D.; Reski, Ralf; Grigoriev, Igor V.; Quatrano, Rakph S.; Boore, Jeffrey L.

    2007-09-18

    We report the draft genome sequence of the model moss Physcomitrella patens and compare its features with those of flowering plants, from which it is separated by more than 400 million years, and unicellular aquatic algae. This comparison reveals genomic changes concomitant with the evolutionary movement to land, including a general increase in gene family complexity; loss of genes associated with aquatic environments (e.g., flagellar arms); acquisition of genes for tolerating terrestrial stresses (e.g., variation in temperature and water availability); and the development of the auxin and abscisic acid signaling pathways for coordinating multicellular growth and dehydration response. The Physcomitrella genome provides a resource for phylogenetic inferences about gene function and for experimental analysis of plant processes through this plant's unique facility for reverse genetics.

  8. Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping.

    Directory of Open Access Journals (Sweden)

    Amaury Vaysse

    2011-10-01

    Full Text Available The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease.

  9. Bayesian genomic selection: the effect of haplotype lenghts and priors

    DEFF Research Database (Denmark)

    Villumsen, Trine Michelle; Janss, Luc

    2009-01-01

    Breeding values for animals with marker data are estimated using a genomic selection approach where data is analyzed using Bayesian multi-marker association models. Fourteen model scenarios with varying haplotype lengths, hyper parameter and prior distributions were compared to find the scenario ...

  10. Theory of microbial genome evolution

    Science.gov (United States)

    Koonin, Eugene

    Bacteria and archaea have small genomes tightly packed with protein-coding genes. This compactness is commonly perceived as evidence of adaptive genome streamlining caused by strong purifying selection in large microbial populations. In such populations, even the small cost incurred by nonfunctional DNA because of extra energy and time expenditure is thought to be sufficient for this extra genetic material to be eliminated by selection. However, contrary to the predictions of this model, there exists a consistent, positive correlation between the strength of selection at the protein sequence level, measured as the ratio of nonsynonymous to synonymous substitution rates, and microbial genome size. By fitting the genome size distributions in multiple groups of prokaryotes to predictions of mathematical models of population evolution, we show that only models in which acquisition of additional genes is, on average, slightly beneficial yield a good fit to genomic data. Thus, the number of genes in prokaryotic genomes seems to reflect the equilibrium between the benefit of additional genes that diminishes as the genome grows and deletion bias. New genes acquired by microbial genomes, on average, appear to be adaptive. Evolution of bacterial and archaeal genomes involves extensive horizontal gene transfer and gene loss. Many microbes have open pangenomes, where each newly sequenced genome contains more than 10% `ORFans', genes without detectable homologues in other species. A simple, steady-state evolutionary model reveals two sharply distinct classes of microbial genes, one of which (ORFans) is characterized by effectively instantaneous gene replacement, whereas the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of at least a billion distinct genes in the prokaryotic genomic universe.

  11. Reliable reconstruction of HIV-1 whole genome haplotypes reveals clonal interference and genetic hitchhiking among immune escape variants

    Science.gov (United States)

    2014-01-01

    Background Following transmission, HIV-1 evolves into a diverse population, and next generation sequencing enables us to detect variants occurring at low frequencies. Studying viral evolution at the level of whole genomes was hitherto not possible because next generation sequencing delivers relatively short reads. Results We here provide a proof of principle that whole HIV-1 genomes can be reliably reconstructed from short reads, and use this to study the selection of immune escape mutations at the level of whole genome haplotypes. Using realistically simulated HIV-1 populations, we demonstrate that reconstruction of complete genome haplotypes is feasible with high fidelity. We do not reconstruct all genetically distinct genomes, but each reconstructed haplotype represents one or more of the quasispecies in the HIV-1 population. We then reconstruct 30 whole genome haplotypes from published short sequence reads sampled longitudinally from a single HIV-1 infected patient. We confirm the reliability of the reconstruction by validating our predicted haplotype genes with single genome amplification sequences, and by comparing haplotype frequencies with observed epitope escape frequencies. Conclusions Phylogenetic analysis shows that the HIV-1 population undergoes selection driven evolution, with successive replacement of the viral population by novel dominant strains. We demonstrate that immune escape mutants evolve in a dependent manner with various mutations hitchhiking along with others. As a consequence of this clonal interference, selection coefficients have to be estimated for complete haplotypes and not for individual immune escapes. PMID:24996694

  12. A genome-wide scan for signatures of directional selection in domesticated pigs.

    Science.gov (United States)

    Moon, Sunjin; Kim, Tae-Hun; Lee, Kyung-Tai; Kwak, Woori; Lee, Taeheon; Lee, Si-Woo; Kim, Myung-Jick; Cho, Kyuho; Kim, Namshin; Chung, Won-Hyong; Sung, Samsun; Park, Taesung; Cho, Seoae; Groenen, Martien Am; Nielsen, Rasmus; Kim, Yuseob; Kim, Heebal

    2015-02-25

    Animal domestication involved drastic phenotypic changes driven by strong artificial selection and also resulted in new populations of breeds, established by humans. This study aims to identify genes that show evidence of recent artificial selection during pig domestication. Whole-genome resequencing of 30 individual pigs from domesticated breeds, Landrace and Yorkshire, and 10 Asian wild boars at ~16-fold coverage was performed resulting in over 4.3 million SNPs for 19,990 genes. We constructed a comprehensive genome map of directional selection by detecting selective sweeps using an F ST-based approach that detects directional selection in lineages leading to the domesticated breeds and using a haplotype-based test that detects ongoing selective sweeps within the breeds. We show that candidate genes under selection are significantly enriched for loci implicated in quantitative traits important to pig reproduction and production. The candidate gene with the strongest signals of directional selection belongs to group III of the metabolomics glutamate receptors, known to affect brain functions associated with eating behavior, suggesting that loci under strong selection include loci involved in behaviorial traits in domesticated pigs including tameness. We show that a significant proportion of selection signatures coincide with loci that were previously inferred to affect phenotypic variation in pigs. We further identify functional enrichment related to behavior, such as signal transduction and neuronal activities, for those targets of selection during domestication in pigs.

  13. Integration of genomic information into sport horse breeding programs for optimization of accuracy of selection.

    Science.gov (United States)

    Haberland, A M; König von Borstel, U; Simianer, H; König, S

    2012-09-01

    Reliable selection criteria are required for young riding horses to increase genetic gain by increasing accuracy of selection and decreasing generation intervals. In this study, selection strategies incorporating genomic breeding values (GEBVs) were evaluated. Relevant stages of selection in sport horse breeding programs were analyzed by applying selection index theory. Results in terms of accuracies of indices (r(TI) ) and relative selection response indicated that information on single nucleotide polymorphism (SNP) genotypes considerably increases the accuracy of breeding values estimated for young horses without own or progeny performance. In a first scenario, the correlation between the breeding value estimated from the SNP genotype and the true breeding value (= accuracy of GEBV) was fixed to a relatively low value of r(mg) = 0.5. For a low heritability trait (h(2) = 0.15), and an index for a young horse based only on information from both parents, additional genomic information doubles r(TI) from 0.27 to 0.54. Including the conventional information source 'own performance' into the before mentioned index, additional SNP information increases r(TI) by 40%. Thus, particularly with regard to traits of low heritability, genomic information can provide a tool for well-founded selection decisions early in life. In a further approach, different sources of breeding values (e.g. GEBV and estimated breeding values (EBVs) from different countries) were combined into an overall index when altering accuracies of EBVs and correlations between traits. In summary, we showed that genomic selection strategies have the potential to contribute to a substantial reduction in generation intervals in horse breeding programs.

  14. Algal genomes reveal evolutionary mosaicism and the fate of nucleomorphs

    Energy Technology Data Exchange (ETDEWEB)

    Curtis, Bruce A.; Tanifuji, Goro; Burki, Fabien; Gruber, Ansgar; Irimia, Manuuel; Maruyama, Shinichiro; Arias, Maria C.; Ball, Steven G.; Gile, Gillian H.; Hirakawa, Yoshihisa; Hopkins, Julia F.; Kuo, Alan; Rensing, Stefan A.; Schmutz, Jeremy; Symeonidi, Aikaterini; Elias, Marek; Eveleigh, Robert J. M.; Herman, Emily K.; Klute, Mary J.; Nakayama, Takuro; Obornik, Miroslav; Reyes-Prieto, Adrian; Armbrust, E. Virginia; Aves, Stephen J.; Beiko, Robert G.; Coutinho, Pedro; Dacks, Joel B.; Durnford, Dion G.; Fast, Naomi M.; Green, Beverley R.; Grisdale, Cameron J.; Hempel, Franziska; Henrissat, Bernard; Hoppner, Marc P.; Ishida, Ken-Ichiro; Kim, Eunsoo; Koreny, Ludek; Kroth, Peter G.; Liu, Yuan; Malik, Shehre-Banoo; Maier, Uwe G.; McRose, Darcy; Mock, Thomas; Neilson, Jonathan A. D.; Onodera, Naoko T.; Poole, Anthony M.; Pritham, Ellen J.; Richards, Thomas A.; Rocap, Gabrielle; Roy, Scott W.; Sarai, Chihiro; Schaack, Sarah; Shirato, Shu; Slamovits, Claudio H.; Spencer, Davie F.; Suzuki, Shigekatsu; Worden, Alexandra Z.; Zauner, Stefan; Barry, Kerrie; Bell, Callum; Bharti, Arvind K.; Crow, John A.; Grimwood, Jane; Kramer, Robin; Lindquist, Erika; Lucas, Susan; Salamov, Asaf; McFadden, Geoffrey I.; Lane, Christopher E.; Keeling, Patrick J.; Gray, Michael W.; Grigoriev, Igor V.; Archibald, John M.

    2012-08-10

    Cryptophyte and chlorarachniophyte algae are transitional forms in the widespread secondary endosymbiotic acquisition of photosynthesis by engulfment of eukaryotic algae. Unlike most secondary plastid-bearing algae, miniaturized versions of the endosymbiont nuclei (nucleomorphs) persist in cryptophytes and chlorarachniophytes. To determine why, and to address other fundamental questions about eukaryote eukaryote endosymbiosis, we sequenced the nuclear genomes of the cryptophyte Guillardia theta and the chlorarachniophyte Bigelowiella natans. Both genomes have 21,000 protein genes and are intron rich, and B. natans exhibits unprecedented alternative splicing for a single-celled organism. Phylogenomic analyses and subcellular targeting predictions reveal extensive genetic and biochemical mosaicism, with both host- and endosymbiont-derived genes servicing the mitochondrion, the host cell cytosol, the plastid and the remnant endosymbiont cytosol of both algae. Mitochondrion-to-nucleus gene transfer still occurs in both organisms but plastid-to-nucleus and nucleomorph-to-nucleus transfers do not, which explains why a small residue of essential genes remains locked in each nucleomorph.

  15. Draft whole genome sequence of groundnut stem rot fungus Athelia rolfsii revealing genetic architect of its pathogenicity and virulence.

    Science.gov (United States)

    Iquebal, M A; Tomar, Rukam S; Parakhia, M V; Singla, Deepak; Jaiswal, Sarika; Rathod, V M; Padhiyar, S M; Kumar, Neeraj; Rai, Anil; Kumar, Dinesh

    2017-07-13

    Groundnut (Arachis hypogaea L.) is an important oil seed crop having major biotic constraint in production due to stem rot disease caused by fungus, Athelia rolfsii causing 25-80% loss in productivity. As chemical and biological combating strategies of this fungus are not very effective, thus genome sequencing can reveal virulence and pathogenicity related genes for better understanding of the host-parasite interaction. We report draft assembly of Athelia rolfsii genome of ~73 Mb having 8919 contigs. Annotation analysis revealed 16830 genes which are involved in fungicide resistance, virulence and pathogenicity along with putative effector and lethal genes. Secretome analysis revealed CAZY genes representing 1085 enzymatic genes, glycoside hydrolases, carbohydrate esterases, carbohydrate-binding modules, auxillary activities, glycosyl transferases and polysaccharide lyases. Repeat analysis revealed 11171 SSRs, LTR, GYPSY and COPIA elements. Comparative analysis with other existing ascomycotina genome predicted conserved domain family of WD40, CYP450, Pkinase and ABC transporter revealing insight of evolution of pathogenicity and virulence. This study would help in understanding pathogenicity and virulence at molecular level and development of new combating strategies. Such approach is imperative in endeavour of genome based solution in stem rot disease management leading to better productivity of groundnut crop in tropical region of world.

  16. Genetic association of marbling score with intragenic nucleotide variants at selection signals of the bovine genome.

    Science.gov (United States)

    Ryu, J; Lee, C

    2016-04-01

    Selection signals of Korean cattle might be attributed largely to artificial selection for meat quality. Rapidly increased intragenic markers of newly annotated genes in the bovine genome would help overcome limited findings of genetic markers associated with meat quality at the selection signals in a previous study. The present study examined genetic associations of marbling score (MS) with intragenic nucleotide variants at selection signals of Korean cattle. A total of 39 092 nucleotide variants of 407 Korean cattle were utilized in the association analysis. A total of 129 variants were selected within newly annotated genes in the bovine genome. Their genetic associations were analyzed using the mixed model with random polygenic effects based on identical-by-state genetic relationships among animals in order to control for spurious associations produced by population structure. Genetic associations of MS were found (Pdirectional selection for greater MS and remain selection signals in the bovine genome. Further studies of fine mapping would be useful to incorporate favorable alleles in marker-assisted selection for MS of Korean cattle.

  17. Genomic Selection Accuracy using Multifamily Prediction Models in a Wheat Breeding Program

    Directory of Open Access Journals (Sweden)

    Elliot L. Heffner

    2011-03-01

    Full Text Available Genomic selection (GS uses genome-wide molecular marker data to predict the genetic value of selection candidates in breeding programs. In plant breeding, the ability to produce large numbers of progeny per cross allows GS to be conducted within each family. However, this approach requires phenotypes of lines from each cross before conducting GS. This will prolong the selection cycle and may result in lower gains per year than approaches that estimate marker-effects with multiple families from previous selection cycles. In this study, phenotypic selection (PS, conventional marker-assisted selection (MAS, and GS prediction accuracy were compared for 13 agronomic traits in a population of 374 winter wheat ( L. advanced-cycle breeding lines. A cross-validation approach that trained and validated prediction accuracy across years was used to evaluate effects of model selection, training population size, and marker density in the presence of genotype × environment interactions (G×E. The average prediction accuracies using GS were 28% greater than with MAS and were 95% as accurate as PS. For net merit, the average accuracy across six selection indices for GS was 14% greater than for PS. These results provide empirical evidence that multifamily GS could increase genetic gain per unit time and cost in plant breeding.

  18. Population genomic scan for candidate signatures of balancing selection to guide antigen characterization in malaria parasites.

    Science.gov (United States)

    Amambua-Ngwa, Alfred; Tetteh, Kevin K A; Manske, Magnus; Gomez-Escobar, Natalia; Stewart, Lindsay B; Deerhake, M Elizabeth; Cheeseman, Ian H; Newbold, Christopher I; Holder, Anthony A; Knuepfer, Ellen; Janha, Omar; Jallow, Muminatou; Campino, Susana; Macinnis, Bronwyn; Kwiatkowski, Dominic P; Conway, David J

    2012-01-01

    Acquired immunity in vertebrates maintains polymorphisms in endemic pathogens, leading to identifiable signatures of balancing selection. To comprehensively survey for genes under such selection in the human malaria parasite Plasmodium falciparum, we generated paired-end short-read sequences of parasites in clinical isolates from an endemic Gambian population, which were mapped to the 3D7 strain reference genome to yield high-quality genome-wide coding sequence data for 65 isolates. A minority of genes did not map reliably, including the hypervariable var, rifin, and stevor families, but 5,056 genes (90.9% of all in the genome) had >70% sequence coverage with minimum read depth of 5 for at least 50 isolates, of which 2,853 genes contained 3 or more single nucleotide polymorphisms (SNPs) for analysis of polymorphic site frequency spectra. Against an overall background of negatively skewed frequencies, as expected from historical population expansion combined with purifying selection, the outlying minority of genes with signatures indicating exceptionally intermediate frequencies were identified. Comparing genes with different stage-specificity, such signatures were most common in those with peak expression at the merozoite stage that invades erythrocytes. Members of clag, PfMC-2TM, surfin, and msp3-like gene families were highly represented, the strongest signature being in the msp3-like gene PF10_0355. Analysis of msp3-like transcripts in 45 clinical and 11 laboratory adapted isolates grown to merozoite-containing schizont stages revealed surprisingly low expression of PF10_0355. In diverse clonal parasite lines the protein product was expressed in a minority of mature schizonts (<1% in most lines and ∼10% in clone HB3), and eight sub-clones of HB3 cultured separately had an intermediate spectrum of positive frequencies (0.9 to 7.5%), indicating phase variable expression of this polymorphic antigen. This and other identified targets of balancing selection are now

  19. Population genomic scan for candidate signatures of balancing selection to guide antigen characterization in malaria parasites.

    Directory of Open Access Journals (Sweden)

    Alfred Amambua-Ngwa

    Full Text Available Acquired immunity in vertebrates maintains polymorphisms in endemic pathogens, leading to identifiable signatures of balancing selection. To comprehensively survey for genes under such selection in the human malaria parasite Plasmodium falciparum, we generated paired-end short-read sequences of parasites in clinical isolates from an endemic Gambian population, which were mapped to the 3D7 strain reference genome to yield high-quality genome-wide coding sequence data for 65 isolates. A minority of genes did not map reliably, including the hypervariable var, rifin, and stevor families, but 5,056 genes (90.9% of all in the genome had >70% sequence coverage with minimum read depth of 5 for at least 50 isolates, of which 2,853 genes contained 3 or more single nucleotide polymorphisms (SNPs for analysis of polymorphic site frequency spectra. Against an overall background of negatively skewed frequencies, as expected from historical population expansion combined with purifying selection, the outlying minority of genes with signatures indicating exceptionally intermediate frequencies were identified. Comparing genes with different stage-specificity, such signatures were most common in those with peak expression at the merozoite stage that invades erythrocytes. Members of clag, PfMC-2TM, surfin, and msp3-like gene families were highly represented, the strongest signature being in the msp3-like gene PF10_0355. Analysis of msp3-like transcripts in 45 clinical and 11 laboratory adapted isolates grown to merozoite-containing schizont stages revealed surprisingly low expression of PF10_0355. In diverse clonal parasite lines the protein product was expressed in a minority of mature schizonts (<1% in most lines and ∼10% in clone HB3, and eight sub-clones of HB3 cultured separately had an intermediate spectrum of positive frequencies (0.9 to 7.5%, indicating phase variable expression of this polymorphic antigen. This and other identified targets of balancing

  20. Genetic variation architecture of mitochondrial genome reveals the differentiation in Korean landrace and weedy rice

    OpenAIRE

    Wei Tong; Qiang He; Yong-Jin Park

    2017-01-01

    Mitochondrial genome variations have been detected despite the overall conservation of this gene content, which has been valuable for plant population genetics and evolutionary studies. Here, we describe mitochondrial variation architecture and our performance of a phylogenetic dissection of Korean landrace and weedy rice. A total of 4,717 variations across the mitochondrial genome were identified adjunct with 10 wild rice. Genetic diversity assessment revealed that wild rice has higher nucle...

  1. Systematic differences in the response of genetic variation to pedigree and genome-based selection methods.

    Science.gov (United States)

    Heidaritabar, M; Vereijken, A; Muir, W M; Meuwissen, T; Cheng, H; Megens, H-J; Groenen, M A M; Bastiaansen, J W M

    2014-12-01

    Genomic selection (GS) is a DNA-based method of selecting for quantitative traits in animal and plant breeding, and offers a potentially superior alternative to traditional breeding methods that rely on pedigree and phenotype information. Using a 60 K SNP chip with markers spaced throughout the entire chicken genome, we compared the impact of GS and traditional BLUP (best linear unbiased prediction) selection methods applied side-by-side in three different lines of egg-laying chickens. Differences were demonstrated between methods, both at the level and genomic distribution of allele frequency changes. In all three lines, the average allele frequency changes were larger with GS, 0.056 0.064 and 0.066, compared with BLUP, 0.044, 0.045 and 0.036 for lines B1, B2 and W1, respectively. With BLUP, 35 selected regions (empirical P selected regions were identified. Empirical thresholds for local allele frequency changes were determined from gene dropping, and differed considerably between GS (0.167-0.198) and BLUP (0.105-0.126). Between lines, the genomic regions with large changes in allele frequencies showed limited overlap. Our results show that GS applies selection pressure much more locally than BLUP, resulting in larger allele frequency changes. With these results, novel insights into the nature of selection on quantitative traits have been gained and important questions regarding the long-term impact of GS are raised. The rapid changes to a part of the genetic architecture, while another part may not be selected, at least in the short term, require careful consideration, especially when selection occurs before phenotypes are observed.

  2. Impact of reduced marker set estimation of genomic relationship matrices on genomic selection for feed efficiency in Angus cattle

    Directory of Open Access Journals (Sweden)

    Northcutt Sally L

    2010-04-01

    Full Text Available Abstract Background Molecular estimates of breeding value are expected to increase selection response due to improvements in the accuracy of selection and a reduction in generation interval, particularly for traits that are difficult or expensive to record or are measured late in life. Several statistical methods for incorporating molecular data into breeding value estimation have been proposed, however, most studies have utilized simulated data in which the generated linkage disequilibrium may not represent the targeted livestock population. A genomic relationship matrix was developed for 698 Angus steers and 1,707 Angus sires using 41,028 single nucleotide polymorphisms and breeding values were estimated using feed efficiency phenotypes (average daily feed intake, residual feed intake, and average daily gain recorded on the steers. The number of SNPs needed to accurately estimate a genomic relationship matrix was evaluated in this population. Results Results were compared to estimates produced from pedigree-based mixed model analysis of 862 Angus steers with 34,864 identified paternal relatives but no female ancestors. Estimates of additive genetic variance and breeding value accuracies were similar for AFI and RFI using the numerator and genomic relationship matrices despite fewer animals in the genomic analysis. Bootstrap analyses indicated that 2,500-10,000 markers are required for robust estimation of genomic relationship matrices in cattle. Conclusions This research shows that breeding values and their accuracies may be estimated for commercially important sires for traits recorded in experimental populations without the need for pedigree data to establish identity by descent between members of the commercial and experimental populations when at least 2,500 SNPs are available for the generation of a genomic relationship matrix.

  3. The genomic landscape shaped by selection on transposable elements across 18 mouse strains.

    Science.gov (United States)

    Nellåker, Christoffer; Keane, Thomas M; Yalcin, Binnaz; Wong, Kim; Agam, Avigail; Belgard, T Grant; Flint, Jonathan; Adams, David J; Frankel, Wayne N; Ponting, Chris P

    2012-06-15

    Transposable element (TE)-derived sequence dominates the landscape of mammalian genomes and can modulate gene function by dysregulating transcription and translation. Our current knowledge of TEs in laboratory mouse strains is limited primarily to those present in the C57BL/6J reference genome, with most mouse TEs being drawn from three distinct classes, namely short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs) and the endogenous retrovirus (ERV) superfamily. Despite their high prevalence, the different genomic and gene properties controlling whether TEs are preferentially purged from, or are retained by, genetic drift or positive selection in mammalian genomes remain poorly defined. Using whole genome sequencing data from 13 classical laboratory and 4 wild-derived mouse inbred strains, we developed a comprehensive catalogue of 103,798 polymorphic TE variants. We employ this extensive data set to characterize TE variants across the Mus lineage, and to infer neutral and selective processes that have acted over 2 million years. Our results indicate that the majority of TE variants are introduced though the male germline and that only a minority of TE variants exert detectable changes in gene expression. However, among genes with differential expression across the strains there are twice as many TE variants identified as being putative causal variants as expected. Most TE variants that cause gene expression changes appear to be purged rapidly by purifying selection. Our findings demonstrate that past TE insertions have often been highly deleterious, and help to prioritize TE variants according to their likely contribution to gene expression or phenotype variation.

  4. Comparative Genomics and Transcriptomics Analyses Reveal Divergent Lifestyle Features of Nematode Endoparasitic Fungus Hirsutella minnesotensis

    Science.gov (United States)

    Lai, Yiling; Liu, Keke; Zhang, Xinyu; Zhang, Xiaoling; Li, Kuan; Wang, Niuniu; Shu, Chi; Wu, Yunpeng; Wang, Chengshu; Bushley, Kathryn E.; Xiang, Meichun; Liu, Xingzhong

    2014-01-01

    Hirsutella minnesotensis [Ophiocordycipitaceae (Hypocreales, Ascomycota)] is a dominant endoparasitic fungus by using conidia that adhere to and penetrate the secondary stage juveniles of soybean cyst nematode. Its genome was de novo sequenced and compared with five entomopathogenic fungi in the Hypocreales and three nematode-trapping fungi in the Orbiliales (Ascomycota). The genome of H. minnesotensis is 51.4 Mb and encodes 12,702 genes enriched with transposable elements up to 32%. Phylogenomic analysis revealed that H. minnesotensis was diverged from entomopathogenic fungi in Hypocreales. Genome of H. minnesotensis is similar to those of entomopathogenic fungi to have fewer genes encoding lectins for adhesion and glycoside hydrolases for cellulose degradation, but is different from those of nematode-trapping fungi to possess more genes for protein degradation, signal transduction, and secondary metabolism. Those results indicate that H. minnesotensis has evolved different mechanism for nematode endoparasitism compared with nematode-trapping fungi. Transcriptomics analyses for the time-scale parasitism revealed the upregulations of lectins, secreted proteases and the genes for biosynthesis of secondary metabolites that could be putatively involved in host surface adhesion, cuticle degradation, and host manipulation. Genome and transcriptome analyses provided comprehensive understanding of the evolution and lifestyle of nematode endoparasitism. PMID:25359922

  5. The American cranberry mitochondrial genome reveals the presence of selenocysteine (tRNA-Sec and SECIS) insertion machinery in land plants

    Science.gov (United States)

    The American cranberry (Vaccinium macrocarpon Ait.) mitochondrial genome was assembled and reconstructed from whole genome 454 Roche GS-FLX and Illumina shotgun sequences. Compared with other Asterids, the reconstruction of the genome revealed an average size mitochondrion (459,678 nt) with comparat...

  6. Genomic and environmental selection patterns in two distinct lettuce crop–wild hybrid crosses

    Science.gov (United States)

    Hartman, Yorike; Uwimana, Brigitte; Hooftman, Danny A P; Schranz, Michael E; van de Wiel, Clemens C M; Smulders, Marinus J M; Visser, Richard G F; van Tienderen, Peter H

    2013-01-01

    Genomic selection patterns and hybrid performance influence the chance that crop (trans)genes can spread to wild relatives. We measured fitness(-related) traits in two different field environments employing two different crop–wild crosses of lettuce. We performed quantitative trait loci (QTL) analyses and estimated the fitness distribution of early- and late-generation hybrids. We detected consistent results across field sites and crosses for a fitness QTL at linkage group 7, where a selective advantage was conferred by the wild allele. Two fitness QTL were detected on linkage group 5 and 6, which were unique to one of the crop–wild crosses. Average hybrid fitness was lower than the fitness of the wild parent, but several hybrid lineages outperformed the wild parent, especially in a novel habitat for the wild type. In early-generation hybrids, this may partly be due to heterosis effects, whereas in late-generation hybrids transgressive segregation played a major role. The study of genomic selection patterns can identify crop genomic regions under negative selection across multiple environments and cultivar–wild crosses that might be applicable in transgene mitigation strategies. At the same time, results were cultivar-specific, so that a case-by-case environmental risk assessment is still necessary, decreasing its general applicability. PMID:23789025

  7. Genomic and environmental selection patterns in two distinct lettuce crop-wild hybrid crosses.

    Science.gov (United States)

    Hartman, Yorike; Uwimana, Brigitte; Hooftman, Danny A P; Schranz, Michael E; van de Wiel, Clemens C M; Smulders, Marinus J M; Visser, Richard G F; van Tienderen, Peter H

    2013-06-01

    Genomic selection patterns and hybrid performance influence the chance that crop (trans)genes can spread to wild relatives. We measured fitness(-related) traits in two different field environments employing two different crop-wild crosses of lettuce. We performed quantitative trait loci (QTL) analyses and estimated the fitness distribution of early- and late-generation hybrids. We detected consistent results across field sites and crosses for a fitness QTL at linkage group 7, where a selective advantage was conferred by the wild allele. Two fitness QTL were detected on linkage group 5 and 6, which were unique to one of the crop-wild crosses. Average hybrid fitness was lower than the fitness of the wild parent, but several hybrid lineages outperformed the wild parent, especially in a novel habitat for the wild type. In early-generation hybrids, this may partly be due to heterosis effects, whereas in late-generation hybrids transgressive segregation played a major role. The study of genomic selection patterns can identify crop genomic regions under negative selection across multiple environments and cultivar-wild crosses that might be applicable in transgene mitigation strategies. At the same time, results were cultivar-specific, so that a case-by-case environmental risk assessment is still necessary, decreasing its general applicability.

  8. The genome of the seagrass Zostera marina reveals angiosperm adaptation to the sea

    NARCIS (Netherlands)

    Olsen, Jeanine; Rouzé, Pierre; Verhelst, Bram; Lin, Yao-Cheng; Bayer, Till; Collen, Jonas; Dattolo, Emanuela; De Paoli, Emanuele; Dittami, Simon; Maumus, Florian; Michel, Gurvan; Kersting, Anna; Lauritano, Chiara; Lohaus, Rolf; Töpel, Mats; Tonon, Thierry; Vanneste, Kevin; Amirebrahimi, Mojgan; Brakel, Janina; Boström, Christoffer; Chovatia, Mansi; Grimwood, Jane; Jenkins, Jerry W; Jueterbock, Alexander; Mraz, Amy; Stam, Wytze T; Tice, Hope; Bornberg-Bauer, Erich; Green, Pamela J; Pearson, Gareth A; Procaccini, Gabriele; Duarte, Carlos M; Schmutz, Jeremy; Reusch, Thorsten B H; Van de Peer, Yves

    2016-01-01

    Seagrasses colonized the sea on at least three independent occasions to form the basis of one of the most productive and widespread coastal ecosystems on the planet. Here we report the genome of Zostera marina (L.), the first, to our knowledge, marine angiosperm to be fully sequenced. This reveals

  9. Natural selection among Eurasians at genomic regions associated with HIV-1 control

    Directory of Open Access Journals (Sweden)

    Allison David B

    2011-06-01

    Full Text Available Abstract Background HIV susceptibility and pathogenicity exhibit both interindividual and intergroup variability. The etiology of intergroup variability is still poorly understood, and could be partly linked to genetic differences among racial/ethnic groups. These genetic differences may be traceable to different regimes of natural selection in the 60,000 years since the human radiation out of Africa. Here, we examine population differentiation and haplotype patterns at several loci identified through genome-wide association studies on HIV-1 control, as determined by viral-load setpoint, in European and African-American populations. We use genome-wide data from the Human Genome Diversity Project, consisting of 53 world-wide populations, to compare measures of FST and relative extended haplotype homozygosity (REHH at these candidate loci to the rest of the respective chromosome. Results We find that the Europe-Middle East and Europe-South Asia pairwise FST in the most strongly associated region are elevated compared to most pairwise comparisons with the sub-Saharan African group, which exhibit very low FST. We also find genetic signatures of recent positive selection (higher REHH at these associated regions among all groups except for sub-Saharan Africans and Native Americans. This pattern is consistent with one in which genetic differentiation, possibly due to diversifying/positive selection, occurred at these loci among Eurasians. Conclusions These findings are concordant with those from earlier studies suggesting recent evolutionary change at immunity-related genomic regions among Europeans, and shed light on the potential genetic and evolutionary origin of population differences in HIV-1 control.

  10. Genomic prediction applied to high-biomass sorghum for bioenergy production.

    Science.gov (United States)

    de Oliveira, Amanda Avelar; Pastina, Maria Marta; de Souza, Vander Filipe; da Costa Parrella, Rafael Augusto; Noda, Roberto Willians; Simeone, Maria Lúcia Ferreira; Schaffert, Robert Eugene; de Magalhães, Jurandir Vieira; Damasceno, Cynthia Maria Borges; Margarido, Gabriel Rodrigues Alves

    2018-01-01

    The increasing cost of energy and finite oil and gas reserves have created a need to develop alternative fuels from renewable sources. Due to its abiotic stress tolerance and annual cultivation, high-biomass sorghum ( Sorghum bicolor L. Moench) shows potential as a bioenergy crop. Genomic selection is a useful tool for accelerating genetic gains and could restructure plant breeding programs by enabling early selection and reducing breeding cycle duration. This work aimed at predicting breeding values via genomic selection models for 200 sorghum genotypes comprising landrace accessions and breeding lines from biomass and saccharine groups. These genotypes were divided into two sub-panels, according to breeding purpose. We evaluated the following phenotypic biomass traits: days to flowering, plant height, fresh and dry matter yield, and fiber, cellulose, hemicellulose, and lignin proportions. Genotyping by sequencing yielded more than 258,000 single-nucleotide polymorphism markers, which revealed population structure between subpanels. We then fitted and compared genomic selection models BayesA, BayesB, BayesCπ, BayesLasso, Bayes Ridge Regression and random regression best linear unbiased predictor. The resulting predictive abilities varied little between the different models, but substantially between traits. Different scenarios of prediction showed the potential of using genomic selection results between sub-panels and years, although the genotype by environment interaction negatively affected accuracies. Functional enrichment analyses performed with the marker-predicted effects suggested several interesting associations, with potential for revealing biological processes relevant to the studied quantitative traits. This work shows that genomic selection can be successfully applied in biomass sorghum breeding programs.

  11. Landscape genomics: natural selection drives the evolution of mitogenome in penguins.

    Science.gov (United States)

    Ramos, Barbara; González-Acuña, Daniel; Loyola, David E; Johnson, Warren E; Parker, Patricia G; Massaro, Melanie; Dantas, Gisele P M; Miranda, Marcelo D; Vianna, Juliana A

    2018-01-16

    Mitochondria play a key role in the balance of energy and heat production, and therefore the mitochondrial genome is under natural selection by environmental temperature and food availability, since starvation can generate more efficient coupling of energy production. However, selection over mitochondrial DNA (mtDNA) genes has usually been evaluated at the population level. We sequenced by NGS 12 mitogenomes and with four published genomes, assessed genetic variation in ten penguin species distributed from the equator to Antarctica. Signatures of selection of 13 mitochondrial protein-coding genes were evaluated by comparing among species within and among genera (Spheniscus, Pygoscelis, Eudyptula, Eudyptes and Aptenodytes). The genetic data were correlated with environmental data obtained through remote sensing (sea surface temperature [SST], chlorophyll levels [Chl] and a combination of SST and Chl [COM]) through the distribution of these species. We identified the complete mtDNA genomes of several penguin species, including ND6 and 8 tRNAs on the light strand and 12 protein coding genes, 14 tRNAs and two rRNAs positioned on the heavy strand. The highest diversity was found in NADH dehydrogenase genes and the lowest in COX genes. The lowest evolutionary divergence among species was between Humboldt (Spheniscus humboldti) and Galapagos (S. mendiculus) penguins (0.004), while the highest was observed between little penguin (Eudyptula minor) and Adélie penguin (Pygoscelis adeliae) (0.097). We identified a signature of purifying selection (Ka/Ks penguins. In contrast, COX1 had a signature of strong negative selection. ND4 Ka/Ks ratios were highly correlated with SST (Mantel, p-value: 0.0001; GLM, p-value: 0.00001) and thus may be related to climate adaptation throughout penguin speciation. These results identify mtDNA candidate genes under selection which could be involved in broad-scale adaptations of penguins to their environment. Such knowledge may be

  12. A korarchaeal genome reveals insights into the evolution of the Archaea

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain J; Elkins, James G.; Podar, Mircea; Graham, David E.; Makarova, Kira S.; Wolf, Yuri; Randau, Lennart; Hedlund, Brian P.; Brochier-Armanet, Celine; Kunin, Victor; Anderson, Iain; Lapidus, Alla; Goltsman, Eugene; Barry, Kerrie; Koonin, Eugene V.; Hugenholtz, Phil; Kyrpides, Nikos; Wanner, Gerhard; Richardson, Paul; Keller, Martin; Stetter, Karl O.

    2008-06-05

    The candidate division Korarchaeota comprises a group of uncultivated microorganisms that, by their small subunit rRNA phylogeny, may have diverged early from the major archaeal phyla Crenarchaeota and Euryarchaeota. Here, we report the initial characterization of a member of the Korarchaeota with the proposed name,"Candidatus Korarchaeum cryptofilum," which exhibits an ultrathin filamentous morphology. To investigate possible ancestral relationships between deep-branching Korarchaeota and other phyla, we used whole-genome shotgun sequencing to construct a complete composite korarchaeal genome from enriched cells. The genome was assembled into a single contig 1.59 Mb in length with a G + C content of 49percent. Of the 1,617 predicted protein-coding genes, 1,382 (85percent) could be assigned to a revised set of archaeal Clusters of Orthologous Groups (COGs). The predicted gene functions suggest that the organism relies on a simple mode of peptide fermentation for carbon and energy and lacks the ability to synthesize de novo purines, CoA, and several other cofactors. Phylogenetic analyses based on conserved single genes and concatenated protein sequences positioned the korarchaeote as a deep archaeal lineage with an apparent affinity to the Crenarchaeota. However, the predicted gene content revealed that several conserved cellular systems, such as cell division, DNA replication, and tRNA maturation, resemble the counterparts in the Euryarchaeota. In light of the known composition of archaeal genomes, the Korarchaeota might have retained a set of cellular features that represents the ancestral archaeal form.

  13. A Korarchael Genome Reveals Insights into the Evolution of the Archaea

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla; Elkins, James G.; Podar, Mircea; Graham, David E.; Makarova, Kira S.; Wolf, Yuri; Randau, Lennart; Hedlund, Brian P.; Brochier-Armanet, Celine; Kunin, Victor; Anderson, Iain; Lapidus, Alla; Goltsman, Eugene; Barry, Kerrie; Koonin, Eugene V.; Hugenholtz, Phil; Kyrpides, Nikos; Wanner, Gerhard; Richardson, Paul; Keller, Martin; Stetter, Karl O.

    2008-01-07

    The candidate division Korarchaeota comprises a group of uncultivated microorganisms that, by their small subunit rRNA phylogeny, may have diverged early from the major archaeal phyla Crenarchaeota and Euryarchaeota. Here, we report the initial characterization of a member of the Korarchaeota with the proposed name, ?Candidatus Korarchaeum cryptofilum,? which exhibits an ultrathin filamentous morphology. To investigate possible ancestral relationships between deep-branching Korarchaeota and other phyla, we used whole-genome shotgun sequencing to construct a complete composite korarchaeal genome from enriched cells. The genome was assembled into a single contig 1.59 Mb in length with a G + C content of 49percent. Of the 1,617 predicted protein-coding genes, 1,382 (85percent) could be assigned to a revised set of archaeal Clusters of Orthologous Groups (COGs). The predicted gene functions suggest that the organism relies on a simple mode of peptide fermentation for carbon and energy and lacks the ability to synthesize de novo purines, CoA, and several other cofactors. Phylogenetic analyses based on conserved single genes and concatenated protein sequences positioned the korarchaeote as a deep archaeal lineage with an apparent affinity to the Crenarchaeota. However, the predicted gene content revealed that several conserved cellular systems, such as cell division, DNA replication, and tRNA maturation, resemble the counterparts in the Euryarchaeota. In light of the known composition of archaeal genomes, the Korarchaeota might have retained a set of cellular features that represents the ancestral archaeal form.

  14. Genome-Wide Association Study of Major Agronomic Traits Related to Domestication in Peanut

    Directory of Open Access Journals (Sweden)

    Xingguo Zhang

    2017-09-01

    Full Text Available Peanut (Arachis hypogaea consists of two subspecies, hypogaea and fastigiata, and has been cultivated worldwide for hundreds of years. Here, 158 peanut accessions were selected to dissect the molecular footprint of agronomic traits related to domestication using specific-locus amplified fragment sequencing (SLAF-seq method. Then, a total of 17,338 high-quality single nucleotide polymorphisms (SNPs in the whole peanut genome were revealed. Eleven agronomic traits in 158 peanut accessions were subsequently analyzed using genome-wide association studies (GWAS. Candidate genes responsible for corresponding traits were then analyzed in genomic regions surrounding the peak SNPs, and 1,429 genes were found within 200 kb windows centerd on GWAS-identified peak SNPs related to domestication. Highly differentiated genomic regions were observed between hypogaea and fastigiata accessions using FST values and sequence diversity (π ratios. Among the 1,429 genes, 662 were located on chromosome A3, suggesting the presence of major selective sweeps caused by artificial selection during long domestication. These findings provide a promising insight into the complicated genetic architecture of domestication-related traits in peanut, and reveal whole-genome SNP markers of beneficial candidate genes for marker-assisted selection (MAS in future breeding programs.

  15. Whole genome resequencing reveals natural target site preferences of transposable elements in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Raquel S Linheiro

    Full Text Available Transposable elements are mobile DNA sequences that integrate into host genomes using diverse mechanisms with varying degrees of target site specificity. While the target site preferences of some engineered transposable elements are well studied, the natural target preferences of most transposable elements are poorly characterized. Using population genomic resequencing data from 166 strains of Drosophila melanogaster, we identified over 8,000 new insertion sites not present in the reference genome sequence that we used to decode the natural target preferences of 22 families of transposable element in this species. We found that terminal inverted repeat transposon and long terminal repeat retrotransposon families present clade-specific target site duplications and target site sequence motifs. Additionally, we found that the sequence motifs at transposable element target sites are always palindromes that extend beyond the target site duplication. Our results demonstrate the utility of population genomics data for high-throughput inference of transposable element targeting preferences in the wild and establish general rules for terminal inverted repeat transposon and long terminal repeat retrotransposon target site selection in eukaryotic genomes.

  16. The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions

    Energy Technology Data Exchange (ETDEWEB)

    Merchant, Sabeeha S

    2007-04-09

    Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were inherited from the common ancestor of plants and animals, but lost in land plants. We sequenced the 120-megabase nuclear genome of Chlamydomonas and performed comparative phylogenomic analyses, identifying genes encoding uncharacterized proteins that are likely associated with the function and biogenesis of chloroplasts or eukaryotic flagella. Analyses of the Chlamydomonas genome advance our understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella.

  17. Selective Sweep Analysis in the Genomes of the 91-R and 91-C Drosophila melanogaster Strains Reveals Few of the ‘Usual Suspects’ in Dichlorodiphenyltrichloroethane (DDT) Resistance

    Science.gov (United States)

    Steele, Laura D.; Coates, Brad; Valero, M. Carmen; Sun, Weilin; Seong, Keon Mook; Muir, William M.; Clark, John M.; Pittendrigh, Barry R.

    2015-01-01

    Adaptation of insect phenotypes for survival after exposure to xenobiotics can result from selection at multiple loci with additive genetic effects. To the authors’ knowledge, no selective sweep analysis has been performed to identify such loci in highly dichlorodiphenyltrichloroethane (DDT) resistant insects. Here we compared a highly DDT resistant phenotype in the Drosophila melanogaster (Drosophila) 91-R strain to the DDT susceptible 91-C strain, both of common origin. Whole genome re-sequencing data from pools of individuals was generated separately for 91-R and 91-C, and mapped to the reference Drosophila genome assembly (v. 5.72). Thirteen major and three minor effect chromosome intervals with reduced nucleotide diversity (π) were identified only in the 91-R population. Estimates of Tajima's D (D) showed corresponding evidence of directional selection in these same genome regions of 91-R, however, no similar reductions in π or D estimates were detected in 91-C. An overabundance of non-synonymous proteins coding to synonymous changes were identified in putative open reading frames associated with 91-R. Except for NinaC and Cyp4g1, none of the identified genes were the ‘usual suspects’ previously observed to be associated with DDT resistance. Additionally, up-regulated ATP-binding cassette transporters have been previously associated with DDT resistance; however, here we identified a structurally altered MDR49 candidate resistance gene. The remaining fourteen genes have not previously been shown to be associated with DDT resistance. These results suggest hitherto unknown mechanisms of DDT resistance, most of which have been overlooked in previous transcriptional studies, with some genes having orthologs in mammals. PMID:25826265

  18. Genome characterization of the selected long- and short-sleep mouse lines.

    Science.gov (United States)

    Dowell, Robin; Odell, Aaron; Richmond, Phillip; Malmer, Daniel; Halper-Stromberg, Eitan; Bennett, Beth; Larson, Colin; Leach, Sonia; Radcliffe, Richard A

    2016-12-01

    The Inbred Long- and Short-Sleep (ILS, ISS) mouse lines were selected for differences in acute ethanol sensitivity using the loss of righting response (LORR) as the selection trait. The lines show an over tenfold difference in LORR and, along with a recombinant inbred panel derived from them (the LXS), have been widely used to dissect the genetic underpinnings of acute ethanol sensitivity. Here we have sequenced the genomes of the ILS and ISS to investigate the DNA variants that contribute to their sensitivity difference. We identified ~2.7 million high-confidence SNPs and small indels and ~7000 structural variants between the lines; variants were found to occur in 6382 annotated genes. Using a hidden Markov model, we were able to reconstruct the genome-wide ancestry patterns of the eight inbred progenitor strains from which the ILS and ISS were derived, and found that quantitative trait loci that have been mapped for LORR were slightly enriched for DNA variants. Finally, by mapping and quantifying RNA-seq reads from the ILS and ISS to their strain-specific genomes rather than to the reference genome, we found a substantial improvement in a differential expression analysis between the lines. This work will help in identifying and characterizing the DNA sequence variants that contribute to the difference in ethanol sensitivity between the ILS and ISS and will also aid in accurate quantification of RNA-seq data generated from the LXS RIs.

  19. Comparative genomics of Mycoplasma: analysis of conserved essential genes and diversity of the pan-genome.

    Directory of Open Access Journals (Sweden)

    Wei Liu

    Full Text Available Mycoplasma, the smallest self-replicating organism with a minimal metabolism and little genomic redundancy, is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. This study employs comparative evolutionary analysis of twenty Mycoplasma genomes to gain an improved understanding of essential genes. By analyzing the core genome of mycoplasmas, we finally revealed the conserved essential genes set for mycoplasma survival. Further analysis showed that the core genome set has many characteristics in common with experimentally identified essential genes. Several key genes, which are related to DNA replication and repair and can be disrupted in transposon mutagenesis studies, may be critical for bacteria survival especially over long period natural selection. Phylogenomic reconstructions based on 3,355 homologous groups allowed robust estimation of phylogenetic relatedness among mycoplasma strains. To obtain deeper insight into the relative roles of molecular evolution in pathogen adaptation to their hosts, we also analyzed the positive selection pressures on particular sites and lineages. There appears to be an approximate correlation between the divergence of species and the level of positive selection detected in corresponding lineages.

  20. Genome-wide prediction of traits with different genetic architecture through efficient variable selection.

    Science.gov (United States)

    Wimmer, Valentin; Lehermeier, Christina; Albrecht, Theresa; Auinger, Hans-Jürgen; Wang, Yu; Schön, Chris-Carolin

    2013-10-01

    In genome-based prediction there is considerable uncertainty about the statistical model and method required to maximize prediction accuracy. For traits influenced by a small number of quantitative trait loci (QTL), predictions are expected to benefit from methods performing variable selection [e.g., BayesB or the least absolute shrinkage and selection operator (LASSO)] compared to methods distributing effects across the genome [ridge regression best linear unbiased prediction (RR-BLUP)]. We investigate the assumptions underlying successful variable selection by combining computer simulations with large-scale experimental data sets from rice (Oryza sativa L.), wheat (Triticum aestivum L.), and Arabidopsis thaliana (L.). We demonstrate that variable selection can be successful when the number of phenotyped individuals is much larger than the number of causal mutations contributing to the trait. We show that the sample size required for efficient variable selection increases dramatically with decreasing trait heritabilities and increasing extent of linkage disequilibrium (LD). We contrast and discuss contradictory results from simulation and experimental studies with respect to superiority of variable selection methods over RR-BLUP. Our results demonstrate that due to long-range LD, medium heritabilities, and small sample sizes, superiority of variable selection methods cannot be expected in plant breeding populations even for traits like FRIGIDA gene expression in Arabidopsis and flowering time in rice, assumed to be influenced by a few major QTL. We extend our conclusions to the analysis of whole-genome sequence data and infer upper bounds for the number of causal mutations which can be identified by LASSO. Our results have major impact on the choice of statistical method needed to make credible inferences about genetic architecture and prediction accuracy of complex traits.

  1. Probing Genomic Aspects of the Multi-Host Pathogen Clostridium perfringens Reveals Significant Pangenome Diversity, and a Diverse Array of Virulence Factors.

    Science.gov (United States)

    Kiu, Raymond; Caim, Shabhonam; Alexander, Sarah; Pachori, Purnima; Hall, Lindsay J

    2017-01-01

    Clostridium perfringens is an important cause of animal and human infections, however information about the genetic makeup of this pathogenic bacterium is currently limited. In this study, we sought to understand and characterise the genomic variation, pangenomic diversity, and key virulence traits of 56 C. perfringens strains which included 51 public, and 5 newly sequenced and annotated genomes using Whole Genome Sequencing. Our investigation revealed that C. perfringens has an "open" pangenome comprising 11667 genes and 12.6% of core genes, identified as the most divergent single-species Gram-positive bacterial pangenome currently reported. Our computational analyses also defined C. perfringens phylogeny (16S rRNA gene) in relation to some 25 Clostridium species, with C. baratii and C. sardiniense determined to be the closest relatives. Profiling virulence-associated factors confirmed presence of well-characterised C. perfringens -associated exotoxins genes including α-toxin ( plc ), enterotoxin ( cpe ), and Perfringolysin O ( pfo or pfoA ), although interestingly there did not appear to be a close correlation with encoded toxin type and disease phenotype. Furthermore, genomic analysis indicated significant horizontal gene transfer events as defined by presence of prophage genomes, and notably absence of CRISPR defence systems in >70% (40/56) of the strains. In relation to antimicrobial resistance mechanisms, tetracycline resistance genes ( tet ) and anti-defensins genes ( mprF ) were consistently detected in silico ( tet : 75%; mprF : 100%). However, pre-antibiotic era strain genomes did not encode for tet , thus implying antimicrobial selective pressures in C. perfringens evolutionary history over the past 80 years. This study provides new genomic understanding of this genetically divergent multi-host bacterium, and further expands our knowledge on this medically and veterinary important pathogen.

  2. Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

    Directory of Open Access Journals (Sweden)

    Keeling Patrick J

    2007-09-01

    Full Text Available Abstract Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements

  3. Mapping of Micro-Tom BAC-End Sequences to the Reference Tomato Genome Reveals Possible Genome Rearrangements and Polymorphisms

    Science.gov (United States)

    Asamizu, Erika; Shirasawa, Kenta; Hirakawa, Hideki; Sato, Shusei; Tabata, Satoshi; Yano, Kentaro; Ariizumi, Tohru; Shibata, Daisuke; Ezura, Hiroshi

    2012-01-01

    A total of 93,682 BAC-end sequences (BESs) were generated from a dwarf model tomato, cv. Micro-Tom. After removing repetitive sequences, the BESs were similarity searched against the reference tomato genome of a standard cultivar, “Heinz 1706.” By referring to the “Heinz 1706” physical map and by eliminating redundant or nonsignificant hits, 28,804 “unique pair ends” and 8,263 “unique ends” were selected to construct hypothetical BAC contigs. The total physical length of the BAC contigs was 495, 833, 423 bp, covering 65.3% of the entire genome. The average coverage of euchromatin and heterochromatin was 58.9% and 67.3%, respectively. From this analysis, two possible genome rearrangements were identified: one in chromosome 2 (inversion) and the other in chromosome 3 (inversion and translocation). Polymorphisms (SNPs and Indels) between the two cultivars were identified from the BLAST alignments. As a result, 171,792 polymorphisms were mapped on 12 chromosomes. Among these, 30,930 polymorphisms were found in euchromatin (1 per 3,565 bp) and 140,862 were found in heterochromatin (1 per 2,737 bp). The average polymorphism density in the genome was 1 polymorphism per 2,886 bp. To facilitate the use of these data in Micro-Tom research, the BAC contig and polymorphism information are available in the TOMATOMICS database. PMID:23227037

  4. Genetic improvement of Pacific white shrimp (Penaeus (Litopenaeus vannamei: perspectives for genomic selection

    Directory of Open Access Journals (Sweden)

    Héctor eCastillo-Juárez

    2015-03-01

    Full Text Available The use of breeding programs for the Pacific white shrimp (Penaeus (Litopenaeus vannamei based on mixed linear models with pedigreed data are described. The application of these classic breeding methods yielded continuous progress of great value to increase the profitability of the shrimp industry in several countries. Recent advances in such areas as genomics in shrimp will allow for the development of new breeding programs in the near future that will increase genetic progress. In particular, these novel techniques may help increase disease resistance to specific emerging diseases, which is today a very important component of shrimp breeding programs. Thanks to increased selection accuracy, simulated genetic advance using genomic selection for survival to a disease challenge was up to 2.6 times that of phenotypic sib selection.

  5. The role of parasite-driven selection in shaping landscape genomic structure in red grouse (Lagopus lagopus scotica).

    Science.gov (United States)

    Wenzel, Marius A; Douglas, Alex; James, Marianne C; Redpath, Steve M; Piertney, Stuart B

    2016-01-01

    Landscape genomics promises to provide novel insights into how neutral and adaptive processes shape genome-wide variation within and among populations. However, there has been little emphasis on examining whether individual-based phenotype-genotype relationships derived from approaches such as genome-wide association (GWAS) manifest themselves as a population-level signature of selection in a landscape context. The two may prove irreconcilable as individual-level patterns become diluted by high levels of gene flow and complex phenotypic or environmental heterogeneity. We illustrate this issue with a case study that examines the role of the highly prevalent gastrointestinal nematode Trichostrongylus tenuis in shaping genomic signatures of selection in red grouse (Lagopus lagopus scotica). Individual-level GWAS involving 384 SNPs has previously identified five SNPs that explain variation in T. tenuis burden. Here, we examine whether these same SNPs display population-level relationships between T. tenuis burden and genetic structure across a small-scale landscape of 21 sites with heterogeneous parasite pressure. Moreover, we identify adaptive SNPs showing signatures of directional selection using F(ST) outlier analysis and relate population- and individual-level patterns of multilocus neutral and adaptive genetic structure to T. tenuis burden. The five candidate SNPs for parasite-driven selection were neither associated with T. tenuis burden on a population level, nor under directional selection. Similarly, there was no evidence of parasite-driven selection in SNPs identified as candidates for directional selection. We discuss these results in the context of red grouse ecology and highlight the broader consequences for the utility of landscape genomics approaches for identifying signatures of selection. © 2015 John Wiley & Sons Ltd.

  6. Whole Genome Analyses of a Well-Differentiated Liposarcoma Reveals Novel SYT1 and DDR2 Rearrangements

    Science.gov (United States)

    Egan, Jan B.; Barrett, Michael T.; Champion, Mia D.; Middha, Sumit; Lenkiewicz, Elizabeth; Evers, Lisa; Francis, Princy; Schmidt, Jessica; Shi, Chang-Xin; Van Wier, Scott; Badar, Sandra; Ahmann, Gregory; Kortuem, K. Martin; Boczek, Nicole J.; Fonseca, Rafael; Craig, David W.; Carpten, John D.; Borad, Mitesh J.; Stewart, A. Keith

    2014-01-01

    Liposarcoma is the most common soft tissue sarcoma, but little is known about the genomic basis of this disease. Given the low cell content of this tumor type, we utilized flow cytometry to isolate the diploid normal and aneuploid tumor populations from a well-differentiated liposarcoma prior to array comparative genomic hybridization and whole genome sequencing. This work revealed massive highly focal amplifications throughout the aneuploid tumor genome including MDM2, a gene that has previously been found to be amplified in well-differentiated liposarcoma. Structural analysis revealed massive rearrangement of chromosome 12 and 11 gene fusions, some of which may be part of double minute chromosomes commonly present in well-differentiated liposarcoma. We identified a hotspot of genomic instability localized to a region of chromosome 12 that includes a highly conserved, putative L1 retrotransposon element, LOC100507498 which resides within a gene cluster (NAV3, SYT1, PAWR) where 6 of the 11 fusion events occurred. Interestingly, a potential gene fusion was also identified in amplified DDR2, which is a potential therapeutic target of kinase inhibitors such as dastinib, that are not routinely used in the treatment of patients with liposarcoma. Furthermore, 7 somatic, damaging single nucleotide variants have also been identified, including D125N in the PTPRQ protein. In conclusion, this work is the first to report the entire genome of a well-differentiated liposarcoma with novel chromosomal rearrangements associated with amplification of therapeutically targetable genes such as MDM2 and DDR2. PMID:24505276

  7. Whole genome analyses of a well-differentiated liposarcoma reveals novel SYT1 and DDR2 rearrangements.

    Directory of Open Access Journals (Sweden)

    Jan B Egan

    Full Text Available Liposarcoma is the most common soft tissue sarcoma, but little is known about the genomic basis of this disease. Given the low cell content of this tumor type, we utilized flow cytometry to isolate the diploid normal and aneuploid tumor populations from a well-differentiated liposarcoma prior to array comparative genomic hybridization and whole genome sequencing. This work revealed massive highly focal amplifications throughout the aneuploid tumor genome including MDM2, a gene that has previously been found to be amplified in well-differentiated liposarcoma. Structural analysis revealed massive rearrangement of chromosome 12 and 11 gene fusions, some of which may be part of double minute chromosomes commonly present in well-differentiated liposarcoma. We identified a hotspot of genomic instability localized to a region of chromosome 12 that includes a highly conserved, putative L1 retrotransposon element, LOC100507498 which resides within a gene cluster (NAV3, SYT1, PAWR where 6 of the 11 fusion events occurred. Interestingly, a potential gene fusion was also identified in amplified DDR2, which is a potential therapeutic target of kinase inhibitors such as dastinib, that are not routinely used in the treatment of patients with liposarcoma. Furthermore, 7 somatic, damaging single nucleotide variants have also been identified, including D125N in the PTPRQ protein. In conclusion, this work is the first to report the entire genome of a well-differentiated liposarcoma with novel chromosomal rearrangements associated with amplification of therapeutically targetable genes such as MDM2 and DDR2.

  8. Comparative genome and transcriptome analysis reveals distinctive surface characteristics and unique physiological potentials of Pseudomonas aeruginosa ATCC 27853

    KAUST Repository

    Cao, Huiluo

    2017-06-12

    Pseudomonas aeruginosa ATCC 27853 was isolated from a hospital blood specimen in 1971 and has been widely used as a model strain to survey antibiotics susceptibilities, biofilm development, and metabolic activities of Pseudomonas spp.. Although four draft genomes of P. aeruginosa ATCC 27853 have been sequenced, the complete genome of this strain is still lacking, hindering a comprehensive understanding of its physiology and functional genome.Here we sequenced and assembled the complete genome of P. aeruginosa ATCC 27853 using the Pacific Biosciences SMRT (PacBio) technology and Illumina sequencing platform. We found that accessory genes of ATCC 27853 including prophages and genomic islands (GIs) mainly contribute to the difference between P. aeruginosa ATCC 27853 and other P. aeruginosa strains. Seven prophages were identified within the genome of P. aeruginosa ATCC 27853. Of the predicted 25 GIs, three contain genes that encode monoxoygenases, dioxygenases and hydrolases that could be involved in the metabolism of aromatic compounds. Surveying virulence-related genes revealed that a series of genes that encode the B-band O-antigen of LPS are lacking in ATCC 27853. Distinctive SNPs in genes of cellular adhesion proteins such as type IV pili and flagella biosynthesis were also observed in this strain. Colony morphology analysis confirmed an enhanced biofilm formation capability of ATCC 27853 on solid agar surface compared to Pseudomonas aeruginosa PAO1. We then performed transcriptome analysis of ATCC 27853 and PAO1 using RNA-seq and compared the expression of orthologous genes to understand the functional genome and the genomic details underlying the distinctive colony morphogenesis. These analyses revealed an increased expression of genes involved in cellular adhesion and biofilm maturation such as type IV pili, exopolysaccharide and electron transport chain components in ATCC 27853 compared with PAO1. In addition, distinctive expression profiles of the

  9. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

    Science.gov (United States)

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-03-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.

  10. Genomic Characterisation of the Indigenous Irish Kerry Cattle Breed

    Science.gov (United States)

    Browett, Sam; McHugo, Gillian; Richardson, Ian W.; Magee, David A.; Park, Stephen D. E.; Fahey, Alan G.; Kearney, John F.; Correia, Carolina N.; Randhawa, Imtiaz A. S.; MacHugh, David E.

    2018-01-01

    Kerry cattle are an endangered landrace heritage breed of cultural importance to Ireland. In the present study we have used genome-wide SNP array data to evaluate genomic diversity within the Kerry population and between Kerry cattle and other European breeds. Patterns of genetic differentiation and gene flow among breeds using phylogenetic trees with ancestry graphs highlighted historical gene flow from the British Shorthorn breed into the ancestral population of modern Kerry cattle. Principal component analysis (PCA) and genetic clustering emphasised the genetic distinctiveness of Kerry cattle relative to comparator British and European cattle breeds. Modelling of genetic effective population size (Ne) revealed a demographic trend of diminishing Ne over time and that recent estimated Ne values for the Kerry breed may be less than the threshold for sustainable genetic conservation. In addition, analysis of genome-wide autozygosity (FROH) showed that genomic inbreeding has increased significantly during the 20 years between 1992 and 2012. Finally, signatures of selection revealed genomic regions subject to natural and artificial selection as Kerry cattle adapted to the climate, physical geography and agro-ecology of southwest Ireland. PMID:29520297

  11. Genomic Characterisation of the Indigenous Irish Kerry Cattle Breed

    Directory of Open Access Journals (Sweden)

    Sam Browett

    2018-02-01

    Full Text Available Kerry cattle are an endangered landrace heritage breed of cultural importance to Ireland. In the present study we have used genome-wide SNP array data to evaluate genomic diversity within the Kerry population and between Kerry cattle and other European breeds. Patterns of genetic differentiation and gene flow among breeds using phylogenetic trees with ancestry graphs highlighted historical gene flow from the British Shorthorn breed into the ancestral population of modern Kerry cattle. Principal component analysis (PCA and genetic clustering emphasised the genetic distinctiveness of Kerry cattle relative to comparator British and European cattle breeds. Modelling of genetic effective population size (Ne revealed a demographic trend of diminishing Ne over time and that recent estimated Ne values for the Kerry breed may be less than the threshold for sustainable genetic conservation. In addition, analysis of genome-wide autozygosity (FROH showed that genomic inbreeding has increased significantly during the 20 years between 1992 and 2012. Finally, signatures of selection revealed genomic regions subject to natural and artificial selection as Kerry cattle adapted to the climate, physical geography and agro-ecology of southwest Ireland.

  12. Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

    Science.gov (United States)

    Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

    2014-07-04

    Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was

  13. Genome sequencing of chimpanzee malaria parasites reveals possible pathways of adaptation to human hosts

    KAUST Repository

    Otto, Thomas D.

    2014-09-09

    Plasmodium falciparum causes most human malaria deaths, having prehistorically evolved from parasites of African Great Apes. Here we explore the genomic basis of P. falciparum adaptation to human hosts by fully sequencing the genome of the closely related chimpanzee parasite species P. reichenowi, and obtaining partial sequence data from a more distantly related chimpanzee parasite (P. gaboni). The close relationship between P. reichenowi and P. falciparum is emphasized by almost complete conservation of genomic synteny, but against this strikingly conserved background we observe major differences at loci involved in erythrocyte invasion. The organization of most virulence-associated multigene families, including the hypervariable var genes, is broadly conserved, but P. falciparum has a smaller subset of rif and stevor genes whose products are expressed on the infected erythrocyte surface. Genome-wide analysis identifies other loci under recent positive selection, but a limited number of changes at the host–parasite interface may have mediated host switching.

  14. Genomewide variation in an introgression line of rice-Zizania revealed by whole-genome re-sequencing.

    Directory of Open Access Journals (Sweden)

    Zhen-Hui Wang

    Full Text Available BACKGROUND: Hybridization between genetically diverged organisms is known as an important avenue that drives plant genome evolution. The possible outcomes of hybridization would be the occurrences of genetic instabilities in the resultant hybrids. It remained under-investigated however whether pollination by alien pollens of a closely related but sexually "incompatible" species could evoke genomic changes and to what extent it may result in phenotypic novelties in the derived progenies. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we have re-sequenced the genomes of Oryza sativa ssp. japonica cv. Matsumae and one of its derived introgressant RZ35 that was obtained from an introgressive hybridization between Matsumae and Zizanialatifolia Griseb. in general, 131 millions 90 base pair (bp paired-end reads were generated which covered 13.2 and 21.9 folds of the Matsumae and RZ35 genomes, respectively. Relative to Matsumae, a total of 41,724 homozygous single nucleotide polymorphisms (SNPs and 17,839 homozygous insertions/deletions (indels were identified in RZ35, of which 3,797 SNPs were nonsynonymous mutations. Furthermore, rampant mobilization of transposable elements (TEs was found in the RZ35 genome. The results of pathogen inoculation revealed that RZ35 exhibited enhanced resistance to blast relative to Matsumae. Notably, one nonsynonymous mutation was found in the known blast resistance gene Pid3/Pi25 and real-time quantitative (q RT-PCR analysis revealed constitutive up-regulation of its expression, suggesting both altered function and expression of Pid3/Pi25 may be responsible for the enhanced resistance to rice blast by RZ35. CONCLUSIONS/SIGNIFICANCE: Our results demonstrate that introgressive hybridization by Zizania has provoked genomewide, extensive genomic changes in the rice genome, and some of which have resulted in important phenotypic novelties. These findings suggest that introgressive hybridization by alien pollens of even a

  15. Genome Sequencing Reveals the Potential of Achromobacter sp. HZ01 for Bioremediation

    Directory of Open Access Journals (Sweden)

    Yue-Hui Hong

    2017-08-01

    Full Text Available Petroleum pollution is a severe environmental issue. Comprehensively revealing the genetic backgrounds of hydrocarbon-degrading microorganisms contributes to developing effective methods for bioremediation of crude oil-polluted environments. Marine bacterium Achromobacter sp. HZ01 is capable of degrading hydrocarbons and producing biosurfactants. In this study, the draft genome (5.5 Mbp of strain HZ01 has been obtained by Illumina sequencing, containing 5,162 predicted genes. Genome annotation shows that “amino acid metabolism” is the most abundant metabolic pathway. Strain HZ01 is not capable of using some common carbohydrates as the sole carbon sources, which is due to that it contains few genes associated with carbohydrate transport and lacks some important enzymes related to glycometabolism. It contains abundant proteins directly related to petroleum hydrocarbon degradation. AlkB hydroxylase and its homologs were not identified. It harbors a complete enzyme system of terminal oxidation pathway for n-alkane degradation, which may be initiated by cytochrome P450. The enzymes involved in the catechol pathway are relatively complete for the degradation of aromatic compounds. This bacterium lacks several essential enzymes for methane oxidation, and Baeyer-Villiger monooxygenase involved in the subterminal oxidation pathway and cycloalkane degradation was not identified. These results suggest that strain HZ01 degrades n-alkanes via the terminal oxidation pathway, degrades aromatic compounds primarily via the catechol pathway and cannot perform methane oxidation or cycloalkane degradation. Additionally, strain HZ01 possesses abundant genes related to the metabolism of secondary metabolites, including some genes involved in biosurfactant (such as glycolipids and lipopeptides synthesis. The genome analysis also reveals its genetic basis for nitrogen metabolism, antibiotic resistance, regulatory responses to environmental changes, cell motility

  16. Both selective and neutral processes drive GC content evolution in the human genome

    Directory of Open Access Journals (Sweden)

    Cagliani Rachele

    2008-03-01

    Full Text Available Abstract Background Mammalian genomes consist of regions differing in GC content, referred to as isochores or GC-content domains. The scientific debate is still open as to whether such compositional heterogeneity is a selected or neutral trait. Results Here we analyze SNP allele frequencies, retrotransposon insertion polymorphisms (RIPs, as well as fixed substitutions accumulated in the human lineage since its divergence from chimpanzee to indicate that biased gene conversion (BGC has been playing a role in within-genome GC content variation. Yet, a distinct contribution to GC content evolution is accounted for by a selective process. Accordingly, we searched for independent evidences that GC content distribution does not conform to neutral expectations. Indeed, after correcting for possible biases, we show that intron GC content and size display isochore-specific correlations. Conclusion We consider that the more parsimonious explanation for our results is that GC content is subjected to the action of both weak selection and BGC in the human genome with features such as nucleosome positioning or chromatin conformation possibly representing the final target of selective processes. This view might reconcile previous contrasting findings and add some theoretical background to recent evidences suggesting that GC content domains display different behaviors with respect to highly regulated biological processes such as developmentally-stage related gene expression and programmed replication timing during neural stem cell differentiation.

  17. Genome and metagenome enabled analyses reveal new insight into the global biogeography and potential urea utilization in marine Thaumarchaeota.

    Science.gov (United States)

    Ahlgren, N.; Parada, A. E.; Fuhrman, J. A.

    2016-02-01

    Marine Thaumarchaea are an abundant, important group of marine microbial communities as they fix carbon, oxidize ammonium, and thus contribute to key N and C cycles in the oceans. From an enrichment culture, we have sequenced the complete genome of a new Thaumarchaeota strain, SPOT01. Analysis of this genome and other Thaumarchaeal genomes contributes new insight into its role in N cycling and clarifies the broader biogeography of marine Thaumarchaeal genera. Phylogenomics of Thaumarchaeota genomes reveal coherent separation into clusters roughly equivalent to the genus level, and SPOT01 represents a new genus of marine Thaumarchaea. Competitive fragment recruitment of globally distributed metagenomes from TARA, Ocean Sampling Day, and those generated from a station off California shows that the SPOT01 genus is often the most abundant genus, especially where total Thaumarchaea are most abundant in the overall community. The SPOT01 genome contains urease genes allowing it to use an alternative form of N. Genomic and metagenomic analysis also reveal that among planktonic genomes and populations, the urease genes in general are more frequently found in members of the SPOT01 genus and another genus dominant in deep waters, thus we predict these two genera contribute most significantly to urea utilization among marine Thaumarchaea. Recruitment also revealed broader biogeographic and ecological patterns of the putative genera. The SPOT01 genus was most abundant at colder temperatures (45 degrees). The genus containing Nitrosopumilus maritimus had the highest temperature range, and the genus containing Candidatus Nitrosopelagicus brevis was typically most abundant at intermediate temperatures and intermediate latitudes ( 35-45 degrees). Together these genome and metagenome enabled analyses provide significant new insight into the ecology and biogeochemical contributions of marine archaea.

  18. Genomic regions under selection in crop-wild hybrids of lettuce: implications for crop breeding and environmental risk assessment

    NARCIS (Netherlands)

    Hartman, Y.

    2012-01-01

    The results of this thesis show that the probability of introgression of a putative transgene to wild relatives indeed depends strongly on the insertion location of the transgene. The study of genomic selection patterns can identify crop genomic regions under negative selection in multiple

  19. Insights into recent and ancient trends in the co-evolution of Earth and life as revealed by microbial genomics

    Science.gov (United States)

    Anderson, R. E.; Huber, J. A.; Parsons, C.; Stüeken, E.

    2017-12-01

    Since the origin of life over 4 billion years ago, life has fundamentally altered the habitability of Earth. Similarly, the environment molds the evolutionary trajectory of life itself through natural selection. Microbial genomes retain a "memory" of the co-evolution of life and Earth and can be analyzed to better understand trends and events in both the recent and distant past. To examine evolutionary trends in the more recent past, we have used metagenomics analyses to investigate which environmental factors play the strongest role in driving the evolution of microbes in deep-sea hydrothermal vents, which are thought to have been important habitats in the earliest stages of life's evolution. We have shown that microbial populations in a deep, basalt-hosted system appear to be under stronger purifying selection than populations inhabiting a cooler serpentinizing system less than 20 km away, suggesting that environmental context and geochemistry have an important impact on evolutionary rates and trends. We also found evidence that viruses play an important role in driving evolution in these habitats. Changing environmental conditions may also effect long-term evolutionary trends in Earth's distant past, as revealed by comparative genomics. By reconciling phylogenetic trees for microbial species with trees of metabolic genes, we can determine approximately when crucial metabolic genes began to spread across the tree of life through horizontal gene transfer. Using these methods, we conducted an analysis of the relative timing of the spread of genes related to the nitrogen cycle. Our results indicate that the rate of horizontal gene transfer for important genes related to denitrification increased after the Great Oxidation Event, concurrent with geochemical evidence for increasing availability of nitrate, suggesting that the oxygenation of the atmosphere and surface ocean may have been an important determining factor for the spread of denitrification genes across the

  20. Selection for silage yield and composition did not affect genomic diversity within the Wisconsin Quality Synthetic maize population.

    Science.gov (United States)

    Lorenz, Aaron J; Beissinger, Timothy M; Silva, Renato Rodrigues; de Leon, Natalia

    2015-02-02

    Maize silage is forage of high quality and yield, and represents the second most important use of maize in the United States. The Wisconsin Quality Synthetic (WQS) maize population has undergone five cycles of recurrent selection for silage yield and composition, resulting in a genetically improved population. The application of high-density molecular markers allows breeders and geneticists to identify important loci through association analysis and selection mapping, as well as to monitor changes in the distribution of genetic diversity across the genome. The objectives of this study were to identify loci controlling variation for maize silage traits through association analysis and the assessment of selection signatures and to describe changes in the genomic distribution of gene diversity through selection and genetic drift in the WQS recurrent selection program. We failed to find any significant marker-trait associations using the historical phenotypic data from WQS breeding trials combined with 17,719 high-quality, informative single nucleotide polymorphisms. Likewise, no strong genomic signatures were left by selection on silage yield and quality in the WQS despite genetic gain for these traits. These results could be due to the genetic complexity underlying these traits, or the role of selection on standing genetic variation. Variation in loss of diversity through drift was observed across the genome. Some large regions experienced much greater loss in diversity than what is expected, suggesting limited recombination combined with small populations in recurrent selection programs could easily lead to fixation of large swaths of the genome. Copyright © 2015 Lorenz et al.

  1. Nuclease Target Site Selection for Maximizing On-target Activity and Minimizing Off-target Effects in Genome Editing

    Science.gov (United States)

    Lee, Ciaran M; Cradick, Thomas J; Fine, Eli J; Bao, Gang

    2016-01-01

    The rapid advancement in targeted genome editing using engineered nucleases such as ZFNs, TALENs, and CRISPR/Cas9 systems has resulted in a suite of powerful methods that allows researchers to target any genomic locus of interest. A complementary set of design tools has been developed to aid researchers with nuclease design, target site selection, and experimental validation. Here, we review the various tools available for target selection in designing engineered nucleases, and for quantifying nuclease activity and specificity, including web-based search tools and experimental methods. We also elucidate challenges in target selection, especially in predicting off-target effects, and discuss future directions in precision genome editing and its applications. PMID:26750397

  2. Comparative Genomics of the Herbivore Gut Symbiont Lactobacillus reuteri Reveals Genetic Diversity and Lifestyle Adaptation

    Directory of Open Access Journals (Sweden)

    Jie Yu

    2018-06-01

    Full Text Available Lactobacillus reuteri is a catalase-negative, Gram-positive, non-motile, obligately heterofermentative bacterial species that has been used as a model to describe the ecology and evolution of vertebrate gut symbionts. However, the genetic features and evolutionary strategies of L. reuteri from the gastrointestinal tract of herbivores remain unknown. Therefore, 16 L. reuteri strains isolated from goat, sheep, cow, and horse in Inner Mongolia, China were sequenced in this study. A comparative genomic approach was used to assess genetic diversity and gain insight into the distinguishing features related to the different hosts based on 21 published genomic sequences. Genome size, G + C content, and average nucleotide identity values of the L. reuteri strains from different hosts indicated that the strains have broad genetic diversity. The pan-genome of 37 L. reuteri strains contained 8,680 gene families, and the core genome contained 726 gene families. A total of 92,270 nucleotide mutation sites were discovered among 37 L. reuteri strains, and all core genes displayed a Ka/Ks ratio much lower than 1, suggesting strong purifying selective pressure (negative selection. A highly robust maximum likelihood tree based on the core genes shown in the herbivore isolates were divided into three clades; clades A and B contained most of the herbivore isolates and were more closely related to human isolates and vastly distinct from clade C. Some functional genes may be attributable to host-specific of the herbivore, omnivore, and sourdough groups. Moreover, the numbers of genes encoding cell surface proteins and active carbohydrate enzymes were host-specific. This study provides new insight into the adaptation of L. reuteri to the intestinal habitat of herbivores, suggesting that the genomic diversity of L. reuteri from different ecological origins is closely associated with their living environment.

  3. Genomic selection to improve livestock production in developing countries with a focus on India

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Do, Duy Ngoc

    2015-01-01

    growth will increase the demand for food as well as animal products, particularly in emerging economic giants like India. Moreover, the urbanization has considerable impact on patterns of food consumption in general and on demand for livestock products, in particular and the increased income growth led......Global livestock production has increased substantially during the last decades, in both number of animals and productivity. Meanwhile, the human population is projected to reach 9.6 billions by 2050 and most of the increase in the projection takes place in developing countries. Rapid population...... production (OPU-IVP) of embryos will have a considerable impact in the future. This paper attempts to provide basic concepts of using genomic tools for livestock production with the focus on genomic prediction and selection methods and discuss about the potential application of genomic selection to increase...

  4. Transcriptional profiling in response to terminal drought stress reveals differential responses along the wheat genome

    Directory of Open Access Journals (Sweden)

    Ferrari Francesco

    2009-06-01

    Full Text Available Abstract Background Water stress during grain filling has a marked effect on grain yield, leading to a reduced endosperm cell number and thus sink capacity to accumulate dry matter. The bread wheat cultivar Chinese Spring (CS, a Chinese Spring terminal deletion line (CS_5AL-10 and the durum wheat cultivar Creso were subjected to transcriptional profiling after exposure to mild and severe drought stress at the grain filling stage to find evidences of differential stress responses associated to different wheat genome regions. Results The transcriptome analysis of Creso, CS and its deletion line revealed 8,552 non redundant probe sets with different expression levels, mainly due to the comparisons between the two species. The drought treatments modified the expression of 3,056 probe sets. Besides a set of genes showing a similar drought response in Creso and CS, cluster analysis revealed several drought response features that can be associated to the different genomic structure of Creso, CS and CS_5AL-10. Some drought-related genes were expressed at lower level (or not expressed in Creso (which lacks the D genome or in the CS_5AL-10 deletion line compared to CS. The chromosome location of a set of these genes was confirmed by PCR-based mapping on the D genome (or the 5AL-10 region. Many clusters were characterized by different level of expression in Creso, CS and CS_AL-10, suggesting that the different genome organization of the three genotypes may affect plant adaptation to stress. Clusters with similar expression trend were grouped and functional classified to mine the biological mean of their activation or repression. Genes involved in ABA, proline, glycine-betaine and sorbitol pathways were found up-regulated by drought stress. Furthermore, the enhanced expression of a set of transposons and retrotransposons was detected in CS_5AL-10. Conclusion Bread and durum wheat genotypes were characterized by a different physiological reaction to water

  5. Genomic diversity and introgression in O. sativa reveal the impact of domestication and breeding on the rice genome.

    Directory of Open Access Journals (Sweden)

    Keyan Zhao

    2010-05-01

    Full Text Available The domestication of Asian rice (Oryza sativa was a complex process punctuated by episodes of introgressive hybridization among and between subpopulations. Deep genetic divergence between the two main varietal groups (Indica and Japonica suggests domestication from at least two distinct wild populations. However, genetic uniformity surrounding key domestication genes across divergent subpopulations suggests cultural exchange of genetic material among ancient farmers.In this study, we utilize a novel 1,536 SNP panel genotyped across 395 diverse accessions of O. sativa to study genome-wide patterns of polymorphism, to characterize population structure, and to infer the introgression history of domesticated Asian rice. Our population structure analyses support the existence of five major subpopulations (indica, aus, tropical japonica, temperate japonica and GroupV consistent with previous analyses. Our introgression analysis shows that most accessions exhibit some degree of admixture, with many individuals within a population sharing the same introgressed segment due to artificial selection. Admixture mapping and association analysis of amylose content and grain length illustrate the potential for dissecting the genetic basis of complex traits in domesticated plant populations.Genes in these regions control a myriad of traits including plant stature, blast resistance, and amylose content. These analyses highlight the power of population genomics in agricultural systems to identify functionally important regions of the genome and to decipher the role of human-directed breeding in refashioning the genomes of a domesticated species.

  6. Comparative Genomics of a Plant-Pathogenic Fungus, Pyrenophora tritici-repentis, Reveals Transduplication and the Impact of Repeat Elements on Pathogenicity and Population Divergence

    Energy Technology Data Exchange (ETDEWEB)

    Manning, Viola A.; Pandelova, Iovanna; Dhillon, Braham; Wilhelm, Larry J.; Goodwin, Stephen B.; Berlin, Aaron M.; Figueroa, Melania; Freitag, Michael; Hane, James K.; Henrissat, Bernard; Holman, Wade H.; Kodira, Chinnappa D.; Martin, Joel; Oliver, Richard P.; Robbertse, Barbara; Schackwitz, Wendy; Schwartz, David C.; Spatafora, Joseph W.; Turgeon, B. Gillian; Yandava, Chandri; Young, Sarah; Zhou, Shiguo; Zeng, Qiandong; Grigoriev, Igor V.; Ma, Li-Jun; Ciuffetti, Lynda M.

    2012-08-16

    Pyrenophora tritici-repentis is a necrotrophic fungus causal to the disease tan spot of wheat, whose contribution to crop loss has increased significantly during the last few decades. Pathogenicity by this fungus is attributed to the production of host-selective toxins (HST), which are recognized by their host in a genotype-specific manner. To better understand the mechanisms that have led to the increase in disease incidence related to this pathogen, we sequenced the genomes of three P. tritici-repentis isolates. A pathogenic isolate that produces two known HSTs was used to assemble a reference nuclear genome of approximately 40 Mb composed of 11 chromosomes that encode 12,141 predicted genes. Comparison of the reference genome with those of a pathogenic isolate that produces a third HST, and a nonpathogenic isolate, showed the nonpathogen genome to be more diverged than those of the two pathogens. Examination of gene-coding regions has provided candidate pathogen-specific proteins and revealed gene families that may play a role in a necrotrophic lifestyle. Analysis of transposable elements suggests that their presence in the genome of pathogenic isolates contributes to the creation of novel genes, effector diversification, possible horizontal gene transfer events, identified copy number variation, and the first example of transduplication by DNA transposable elements in fungi. Overall, comparative analysis of these genomes provides evidence that pathogenicity in this species arose through an influx of transposable elements, which created a genetically flexible landscape that can easily respond to environmental changes.

  7. Genomic profiling of plasmablastic lymphoma using array comparative genomic hybridization (aCGH: revealing significant overlapping genomic lesions with diffuse large B-cell lymphoma

    Directory of Open Access Journals (Sweden)

    Lu Xin-Yan

    2009-11-01

    Full Text Available Abstract Background Plasmablastic lymphoma (PL is a subtype of diffuse large B-cell lymphoma (DLBCL. Studies have suggested that tumors with PL morphology represent a group of neoplasms with clinopathologic characteristics corresponding to different entities including extramedullary plasmablastic tumors associated with plasma cell myeloma (PCM. The goal of the current study was to evaluate the genetic similarities and differences among PL, DLBCL (AIDS-related and non AIDS-related and PCM using array-based comparative genomic hybridization. Results Examination of genomic data in PL revealed that the most frequent segmental gain (> 40% include: 1p36.11-1p36.33, 1p34.1-1p36.13, 1q21.1-1q23.1, 7q11.2-7q11.23, 11q12-11q13.2 and 22q12.2-22q13.3. This correlated with segmental gains occurring in high frequency in DLBCL (AIDS-related and non AIDS-related cases. There were some segmental gains and some segmental loss that occurred in PL but not in the other types of lymphoma suggesting that these foci may contain genes responsible for the differentiation of this lymphoma. Additionally, some segmental gains and some segmental loss occurred only in PL and AIDS associated DLBCL suggesting that these foci may be associated with HIV infection. Furthermore, some segmental gains and some segmental loss occurred only in PL and PCM suggesting that these lesions may be related to plasmacytic differentiation. Conclusion To the best of our knowledge, the current study represents the first genomic exploration of PL. The genomic aberration pattern of PL appears to be more similar to that of DLBCL (AIDS-related or non AIDS-related than to PCM. Our findings suggest that PL may remain best classified as a subtype of DLBCL at least at the genome level.

  8. Signatures of Selection in the Genomes of Commercial and Non-Commercial Chicken Breeds

    Science.gov (United States)

    Elferink, Martin G.; Megens, Hendrik-Jan; Vereijken, Addie; Hu, Xiaoxiang; Crooijmans, Richard P. M. A.; Groenen, Martien A. M.

    2012-01-01

    Identifying genomics regions that are affected by selection is important to understand the domestication and selection history of the domesticated chicken, as well as understanding molecular pathways underlying phenotypic traits and breeding goals. While whole-genome approaches, either high-density SNP chips or massively parallel sequencing, have been successfully applied to identify evidence for selective sweeps in chicken, it has been difficult to distinguish patterns of selection and stochastic and breed specific effects. Here we present a study to identify selective sweeps in a large number of chicken breeds (67 in total) using a high-density (58 K) SNP chip. We analyzed commercial chickens representing all major breeding goals. In addition, we analyzed non-commercial chicken diversity for almost all recognized traditional Dutch breeds and a selection of representative breeds from China. Based on their shared history or breeding goal we in silico grouped the breeds into 14 breed groups. We identified 396 chromosomal regions that show suggestive evidence of selection in at least one breed group with 26 of these regions showing strong evidence of selection. Of these 26 regions, 13 were previously described and 13 yield new candidate genes for performance traits in chicken. Our approach demonstrates the strength of including many different populations with similar, and breed groups with different selection histories to reduce stochastic effects based on single populations. PMID:22384281

  9. Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U. S. Holsteins.

    Science.gov (United States)

    He, Jun; Xu, Jiaqi; Wu, Xiao-Lin; Bauck, Stewart; Lee, Jungjae; Morota, Gota; Kachman, Stephen D; Spangler, Matthew L

    2018-04-01

    SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2%. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821-0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825-0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction.

  10. A genomic portrait of haplotype diversity and signatures of selection in indigenous southern African populations.

    Directory of Open Access Journals (Sweden)

    Emile R Chimusa

    2015-03-01

    Full Text Available We report a study of genome-wide, dense SNP (∼ 900K and copy number polymorphism data of indigenous southern Africans. We demonstrate the genetic contribution to southern and eastern African populations, which involved admixture between indigenous San, Niger-Congo-speaking and populations of Eurasian ancestry. This finding illustrates the need to account for stratification in genome-wide association studies, and that admixture mapping would likely be a successful approach in these populations. We developed a strategy to detect the signature of selection prior to and following putative admixture events. Several genomic regions show an unusual excess of Niger-Kordofanian, and unusual deficiency of both San and Eurasian ancestry, which were considered the footprints of selection after population admixture. Several SNPs with strong allele frequency differences were observed predominantly between the admixed indigenous southern African populations, and their ancestral Eurasian populations. Interestingly, many candidate genes, which were identified within the genomic regions showing signals for selection, were associated with southern African-specific high-risk, mostly communicable diseases, such as malaria, influenza, tuberculosis, and human immunodeficiency virus/AIDs. This observation suggests a potentially important role that these genes might have played in adapting to the environment. Additionally, our analyses of haplotype structure, linkage disequilibrium, recombination, copy number variation and genome-wide admixture highlight, and support the unique position of San relative to both African and non-African populations. This study contributes to a better understanding of population ancestry and selection in south-eastern African populations; and the data and results obtained will support research into the genetic contributions to infectious as well as non-communicable diseases in the region.

  11. A genomic portrait of haplotype diversity and signatures of selection in indigenous southern African populations.

    Science.gov (United States)

    Chimusa, Emile R; Meintjies, Ayton; Tchanga, Milaine; Mulder, Nicola; Seoighe, Cathal; Seioghe, Cathal; Soodyall, Himla; Ramesar, Rajkumar

    2015-03-01

    We report a study of genome-wide, dense SNP (∼ 900K) and copy number polymorphism data of indigenous southern Africans. We demonstrate the genetic contribution to southern and eastern African populations, which involved admixture between indigenous San, Niger-Congo-speaking and populations of Eurasian ancestry. This finding illustrates the need to account for stratification in genome-wide association studies, and that admixture mapping would likely be a successful approach in these populations. We developed a strategy to detect the signature of selection prior to and following putative admixture events. Several genomic regions show an unusual excess of Niger-Kordofanian, and unusual deficiency of both San and Eurasian ancestry, which were considered the footprints of selection after population admixture. Several SNPs with strong allele frequency differences were observed predominantly between the admixed indigenous southern African populations, and their ancestral Eurasian populations. Interestingly, many candidate genes, which were identified within the genomic regions showing signals for selection, were associated with southern African-specific high-risk, mostly communicable diseases, such as malaria, influenza, tuberculosis, and human immunodeficiency virus/AIDs. This observation suggests a potentially important role that these genes might have played in adapting to the environment. Additionally, our analyses of haplotype structure, linkage disequilibrium, recombination, copy number variation and genome-wide admixture highlight, and support the unique position of San relative to both African and non-African populations. This study contributes to a better understanding of population ancestry and selection in south-eastern African populations; and the data and results obtained will support research into the genetic contributions to infectious as well as non-communicable diseases in the region.

  12. Probing Genomic Aspects of the Multi-Host Pathogen Clostridium perfringens Reveals Significant Pangenome Diversity, and a Diverse Array of Virulence Factors

    Directory of Open Access Journals (Sweden)

    Raymond Kiu

    2017-12-01

    Full Text Available Clostridium perfringens is an important cause of animal and human infections, however information about the genetic makeup of this pathogenic bacterium is currently limited. In this study, we sought to understand and characterise the genomic variation, pangenomic diversity, and key virulence traits of 56 C. perfringens strains which included 51 public, and 5 newly sequenced and annotated genomes using Whole Genome Sequencing. Our investigation revealed that C. perfringens has an “open” pangenome comprising 11667 genes and 12.6% of core genes, identified as the most divergent single-species Gram-positive bacterial pangenome currently reported. Our computational analyses also defined C. perfringens phylogeny (16S rRNA gene in relation to some 25 Clostridium species, with C. baratii and C. sardiniense determined to be the closest relatives. Profiling virulence-associated factors confirmed presence of well-characterised C. perfringens-associated exotoxins genes including α-toxin (plc, enterotoxin (cpe, and Perfringolysin O (pfo or pfoA, although interestingly there did not appear to be a close correlation with encoded toxin type and disease phenotype. Furthermore, genomic analysis indicated significant horizontal gene transfer events as defined by presence of prophage genomes, and notably absence of CRISPR defence systems in >70% (40/56 of the strains. In relation to antimicrobial resistance mechanisms, tetracycline resistance genes (tet and anti-defensins genes (mprF were consistently detected in silico (tet: 75%; mprF: 100%. However, pre-antibiotic era strain genomes did not encode for tet, thus implying antimicrobial selective pressures in C. perfringens evolutionary history over the past 80 years. This study provides new genomic understanding of this genetically divergent multi-host bacterium, and further expands our knowledge on this medically and veterinary important pathogen.

  13. Green evolution and dynamic adaptations revealed by genomes of the marine picoeukaryotes Micromonas

    Energy Technology Data Exchange (ETDEWEB)

    Worden, Alexandra Z.; Lee, Jae-Hyeok; Mock, Thomas; Rouze, Pierre; Simmons, Melinda P.; Aerts, Andrea L.; Allen, Andrew E.; Cuvelier, Marie L.; Derelle, Evelyne; Everett, Meredieht V.; Foulon, Elodie; Grimwood, Jane; Gundlach, Heidrun; Henrissat, Bernard; Napoli, Carolyn; McDonald, Sarah M.; Parker, Micaela S.; Rombauts, Stephane; Salamov, Asaf; von Dassow, Peter; Badger, Jonathan G,; Coutinho, Pedro M.; Demir, Elif; Dubchak, Inna; Gentemann, Chelle; Eikrem, Wenche; Gready, Jill E.; John, Uwe; Lanier, William; Lindquist, Erika A.; Lucas, Susan; Mayer, Kluas F. X.; Moreau, Herve; Not, Fabrice; Otillar, Robert; Panaud, Olivier; Pangilinan, Jasmyn; Paulsen, Ian; Piegu, Benoit; Poliakov, Aaron; Robbens, Steven; Schmutz, Jeremy; Roulza, Eve; Wyss, Tania; Zelensky, Alexander; Zhou, Kemin; Armbrust, E. Virginia; Bhattacharya, Debashish; Goodenough, Ursula W.; Van de Peer, Yves; Grigoriev, Igor V.

    2009-10-14

    Picoeukaryotes are a taxonomically diverse group of organisms less than 2 micrometers in diameter. Photosynthetic marine picoeukaryotes in the genus Micromonas thrive in ecosystems ranging from tropical to polar and could serve as sentinel organisms for biogeochemical fluxes of modern oceans during climate change. These broadly distributed primary producers belong to an anciently diverged sister clade to land plants. Although Micromonas isolates have high 18S ribosomal RNA gene identity, we found that genomes from two isolates shared only 90percent of their predicted genes. Their independent evolutionary paths were emphasized by distinct riboswitch arrangements as well as the discovery of intronic repeat elements in one isolate, and in metagenomic data, but not in other genomes. Divergence appears to have been facilitated by selection and acquisition processes that actively shape the repertoire of genes that are mutually exclusive between the two isolates differently than the core genes. Analyses of the Micromonas genomes offer valuable insights into ecological differentiation and the dynamic nature of early plant evolution.

  14. Selection signatures in worldwide sheep populations.

    Science.gov (United States)

    Fariello, Maria-Ines; Servin, Bertrand; Tosser-Klopp, Gwenola; Rupp, Rachel; Moreno, Carole; San Cristobal, Magali; Boitard, Simon

    2014-01-01

    The diversity of populations in domestic species offers great opportunities to study genome response to selection. The recently published Sheep HapMap dataset is a great example of characterization of the world wide genetic diversity in sheep. In this study, we re-analyzed the Sheep HapMap dataset to identify selection signatures in worldwide sheep populations. Compared to previous analyses, we made use of statistical methods that (i) take account of the hierarchical structure of sheep populations, (ii) make use of linkage disequilibrium information and (iii) focus specifically on either recent or older selection signatures. We show that this allows pinpointing several new selection signatures in the sheep genome and distinguishing those related to modern breeding objectives and to earlier post-domestication constraints. The newly identified regions, together with the ones previously identified, reveal the extensive genome response to selection on morphology, color and adaptation to new environments.

  15. Genomic comparison of invasive and rare non-invasive strains reveals Porphyromonas gingivalis genetic polymorphisms

    Directory of Open Access Journals (Sweden)

    Svetlana Dolgilevich

    2011-03-01

    Full Text Available Porphyromonas gingivalis strains are shown to invade human cells in vitro with different invasion efficiencies, varying by up to three orders of magnitude.We tested the hypothesis that invasion-associated interstrain genomic polymorphisms are present in P. gingivalis and that putative invasion-associated genes can contribute to P. gingivalis invasion.Using an invasive (W83 and the only available non-invasive P. gingivalis strain (AJW4 and whole genome microarrays followed by two separate software tools, we carried out comparative genomic hybridization (CGH analysis.We identified 68 annotated and 51 hypothetical open reading frames (ORFs that are polymorphic between these strains. Among these are surface proteins, lipoproteins, capsular polysaccharide biosynthesis enzymes, regulatory and immunoreactive proteins, integrases, and transposases often with abnormal GC content and clustered on the chromosome. Amplification of selected ORFs was used to validate the approach and the selection. Eleven clinical strains were investigated for the presence of selected ORFs. The putative invasion-associated ORFs were present in 10 of the isolates. The invasion ability of three isogenic mutants, carrying deletions in PG0185, PG0186, and PG0982 was tested. The PG0185 (ragA and PG0186 (ragB mutants had 5.1×103-fold and 3.6×103-fold decreased in vitro invasion ability, respectively.The annotation of divergent ORFs suggests deficiency in multiple genes as a basis for P. gingivalis non-invasive phenotype. Access the supplementary material to this article: Supplement, table (see Supplementary files under Reading Tools online.

  16. Genomic selection in plant breeding: from theory to practice.

    Science.gov (United States)

    Jannink, Jean-Luc; Lorenz, Aaron J; Iwata, Hiroyoshi

    2010-03-01

    We intuitively believe that the dramatic drop in the cost of DNA marker information we have experienced should have immediate benefits in accelerating the delivery of crop varieties with improved yield, quality and biotic and abiotic stress tolerance. But these traits are complex and affected by many genes, each with small effect. Traditional marker-assisted selection has been ineffective for such traits. The introduction of genomic selection (GS), however, has shifted that paradigm. Rather than seeking to identify individual loci significantly associated with a trait, GS uses all marker data as predictors of performance and consequently delivers more accurate predictions. Selection can be based on GS predictions, potentially leading to more rapid and lower cost gains from breeding. The objectives of this article are to review essential aspects of GS and summarize the important take-home messages from recent theoretical, simulation and empirical studies. We then look forward and consider research needs surrounding methodological questions and the implications of GS for long-term selection.

  17. Genome re-sequencing of semi-wild soybean reveals a complex Soja population structure and deep introgression.

    Directory of Open Access Journals (Sweden)

    Jie Qiu

    Full Text Available Semi-wild soybean is a unique type of soybean that retains both wild and domesticated characteristics, which provides an important intermediate type for understanding the evolution of the subgenus Soja population in the Glycine genus. In this study, a semi-wild soybean line (Maliaodou and a wild line (Lanxi 1 collected from the lower Yangtze regions were deeply sequenced while nine other semi-wild lines were sequenced to a 3-fold genome coverage. Sequence analysis revealed that (1 no independent phylogenetic branch covering all 10 semi-wild lines was observed in the Soja phylogenetic tree; (2 besides two distinct subpopulations of wild and cultivated soybean in the Soja population structure, all semi-wild lines were mixed with some wild lines into a subpopulation rather than an independent one or an intermediate transition type of soybean domestication; (3 high heterozygous rates (0.19-0.49 were observed in several semi-wild lines; and (4 over 100 putative selective regions were identified by selective sweep analysis, including those related to the development of seed size. Our results suggested a hybridization origin for the semi-wild soybean, which makes a complex Soja population structure.

  18. Illumina based whole mitochondrial genome of Junonia iphita reveals minor intraspecific variation

    Directory of Open Access Journals (Sweden)

    Catherine Vanlalruati

    2015-12-01

    Full Text Available In the present study, the near complete mitochondrial genome (mitogenome of Junonia iphita (Lepidoptera: Nymphalidae: Nymphalinae was determined to be 14,892 bp. The gene order and orientation are identical to those in other butterfly species. The phylogenetic tree constructed from the whole mitogenomes using the 13 protein coding genes (PCGs defines the genetic relatedness of the two J. iphita species collected from two different regions. All the Junonia species clustered together, and were further subdivided into clade one consisting of J. almana and J. orithya and clade two comprising of the two J. iphita which were collected from Indo and Indochinese subregions separated by river barrier. Comparison between the two J. iphita sequences revealed minor variations and Single Nucleotide Polymorphisms were identified at 51 sites amounting to 0.4% of the entire mitochondrial genome.

  19. Genome-wide analysis reveals the extent of EAV-HP integration in domestic chicken.

    Science.gov (United States)

    Wragg, David; Mason, Andrew S; Yu, Le; Kuo, Richard; Lawal, Raman A; Desta, Takele Taye; Mwacharo, Joram M; Cho, Chang-Yeon; Kemp, Steve; Burt, David W; Hanotte, Olivier

    2015-10-14

    EAV-HP is an ancient retrovirus pre-dating Gallus speciation, which continues to circulate in modern chicken populations, and led to the emergence of avian leukosis virus subgroup J causing significant economic losses to the poultry industry. We mapped EAV-HP integration sites in Ethiopian village chickens, a Silkie, Taiwan Country chicken, red junglefowl Gallus gallus and several inbred experimental lines using whole-genome sequence data. An average of 75.22 ± 9.52 integration sites per bird were identified, which collectively group into 279 intervals of which 5 % are common to 90 % of the genomes analysed and are suggestive of pre-domestication integration events. More than a third of intervals are specific to individual genomes, supporting active circulation of EAV-HP in modern chickens. Interval density is correlated with chromosome length (P < 2.31(-6)), and 27 % of intervals are located within 5 kb of a transcript. Functional annotation clustering of genes reveals enrichment for immune-related functions (P < 0.05). Our results illustrate a non-random distribution of EAV-HP in the genome, emphasising the importance it may have played in the adaptation of the species, and provide a platform from which to extend investigations on the co-evolutionary significance of endogenous retroviral genera with their hosts.

  20. Genome-Wide Comparative Functional Analyses Reveal Adaptations of Salmonella sv. Newport to a Plant Colonization Lifestyle

    Directory of Open Access Journals (Sweden)

    Marcos H. de Moraes

    2018-05-01

    Full Text Available Outbreaks of salmonellosis linked to the consumption of vegetables have been disproportionately associated with strains of serovar Newport. We tested the hypothesis that strains of sv. Newport have evolved unique adaptations to persistence in plants that are not shared by strains of other Salmonella serovars. We used a genome-wide mutant screen to compare growth in tomato fruit of a sv. Newport strain from an outbreak traced to tomatoes, and a sv. Typhimurium strain from animals. Most genes in the sv. Newport strain that were selected during persistence in tomatoes were shared with, and similarly selected in, the sv. Typhimurium strain. Many of their functions are linked to central metabolism, including amino acid biosynthetic pathways, iron acquisition, and maintenance of cell structure. One exception was a greater need for the core genes involved in purine metabolism in sv. Typhimurium than in sv. Newport. We discovered a gene, papA, that was unique to sv. Newport and contributed to the strain’s fitness in tomatoes. The papA gene was present in about 25% of sv. Newport Group III genomes and generally absent from other Salmonella genomes. Homologs of papA were detected in the genomes of Pantoea, Dickeya, and Pectobacterium, members of the Enterobacteriacea family that can colonize both plants and animals.

  1. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

    Directory of Open Access Journals (Sweden)

    Yunsheng Wang

    Full Text Available In this study, we identified and compared nucleotide-binding site (NBS domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China. Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.

  2. Genome Sequencing of Museum Specimens Reveals Rapid Changes in the Genetic Composition of Honey Bees in California.

    Science.gov (United States)

    Cridland, Julie M; Ramirez, Santiago R; Dean, Cheryl A; Sciligo, Amber; Tsutsui, Neil D

    2018-02-01

    The western honey bee, Apis mellifera, is an enormously influential pollinator in both natural and managed ecosystems. In North America, this species has been introduced numerous times from a variety of different source populations in Europe and Africa. Since then, feral populations have expanded into many different environments across their broad introduced range. Here, we used whole genome sequencing of historical museum specimens and newly collected modern populations from California (USA) to analyze the impact of demography and selection on introduced populations during the past 105 years. We find that populations from both northern and southern California exhibit pronounced genetic changes, but have changed in different ways. In northern populations, honey bees underwent a substantial shift from western European to eastern European ancestry since the 1960s, whereas southern populations are dominated by the introgression of Africanized genomes during the past two decades. Additionally, we identify an isolated island population that has experienced comparatively little change over a large time span. Fine-scale comparison of different populations and time points also revealed SNPs that differ in frequency, highlighting a number of genes that may be important for recent adaptations in these introduced populations. © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  3. Genomic selection using indicator traits to reduce the environmental impact of milk production

    DEFF Research Database (Denmark)

    Hansen Axelsson, H; Fikse, W F; Kargo, Morten

    2013-01-01

    The aim of this simulation study was to test the hypothesis that phenotype information of specific indicator traits of environmental importance recorded on a small-scale can be implemented in breeding schemes with genomic selection to reduce the environmental impact of milk production. A stochastic...... was, however, best in the scenarios where the genetic correlation between IT and EI was ≥0.30 and the accuracy of direct genomic value was ≥0.40. The genetic gain in EI was 26 to 34% higher when indicator traits such as greenhouse gases in the breath of the cow and methane recorded in respiration...... of direct genomic values will be reasonably high...

  4. Mitochondrial genomes reveal recombination in the presumed asexual Fusarium oxysporum species complex.

    Science.gov (United States)

    Brankovics, Balázs; van Dam, Peter; Rep, Martijn; de Hoog, G Sybren; J van der Lee, Theo A; Waalwijk, Cees; van Diepeningen, Anne D

    2017-09-18

    The Fusarium oxysporum species complex (FOSC) contains several phylogenetic lineages. Phylogenetic studies identified two to three major clades within the FOSC. The mitochondrial sequences are highly informative phylogenetic markers, but have been mostly neglected due to technical difficulties. A total of 61 complete mitogenomes of FOSC strains were de novo assembled and annotated. Length variations and intron patterns support the separation of three phylogenetic species. The variable region of the mitogenome that is typical for the genus Fusarium shows two new variants in the FOSC. The variant typical for Fusarium is found in members of all three clades, while variant 2 is found in clades 2 and 3 and variant 3 only in clade 2. The extended set of loci analyzed using a new implementation of the genealogical concordance species recognition method support the identification of three phylogenetic species within the FOSC. Comparative analysis of the mitogenomes in the FOSC revealed ongoing mitochondrial recombination within, but not between phylogenetic species. The recombination indicates the presence of a parasexual cycle in F. oxysporum. The obstacles hindering the usage of the mitogenomes are resolved by using next generation sequencing and selective genome assemblers, such as GRAbB. Complete mitogenome sequences offer a stable basis and reference point for phylogenetic and population genetic studies.

  5. Comparison of genome-wide selection strategies to identify furfural tolerance genes in Escherichia coli.

    Science.gov (United States)

    Glebes, Tirzah Y; Sandoval, Nicholas R; Gillis, Jacob H; Gill, Ryan T

    2015-01-01

    Engineering both feedstock and product tolerance is important for transitioning towards next-generation biofuels derived from renewable sources. Tolerance to chemical inhibitors typically results in complex phenotypes, for which multiple genetic changes must often be made to confer tolerance. Here, we performed a genome-wide search for furfural-tolerant alleles using the TRackable Multiplex Recombineering (TRMR) method (Warner et al. (2010), Nature Biotechnology), which uses chromosomally integrated mutations directed towards increased or decreased expression of virtually every gene in Escherichia coli. We employed various growth selection strategies to assess the role of selection design towards growth enrichments. We also compared genes with increased fitness from our TRMR selection to those from a previously reported genome-wide identification study of furfural tolerance genes using a plasmid-based genomic library approach (Glebes et al. (2014) PLOS ONE). In several cases, growth improvements were observed for the chromosomally integrated promoter/RBS mutations but not for the plasmid-based overexpression constructs. Through this assessment, four novel tolerance genes, ahpC, yhjH, rna, and dicA, were identified and confirmed for their effect on improving growth in the presence of furfural. © 2014 Wiley Periodicals, Inc.

  6. Diverse circovirus-like genome architectures revealed by environmental metagenomics.

    Science.gov (United States)

    Rosario, Karyna; Duffy, Siobain; Breitbart, Mya

    2009-10-01

    Single-stranded DNA (ssDNA) viruses with circular genomes are the smallest viruses known to infect eukaryotes. The present study identified 10 novel genomes similar to ssDNA circoviruses through data-mining of public viral metagenomes. The metagenomic libraries included samples from reclaimed water and three different marine environments (Chesapeake Bay, British Columbia coastal waters and Sargasso Sea). All the genomes have similarities to the replication (Rep) protein of circoviruses; however, only half have genomic features consistent with known circoviruses. Some of the genomes exhibit a mixture of genomic features associated with different families of ssDNA viruses (i.e. circoviruses, geminiviruses and parvoviruses). Unique genome architectures and phylogenetic analysis of the Rep protein suggest that these viruses belong to novel genera and/or families. Investigating the complex community of ssDNA viruses in the environment can lead to the discovery of divergent species and help elucidate evolutionary links between ssDNA viruses.

  7. Genome-wide maps of alkylation damage, repair, and mutagenesis in yeast reveal mechanisms of mutational heterogeneity.

    Science.gov (United States)

    Mao, Peng; Brown, Alexander J; Malc, Ewa P; Mieczkowski, Piotr A; Smerdon, Michael J; Roberts, Steven A; Wyrick, John J

    2017-10-01

    DNA base damage is an important contributor to genome instability, but how the formation and repair of these lesions is affected by the genomic landscape and contributes to mutagenesis is unknown. Here, we describe genome-wide maps of DNA base damage, repair, and mutagenesis at single nucleotide resolution in yeast treated with the alkylating agent methyl methanesulfonate (MMS). Analysis of these maps revealed that base excision repair (BER) of alkylation damage is significantly modulated by chromatin, with faster repair in nucleosome-depleted regions, and slower repair and higher mutation density within strongly positioned nucleosomes. Both the translational and rotational settings of lesions within nucleosomes significantly influence BER efficiency; moreover, this effect is asymmetric relative to the nucleosome dyad axis and is regulated by histone modifications. Our data also indicate that MMS-induced mutations at adenine nucleotides are significantly enriched on the nontranscribed strand (NTS) of yeast genes, particularly in BER-deficient strains, due to higher damage formation on the NTS and transcription-coupled repair of the transcribed strand (TS). These findings reveal the influence of chromatin on repair and mutagenesis of base lesions on a genome-wide scale and suggest a novel mechanism for transcription-associated mutation asymmetry, which is frequently observed in human cancers. © 2017 Mao et al.; Published by Cold Spring Harbor Laboratory Press.

  8. On causal roles and selected effects: our genome is mostly junk.

    Science.gov (United States)

    Doolittle, W Ford; Brunet, Tyler D P

    2017-12-05

    The idea that much of our genome is irrelevant to fitness-is not the product of positive natural selection at the organismal level-remains viable. Claims to the contrary, and specifically that the notion of "junk DNA" should be abandoned, are based on conflating meanings of the word "function". Recent estimates suggest that perhaps 90% of our DNA, though biochemically active, does not contribute to fitness in any sequence-dependent way, and possibly in no way at all. Comparisons to vertebrates with much larger and smaller genomes (the lungfish and the pufferfish) strongly align with such a conclusion, as they have done for the last half-century.

  9. Efficient Breeding by Genomic Mating.

    Science.gov (United States)

    Akdemir, Deniz; Sánchez, Julio I

    2016-01-01

    Selection in breeding programs can be done by using phenotypes (phenotypic selection), pedigree relationship (breeding value selection) or molecular markers (marker assisted selection or genomic selection). All these methods are based on truncation selection, focusing on the best performance of parents before mating. In this article we proposed an approach to breeding, named genomic mating, which focuses on mating instead of truncation selection. Genomic mating uses information in a similar fashion to genomic selection but includes information on complementation of parents to be mated. Following the efficiency frontier surface, genomic mating uses concepts of estimated breeding values, risk (usefulness) and coefficient of ancestry to optimize mating between parents. We used a genetic algorithm to find solutions to this optimization problem and the results from our simulations comparing genomic selection, phenotypic selection and the mating approach indicate that current approach for breeding complex traits is more favorable than phenotypic and genomic selection. Genomic mating is similar to genomic selection in terms of estimating marker effects, but in genomic mating the genetic information and the estimated marker effects are used to decide which genotypes should be crossed to obtain the next breeding population.

  10. Optimizing the allocation of resources for genomic selection in one breeding cycle.

    Science.gov (United States)

    Riedelsheimer, Christian; Melchinger, Albrecht E

    2013-11-01

    We developed a universally applicable planning tool for optimizing the allocation of resources for one cycle of genomic selection in a biparental population. The framework combines selection theory with constraint numerical optimization and considers genotype  ×  environment interactions. Genomic selection (GS) is increasingly implemented in plant breeding programs to increase selection gain but little is known how to optimally allocate the resources under a given budget. We investigated this problem with model calculations by combining quantitative genetic selection theory with constraint numerical optimization. We assumed one selection cycle where both the training and prediction sets comprised double haploid (DH) lines from the same biparental population. Grain yield for testcrosses of maize DH lines was used as a model trait but all parameters can be adjusted in a freely available software implementation. An extension of the expected selection accuracy given by Daetwyler et al. (2008) was developed to correctly balance between the number of environments for phenotyping the training set and its population size in the presence of genotype × environment interactions. Under small budget, genotyping costs mainly determine whether GS is superior over phenotypic selection. With increasing budget, flexibility in resource allocation increases greatly but selection gain leveled off quickly requiring balancing the number of populations with the budget spent for each population. The use of an index combining phenotypic and GS predicted values in the training set was especially beneficial under limited resources and large genotype × environment interactions. Once a sufficiently high selection accuracy is achieved in the prediction set, further selection gain can be achieved most efficiently by massively expanding its size. Thus, with increasing budget, reducing the costs for producing a DH line becomes increasingly crucial for successfully exploiting the

  11. Introgression of a Block of Genome Under Infinitesimal Selection.

    Science.gov (United States)

    Sachdeva, Himani; Barton, Nicholas H

    2018-06-12

    Adaptive introgression is common in nature and can be driven by selection acting on multiple, linked genes. We explore the effects of polygenic selection on introgression under the infinitesimal model with linkage. This model assumes that the introgressing block has an effectively infinite number of loci, each with an infinitesimal effect on the trait under selection. The block is assumed to introgress under directional selection within a native population that is genetically homogeneous. We use individual-based simulations and a branching process approximation to compute various statistics of the introgressing block, and explore how these depend on parameters such as the map length and initial trait value associated with the introgressing block, the genetic variability along the block, and the strength of selection. Our results show that the introgression dynamics of a block under infinitesimal selection are qualitatively different from the dynamics of neutral introgression. We also find that in the long run, surviving descendant blocks are likely to have intermediate lengths, and clarify how their length is shaped by the interplay between linkage and infinitesimal selection. Our results suggest that it may be difficult to distinguish the long-term introgression of a block of genome with a single strongly selected locus from the introgression of a block with multiple, tightly linked and weakly selected loci. Copyright © 2018, Genetics.

  12. A scan for positively selected genes in the genomes of humans and chimpanzees.

    Directory of Open Access Journals (Sweden)

    Rasmus Nielsen

    2005-06-01

    Full Text Available Since the divergence of humans and chimpanzees about 5 million years ago, these species have undergone a remarkable evolution with drastic divergence in anatomy and cognitive abilities. At the molecular level, despite the small overall magnitude of DNA sequence divergence, we might expect such evolutionary changes to leave a noticeable signature throughout the genome. We here compare 13,731 annotated genes from humans to their chimpanzee orthologs to identify genes that show evidence of positive selection. Many of the genes that present a signature of positive selection tend to be involved in sensory perception or immune defenses. However, the group of genes that show the strongest evidence for positive selection also includes a surprising number of genes involved in tumor suppression and apoptosis, and of genes involved in spermatogenesis. We hypothesize that positive selection in some of these genes may be driven by genomic conflict due to apoptosis during spermatogenesis. Genes with maximal expression in the brain show little or no evidence for positive selection, while genes with maximal expression in the testis tend to be enriched with positively selected genes. Genes on the X chromosome also tend to show an elevated tendency for positive selection. We also present polymorphism data from 20 Caucasian Americans and 19 African Americans for the 50 annotated genes showing the strongest evidence for positive selection. The polymorphism analysis further supports the presence of positive selection in these genes by showing an excess of high-frequency derived nonsynonymous mutations.

  13. How to Make a Dolphin: Molecular Signature of Positive Selection in Cetacean Genome.

    Directory of Open Access Journals (Sweden)

    Mariana F Nery

    Full Text Available Cetaceans are unique in being the only mammals completely adapted to an aquatic environment. This adaptation has required complex changes and sometimes a complete restructuring of physiology, behavior and morphology. Identifying genes that have been subjected to selection pressure during cetacean evolution would greatly enhance our knowledge of the ways in which genetic variation in this mammalian order has been shaped by natural selection. Here, we performed a genome-wide scan for positive selection in the dolphin lineage. We employed models of codon substitution that account for variation of selective pressure over branches on the tree and across sites in a sequence. We analyzed 7,859 nuclear-coding ortholog genes and using a series of likelihood ratio tests (LRTs, we identified 376 genes (4.8% with molecular signatures of positive selection in the dolphin lineage. We used the cow as the sister group and compared estimates of selection in the cetacean genome to this using the same methods. This allowed us to define which genes have been exclusively under positive selection in the dolphin lineage. The enrichment analysis found that the identified positively selected genes are significantly over-represented for three exclusive functional categories only in the dolphin lineage: segment specification, mesoderm development and system development. Of particular interest for cetacean adaptation to an aquatic life are the following GeneOntology targets under positive selection: genes related to kidney, heart, lung, eye, ear and nervous system development.

  14. Natural selection shaped the rise and fall of passenger pigeon genomic diversity.

    Science.gov (United States)

    Murray, Gemma G R; Soares, André E R; Novak, Ben J; Schaefer, Nathan K; Cahill, James A; Baker, Allan J; Demboski, John R; Doll, Andrew; Da Fonseca, Rute R; Fulton, Tara L; Gilbert, M Thomas P; Heintzman, Peter D; Letts, Brandon; McIntosh, George; O'Connell, Brendan L; Peck, Mark; Pipes, Marie-Lorraine; Rice, Edward S; Santos, Kathryn M; Sohrweide, A Gregory; Vohr, Samuel H; Corbett-Detig, Russell B; Green, Richard E; Shapiro, Beth

    2017-11-17

    The extinct passenger pigeon was once the most abundant bird in North America, and possibly the world. Although theory predicts that large populations will be more genetically diverse, passenger pigeon genetic diversity was surprisingly low. To investigate this disconnect, we analyzed 41 mitochondrial and 4 nuclear genomes from passenger pigeons and 2 genomes from band-tailed pigeons, which are passenger pigeons' closest living relatives. Passenger pigeons' large population size appears to have allowed for faster adaptive evolution and removal of harmful mutations, driving a huge loss in their neutral genetic diversity. These results demonstrate the effect that selection can have on a vertebrate genome and contradict results that suggested that population instability contributed to this species's surprisingly rapid extinction. Copyright © 2017, American Association for the Advancement of Science.

  15. Sequence analysis of chromosome 1 revealed different selection patterns between Chinese wild mice and laboratory strains.

    Science.gov (United States)

    Xu, Fuyi; Hu, Shixian; Chao, Tianzhu; Wang, Maochun; Li, Kai; Zhou, Yuxun; Xu, Hongyan; Xiao, Junhua

    2017-10-01

    Both natural and artificial selection play a critical role in animals' adaptation to the environment. Detection of the signature of selection in genomic regions can provide insights for understanding the function of specific phenotypes. It is generally assumed that laboratory mice may experience intense artificial selection while wild mice more natural selection. However, the differences of selection signature in the mouse genome and underlying genes between wild and laboratory mice remain unclear. In this study, we used two mouse populations: chromosome 1 (Chr 1) substitution lines (C1SLs) derived from Chinese wild mice and mouse genome project (MGP) sequenced inbred strains and two selection detection statistics: Fst and Tajima's D to identify the signature of selection footprint on Chr 1. For the differentiation between the C1SLs and MGP, 110 candidate selection regions containing 47 protein coding genes were detected. A total of 149 selection regions which encompass 7.215 Mb were identified in the C1SLs by Tajima's D approach. While for the MGP, we identified nearly twice selection regions (243) compared with the C1SLs which accounted for 13.27 Mb Chr 1 sequence. Through functional annotation, we identified several biological processes with significant enrichment including seven genes in the olfactory transduction pathway. In addition, we searched the phenotypes associated with the 47 candidate selection genes identified by Fst. These genes were involved in behavior, growth or body weight, mortality or aging, and immune systems which align well with the phenotypic differences between wild and laboratory mice. Therefore, the findings would be helpful for our understanding of the phenotypic differences between wild and laboratory mice and applications for using this new mouse resource (C1SLs) for further genetics studies.

  16. Draft genome sequence of Streptomyces coelicoflavus ZG0656 reveals the putative biosynthetic gene cluster of acarviostatin family α-amylase inhibitors.

    Science.gov (United States)

    Guo, X; Geng, P; Bai, F; Bai, G; Sun, T; Li, X; Shi, L; Zhong, Q

    2012-08-01

    The aims of this study are to obtain the draft genome sequence of Streptomyces coelicoflavus ZG0656, which produces novel acarviostatin family α-amylase inhibitors, and then to reveal the putative acarviostatin-related gene cluster and the biosynthetic pathway. The draft genome sequence of S. coelicoflavus ZG0656 was generated using a shotgun approach employing a combination of 454 and Solexa sequencing technologies. Genome analysis revealed a putative gene cluster for acarviostatin biosynthesis, termed sct-cluster. The cluster contains 13 acarviostatin synthetic genes, six transporter genes, four starch degrading or transglycosylation enzyme genes and two regulator genes. On the basis of bioinformatic analysis, we proposed a putative biosynthetic pathway of acarviostatins. The intracellular steps produce a structural core, acarviostatin I00-7-P, and the extracellular assemblies lead to diverse acarviostatin end products. The draft genome sequence of S. coelicoflavus ZG0656 revealed the putative biosynthetic gene cluster of acarviostatins and a putative pathway of acarviostatin production. To our knowledge, S. coelicoflavus ZG0656 is the first strain in this species for which a genome sequence has been reported. The analysis of sct-cluster provided important insights into the biosynthesis of acarviostatins. This work will be a platform for producing novel variants and yield improvement. © 2012 The Authors. Letters in Applied Microbiology © 2012 The Society for Applied Microbiology.

  17. Genome editing reveals a role for OCT4 in human embryogenesis.

    Science.gov (United States)

    Fogarty, Norah M E; McCarthy, Afshan; Snijders, Kirsten E; Powell, Benjamin E; Kubikova, Nada; Blakeley, Paul; Lea, Rebecca; Elder, Kay; Wamaitha, Sissy E; Kim, Daesik; Maciulyte, Valdone; Kleinjung, Jens; Kim, Jin-Soo; Wells, Dagan; Vallier, Ludovic; Bertero, Alessandro; Turner, James M A; Niakan, Kathy K

    2017-10-05

    Despite their fundamental biological and clinical importance, the molecular mechanisms that regulate the first cell fate decisions in the human embryo are not well understood. Here we use CRISPR-Cas9-mediated genome editing to investigate the function of the pluripotency transcription factor OCT4 during human embryogenesis. We identified an efficient OCT4-targeting guide RNA using an inducible human embryonic stem cell-based system and microinjection of mouse zygotes. Using these refined methods, we efficiently and specifically targeted the gene encoding OCT4 (POU5F1) in diploid human zygotes and found that blastocyst development was compromised. Transcriptomics analysis revealed that, in POU5F1-null cells, gene expression was downregulated not only for extra-embryonic trophectoderm genes, such as CDX2, but also for regulators of the pluripotent epiblast, including NANOG. By contrast, Pou5f1-null mouse embryos maintained the expression of orthologous genes, and blastocyst development was established, but maintenance was compromised. We conclude that CRISPR-Cas9-mediated genome editing is a powerful method for investigating gene function in the context of human development.

  18. Selection for Unequal Densities of Sigma70 Promoter-like Signalsin Different Regions of Large Bacterial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Huerta, Araceli M.; Francino, M. Pilar; Morett, Enrique; Collado-Vides, Julio

    2006-03-01

    The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that are recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently-transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to investigate the generality of this pattern, we have used position weight matrices describing the -35 and -10 promoter boxes of E. coli to search for these motifs in 43 additional genomes belonging to most established bacterial phyla, after specific calibration of the matrices according to the base composition of the noncoding regions of each genome. We have found that all bacterial species analyzed contain similar promoter-like motifs, and that, in most cases, these motifs follow the same genomic distribution observed in E. coli. Differential densities between regulatory and nonregulatory regions are detectable in most bacterial genomes, with the exception of those that have experienced evolutionary extreme genome reduction. Thus, the phylogenetic distribution of this pattern mirrors that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is the outcome of a process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential

  19. Strategies for implementing genomic selection in family-based aquaculture breeding schemes: double haploid sib test populations

    Directory of Open Access Journals (Sweden)

    Nirea Kahsay G

    2012-10-01

    Full Text Available Abstract Background Simulation studies have shown that accuracy and genetic gain are increased in genomic selection schemes compared to traditional aquaculture sib-based schemes. In genomic selection, accuracy of selection can be maximized by increasing the precision of the estimation of SNP effects and by maximizing the relationships between test sibs and candidate sibs. Another means of increasing the accuracy of the estimation of SNP effects is to create individuals in the test population with extreme genotypes. The latter approach was studied here with creation of double haploids and use of non-random mating designs. Methods Six alternative breeding schemes were simulated in which the design of the test population was varied: test sibs inherited maternal (Mat, paternal (Pat or a mixture of maternal and paternal (MatPat double haploid genomes or test sibs were obtained by maximum coancestry mating (MaxC, minimum coancestry mating (MinC, or random (RAND mating. Three thousand test sibs and 3000 candidate sibs were genotyped. The test sibs were recorded for a trait that could not be measured on the candidates and were used to estimate SNP effects. Selection was done by truncation on genome-wide estimated breeding values and 100 individuals were selected as parents each generation, equally divided between both sexes. Results Results showed a 7 to 19% increase in selection accuracy and a 6 to 22% increase in genetic gain in the MatPat scheme compared to the RAND scheme. These increases were greater with lower heritabilities. Among all other scenarios, i.e. Mat, Pat, MaxC, and MinC, no substantial differences in selection accuracy and genetic gain were observed. Conclusions In conclusion, a test population designed with a mixture of paternal and maternal double haploids, i.e. the MatPat scheme, increases substantially the accuracy of selection and genetic gain. This will be particularly interesting for traits that cannot be recorded on the

  20. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  1. The mitochondrial genomes of Amphiascoides atopus and Schizopera knabeni (Harpacticoida: Miraciidae) reveal similarities between the copepod orders Harpacticoida and Poecilostomatoida.

    Science.gov (United States)

    Easton, Erin E; Darrow, Emily M; Spears, Trisha; Thistle, David

    2014-03-15

    Members of subclass Copepoda are abundant, diverse, and-as a result of their variety of ecological roles in marine and freshwater environments-important, but their phylogenetic interrelationships are unclear. Recent studies of arthropods have used gene arrangements in the mitochondrial (mt) genome to infer phylogenies, but for copepods, only seven complete mt genomes have been published. These data revealed several within-order and few among-order similarities. To increase the data available for comparisons, we sequenced the complete mt genome (13,831base pairs) of Amphiascoides atopus and 10,649base pairs of the mt genome of Schizopera knabeni (both in the family Miraciidae of the order Harpacticoida). Comparison of our data to those for Tigriopus japonicus (family Harpacticidae, order Harpacticoida) revealed similarities in gene arrangement among these three species that were consistent with those found within and among families of other copepod orders. Comparison of the mt genomes of our species with those known from other copepod orders revealed the arrangement of mt genes of our Harpacticoida species to be more similar to that of Sinergasilus polycolpus (order Poecilostomatoida) than to that of T. japonicus. The similarities between S. polycolpus and our species are the first to be noted across the boundaries of copepod orders and support the possibility that mt-gene arrangement might be used to infer copepod phylogenies. We also found that our two species had extremely truncated transfer RNAs and that gene overlaps occurred much more frequently than has been reported for other copepod mt genomes. Published by Elsevier B.V.

  2. Evaluation of genome-enabled selection for bacterial cold water disease resistance using progeny performance data in Rainbow Trout: Insights on genotyping methods and genomic prediction models

    Science.gov (United States)

    Bacterial cold water disease (BCWD) causes significant economic losses in salmonid aquaculture, and traditional family-based breeding programs aimed at improving BCWD resistance have been limited to exploiting only between-family variation. We used genomic selection (GS) models to predict genomic br...

  3. Genomic analysis of codon usage shows influence of mutation pressure, natural selection, and host features on Marburg virus evolution.

    Science.gov (United States)

    Nasrullah, Izza; Butt, Azeem M; Tahir, Shifa; Idrees, Muhammad; Tong, Yigang

    2015-08-26

    The Marburg virus (MARV) has a negative-sense single-stranded RNA genome, belongs to the family Filoviridae, and is responsible for several outbreaks of highly fatal hemorrhagic fever. Codon usage patterns of viruses reflect a series of evolutionary changes that enable viruses to shape their survival rates and fitness toward the external environment and, most importantly, their hosts. To understand the evolution of MARV at the codon level, we report a comprehensive analysis of synonymous codon usage patterns in MARV genomes. Multiple codon analysis approaches and statistical methods were performed to determine overall codon usage patterns, biases in codon usage, and influence of various factors, including mutation pressure, natural selection, and its two hosts, Homo sapiens and Rousettus aegyptiacus. Nucleotide composition and relative synonymous codon usage (RSCU) analysis revealed that MARV shows mutation bias and prefers U- and A-ended codons to code amino acids. Effective number of codons analysis indicated that overall codon usage among MARV genomes is slightly biased. The Parity Rule 2 plot analysis showed that GC and AU nucleotides were not used proportionally which accounts for the presence of natural selection. Codon usage patterns of MARV were also found to be influenced by its hosts. This indicates that MARV have evolved codon usage patterns that are specific to both of its hosts. Moreover, selection pressure from R. aegyptiacus on the MARV RSCU patterns was found to be dominant compared with that from H. sapiens. Overall, mutation pressure was found to be the most important and dominant force that shapes codon usage patterns in MARV. To our knowledge, this is the first detailed codon usage analysis of MARV and extends our understanding of the mechanisms that contribute to codon usage and evolution of MARV.

  4. Whole genome PCR scanning reveals the syntenic genome structure of toxigenic Vibrio cholerae strains in the O1/O139 population.

    Directory of Open Access Journals (Sweden)

    Bo Pang

    Full Text Available Vibrio cholerae is commonly found in estuarine water systems. Toxigenic O1 and O139 V. cholerae strains have caused cholera epidemics and pandemics, whereas the nontoxigenic strains within these serogroups only occasionally lead to disease. To understand the differences in the genome and clonality between the toxigenic and nontoxigenic strains of V. cholerae serogroups O1 and O139, we employed a whole genome PCR scanning (WGPScanning method, an rrn operon-mediated fragment rearrangement analysis and comparative genomic hybridization (CGH to analyze the genome structure of different strains. WGPScanning in conjunction with CGH revealed that the genomic contents of the toxigenic strains were conservative, except for a few indels located mainly in mobile elements. Minor nucleotide variation in orthologous genes appeared to be the major difference between the toxigenic strains. rrn operon-mediated rearrangements were infrequent in El Tor toxigenic strains tested using I-CeuI digested pulsed-field gel electrophoresis (PFGE analysis and PCR analysis based on flanking sequence of rrn operons. Using these methods, we found that the genomic structures of toxigenic El Tor and O139 strains were syntenic. The nontoxigenic strains exhibited more extensive sequence variations, but toxin coregulated pilus positive (TCP+ strains had a similar structure. TCP+ nontoxigenic strains could be subdivided into multiple lineages according to the TCP type, suggesting the existence of complex intermediates in the evolution of toxigenic strains. The data indicate that toxigenic O1 El Tor and O139 strains were derived from a single lineage of intermediates from complex clones in the environment. The nontoxigenic strains with non-El Tor type TCP may yet evolve into new epidemic clones after attaining toxigenic attributes.

  5. Optimization of multi-environment trials for genomic selection based on crop models.

    Science.gov (United States)

    Rincent, R; Kuhn, E; Monod, H; Oury, F-X; Rousset, M; Allard, V; Le Gouis, J

    2017-08-01

    We propose a statistical criterion to optimize multi-environment trials to predict genotype × environment interactions more efficiently, by combining crop growth models and genomic selection models. Genotype × environment interactions (GEI) are common in plant multi-environment trials (METs). In this context, models developed for genomic selection (GS) that refers to the use of genome-wide information for predicting breeding values of selection candidates need to be adapted. One promising way to increase prediction accuracy in various environments is to combine ecophysiological and genetic modelling thanks to crop growth models (CGM) incorporating genetic parameters. The efficiency of this approach relies on the quality of the parameter estimates, which depends on the environments composing this MET used for calibration. The objective of this study was to determine a method to optimize the set of environments composing the MET for estimating genetic parameters in this context. A criterion called OptiMET was defined to this aim, and was evaluated on simulated and real data, with the example of wheat phenology. The MET defined with OptiMET allowed estimating the genetic parameters with lower error, leading to higher QTL detection power and higher prediction accuracies. MET defined with OptiMET was on average more efficient than random MET composed of twice as many environments, in terms of quality of the parameter estimates. OptiMET is thus a valuable tool to determine optimal experimental conditions to best exploit MET and the phenotyping tools that are currently developed.

  6. Genomic Footprints in Selected and Unselected Beef Cattle Breeds in Korea.

    Directory of Open Access Journals (Sweden)

    Dajeong Lim

    Full Text Available Korean Hanwoo cattle have been subjected to intensive artificial selection over the past four decades to improve meat production traits. Another three cattle varieties very closely related to Hanwoo reside in Korea (Jeju Black and Brindle and in China (Yanbian. These breeds have not been part of a breeding scheme to improve production traits. Here, we compare the selected Hanwoo against these similar but presumed to be unselected populations to identify genomic regions that have been under recent selection pressure due to the breeding program. Rsb statistics were used to contrast the genomes of Hanwoo versus a pooled sample of the three unselected population (UN. We identified 37 significant SNPs (FDR corrected in the HW/UN comparison and 21 known protein coding genes were within 1 MB to the identified SNPs. These genes were previously reported to affect traits important for meat production (14 genes, reproduction including mammary gland development (3 genes, coat color (2 genes, and genes affecting behavioral traits in a broader sense (2 genes. We subsequently sequenced (Illumina HiSeq 2000 platform 10 individuals of the brown Hanwoo and the Chinese Yanbian to identify SNPs within the candidate genomic regions. Based on allele frequency differences, haplotype structures, and literature research, we singled out one non-synonymous SNP in the APP gene (APP: c.569C>T, Ala199Val and predicted the mutational effect on the protein structure. We found that protein-protein interactions might be impaired due to increased exposed hydrophobic surfaces of the mutated protein. The APP gene has also been reported to affect meat tenderness in pigs and obesity in humans. Meat tenderness has been linked to intramuscular fat content, which is one of the main breeding goals for brown Hanwoo, potentially supporting a causal influence of the herein described nsSNP in the APP gene.

  7. Genomic Footprints in Selected and Unselected Beef Cattle Breeds in Korea.

    Science.gov (United States)

    Lim, Dajeong; Strucken, Eva M; Choi, Bong Hwan; Chai, Han Ha; Cho, Yong Min; Jang, Gul Won; Kim, Tae-Hun; Gondro, Cedric; Lee, Seung Hwan

    2016-01-01

    Korean Hanwoo cattle have been subjected to intensive artificial selection over the past four decades to improve meat production traits. Another three cattle varieties very closely related to Hanwoo reside in Korea (Jeju Black and Brindle) and in China (Yanbian). These breeds have not been part of a breeding scheme to improve production traits. Here, we compare the selected Hanwoo against these similar but presumed to be unselected populations to identify genomic regions that have been under recent selection pressure due to the breeding program. Rsb statistics were used to contrast the genomes of Hanwoo versus a pooled sample of the three unselected population (UN). We identified 37 significant SNPs (FDR corrected) in the HW/UN comparison and 21 known protein coding genes were within 1 MB to the identified SNPs. These genes were previously reported to affect traits important for meat production (14 genes), reproduction including mammary gland development (3 genes), coat color (2 genes), and genes affecting behavioral traits in a broader sense (2 genes). We subsequently sequenced (Illumina HiSeq 2000 platform) 10 individuals of the brown Hanwoo and the Chinese Yanbian to identify SNPs within the candidate genomic regions. Based on allele frequency differences, haplotype structures, and literature research, we singled out one non-synonymous SNP in the APP gene (APP: c.569C>T, Ala199Val) and predicted the mutational effect on the protein structure. We found that protein-protein interactions might be impaired due to increased exposed hydrophobic surfaces of the mutated protein. The APP gene has also been reported to affect meat tenderness in pigs and obesity in humans. Meat tenderness has been linked to intramuscular fat content, which is one of the main breeding goals for brown Hanwoo, potentially supporting a causal influence of the herein described nsSNP in the APP gene.

  8. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    Directory of Open Access Journals (Sweden)

    David Jean-Philippe

    2009-11-01

    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  9. Genomic Analysis of Hepatitis B Virus Reveals Antigen State and Genotype as Sources of Evolutionary Rate Variation

    Science.gov (United States)

    Harrison, Abby; Lemey, Philippe; Hurles, Matthew; Moyes, Chris; Horn, Susanne; Pryor, Jan; Malani, Joji; Supuri, Mathias; Masta, Andrew; Teriboriki, Burentau; Toatu, Tebuka; Penny, David; Rambaut, Andrew; Shapiro, Beth

    2011-01-01

    Hepatitis B virus (HBV) genomes are small, semi-double-stranded DNA circular genomes that contain alternating overlapping reading frames and replicate through an RNA intermediary phase. This complex biology has presented a challenge to estimating an evolutionary rate for HBV, leading to difficulties resolving the evolutionary and epidemiological history of the virus. Here, we re-examine rates of HBV evolution using a novel data set of 112 within-host, transmission history (pedigree) and among-host genomes isolated over 20 years from the indigenous peoples of the South Pacific, combined with 313 previously published HBV genomes. We employ Bayesian phylogenetic approaches to examine several potential causes and consequences of evolutionary rate variation in HBV. Our results reveal rate variation both between genotypes and across the genome, as well as strikingly slower rates when genomes are sampled in the Hepatitis B e antigen positive state, compared to the e antigen negative state. This Hepatitis B e antigen rate variation was found to be largely attributable to changes during the course of infection in the preCore and Core genes and their regulatory elements. PMID:21765983

  10. Genomic analysis of natural selection and phenotypic variation in high-altitude mongolians.

    Directory of Open Access Journals (Sweden)

    Jinchuan Xing

    Full Text Available Deedu (DU Mongolians, who migrated from the Mongolian steppes to the Qinghai-Tibetan Plateau approximately 500 years ago, are challenged by environmental conditions similar to native Tibetan highlanders. Identification of adaptive genetic factors in this population could provide insight into coordinated physiological responses to this environment. Here we examine genomic and phenotypic variation in this unique population and present the first complete analysis of a Mongolian whole-genome sequence. High-density SNP array data demonstrate that DU Mongolians share genetic ancestry with other Mongolian as well as Tibetan populations, specifically in genomic regions related with adaptation to high altitude. Several selection candidate genes identified in DU Mongolians are shared with other Asian groups (e.g., EDAR, neighboring Tibetan populations (including high-altitude candidates EPAS1, PKLR, and CYP2E1, as well as genes previously hypothesized to be associated with metabolic adaptation (e.g., PPARG. Hemoglobin concentration, a trait associated with high-altitude adaptation in Tibetans, is at an intermediate level in DU Mongolians compared to Tibetans and Han Chinese at comparable altitude. Whole-genome sequence from a DU Mongolian (Tianjiao1 shows that about 2% of the genomic variants, including more than 300 protein-coding changes, are specific to this individual. Our analyses of DU Mongolians and the first Mongolian genome provide valuable insight into genetic adaptation to extreme environments.

  11. Whole genome sequencing revealed host adaptation-focused genomic plasticity of pathogenic Leptospira

    Science.gov (United States)

    Xu, Yinghua; Zhu, Yongzhang; Wang, Yuezhu; Chang, Yung-Fu; Zhang, Ying; Jiang, Xiugao; Zhuang, Xuran; Zhu, Yongqiang; Zhang, Jinlong; Zeng, Lingbing; Yang, Minjun; Li, Shijun; Wang, Shengyue; Ye, Qiang; Xin, Xiaofang; Zhao, Guoping; Zheng, Huajun; Guo, Xiaokui; Wang, Junzhi

    2016-01-01

    Leptospirosis, caused by pathogenic Leptospira spp., has recently been recognized as an emerging infectious disease worldwide. Despite its severity and global importance, knowledge about the molecular pathogenesis and virulence evolution of Leptospira spp. remains limited. Here we sequenced and analyzed 102 isolates representing global sources. A high genomic variability were observed among different Leptospira species, which was attributed to massive gene gain and loss events allowing for adaptation to specific niche conditions and changing host environments. Horizontal gene transfer and gene duplication allowed the stepwise acquisition of virulence factors in pathogenic Leptospira evolved from a recent common ancestor. More importantly, the abundant expansion of specific virulence-related protein families, such as metalloproteases-associated paralogs, were exclusively identified in pathogenic species, reflecting the importance of these protein families in the pathogenesis of leptospirosis. Our observations also indicated that positive selection played a crucial role on this bacteria adaptation to hosts. These novel findings may lead to greater understanding of the global diversity and virulence evolution of Leptospira spp. PMID:26833181

  12. High overlap of CNVs and selection signatures revealed by varLD analyses of taurine and zebu cattle

    Science.gov (United States)

    Selection Signatures (SS) assessed through analysis of genomic data are being widely studied to discover population specific regions selected via artificial or natural selection. Different methodologies have been proposed for these analyses, each having specific limitations as to the age of the sele...

  13. Comparative Genomic Analysis of Clinical and Environmental Vibrio Vulnificus Isolates Revealed Biotype 3 Evolutionary Relationships

    Directory of Open Access Journals (Sweden)

    Yael eKotton

    2015-01-01

    Full Text Available In 1996 a common-source outbreak of severe soft tissue and bloodstream infections erupted among Israeli fish farmers and fish consumers due to changes in fish marketing policies. The causative pathogen was a new strain of Vibrio vulnificus, named biotype 3, which displayed a unique biochemical and genotypic profile. Initial observations suggested that the pathogen erupted as a result of genetic recombination between two distinct populations. We applied a whole genome shotgun sequencing approach using several V. vulnificus strains from Israel in order to study the pan genome of V. vulnificus and determine the phylogenetic relationship of biotype 3 with existing populations. The core genome of V. vulnificus based on 16 draft and complete genomes consisted of 3068 genes, representing between 59% and 78% of the whole genome of 16 strains. The accessory genome varied in size from 781 kbp to 2044 kbp. Phylogenetic analysis based on whole, core, and accessory genomes displayed similar clustering patterns with two main clusters, clinical (C and environmental (E, all biotype 3 strains formed a distinct group within the E cluster. Annotation of accessory genomic regions found in biotype 3 strains and absent from the core genome yielded 1732 genes, of which the vast majority encoded hypothetical proteins, phage-related proteins, and mobile element proteins. A total of 1916 proteins (including 713 hypothetical proteins were present in all human pathogenic strains (both biotype 3 and non-biotype 3 and absent from the environmental strains. Clustering analysis of the non-hypothetical proteins revealed 148 protein clusters shared by all human pathogenic strains; these included transcriptional regulators, arylsulfatases, methyl-accepting chemotaxis proteins, acetyltransferases, GGDEF family proteins, transposases, type IV secretory system (T4SS proteins, and integrases. Our study showed that V. vulnificus biotype 3 evolved from environmental populations and

  14. Island-Model Genomic Selection for Long-Term Genetic Improvement of Autogamous Crops.

    Science.gov (United States)

    Yabe, Shiori; Yamasaki, Masanori; Ebana, Kaworu; Hayashi, Takeshi; Iwata, Hiroyoshi

    2016-01-01

    Acceleration of genetic improvement of autogamous crops such as wheat and rice is necessary to increase cereal production in response to the global food crisis. Population and pedigree methods of breeding, which are based on inbred line selection, are used commonly in the genetic improvement of autogamous crops. These methods, however, produce a few novel combinations of genes in a breeding population. Recurrent selection promotes recombination among genes and produces novel combinations of genes in a breeding population, but it requires inaccurate single-plant evaluation for selection. Genomic selection (GS), which can predict genetic potential of individuals based on their marker genotype, might have high reliability of single-plant evaluation and might be effective in recurrent selection. To evaluate the efficiency of recurrent selection with GS, we conducted simulations using real marker genotype data of rice cultivars. Additionally, we introduced the concept of an "island model" inspired by evolutionary algorithms that might be useful to maintain genetic variation through the breeding process. We conducted GS simulations using real marker genotype data of rice cultivars to evaluate the efficiency of recurrent selection and the island model in an autogamous species. Results demonstrated the importance of producing novel combinations of genes through recurrent selection. An initial population derived from admixture of multiple bi-parental crosses showed larger genetic gains than a population derived from a single bi-parental cross in whole cycles, suggesting the importance of genetic variation in an initial population. The island-model GS better maintained genetic improvement in later generations than the other GS methods, suggesting that the island-model GS can utilize genetic variation in breeding and can retain alleles with small effects in the breeding population. The island-model GS will become a new breeding method that enhances the potential of genomic

  15. Island-Model Genomic Selection for Long-Term Genetic Improvement of Autogamous Crops.

    Directory of Open Access Journals (Sweden)

    Shiori Yabe

    Full Text Available Acceleration of genetic improvement of autogamous crops such as wheat and rice is necessary to increase cereal production in response to the global food crisis. Population and pedigree methods of breeding, which are based on inbred line selection, are used commonly in the genetic improvement of autogamous crops. These methods, however, produce a few novel combinations of genes in a breeding population. Recurrent selection promotes recombination among genes and produces novel combinations of genes in a breeding population, but it requires inaccurate single-plant evaluation for selection. Genomic selection (GS, which can predict genetic potential of individuals based on their marker genotype, might have high reliability of single-plant evaluation and might be effective in recurrent selection. To evaluate the efficiency of recurrent selection with GS, we conducted simulations using real marker genotype data of rice cultivars. Additionally, we introduced the concept of an "island model" inspired by evolutionary algorithms that might be useful to maintain genetic variation through the breeding process. We conducted GS simulations using real marker genotype data of rice cultivars to evaluate the efficiency of recurrent selection and the island model in an autogamous species. Results demonstrated the importance of producing novel combinations of genes through recurrent selection. An initial population derived from admixture of multiple bi-parental crosses showed larger genetic gains than a population derived from a single bi-parental cross in whole cycles, suggesting the importance of genetic variation in an initial population. The island-model GS better maintained genetic improvement in later generations than the other GS methods, suggesting that the island-model GS can utilize genetic variation in breeding and can retain alleles with small effects in the breeding population. The island-model GS will become a new breeding method that enhances the

  16. The Burmese python genome reveals the molecular basis for extreme adaptation in snakes.

    Science.gov (United States)

    Castoe, Todd A; de Koning, A P Jason; Hall, Kathryn T; Card, Daren C; Schield, Drew R; Fujita, Matthew K; Ruggiero, Robert P; Degner, Jack F; Daza, Juan M; Gu, Wanjun; Reyes-Velasco, Jacobo; Shaney, Kyle J; Castoe, Jill M; Fox, Samuel E; Poole, Alex W; Polanco, Daniel; Dobry, Jason; Vandewege, Michael W; Li, Qing; Schott, Ryan K; Kapusta, Aurélie; Minx, Patrick; Feschotte, Cédric; Uetz, Peter; Ray, David A; Hoffmann, Federico G; Bogden, Robert; Smith, Eric N; Chang, Belinda S W; Vonk, Freek J; Casewell, Nicholas R; Henkel, Christiaan V; Richardson, Michael K; Mackessy, Stephen P; Bronikowski, Anne M; Bronikowsi, Anne M; Yandell, Mark; Warren, Wesley C; Secor, Stephen M; Pollock, David D

    2013-12-17

    Snakes possess many extreme morphological and physiological adaptations. Identification of the molecular basis of these traits can provide novel understanding for vertebrate biology and medicine. Here, we study snake biology using the genome sequence of the Burmese python (Python molurus bivittatus), a model of extreme physiological and metabolic adaptation. We compare the python and king cobra genomes along with genomic samples from other snakes and perform transcriptome analysis to gain insights into the extreme phenotypes of the python. We discovered rapid and massive transcriptional responses in multiple organ systems that occur on feeding and coordinate major changes in organ size and function. Intriguingly, the homologs of these genes in humans are associated with metabolism, development, and pathology. We also found that many snake metabolic genes have undergone positive selection, which together with the rapid evolution of mitochondrial proteins, provides evidence for extensive adaptive redesign of snake metabolic pathways. Additional evidence for molecular adaptation and gene family expansions and contractions is associated with major physiological and phenotypic adaptations in snakes; genes involved are related to cell cycle, development, lungs, eyes, heart, intestine, and skeletal structure, including GRB2-associated binding protein 1, SSH, WNT16, and bone morphogenetic protein 7. Finally, changes in repetitive DNA content, guanine-cytosine isochore structure, and nucleotide substitution rates indicate major shifts in the structure and evolution of snake genomes compared with other amniotes. Phenotypic and physiological novelty in snakes seems to be driven by system-wide coordination of protein adaptation, gene expression, and changes in the structure of the genome.

  17. Genome-wide analyses reveal a role for peptide hormones in planarian germline development.

    Directory of Open Access Journals (Sweden)

    James J Collins

    Full Text Available Bioactive peptides (i.e., neuropeptides or peptide hormones represent the largest class of cell-cell signaling molecules in metazoans and are potent regulators of neural and physiological function. In vertebrates, peptide hormones play an integral role in endocrine signaling between the brain and the gonads that controls reproductive development, yet few of these molecules have been shown to influence reproductive development in invertebrates. Here, we define a role for peptide hormones in controlling reproductive physiology of the model flatworm, the planarian Schmidtea mediterranea. Based on our observation that defective neuropeptide processing results in defects in reproductive system development, we employed peptidomic and functional genomic approaches to characterize the planarian peptide hormone complement, identifying 51 prohormone genes and validating 142 peptides biochemically. Comprehensive in situ hybridization analyses of prohormone gene expression revealed the unanticipated complexity of the flatworm nervous system and identified a prohormone specifically expressed in the nervous system of sexually reproducing planarians. We show that this member of the neuropeptide Y superfamily is required for the maintenance of mature reproductive organs and differentiated germ cells in the testes. Additionally, comparative analyses of our biochemically validated prohormones with the genomes of the parasitic flatworms Schistosoma mansoni and Schistosoma japonicum identified new schistosome prohormones and validated half of all predicted peptide-encoding genes in these parasites. These studies describe the peptide hormone complement of a flatworm on a genome-wide scale and reveal a previously uncharacterized role for peptide hormones in flatworm reproduction. Furthermore, they suggest new opportunities for using planarians as free-living models for understanding the reproductive biology of flatworm parasites.

  18. Non-Random Inversion Landscapes in Prokaryotic Genomes Are Shaped by Heterogeneous Selection Pressures.

    Science.gov (United States)

    Repar, Jelena; Warnecke, Tobias

    2017-08-01

    Inversions are a major contributor to structural genome evolution in prokaryotes. Here, using a novel alignment-based method, we systematically compare 1,651 bacterial and 98 archaeal genomes to show that inversion landscapes are frequently biased toward (symmetric) inversions around the origin-terminus axis. However, symmetric inversion bias is not a universal feature of prokaryotic genome evolution but varies considerably across clades. At the extremes, inversion landscapes in Bacillus-Clostridium and Actinobacteria are dominated by symmetric inversions, while there is little or no systematic bias favoring symmetric rearrangements in archaea with a single origin of replication. Within clades, we find strong but clade-specific relationships between symmetric inversion bias and different features of adaptive genome architecture, including the distance of essential genes to the origin of replication and the preferential localization of genes on the leading strand. We suggest that heterogeneous selection pressures have converged to produce similar patterns of structural genome evolution across prokaryotes. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  19. Annotated Draft Genome Assemblies for the Northern Bobwhite (Colinus virginianus) and the Scaled Quail (Callipepla squamata) Reveal Disparate Estimates of Modern Genome Diversity and Historic Effective Population Size.

    Science.gov (United States)

    Oldeschulte, David L; Halley, Yvette A; Wilson, Miranda L; Bhattarai, Eric K; Brashear, Wesley; Hill, Joshua; Metz, Richard P; Johnson, Charles D; Rollins, Dale; Peterson, Markus J; Bickhart, Derek M; Decker, Jared E; Sewell, John F; Seabury, Christopher M

    2017-09-07

    Northern bobwhite ( Colinus virginianus ; hereafter bobwhite) and scaled quail ( Callipepla squamata ) populations have suffered precipitous declines across most of their US ranges. Illumina-based first- (v1.0) and second- (v2.0) generation draft genome assemblies for the scaled quail and the bobwhite produced N50 scaffold sizes of 1.035 and 2.042 Mb, thereby producing a 45-fold improvement in contiguity over the existing bobwhite assembly, and ≥90% of the assembled genomes were captured within 1313 and 8990 scaffolds, respectively. The scaled quail assembly (v1.0 = 1.045 Gb) was ∼20% smaller than the bobwhite (v2.0 = 1.254 Gb), which was supported by kmer-based estimates of genome size. Nevertheless, estimates of GC content (41.72%; 42.66%), genome-wide repetitive content (10.40%; 10.43%), and MAKER-predicted protein coding genes (17,131; 17,165) were similar for the scaled quail (v1.0) and bobwhite (v2.0) assemblies, respectively. BUSCO analyses utilizing 3023 single-copy orthologs revealed a high level of assembly completeness for the scaled quail (v1.0; 84.8%) and the bobwhite (v2.0; 82.5%), as verified by comparison with well-established avian genomes. We also detected 273 putative segmental duplications in the scaled quail genome (v1.0), and 711 in the bobwhite genome (v2.0), including some that were shared among both species. Autosomal variant prediction revealed ∼2.48 and 4.17 heterozygous variants per kilobase within the scaled quail (v1.0) and bobwhite (v2.0) genomes, respectively, and estimates of historic effective population size were uniformly higher for the bobwhite across all time points in a coalescent model. However, large-scale declines were predicted for both species beginning ∼15-20 KYA. Copyright © 2017 Oldeschulte et al.

  20. Annotated Draft Genome Assemblies for the Northern Bobwhite (Colinus virginianus and the Scaled Quail (Callipepla squamata Reveal Disparate Estimates of Modern Genome Diversity and Historic Effective Population Size

    Directory of Open Access Journals (Sweden)

    David L. Oldeschulte

    2017-09-01

    Full Text Available Northern bobwhite (Colinus virginianus; hereafter bobwhite and scaled quail (Callipepla squamata populations have suffered precipitous declines across most of their US ranges. Illumina-based first- (v1.0 and second- (v2.0 generation draft genome assemblies for the scaled quail and the bobwhite produced N50 scaffold sizes of 1.035 and 2.042 Mb, thereby producing a 45-fold improvement in contiguity over the existing bobwhite assembly, and ≥90% of the assembled genomes were captured within 1313 and 8990 scaffolds, respectively. The scaled quail assembly (v1.0 = 1.045 Gb was ∼20% smaller than the bobwhite (v2.0 = 1.254 Gb, which was supported by kmer-based estimates of genome size. Nevertheless, estimates of GC content (41.72%; 42.66%, genome-wide repetitive content (10.40%; 10.43%, and MAKER-predicted protein coding genes (17,131; 17,165 were similar for the scaled quail (v1.0 and bobwhite (v2.0 assemblies, respectively. BUSCO analyses utilizing 3023 single-copy orthologs revealed a high level of assembly completeness for the scaled quail (v1.0; 84.8% and the bobwhite (v2.0; 82.5%, as verified by comparison with well-established avian genomes. We also detected 273 putative segmental duplications in the scaled quail genome (v1.0, and 711 in the bobwhite genome (v2.0, including some that were shared among both species. Autosomal variant prediction revealed ∼2.48 and 4.17 heterozygous variants per kilobase within the scaled quail (v1.0 and bobwhite (v2.0 genomes, respectively, and estimates of historic effective population size were uniformly higher for the bobwhite across all time points in a coalescent model. However, large-scale declines were predicted for both species beginning ∼15–20 KYA.

  1. Short communication: Genomic selection in a crossbred cattle population using data from the Dairy Genetics East Africa Project.

    Science.gov (United States)

    Brown, A; Ojango, J; Gibson, J; Coffey, M; Okeyo, M; Mrode, R

    2016-09-01

    Due to the absence of accurate pedigree information, it has not been possible to implement genetic evaluations for crossbred cattle in African small-holder systems. Genomic selection techniques that do not rely on pedigree information could, therefore, be a useful alternative. The objective of this study was to examine the feasibility of using genomic selection techniques in a crossbred cattle population using data from Kenya provided by the Dairy Genetics East Africa Project. Genomic estimated breeding values for milk yield were estimated using 2 prediction methods, GBLUP and BayesC, and accuracies were calculated as the correlation between yield deviations and genomic breeding values included in the estimation process, mimicking the situation for young bulls. The accuracy of evaluation ranged from 0.28 to 0.41, depending on the validation population and prediction method used. No significant differences were found in accuracy between the 2 prediction methods. The results suggest that there is potential for implementing genomic selection for young bulls in crossbred small-holder cattle populations, and targeted genotyping and phenotyping should be pursued to facilitate this. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  2. Genomic analysis of primordial dwarfism reveals novel disease genes.

    Science.gov (United States)

    Shaheen, Ranad; Faqeih, Eissa; Ansari, Shinu; Abdel-Salam, Ghada; Al-Hassnan, Zuhair N; Al-Shidi, Tarfa; Alomar, Rana; Sogaty, Sameera; Alkuraya, Fowzan S

    2014-02-01

    Primordial dwarfism (PD) is a disease in which severely impaired fetal growth persists throughout postnatal development and results in stunted adult size. The condition is highly heterogeneous clinically, but the use of certain phenotypic aspects such as head circumference and facial appearance has proven helpful in defining clinical subgroups. In this study, we present the results of clinical and genomic characterization of 16 new patients in whom a broad definition of PD was used (e.g., 3M syndrome was included). We report a novel PD syndrome with distinct facies in two unrelated patients, each with a different homozygous truncating mutation in CRIPT. Our analysis also reveals, in addition to mutations in known PD disease genes, the first instance of biallelic truncating BRCA2 mutation causing PD with normal bone marrow analysis. In addition, we have identified a novel locus for Seckel syndrome based on a consanguineous multiplex family and identified a homozygous truncating mutation in DNA2 as the likely cause. An additional novel PD disease candidate gene XRCC4 was identified by autozygome/exome analysis, and the knockout mouse phenotype is highly compatible with PD. Thus, we add a number of novel genes to the growing list of PD-linked genes, including one which we show to be linked to a novel PD syndrome with a distinct facial appearance. PD is extremely heterogeneous genetically and clinically, and genomic tools are often required to reach a molecular diagnosis.

  3. Next-Generation Sequencing Reveals the Impact of Repetitive DNA Across Phylogenetically Closely Related Genomes of Orobanchaceae

    Science.gov (United States)

    Piednoël, Mathieu; Aberer, Andre J.; Schneeweiss, Gerald M.; Macas, Jiri; Novak, Petr; Gundlach, Heidrun; Temsch, Eva M.; Renner, Susanne S.

    2013-01-01

    We used next-generation sequencing to characterize the genomes of nine species of Orobanchaceae of known phylogenetic relationships, different life forms, and including a polyploid species. The study species are the autotrophic, nonparasitic Lindenbergia philippensis, the hemiparasitic Schwalbea americana, and seven nonphotosynthetic parasitic species of Orobanche (Orobanche crenata, Orobanche cumana, Orobanche gracilis (tetraploid), and Orobanche pancicii) and Phelipanche (Phelipanche lavandulacea, Phelipanche purpurea, and Phelipanche ramosa). Ty3/Gypsy elements comprise 1.93%–28.34% of the nine genomes and Ty1/Copia elements comprise 8.09%–22.83%. When compared with L. philippensis and S. americana, the nonphotosynthetic species contain higher proportions of repetitive DNA sequences, perhaps reflecting relaxed selection on genome size in parasitic organisms. Among the parasitic species, those in the genus Orobanche have smaller genomes but higher proportions of repetitive DNA than those in Phelipanche, mostly due to a diversification of repeats and an accumulation of Ty3/Gypsy elements. Genome downsizing in the tetraploid O. gracilis probably led to sequence loss across most repeat types. PMID:22723303

  4. Genome-Wide Analysis of Transposon and Retroviral Insertions Reveals Preferential Integrations in Regions of DNA Flexibility.

    Science.gov (United States)

    Vrljicak, Pavle; Tao, Shijie; Varshney, Gaurav K; Quach, Helen Ngoc Bao; Joshi, Adita; LaFave, Matthew C; Burgess, Shawn M; Sampath, Karuna

    2016-04-07

    DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germ-line transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift, and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence the integration of heterologous DNA in genomes, and have implications for targeted genome engineering. Copyright © 2016 Vrljicak et al.

  5. Ancestry variation and footprints of natural selection along the genome in Latin American populations.

    Science.gov (United States)

    Deng, Lian; Ruiz-Linares, Andrés; Xu, Shuhua; Wang, Sijia

    2016-02-18

    Latin American populations stem from the admixture of Europeans, Africans and Native Americans, which started over 400 years ago and had lasted for several centuries. Extreme deviation over the genome-wide average in ancestry estimations at certain genomic locations could reflect recent natural selection. We evaluated the distribution of ancestry estimations using 678 genome-wide microsatellite markers in 249 individuals from 13 admixed populations across Latin America. We found significant deviations in ancestry estimations including three locations with more than 3.5 times standard deviations from the genome-wide average: an excess of European ancestry at 1p36 and 14q32, and an excess of African ancestry at 6p22. Using simulations, we could show that at least the deviation at 6p22 was unlikely to result from genetic drift alone. By applying different linguistic groups as well as the most likely ancestral Native American populations as the ancestry, we showed that the choice of Native American ancestry could affect the local ancestry estimation. However, the signal at 6p22 consistently appeared in most of the analyses using various ancestral groups. This study provided important insights for recent natural selection in the context of the unique history of the New World and implications for disease mapping.

  6. Population genomics of Fusarium graminearum reveals signatures of divergent evolution within a major cereal pathogen

    Science.gov (United States)

    The cereal pathogen Fusarium graminearum is the primary cause of Fusarium head blight (FHB) and a significant threat to food safety and crop production. To elucidate population structure and identify genomic targets of selection within major FHB pathogen populations in North America we sequenced the...

  7. Long-term response to genomic selection: effects of estimation method and reference population structure for different genetic architectures.

    Science.gov (United States)

    Bastiaansen, John W M; Coster, Albart; Calus, Mario P L; van Arendonk, Johan A M; Bovenhuis, Henk

    2012-01-24

    Genomic selection has become an important tool in the genetic improvement of animals and plants. The objective of this study was to investigate the impacts of breeding value estimation method, reference population structure, and trait genetic architecture, on long-term response to genomic selection without updating marker effects. Three methods were used to estimate genomic breeding values: a BLUP method with relationships estimated from genome-wide markers (GBLUP), a Bayesian method, and a partial least squares regression method (PLSR). A shallow (individuals from one generation) or deep reference population (individuals from five generations) was used with each method. The effects of the different selection approaches were compared under four different genetic architectures for the trait under selection. Selection was based on one of the three genomic breeding values, on pedigree BLUP breeding values, or performed at random. Selection continued for ten generations. Differences in long-term selection response were small. For a genetic architecture with a very small number of three to four quantitative trait loci (QTL), the Bayesian method achieved a response that was 0.05 to 0.1 genetic standard deviation higher than other methods in generation 10. For genetic architectures with approximately 30 to 300 QTL, PLSR (shallow reference) or GBLUP (deep reference) had an average advantage of 0.2 genetic standard deviation over the Bayesian method in generation 10. GBLUP resulted in 0.6% and 0.9% less inbreeding than PLSR and BM and on average a one third smaller reduction of genetic variance. Responses in early generations were greater with the shallow reference population while long-term response was not affected by reference population structure. The ranking of estimation methods was different with than without selection. Under selection, applying GBLUP led to lower inbreeding and a smaller reduction of genetic variance while a similar response to selection was

  8. The complete genome sequence of Fibrobacter succinogenes S85 reveals a cellulolytic and metabolic specialist.

    Directory of Open Access Journals (Sweden)

    Garret Suen

    Full Text Available Fibrobacter succinogenes is an important member of the rumen microbial community that converts plant biomass into nutrients usable by its host. This bacterium, which is also one of only two cultivated species in its phylum, is an efficient and prolific degrader of cellulose. Specifically, it has a particularly high activity against crystalline cellulose that requires close physical contact with this substrate. However, unlike other known cellulolytic microbes, it does not degrade cellulose using a cellulosome or by producing high extracellular titers of cellulase enzymes. To better understand the biology of F. succinogenes, we sequenced the genome of the type strain S85 to completion. A total of 3,085 open reading frames were predicted from its 3.84 Mbp genome. Analysis of sequences predicted to encode for carbohydrate-degrading enzymes revealed an unusually high number of genes that were classified into 49 different families of glycoside hydrolases, carbohydrate binding modules (CBMs, carbohydrate esterases, and polysaccharide lyases. Of the 31 identified cellulases, none contain CBMs in families 1, 2, and 3, typically associated with crystalline cellulose degradation. Polysaccharide hydrolysis and utilization assays showed that F. succinogenes was able to hydrolyze a number of polysaccharides, but could only utilize the hydrolytic products of cellulose. This suggests that F. succinogenes uses its array of hemicellulose-degrading enzymes to remove hemicelluloses to gain access to cellulose. This is reflected in its genome, as F. succinogenes lacks many of the genes necessary to transport and metabolize the hydrolytic products of non-cellulose polysaccharides. The F. succinogenes genome reveals a bacterium that specializes in cellulose as its sole energy source, and provides insight into a novel strategy for cellulose degradation.

  9. Supervised machine learning reveals introgressed loci in the genomes of Drosophila simulans and D. sechellia.

    Science.gov (United States)

    Schrider, Daniel R; Ayroles, Julien; Matute, Daniel R; Kern, Andrew D

    2018-04-01

    Hybridization and gene flow between species appears to be common. Even though it is clear that hybridization is widespread across all surveyed taxonomic groups, the magnitude and consequences of introgression are still largely unknown. Thus it is crucial to develop the statistical machinery required to uncover which genomic regions have recently acquired haplotypes via introgression from a sister population. We developed a novel machine learning framework, called FILET (Finding Introgressed Loci via Extra-Trees) capable of revealing genomic introgression with far greater power than competing methods. FILET works by combining information from a number of population genetic summary statistics, including several new statistics that we introduce, that capture patterns of variation across two populations. We show that FILET is able to identify loci that have experienced gene flow between related species with high accuracy, and in most situations can correctly infer which population was the donor and which was the recipient. Here we describe a data set of outbred diploid Drosophila sechellia genomes, and combine them with data from D. simulans to examine recent introgression between these species using FILET. Although we find that these populations may have split more recently than previously appreciated, FILET confirms that there has indeed been appreciable recent introgression (some of which might have been adaptive) between these species, and reveals that this gene flow is primarily in the direction of D. simulans to D. sechellia.

  10. Complete genomes reveal signatures of demographic and genetic declines in the woolly mammoth

    Science.gov (United States)

    Palkopoulou, Eleftheria; Mallick, Swapan; Skoglund, Pontus; Enk, Jacob; Rohland, Nadin; Li, Heng; Omrak, Ayça; Vartanyan, Sergey; Poinar, Hendrik; Götherström, Anders; Reich, David; Dalén, Love

    2015-01-01

    Summary The processes leading up to species extinctions are typically characterized by prolonged declines in population size and geographic distribution, followed by a phase in which populations are very small and may be subject to intrinsic threats, including loss of genetic diversity and inbreeding [1]. However, whether such genetic factors have had an impact on species prior to their extinction is unclear [2, 3]; examining this would require a detailed reconstruction of a species’ demographic history as well as changes in genome-wide diversity leading up to its extinction. Here, we present high-quality complete genome sequences from two woolly mammoths (Mammuthus primigenius). The first mammoth was sequenced at 17.1-fold coverage, and dates to ~4,300 years before present, constituting one of the last surviving individuals on Wrangel Island. The second mammoth, sequenced at 11.2-fold coverage, was obtained from a ~44,800 year old specimen from the Late Pleistocene population in northeastern Siberia. The demographic trajectories inferred from the two genomes are qualitatively similar and reveal a population bottleneck during the Middle or Early Pleistocene, and a more recent severe decline in the ancestors of the Wrangel mammoth at the end of the last glaciation. A comparison of the two genomes shows that the Wrangel mammoth has a 20% reduction in heterozygosity as well as a 28-fold increase in the fraction of the genome that is comprised of runs of homozygosity. We conclude that the population on Wrangel Island, which was the last surviving woolly mammoth population, was subject to reduced genetic diversity shortly before it became extinct. PMID:25913407

  11. Evolution and phylogeny of the mud shrimps (Crustacea: Decapoda) revealed from complete mitochondrial genomes.

    Science.gov (United States)

    Lin, Feng-Jiau; Liu, Yuan; Sha, Zhongli; Tsang, Ling Ming; Chu, Ka Hou; Chan, Tin-Yam; Liu, Ruiyu; Cui, Zhaoxia

    2012-11-16

    The evolutionary history and relationships of the mud shrimps (Crustacea: Decapoda: Gebiidea and Axiidea) are contentious, with previous attempts revealing mixed results. The mud shrimps were once classified in the infraorder Thalassinidea. Recent molecular phylogenetic analyses, however, suggest separation of the group into two individual infraorders, Gebiidea and Axiidea. Mitochondrial (mt) genome sequence and structure can be especially powerful in resolving higher systematic relationships that may offer new insights into the phylogeny of the mud shrimps and the other decapod infraorders, and test the hypothesis of dividing the mud shrimps into two infraorders. We present the complete mitochondrial genome sequences of five mud shrimps, Austinogebia edulis, Upogebia major, Thalassina kelanang (Gebiidea), Nihonotrypaea thermophilus and Neaxius glyptocercus (Axiidea). All five genomes encode a standard set of 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes and a putative control region. Except for T. kelanang, mud shrimp mitochondrial genomes exhibited rearrangements and novel patterns compared to the pancrustacean ground pattern. Each of the two Gebiidea species (A. edulis and U. major) and two Axiidea species (N. glyptocercus and N. thermophiles) share unique gene order specific to their infraorders and analyses further suggest these two derived gene orders have evolved independently. Phylogenetic analyses based on the concatenated nucleotide and amino acid sequences of 13 protein-coding genes indicate the possible polyphyly of mud shrimps, supporting the division of the group into two infraorders. However, the infraordinal relationships among the Gebiidea and Axiidea, and other reptants are poorly resolved. The inclusion of mt genome from more taxa, in particular the reptant infraorders Polychelida and Glypheidea is required in further analysis. Phylogenetic analyses on the mt genome sequences and the distinct gene orders provide further

  12. Comparative genomic hybridizations reveal absence of large Streptomyces coelicolor genomic islands in Streptomyces lividans

    OpenAIRE

    Jayapal, Karthik P; Lian, Wei; Glod, Frank; Sherman, David H; Hu, Wei-Shou

    2007-01-01

    Abstract Background The genomes of Streptomyces coelicolor and Streptomyces lividans bear a considerable degree of synteny. While S. coelicolor is the model streptomycete for studying antibiotic synthesis and differentiation, S. lividans is almost exclusively considered as the preferred host, among actinomycetes, for cloning and expression of exogenous DNA. We used whole genome microarrays as a comparative genomics tool for identifying the subtle differences between these two chromosomes. Res...

  13. Correction: Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi

    Science.gov (United States)

    2014-01-01

    Abstract The version of this article published in BMC Genomics 2013, 14: 274, contains 9 unpublished genomes (Botryobasidium botryosum, Gymnopus luxurians, Hypholoma sublateritium, Jaapia argillacea, Hebeloma cylindrosporum, Conidiobolus coronatus, Laccaria amethystina, Paxillus involutus, and P. rubicundulus) downloaded from JGI website. In this correction, we removed these genomes after discussion with editors and data producers whom we should have contacted before downloading these genomes. Removing these data did not alter the principle results and conclusions of our original work. The relevant Figures 1, 2, 3, 4 and 6; and Table 1 have been revised. Additional files 1, 3, 4, and 5 were also revised. We would like to apologize for any confusion or inconvenience this may have caused. Background Fungi produce a variety of carbohydrate activity enzymes (CAZymes) for the degradation of plant polysaccharide materials to facilitate infection and/or gain nutrition. Identifying and comparing CAZymes from fungi with different nutritional modes or infection mechanisms may provide information for better understanding of their life styles and infection models. To date, over hundreds of fungal genomes are publicly available. However, a systematic comparative analysis of fungal CAZymes across the entire fungal kingdom has not been reported. Results In this study, we systemically identified glycoside hydrolases (GHs), polysaccharide lyases (PLs), carbohydrate esterases (CEs), and glycosyltransferases (GTs) as well as carbohydrate-binding modules (CBMs) in the predicted proteomes of 94 representative fungi from Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota. Comparative analysis of these CAZymes that play major roles in plant polysaccharide degradation revealed that fungi exhibit tremendous diversity in the number and variety of CAZymes. Among them, some families of GHs and CEs are the most prevalent CAZymes that are distributed in all of the fungi analyzed

  14. Genomic and transcriptomic analyses reveal differential regulation of diverse terpenoid and polyketides secondary metabolites in Hericium erinaceus.

    Science.gov (United States)

    Chen, Juan; Zeng, Xu; Yang, Yan Long; Xing, Yong Mei; Zhang, Qi; Li, Jia Mei; Ma, Ke; Liu, Hong Wei; Guo, Shun Xing

    2017-08-31

    The lion's mane mushroom Hericium erinaceus is a famous traditional medicinal fungus credited with anti-dementia activity and a producer of cyathane diterpenoid natural products (erinacines) useful against nervous system diseases. To date, few studies have explored the biosynthesis of these compounds, although their chemical synthesis is known. Here, we report the first genome and tanscriptome sequence of the medicinal fungus H. erinaceus. The size of the genome is 39.35 Mb, containing 9895 gene models. The genome of H. erinaceus reveals diverse enzymes and a large family of cytochrome P450 (CYP) proteins involved in the biosynthesis of terpenoid backbones, diterpenoids, sesquiterpenes and polyketides. Three gene clusters related to terpene biosynthesis and one gene cluster for polyketides biosynthesis (PKS) were predicted. Genes involved in terpenoid biosynthesis were generally upregulated in mycelia, while the PKS gene was upregulated in the fruiting body. Comparative genome analysis of 42 fungal species of Basidiomycota revealed that most edible and medicinal mushroom show many more gene clusters involved in terpenoid and polyketide biosynthesis compared to the pathogenic fungi. None of the gene clusters for terpenoid or polyketide biosynthesis were predicted in the poisonous mushroom Amanita muscaria. Our findings may facilitate future discovery and biosynthesis of bioactive secondary metabolites from H. erinaceus and provide fundamental information for exploring the secondary metabolites in other Basidiomycetes.

  15. Twenty years of artificial directional selection have shaped the genome of the Italian Large White pig breed.

    Science.gov (United States)

    Schiavo, G; Galimberti, G; Calò, D G; Samorè, A B; Bertolini, F; Russo, V; Gallo, M; Buttazzoni, L; Fontanesi, L

    2016-04-01

    In this study, we investigated at the genome-wide level if 20 years of artificial directional selection based on boar genetic evaluation obtained with a classical BLUP animal model shaped the genome of the Italian Large White pig breed. The most influential boars of this breed (n = 192), born from 1992 (the beginning of the selection program of this breed) to 2012, with an estimated breeding value reliability of >0.85, were genotyped with the Illumina Porcine SNP60 BeadChip. After grouping the boars in eight classes according to their year of birth, filtered single nucleotide polymorphisms (SNPs) were used to evaluate the effects of time on genotype frequency changes using multinomial logistic regression models. Of these markers, 493 had a PBonferroni  selection program. The obtained results indicated that the genome of the Italian Large White pigs was shaped by a directional selection program derived by the application of methodologies assuming the infinitesimal model that captured a continuous trend of allele frequency changes in the boar population. © 2015 Stichting International Foundation for Animal Genetics.

  16. Probing of RNA structures in a positive sense RNA virus reveals selection pressures for structural elements

    Science.gov (United States)

    Watters, Kyle E; Choudhary, Krishna; Aviran, Sharon; Perry, Keith L

    2018-01-01

    Abstract In single stranded (+)-sense RNA viruses, RNA structural elements (SEs) play essential roles in the infection process from replication to encapsidation. Using selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq) and covariation analysis, we explore the structural features of the third genome segment of cucumber mosaic virus (CMV), RNA3 (2216 nt), both in vitro and in plant cell lysates. Comparing SHAPE-Seq and covariation analysis results revealed multiple SEs in the coat protein open reading frame and 3′ untranslated region. Four of these SEs were mutated and serially passaged in Nicotiana tabacum plants to identify biologically selected changes to the original mutated sequences. After passaging, loop mutants showed partial reversion to their wild-type sequence and SEs that were structurally disrupted by mutations were restored to wild-type-like structures via synonymous mutations in planta. These results support the existence and selection of virus open reading frame SEs in the host organism and provide a framework for further studies on the role of RNA structure in viral infection. Additionally, this work demonstrates the applicability of high-throughput chemical probing in plant cell lysates and presents a new method for calculating SHAPE reactivities from overlapping reverse transcriptase priming sites. PMID:29294088

  17. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds.

    Directory of Open Access Journals (Sweden)

    Yao Xu

    Full Text Available Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus and Qinchuan (Bos taurus are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 to 12 fold on average of 97.86% and 98.98% coverage of genomes, respectively. Comparison with the Bos_taurus_UMD_3.1 reference assembly yielded 9,010,096 SNPs for Nanyang, and 6,965,062 for Qinchuan cattle, 51% and 29% of which were novel SNPs, respectively. A total of 154,934 and 115,032 small indels (1 to 3 bp were found in the Nanyang and Qinchuan genomes, respectively. The SNP and indel distribution revealed that Nanyang showed a genetically high diversity as compared to Qinchuan cattle. Furthermore, a total of 2,907 putative cases of copy number variation (CNV were identified by aligning Nanyang to Qinchuan genome, 783 of which (27% encompassed the coding regions of 495 functional genes. The gene ontology (GO analysis revealed that many CNV genes were enriched in the immune system and environment adaptability. Among several CNV genes related to lipid transport and fat metabolism, Lepin receptor gene (LEPR overlapping with CNV_1815 showed remarkably higher copy number in Qinchuan than Nanyang (log2 (ratio = -2.34988; P value = 1.53E-102. Further qPCR and association analysis investigated that the copy number of the LEPR gene presented positive correlations with transcriptional expression and phenotypic traits, suggesting the LEPR CNV may contribute to the higher fat deposition in muscles of Qinchuan cattle. Our findings provide evidence that the distinct phenotypes of Nanyang and Qinchuan breeds may be due to the different genetic variations including SNPs

  18. The Phaeodactylum genome reveals the evolutionary history of diatom genomes

    Czech Academy of Sciences Publication Activity Database

    Bowler, Ch.; Allen, A. E.; Badger, J. H.; Grimwood, J.; Jabbari, K.; Kuo, A.; Maheswari, U.; Martens, C.; Maumus, F.; Otillar, R. P.; Rayko, E.; Salamov, A.; Vandepoele, K.; Beszteri, B.; Gruber, A.; Heijde, M.; Katinka, M.; Mock, T.; Valentin, K.; Verret, F.; Berges, J. A.; Brownlee, C.; Cadoret, J.-P.; Chiovitti, A.; Choi, Ch. J.; Coesel, S.; De Martino, A.; Detter, J. Ch.; Durkin, C.; Falciatore, A.; Fournet, J.; Haruta, M.; Huysman, M. J. J.; Jenkins, B. D.; Jiroutová, Kateřina; Jorgensen, R. E.; Joubert, Y.; Kaplan, A.; Kröger, N.; Kroth, P. G.; La Roche, J.; Lindquist, E.; Lommer, M.; Martin–Jézéquel, V.; Lopez, P. J.; Lucas, S.; Mangogna, M.; McGinnis, K.; Medlin, L. K.; Montsant, A.; Oudot–Le Secq, M.-P.; Napoli, C.; Oborník, Miroslav; Schnitzler Parker, M.; Petit, J.-L.; Porcel, B. M.; Poulsen, N.; Robison, M.; Rychlewski, L.; Rynearson, T. A.; Schmutz, J.; Shapiro, H.; Siaut, M.; Stanley, M.; Sussman, M. R.; Taylor, A. R.; Vardi, A.; von Dassow, P.; Vyverman, W.; Willis, A.; Wyrwicz, L. S.; Rokhsar, D. S.; Weissenbach, J.; Armbrust, E. V.; Green, B. R.; Van de Peer, Y.; Grigoriev, I. V.

    2008-01-01

    Roč. 456, 13-11-2008 (2008), s. 239-244 ISSN 0028-0836 Institutional research plan: CEZ:AV0Z60220518 Keywords : Phaeodactylum * genome * evolution * diatom Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 31.434, year: 2008

  19. The impact of selection, gene flow and demographic history on heterogeneous genomic divergence: three-spine sticklebacks in divergent environments.

    Science.gov (United States)

    Ferchaud, Anne-Laure; Hansen, Michael M

    2016-01-01

    Heterogeneous genomic divergence between populations may reflect selection, but should also be seen in conjunction with gene flow and drift, particularly population bottlenecks. Marine and freshwater three-spine stickleback (Gasterosteus aculeatus) populations often exhibit different lateral armour plate morphs. Moreover, strikingly parallel genomic footprints across different marine-freshwater population pairs are interpreted as parallel evolution and gene reuse. Nevertheless, in some geographic regions like the North Sea and Baltic Sea, different patterns are observed. Freshwater populations in coastal regions are often dominated by marine morphs, suggesting that gene flow overwhelms selection, and genomic parallelism may also be less pronounced. We used RAD sequencing for analysing 28 888 SNPs in two marine and seven freshwater populations in Denmark, Europe. Freshwater populations represented a variety of environments: river populations accessible to gene flow from marine sticklebacks and large and small isolated lakes with and without fish predators. Sticklebacks in an accessible river environment showed minimal morphological and genomewide divergence from marine populations, supporting the hypothesis of gene flow overriding selection. Allele frequency spectra suggested bottlenecks in all freshwater populations, and particularly two small lake populations. However, genomic footprints ascribed to selection could nevertheless be identified. No genomic regions were consistent freshwater-marine outliers, and parallelism was much lower than in other comparable studies. Two genomic regions previously described to be under divergent selection in freshwater and marine populations were outliers between different freshwater populations. We ascribe these patterns to stronger environmental heterogeneity among freshwater populations in our study as compared to most other studies, although the demographic history involving bottlenecks should also be considered in the

  20. Genome-wide analysis suggests high level of microsynteny and purifying selection affect the evolution of EIN3/EIL family in Rosaceae.

    Science.gov (United States)

    Cao, Yunpeng; Han, Yahui; Meng, Dandan; Li, Dahui; Jin, Qing; Lin, Yi; Cai, Yongping

    2017-01-01

    The ethylene-insensitive3/ethylene-insensitive3-like ( EIN3/EIL ) proteins are a type of nuclear-localized protein with DNA-binding activity in plants. Although the EIN3/EIL gene family has been studied in several plant species, little is known about comprehensive study of the EIN3/EIL gene family in Rosaceae. In this study, ten, five, four, and five EIN3/EIL genes were identified in the genomes of pear ( Pyrus bretschneideri ), mei ( Prunus mume ), peach ( Prunus persica ) and strawberry ( Fragaria vesca ), respectively. Twenty-eight chromosomal segments of EIL/EIN3 gene family were found in four Rosaceae species, and these segments could form seven orthologous or paralogous groups based on interspecies or intraspecies gene colinearity (microsynteny) analysis. Moreover, the highly conserved regions of microsynteny were found in four Rosaceae species. Subsequently it was found that both whole genome duplication and tandem duplication events significantly contributed to the EIL/EIN3 gene family expansion. Gene expression analysis of the EIL/EIN3 genes in the pear revealed subfunctionalization for several PbEIL genes derived from whole genome duplication. It is noteworthy that according to environmental selection pressure analysis, the strong purifying selection should dominate the maintenance of the EIL/EIN3 gene family in four Rosaceae species. These results provided useful information on Rosaceae EIL/EIN3 genes, as well as insights into the evolution of this gene family in four Rosaceae species. Furthermore, high level of microsynteny in the four Rosaceae plants suggested that a large-scale genome duplication event in the EIL/EIN3 gene family was predated to speciation.

  1. Genome-wide analysis suggests high level of microsynteny and purifying selection affect the evolution of EIN3/EIL family in Rosaceae

    Directory of Open Access Journals (Sweden)

    Yunpeng Cao

    2017-05-01

    Full Text Available The ethylene-insensitive3/ethylene-insensitive3-like (EIN3/EIL proteins are a type of nuclear-localized protein with DNA-binding activity in plants. Although the EIN3/EIL gene family has been studied in several plant species, little is known about comprehensive study of the EIN3/EIL gene family in Rosaceae. In this study, ten, five, four, and five EIN3/EIL genes were identified in the genomes of pear (Pyrus bretschneideri, mei (Prunus mume, peach (Prunus persica and strawberry (Fragaria vesca, respectively. Twenty-eight chromosomal segments of EIL/EIN3 gene family were found in four Rosaceae species, and these segments could form seven orthologous or paralogous groups based on interspecies or intraspecies gene colinearity (microsynteny analysis. Moreover, the highly conserved regions of microsynteny were found in four Rosaceae species. Subsequently it was found that both whole genome duplication and tandem duplication events significantly contributed to the EIL/EIN3 gene family expansion. Gene expression analysis of the EIL/EIN3 genes in the pear revealed subfunctionalization for several PbEIL genes derived from whole genome duplication. It is noteworthy that according to environmental selection pressure analysis, the strong purifying selection should dominate the maintenance of the EIL/EIN3 gene family in four Rosaceae species. These results provided useful information on Rosaceae EIL/EIN3 genes, as well as insights into the evolution of this gene family in four Rosaceae species. Furthermore, high level of microsynteny in the four Rosaceae plants suggested that a large-scale genome duplication event in the EIL/EIN3 gene family was predated to speciation.

  2. Comprehensive genomic characterization of campylobacter genus reveals some underlying mechanisms for its genomic diversification.

    Directory of Open Access Journals (Sweden)

    Yizhuang Zhou

    Full Text Available Campylobacter species.are phenotypically diverse in many aspects including host habitats and pathogenicities, which demands comprehensive characterization of the entire Campylobacter genus to study their underlying genetic diversification. Up to now, 34 Campylobacter strains have been sequenced and published in public databases, providing good opportunity to systemically analyze their genomic diversities. In this study, we first conducted genomic characterization, which includes genome-wide alignments, pan-genome analysis, and phylogenetic identification, to depict the genetic diversity of Campylobacter genus. Afterward, we improved the tetranucleotide usage pattern-based naïve Bayesian classifier to identify the abnormal composition fragments (ACFs, fragments with significantly different tetranucleotide frequency profiles from its genomic tetranucleotide frequency profiles including horizontal gene transfers (HGTs to explore the mechanisms for the genetic diversity of this organism. Finally, we analyzed the HGTs transferred via bacteriophage transductions. To our knowledge, this study is the first to use single nucleotide polymorphism information to construct liable microevolution phylogeny of 21 Campylobacter jejuni strains. Combined with the phylogeny of all the collected Campylobacter species based on genome-wide core gene information, comprehensive phylogenetic inference of all 34 Campylobacter organisms was determined. It was found that C. jejuni harbors a high fraction of ACFs possibly through intraspecies recombination, whereas other Campylobacter members possess numerous ACFs possibly via intragenus recombination. Furthermore, some Campylobacter strains have undergone significant ancient viral integration during their evolution process. The improved method is a powerful tool for bacterial genomic analysis. Moreover, the findings would provide useful information for future research on Campylobacter genus.

  3. Comparison of closely related, uncultivated Coxiella tick endosymbiont population genomes reveals clues about the mechanisms of symbiosis.

    Science.gov (United States)

    Tsementzi, Despina; Castro Gordillo, Juan; Mahagna, Mustafa; Gottlieb, Yuval; Konstantinidis, Konstantinos T

    2018-05-01

    Understanding the symbiotic interaction between Coxiella-like endosymbionts (CLE) and their tick hosts is challenging due to lack of isolates and difficulties in tick functional assays. Here we sequenced the metagenome of a CLE population from wild Rhipicephalus sanguineus ticks (CRs) and compared it to the previously published genome of its close relative, CLE of R. turanicus (CRt). The tick hosts are closely related sympatric species, and their two endosymbiont genomes are highly similar with only minor differences in gene content. Both genomes encode numerous pseudogenes, consistent with an ongoing genome reduction process. In silico flux balance metabolic analysis (FBA) revealed the excess production of L-proline for both genomes, indicating a possible proline transport from Coxiella to the tick. Additionally, both CR genomes encode multiple copies of the proline/betaine transporter, proP gene. Modelling additional Coxiellaceae members including other tick CLE, did not identify proline as an excreted metabolite. Although both CRs and CRt genomes encode intact B vitamin synthesis pathway genes, which are presumed to underlay the mechanism of CLE-tick symbiosis, the FBA analysis indicated no changes for their products. Therefore, this study provides new testable hypotheses for the symbiosis mechanism and a better understanding of CLE genome evolution and diversity. © 2018 Society for Applied Microbiology and John Wiley & Sons Ltd.

  4. Systematic differences in the response of genetic variation to pedigree and genome-based selection methods

    NARCIS (Netherlands)

    Heidaritabar, M.; Vereijken, A.; Muir, W.M.; Meuwissen, T.H.E.; Cheng, H.; Megens, H.J.W.C.; Groenen, M.; Bastiaansen, J.W.M.

    2014-01-01

    Genomic selection (GS) is a DNA-based method of selecting for quantitative traits in animal and plant breeding, and offers a potentially superior alternative to traditional breeding methods that rely on pedigree and phenotype information. Using a 60¿K SNP chip with markers spaced throughout the

  5. Novel phage group infecting Lactobacillus delbrueckii subsp. lactis, as revealed by genomic and proteomic analysis of bacteriophage Ldl1.

    Science.gov (United States)

    Casey, Eoghan; Mahony, Jennifer; Neve, Horst; Noben, Jean-Paul; Dal Bello, Fabio; van Sinderen, Douwe

    2015-02-01

    Ldl1 is a virulent phage infecting the dairy starter Lactobacillus delbrueckii subsp. lactis LdlS. Electron microscopy analysis revealed that this phage exhibits a large head and a long tail and bears little resemblance to other characterized phages infecting Lactobacillus delbrueckii. In vitro propagation of this phage revealed a latent period of 30 to 40 min and a burst size of 59.9 +/- 1.9 phage particles. Comparative genomic and proteomic analyses showed remarkable similarity between the genome of Ldl1 and that of Lactobacillus plantarum phage ATCC 8014-B2. The genomic and proteomic characteristics of Ldl1 demonstrate that this phage does not belong to any of the four previously recognized L. delbrueckii phage groups, necessitating the creation of a new group, called group e, thus adding to the knowledge on the diversity of phages targeting strains of this industrially important lactic acid bacterial species.

  6. In-depth comparative analysis of malaria parasite genomes reveals protein-coding genes linked to human disease in Plasmodium falciparum genome.

    Science.gov (United States)

    Liu, Xuewu; Wang, Yuanyuan; Liang, Jiao; Wang, Luojun; Qin, Na; Zhao, Ya; Zhao, Gang

    2018-05-02

    Plasmodium falciparum is the most virulent malaria parasite capable of parasitizing human erythrocytes. The identification of genes related to this capability can enhance our understanding of the molecular mechanisms underlying human malaria and lead to the development of new therapeutic strategies for malaria control. With the availability of several malaria parasite genome sequences, performing computational analysis is now a practical strategy to identify genes contributing to this disease. Here, we developed and used a virtual genome method to assign 33,314 genes from three human malaria parasites, namely, P. falciparum, P. knowlesi and P. vivax, and three rodent malaria parasites, namely, P. berghei, P. chabaudi and P. yoelii, to 4605 clusters. Each cluster consisted of genes whose protein sequences were significantly similar and was considered as a virtual gene. Comparing the enriched values of all clusters in human malaria parasites with those in rodent malaria parasites revealed 115 P. falciparum genes putatively responsible for parasitizing human erythrocytes. These genes are mainly located in the chromosome internal regions and participate in many biological processes, including membrane protein trafficking and thiamine biosynthesis. Meanwhile, 289 P. berghei genes were included in the rodent parasite-enriched clusters. Most are located in subtelomeric regions and encode erythrocyte surface proteins. Comparing cluster values in P. falciparum with those in P. vivax and P. knowlesi revealed 493 candidate genes linked to virulence. Some of them encode proteins present on the erythrocyte surface and participate in cytoadhesion, virulence factor trafficking, or erythrocyte invasion, but many genes with unknown function were also identified. Cerebral malaria is characterized by accumulation of infected erythrocytes at trophozoite stage in brain microvascular. To discover cerebral malaria-related genes, fast Fourier transformation (FFT) was introduced to extract

  7. The first whole genome and transcriptome of the cinereous vulture reveals adaptation in the gastric and immune defense systems and possible convergent evolution between the Old and New World vultures.

    Science.gov (United States)

    Chung, Oksung; Jin, Seondeok; Cho, Yun Sung; Lim, Jeongheui; Kim, Hyunho; Jho, Sungwoong; Kim, Hak-Min; Jun, JeHoon; Lee, HyeJin; Chon, Alvin; Ko, Junsu; Edwards, Jeremy; Weber, Jessica A; Han, Kyudong; O'Brien, Stephen J; Manica, Andrea; Bhak, Jong; Paek, Woon Kee

    2015-10-21

    The cinereous vulture, Aegypius monachus, is the largest bird of prey and plays a key role in the ecosystem by removing carcasses, thus preventing the spread of diseases. Its feeding habits force it to cope with constant exposure to pathogens, making this species an interesting target for discovering functionally selected genetic variants. Furthermore, the presence of two independently evolved vulture groups, Old World and New World vultures, provides a natural experiment in which to investigate convergent evolution due to obligate scavenging. We sequenced the genome of a cinereous vulture, and mapped it to the bald eagle reference genome, a close relative with a divergence time of 18 million years. By comparing the cinereous vulture to other avian genomes, we find positively selected genetic variations in this species associated with respiration, likely linked to their ability of immune defense responses and gastric acid secretion, consistent with their ability to digest carcasses. Comparisons between the Old World and New World vulture groups suggest convergent gene evolution. We assemble the cinereous vulture blood transcriptome from a second individual, and annotate genes. Finally, we infer the demographic history of the cinereous vulture which shows marked fluctuations in effective population size during the late Pleistocene. We present the first genome and transcriptome analyses of the cinereous vulture compared to other avian genomes and transcriptomes, revealing genetic signatures of dietary and environmental adaptations accompanied by possible convergent evolution between the Old World and New World vultures.

  8. Within-Host Variations of Human Papillomavirus Reveal APOBEC-Signature Mutagenesis in the Viral Genome.

    Science.gov (United States)

    Hirose, Yusuke; Onuki, Mamiko; Tenjimbayashi, Yuri; Mori, Seiichiro; Ishii, Yoshiyuki; Takeuchi, Takamasa; Tasaka, Nobutaka; Satoh, Toyomi; Morisada, Tohru; Iwata, Takashi; Miyamoto, Shingo; Matsumoto, Koji; Sekizawa, Akihiko; Kukimoto, Iwao

    2018-03-28

    Persistent infection with oncogenic human papillomaviruses (HPVs) causes cervical cancer, accompanied with the accumulation of somatic mutations into the host genome. There are concomitant genetic changes in the HPV genome during viral infection; however, their relevance to cervical carcinogenesis is poorly understood. Here we explored within-host genetic diversity of HPV by performing deep sequencing analyses of viral whole-genome sequences in clinical specimens. The whole genomes of HPV types 16, 52 and 58 were amplified by type-specific PCR from total cellular DNA of cervical exfoliated cells collected from patients with cervical intraepithelial neoplasia (CIN) and invasive cervical cancer (ICC), and were deep-sequenced. After constructing a reference vial genome sequence for each specimen, nucleotide positions showing changes with > 0.5% frequencies compared to the reference sequence were determined for individual samples. In total, 1,052 positions of nucleotide variations were detected in HPV genomes from 151 samples (CIN1, n = 56; CIN2/3, n = 68; ICC, n = 27), with varying numbers per sample. Overall, C-to-T and C-to-A substitutions were the dominant changes observed across all histological grades. While C-to-T transitions were predominantly detected in CIN1, their prevalence was decreased in CIN2/3 and fell below that of C-to-A transversions in ICC. Analysis of the tri-nucleotides context encompassing substituted bases revealed that Tp C pN, a preferred target sequence for cellular APOBEC cytosine deaminases, was a primary site for C-to-T substitutions in the HPV genome. These results strongly imply that the APOBEC proteins are drivers of HPV genome mutation, particularly in CIN1 lesions. IMPORTANCE HPVs exhibit surprisingly high levels of genetic diversity, including a large repertoire of minor genomic variants in each viral genotype. Here, by conducting deep sequencing analyses, we show for the first time a comprehensive snapshot of the "within

  9. GRAbB : Selective Assembly of Genomic Regions, a New Niche for Genomic Research

    NARCIS (Netherlands)

    Brankovics, Balázs; Zhang, Hao; van Diepeningen, Anne D; van der Lee, Theo A J; Waalwijk, Cees; de Hoog, G Sybren

    GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often

  10. Statistical power of model selection strategies for genome-wide association studies.

    Directory of Open Access Journals (Sweden)

    Zheyang Wu

    2009-07-01

    Full Text Available Genome-wide association studies (GWAS aim to identify genetic variants related to diseases by examining the associations between phenotypes and hundreds of thousands of genotyped markers. Because many genes are potentially involved in common diseases and a large number of markers are analyzed, it is crucial to devise an effective strategy to identify truly associated variants that have individual and/or interactive effects, while controlling false positives at the desired level. Although a number of model selection methods have been proposed in the literature, including marginal search, exhaustive search, and forward search, their relative performance has only been evaluated through limited simulations due to the lack of an analytical approach to calculating the power of these methods. This article develops a novel statistical approach for power calculation, derives accurate formulas for the power of different model selection strategies, and then uses the formulas to evaluate and compare these strategies in genetic model spaces. In contrast to previous studies, our theoretical framework allows for random genotypes, correlations among test statistics, and a false-positive control based on GWAS practice. After the accuracy of our analytical results is validated through simulations, they are utilized to systematically evaluate and compare the performance of these strategies in a wide class of genetic models. For a specific genetic model, our results clearly reveal how different factors, such as effect size, allele frequency, and interaction, jointly affect the statistical power of each strategy. An example is provided for the application of our approach to empirical research. The statistical approach used in our derivations is general and can be employed to address the model selection problems in other random predictor settings. We have developed an R package markerSearchPower to implement our formulas, which can be downloaded from the

  11. Rapid Cycling Genomic Selection in a Multiparental Tropical Maize Population.

    Science.gov (United States)

    Zhang, Xuecai; Pérez-Rodríguez, Paulino; Burgueño, Juan; Olsen, Michael; Buckler, Edward; Atlin, Gary; Prasanna, Boddupalli M; Vargas, Mateo; San Vicente, Félix; Crossa, José

    2017-07-05

    Genomic selection (GS) increases genetic gain by reducing the length of the selection cycle, as has been exemplified in maize using rapid cycling recombination of biparental populations. However, no results of GS applied to maize multi-parental populations have been reported so far. This study is the first to show realized genetic gains of rapid cycling genomic selection (RCGS) for four recombination cycles in a multi-parental tropical maize population. Eighteen elite tropical maize lines were intercrossed twice, and self-pollinated once, to form the cycle 0 (C 0 ) training population. A total of 1000 ear-to-row C 0 families was genotyped with 955,690 genotyping-by-sequencing SNP markers; their testcrosses were phenotyped at four optimal locations in Mexico to form the training population. Individuals from families with the best plant types, maturity, and grain yield were selected and intermated to form RCGS cycle 1 (C 1 ). Predictions of the genotyped individuals forming cycle C 1 were made, and the best predicted grain yielders were selected as parents of C 2 ; this was repeated for more cycles (C 2 , C 3 , and C 4 ), thereby achieving two cycles per year. Multi-environment trials of individuals from populations C 0, C 1 , C 2 , C 3 , and C 4 , together with four benchmark checks were evaluated at two locations in Mexico. Results indicated that realized grain yield from C 1 to C 4 reached 0.225 ton ha -1 per cycle, which is equivalent to 0.100 ton ha -1  yr -1 over a 4.5-yr breeding period from the initial cross to the last cycle. Compared with the original 18 parents used to form cycle 0 (C 0 ), genetic diversity narrowed only slightly during the last GS cycles (C 3 and C 4 ). Results indicate that, in tropical maize multi-parental breeding populations, RCGS can be an effective breeding strategy for simultaneously conserving genetic diversity and achieving high genetic gains in a short period of time. Copyright © 2017 Zhang et al.

  12. A genome-wide association study reveals a novel candidate gene for sperm motility in pigs

    NARCIS (Netherlands)

    Diniz, D.B.; Lopes, M.S.; Broekhuijse, M.L.W.J.; Lopes, P.S.; Harlizius, B.; Guimaraes, S.E.F.; Duijvesteijn, N.; Knol, E.F.; Silva, F.F.

    2014-01-01

    Sperm motility is one of the most widely used parameters in order to evaluate boar semen quality. However, this trait can only be measured after puberty. Thus, the use of genomic information appears as an appealing alternative to evaluate and improve selection for boar fertility traits earlier in

  13. Polyploid genome of Camelina sativa revealed by isolation of fatty acid synthesis genes

    Directory of Open Access Journals (Sweden)

    Shewmaker Christine K

    2010-10-01

    Full Text Available Abstract Background Camelina sativa, an oilseed crop in the Brassicaceae family, has inspired renewed interest due to its potential for biofuels applications. Little is understood of the nature of the C. sativa genome, however. A study was undertaken to characterize two genes in the fatty acid biosynthesis pathway, fatty acid desaturase (FAD 2 and fatty acid elongase (FAE 1, which revealed unexpected complexity in the C. sativa genome. Results In C. sativa, Southern analysis indicates the presence of three copies of both FAD2 and FAE1 as well as LFY, a known single copy gene in other species. All three copies of both CsFAD2 and CsFAE1 are expressed in developing seeds, and sequence alignments show that previously described conserved sites are present, suggesting that all three copies of both genes could be functional. The regions downstream of CsFAD2 and upstream of CsFAE1 demonstrate co-linearity with the Arabidopsis genome. In addition, three expressed haplotypes were observed for six predicted single-copy genes in 454 sequencing analysis and results from flow cytometry indicate that the DNA content of C. sativa is approximately three-fold that of diploid Camelina relatives. Phylogenetic analyses further support a history of duplication and indicate that C. sativa and C. microcarpa might share a parental genome. Conclusions There is compelling evidence for triplication of the C. sativa genome, including a larger chromosome number and three-fold larger measured genome size than other Camelina relatives, three isolated copies of FAD2, FAE1, and the KCS17-FAE1 intergenic region, and three expressed haplotypes observed for six predicted single-copy genes. Based on these results, we propose that C. sativa be considered an allohexaploid. The characterization of fatty acid synthesis pathway genes will allow for the future manipulation of oil composition of this emerging biofuel crop; however, targeted manipulations of oil composition and general

  14. Large-scale genomic 2D visualization reveals extensive CG-AT skew correlation in bird genomes

    Directory of Open Access Journals (Sweden)

    Deng Xuemei

    2007-11-01

    Full Text Available Abstract Background Bird genomes have very different compositional structure compared with other warm-blooded animals. The variation in the base skew rules in the vertebrate genomes remains puzzling, but it must relate somehow to large-scale genome evolution. Current research is inclined to relate base skew with mutations and their fixation. Here we wish to explore base skew correlations in bird genomes, to develop methods for displaying and quantifying such correlations at different scales, and to discuss possible explanations for the peculiarities of the bird genomes in skew correlation. Results We have developed a method called Base Skew Double Triangle (BSDT for exhibiting the genome-scale change of AT/CG skew as a two-dimensional square picture, showing base skews at many scales simultaneously in a single image. By this method we found that most chicken chromosomes have high AT/CG skew correlation (symmetry in 2D picture, except for some microchromosomes. No other organisms studied (18 species show such high skew correlations. This visualized high correlation was validated by three kinds of quantitative calculations with overlapping and non-overlapping windows, all indicating that chicken and birds in general have a special genome structure. Similar features were also found in some of the mammal genomes, but clearly much weaker than in chickens. We presume that the skew correlation feature evolved near the time that birds separated from other vertebrate lineages. When we eliminated the repeat sequences from the genomes, the AT and CG skews correlation increased for some mammal genomes, but were still clearly lower than in chickens. Conclusion Our results suggest that BSDT is an expressive visualization method for AT and CG skew and enabled the discovery of the very high skew correlation in bird genomes; this peculiarity is worth further study. Computational analysis indicated that this correlation might be a compositional characteristic

  15. Diversity of Pseudomonas Genomes, Including Populus-Associated Isolates, as Revealed by Comparative Genome Analysis.

    Science.gov (United States)

    Jun, Se-Ran; Wassenaar, Trudy M; Nookaew, Intawat; Hauser, Loren; Wanchai, Visanu; Land, Miriam; Timm, Collin M; Lu, Tse-Yuan S; Schadt, Christopher W; Doktycz, Mitchel J; Pelletier, Dale A; Ussery, David W

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants. Copyright © 2015 Jun et al.

  16. Wrapper-based selection of genetic features in genome-wide association studies through fast matrix operations

    Science.gov (United States)

    2012-01-01

    Background Through the wealth of information contained within them, genome-wide association studies (GWAS) have the potential to provide researchers with a systematic means of associating genetic variants with a wide variety of disease phenotypes. Due to the limitations of approaches that have analyzed single variants one at a time, it has been proposed that the genetic basis of these disorders could be determined through detailed analysis of the genetic variants themselves and in conjunction with one another. The construction of models that account for these subsets of variants requires methodologies that generate predictions based on the total risk of a particular group of polymorphisms. However, due to the excessive number of variants, constructing these types of models has so far been computationally infeasible. Results We have implemented an algorithm, known as greedy RLS, that we use to perform the first known wrapper-based feature selection on the genome-wide level. The running time of greedy RLS grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. This speed is achieved through computational short-cuts based on matrix calculus. Since the memory consumption in present-day computers can form an even tighter bottleneck than running time, we also developed a space efficient variation of greedy RLS which trades running time for memory. These approaches are then compared to traditional wrapper-based feature selection implementations based on support vector machines (SVM) to reveal the relative speed-up and to assess the feasibility of the new algorithm. As a proof of concept, we apply greedy RLS to the Hypertension – UK National Blood Service WTCCC dataset and select the most predictive variants using 3-fold external cross-validation in less than 26 minutes on a high-end desktop. On this dataset, we also show that greedy RLS has a better classification performance on independent

  17. Accounting for Genotype-by-Environment Interactions and Residual Genetic Variation in Genomic Selection for Water-Soluble Carbohydrate Concentration in Wheat.

    Science.gov (United States)

    Ovenden, Ben; Milgate, Andrew; Wade, Len J; Rebetzke, Greg J; Holland, James B

    2018-05-31

    Abiotic stress tolerance traits are often complex and recalcitrant targets for conventional breeding improvement in many crop species. This study evaluated the potential of genomic selection to predict water-soluble carbohydrate concentration (WSCC), an important drought tolerance trait, in wheat under field conditions. A panel of 358 varieties and breeding lines constrained for maturity was evaluated under rainfed and irrigated treatments across two locations and two years. Whole-genome marker profiles and factor analytic mixed models were used to generate genomic estimated breeding values (GEBVs) for specific environments and environment groups. Additive genetic variance was smaller than residual genetic variance for WSCC, such that genotypic values were dominated by residual genetic effects rather than additive breeding values. As a result, GEBVs were not accurate predictors of genotypic values of the extant lines, but GEBVs should be reliable selection criteria to choose parents for intermating to produce new populations. The accuracy of GEBVs for untested lines was sufficient to increase predicted genetic gain from genomic selection per unit time compared to phenotypic selection if the breeding cycle is reduced by half by the use of GEBVs in off-season generations. Further, genomic prediction accuracy depended on having phenotypic data from environments with strong correlations with target production environments to build prediction models. By combining high-density marker genotypes, stress-managed field evaluations, and mixed models that model simultaneously covariances among genotypes and covariances of complex trait performance between pairs of environments, we were able to train models with good accuracy to facilitate genetic gain from genomic selection. Copyright © 2018 Ovenden et al.

  18. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  19. Genomic and environmental selection patterns in two distinct lettuce crop-wild hybrid crosses

    NARCIS (Netherlands)

    Hartman, Y.; Uwimana, B; Hooftman, D.A.P.; Schranz, M.E.; van de Wiel, C.C.M.; Smulders, M.J.M.; Visser, R.G.F.; van Tienderen, P.H.

    2013-01-01

    Genomic selection patterns and hybrid performance influence the chance that crop (trans)genes can spread to wild relatives. We measured fitness(-related) traits in two different field environments employing two different crop-wild crosses of lettuce. We performed quantitative trait loci (QTL)

  20. Genomic and environmental selection patterns in two distinct lettuce crop-wild hybrid crosses

    NARCIS (Netherlands)

    Hartman, Y.; Uwimana, B.; Hooftman, D.A.P.; Schranz, M.E.; Wiel, van de C.C.M.; Smulders, M.J.M.; Visser, R.G.F.; Tienderen, van P.H.

    2013-01-01

    Genomic selection patterns and hybrid performance influence the chance that crop (trans)genes can spread to wild relatives. We measured fitness(-related) traits in two different field environments employing two different crop–wild crosses of lettuce. We performed quantitative trait loci (QTL)

  1. A map of recent positive selection in the human genome.

    Directory of Open Access Journals (Sweden)

    Benjamin F Voight

    2006-03-01

    Full Text Available The identification of signals of very recent positive selection provides information about the adaptation of modern humans to local conditions. We report here on a genome-wide scan for signals of very recent positive selection in favor of variants that have not yet reached fixation. We describe a new analytical method for scanning single nucleotide polymorphism (SNP data for signals of recent selection, and apply this to data from the International HapMap Project. In all three continental groups we find widespread signals of recent positive selection. Most signals are region-specific, though a significant excess are shared across groups. Contrary to some earlier low resolution studies that suggested a paucity of recent selection in sub-Saharan Africans, we find that by some measures our strongest signals of selection are from the Yoruba population. Finally, since these signals indicate the existence of genetic variants that have substantially different fitnesses, they must indicate loci that are the source of significant phenotypic variation. Though the relevant phenotypes are generally not known, such loci should be of particular interest in mapping studies of complex traits. For this purpose we have developed a set of SNPs that can be used to tag the strongest approximately 250 signals of recent selection in each population.

  2. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA; Vos, M. de; Louw, GE; Merwe, RG van der; Dippenaar, A.; Streicher, EM; Abdallah, AM; Sampson, SL; Victor, TC; Dolby, T.; Simpson, JA; Helden, PD van; Warren, RM; Pain, Arnab

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug

  3. Comparative genomic analysis of Brucella abortus vaccine strain 104M reveals a set of candidate genes associated with its virulence attenuation.

    Science.gov (United States)

    Yu, Dong; Hui, Yiming; Zai, Xiaodong; Xu, Junjie; Liang, Long; Wang, Bingxiang; Yue, Junjie; Li, Shanhu

    2015-01-01

    The Brucella abortus strain 104M, a spontaneously attenuated strain, has been used as a vaccine strain in humans against brucellosis for 6 decades in China. Despite many studies, the molecular mechanisms that cause the attenuation are still unclear. Here, we determined the whole-genome sequence of 104M and conducted a comprehensive comparative analysis against the whole genome sequences of the virulent strain, A13334, and other reference strains. This analysis revealed a highly similar genome structure between 104M and A13334. The further comparative genomic analysis between 104M and A13334 revealed a set of genes missing in 104M. Some of these genes were identified to be directly or indirectly associated with virulence. Similarly, a set of mutations in the virulence-related genes was also identified, which may be related to virulence alteration. This study provides a set of candidate genes associated with virulence attenuation in B.abortus vaccine strain 104M.

  4. Unprecedented genomic diversity of RNA viruses in arthropods reveals the ancestry of negative-sense RNA viruses.

    Science.gov (United States)

    Li, Ci-Xiu; Shi, Mang; Tian, Jun-Hua; Lin, Xian-Dan; Kang, Yan-Jun; Chen, Liang-Jun; Qin, Xin-Cheng; Xu, Jianguo; Holmes, Edward C; Zhang, Yong-Zhen

    2015-01-29

    Although arthropods are important viral vectors, the biodiversity of arthropod viruses, as well as the role that arthropods have played in viral origins and evolution, is unclear. Through RNA sequencing of 70 arthropod species we discovered 112 novel viruses that appear to be ancestral to much of the documented genetic diversity of negative-sense RNA viruses, a number of which are also present as endogenous genomic copies. With this greatly enriched diversity we revealed that arthropods contain viruses that fall basal to major virus groups, including the vertebrate-specific arenaviruses, filoviruses, hantaviruses, influenza viruses, lyssaviruses, and paramyxoviruses. We similarly documented a remarkable diversity of genome structures in arthropod viruses, including a putative circular form, that sheds new light on the evolution of genome organization. Hence, arthropods are a major reservoir of viral genetic diversity and have likely been central to viral evolution.

  5. Potential of Genomic Selection in Mass Selection Breeding of an Allogamous Crop: An Empirical Study to Increase Yield of Common Buckwheat.

    Science.gov (United States)

    Yabe, Shiori; Hara, Takashi; Ueno, Mariko; Enoki, Hiroyuki; Kimura, Tatsuro; Nishimura, Satoru; Yasui, Yasuo; Ohsawa, Ryo; Iwata, Hiroyoshi

    2018-01-01

    To evaluate the potential of genomic selection (GS), a selection experiment with GS and phenotypic selection (PS) was performed in an allogamous crop, common buckwheat ( Fagopyrum esculentum Moench). To indirectly select for seed yield per unit area, which cannot be measured on a single-plant basis, a selection index was constructed from seven agro-morphological traits measurable on a single plant basis. Over 3 years, we performed two GS and one PS cycles per year for improvement in the selection index. In GS, a prediction model was updated every year on the basis of genotypes of 14,598-50,000 markers and phenotypes. Plants grown from seeds derived from a series of generations of GS and PS populations were evaluated for the traits in the selection index and other yield-related traits. GS resulted in a 20.9% increase and PS in a 15.0% increase in the selection index in comparison with the initial population. Although the level of linkage disequilibrium in the breeding population was low, the target trait was improved with GS. Traits with higher weights in the selection index were improved more than those with lower weights, especially when prediction accuracy was high. No trait changed in an unintended direction in either GS or PS. The accuracy of genomic prediction models built in the first cycle decreased in the later cycles because the genetic bottleneck through the selection cycles changed linkage disequilibrium patterns in the breeding population. The present study emphasizes the importance of updating models in GS and demonstrates the potential of GS in mass selection of allogamous crop species, and provided a pilot example of successful application of GS to plant breeding.

  6. Comparative genomic analysis of Lactobacillus plantarum ZJ316 reveals its genetic adaptation and potential probiotic profiles.

    Science.gov (United States)

    Li, Ping; Li, Xuan; Gu, Qing; Lou, Xiu-Yu; Zhang, Xiao-Mei; Song, Da-Feng; Zhang, Chen

    2016-08-01

    In previous studies, Lactobacillus plantarum ZJ316 showed probiotic properties, such as antimicrobial activity against various pathogens and the capacity to significantly improve pig growth and pork quality. The purpose of this study was to reveal the genes potentially related to its genetic adaptation and probiotic profiles based on comparative genomic analysis. The genome sequence of L. plantarum ZJ316 was compared with those of eight L. plantarum strains deposited in GenBank. BLASTN, Mauve, and MUMmer programs were used for genome alignment and comparison. CRISPRFinder was applied for searching the clustered regularly interspaced short palindromic repeats (CRISPRs). We identified genes that encode proteins related to genetic adaptation and probiotic profiles, including carbohydrate transport and metabolism, proteolytic enzyme systems and amino acid biosynthesis, CRISPR adaptive immunity, stress responses, bile salt resistance, ability to adhere to the host intestinal wall, exopolysaccharide (EPS) biosynthesis, and bacteriocin biosynthesis. Comparative characterization of the L. plantarum ZJ316 genome provided the genetic basis for further elucidating the functional mechanisms of its probiotic properties. ZJ316 could be considered a potential probiotic candidate.

  7. Comparative genomic analysis of Lactobacillus plantarum ZJ316 reveals its genetic adaptation and potential probiotic profiles* #

    Science.gov (United States)

    Li, Ping; Li, Xuan; Gu, Qing; Lou, Xiu-yu; Zhang, Xiao-mei; Song, Da-feng; Zhang, Chen

    2016-01-01

    Objective: In previous studies, Lactobacillus plantarum ZJ316 showed probiotic properties, such as antimicrobial activity against various pathogens and the capacity to significantly improve pig growth and pork quality. The purpose of this study was to reveal the genes potentially related to its genetic adaptation and probiotic profiles based on comparative genomic analysis. Methods: The genome sequence of L. plantarum ZJ316 was compared with those of eight L. plantarum strains deposited in GenBank. BLASTN, Mauve, and MUMmer programs were used for genome alignment and comparison. CRISPRFinder was applied for searching the clustered regularly interspaced short palindromic repeats (CRISPRs). Results: We identified genes that encode proteins related to genetic adaptation and probiotic profiles, including carbohydrate transport and metabolism, proteolytic enzyme systems and amino acid biosynthesis, CRISPR adaptive immunity, stress responses, bile salt resistance, ability to adhere to the host intestinal wall, exopolysaccharide (EPS) biosynthesis, and bacteriocin biosynthesis. Conclusions: Comparative characterization of the L. plantarum ZJ316 genome provided the genetic basis for further elucidating the functional mechanisms of its probiotic properties. ZJ316 could be considered a potential probiotic candidate. PMID:27487802

  8. Sequence- vs. chip-assisted genomic selection: accurate biological information is advised.

    Science.gov (United States)

    Pérez-Enciso, Miguel; Rincón, Juan C; Legarra, Andrés

    2015-05-09

    The development of next-generation sequencing technologies (NGS) has made the use of whole-genome sequence data for routine genetic evaluations possible, which has triggered a considerable interest in animal and plant breeding fields. Here, we investigated whether complete or partial sequence data can improve upon existing SNP (single nucleotide polymorphism) array-based selection strategies by simulation using a mixed coalescence - gene-dropping approach. We simulated 20 or 100 causal mutations (quantitative trait nucleotides, QTN) within 65 predefined 'gene' regions, each 10 kb long, within a genome composed of ten 3-Mb chromosomes. We compared prediction accuracy by cross-validation using a medium-density chip (7.5 k SNPs), a high-density (HD, 17 k) and sequence data (335 k). Genetic evaluation was based on a GBLUP method. The simulations showed: (1) a law of diminishing returns with increasing number of SNPs; (2) a modest effect of SNP ascertainment bias in arrays; (3) a small advantage of using whole-genome sequence data vs. HD arrays i.e. ~4%; (4) a minor effect of NGS errors except when imputation error rates are high (≥20%); and (5) if QTN were known, prediction accuracy approached 1. Since this is obviously unrealistic, we explored milder assumptions. We showed that, if all SNPs within causal genes were included in the prediction model, accuracy could also dramatically increase by ~40%. However, this criterion was highly sensitive to either misspecification (including wrong genes) or to the use of an incomplete gene list; in these cases, accuracy fell rapidly towards that reached when all SNPs from sequence data were blindly included in the model. Our study shows that, unless an accurate prior estimate on the functionality of SNPs can be included in the predictor, there is a law of diminishing returns with increasing SNP density. As a result, use of whole-genome sequence data may not result in a highly increased selection response over high

  9. Comparative genomics of the marine bacterial genus Glaciecola reveals the high degree of genomic diversity and genomic characteristic for cold adaptation.

    Science.gov (United States)

    Qin, Qi-Long; Xie, Bin-Bin; Yu, Yong; Shu, Yan-Li; Rong, Jin-Cheng; Zhang, Yan-Jiao; Zhao, Dian-Li; Chen, Xiu-Lan; Zhang, Xi-Ying; Chen, Bo; Zhou, Bai-Cheng; Zhang, Yu-Zhong

    2014-06-01

    To what extent the genomes of different species belonging to one genus can be diverse and the relationship between genomic differentiation and environmental factor remain unclear for oceanic bacteria. With many new bacterial genera and species being isolated from marine environments, this question warrants attention. In this study, we sequenced all the type strains of the published species of Glaciecola, a recently defined cold-adapted genus with species from diverse marine locations, to study the genomic diversity and cold-adaptation strategy in this genus.The genome size diverged widely from 3.08 to 5.96 Mb, which can be explained by massive gene gain and loss events. Horizontal gene transfer and new gene emergence contributed substantially to the genome size expansion. The genus Glaciecola had an open pan-genome. Comparative genomic research indicated that species of the genus Glaciecola had high diversity in genome size, gene content and genetic relatedness. This may be prevalent in marine bacterial genera considering the dynamic and complex environments of the ocean. Species of Glaciecola had some common genomic features related to cold adaptation, which enable them to thrive and play a role in biogeochemical cycle in the cold marine environments.

  10. Infectious diseases of marine molluscs and host responses as revealed by genomic tools

    Science.gov (United States)

    Ford, Susan E.

    2016-01-01

    More and more infectious diseases affect marine molluscs. Some diseases have impacted commercial species including MSX and Dermo of the eastern oyster, QPX of hard clams, withering syndrome of abalone and ostreid herpesvirus 1 (OsHV-1) infections of many molluscs. Although the exact transmission mechanisms are not well understood, human activities and associated environmental changes often correlate with increased disease prevalence. For instance, hatcheries and large-scale aquaculture create high host densities, which, along with increasing ocean temperature, might have contributed to OsHV-1 epizootics in scallops and oysters. A key to understanding linkages between the environment and disease is to understand how the environment affects the host immune system. Although we might be tempted to downplay the role of immunity in invertebrates, recent advances in genomics have provided insights into host and parasite genomes and revealed surprisingly sophisticated innate immune systems in molluscs. All major innate immune pathways are found in molluscs with many immune receptors, regulators and effectors expanded. The expanded gene families provide great diversity and complexity in innate immune response, which may be key to mollusc's defence against diverse pathogens in the absence of adaptive immunity. Further advances in host and parasite genomics should improve our understanding of genetic variation in parasite virulence and host disease resistance. PMID:26880838

  11. Genome-wide association study, genomic prediction and marker-assisted selection for seed weight in soybean (Glycine max).

    Science.gov (United States)

    Zhang, Jiaoping; Song, Qijian; Cregan, Perry B; Jiang, Guo-Liang

    2016-01-01

    Twenty-two loci for soybean SW and candidate genes conditioning seed development were identified; and prediction accuracies of GS and MAS were estimated through cross-validation and validation with unrelated populations. Soybean (Glycine max) is a major crop for plant protein and oil production, and seed weight (SW) is important for yield and quality in food/vegetable uses of soybean. However, our knowledge of genes controlling SW remains limited. To better understand the molecular mechanism underlying the trait and explore marker-based breeding approaches, we conducted a genome-wide association study in a population of 309 soybean germplasm accessions using 31,045 single nucleotide polymorphisms (SNPs), and estimated the prediction accuracy of genomic selection (GS) and marker-assisted selection (MAS) for SW. Twenty-two loci of minor effect associated with SW were identified, including hotspots on Gm04 and Gm19. The mixed model containing these loci explained 83.4% of phenotypic variation. Candidate genes with Arabidopsis orthologs conditioning SW were also proposed. The prediction accuracies of GS and MAS by cross-validation were 0.75-0.87 and 0.62-0.75, respectively, depending on the number of SNPs used and the size of training population. GS also outperformed MAS when the validation was performed using unrelated panels across a wide range of maturities, with an average prediction accuracy of 0.74 versus 0.53. This study convincingly demonstrated that soybean SW is controlled by numerous minor-effect loci. It greatly enhances our understanding of the genetic basis of SW in soybean and facilitates the identification of genes controlling the trait. It also suggests that GS holds promise for accelerating soybean breeding progress. The results are helpful for genetic improvement and genomic prediction of yield in soybean.

  12. Transcriptome analysis reveals the time of the fourth round of genome duplication in common carp (Cyprinus carpio)

    Science.gov (United States)

    2012-01-01

    Background Common carp (Cyprinus carpio) is thought to have undergone one extra round of genome duplication compared to zebrafish. Transcriptome analysis has been used to study the existence and timing of genome duplication in species for which genome sequences are incomplete. Large-scale transcriptome data for the common carp genome should help reveal the timing of the additional duplication event. Results We have sequenced the transcriptome of common carp using 454 pyrosequencing. After assembling the 454 contigs and the published common carp sequences together, we obtained 49,669 contigs and identified genes using homology searches and an ab initio method. We identified 4,651 orthologous pairs between common carp and zebrafish and found 129,984 paralogous pairs within the common carp. An estimation of the synonymous substitution rate in the orthologous pairs indicated that common carp and zebrafish diverged 120 million years ago (MYA). We identified one round of genome duplication in common carp and estimated that it had occurred 5.6 to 11.3 MYA. In zebrafish, no genome duplication event after speciation was observed, suggesting that, compared to zebrafish, common carp had undergone an additional genome duplication event. We annotated the common carp contigs with Gene Ontology terms and KEGG pathways. Compared with zebrafish gene annotations, we found that a set of biological processes and pathways were enriched in common carp. Conclusions The assembled contigs helped us to estimate the time of the fourth-round of genome duplication in common carp. The resource that we have built as part of this study will help advance functional genomics and genome annotation studies in the future. PMID:22424280

  13. Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Otillar, Robert; Fagnan, Kirsten; Boussau, Bastien; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Held, Benjamin; Nagy, Laszlo; Floudas, Dimitris; Morin, Emmanuelle; Manning, Gerard; Baker, Scott; Martin, Francis; Blanchette, Robert; Hibbett, David; Grigoriev, Igor V.

    2013-03-11

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.

  14. Investigating Drought Tolerance in Chickpea Using Genome-Wide Association Mapping and Genomic Selection Based on Whole-Genome Resequencing Data.

    Science.gov (United States)

    Li, Yongle; Ruperao, Pradeep; Batley, Jacqueline; Edwards, David; Khan, Tanveer; Colmer, Timothy D; Pang, Jiayin; Siddique, Kadambot H M; Sutton, Tim

    2018-01-01

    Drought tolerance is a complex trait that involves numerous genes. Identifying key causal genes or linked molecular markers can facilitate the fast development of drought tolerant varieties. Using a whole-genome resequencing approach, we sequenced 132 chickpea varieties and advanced breeding lines and found more than 144,000 single nucleotide polymorphisms (SNPs). We measured 13 yield and yield-related traits in three drought-prone environments of Western Australia. The genotypic effects were significant for all traits, and many traits showed highly significant correlations, ranging from 0.83 between grain yield and biomass to -0.67 between seed weight and seed emergence rate. To identify candidate genes, the SNP and trait data were incorporated into the SUPER genome-wide association study (GWAS) model, a modified version of the linear mixed model. We found that several SNPs from auxin-related genes, including auxin efflux carrier protein (PIN3), p-glycoprotein, and nodulin MtN21/EamA-like transporter, were significantly associated with yield and yield-related traits under drought-prone environments. We identified four genetic regions containing SNPs significantly associated with several different traits, which was an indication of pleiotropic effects. We also investigated the possibility of incorporating the GWAS results into a genomic selection (GS) model, which is another approach to deal with complex traits. Compared to using all SNPs, application of the GS model using subsets of SNPs significantly associated with the traits under investigation increased the prediction accuracies of three yield and yield-related traits by more than twofold. This has important implication for implementing GS in plant breeding programs.

  15. Rhipicephalus microplus dataset of nonredundant raw sequence reads from 454 GS FLX sequencing of Cot-selected (Cot = 660) genomic DNA

    Science.gov (United States)

    A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...

  16. Identification and analysis of genome-wide SNPs provide insight into signatures of selection and domestication in channel catfish (Ictalurus punctatus.

    Directory of Open Access Journals (Sweden)

    Luyang Sun

    Full Text Available Domestication and selection for important performance traits can impact the genome, which is most often reflected by reduced heterozygosity in and surrounding genes related to traits affected by selection. In this study, analysis of the genomic impact caused by domestication and artificial selection was conducted by investigating the signatures of selection using single nucleotide polymorphisms (SNPs in channel catfish (Ictalurus punctatus. A total of 8.4 million candidate SNPs were identified by using next generation sequencing. On average, the channel catfish genome harbors one SNP per 116 bp. Approximately 6.6 million, 5.3 million, 4.9 million, 7.1 million and 6.7 million SNPs were detected in the Marion, Thompson, USDA103, Hatchery strain, and wild population, respectively. The allele frequencies of 407,861 SNPs differed significantly between the domestic and wild populations. With these SNPs, 23 genomic regions with putative selective sweeps were identified that included 11 genes. Although the function for the majority of the genes remain unknown in catfish, several genes with known function related to aquaculture performance traits were included in the regions with selective sweeps. These included hypoxia-inducible factor 1β. HIFιβ.. and the transporter gene ATP-binding cassette sub-family B member 5 (ABCB5. HIF1β. is important for response to hypoxia and tolerance to low oxygen levels is a critical aquaculture trait. The large numbers of SNPs identified from this study are valuable for the development of high-density SNP arrays for genetic and genomic studies of performance traits in catfish.

  17. Genomic identification of founding haplotypes reveals the history of the selfing species Capsella rubella.

    Directory of Open Access Journals (Sweden)

    Yaniv Brandvain

    Full Text Available The shift from outcrossing to self-fertilization is among the most common evolutionary transitions in flowering plants. Until recently, however, a genome-wide view of this transition has been obscured by both a dearth of appropriate data and the lack of appropriate population genomic methods to interpret such data. Here, we present a novel population genomic analysis detailing the origin of the selfing species, Capsella rubella, which recently split from its outcrossing sister, Capsella grandiflora. Due to the recency of the split, much of the variation within C. rubella is also found within C. grandiflora. We can therefore identify genomic regions where two C. rubella individuals have inherited the same or different segments of ancestral diversity (i.e. founding haplotypes present in C. rubella's founder(s. Based on this analysis, we show that C. rubella was founded by multiple individuals drawn from a diverse ancestral population closely related to extant C. grandiflora, that drift and selection have rapidly homogenized most of this ancestral variation since C. rubella's founding, and that little novel variation has accumulated within this time. Despite the extensive loss of ancestral variation, the approximately 25% of the genome for which two C. rubella individuals have inherited different founding haplotypes makes up roughly 90% of the genetic variation between them. To extend these findings, we develop a coalescent model that utilizes the inferred frequency of founding haplotypes and variation within founding haplotypes to estimate that C. rubella was founded by a potentially large number of individuals between 50 and 100 kya, and has subsequently experienced a twenty-fold reduction in its effective population size. As population genomic data from an increasing number of outcrossing/selfing pairs are generated, analyses like the one developed here will facilitate a fine-scaled view of the evolutionary and demographic impact of the

  18. Pooled-DNA sequencing identifies genomic regions of selection in Nigerian isolates of Plasmodium falciparum.

    Science.gov (United States)

    Oyebola, Kolapo M; Idowu, Emmanuel T; Olukosi, Yetunde A; Awolola, Taiwo S; Amambua-Ngwa, Alfred

    2017-06-29

    The burden of falciparum malaria is especially high in sub-Saharan Africa. Differences in pressure from host immunity and antimalarial drugs lead to adaptive changes responsible for high level of genetic variations within and between the parasite populations. Population-specific genetic studies to survey for genes under positive or balancing selection resulting from drug pressure or host immunity will allow for refinement of interventions. We performed a pooled sequencing (pool-seq) of the genomes of 100 Plasmodium falciparum isolates from Nigeria. We explored allele-frequency based neutrality test (Tajima's D) and integrated haplotype score (iHS) to identify genes under selection. Fourteen shared iHS regions that had at least 2 SNPs with a score > 2.5 were identified. These regions code for genes that were likely to have been under strong directional selection. Two of these genes were the chloroquine resistance transporter (CRT) on chromosome 7 and the multidrug resistance 1 (MDR1) on chromosome 5. There was a weak signature of selection in the dihydrofolate reductase (DHFR) gene on chromosome 4 and MDR5 genes on chromosome 13, with only 2 and 3 SNPs respectively identified within the iHS window. We observed strong selection pressure attributable to continued chloroquine and sulfadoxine-pyrimethamine use despite their official proscription for the treatment of uncomplicated malaria. There was also a major selective sweep on chromosome 6 which had 32 SNPs within the shared iHS region. Tajima's D of circumsporozoite protein (CSP), erythrocyte-binding antigen (EBA-175), merozoite surface proteins - MSP3 and MSP7, merozoite surface protein duffy binding-like (MSPDBL2) and serine repeat antigen (SERA-5) were 1.38, 1.29, 0.73, 0.84 and 0.21, respectively. We have demonstrated the use of pool-seq to understand genomic patterns of selection and variability in P. falciparum from Nigeria, which bears the highest burden of infections. This investigation identified known

  19. Genome sequencing and analysis reveals possible determinants of Staphylococcus aureus nasal carriage

    Directory of Open Access Journals (Sweden)

    Cole Alexander M

    2008-09-01

    Full Text Available Abstract Background Nasal carriage of Staphylococcus aureus is a major risk factor in clinical and community settings due to the range of etiologies caused by the organism. We have identified unique immunological and ultrastructural properties associated with nasal carriage isolates denoting a role for bacterial factors in nasal carriage. However, despite extensive molecular level characterizations by several groups suggesting factors necessary for colonization on nasal epithelium, genetic determinants of nasal carriage are unknown. Herein, we have set a genomic foundation for unraveling the bacterial determinants of nasal carriage in S. aureus. Results MLST analysis revealed no lineage specific differences between carrier and non-carrier strains suggesting a role for mobile genetic elements. We completely sequenced a model carrier isolate (D30 and a model non-carrier strain (930918-3 to identify differential gene content. Comparison revealed the presence of 84 genes unique to the carrier strain and strongly suggests a role for Type VII secretion systems in nasal carriage. These genes, along with a putative pathogenicity island (SaPIBov present uniquely in the carrier strains are likely important in affecting carriage. Further, PCR-based genotyping of other clinical isolates for a specific subset of these 84 genes raise the possibility of nasal carriage being caused by multiple gene sets. Conclusion Our data suggest that carriage is likely a heterogeneic phenotypic trait and implies a role for nucleotide level polymorphism in carriage. Complete genome level analyses of multiple carriage strains of S. aureus will be important in clarifying molecular determinants of S. aureus nasal carriage.

  20. Identification of Promising Mutants Associated with Egg Production Traits Revealed by Genome-Wide Association Study.

    Directory of Open Access Journals (Sweden)

    Jingwei Yuan

    Full Text Available Egg number (EN, egg laying rate (LR and age at first egg (AFE are important production traits related to egg production in poultry industry. To better understand the knowledge of genetic architecture of dynamic EN during the whole laying cycle and provide the precise positions of associated variants for EN, LR and AFE, laying records from 21 to 72 weeks of age were collected individually for 1,534 F2 hens produced by reciprocal crosses between White Leghorn and Dongxiang Blue-shelled chicken, and their genotypes were assayed by chicken 600 K Affymetrix high density genotyping arrays. Subsequently, pedigree and SNP-based genetic parameters were estimated and a genome-wide association study (GWAS was conducted on EN, LR and AFE. The heritability estimates were similar between pedigree and SNP-based estimates varying from 0.17 to 0.36. In the GWA analysis, we identified nine genome-wide significant loci associated with EN of the laying periods from 21 to 26 weeks, 27 to 36 weeks and 37 to 72 weeks. Analysis of GTF2A1 and CLSPN suggested that they influenced the function of ovary and uterus, and may be considered as relevant candidates. The identified SNP rs314448799 for accumulative EN from 21 to 40 weeks on chromosome 5 created phenotypic differences of 6.86 eggs between two homozygous genotypes, which could be potentially applied to the molecular breeding for EN selection. Moreover, our finding showed that LR was a moderate polygenic trait. The suggestive significant region on chromosome 16 for AFE suggested the relationship between sex maturity and immune in the current population. The present study comprehensively evaluates the role of genetic variants in the development of egg laying. The findings will be helpful to investigation of causative genes function and future marker-assisted selection and genomic selection in chickens.

  1. Population genomics identifies the origin and signatures of selection of Korean weedy rice

    OpenAIRE

    He, Qiang; Kim, Kyu?Won; Park, Yong?Jin

    2016-01-01

    Summary Weedy rice is the same biological species as cultivated rice (Oryza sativa); it is also a noxious weed infesting rice fields worldwide. Its formation and population?selective or ?adaptive signatures are poorly understood. In this study, we investigated the phylogenetics, population structure and signatures of selection of Korean weedy rice by determining the whole genomes of 30 weedy rice, 30 landrace rice and ten wild rice samples. The phylogenetic tree and results of ancestry infere...

  2. Comparisons of single-stage and two-stage approaches to genomic selection.

    Science.gov (United States)

    Schulz-Streeck, Torben; Ogutu, Joseph O; Piepho, Hans-Peter

    2013-01-01

    Genomic selection (GS) is a method for predicting breeding values of plants or animals using many molecular markers that is commonly implemented in two stages. In plant breeding the first stage usually involves computation of adjusted means for genotypes which are then used to predict genomic breeding values in the second stage. We compared two classical stage-wise approaches, which either ignore or approximate correlations among the means by a diagonal matrix, and a new method, to a single-stage analysis for GS using ridge regression best linear unbiased prediction (RR-BLUP). The new stage-wise method rotates (orthogonalizes) the adjusted means from the first stage before submitting them to the second stage. This makes the errors approximately independently and identically normally distributed, which is a prerequisite for many procedures that are potentially useful for GS such as machine learning methods (e.g. boosting) and regularized regression methods (e.g. lasso). This is illustrated in this paper using componentwise boosting. The componentwise boosting method minimizes squared error loss using least squares and iteratively and automatically selects markers that are most predictive of genomic breeding values. Results are compared with those of RR-BLUP using fivefold cross-validation. The new stage-wise approach with rotated means was slightly more similar to the single-stage analysis than the classical two-stage approaches based on non-rotated means for two unbalanced datasets. This suggests that rotation is a worthwhile pre-processing step in GS for the two-stage approaches for unbalanced datasets. Moreover, the predictive accuracy of stage-wise RR-BLUP was higher (5.0-6.1%) than that of componentwise boosting.

  3. Genome-wide mapping reveals single-origin chromosome replication in Leishmania, a eukaryotic microbe.

    Science.gov (United States)

    Marques, Catarina A; Dickens, Nicholas J; Paape, Daniel; Campbell, Samantha J; McCulloch, Richard

    2015-10-19

    DNA replication initiates on defined genome sites, termed origins. Origin usage appears to follow common rules in the eukaryotic organisms examined to date: all chromosomes are replicated from multiple origins, which display variations in firing efficiency and are selected from a larger pool of potential origins. To ask if these features of DNA replication are true of all eukaryotes, we describe genome-wide origin mapping in the parasite Leishmania. Origin mapping in Leishmania suggests a striking divergence in origin usage relative to characterized eukaryotes, since each chromosome appears to be replicated from a single origin. By comparing two species of Leishmania, we find evidence that such origin singularity is maintained in the face of chromosome fusion or fission events during evolution. Mapping Leishmania origins suggests that all origins fire with equal efficiency, and that the genomic sites occupied by origins differ from related non-origins sites. Finally, we provide evidence that origin location in Leishmania displays striking conservation with Trypanosoma brucei, despite the latter parasite replicating its chromosomes from multiple, variable strength origins. The demonstration of chromosome replication for a single origin in Leishmania, a microbial eukaryote, has implications for the evolution of origin multiplicity and associated controls, and may explain the pervasive aneuploidy that characterizes Leishmania chromosome architecture.

  4. Detecting Positive Selection of Korean Native Goat Populations Using Next-Generation Sequencing

    Science.gov (United States)

    Lee, Wonseok; Ahn, Sojin; Taye, Mengistie; Sung, Samsun; Lee, Hyun-Jeong; Cho, Seoae; Kim, Heebal

    2016-01-01

    Goats (Capra hircus) are one of the oldest species of domesticated animals. Native Korean goats are a particularly interesting group, as they are indigenous to the area and were raised in the Korean peninsula almost 2,000 years ago. Although they have a small body size and produce low volumes of milk and meat, they are quite resistant to lumbar paralysis. Our study aimed to reveal the distinct genetic features and patterns of selection in native Korean goats by comparing the genomes of native Korean goat and crossbred goat populations. We sequenced the whole genome of 15 native Korean goats and 11 crossbred goats using next-generation sequencing (Illumina platform) to compare the genomes of the two populations. We found decreased nucleotide diversity in the native Korean goats compared to the crossbred goats. Genetic structural analysis demonstrated that the native Korean goat and crossbred goat populations shared a common ancestry, but were clearly distinct. Finally, to reveal the native Korean goat’s selective sweep region, selective sweep signals were identified in the native Korean goat genome using cross-population extended haplotype homozygosity (XP-EHH) and a cross-population composite likelihood ratio test (XP-CLR). As a result, we were able to identify candidate genes for recent selection, such as the CCR3 gene, which is related to lumbar paralysis resistance. Combined with future studies and recent goat genome information, this study will contribute to a thorough understanding of the native Korean goat genome. PMID:27989103

  5. Detecting Positive Selection of Korean Native Goat Populations Using Next-Generation Sequencing.

    Science.gov (United States)

    Lee, Wonseok; Ahn, Sojin; Taye, Mengistie; Sung, Samsun; Lee, Hyun-Jeong; Cho, Seoae; Kim, Heebal

    2016-12-01

    Goats ( Capra hircus ) are one of the oldest species of domesticated animals. Native Korean goats are a particularly interesting group, as they are indigenous to the area and were raised in the Korean peninsula almost 2,000 years ago. Although they have a small body size and produce low volumes of milk and meat, they are quite resistant to lumbar paralysis. Our study aimed to reveal the distinct genetic features and patterns of selection in native Korean goats by comparing the genomes of native Korean goat and crossbred goat populations. We sequenced the whole genome of 15 native Korean goats and 11 crossbred goats using next-generation sequencing (Illumina platform) to compare the genomes of the two populations. We found decreased nucleotide diversity in the native Korean goats compared to the crossbred goats. Genetic structural analysis demonstrated that the native Korean goat and crossbred goat populations shared a common ancestry, but were clearly distinct. Finally, to reveal the native Korean goat's selective sweep region, selective sweep signals were identified in the native Korean goat genome using cross-population extended haplotype homozygosity (XP-EHH) and a cross-population composite likelihood ratio test (XP-CLR). As a result, we were able to identify candidate genes for recent selection, such as the CCR3 gene, which is related to lumbar paralysis resistance. Combined with future studies and recent goat genome information, this study will contribute to a thorough understanding of the native Korean goat genome.

  6. Transcriptome of interstitial cells of Cajal reveals unique and selective gene signatures.

    Directory of Open Access Journals (Sweden)

    Moon Young Lee

    Full Text Available Transcriptome-scale data can reveal essential clues into understanding the underlying molecular mechanisms behind specific cellular functions and biological processes. Transcriptomics is a continually growing field of research utilized in biomarker discovery. The transcriptomic profile of interstitial cells of Cajal (ICC, which serve as slow-wave electrical pacemakers for gastrointestinal (GI smooth muscle, has yet to be uncovered. Using copGFP-labeled ICC mice and flow cytometry, we isolated ICC populations from the murine small intestine and colon and obtained their transcriptomes. In analyzing the transcriptome, we identified a unique set of ICC-restricted markers including transcription factors, epigenetic enzymes/regulators, growth factors, receptors, protein kinases/phosphatases, and ion channels/transporters. This analysis provides new and unique insights into the cellular and biological functions of ICC in GI physiology. Additionally, we constructed an interactive ICC genome browser (http://med.unr.edu/physio/transcriptome based on the UCSC genome database. To our knowledge, this is the first online resource that provides a comprehensive library of all known genetic transcripts expressed in primary ICC. Our genome browser offers a new perspective into the alternative expression of genes in ICC and provides a valuable reference for future functional studies.

  7. Integrative Genomics Reveals Mechanisms of Copy Number Alterations Responsible for Transcriptional Deregulation in Colorectal Cancer

    Science.gov (United States)

    Camps, Jordi; Nguyen, Quang Tri; Padilla-Nash, Hesed M.; Knutsen, Turid; McNeil, Nicole E.; Wangsa, Danny; Hummon, Amanda B.; Grade, Marian; Ried, Thomas; Difilippantonio, Michael J.

    2016-01-01

    To evaluate the mechanisms and consequences of chromosomal aberrations in colorectal cancer (CRC), we used a combination of spectral karyotyping, array comparative genomic hybridization (aCGH), and array-based global gene expression profiling on 31 primary carcinomas and 15 established cell lines. Importantly, aCGH showed that the genomic profiles of primary tumors are recapitulated in the cell lines. We revealed a preponderance of chromosome breakpoints at sites of copy number variants (CNVs) in the CRC cell lines, a novel mechanism of DNA breakage in cancer. The integration of gene expression and aCGH led to the identification of 157 genes localized within high-level copy number changes whose transcriptional deregulation was significantly affected across all of the samples, thereby suggesting that these genes play a functional role in CRC. Genomic amplification at 8q24 was the most recurrent event and led to the overexpression of MYC and FAM84B. Copy number dependent gene expression resulted in deregulation of known cancer genes such as APC, FGFR2, and ERBB2. The identification of only 36 genes whose localization near a breakpoint could account for their observed deregulated expression demonstrates that the major mechanism for transcriptional deregulation in CRC is genomic copy number changes resulting from chromosomal aberrations. PMID:19691111

  8. Assembly of the Lactuca sativa, L. cv. Tizian draft genome sequence reveals differences within major resistance complex 1 as compared to the cv. Salinas reference genome.

    Science.gov (United States)

    Verwaaijen, Bart; Wibberg, Daniel; Nelkner, Johanna; Gordin, Miriam; Rupp, Oliver; Winkler, Anika; Bremges, Andreas; Blom, Jochen; Grosch, Rita; Pühler, Alfred; Schlüter, Andreas

    2018-02-10

    Lettuce (Lactuca sativa, L.) is an important annual plant of the family Asteraceae (Compositae). The commercial lettuce cultivar Tizian has been used in various scientific studies investigating the interaction of the plant with phytopathogens or biological control agents. Here, we present the de novo draft genome sequencing and gene prediction for this specific cultivar derived from transcriptome sequence data. The assembled scaffolds amount to a size of 2.22 Gb. Based on RNAseq data, 31,112 transcript isoforms were identified. Functional predictions for these transcripts were determined within the GenDBE annotation platform. Comparison with the cv. Salinas reference genome revealed a high degree of sequence similarity on genome and transcriptome levels, with an average amino acid identity of 99%. Furthermore, it was observed that two large regions are either missing or are highly divergent within the cv. Tizian genome compared to cv. Salinas. One of these regions covers the major resistance complex 1 region of cv. Salinas. The cv. Tizian draft genome sequence provides a valuable resource for future functional and transcriptome analyses focused on this lettuce cultivar. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Resource allocation for maximizing prediction accuracy and genetic gain of genomic selection in plant breeding: a simulation experiment.

    Science.gov (United States)

    Lorenz, Aaron J

    2013-03-01

    Allocating resources between population size and replication affects both genetic gain through phenotypic selection and quantitative trait loci detection power and effect estimation accuracy for marker-assisted selection (MAS). It is well known that because alleles are replicated across individuals in quantitative trait loci mapping and MAS, more resources should be allocated to increasing population size compared with phenotypic selection. Genomic selection is a form of MAS using all marker information simultaneously to predict individual genetic values for complex traits and has widely been found superior to MAS. No studies have explicitly investigated how resource allocation decisions affect success of genomic selection. My objective was to study the effect of resource allocation on response to MAS and genomic selection in a single biparental population of doubled haploid lines by using computer simulation. Simulation results were compared with previously derived formulas for the calculation of prediction accuracy under different levels of heritability and population size. Response of prediction accuracy to resource allocation strategies differed between genomic selection models (ridge regression best linear unbiased prediction [RR-BLUP], BayesCπ) and multiple linear regression using ordinary least-squares estimation (OLS), leading to different optimal resource allocation choices between OLS and RR-BLUP. For OLS, it was always advantageous to maximize population size at the expense of replication, but a high degree of flexibility was observed for RR-BLUP. Prediction accuracy of doubled haploid lines included in the training set was much greater than of those excluded from the training set, so there was little benefit to phenotyping only a subset of the lines genotyped. Finally, observed prediction accuracies in the simulation compared well to calculated prediction accuracies, indicating these theoretical formulas are useful for making resource allocation

  10. Potential of Genomic Selection in Mass Selection Breeding of an Allogamous Crop: An Empirical Study to Increase Yield of Common Buckwheat

    Directory of Open Access Journals (Sweden)

    Shiori Yabe

    2018-03-01

    Full Text Available To evaluate the potential of genomic selection (GS, a selection experiment with GS and phenotypic selection (PS was performed in an allogamous crop, common buckwheat (Fagopyrum esculentum Moench. To indirectly select for seed yield per unit area, which cannot be measured on a single-plant basis, a selection index was constructed from seven agro-morphological traits measurable on a single plant basis. Over 3 years, we performed two GS and one PS cycles per year for improvement in the selection index. In GS, a prediction model was updated every year on the basis of genotypes of 14,598–50,000 markers and phenotypes. Plants grown from seeds derived from a series of generations of GS and PS populations were evaluated for the traits in the selection index and other yield-related traits. GS resulted in a 20.9% increase and PS in a 15.0% increase in the selection index in comparison with the initial population. Although the level of linkage disequilibrium in the breeding population was low, the target trait was improved with GS. Traits with higher weights in the selection index were improved more than those with lower weights, especially when prediction accuracy was high. No trait changed in an unintended direction in either GS or PS. The accuracy of genomic prediction models built in the first cycle decreased in the later cycles because the genetic bottleneck through the selection cycles changed linkage disequilibrium patterns in the breeding population. The present study emphasizes the importance of updating models in GS and demonstrates the potential of GS in mass selection of allogamous crop species, and provided a pilot example of successful application of GS to plant breeding.

  11. Draft genome of an Aerophobetes bacterium reveals a facultative lifestyle in deep-sea anaerobic sediments

    KAUST Repository

    Wang, Yong

    2016-07-01

    Aerophobetes (or CD12) is a recently defined bacterial phylum, of which the metabolic processes and ecological importance remain unclear. In the present study, we obtained the draft genome of an Aerophobetes bacterium TCS1 from saline sediment near the Thuwal cold seep in the Red Sea using a genome binning method. Analysis of 16S rRNA genes of TCS1 and close relatives revealed wide distribution of Aerophobetes in deep-sea sediments. Phylogenetic relationships showed affinity between Aerophobetes TCS1 and some thermophilic bacterial phyla. The genome of TCS1 (at least 1.27 Mbp) contains a full set of genes encoding core metabolic pathways, including glycolysis and pyruvate fermentation to produce acetyl-CoA and acetate. The identification of cross-membrane sugar transporter genes further indicates its potential ability to consume carbohydrates preserved in the sediment under the microbial mat. Aerophobetes bacterium TCS1 therefore probably carried out saccharolytic and fermentative metabolism. The genes responsible for autotrophic synthesis of acetyl-CoA via the Wood–Ljungdahl pathway were also found in the genome. Phylogenetic study of the essential genes for the Wood–Ljungdahl pathway implied relative independence of Aerophobetes bacterium from the known acetogens and methanogens. Compared with genomes of acetogenic bacteria, Aerophobetes bacterium TCS1 genome lacks the genes involved in nitrogen metabolism, sulfur metabolism, signal transduction and cell motility. The metabolic activities of TCS1 might depend on geochemical conditions such as supplies of CO2, hydrogen and sugars, and therefore the TCS1 might be a facultative bacterium in anaerobic saline sediments near cold seeps. © 2016, Science China Press and Springer-Verlag Berlin Heidelberg.

  12. Genomic Analysis Reveals Hypoxia Adaptation in the Tibetan Mastiff by Introgression of the Gray Wolf from the Tibetan Plateau.

    Science.gov (United States)

    Miao, Benpeng; Wang, Zhen; Li, Yixue

    2017-03-01

    The Tibetan Mastiff (TM), a native of the Tibetan Plateau, has quickly adapted to the extreme highland environment. Recently, the impact of positive selection on the TM genome was studied and potential hypoxia-adaptive genes were identified. However, the origin of the adaptive variants remains unknown. In this study, we investigated the signature of genetic introgression in the adaptation of TMs with dog and wolf genomic data from different altitudes in close geographic proximity. On a genome-wide scale, the TM was much more closely related to other dogs than wolves. However, using the 'ABBA/BABA' test, we identified genomic regions from the TM that possibly introgressed from Tibetan gray wolf. Several of the regions, including the EPAS1 and HBB loci, also showed the dominant signature of selective sweeps in the TM genome. We validated the introgression of the two loci by excluding the possibility of convergent evolution and ancestral polymorphisms and examined the haplotypes of all available canid genomes. The estimated time of introgression based on a non-coding region of the EPAS1 locus mostly overlapped with the Paleolithic era. Our results demonstrated that the introgression of hypoxia adaptive genes in wolves from the highland played an important role for dogs living in hypoxic environments, which indicated that domestic animals could acquire local adaptation quickly by secondary contact with their wild relatives. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  13. Comparative Genomic Analysis Reveals Ecological Differentiation in the Genus Carnobacterium.

    Science.gov (United States)

    Iskandar, Christelle F; Borges, Frédéric; Taminiau, Bernard; Daube, Georges; Zagorec, Monique; Remenant, Benoît; Leisner, Jørgen J; Hansen, Martin A; Sørensen, Søren J; Mangavel, Cécile; Cailliez-Grimal, Catherine; Revol-Junelles, Anne-Marie

    2017-01-01

    Lactic acid bacteria (LAB) differ in their ability to colonize food and animal-associated habitats: while some species are specialized and colonize a limited number of habitats, other are generalist and are able to colonize multiple animal-linked habitats. In the current study, Carnobacterium was used as a model genus to elucidate the genetic basis of these colonization differences. Analyses of 16S rRNA gene meta-barcoding data showed that C. maltaromaticum followed by C. divergens are the most prevalent species in foods derived from animals (meat, fish, dairy products), and in the gut. According to phylogenetic analyses, these two animal-adapted species belong to one of two deeply branched lineages. The second lineage contains species isolated from habitats where contact with animal is rare. Genome analyses revealed that members of the animal-adapted lineage harbor a larger secretome than members of the other lineage. The predicted cell-surface proteome is highly diversified in C. maltaromaticum and C. divergens with genes involved in adaptation to the animal milieu such as those encoding biopolymer hydrolytic enzymes, a heme uptake system, and biopolymer-binding adhesins. These species also exhibit genes for gut adaptation and respiration. In contrast, Carnobacterium species belonging to the second lineage encode a poorly diversified cell-surface proteome, lack genes for gut adaptation and are unable to respire. These results shed light on the important genomics traits required for adaptation to animal-linked habitats in generalist Carnobacterium .

  14. Virus Genomes Reveal the Factors that Spread and Sustained the West African Ebola Epidemic

    Science.gov (United States)

    2016-08-09

    Ladner, J. T. et al. Evolution and Spread of Ebola Virus in Liberia , 2014--2015. Cell Host Microbe 18, 659–669 (2015). 15. Lemey, P. et al. Unifying...Virus genomes reveal the factors that spread and sustained the West African Ebola epidemic. Gytis Dudas1,2, Luiz Max Carvalho1, Trevor Bedford2...Charlesville, Liberia ., 19University of Sierra Leone, Freetown, Sierra Leone , 20Center for Systems Biology, Department of Organismic and Evolutionary

  15. Distinct Biological Potential of Streptococcus gordonii and Streptococcus sanguinis Revealed by Comparative Genome Analysis

    OpenAIRE

    Zheng, Wenning; Tan, Mui Fern; Old, Lesley A.; Paterson, Ian C.; Jakubovics, Nicholas S.; Choo, Siew Woh

    2017-01-01

    Streptococcus gordonii and Streptococcus sanguinis are pioneer colonizers of dental plaque and important agents of bacterial infective endocarditis (IE). To gain a greater understanding of these two closely related species, we performed comparative analyses on 14 new S. gordonii and 5 S. sanguinis strains using various bioinformatics approaches. We revealed S. gordonii and S. sanguinis harbor open pan-genomes and share generally high sequence homology and number of core genes including virule...

  16. Parasitism drives host genome evolution: Insights from the Pasteuria ramosa-Daphnia magna system.

    Science.gov (United States)

    Bourgeois, Yann; Roulin, Anne C; Müller, Kristina; Ebert, Dieter

    2017-04-01

    Because parasitism is thought to play a major role in shaping host genomes, it has been predicted that genomic regions associated with resistance to parasites should stand out in genome scans, revealing signals of selection above the genomic background. To test whether parasitism is indeed such a major factor in host evolution and to better understand host-parasite interaction at the molecular level, we studied genome-wide polymorphisms in 97 genotypes of the planktonic crustacean Daphnia magna originating from three localities across Europe. Daphnia magna is known to coevolve with the bacterial pathogen Pasteuria ramosa for which host genotypes (clonal lines) are either resistant or susceptible. Using association mapping, we identified two genomic regions involved in resistance to P. ramosa, one of which was already known from a previous QTL analysis. We then performed a naïve genome scan to test for signatures of positive selection and found that the two regions identified with the association mapping further stood out as outliers. Several other regions with evidence for selection were also found, but no link between these regions and phenotypic variation could be established. Our results are consistent with the hypothesis that parasitism is driving host genome evolution. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.

  17. Whole genome analysis of linezolid resistance in Streptococcus pneumoniae reveals resistance and compensatory mutations

    Directory of Open Access Journals (Sweden)

    Légaré Danielle

    2011-10-01

    Full Text Available Abstract Background Several mutations were present in the genome of Streptococcus pneumoniae linezolid-resistant strains but the role of several of these mutations had not been experimentally tested. To analyze the role of these mutations, we reconstituted resistance by serial whole genome transformation of a novel resistant isolate into two strains with sensitive background. We sequenced the parent mutant and two independent transformants exhibiting similar minimum inhibitory concentration to linezolid. Results Comparative genomic analyses revealed that transformants acquired G2576T transversions in every gene copy of 23S rRNA and that the number of altered copies correlated with the level of linezolid resistance and cross-resistance to florfenicol and chloramphenicol. One of the transformants also acquired a mutation present in the parent mutant leading to the overexpression of an ABC transporter (spr1021. The acquisition of these mutations conferred a fitness cost however, which was further enhanced by the acquisition of a mutation in a RNA methyltransferase implicated in resistance. Interestingly, the fitness of the transformants could be restored in part by the acquisition of altered copies of the L3 and L16 ribosomal proteins and by mutations leading to the overexpression of the spr1887 ABC transporter that were present in the original linezolid-resistant mutant. Conclusions Our results demonstrate the usefulness of whole genome approaches at detecting major determinants of resistance as well as compensatory mutations that alleviate the fitness cost associated with resistance.

  18. Genome-wide analysis of codon usage bias in four sequenced cotton species.

    Science.gov (United States)

    Wang, Liyuan; Xing, Huixian; Yuan, Yanchao; Wang, Xianlin; Saeed, Muhammad; Tao, Jincai; Feng, Wei; Zhang, Guihua; Song, Xianliang; Sun, Xuezhen

    2018-01-01

    Codon usage bias (CUB) is an important evolutionary feature in a genome which provides important information for studying organism evolution, gene function and exogenous gene expression. The CUB and its shaping factors in the nuclear genomes of four sequenced cotton species, G. arboreum (A2), G. raimondii (D5), G. hirsutum (AD1) and G. barbadense (AD2) were analyzed in the present study. The effective number of codons (ENC) analysis showed the CUB was weak in these four species and the four subgenomes of the two tetraploids. Codon composition analysis revealed these four species preferred to use pyrimidine-rich codons more frequently than purine-rich codons. Correlation analysis indicated that the base content at the third position of codons affect the degree of codon preference. PR2-bias plot and ENC-plot analyses revealed that the CUB patterns in these genomes and subgenomes were influenced by combined effects of translational selection, directional mutation and other factors. The translational selection (P2) analysis results, together with the non-significant correlation between GC12 and GC3, further revealed that translational selection played the dominant role over mutation pressure in the codon usage bias. Through relative synonymous codon usage (RSCU) analysis, we detected 25 high frequency codons preferred to end with T or A, and 31 low frequency codons inclined to end with C or G in these four species and four subgenomes. Finally, 19 to 26 optimal codons with 19 common ones were determined for each species and subgenomes, which preferred to end with A or T. We concluded that the codon usage bias was weak and the translation selection was the main shaping factor in nuclear genes of these four cotton genomes and four subgenomes.

  19. Comparative analysis of pepper and tomato reveals euchromatin expansion of pepper genome caused by differential accumulation of Ty3/Gypsy-like elements

    Directory of Open Access Journals (Sweden)

    Ahn Jong Hwa

    2011-01-01

    Full Text Available Abstract Background Among the Solanaceae plants, the pepper genome is three times larger than that of tomato. Although the gene repertoire and gene order of both species are well conserved, the cause of the genome-size difference is not known. To determine the causes for the expansion of pepper euchromatic regions, we compared the pepper genome to that of tomato. Results For sequence-level analysis, we generated 35.6 Mb of pepper genomic sequences from euchromatin enriched 1,245 pepper BAC clones. The comparative analysis of orthologous gene-rich regions between both species revealed insertion of transposons exclusively in the pepper sequences, maintaining the gene order and content. The most common type of the transposon found was the LTR retrotransposon. Phylogenetic comparison of the LTR retrotransposons revealed that two groups of Ty3/Gypsy-like elements (Tat and Athila were overly accumulated in the pepper genome. The FISH analysis of the pepper Tat elements showed a random distribution in heterochromatic and euchromatic regions, whereas the tomato Tat elements showed heterochromatin-preferential accumulation. Conclusions Compared to tomato pepper euchromatin doubled its size by differential accumulation of a specific group of Ty3/Gypsy-like elements. Our results could provide an insight on the mechanism of genome evolution in the Solanaceae family.

  20. Investigating Drought Tolerance in Chickpea Using Genome-Wide Association Mapping and Genomic Selection Based on Whole-Genome Resequencing Data

    Directory of Open Access Journals (Sweden)

    Yongle Li

    2018-02-01

    Full Text Available Drought tolerance is a complex trait that involves numerous genes. Identifying key causal genes or linked molecular markers can facilitate the fast development of drought tolerant varieties. Using a whole-genome resequencing approach, we sequenced 132 chickpea varieties and advanced breeding lines and found more than 144,000 single nucleotide polymorphisms (SNPs. We measured 13 yield and yield-related traits in three drought-prone environments of Western Australia. The genotypic effects were significant for all traits, and many traits showed highly significant correlations, ranging from 0.83 between grain yield and biomass to -0.67 between seed weight and seed emergence rate. To identify candidate genes, the SNP and trait data were incorporated into the SUPER genome-wide association study (GWAS model, a modified version of the linear mixed model. We found that several SNPs from auxin-related genes, including auxin efflux carrier protein (PIN3, p-glycoprotein, and nodulin MtN21/EamA-like transporter, were significantly associated with yield and yield-related traits under drought-prone environments. We identified four genetic regions containing SNPs significantly associated with several different traits, which was an indication of pleiotropic effects. We also investigated the possibility of incorporating the GWAS results into a genomic selection (GS model, which is another approach to deal with complex traits. Compared to using all SNPs, application of the GS model using subsets of SNPs significantly associated with the traits under investigation increased the prediction accuracies of three yield and yield-related traits by more than twofold. This has important implication for implementing GS in plant breeding programs.

  1. Multi-region and single-cell sequencing reveal variable genomic heterogeneity in rectal cancer.

    Science.gov (United States)

    Liu, Mingshan; Liu, Yang; Di, Jiabo; Su, Zhe; Yang, Hong; Jiang, Beihai; Wang, Zaozao; Zhuang, Meng; Bai, Fan; Su, Xiangqian

    2017-11-23

    Colorectal cancer is a heterogeneous group of malignancies with complex molecular subtypes. While colon cancer has been widely investigated, studies on rectal cancer are very limited. Here, we performed multi-region whole-exome sequencing and single-cell whole-genome sequencing to examine the genomic intratumor heterogeneity (ITH) of rectal tumors. We sequenced nine tumor regions and 88 single cells from two rectal cancer patients with tumors of the same molecular classification and characterized their mutation profiles and somatic copy number alterations (SCNAs) at the multi-region and the single-cell levels. A variable extent of genomic heterogeneity was observed between the two patients, and the degree of ITH increased when analyzed on the single-cell level. We found that major SCNAs were early events in cancer development and inherited steadily. Single-cell sequencing revealed mutations and SCNAs which were hidden in bulk sequencing. In summary, we studied the ITH of rectal cancer at regional and single-cell resolution and demonstrated that variable heterogeneity existed in two patients. The mutational scenarios and SCNA profiles of two patients with treatment naïve from the same molecular subtype are quite different. Our results suggest each tumor possesses its own architecture, which may result in different diagnosis, prognosis, and drug responses. Remarkable ITH exists in the two patients we have studied, providing a preliminary impression of ITH in rectal cancer.

  2. Genomic selection prediction accuracy in a perennial crop: case study of oil palm (Elaeis guineensis Jacq.).

    Science.gov (United States)

    Cros, David; Denis, Marie; Sánchez, Leopoldo; Cochard, Benoit; Flori, Albert; Durand-Gasselin, Tristan; Nouy, Bruno; Omoré, Alphonse; Pomiès, Virginie; Riou, Virginie; Suryana, Edyana; Bouvet, Jean-Marc

    2015-03-01

    Genomic selection empirically appeared valuable for reciprocal recurrent selection in oil palm as it could account for family effects and Mendelian sampling terms, despite small populations and low marker density. Genomic selection (GS) can increase the genetic gain in plants. In perennial crops, this is expected mainly through shortened breeding cycles and increased selection intensity, which requires sufficient GS accuracy in selection candidates, despite often small training populations. Our objective was to obtain the first empirical estimate of GS accuracy in oil palm (Elaeis guineensis), the major world oil crop. We used two parental populations involved in conventional reciprocal recurrent selection (Deli and Group B) with 131 individuals each, genotyped with 265 SSR. We estimated within-population GS accuracies when predicting breeding values of non-progeny-tested individuals for eight yield traits. We used three methods to sample training sets and five statistical methods to estimate genomic breeding values. The results showed that GS could account for family effects and Mendelian sampling terms in Group B but only for family effects in Deli. Presumably, this difference between populations originated from their contrasting breeding history. The GS accuracy ranged from -0.41 to 0.94 and was positively correlated with the relationship between training and test sets. Training sets optimized with the so-called CDmean criterion gave the highest accuracies, ranging from 0.49 (pulp to fruit ratio in Group B) to 0.94 (fruit weight in Group B). The statistical methods did not affect the accuracy. Finally, Group B could be preselected for progeny tests by applying GS to key yield traits, therefore increasing the selection intensity. Our results should be valuable for breeding programs with small populations, long breeding cycles, or reduced effective size.

  3. Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs

    Science.gov (United States)

    Green, Richard E; Braun, Edward L; Armstrong, Joel; Earl, Dent; Nguyen, Ngan; Hickey, Glenn; Vandewege, Michael W; St John, John A; Capella-Gutiérrez, Salvador; Castoe, Todd A; Kern, Colin; Fujita, Matthew K; Opazo, Juan C; Jurka, Jerzy; Kojima, Kenji K; Caballero, Juan; Hubley, Robert M; Smit, Arian F; Platt, Roy N; Lavoie, Christine A; Ramakodi, Meganathan P; Finger, John W; Suh, Alexander; Isberg, Sally R; Miles, Lee; Chong, Amanda Y; Jaratlerdsiri, Weerachai; Gongora, Jaime; Moran, Christopher; Iriarte, Andrés; McCormack, John; Burgess, Shane C; Edwards, Scott V; Lyons, Eric; Williams, Christina; Breen, Matthew; Howard, Jason T; Gresham, Cathy R; Peterson, Daniel G; Schmitz, Jürgen; Pollock, David D; Haussler, David; Triplett, Eric W; Zhang, Guojie; Irie, Naoki; Jarvis, Erich D; Brochu, Christopher A; Schmidt, Carl J; McCarthy, Fiona M; Faircloth, Brant C; Hoffmann, Federico G; Glenn, Travis C; Gabaldón, Toni; Paten, Benedict; Ray, David A

    2015-01-01

    To provide context for the diversifications of archosaurs, the group that includes crocodilians, dinosaurs and birds, we generated draft genomes of three crocodilians, Alligator mississippiensis (the American alligator), Crocodylus porosus (the saltwater crocodile), and Gavialis gangeticus (the Indian gharial). We observed an exceptionally slow rate of genome evolution within crocodilians at all levels, including nucleotide substitutions, indels, transposable element content and movement, gene family evolution, and chromosomal synteny. When placed within the context of related taxa including birds and turtles, this suggests that the common ancestor of all of these taxa also exhibited slow genome evolution and that the relatively rapid evolution of bird genomes represents an autapomorphy within that clade. The data also provided the opportunity to analyze heterozygosity in crocodilians, which indicates a likely reduction in population size for all three taxa through the Pleistocene. Finally, these new data combined with newly published bird genomes allowed us to reconstruct the partial genome of the common ancestor of archosaurs providing a tool to investigate the genetic starting material of crocodilians, birds, and dinosaurs. PMID:25504731

  4. The genome of Tetranychus urticae reveals herbivorous pest adaptations

    NARCIS (Netherlands)

    Grbić, M.; Van Leeuwen, T.; Clark, R.M.; Rombauts, S.; Grbić, V.; Osborne, E.J.; Dermauw, W.; Phuong, C.T.N.; Ortego, F.; Hernández-Crespo, P.; Diaz, I.; Martinez, M.; Navajas, M.; Sucena, E.; Magalhães, S.; Nagy, L.; Pace, R.M.; Djuranović, S.; Smagghe, G.; Iga, M.; Christiaens, O.; Veenstra, J.A.; Ewer, J.; Villalobos, R.M.; Hutter, J.L.; Hudson, S.D.; Velez, M.; Yi, S.V.; Zeng, J.; Pires-dasilva, A.; Roch, F.; Cazaux, M.; Navarro, M.; Zhurov, V.; Acevedo, G.; Bjelica, A.; Fawcett, J.A.; Bonnet, E.; Martens, C.; Baele, G.; Wissler, L.; Sanchez-Rodriguez, A.; Tirry, L.; Blais, C.; Demeestere, K.; Henz, S.R.; Gregory, T.R.; Mathieu, J.; Verdon, L.; Farinelli, L.; Schmutz, J.; Lindquist, E.; Feyereisen, R.; Van de Peer, Y.

    2011-01-01

    The spider mite Tetranychus urticae is a cosmopolitan agricultural pest with an extensive host plant range and an extreme record of pesticide resistance. Here we present the completely sequenced and annotated spider mite genome, representing the first complete chelicerate genome. At 90 megabases T.

  5. Adaptive selection on bracovirus genomes drives the specialization of Cotesia parasitoid wasps.

    Directory of Open Access Journals (Sweden)

    Séverine Jancek

    Full Text Available The geographic mosaic of coevolution predicts parasite virulence should be locally adapted to the host community. Cotesia parasitoid wasps adapt to local lepidopteran species possibly through their symbiotic bracovirus. The virus, essential for the parasitism success, is at the heart of the complex coevolutionary relationship linking the wasps and their hosts. The large segmented genome contained in the virus particles encodes virulence genes involved in host immune and developmental suppression. Coevolutionary arms race should result in the positive selection of particular beneficial alleles. To understand the global role of bracoviruses in the local adaptation or specialization of parasitoid wasps to their hosts, we studied the molecular evolution of four bracoviruses associated with wasps of the genus Cotesia, including C congregata, C vestalis and new data and annotation on two ecologically differentiated populations of C sesamie, Kitale and Mombasa. Paired orthologs analyses revealed more genes under positive selection when comparing the two C sesamiae bracoviruses belonging to the same species, and more genes under strong evolutionary constraint between species. Furthermore branch-site evolutionary models showed that 17 genes, out of the 54 currently available shared by the four bracoviruses, harboured sites under positive selection including: the histone H4-like, a C-type lectin, two ep1-like, ep2, a viral ankyrin, CrV1, a ben-domain, a Serine-rich, and eight unknown genes. Lastly the phylogenetic analyses of the histone, ep2 and CrV1 genes in different African C sesamiae populations showed that each gene described differently the individual relationships. In particular we found recombination had happened between the ep2 and CrV1 genes, which are localized 37.5 kb apart on the wasp chromosomes. Involved in multidirectional coevolutionary interactions, C sesamiae wasps rely on different bracovirus mediated molecular pathways to overcome

  6. Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity.

    Directory of Open Access Journals (Sweden)

    Nicolas Heslot

    Full Text Available Genome-wide molecular markers are often being used to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorphisms in the population under study. Ascertainment bias arises when marker data is not obtained from a random sample of the polymorphisms in the population of interest. Genotyping-by-sequencing (GBS is rapidly emerging as a low-cost genotyping platform, even for the large, complex, and polyploid wheat (Triticum aestivum L. genome. With GBS, marker discovery and genotyping occur simultaneously, resulting in minimal ascertainment bias. The previous platform of choice for whole-genome genotyping in many species such as wheat was DArT (Diversity Array Technology and has formed the basis of most of our knowledge about cereals genetic diversity. This study compared GBS and DArT marker platforms for measuring genetic diversity and genomic selection (GS accuracy in elite U.S. soft winter wheat. From a set of 365 breeding lines, 38,412 single nucleotide polymorphism GBS markers were discovered and genotyped. The GBS SNPs gave a higher GS accuracy than 1,544 DArT markers on the same lines, despite 43.9% missing data. Using a bootstrap approach, we observed significantly more clustering of markers and ascertainment bias with DArT relative to GBS. The minor allele frequency distribution of GBS markers had a deficit of rare variants compared to DArT markers. Despite the ascertainment bias of the DArT markers, GS accuracy for three traits out of four was not significantly different when an equal number of markers were used for each platform. This suggests that the gain in accuracy observed using GBS compared to DArT markers was mainly due to a large increase in the number of markers available for the analysis.

  7. Impact of Marker Ascertainment Bias on Genomic Selection Accuracy and Estimates of Genetic Diversity

    Science.gov (United States)

    Heslot, Nicolas; Rutkoski, Jessica; Poland, Jesse; Jannink, Jean-Luc; Sorrells, Mark E.

    2013-01-01

    Genome-wide molecular markers are often being used to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorphisms in the population under study. Ascertainment bias arises when marker data is not obtained from a random sample of the polymorphisms in the population of interest. Genotyping-by-sequencing (GBS) is rapidly emerging as a low-cost genotyping platform, even for the large, complex, and polyploid wheat (Triticum aestivum L.) genome. With GBS, marker discovery and genotyping occur simultaneously, resulting in minimal ascertainment bias. The previous platform of choice for whole-genome genotyping in many species such as wheat was DArT (Diversity Array Technology) and has formed the basis of most of our knowledge about cereals genetic diversity. This study compared GBS and DArT marker platforms for measuring genetic diversity and genomic selection (GS) accuracy in elite U.S. soft winter wheat. From a set of 365 breeding lines, 38,412 single nucleotide polymorphism GBS markers were discovered and genotyped. The GBS SNPs gave a higher GS accuracy than 1,544 DArT markers on the same lines, despite 43.9% missing data. Using a bootstrap approach, we observed significantly more clustering of markers and ascertainment bias with DArT relative to GBS. The minor allele frequency distribution of GBS markers had a deficit of rare variants compared to DArT markers. Despite the ascertainment bias of the DArT markers, GS accuracy for three traits out of four was not significantly different when an equal number of markers were used for each platform. This suggests that the gain in accuracy observed using GBS compared to DArT markers was mainly due to a large increase in the number of markers available for the analysis. PMID:24040295

  8. Collective Dynamics of Specific Gene Ensembles Crucial for Neutrophil Differentiation: The Existence of Genome Vehicles Revealed

    Science.gov (United States)

    Giuliani, Alessandro; Tomita, Masaru

    2010-01-01

    Cell fate decision remarkably generates specific cell differentiation path among the multiple possibilities that can arise through the complex interplay of high-dimensional genome activities. The coordinated action of thousands of genes to switch cell fate decision has indicated the existence of stable attractors guiding the process. However, origins of the intracellular mechanisms that create “cellular attractor” still remain unknown. Here, we examined the collective behavior of genome-wide expressions for neutrophil differentiation through two different stimuli, dimethyl sulfoxide (DMSO) and all-trans-retinoic acid (atRA). To overcome the difficulties of dealing with single gene expression noises, we grouped genes into ensembles and analyzed their expression dynamics in correlation space defined by Pearson correlation and mutual information. The standard deviation of correlation distributions of gene ensembles reduces when the ensemble size is increased following the inverse square root law, for both ensembles chosen randomly from whole genome and ranked according to expression variances across time. Choosing the ensemble size of 200 genes, we show the two probability distributions of correlations of randomly selected genes for atRA and DMSO responses overlapped after 48 hours, defining the neutrophil attractor. Next, tracking the ranked ensembles' trajectories, we noticed that only certain, not all, fall into the attractor in a fractal-like manner. The removal of these genome elements from the whole genomes, for both atRA and DMSO responses, destroys the attractor providing evidence for the existence of specific genome elements (named “genome vehicle”) responsible for the neutrophil attractor. Notably, within the genome vehicles, genes with low or moderate expression changes, which are often considered noisy and insignificant, are essential components for the creation of the neutrophil attractor. Further investigations along with our findings might

  9. Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis

    Directory of Open Access Journals (Sweden)

    Ueki Masao

    2012-05-01

    Full Text Available Abstract Background Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. Results We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case–control Consortium data. Conclusions Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction.

  10. Analysis of an RNA-seq Strand-Specific Library from an East Timorese Cucumber Sample Reveals a Complete Cucurbit aphid-borne yellows virus Genome.

    Science.gov (United States)

    Maina, Solomon; Edwards, Owain R; de Almeida, Luis; Ximenes, Abel; Jones, Roger A C

    2017-05-11

    Analysis of an RNA-seq library from cucumber leaf RNA extracted from a fast technology for analysis of nucleic acids (FTA) card revealed the first complete genome of Cucurbit aphid-borne yellows virus (CABYV) from East Timor. We compare it with 35 complete CABYV genomes from other world regions. It most resembled the genome of the South Korean isolate HD118. Copyright © 2017 Maina et al.

  11. Trait-specific long-term consequences of genomic selection in beef cattle.

    Science.gov (United States)

    de Rezende Neves, Haroldo Henrique; Carvalheiro, Roberto; de Queiroz, Sandra Aidar

    2018-02-01

    Simulation studies allow addressing consequences of selection schemes, helping to identify effective strategies to enable genetic gain and maintain genetic diversity. The aim of this study was to evaluate the long-term impact of genomic selection (GS) in genetic progress and genetic diversity of beef cattle. Forward-in-time simulation generated a population with pattern of linkage disequilibrium close to that previously reported for real beef cattle populations. Different scenarios of GS and traditional pedigree-based BLUP (PBLUP) selection were simulated for 15 generations, mimicking selection for female reproduction and meat quality. For GS scenarios, an alternative selection criterion was simulated (wGBLUP), intended to enhance long-term gains by attributing more weight to favorable alleles with low frequency. GS allowed genetic progress up to 40% greater than PBLUP, for female reproduction and meat quality. The alternative criterion wGBLUP did not increase long-term response, although allowed reducing inbreeding rates and loss of favorable alleles. The results suggest that GS outperforms PBLUP when the selected trait is under less polygenic background and that attributing more weight to low-frequency favorable alleles can reduce inbreeding rates and loss of favorable alleles in GS.

  12. Genomic landscapes of Chinese hamster ovary cell lines as revealed by the Cricetulus griseus draft genome

    DEFF Research Database (Denmark)

    Lewis, Nathan E; Liu, Xin; Li, Yuxiang

    2013-01-01

    stymied by the lack of a unifying genomic resource for CHO cells. Here we report a 2.4-Gb draft genome sequence of a female Chinese hamster, Cricetulus griseus, harboring 24,044 genes. We also resequenced and analyzed the genomes of six CHO cell lines from the CHO-K1, DG44 and CHO-S lineages...

  13. Infidelity of SARS-CoV Nsp14-exonuclease mutant virus replication is revealed by complete genome sequencing.

    Directory of Open Access Journals (Sweden)

    Lance D Eckerle

    2010-05-01

    Full Text Available Most RNA viruses lack the mechanisms to recognize and correct mutations that arise during genome replication, resulting in quasispecies diversity that is required for pathogenesis and adaptation. However, it is not known how viruses encoding large viral RNA genomes such as the Coronaviridae (26 to 32 kb balance the requirements for genome stability and quasispecies diversity. Further, the limits of replication infidelity during replication of large RNA genomes and how decreased fidelity impacts virus fitness over time are not known. Our previous work demonstrated that genetic inactivation of the coronavirus exoribonuclease (ExoN in nonstructural protein 14 (nsp14 of murine hepatitis virus results in a 15-fold decrease in replication fidelity. However, it is not known whether nsp14-ExoN is required for replication fidelity of all coronaviruses, nor the impact of decreased fidelity on genome diversity and fitness during replication and passage. We report here the engineering and recovery of nsp14-ExoN mutant viruses of severe acute respiratory syndrome coronavirus (SARS-CoV that have stable growth defects and demonstrate a 21-fold increase in mutation frequency during replication in culture. Analysis of complete genome sequences from SARS-ExoN mutant viral clones revealed unique mutation sets in every genome examined from the same round of replication and a total of 100 unique mutations across the genome. Using novel bioinformatic tools and deep sequencing across the full-length genome following 10 population passages in vitro, we demonstrate retention of ExoN mutations and continued increased diversity and mutational load compared to wild-type SARS-CoV. The results define a novel genetic and bioinformatics model for introduction and identification of multi-allelic mutations in replication competent viruses that will be powerful tools for testing the effects of decreased fidelity and increased quasispecies diversity on viral replication

  14. Comparative Genomics Revealed Genetic Diversity and Species/Strain-Level Differences in Carbohydrate Metabolism of Three Probiotic Bifidobacterial Species

    Directory of Open Access Journals (Sweden)

    Toshitaka Odamaki

    2015-01-01

    Full Text Available Strains of Bifidobacterium longum, Bifidobacterium breve, and Bifidobacterium animalis are widely used as probiotics in the food industry. Although numerous studies have revealed the properties and functionality of these strains, it is uncertain whether these characteristics are species common or strain specific. To address this issue, we performed a comparative genomic analysis of 49 strains belonging to these three bifidobacterial species to describe their genetic diversity and to evaluate species-level differences. There were 166 common clusters between strains of B. breve and B. longum, whereas there were nine common clusters between strains of B. animalis and B. longum and four common clusters between strains of B. animalis and B. breve. Further analysis focused on carbohydrate metabolism revealed the existence of certain strain-dependent genes, such as those encoding enzymes for host glycan utilisation or certain membrane transporters, and many genes commonly distributed at the species level, as was previously reported in studies with limited strains. As B. longum and B. breve are human-residential bifidobacteria (HRB, whereas B. animalis is a non-HRB species, several of the differences in these species’ gene distributions might be the result of their adaptations to the nutrient environment. This information may aid both in selecting probiotic candidates and in understanding their potential function as probiotics.

  15. Genetic Gain and Inbreeding from Genomic Selection in a Simulated Commercial Breeding Program for Perennial Ryegrass

    Directory of Open Access Journals (Sweden)

    Zibei Lin

    2016-03-01

    Full Text Available Genomic selection (GS provides an attractive option for accelerating genetic gain in perennial ryegrass ( improvement given the long cycle times of most current breeding programs. The present study used simulation to investigate the level of genetic gain and inbreeding obtained from GS breeding strategies compared with traditional breeding strategies for key traits (persistency, yield, and flowering time. Base population genomes were simulated through random mating for 60,000 generations at an effective population size of 10,000. The degree of linkage disequilibrium (LD in the resulting population was compared with that obtained from empirical studies. Initial parental varieties were simulated to match diversity of current commercial cultivars. Genomic selection was designed to fit into a company breeding program at two selection points in the breeding cycle (spaced plants and miniplot. Genomic estimated breeding values (GEBVs for productivity traits were trained with phenotypes and genotypes from plots. Accuracy of GEBVs was 0.24 for persistency and 0.36 for yield for single plants, while for plots it was lower (0.17 and 0.19, respectively. Higher accuracy of GEBVs was obtained for flowering time (up to 0.7, partially as a result of the larger reference population size that was available from the clonal row stage. The availability of GEBVs permit a 4-yr reduction in cycle time, which led to at least a doubling and trebling genetic gain for persistency and yield, respectively, than the traditional program. However, a higher rate of inbreeding per cycle among varieties was also observed for the GS strategy.

  16. Genetic Gain and Inbreeding from Genomic Selection in a Simulated Commercial Breeding Program for Perennial Ryegrass.

    Science.gov (United States)

    Lin, Zibei; Cogan, Noel O I; Pembleton, Luke W; Spangenberg, German C; Forster, John W; Hayes, Ben J; Daetwyler, Hans D

    2016-03-01

    Genomic selection (GS) provides an attractive option for accelerating genetic gain in perennial ryegrass () improvement given the long cycle times of most current breeding programs. The present study used simulation to investigate the level of genetic gain and inbreeding obtained from GS breeding strategies compared with traditional breeding strategies for key traits (persistency, yield, and flowering time). Base population genomes were simulated through random mating for 60,000 generations at an effective population size of 10,000. The degree of linkage disequilibrium (LD) in the resulting population was compared with that obtained from empirical studies. Initial parental varieties were simulated to match diversity of current commercial cultivars. Genomic selection was designed to fit into a company breeding program at two selection points in the breeding cycle (spaced plants and miniplot). Genomic estimated breeding values (GEBVs) for productivity traits were trained with phenotypes and genotypes from plots. Accuracy of GEBVs was 0.24 for persistency and 0.36 for yield for single plants, while for plots it was lower (0.17 and 0.19, respectively). Higher accuracy of GEBVs was obtained for flowering time (up to 0.7), partially as a result of the larger reference population size that was available from the clonal row stage. The availability of GEBVs permit a 4-yr reduction in cycle time, which led to at least a doubling and trebling genetic gain for persistency and yield, respectively, than the traditional program. However, a higher rate of inbreeding per cycle among varieties was also observed for the GS strategy. Copyright © 2016 Crop Science Society of America.

  17. Genome-wide identification of SAUR genes in watermelon (Citrullus lanatus).

    Science.gov (United States)

    Zhang, Na; Huang, Xing; Bao, Yaning; Wang, Bo; Zeng, Hongxia; Cheng, Weishun; Tang, Mi; Li, Yuhua; Ren, Jian; Sun, Yuhong

    2017-07-01

    The early auxin responsive SAUR family is an important gene family in auxin signal transduction. We here present the first report of a genome-wide identification of SAUR genes in watermelon genome. We successfully identified 65 ClaSAURs and provide a genomic framework for future study on these genes. Phylogenetic result revealed a Cucurbitaceae-specific SAUR subfamily and contribute to understanding of the evolutionary pattern of SAUR genes in plants. Quantitative RT-PCR analysis demonstrates the existed expression of 11 randomly selected SAUR genes in watermelon tissues. ClaSAUR36 was highly expressed in fruit, for which further study might bring a new prospective for watermelon fruit development. Moreover, correlation analysis revealed the similar expression profiles of SAUR genes between watermelon and Arabidopsis during shoot organogenesis. This work gives us a new support for the conserved auxin machinery in plants.

  18. Orchestrating the Selection and Packaging of Genomic RNA by Retroviruses: An Ensemble of Viral and Host Factors

    Science.gov (United States)

    Kaddis Maldonado, Rebecca J.; Parent, Leslie J.

    2016-01-01

    Infectious retrovirus particles contain two copies of unspliced viral RNA that serve as the viral genome. Unspliced retroviral RNA is transcribed in the nucleus by the host RNA polymerase II and has three potential fates: (1) it can be spliced into subgenomic messenger RNAs (mRNAs) for the translation of viral proteins; or it can remain unspliced to serve as either (2) the mRNA for the translation of Gag and Gag–Pol; or (3) the genomic RNA (gRNA) that is packaged into virions. The Gag structural protein recognizes and binds the unspliced viral RNA to select it as a genome, which is selected in preference to spliced viral RNAs and cellular RNAs. In this review, we summarize the current state of understanding about how retroviral packaging is orchestrated within the cell and explore potential new mechanisms based on recent discoveries in the field. We discuss the cis-acting elements in the unspliced viral RNA and the properties of the Gag protein that are required for their interaction. In addition, we discuss the role of host factors in influencing the fate of the newly transcribed viral RNA, current models for how retroviruses distinguish unspliced viral mRNA from viral genomic RNA, and the possible subcellular sites of genomic RNA dimerization and selection by Gag. Although this review centers primarily on the wealth of data available for the alpharetrovirus Rous sarcoma virus, in which a discrete RNA packaging sequence has been identified, we have also summarized the cis- and trans-acting factors as well as the mechanisms governing gRNA packaging of other retroviruses for comparison. PMID:27657110

  19. Culture independent genomic comparisons reveal environmental adaptations for Altiarchaeales

    Directory of Open Access Journals (Sweden)

    Jordan T Bird

    2016-08-01

    Full Text Available The recently proposed candidatus order Altiarchaeales remains an uncultured archaeal lineage composed of genetically diverse, globally widespread organisms frequently observed in anoxic subsurface environments. In spite of 15 years of studies on the psychrophilic biofilm-producing Candidatus (Ca. Altiarchaeum hamiconexum and its close relatives, very little is known about the phylogenetic and functional diversity of the widespread free-living marine members of this taxon. From methanogenic sediments in the White Oak River Estuary, NC, we sequenced a single cell amplified genome (SAG, WOR_SCG_SM1, and used it to identify and refine two high-quality genomes from metagenomes, WOR_79 and WOR_86-2, from the same site in a different year. These three genomic reconstructions form a monophyletic group which also includes three previously published genomes from metagenomes from terrestrial springs and a SAG from Sakinaw Lake in a group previously designated as pMC2A384. A synapomorphic mutation in the Altiarchaeales tRNA synthetase β subunit, pheT, causes the protein to be encoded as two subunits at distant loci. Consistent with the terrestrial spring clades, our estuarine genomes contain a near-complete autotrophic metabolism, H2 or CO as potential electron donors, a reductive acetyl-CoA pathway for carbon fixation, and methylotroph-like NADP(H-dependent dehydrogenase. Phylogenies based on 16S rRNA genes and concatenated conserved proteins identify two distinct sub-clades of Altiarchaeales, Alti-1 populated by organisms from actively flowing springs, and Alti-2 which is more widespread, diverse, and not associated with visible mats. The core Alti-1 genome supports Alti-1 as adapted for the stream environment, with lipopolysaccharide production capacity, extracellular hami structures. The core Alti-2 genome members of this clade are free-living, with distinct mechanisms for energy maintenance, motility, osmoregulation, and sulfur redox reactions. These

  20. Deep sequencing of foot-and-mouth disease virus reveals RNA sequences involved in genome packaging.

    Science.gov (United States)

    Logan, Grace; Newman, Joseph; Wright, Caroline F; Lasecka-Dykes, Lidia; Haydon, Daniel T; Cottam, Eleanor M; Tuthill, Tobias J

    2017-10-18

    Non-enveloped viruses protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. Packaging and capsid assembly in RNA viruses can involve interactions between capsid proteins and secondary structures in the viral genome as exemplified by the RNA bacteriophage MS2 and as proposed for other RNA viruses of plants, animals and human. In the picornavirus family of non-enveloped RNA viruses, the requirements for genome packaging remain poorly understood. Here we show a novel and simple approach to identify predicted RNA secondary structures involved in genome packaging in the picornavirus foot-and-mouth disease virus (FMDV). By interrogating deep sequencing data generated from both packaged and unpackaged populations of RNA we have determined multiple regions of the genome with constrained variation in the packaged population. Predicted secondary structures of these regions revealed stem loops with conservation of structure and a common motif at the loop. Disruption of these features resulted in attenuation of virus growth in cell culture due to a reduction in assembly of mature virions. This study provides evidence for the involvement of predicted RNA structures in picornavirus packaging and offers a readily transferable methodology for identifying packaging requirements in many other viruses. Importance In order to transmit their genetic material to a new host, non-enveloped viruses must protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. For many non-enveloped RNA viruses the requirements for this critical part of the viral life cycle remain poorly understood. We have identified RNA sequences involved in genome packaging of the picornavirus foot-and-mouth disease virus. This virus causes an economically devastating disease of livestock affecting both the developed and developing world. The experimental methods developed to carry out this work are novel, simple and transferable to the