Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to show the distribution of these 2 important incompatible cultivated pepper species. Estimated mean nucleotide...
Gao, Yangchun; Li, Shiguo; Zhan, Aibin
Invasive species cause huge damages to ecology, environment and economy globally. The comprehensive understanding of invasion mechanisms, particularly genetic bases of micro-evolutionary processes responsible for invasion success, is essential for reducing potential damages caused by invasive species. The golden star tunicate, Botryllus schlosseri, has become a model species in invasion biology, mainly owing to its high invasiveness nature and small well-sequenced genome. However, the genome-wide genetic markers have not been well developed in this highly invasive species, thus limiting the comprehensive understanding of genetic mechanisms of invasion success. Using restriction site-associated DNA (RAD) tag sequencing, here we developed a high-quality resource of 14,119 out of 158,821 SNPs for B. schlosseri. These SNPs were relatively evenly distributed at each chromosome. SNP annotations showed that the majority of SNPs (63.20%) were located at intergenic regions, and 21.51% and 14.58% were located at introns and exons, respectively. In addition, the potential use of the developed SNPs for population genomics studies was primarily assessed, such as the estimate of observed heterozygosity (H O ), expected heterozygosity (H E ), nucleotide diversity (π), Wright's inbreeding coefficient (F IS ) and effective population size (Ne). Our developed SNP resource would provide future studies the genome-wide genetic markers for genetic and genomic investigations, such as genetic bases of micro-evolutionary processes responsible for invasion success.
Full Text Available Massively parallel sequencing platforms have allowed for the rapid discovery of single nucleotide polymorphisms (SNPs among related genotypes within a species. We describe the creation of reduced representation libraries (RRLs using an initial digestion of nuclear genomic DNA with a methylation-sensitive restriction endonuclease followed by a secondary digestion with the 4bp-restriction endonuclease This strategy allows for the enrichment of hypomethylated genomic DNA, which has been shown to be rich in genic sequences, and the digestion with serves to increase the number of common loci resequenced between individuals. Deep resequencing of these RRLs performed with the Illumina Genome Analyzer led to the identification of 2618 SNPs in rice and 1682 SNPs in soybean for two representative genotypes in each of the species. A subset of these SNPs was validated via Sanger sequencing, exhibiting validation rates of 96.4 and 97.0%, in rice ( and soybean (, respectively. Comparative analysis of the read distribution relative to annotated genes in the reference genome assemblies indicated that the RRL strategy was primarily sampling within genic regions for both species. The massively parallel sequencing of methylation-sensitive RRLs for genome-wide SNP discovery can be applied across a wide range of plant species having sufficient reference genomic sequence.
Levinson, Douglas F; Shi, Jianxin; Wang, Kai
The authors used a genome-wide association study (GWAS) of multiply affected families to investigate the association of schizophrenia to common single-nucleotide polymorphisms (SNPs) and rare copy number variants (CNVs).......The authors used a genome-wide association study (GWAS) of multiply affected families to investigate the association of schizophrenia to common single-nucleotide polymorphisms (SNPs) and rare copy number variants (CNVs)....
Reddy, Umesh K; Nimmakayala, Padma; Levi, Amnon; Abburi, Venkata Lakshmi; Saminathan, Thangasamy; Tomason, Yan R; Vajja, Gopinath; Reddy, Rishi; Abburi, Lavanya; Wehner, Todd C; Ronin, Yefim; Karol, Abraham
We used genotyping by sequencing to identify a set of 10,480 single nucleotide polymorphism (SNP) markers for constructing a high-resolution genetic map of 1096 cM for watermelon. We assessed the genome-wide variation in recombination rate (GWRR) across the map and found an association between GWRR and genome-wide nucleotide diversity. Collinearity between the map and the genome-wide reference sequence for watermelon was studied to identify inconsistency and chromosome rearrangements. We assessed genome-wide nucleotide diversity, linkage disequilibrium (LD), and selective sweep for wild, semi-wild, and domesticated accessions of Citrullus lanatus var. lanatus to track signals of domestication. Principal component analysis combined with chromosome-wide phylogenetic study based on 1563 SNPs obtained after LD pruning with minor allele frequency of 0.05 resolved the differences between semi-wild and wild accessions as well as relationships among worldwide sweet watermelon. Population structure analysis revealed predominant ancestries for wild, semi-wild, and domesticated watermelons as well as admixture of various ancestries that were important for domestication. Sliding window analysis of Tajima's D across various chromosomes was used to resolve selective sweep. LD decay was estimated for various chromosomes. We identified a strong selective sweep on chromosome 3 consisting of important genes that might have had a role in sweet watermelon domestication. Copyright © 2014 Reddy et al.
Li, M-H; Tiirikka, T; Kantanen, J
In sheep, coat colour (and pattern) is one of the important traits of great biological, economic and social importance. However, the genetics of sheep coat colour has not yet been fully clarified. We conducted a genome-wide association study of sheep coat colours by genotyping 47 303 single-nucleotide polymorphisms (SNPs) in the Finnsheep population in Finland. We identified 35 SNPs associated with all the coat colours studied, which cover genomic regions encompassing three kno...
Murray, Lee; Mobegi, Victor A; Duffy, Craig W; Assefa, Samuel A; Kwiatkowski, Dominic P; Laman, Eugene; Loua, Kovana M; Conway, David J
In regions where malaria is endemic, individuals are often infected with multiple distinct parasite genotypes, a situation that may impact on evolution of parasite virulence and drug resistance. Most approaches to studying genotypic diversity have involved analysis of a modest number of polymorphic loci, although whole genome sequencing enables a broader characterisation of samples. PCR-based microsatellite typing of a panel of ten loci was performed on Plasmodium falciparum in 95 clinical isolates from a highly endemic area in the Republic of Guinea, to characterize within-isolate genetic diversity. Separately, single nucleotide polymorphism (SNP) data from genome-wide short-read sequences of the same samples were used to derive within-isolate fixation indices (F ws), an inverse measure of diversity within each isolate compared to overall local genetic diversity. The latter indices were compared with the microsatellite results, and also with indices derived by randomly sampling modest numbers of SNPs. As expected, the number of microsatellite loci with more than one allele in each isolate was highly significantly inversely correlated with the genome-wide F ws fixation index (r = -0.88, P 10 % had high correlation (r > 0.90) with the index derived using all SNPs. Different types of data give highly correlated indices of within-infection diversity, although PCR-based analysis detects low-level minority genotypes not apparent in bulk sequence analysis. When whole-genome data are not obtainable, quantitative assay of ten or more SNPs can yield a reasonably accurate estimate of the within-infection fixation index (F ws).
Caicedo, Ana L; Williamson, Scott H; Hernandez, Ryan D
Domesticated Asian rice (Oryza sativa) is one of the oldest domesticated crop species in the world, having fed more people than any other plant in human history. We report the patterns of DNA sequence variation in rice and its wild ancestor, O. rufipogon, across 111 randomly chosen gene fragments......, and use these to infer the evolutionary dynamics that led to the origins of rice. There is a genome-wide excess of high-frequency derived single nucleotide polymorphisms (SNPs) in O. sativa varieties, a pattern that has not been reported for other crop species. We developed several alternative models...... to explain contemporary patterns of polymorphisms in rice, including a (i) selectively neutral population bottleneck model, (ii) bottleneck plus migration model, (iii) multiple selective sweeps model, and (iv) bottleneck plus selective sweeps model. We find that a simple bottleneck model, which has been...
This is because the SNPs on BovineSNP50 and GGP-80K assays were ascertained as being common in European taurine breeds. Lower MAF and SNP informativeness observed in this study limits the application of these assays in breed assignment, and could have other implications for genome-wide studies in South ...
Pujolar, J.M.; Jacobsen, M.W.; Frydenberg, J.
Reduced representation genome sequencing such as restriction-site-associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single-nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers for the Eu...... 425 loci and 376 918 associated SNPs provides a valuable tool for future population genetics and genomics studies and allows for targeting specific genes and particularly interesting regions of the eel genome...
Manivannan, Abinaya; Kim, Jin-Hee; Yang, Eun-Young; Ahn, Yul-Kyun; Lee, Eun-Su; Choi, Sena; Kim, Do-Sun
Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS) approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP) indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.
Full Text Available Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.
Grønlund, Hugo Ahlm; Moen, Birgitte; Hoorfar, Jeffrey
A major challenge with single-nucleotide polymorphism (SNP) fingerprinting of bacteria and higher organisms is the combination of genome-wide screenings with the potential of multiplexing and accurate SNP detection. Single-nucleotide extension by the minisequencing principle represents a technolo...
Full Text Available Non-additive genetic variation is usually ignored when genome-wide markers are used to study the genetic architecture and genomic prediction of complex traits in human, wild life, model organisms or farm animals. However, non-additive genetic effects may have an important contribution to total genetic variation of complex traits. This study presented a genomic BLUP model including additive and non-additive genetic effects, in which additive and non-additive genetic relation matrices were constructed from information of genome-wide dense single nucleotide polymorphism (SNP markers. In addition, this study for the first time proposed a method to construct dominance relationship matrix using SNP markers and demonstrated it in detail. The proposed model was implemented to investigate the amounts of additive genetic, dominance and epistatic variations, and assessed the accuracy and unbiasedness of genomic predictions for daily gain in pigs. In the analysis of daily gain, four linear models were used: 1 a simple additive genetic model (MA, 2 a model including both additive and additive by additive epistatic genetic effects (MAE, 3 a model including both additive and dominance genetic effects (MAD, and 4 a full model including all three genetic components (MAED. Estimates of narrow-sense heritability were 0.397, 0.373, 0.379 and 0.357 for models MA, MAE, MAD and MAED, respectively. Estimated dominance variance and additive by additive epistatic variance accounted for 5.6% and 9.5% of the total phenotypic variance, respectively. Based on model MAED, the estimate of broad-sense heritability was 0.506. Reliabilities of genomic predicted breeding values for the animals without performance records were 28.5%, 28.8%, 29.2% and 29.5% for models MA, MAE, MAD and MAED, respectively. In addition, models including non-additive genetic effects improved unbiasedness of genomic predictions.
Gómez-Romero, Laura; Palacios-Flores, Kim; Reyes, José; García, Delfino; Boege, Margareta; Dávila, Guillermo; Flores, Margarita; Schatz, Michael C; Palacios, Rafael
The precise determination of de novo genetic variants has enormous implications across different fields of biology and medicine, particularly personalized medicine. Currently, de novo variations are identified by mapping sample reads from a parent-offspring trio to a reference genome, allowing for a certain degree of differences. While widely used, this approach often introduces false-positive (FP) results due to misaligned reads and mischaracterized sequencing errors. In a previous study, we developed an alternative approach to accurately identify single nucleotide variants (SNVs) using only perfect matches. However, this approach could be applied only to haploid regions of the genome and was computationally intensive. In this study, we present a unique approach, coverage-based single nucleotide variant identification (COBASI), which allows the exploration of the entire genome using second-generation short sequence reads without extensive computing requirements. COBASI identifies SNVs using changes in coverage of exactly matching unique substrings, and is particularly suited for pinpointing de novo SNVs. Unlike other approaches that require population frequencies across hundreds of samples to filter out any methodological biases, COBASI can be applied to detect de novo SNVs within isolated families. We demonstrate this capability through extensive simulation studies and by studying a parent-offspring trio we sequenced using short reads. Experimental validation of all 58 candidate de novo SNVs and a selection of non-de novo SNVs found in the trio confirmed zero FP calls. COBASI is available as open source at https://github.com/Laura-Gomez/COBASI for any researcher to use. Copyright © 2018 the Author(s). Published by PNAS.
Yamamoto, Toshio; Nagasaki, Hideki; Yonemaru, Jun-ichi; Ebana, Kaworu; Nakajima, Maiko; Shibaya, Taeko; Yano, Masahiro
To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP) data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7 x the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67,051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several genomic regions decreasing genetic diversity which might be
Full Text Available Hereditary 1,25-dihydroxyvitamin D-resistant rickets (HVDRR is an autosomal recessive disease caused by biallelic mutations in the vitamin D receptor (VDR gene. No patients have been reported with uniparental disomy (UPD.Using genome-wide single nucleotide polymorphism (SNP array to confirm whether HVDRR was caused by UPD of chromosome 12.A 2-year-old girl with alopecia and short stature and without any family history of consanguinity was diagnosed with HVDRR by typical laboratory data findings and clinical features of rickets. Sequence analysis of VDR was performed, and the origin of the homozygous mutation was investigated by target SNP sequencing, short tandem repeat analysis, and genome-wide SNP array.The patient had a homozygous p.Arg73Ter nonsense mutation. Her mother was heterozygous for the mutation, but her father was negative. We excluded gross deletion of the father's allele or paternal discordance. Genome-wide SNP array of the family (the patient and her parents showed complete maternal isodisomy of chromosome 12. She was successfully treated with high-dose oral calcium.This is the first report of HVDRR caused by UPD, and the third case of complete UPD of chromosome 12, in the published literature. Genome-wide SNP array was useful for detecting isodisomy and the parental origin of the allele. Comprehensive examination of the homozygous state is essential for accurate genetic counseling of recurrence risk and appropriate monitoring for other chromosome 12 related disorders. Furthermore, oral calcium therapy was effective as an initial treatment for rickets in this instance.
Khong, Jwu Jin; Burdon, Kathryn P; Lu, Yi; Laurie, Kate; Leonardos, Lefta; Baird, Paul N; Sahebjada, Srujana; Walsh, John P; Gajdatsy, Adam; Ebeling, Peter R; Hamblin, Peter Shane; Wong, Rosemary; Forehan, Simon P; Fourlanos, Spiros; Roberts, Anthony P; Doogue, Matthew; Selva, Dinesh; Montgomery, Grant W; Macgregor, Stuart; Craig, Jamie E
Graves' disease is an autoimmune thyroid disease of complex inheritance. Multiple genetic susceptibility loci are thought to be involved in Graves' disease and it is therefore likely that these can be identified by genome wide association studies. This study aimed to determine if a genome wide association study, using a pooling methodology, could detect genomic loci associated with Graves' disease. Nineteen of the top ranking single nucleotide polymorphisms including HLA-DQA1 and C6orf10, were clustered within the Major Histo-compatibility Complex region on chromosome 6p21, with rs1613056 reaching genome wide significance (p = 5 × 10 -8 ). Technical validation of top ranking non-Major Histo-compatablity complex single nucleotide polymorphisms with individual genotyping in the discovery cohort revealed four single nucleotide polymorphisms with p ≤ 10 -4 . Rs17676303 on chromosome 1q23.1, located upstream of FCRL3, showed evidence of association with Graves' disease across the discovery, replication and combined cohorts. A second single nucleotide polymorphism rs9644119 downstream of DPYSL2 showed some evidence of association supported by finding in the replication cohort that warrants further study. Pooled genome wide association study identified a genetic variant upstream of FCRL3 as a susceptibility locus for Graves' disease in addition to those identified in the Major Histo-compatibility Complex. A second locus downstream of DPYSL2 is potentially a novel genetic variant in Graves' disease that requires further confirmation.
Apr 1, 2010 ... Genome-wide association studies (GWAS) examine the entire human genome with the goal of identifying genetic variants. (usually single nucleotide polymorphisms (SNPs)) that are associated with phenotypic traits such as disease status and drug response. The discordance of significantly associated ...
Full Text Available Cardiovascular diseases are a large contributor to causes of early death in developed countries. Some of these conditions, such as sudden cardiac death and atrial fibrillation, stem from arrhythmias—a spectrum of conditions with abnormal electrical activity in the heart. Genome-wide association studies can identify single nucleotide variations (SNVs that may predispose individuals to developing acquired forms of arrhythmias. Through manual curation of published genome-wide association studies, we have collected a comprehensive list of 75 SNVs associated with cardiac arrhythmias. Ten of the SNVs result in amino acid changes and can be used in proteomic-based detection methods. In an effort to identify additional non-synonymous mutations that affect the proteome, we analyzed the post-translational modification S-nitrosylation, which is known to affect cardiac arrhythmias. We identified loss of seven known S-nitrosylation sites due to non-synonymous single nucleotide variations (nsSNVs. For predicted nitrosylation sites we found 1429 proteins where the sites are modified due to nsSNV. Analysis of the predicted S-nitrosylation dataset for over- or under-representation (compared to the complete human proteome of pathways and functional elements shows significant statistical over-representation of the blood coagulation pathway. Gene Ontology (GO analysis displays statistically over-represented terms related to muscle contraction, receptor activity, motor activity, cystoskeleton components, and microtubule activity. Through the genomic and proteomic context of SNVs and S-nitrosylation sites presented in this study, researchers can look for variation that can predispose individuals to cardiac arrhythmias. Such attempts to elucidate mechanisms of arrhythmia thereby add yet another useful parameter in predicting susceptibility for cardiac diseases.
Goudey, Benjamin; Abedini, Mani; Hopper, John L; Inouye, Michael; Makalic, Enes; Schmidt, Daniel F; Wagner, John; Zhou, Zeyu; Zobel, Justin; Reumann, Matthias
Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
Bernatsky, Sasha; Velásquez García, Héctor A; Spinelli, John; Gaffney, Patrick; Smedby, Karin E; Ramsey-Goldman, Rosalind; Wang, Sophia S.; Adami, Hans-Olov; Albanes, Demetrius; Angelucci, Emanuele; Ansell, Stephen M.; Asmann, Yan W.; Becker, Nikolaus; Benavente, Yolanda; Berndt, Sonja I.; Bertrand, Kimberly A.; Birmann, Brenda M.; Boeing, Heiner; Boffetta, Paolo; Bracci, Paige M.; Brennan, Paul; Brooks-Wilson, Angela R.; Cerhan, James R.; Chanock, Stephen J.; Clavel, Jacqueline; Conde, Lucia; Cotenbader, Karen H; Cox, David G; Cozen, Wendy; Crouch, Simon; De Roos, Anneclaire J.; De Sanjose, Silvia; Di Lollo, Simonetta; Diver, W. Ryan; Dogan, Ahmet; Foretova, Lenka; Ghesquières, Hervé; Giles, Graham G.; Glimelius, Bengt; Habermann, Thomas M.; Haioun, Corinne; Hartge, Patricia; Hjalgrim, Henrik; Holford, Theodore R.; Holly, Elizabeth A.; Jackson, Rebecca D.; Kaaks, Rudolph; Kane, Eleanor; Kelly, Rachel S.; Klein, Robert J.; Kraft, Peter; Kricker, Anne; Lan, Qing; Lawrence, Charles; Liebow, Mark; Lightfoot, Tracy; Link, Brian K.; Maynadie, Marc; McKay, James; Melbye, Mads; Molina, Thierry Jo; Monnereau, Alain; Morton, Lindsay M.; Nieters, Alexandra; North, Kari E.; Novak, Anne J.; Offit, Kenneth; Purdue, Mark P.; Rais, Marco; Riby, Jacques; Roman, Eve; Rothman, Nathaniel; Salles, Gilles; Severi, Gianluca; Severson, Richard K.; Skibola, Christine F.; Slager, Susan L.; Smith, Alex; Smith, Martyn T.; Southey, Melissa C.; Staines, Anthony; Teras, Lauren R.; Thompson, Carrie A.; Tilly, Hervé; Tinker, Lesley F.; Tjonneland, Anne; Turner, Jenny; Vajdic, Claire M.; Vermeulen, Roel C H; Vijai, Joseph; Vineis, Paolo; Virtamo, Jarmo; Wang, Zhaoming; Weinstein, Stephanie; Witzig, Thomas E.; Zelenetz, Andrew; Zeleniuch-Jacquotte, Anne; Zhang, Yawei; Zheng, Tongzhang; Zucca, Mariagrazia; Clarke, Ann E
Objective: Determinants of the increased risk of diffuse large B-cell lymphoma (DLBCL) in SLE are unclear. Using data from a recent lymphoma genome-wide association study (GWAS), we assessed whether certain lupus-related single nucleotide polymorphisms (SNPs) were also associated with DLBCL.
Full Text Available Age at first calving is an important trait for achieving earlier reproductive performance. To detect quantitative trait loci (QTL for reproductive traits, a genome wide association study was conducted on the 96 Hanwoo cows that were born between 2008 and 2010 from 13 sires in a local farm (Juk-Am Hanwoo farm, Suncheon, Korea and genotyped with the Illumina 50K bovine single nucleotide polymorphism (SNP chips. Phenotypes were regressed on additive and dominance effects for each SNP using a simple linear regression model after the effects of birth-year-month and polygenes were considered. A forward regression procedure was applied to determine the best set of SNPs for age at first calving. A total of 15 QTL were detected at the comparison-wise 0.001 level. Two QTL with strong statistical evidence were found at 128.9 Mb and 111.1 Mb on bovine chromosomes (BTA 2 and 7, respectively, each of which accounted for 22% of the phenotypic variance. Also, five significant SNPs were detected on BTAs 10, 16, 20, 26, and 29. Multiple QTL were found on BTAs 1, 2, 7, and 14. The significant QTLs may be applied via marker assisted selection to increase rate of genetic gain for the trait, after validation tests in other Hanwoo cow populations.
Meghann K. Devlin-Durante
Full Text Available The advent of next-generation sequencing tools has made it possible to conduct fine-scale surveys of population differentiation and genome-wide scans for signatures of selection in non-model organisms. Such surveys are of particular importance in sharply declining coral species, since knowledge of population boundaries and signs of local adaptation can inform restoration and conservation efforts. Here, we use genome-wide surveys of single-nucleotide polymorphisms in the threatened Caribbean elkhorn coral, Acropora palmata, to reveal fine-scale population structure and infer the major barrier to gene flow that separates the eastern and western Caribbean populations between the Bahamas and Puerto Rico. The exact location of this break had been subject to discussion because two previous studies based on microsatellite data had come to differing conclusions. We investigate this contradiction by analyzing an extended set of 11 microsatellite markers including the five previously employed and discovered that one of the original microsatellite loci is apparently under selection. Exclusion of this locus reconciles the results from the SNP and the microsatellite datasets. Scans for outlier loci in the SNP data detected 13 candidate loci under positive selection, however there was no correlation between available environmental parameters and genetic distance. Together, these results suggest that reef restoration efforts should use local sources and utilize existing functional variation among geographic regions in ex situ crossing experiments to improve stress resistance of this species.
Devlin-Durante, Meghann K; Baums, Iliana B
The advent of next-generation sequencing tools has made it possible to conduct fine-scale surveys of population differentiation and genome-wide scans for signatures of selection in non-model organisms. Such surveys are of particular importance in sharply declining coral species, since knowledge of population boundaries and signs of local adaptation can inform restoration and conservation efforts. Here, we use genome-wide surveys of single-nucleotide polymorphisms in the threatened Caribbean elkhorn coral, Acropora palmata , to reveal fine-scale population structure and infer the major barrier to gene flow that separates the eastern and western Caribbean populations between the Bahamas and Puerto Rico. The exact location of this break had been subject to discussion because two previous studies based on microsatellite data had come to differing conclusions. We investigate this contradiction by analyzing an extended set of 11 microsatellite markers including the five previously employed and discovered that one of the original microsatellite loci is apparently under selection. Exclusion of this locus reconciles the results from the SNP and the microsatellite datasets. Scans for outlier loci in the SNP data detected 13 candidate loci under positive selection, however there was no correlation between available environmental parameters and genetic distance. Together, these results suggest that reef restoration efforts should use local sources and utilize existing functional variation among geographic regions in ex situ crossing experiments to improve stress resistance of this species.
Full Text Available Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs. Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis.
Lu, Y.; Chen, X.; Beesley, J.; Johnatty, S.E.; Defazio, A.; Lambrechts, S.; Lambrechts, D.; Despierre, E.; Vergotes, I.; Chang-Claude, J.; Hein, R.; Nickels, S.; Wang-Gohrke, S.; Dork, T.; Durst, M.; Antonenkova, N.; Bogdanova, N.; Goodman, M.T.; Lurie, G.; Wilkens, L.R.; Carney, M.E.; Butzow, R.; Nevanlinna, H.; Heikkinen, T.; Leminen, A.; Kiemeney, L.A.L.M.; Massuger, L.F.A.G.; Altena, A.M. van; Aben, K.K.H.; Kjaer, S.K.; Hogdall, E.; Jensen, A.; Brooks-Wilson, A.; Le, N.; Cook, L.; Earp, M.; Kelemen, L.; Easton, D.; Pharoah, P.; Song, H.; Tyrer, J.; Ramus, S.; Menon, U.; Gentry-Maharaj, A.; Gayther, S.A.; Bandera, E.V.; Olson, S.H.; Orlow, I.; Rodriguez-Rodriguez, L.; MacGregor, S.; Chenevix-Trench, G.
Recent Genome-Wide Association Studies (GWAS) have identified four low-penetrance ovarian cancer susceptibility loci. We hypothesized that further moderate- or low-penetrance variants exist among the subset of single-nucleotide polymorphisms (SNPs) not well tagged by the genotyping arrays used in
van den Berg, Stéphanie M; de Moor, Marleen H M; Verweij, K. J. H.
small sample sizes of those studies. Here, we report on a large meta-analysis of GWA studies for extraversion in 63,030 subjects in 29 cohorts. Extraversion item data from multiple personality inventories were harmonized across inventories and cohorts. No genome-wide significant associations were found...... at the single nucleotide polymorphism (SNP) level but there was one significant hit at the gene level for a long non-coding RNA site (LOC101928162). Genome-wide complex trait analysis in two large cohorts showed that the additive variance explained by common SNPs was not significantly different from zero...
Vilella Albert J
Full Text Available Abstract Background DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. Results We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i exhaustive population-genetic analyses including those based on the coalescent theory; ii analysis adapted to the shallow data generated by the high-throughput genome projects; iii use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v visualization of the results integrated with current genome annotations in commonly available genome browsers. Conclusion VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.
Børsting, Claus; Pereira, Vania; Andersen, Jeppe Dyrberg
Single nucleotide polymorphisms (SNPs) are the most frequent DNA sequence variations in the genome. They have been studied extensively in the last decade with various purposes in mind. In this chapter, we will discuss the advantages and disadvantages of using SNPs for human identification...... of SNPs. This will allow acquisition of more information from the sample materials and open up for new possibilities as well as new challenges....
Single nucleotide polymorphisms (SNPs) may be considered the ultimate genetic markers as they represent the finest resolution of a DNA sequence (a single nucleotide), and are generally abundant in populations with a low mutation rate. SNPs are important tools in studying complex genetic traits and genome evolution.
Zillikens, M Carola; Demissie, Serkalem; Hsu, Yi-Hsiang; Yerges-Armstrong, Laura M; Chou, Wen-Chi; Stolk, Lisette; Livshits, Gregory; Broer, Linda; Johnson, Toby; Koller, Daniel L; Kutalik, Zoltán; Luan, Jian'an; Malkin, Ida; Ried, Janina S; Smith, Albert V; Thorleifsson, Gudmar; Vandenput, Liesbeth; Hua Zhao, Jing; Zhang, Weihua; Aghdassi, Ali; Åkesson, Kristina; Amin, Najaf; Baier, Leslie J; Barroso, Inês; Bennett, David A; Bertram, Lars; Biffar, Rainer; Bochud, Murielle; Boehnke, Michael; Borecki, Ingrid B; Buchman, Aron S; Byberg, Liisa; Campbell, Harry; Campos Obanda, Natalia; Cauley, Jane A; Cawthon, Peggy M; Cederberg, Henna; Chen, Zhao; Cho, Nam H; Jin Choi, Hyung; Claussnitzer, Melina; Collins, Francis; Cummings, Steven R; De Jager, Philip L; Demuth, Ilja; Dhonukshe-Rutten, Rosalie A M; Diatchenko, Luda; Eiriksdottir, Gudny; Enneman, Anke W; Erdos, Mike; Eriksson, Johan G; Eriksson, Joel; Estrada, Karol; Evans, Daniel S; Feitosa, Mary F; Fu, Mao; Garcia, Melissa; Gieger, Christian; Girke, Thomas; Glazer, Nicole L; Grallert, Harald; Grewal, Jagvir; Han, Bok-Ghee; Hanson, Robert L; Hayward, Caroline; Hofman, Albert; Hoffman, Eric P; Homuth, Georg; Hsueh, Wen-Chi; Hubal, Monica J; Hubbard, Alan; Huffman, Kim M; Husted, Lise B; Illig, Thomas; Ingelsson, Erik; Ittermann, Till; Jansson, John-Olov; Jordan, Joanne M; Jula, Antti; Karlsson, Magnus; Khaw, Kay-Tee; Kilpeläinen, Tuomas O; Klopp, Norman; Kloth, Jacqueline S L; Koistinen, Heikki A; Kraus, William E; Kritchevsky, Stephen; Kuulasmaa, Teemu; Kuusisto, Johanna; Laakso, Markku; Lahti, Jari; Lang, Thomas; Langdahl, Bente L; Launer, Lenore J; Lee, Jong-Young; Lerch, Markus M; Lewis, Joshua R; Lind, Lars; Lindgren, Cecilia; Liu, Yongmei; Liu, Tian; Liu, Youfang; Ljunggren, Östen; Lorentzon, Mattias; Luben, Robert N; Maixner, William; McGuigan, Fiona E; Medina-Gomez, Carolina; Meitinger, Thomas; Melhus, Håkan; Mellström, Dan; Melov, Simon; Michaëlsson, Karl; Mitchell, Braxton D; Morris, Andrew P; Mosekilde, Leif; Newman, Anne; Nielson, Carrie M; O'Connell, Jeffrey R; Oostra, Ben A; Orwoll, Eric S; Palotie, Aarno; Parker, Stephen C J; Peacock, Munro; Perola, Markus; Peters, Annette; Polasek, Ozren; Prince, Richard L; Räikkönen, Katri; Ralston, Stuart H; Ripatti, Samuli; Robbins, John A; Rotter, Jerome I; Rudan, Igor; Salomaa, Veikko; Satterfield, Suzanne; Schadt, Eric E; Schipf, Sabine; Scott, Laura; Sehmi, Joban; Shen, Jian; Soo Shin, Chan; Sigurdsson, Gunnar; Smith, Shad; Soranzo, Nicole; Stančáková, Alena; Steinhagen-Thiessen, Elisabeth; Streeten, Elizabeth A; Styrkarsdottir, Unnur; Swart, Karin M A; Tan, Sian-Tsung; Tarnopolsky, Mark A; Thompson, Patricia; Thomson, Cynthia A; Thorsteinsdottir, Unnur; Tikkanen, Emmi; Tranah, Gregory J; Tuomilehto, Jaakko; van Schoor, Natasja M; Verma, Arjun; Vollenweider, Peter; Völzke, Henry; Wactawski-Wende, Jean; Walker, Mark; Weedon, Michael N; Welch, Ryan; Wichmann, H-Erich; Widen, Elisabeth; Williams, Frances M K; Wilson, James F; Wright, Nicole C; Xie, Weijia; Yu, Lei; Zhou, Yanhua; Chambers, John C; Döring, Angela; van Duijn, Cornelia M; Econs, Michael J; Gudnason, Vilmundur; Kooner, Jaspal S; Psaty, Bruce M; Spector, Timothy D; Stefansson, Kari; Rivadeneira, Fernando; Uitterlinden, André G; Wareham, Nicholas J; Ossowski, Vicky; Waterworth, Dawn; Loos, Ruth J F; Karasik, David; Harris, Tamara B; Ohlsson, Claes; Kiel, Douglas P
Lean body mass, consisting mostly of skeletal muscle, is important for healthy aging. We performed a genome-wide association study for whole body (20 cohorts of European ancestry with n = 38,292) and appendicular (arms and legs) lean body mass (n = 28,330) measured using dual energy X-ray absorptiometry or bioelectrical impedance analysis, adjusted for sex, age, height, and fat mass. Twenty-one single-nucleotide polymorphisms were significantly associated with lean body mass either genome wide (p lean body mass and in 45,090 (42,360 of European ancestry) subjects from 25 cohorts for appendicular lean body mass was successful for five single-nucleotide polymorphisms in/near HSD17B11, VCAN, ADAMTSL3, IRS1, and FTO for total lean body mass and for three single-nucleotide polymorphisms in/near VCAN, ADAMTSL3, and IRS1 for appendicular lean body mass. Our findings provide new insight into the genetics of lean body mass.Lean body mass is a highly heritable trait and is associated with various health conditions. Here, Kiel and colleagues perform a meta-analysis of genome-wide association studies for whole body lean body mass and find five novel genetic loci to be significantly associated.
Nivard, Michel G.; Middeldorp, Christel M.; Lubke, Gitta; Hottenga, Jouke-Jan; Abdellaoui, Abdel; Boomsma, Dorret I.; Dolan, Conor V.
Heritability may be estimated using phenotypic data collected in relatives or in distantly related individuals using genome-wide single nucleotide polymorphism (SNP) data. We combined these approaches by re-parameterizing the model proposed by Zaitlen et al and extended this model to include
Kasperaviciūte, Dalia; Catarino, Claudia B; Heinzen, Erin L; Depondt, Chantal; Cavalleri, Gianpiero L; Caboclo, Luis O; Tate, Sarah K; Jamnadas-Khoda, Jenny; Chinthapalli, Krishna; Clayton, Lisa M S; Shianna, Kevin V; Radtke, Rodney A; Mikati, Mohamad A; Gallentine, William B; Husain, Aatif M; Alhusaini, Saud; Leppert, David; Middleton, Lefkos T; Gibson, Rachel A; Johnson, Michael R; Matthews, Paul M; Hosford, David; Heuser, Kjell; Amos, Leslie; Ortega, Marcos; Zumsteg, Dominik; Wieser, Heinz-Gregor; Steinhoff, Bernhard J; Krämer, Günter; Hansen, Jörg; Dorn, Thomas; Kantanen, Anne-Mari; Gjerstad, Leif; Peuralinna, Terhi; Hernandez, Dena G; Eriksson, Kai J; Kälviäinen, Reetta K; Doherty, Colin P; Wood, Nicholas W; Pandolfo, Massimo; Duncan, John S; Sander, Josemir W; Delanty, Norman; Goldstein, David B; Sisodiya, Sanjay M
Partial epilepsies have a substantial heritability. However, the actual genetic causes are largely unknown. In contrast to many other common diseases for which genetic association-studies have successfully revealed common variants associated with disease risk, the role of common variation in partial epilepsies has not yet been explored in a well-powered study. We undertook a genome-wide association-study to identify common variants which influence risk for epilepsy shared amongst partial epilepsy syndromes, in 3445 patients and 6935 controls of European ancestry. We did not identify any genome-wide significant association. A few single nucleotide polymorphisms may warrant further investigation. We exclude common genetic variants with effect sizes above a modest 1.3 odds ratio for a single variant as contributors to genetic susceptibility shared across the partial epilepsies. We show that, at best, common genetic variation can only have a modest role in predisposition to the partial epilepsies when considered across syndromes in Europeans. The genetic architecture of the partial epilepsies is likely to be very complex, reflecting genotypic and phenotypic heterogeneity. Larger meta-analyses are required to identify variants of smaller effect sizes (odds ratio<1.3) or syndrome-specific variants. Further, our results suggest research efforts should also be directed towards identifying the multiple rare variants likely to account for at least part of the heritability of the partial epilepsies. Data emerging from genome-wide association-studies will be valuable during the next serious challenge of interpreting all the genetic variation emerging from whole-genome sequencing studies.
Curran, S.; Bolton, P.; Rozsnyai, K.; Chiocchetti, A.; Klauck, S.M.; Duketis, E.; Poustka, F.; Schlitt, S.; Freitag, C.M.; Lee, I. van der; Muglia, P.; Poot, M.; Staal, W.G.; Jonge, M.V. de; Ophoff, R.A.; Lewis, C.; Skuse, D.; Mandy, W.; Vassos, E.; Fossdal, R.; Magnusson, P.; Hreidarsson, S.; Saemundsen, E.; Stefansson, H.; Stefansson, K.; Collier, D.
The Autism Genome Project (AGP) Consortium recently reported genome-wide significant association between autism and an intronic single nucleotide polymorphism marker, rs4141463, within the MACROD2 gene. In the present study we attempted to replicate this finding using an independent case-control
Tincher, Clayton; Long, Hongan; Behringer, Megan; Walker, Noah; Lynch, Michael
Mutations induced by pollutants may promote pathogen evolution, for example by accelerating mutations conferring antibiotic resistance. Generally, evaluating the genome-wide mutagenic effects of long-term sublethal pollutant exposure at single-nucleotide resolution is extremely difficult. To overcome this technical barrier, we use the mutation accumulation/whole-genome sequencing (MA/WGS) method as a mutagenicity test, to quantitatively evaluate genome-wide mutagenesis of Escherichia coli after long-term exposure to a wide gradient of the glyphosate-based herbicide (GBH) Roundup Concentrate Plus. The genome-wide mutation rate decreases as GBH concentration increases, suggesting that even long-term GBH exposure does not compromise the genome stability of bacteria. Copyright © 2017 Tincher et al.
Liu, Siyang; Huang, Shujia; Rao, Junhua
present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome......) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We...... assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction...
Full Text Available In this study, 796 male Duroc pigs were used to identify genomic regions controlling growth traits. Three production traits were studied: food conversion ratio, days to 100 KG, and average daily gain, using a panel of 39,436 single nucleotide polymorphisms. In total, we detected 11 genome-wide and 162 chromosome-wide single nucleotide polymorphism trait associations. The Gene ontology analysis identified 14 candidate genes close to significant single nucleotide polymorphisms, with growth-related functions: six for days to 100 KG (WT1, FBXO3, DOCK7, PPP3CA, AGPAT9, and NKX6-1, seven for food conversion ratio (MAP2, TBX15, IVL, ARL15, CPS1, VWC2L, and VAV3, and one for average daily gain (COL27A1. Gene ontology analysis indicated that most of the candidate genes are involved in muscle, fat, bone or nervous system development, nutrient absorption, and metabolism, which are all either directly or indirectly related to growth traits in pigs. Additionally, we found four haplotype blocks composed of suggestive single nucleotide polymorphisms located in the growth trait-related quantitative trait loci and further narrowed down the ranges, the largest of which decreased by ~60 Mb. Hence, our results could be used to improve pig production traits by increasing the frequency of favorable alleles via artificial selection.
Chavez-Galarza, Julio; Johnston, J. Spencer; Azevedo, João; Muñoz, Irene; De la Rúa, Pilar; Patton, John C.; Pinto, M. Alice
Dissecting genome-wide (expansions, contractions, admixture) from genome-specific effects (selection) is a goal of central importance in evolutionary biology because it leads to more robust inferences of demographic history and to identification of adaptive divergence. The publication of the honey bee genome and the development of high-density SNPs genotyping, provide us with powerful tools, allowing us to identify signatures of selection in the honey bee genome. These signatur...
Brynildsrud, Ola; Bohlin, Jon; Scheffer, Lonneke; Eldholm, Vegard
Genome-wide association studies (GWAS) have become indispensable in human medicine and genomics, but very few have been carried out on bacteria. Here we introduce Scoary, an ultra-fast, easy-to-use, and widely applicable software tool that scores the components of the pan-genome for associations to observed phenotypic traits while accounting for population stratification, with minimal assumptions about evolutionary processes. We call our approach pan-GWAS to distinguish it from traditional, single nucleotide polymorphism (SNP)-based GWAS. Scoary is implemented in Python and is available under an open source GPLv3 license at https://github.com/AdmiralenOla/Scoary .
Full Text Available Single-nucleotide polymorphisms (SNPs are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout (Oncorhynchus mykiss, SNP discovery has been previously done through sequencing of restriction-site associated DNA (RAD libraries, reduced representation libraries (RRL and RNA sequencing. Recently we have performed high coverage whole genome resequencing with 61 unrelated samples, representing a wide range of rainbow trout and steelhead populations, with 49 new samples added to 12 aquaculture samples from AquaGen (Norway that we previously used for SNP discovery. Of the 49 new samples, 11 were double-haploid lines from Washington State University (WSU and 38 represented wild and hatchery populations from a wide range of geographic distribution and with divergent migratory phenotypes. We then mapped the sequences to the new rainbow trout reference genome assembly (GCA_002163495.1 which is based on the Swanson YY doubled haploid line. Variant calling was conducted with FreeBayes and SAMtools mpileup, followed by filtering of SNPs based on quality score, sequence complexity, read depth on the locus, and number of genotyped samples. Results from the two variant calling programs were compared and genotypes of the double haploid samples were used for detecting and filtering putative paralogous sequence variants (PSVs and multi-sequence variants (MSVs. Overall, 30,302,087 SNPs were identified on the rainbow trout genome 29 chromosomes and 1,139,018 on unplaced scaffolds, with 4,042,723 SNPs having high minor allele frequency (MAF > 0.25. The average SNP density on the chromosomes was one SNP per 64 bp, or 15.6 SNPs per 1 kb. Results from the phylogenetic analysis that we conducted indicate that the SNP markers contain enough population-specific polymorphisms for recovering population relationships despite the small sample size used. Intra-Population polymorphism assessment revealed high level of polymorphism and
Zhao, Zhenqing; Gu, Honghui; Sheng, Xiaoguang; Yu, Huifang; Wang, Jiansheng; Huang, Long; Wang, Dan
Molecular markers and genetic maps play an important role in plant genomics and breeding studies. Cauliflower is an important and distinctive vegetable; however, very few molecular resources have been reported for this species. In this study, a novel, specific-locus amplified fragment (SLAF) sequencing strategy was employed for large-scale single nucleotide polymorphism (SNP) discovery and high-density genetic map construction in a double-haploid, segregating population of cauliflower. A total of 12.47 Gb raw data containing 77.92 M pair-end reads were obtained after processing and 6815 polymorphic SLAFs between the two parents were detected. The average sequencing depths reached 52.66-fold for the female parent and 49.35-fold for the male parent. Subsequently, these polymorphic SLAFs were used to genotype the population and further filtered based on several criteria to construct a genetic linkage map of cauliflower. Finally, 1776 high-quality SLAF markers, including 2741 SNPs, constituted the linkage map with average data integrity of 95.68%. The final map spanned a total genetic length of 890.01 cM with an average marker interval of 0.50 cM, and covered 364.9 Mb of the reference genome. The markers and genetic map developed in this study could provide an important foundation not only for comparative genomics studies within Brassica oleracea species but also for quantitative trait loci identification and molecular breeding of cauliflower. PMID:27047515
Hara, Kazuo; Fujita, Hayato; Johnson, Todd A
Although over 60 loci for type 2 diabetes (T2D) have been identified, there still remains a large genetic component to be clarified. To explore unidentified loci for T2D, we performed a genome-wide association study (GWAS) of 6 209 637 single-nucleotide polymorphisms (SNPs), which were directly g...
Full Text Available Abstract Background A recent genome wide association study in 1017 African Americans identified several single nucleotide polymorphisms that reached genome-wide significance for systolic blood pressure. We attempted to replicate these findings in an independent sample of 2474 unrelated African Americans in the Milwaukee metropolitan area; 53% were women and 47% were hypertensives. Methods We evaluated sixteen top associated SNPs from the above genome wide association study for hypertension as a binary trait or blood pressure as a continuous trait. In addition, we evaluated eight single nucleotide polymorphisms located in two genes (STK-39 and CDH-13 found to be associated with systolic and diastolic blood pressures by other genome wide association studies in European and Amish populations. TaqMan MGB-based chemistry with fluorescent probes was used for genotyping. We had an adequate sample size (80% power to detect an effect size of 1.2-2.0 for all the single nucleotide polymorphisms for hypertension as a binary trait, and 1% variance in blood pressure as a continuous trait. Quantitative trait analyses were performed both by excluding and also by including subjects on anti-hypertensive therapy (after adjustments were made for anti-hypertensive medications. Results For all 24 SNPs, no statistically significant differences were noted in the minor allele frequencies between cases and controls. One SNP (rs2146204 showed borderline association (p = 0.006 with hypertension status using recessive model and systolic blood pressure (p = 0.02, but was not significant after adjusting for multiple comparisons. In quantitative trait analyses, among normotensives only, rs12748299 was associated with SBP (p = 0.002. In addition, several nominally significant associations were noted with SBP and DBP among normotensives but none were statistically significant. Conclusions This study highlights the importance of replication to confirm the validity of genome wide
Kerns, Sarah L.; Ostrer, Harry; Stock, Richard; Li, William; Moore, Julian; Pearlman, Alexander; Campbell, Christopher; Shao Yongzhao; Stone, Nelson; Kusnetz, Lynda; Rosenstein, Barry S.
Purpose: To identify single nucleotide polymorphisms (SNPs) associated with erectile dysfunction (ED) among African-American prostate cancer patients treated with external beam radiation therapy. Methods and Materials: A cohort of African-American prostate cancer patients treated with external beam radiation therapy was observed for the development of ED by use of the five-item Sexual Health Inventory for Men (SHIM) questionnaire. Final analysis included 27 cases (post-treatment SHIM score ≤7) and 52 control subjects (post-treatment SHIM score ≥16). A genome-wide association study was performed using approximately 909,000 SNPs genotyped on Affymetrix 6.0 arrays (Affymetrix, Santa Clara, CA). Results: We identified SNP rs2268363, located in the follicle-stimulating hormone receptor (FSHR) gene, as significantly associated with ED after correcting for multiple comparisons (unadjusted p = 5.46 x 10 -8 , Bonferroni p = 0.028). We identified four additional SNPs that tended toward a significant association with an unadjusted p value -6 . Inference of population substructure showed that cases had a higher proportion of African ancestry than control subjects (77% vs. 60%, p = 0.005). A multivariate logistic regression model that incorporated estimated ancestry and four of the top-ranked SNPs was a more accurate classifier of ED than a model that included only clinical variables. Conclusions: To our knowledge, this is the first genome-wide association study to identify SNPs associated with adverse effects resulting from radiotherapy. It is important to note that the SNP that proved to be significantly associated with ED is located within a gene whose encoded product plays a role in male gonad development and function. Another key finding of this project is that the four SNPs most strongly associated with ED were specific to persons of African ancestry and would therefore not have been identified had a cohort of European ancestry been screened. This study demonstrates
Loo, Sandra K.; Shtir, Corina; Doyle, Alysa E.; Mick, Eric; McGough, James J.; McCracken, James; Biederman, Joseph; Smalley, Susan L.; Cantor, Rita M.; Faraone, Stephen V.; Nelson, Stanley F.
Objective: The purpose of the present study was to identify common genetic variants that are associated with human intelligence or general cognitive ability. Method: We performed a genome-wide association analysis with a dense set of 1 million single-nucleotide polymorphisms (SNPs) and quantitative intelligence scores within an ancestrally…
Vaez, Ahmad; Jansen, Rick; Prins, Bram P.; Hottenga, Jouke-Jan; de Geus, Eco J. C.; Boomsma, Dorret I.; Penninx, Brenda W. J. H.; Nolte, Ilja M.; Snieder, Harold; Alizadeh, Behrooz Z.
Background Genome-wide association studies (GWASs) have successfully identified several single nucleotide polymorphisms (SNPs) associated with serum levels of C-reactive protein (CRP). An important limitation of GWASs is that the identified variants merely flag the nearby genomic region and do not
Vaez, A.; Jansen, R.; Prins, B.P.; Hottenga, J.J.; de Geus, E.J.C.; Boomsma, D.I.; Penninx, B.W.J.H.; Nolte, I.M.; Snieder, H.; Alizadeh, BZ
Background - Genome-wide association studies (GWASs) have successfully identified several single nucleotide polymorphisms (SNPs) associated with serum levels of C-reactive protein (CRP). An important limitation of GWASs is that the identified variants merely flag the nearby genomic region and do not
Li, M-H; Tiirikka, T; Kantanen, J
In sheep, coat colour (and pattern) is one of the important traits of great biological, economic and social importance. However, the genetics of sheep coat colour has not yet been fully clarified. We conducted a genome-wide association study of sheep coat colours by genotyping 47 303 single-nucleotide polymorphisms (SNPs) in the Finnsheep population in Finland. We identified 35 SNPs associated with all the coat colours studied, which cover genomic regions encompassing three known pigmentation genes (TYRP1, ASIP and MITF) in sheep. Eighteen of these associations were confirmed in further tests between white versus non-white individuals, but none of the 35 associations were significant in the analysis of only non-white colours. Across the tests, the s66432.1 in ASIP showed significant association (P=4.2 × 10(-11) for all the colours; P=2.3 × 10(-11) for white versus non-white colours) with the variation in coat colours and strong linkage disequilibrium with other significant variants surrounding the ASIP gene. The signals detected around the ASIP gene were explained by differences in white versus non-white alleles. Further, a genome scan for selection for white coat pigmentation identified a strong and striking selection signal spanning ASIP. Our study identified the main candidate gene for the coat colour variation between white and non-white as ASIP, an autosomal gene that has been directly implicated in the pathway regulating melanogenesis. Together with ASIP, the two other newly identified genes (TYRP1 and MITF) in the Finnsheep, bordering associated SNPs, represent a new resource for enriching sheep coat-colour genetics and breeding.
McGuire Patrick E
Full Text Available Abstract Background A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat. Results Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed. Conclusions In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large
Full Text Available Genetics is important for breeding and selection of horses but there is a lack of well-established horse-related browsers or databases. In order to better understand horses, more variants and other integrated information are needed. Thus, we construct a horse genomic variants database including expression and other information. Horse Single Nucleotide Polymorphism and Expression Database (HSDB (http://snugenome2.snu.ac.kr/HSDB provides the number of unexplored genomic variants still remaining to be identified in the horse genome including rare variants by using population genome sequences of eighteen horses and RNA-seq of four horses. The identified single nucleotide polymorphisms (SNPs were confirmed by comparing them with SNP chip data and variants of RNA-seq, which showed a concordance level of 99.02% and 96.6%, respectively. Moreover, the database provides the genomic variants with their corresponding transcriptional profiles from the same individuals to help understand the functional aspects of these variants. The database will contribute to genetic improvement and breeding strategies of Thoroughbreds.
Zillikens, M Carola; Demissie, Serkalem; Hsu, Yi-Hsiang
Lean body mass, consisting mostly of skeletal muscle, is important for healthy aging. We performed a genome-wide association study for whole body (20 cohorts of European ancestry with n = 38,292) and appendicular (arms and legs) lean body mass (n = 28,330) measured using dual energy X-ray absorpt...... a meta-analysis of genome-wide association studies for whole body lean body mass and find five novel genetic loci to be significantly associated.......-ray absorptiometry or bioelectrical impedance analysis, adjusted for sex, age, height, and fat mass. Twenty-one single-nucleotide polymorphisms were significantly associated with lean body mass either genome wide (p
Krintel, Sophine B; Palermo, Giuseppe; Johansen, Julia S
Recently, two genome-wide association studies identified single nucleotide polymorphisms (SNPs) significantly associated with the treatment response to tumor necrosis factor α (TNFα) inhibitors in patients with rheumatoid arthritis (RA). We aimed to replicate these results and identify SNPs and t...
Milne, Roger L; Benítez, Javier; Nevanlinna, Heli
BACKGROUND: A recent genome-wide association study identified single-nucleotide polymorphism (SNP) 2q35-rs13387042 as a marker of susceptibility to estrogen receptor (ER)-positive breast cancer. We attempted to confirm this association using the Breast Cancer Association Consortium. METHODS: 2q35...
Brodie, Ryan; Smith, Alex J; Roper, Rachel L; Tcherepanov, Vasily; Upton, Chris
With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes) is not feasible without new bioinformatics tools. A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1) rapidly identify and correct alignment errors in large, multiple genome alignments; and 2) generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs) to retrieve detailed annotation information about the aligned genomes or use information from text files. Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.
Full Text Available Plants have evolved an elaborate innate immune system against invading pathogens. Within this system, intracellular nucleotide-binding leucine-rich repeat (NLR immune receptors are known play critical roles in effector-triggered immunity (ETI plant defense. We performed genome-wide identification and classification of NLR-coding sequences from the genomes of pepper, tomato, and potato using fixed criteria. We then compared genomic duplication and evolution features. We identified intact 267, 443, and 755 NLR-encoding genes in tomato, potato, and pepper genomes, respectively. Phylogenetic analyses and classification of Solanaceae NLRs revealed that the majority of NLR super family members fell into 14 subgroups, including a TIR-NLR (TNL subgroup and 13 non-TNL subgroups. Specific subgroups have expanded in each genome, with the expansion in pepper showing subgroup-specific physical clusters. Comparative analysis of duplications showed distinct duplication patterns within pepper and among Solanaceae plants suggesting subgroup- or species-specific gene duplication events after speciation, resulting in divergent evolution. Taken together, genome-wide analyses of NLR family members provide insights into their evolutionary history in Solanaceae. These findings also provide important foundational knowledge for understanding NLR evolution and will empower broader characterization of disease resistance genes to be used for crop breeding.
Jensen, Majken Karoline; Pers, Tune Hannes; Dworzynski, Piotr
in genes associated with risk of coronary heart disease (CHD). Methods and Results-Genome-wide association analyses of approximately approximate to 700 000 single-nucleotide polymorphisms in 899 incident CHD cases and 1823 age-and sex-matched controls within the Nurses' Health and the Health Professionals...... complex. Conclusions-The integration of a GWA study with PPI data successfully identifies a set of candidate susceptibility genes for incident CHD that would have been missed in single-marker GWA analysis. (Circ Cardiovasc Genet. 2011; 4:549-556.)...
Full Text Available Determination of cellular DNA damage has so far been limited to global assessment of genome integrity whereas nucleotide-level mapping has been restricted to specific loci by the use of specific primers. Therefore, only limited DNA sequences can be studied and novel regions of genomic instability can hardly be discovered. Using a well-characterized yeast model, we describe a straightforward strategy to map genome-wide DNA strand breaks without compromising nucleotide-level resolution. This technique, termed "damaged DNA immunoprecipitation" (dDIP, uses immunoprecipitation and the terminal deoxynucleotidyl transferase-mediated dUTP-biotin end-labeling (TUNEL to capture DNA at break sites. When used in combination with microarray or next-generation sequencing technologies, dDIP will allow researchers to map genome-wide DNA strand breaks as well as other types of DNA damage and to establish a clear profiling of altered genes and/or intergenic sequences in various experimental conditions. This mapping technique could find several applications for instance in the study of aging, genotoxic drug screening, cancer, meiosis, radiation and oxidative DNA damage.
Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.
Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.
Ojeda, Dario I; Dhillon, Braham; Tsui, Clement K M; Hamelin, Richard C
Single-nucleotide polymorphisms (SNPs) are rapidly becoming the standard markers in population genomics studies; however, their use in nonmodel organisms is limited due to the lack of cost-effective approaches to uncover genome-wide variation, and the large number of individuals needed in the screening process to reduce ascertainment bias. To discover SNPs for population genomics studies in the fungal symbionts of the mountain pine beetle (MPB), we developed a road map to discover SNPs and to produce a genotyping platform. We undertook a whole-genome sequencing approach of Leptographium longiclavatum in combination with available genomics resources of another MPB symbiont, Grosmannia clavigera. We sequenced 71 individuals pooled into four groups using the Illumina sequencing technology. We generated between 27 and 30 million reads of 75 bp that resulted in a total of 1, 181 contigs longer than 2 kb and an assembled genome size of 28.9 Mb (N50 = 48 kb, average depth = 125x). A total of 9052 proteins were annotated, and between 9531 and 17,266 SNPs were identified in the four pools. A subset of 206 genes (containing 574 SNPs, 11% false positives) was used to develop a genotyping platform for this species. Using this roadmap, we developed a genotyping assay with a total of 147 SNPs located in 121 genes using the Illumina(®) Sequenom iPLEX Gold. Our preliminary genotyping (success rate = 85%) of 304 individuals from 36 populations supports the utility of this approach for population genomics studies in other MPB fungal symbionts and other fungal nonmodel species. © 2013 John Wiley & Sons Ltd.
Full Text Available Abstract Background With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes is not feasible without new bioinformatics tools. Results A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1 rapidly identify and correct alignment errors in large, multiple genome alignments; and 2 generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs to retrieve detailed annotation information about the aligned genomes or use information from text files. Conclusion Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.
Gong, Jian; Hsu, Li; Harrison, Tabitha
Selenium is an essential trace element and circulating selenium concentrations have been associated with a wide range of diseases. Candidate gene studies suggest that circulating selenium concentrations may be impacted by genetic variation; however, no study has comprehensively investigated...... this hypothesis. Therefore, we conducted a two-stage genome-wide association study to identify genetic variants associated with serum selenium concentrations in 1203 European descents from two cohorts: the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening and the Women’s Health Initiative (WHI). We...... tested association between 2,474,333 single nucleotide polymorphisms (SNPs) and serum selenium concentrations using linear regression models. In the first stage (PLCO) 41 SNPs clustered in 15 regions had p
Walter, Stefan; Atzmon, Gil; Demerath, Ellen W; Garcia, Melissa E; Kaplan, Robert C; Kumari, Meena; Lunetta, Kathryn L; Milaneschi, Yuri; Tanaka, Toshiko; Tranah, Gregory J; Völker, Uwe; Yu, Lei; Arnold, Alice; Benjamin, Emelia J; Biffar, Reiner; Buchman, Aron S; Boerwinkle, Eric; Couper, David; De Jager, Philip L; Evans, Denis A; Harris, Tamara B; Hoffmann, Wolfgang; Hofman, Albert; Karasik, David; Kiel, Douglas P; Kocher, Thomas; Kuningas, Maris; Launer, Lenore J; Lohman, Kurt K; Lutsey, Pamela L; Mackenbach, Johan; Marciante, Kristin; Psaty, Bruce M; Reiman, Eric M; Rotter, Jerome I; Seshadri, Sudha; Shardell, Michelle D; Smith, Albert V; van Duijn, Cornelia; Walston, Jeremy; Zillikens, M Carola; Bandinelli, Stefania; Baumeister, Sebastian E; Bennett, David A; Ferrucci, Luigi; Gudnason, Vilmundur; Kivimaki, Mika; Liu, Yongmei; Murabito, Joanne M; Newman, Anne B; Tiemeier, Henning; Franceschini, Nora
Human longevity and healthy aging show moderate heritability (20%-50%). We conducted a meta-analysis of genome-wide association studies from 9 studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium for 2 outcomes: (1) all-cause mortality, and (2) survival free of major disease or death. No single nucleotide polymorphism (SNP) was a genome-wide significant predictor of either outcome (p < 5 × 10(-8)). We found 14 independent SNPs that predicted risk of death, and 8 SNPs that predicted event-free survival (p < 10(-5)). These SNPs are in or near genes that are highly expressed in the brain (HECW2, HIP1, BIN2, GRIA1), genes involved in neural development and function (KCNQ4, LMO4, GRIA1, NETO1) and autophagy (ATG4C), and genes that are associated with risk of various diseases including cancer and Alzheimer's disease. In addition to considerable overlap between the traits, pathway and network analysis corroborated these findings. These findings indicate that variation in genes involved in neurological processes may be an important factor in regulating aging free of major disease and achieving longevity. Copyright © 2011 Elsevier Inc. All rights reserved.
Sahana, G; Guldbrandtsen, B; Thomsen, B; Holm, L-E; Panitz, F; Brøndum, R F; Bendixen, C; Lund, M S
Mastitis is a mammary disease that frequently affects dairy cattle. Despite considerable research on the development of effective prevention and treatment strategies, mastitis continues to be a significant issue in bovine veterinary medicine. To identify major genes that affect mastitis in dairy cattle, 6 chromosomal regions on Bos taurus autosome (BTA) 6, 13, 16, 19, and 20 were selected from a genome scan for 9 mastitis phenotypes using imputed high-density single nucleotide polymorphism arrays. Association analyses using sequence-level variants for the 6 targeted regions were carried out to map causal variants using whole-genome sequence data from 3 breeds. The quantitative trait loci (QTL) discovery population comprised 4,992 progeny-tested Holstein bulls, and QTL were confirmed in 4,442 Nordic Red and 1,126 Jersey cattle. The targeted regions were imputed to the sequence level. The highest association signal for clinical mastitis was observed on BTA 6 at 88.97 Mb in Holstein cattle and was confirmed in Nordic Red cattle. The peak association region on BTA 6 contained 2 genes: vitamin D-binding protein precursor (GC) and neuropeptide FF receptor 2 (NPFFR2), which, based on known biological functions, are good candidates for affecting mastitis. However, strong linkage disequilibrium in this region prevented conclusive determination of the causal gene. A different QTL on BTA 6 located at 88.32 Mb in Holstein cattle affected mastitis. In addition, QTL on BTA 13 and 19 were confirmed to segregate in Nordic Red cattle and QTL on BTA 16 and 20 were confirmed in Jersey cattle. Although several candidate genes were identified in these targeted regions, it was not possible to identify a gene or polymorphism as the causal factor for any of these regions. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Rembeck, Karolina; Alsiö, Asa; Christensen, Peer Brehm
Recently, several genome-wide association studies have revealed that single nucleotide polymorphisms (SNPs) in proximity to IL28B predict spontaneous clearance of HCV infection as well as outcome following peginterferon and ribavirin therapy among HCV genotype 1 infected patients. The present stu...
Jiao, Hong; Arner, Peter; Hoffstedt, Johan
Recent genome-wide association (GWA) analyses have identified common single nucleotide polymorphisms (SNPs) that are associated with obesity. However, the reported genetic variation in obesity explains only a minor fraction of the total genetic variation expected to be present in the population....... Thus many genetic variants controlling obesity remain to be identified. The aim of this study was to use GWA followed by multiple stepwise validations to identify additional genes associated with obesity....
Chen, Gary K; Zheng, Tian; Witte, John S; Goode, Ellen L; Gao, Lei; Hu, Pingzhao; Suh, Young Ju; Suktitipat, Bhoom; Szymczak, Silke; Woo, Jung Hoon; Zhang, Wei
A number of issues arise when analyzing the large amount of data from high-throughput genotype and expression microarray experiments, including design and interpretation of genome-wide association studies of expression phenotypes. These issues were considered by contributions submitted to Group 1 of the Genetic Analysis Workshop 15 (GAW15), which focused on the association of quantitative expression data. These contributions evaluated diverse hypotheses, including those relevant to cancer and obesity research, and used various analytic techniques, many of which were derived from information theory. Several observations from these reports stand out. First, one needs to consider the genetic model of the trait of interest and carefully select which single nucleotide polymorphisms and individuals are included early in the design stage of a study. Second, by targeting specific pathways when analyzing genome-wide data, one can generate more interpretable results than agnostic approaches. Finally, for datasets with small sample sizes but a large number of features like the Genetic Analysis Workshop 15 dataset, machine learning approaches may be more practical than traditional parametric approaches. (c) 2007 Wiley-Liss, Inc.
Full Text Available Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait (pcorr < 0.05. Pairwise comparison of the traits in terms of the semantic similarity in their GO sets revealed surprising cases where phenotypically uncorrelated traits showed high similarity in terms of biological pathways. For example, the pH level was related to 7 other traits that showed low phenotypic correlations with it. A literature survey implies that these traits may be regulated partly by common pathways that involve neuronal or nerve systems.
van Binsbergen, R; Veerkamp, R F; Calus, M P L
The correlated responses between traits may differ depending on the makeup of genetic covariances, and may differ from the predictions of polygenic covariances. Therefore, the objective of the present study was to investigate the makeup of the genetic covariances between the well-studied traits: milk yield, fat yield, protein yield, and their percentages in more detail. Phenotypic records of 1,737 heifers of research farms in 4 different countries were used after homogenizing and adjusting for management effects. All cows had a genotype for 37,590 single nucleotide polymorphisms (SNP). A bayesian stochastic search variable selection model was used to estimate the SNP effects for each trait. About 0.5 to 1.0% of the SNP had a significant effect on 1 or more traits; however, the SNP without a significant effect explained most of the genetic variances and covariances of the traits. Single nucleotide polymorphism correlations differed from the polygenic correlations, but only 10 regions were found with an effect on multiple traits; in 1 of these regions the DGAT1 gene was previously reported with an effect on multiple traits. This region explained up to 41% of the variances of 4 traits and explained a major part of the correlation between fat yield and fat percentage and contributes to asymmetry in correlated response between fat yield and fat percentage. Overall, for the traits in this study, the infinitesimal model is expected to be sufficient for the estimation of the variances and covariances. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Sendorek, Dorota H; Caloian, Cristian; Ellrott, Kyle; Bare, J Christopher; Yamaguchi, Takafumi N; Ewing, Adam D; Houlahan, Kathleen E; Norman, Thea C; Margolin, Adam A; Stuart, Joshua M; Boutros, Paul C
The clinical sequencing of cancer genomes to personalize therapy is becoming routine across the world. However, concerns over patient re-identification from these data lead to questions about how tightly access should be controlled. It is not thought to be possible to re-identify patients from somatic variant data. However, somatic variant detection pipelines can mistakenly identify germline variants as somatic ones, a process called "germline leakage". The rate of germline leakage across different somatic variant detection pipelines is not well-understood, and it is uncertain whether or not somatic variant calls should be considered re-identifiable. To fill this gap, we quantified germline leakage across 259 sets of whole-genome somatic single nucleotide variant (SNVs) predictions made by 21 teams as part of the ICGC-TCGA DREAM Somatic Mutation Calling Challenge. The median somatic SNV prediction set contained 4325 somatic SNVs and leaked one germline polymorphism. The level of germline leakage was inversely correlated with somatic SNV prediction accuracy and positively correlated with the amount of infiltrating normal cells. The specific germline variants leaked differed by tumour and algorithm. To aid in quantitation and correction of leakage, we created a tool, called GermlineFilter, for use in public-facing somatic SNV databases. The potential for patient re-identification from leaked germline variants in somatic SNV predictions has led to divergent open data access policies, based on different assessments of the risks. Indeed, a single, well-publicized re-identification event could reshape public perceptions of the values of genomic data sharing. We find that modern somatic SNV prediction pipelines have low germline-leakage rates, which can be further reduced, especially for cloud-sharing, using pre-filtering software.
Taillon-Miller, P; Gu, Z; Li, Q; Hillier, L; Kwok, P Y
An efficient strategy to develop a dense set of single-nucleotide polymorphism (SNP) markers is to take advantage of the human genome sequencing effort currently under way. Our approach is based on the fact that bacterial artificial chromosomes (BACs) and P1-based artificial chromosomes (PACs) used in long-range sequencing projects come from diploid libraries. If the overlapping clones sequenced are from different lineages, one is comparing the sequences from 2 homologous chromosomes in the overlapping region. We have analyzed in detail every SNP identified while sequencing three sets of overlapping clones found on chromosome 5p15.2, 7q21-7q22, and 13q12-13q13. In the 200.6 kb of DNA sequence analyzed in these overlaps, 153 SNPs were identified. Computer analysis for repetitive elements and suitability for STS development yielded 44 STSs containing 68 SNPs for further study. All 68 SNPs were confirmed to be present in at least one of the three (Caucasian, African-American, Hispanic) populations studied. Furthermore, 42 of the SNPs tested (62%) were informative in at least one population, 32 (47%) were informative in two or more populations, and 23 (34%) were informative in all three populations. These results clearly indicate that developing SNP markers from overlapping genomic sequence is highly efficient and cost effective, requiring only the two simple steps of developing STSs around the known SNPs and characterizing them in the appropriate populations.
Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M
RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.
Yin, Chang Shik; Park, Hi Joon; Chung, Joo-Ho; Lee, Hye-Jung; Lee, Byung-Cheol
Four-constitution medicine (FCM), also known as Sasang constitutional medicine, and the heritage of the long history of individualized acupuncture medicine tradition, is one of the holistic and traditional systems of constitution to appraise and categorize individual differences into four major types. This study first reports a genome-wide association study on FCM, to explore the genetic basis of FCM and facilitate the integration of FCM with conventional individual differences research. Healthy individuals of the Korean population were classified into the four constitutional types (FCTs). A total of 353,202 single nucleotide polymorphisms (SNPs) were typed using whole genome amplified samples, and six-way comparison of FCM types provided lists of significantly differential SNPs. In one-to-one FCT comparisons, 15,944 SNPs were significantly differential, and 5 SNPs were commonly significant in all of the three comparisons. In one-to-two FCT comparisons, 22,616 SNPs were significantly differential, and 20 SNPs were commonly significant in all of the three comparison groups. This study presents the association between genome-wide SNP profiles and the categorization of the FCM, and it could further provide a starting point of genome-based identification and research of the constitutions of FCM.
Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca
Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450
Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.
Fall, Tove; Ingelsson, Erik
Until just a few years ago, the genetic determinants of obesity and metabolic syndrome were largely unknown, with the exception of a few forms of monogenic extreme obesity. Since genome-wide association studies (GWAS) became available, large advances have been made. The first single nucleotide polymorphism robustly associated with increased body mass index (BMI) was in 2007 mapped to a gene with for the time unknown function. This gene, now known as fat mass and obesity associated (FTO) has been repeatedly replicated in several ethnicities and is affecting obesity by regulating appetite. Since the first report from a GWAS of obesity, an increasing number of markers have been shown to be associated with BMI, other measures of obesity or fat distribution and metabolic syndrome. This systematic review of obesity GWAS will summarize genome-wide significant findings for obesity and metabolic syndrome and briefly give a few suggestions of what is to be expected in the next few years. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Krapohl, E; Plomin, R
One of the best predictors of children's educational achievement is their family's socioeconomic status (SES), but the degree to which this association is genetically mediated remains unclear. For 3000 UK-representative unrelated children we found that genome-wide single-nucleotide polymorphisms could explain a third of the variance of scores on an age-16 UK national examination of educational achievement and half of the correlation between their scores and family SES. Moreover, genome-wide polygenic scores based on a previously published genome-wide association meta-analysis of total number of years in education accounted for ~3.0% variance in educational achievement and ~2.5% in family SES. This study provides the first molecular evidence for substantial genetic influence on differences in children's educational achievement and its association with family SES.
Full Text Available Here we review our development of, and results with, high resolution studies on global genome nucleotide excision repair (GGNER in Saccharomyces cerevisiae. We have focused on how GGNER relates to histone acetylation for its functioning and we have identified the histone acetyl tranferase Gcn5 and acetylation at lysines 9/14 of histone H3 as a major factor in enabling efficient repair. We consider results employing primarily MFA2 as a model gene, but also those with URA3 located at subtelomeric sequences. In the latter case we also see a role for acetylation at histone H4. We then go on to outline the development of a high resolution genome-wide approach that enables one to examine correlations between histone modifications and the nucleotide excision repair (NER of UV-induced cyclobutane pyrimidine dimers throughout entire genomes. This is an approach that will enable rapid advances in understanding the complexities of how compacted chromatin in chromosomes is processed to access DNA damage and then returned to its pre-damaged status to maintain epigenetic codes.
Full Text Available Nucleotide-binding site (NBS disease resistance genes play an important role in defending plants from a variety of pathogens and insect pests. Many R-genes have been identified in various plant species. However, little is known about the NBS-encoding genes in Brachypodium distachyon. In this study, using computational analysis of the B. distachyon genome, we identified 126 regular NBS-encoding genes and characterized them on the bases of structural diversity, conserved protein motifs, chromosomal locations, gene duplications, promoter region, and phylogenetic relationships. EST hits and full-length cDNA sequences (from Brachypodium database of 126 R-like candidates supported their existence. Based on the occurrence of conserved protein motifs such as coiled-coil (CC, NBS, leucine-rich repeat (LRR, these regular NBS-LRR genes were classified into four subgroups: CC-NBS-LRR, NBS-LRR, CC-NBS, and X-NBS. Further expression analysis of the regular NBS-encoding genes in Brachypodium database revealed that these genes are expressed in a wide range of libraries, including those constructed from various developmental stages, tissue types, and drought challenged or nonchallenged tissue.
Fragomeni, Breno O; Lourenco, Daniela A L; Masuda, Yutaka; Legarra, Andres; Misztal, Ignacy
Much effort is put into identifying causative quantitative trait nucleotides (QTN) in animal breeding, empowered by the availability of dense single nucleotide polymorphism (SNP) information. Genomic selection using traditional SNP information is easily implemented for any number of genotyped individuals using single-step genomic best linear unbiased predictor (ssGBLUP) with the algorithm for proven and young (APY). Our aim was to investigate whether ssGBLUP is useful for genomic prediction when some or all QTN are known. Simulations included 180,000 animals across 11 generations. Phenotypes were available for all animals in generations 6 to 10. Genotypes for 60,000 SNPs across 10 chromosomes were available for 29,000 individuals. The genetic variance was fully accounted for by 100 or 1000 biallelic QTN. Raw genomic relationship matrices (GRM) were computed from (a) unweighted SNPs, (b) unweighted SNPs and causative QTN, (c) SNPs and causative QTN weighted with results obtained with genome-wide association studies, (d) unweighted SNPs and causative QTN with simulated weights, (e) only unweighted causative QTN, (f-h) as in (b-d) but using only the top 10% causative QTN, and (i) using only causative QTN with simulated weight. Predictions were computed by pedigree-based BLUP (PBLUP) and ssGBLUP. Raw GRM were blended with 1 or 5% of the numerator relationship matrix, or 1% of the identity matrix. Inverses of GRM were obtained directly or with APY. Accuracy of breeding values for 5000 genotyped animals in the last generation with PBLUP was 0.32, and for ssGBLUP it increased to 0.49 with an unweighted GRM, 0.53 after adding unweighted QTN, 0.63 when QTN weights were estimated, and 0.89 when QTN weights were based on true effects known from the simulation. When the GRM was constructed from causative QTN only, accuracy was 0.95 and 0.99 with blending at 5 and 1%, respectively. Accuracies simulating 1000 QTN were generally lower, with a similar trend. Accuracies using the
Full Text Available Since the first report of a genome-wide association study (GWAS on human age-related macular degeneration, GWAS has successfully been used to discover genetic variants for a variety of complex human diseases and/or traits, and thousands of associated loci have been identified. However, the underlying mechanisms for these loci remain largely unknown. To make these GWAS findings more useful, it is necessary to perform in-depth data mining. The data analysis in the post-GWAS era will include the following aspects: fine-mapping of susceptibility regions to identify susceptibility genes for elucidating the biological mechanism of action; joint analysis of susceptibility genes in different diseases; integration of GWAS, transcriptome, and epigenetic data to analyze expression and methylation quantitative trait loci at the whole-genome level, and find single-nucleotide polymorphisms that influence gene expression and DNA methylation; genome-wide association analysis of disease-related DNA copy number variations. Applying these strategies and methods will serve to strengthen GWAS data to enhance the utility and significance of GWAS in improving understanding of the genetics of complex diseases or traits and translate these findings for clinical applications. Keywords: Genome-wide association study, Data mining, Integrative data analysis, Polymorphism, Copy number variation
Schork, Andrew J.; Thompson, Wesley K.; Pham, Phillip; Torkamani, Ali; Roddey, J. Cooper; Sullivan, Patrick F.; Kelsoe, John R.; O'Donovan, Michael C.; Furberg, Helena; Schork, Nicholas J.; Andreassen, Ole A.; Dale, Anders M.; Absher, Devin; Agudo, Antonio; Almgren, Peter; Ardissino, Diego; Assimes, Themistocles L.; Bandinelli, Stephania; Barzan, Luigi; Bencko, Vladimir; Benhamou, Simone; Benjamin, Emelia J.; Bernardinelli, Luisa; Bis, Joshua; Boehnke, Michael; Boerwinkle, Eric; Boomsma, Dorret I.; Brennan, Paul; Canova, Cristina; Castellsagué, Xavier; Chanock, Stephen; Chasman, Daniel; Conway, David I.; Dackor, Jennifer; de Geus, Eco J. C.; Duan, Jubao; Elosua, Roberto; Everett, Brendan; Fabianova, Eleonora; Ferrucci, Luigi; Foretova, Lenka; Fortmann, Stephen P.; Franceschini, Nora; Frayling, Timothy; Furberg, Curt; Gejman, Pablo V.; Groop, Leif; Gu, Fangyi; de Haan, Lieuwe; Linszen, Don H.
Recent results indicate that genome-wide association studies (GWAS) have the potential to explain much of the heritability of common complex phenotypes, but methods are lacking to reliably identify the remaining associated single nucleotide polymorphisms (SNPs). We applied stratified False Discovery
Schork, A.J.; Thompson, W.K.; Pham, P.; Torkamani, A.; Roddey, J.C.; Sullivan, P.F.; Kelsoe, J.; O'Donovan, M.C.; Furberg, H.; Absher, D.; Agudo, A.; Almgren, P.; Ardissino, D.; Assimes, T.L.; Bandinelli, S.; Barzan, L.; Bencko, V.; Benhamou, S.; Benjamin, E.J.; Bernardinelli, L.; Bis, J.; Boehnke, M.; Boerwinkle, E.; Boomsma, D.I.; Brennan, P.; Canova, C.; Castellsagué, X.; Chanock, S.; Chasman, D.I.; Conway, D.I.; Dackor, J.; de Geus, E.J.C.; Duan, J.; Elosua, R.; Everett, B.; Fabianova, E.; Ferrucci, L.; Foretova, L.; Fortmann, S.P.; Franceschini, N.; Frayling, T.M.; Furberg, C.; Gejman, P.V.; Groop, L.; Gu, F.; Guralnik, J.; Hankinson, S.E.; Haritunians, T.; Healy, C.; Hofman, A.; Holcátová, I.; Hunter, D.J.; Hwang, S.J.; Ioannidis, J.P.A.; Iribarren, C.; Jackson, A.U.; Janout, V.; Kaprio, J.; Kim, Y.; Kjaerheim, K.; Knowles, J.W.; Kraft, P.; Ladenvall, C.; Lagiou, P.; Lanthrop, M.; Lerman, C.; Levinson, D.F.; Levy, D.; Li, M.D.; Lin, D.Y.; Lips, E.H.; Lissowska, J.; Lowry, R.B.; Lucas, G.; Macfarlane, T.V.; Maes, H.H.M.; Mannucci, P.M.; Mates, D.; Mauri, F.; McGovern, J.A.; McKay, J.D.; McKnight, B.; Melander, O.; Merlini, P.A.; Milaneschi, Y.; Mohlke, K.L.; O'Donnell, C.J.; Pare, G.; Penninx, B.W.J.H.; Perry, J.R.B.; Posthuma, D.; Preis, S.R.; Psaty, B.; Quertermous, T.; Ramachandran, V.S.; Richiardi, L.; Ridker, P.M.; Rose, J.; Rudnai, P.; Salomaa, V.; Sanders, A.R.; Schwartz, S.M.; Shi, J.; Smit, J.H.; Stringham, H.M.; Szeszenia-Dabrowska, N.; Tanaka, T.; Taylor, K.; Thacker, E.E.; Thornton, L.; Tiemeier, H.; Tuomilehto, J.; Uitterlinden, A.G.; van Duijn, C.M.; Vink, J.M.; Vogelzangs, N.; Voight, B.F.; Walter, S.; Willemsen, G.; Zaridze, D.; Znaor, A.; Akil, H.; Anjorin, A.; Backlund, L.; Badner, J.A.; Barchas, J.D.; Barrett, T.; Bass, N.; Bauer, M.; Bellivier, F.; Bergen, S.E.; Berrettini, W.; Blackwood, D.; Bloss, C.S.; Breen, G.; Breuer, R.; Bunner, W.E.; Burmeister, M.; Byerley, W. F.; Caesar, S.; Chambert, K.; Cichon, S.; St Clair, D.; Collier, D.A.; Corvin, A.; Coryell, W.H.; Craddock, N.; Craig, D.W.; Daly, M.; Day, R.; Degenhardt, F.; Djurovic, S.; Dudbridge, F.; Edenberg, H.J.; Elkin, A.; Etain, B.; Farmer, A.E.; Ferreira, M.A.; Ferrier, I.; Flickinger, M.; Foroud, T.; Frank, J.; Fraser, C.; Frisén, L.; Gershon, E.S.; Gill, M.; Gordon-Smith, K.; Green, E.K.; Greenwood, T.A.; Grozeva, D.; Guan, W.; Gurling, H.; Gustafsson, O.; Hamshere, M.L.; Hautzinger, M.; Herms, S.; Hipolito, M.; Holmans, P.A.; Hultman, C. M.; Jamain, S.; Jones, E.G.; Jones, I.; Jones, L.; Kandaswamy, R.; Kennedy, J.L.; Kirov, G. K.; Koller, D.L.; Kwan, P.; Landén, M.; Langstrom, N.; Lathrop, M.; Lawrence, J.; Lawson, W.B.; Leboyer, M.; Lee, P.H.; Li, J.; Lichtenstein, P.; Lin, D.; Liu, C.; Lohoff, F.W.; Lucae, S.; Mahon, P.B.; Maier, W.; Martin, N.G.; Mattheisen, M.; Matthews, K.; Mattingsdal, M.; McGhee, K.A.; McGuffin, P.; McInnis, M.G.; McIntosh, A.; McKinney, R.; McLean, A.W.; McMahon, F.J.; McQuillin, A.; Meier, S.; Melle, I.; Meng, F.; Mitchell, P.B.; Montgomery, G.W.; Moran, J.; Morken, G.; Morris, D.W.; Moskvina, V.; Muglia, P.; Mühleisen, T.W.; Muir, W.J.; Müller-Myhsok, B.; Myers, R.M.; Nievergelt, C.M.; Nikolov, I.; Nimgaonkar, V.L.; Nöthen, M.M.; Nurnberger, J.I.; Nwulia, E.A.; O'Dushlaine, C.; Osby, U.; Óskarsson, H.; Owen, M.J.; Petursson, H.; Pickard, B.S.; Porgeirsson, P.; Potash, J.B.; Propping, P.; Purcell, S.M.; Quinn, E.; Raychaudhuri, S.; Rice, J.; Rietschel, M.; Ruderfer, D.; Schalling, M.; Schatzberg, A.F.; Scheftner, W.A.; Schofield, P.R.; Schulze, T.G.; Schumacher, J.; Schwarz, M.M.; Scolnick, E.; Scott, L.J.; Shilling, P.D.; Sigurdsson, E.; Sklar, P.; Smith, E.N.; Stefansson, H.; Stefansson, K.; Steffens, M; Steinberg, S.; Strauss, J.; Strohmaier, J.; Szelinger, S.; Thompson, R.C.; Tozzi, F.; Treutlein, J.; Vincent, J.B.; Watson, S.J.; Wienker, T.F.; Williamson, R.; Witt, S.H.; Wright, A.; Xu, W.; Young, A.H.; Zandi, P.P.; Zhang, P.; Zöllner, S.; Agartz, I.; Albus, M.; Alexander, M.; Amdur, R. L.; Amin, F.; Bitter, I.; Black, D.W.; Børglum, A.D.; Brown, M.A.; Bruggeman, R.; Buccola, N.G.; Cahn, W.; Cantor, R.M.; Carr, V.J.; Catts, S. V.; Choudhury, K.; Cloninger, C. R.; Cormican, P.; Danoy, P. A.; Datta, S.; DeHert, M.; Demontis, D.; Dikeos, D.; Donnelly, P.; Donohoe, G.; Duong, L.; Dwyer, S.; Fanous, A.; Fink-Jensen, A.; Freedman, R.; Freimer, N.B.; Friedl, M.; Georgieva, L.; Giegling, I.; Glenthoj, B.; Godard, S.; Golimbet, V.; de Haan, L.; Hansen, M.; Hansen, T.; Hartmann, A.M.; Henskens, F. A.; Hougaard, D. M.; Ingason, A.; Jablensky, A. V.; Jakobsen, K.D.; Jay, M.; Jönsson, E.G.; Jürgens, G.; Kahn, R.S.; Keller, M.C.; Kendler, K.S.; Kenis, G.; Kenny, E.; Konnerth, H.; Konte, B.; Krabbendam, L.; Krasucki, R.; Lasseter, V. K.; Laurent, C.; Lencz, T.; Lerer, F. B.; Liang, K. Y.; Lieberman, J. A.; Linszen, D.H.; Lönnqvist, J.; Loughland, C. M.; Maclean, A. W.; Maher, B.S.; Malhotra, A.K.; Mallet, J.; Malloy, P.; McGrath, J. J.; McLean, D. E.; Michie, P. T.; Milanova, V.; Mors, O.; Mortensen, P.B.; Mowry, B. J.; Myin-Germeys, I.; Neale, B.; Nertney, D. A.; Nestadt, G.; Nielsen, J.; Nordentoft, M.; Norton, N.; O'Neill, F.; Olincy, A.; Olsen, L.; Ophoff, R.A.; Orntoft, T. F.; van Os, J.; Pantelis, C.; Papadimitriou, G.; Pato, C.N.; Peltonen, L.; Pickard, B.; Pietilainen, O.P.; Pimm, J.; Pulver, A. E.; Puri, V.; Quested, D.; Rasmussen, H.B.; Rethelyi, J.M.; Ribble, R.; Riley, B.P.; Rossin, L.; Ruggeri, M.; Rujescu, D.; Schall, U.; Schwab, S. G.; Scott, R.J.; Silverman, J.M.; Spencer, C. C.; Strange, A.; Strengman, E.; Stroup, T.S.; Suvisaari, J.; Terenius, L.; Thirumalai, S.; Timm, S.; Toncheva, D.; Tosato, S.; van den Oord, E.J.; Veldink, J.; Visscher, P.M.; Walsh, D.; Wang, A. G.; Werge, T.; Wiersma, D.; Wildenauer, D. B.; Williams, H.J.; Williams, N.M.; van Winkel, R.; Wormley, B.; Zammit, S.; Schork, N.J.; Andreassen, O.A.; Dale, A.M.
Recent results indicate that genome-wide association studies (GWAS) have the potential to explain much of the heritability of common complex phenotypes, but methods are lacking to reliably identify the remaining associated single nucleotide polymorphisms (SNPs). We applied stratified False Discovery
Full Text Available Abstract Background Insect bite hypersensitivity is a common allergic disease in horse populations worldwide. Insect bite hypersensitivity is affected by both environmental and genetic factors. However, little is known about genes contributing to the genetic variance associated with insect bite hypersensitivity. Therefore, the aim of our study was to identify and quantify genomic associations with insect bite hypersensitivity in Shetland pony mares and Icelandic horses in the Netherlands. Methods Data on 200 Shetland pony mares and 146 Icelandic horses were collected according to a matched case–control design. Cases and controls were matched on various factors (e.g. region, sire to minimize effects of population stratification. Breed-specific genome-wide association studies were performed using 70 k single nucleotide polymorphisms genotypes. Bayesian variable selection method Bayes-C with a threshold model implemented in GenSel software was applied. A 1 Mb non-overlapping window approach that accumulated contributions of adjacent single nucleotide polymorphisms was used to identify associated genomic regions. Results The percentage of variance explained by all single nucleotide polymorphisms was 13% in Shetland pony mares and 28% in Icelandic horses. The 20 non-overlapping windows explaining the largest percentages of genetic variance were found on nine chromosomes in Shetland pony mares and on 14 chromosomes in Icelandic horses. Overlap in identified associated genomic regions between breeds would suggest interesting candidate regions to follow-up on. Such regions common to both breeds (within 15 Mb were found on chromosomes 3, 7, 11, 20 and 23. Positional candidate genes within 2 Mb from the associated windows were identified on chromosome 20 in both breeds. Candidate genes are within the equine lymphocyte antigen class II region, which evokes an immune response by recognizing many foreign molecules. Conclusions The genome-wide association
Afsari, Sepideh; Korshoj, Lee E; Abel, Gary R; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant
Several nanoscale electronic methods have been proposed for high-throughput single-molecule nucleic acid sequence identification. While many studies display a large ensemble of measurements as "electronic fingerprints" with some promise for distinguishing the DNA and RNA nucleobases (adenine, guanine, cytosine, thymine, and uracil), important metrics such as accuracy and confidence of base calling fall well below the current genomic methods. Issues such as unreliable metal-molecule junction formation, variation of nucleotide conformations, insufficient differences between the molecular orbitals responsible for single-nucleotide conduction, and lack of rigorous base calling algorithms lead to overlapping nanoelectronic measurements and poor nucleotide discrimination, especially at low coverage on single molecules. Here, we demonstrate a technique for reproducible conductance measurements on conformation-constrained single nucleotides and an advanced algorithmic approach for distinguishing the nucleobases. Our quantum point contact single-nucleotide conductance sequencing (QPICS) method uses combed and electrostatically bound single DNA and RNA nucleotides on a self-assembled monolayer of cysteamine molecules. We demonstrate that by varying the applied bias and pH conditions, molecular conductance can be switched ON and OFF, leading to reversible nucleotide perturbation for electronic recognition (NPER). We utilize NPER as a method to achieve >99.7% accuracy for DNA and RNA base calling at low molecular coverage (∼12×) using unbiased single measurements on DNA/RNA nucleotides, which represents a significant advance compared to existing sequencing methods. These results demonstrate the potential for utilizing simple surface modifications and existing biochemical moieties in individual nucleobases for a reliable, direct, single-molecule, nanoelectronic DNA and RNA nucleotide identification method for sequencing.
Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L
Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.
Yuan, Han; Dougherty, Joseph D.
Lay Abstract Autism spectrum disorders (ASDs) are pervasive developmental disorders which have both a genetic and environmental component. One source of the environmental component is the in utero (prenatal) environment. The maternal genome can potentially contribute to the risk of autism in children by altering this prenatal environment. In this study, the possibility of maternal genotype effects was explored by looking for common variants (single nucleotide polymorphisms, or SNPs) in the maternal genome associated with increased risk of autism in children. We performed a case/control genome-wide association study (GWAS) using mothers of probands as cases and either fathers of probands or normal females as controls, using two collections of families with autism. We did not identify any SNP that reached significance and thus a common variant of large effect is unlikely. However, there was evidence for the possibility of a large number of alleles each carrying a small effect. This suggested that if there is a contribution to autism risk through common-variant maternal genetic effects, it may be the result of multiple loci of small effects. We did not investigate rare variants in this study. Scientific Abstract Like most psychiatric disorders, autism spectrum disorders have both a genetic and an environmental component. While previous studies have clearly demonstrated the contribution of in utero (prenatal) environment on autism risk, most of them focused on transient environmental factors. Based on a recent sibling study, we hypothesized that environmental factors could also come from the maternal genome, which would result in persistent effects across siblings. In this study, the possibility of maternal genotype effects was examined by looking for common variants (single nucleotide polymorphisms, or SNPs) in the maternal genome associated with increased risk of autism in children. A case/control genome-wide association study (GWAS) was performed using mothers of
de Boer, Ynto S; van Gerven, Nicole M F; Zwiers, Antonie; Verwer, Bart J; van Hoek, Bart; van Erpecum, Karel J; Beuers, Ulrich; van Buuren, Henk R; Drenth, Joost P H; den Ouden, Jannie W; Verdonk, Robert C; Koek, Ger H; Brouwer, Johannes T; Guichelaar, Maureen M J; Vrolijk, Jan M; Kraal, Georg; Mulder, Chris J J; van Nieuwkerk, Carin M J; Fischer, Janett; Berg, Thomas; Stickel, Felix; Sarrazin, Christoph; Schramm, Christoph; Lohse, Ansgar W; Weiler-Normann, Christina; Lerch, Markus M; Nauck, Matthias; Völzke, Henry; Homuth, Georg; Bloemena, Elisabeth; Verspaget, Hein W; Kumar, Vinod; Zhernakova, Alexandra; Wijmenga, Cisca; Franke, Lude; Bouma, Gerd
Autoimmune hepatitis (AIH) is an uncommon autoimmune liver disease of unknown etiology. We used a genome-wide approach to identify genetic variants that predispose individuals to AIH. We performed a genome-wide association study of 649 adults in The Netherlands with AIH type 1 and 13,436 controls. Initial associations were further analyzed in an independent replication panel comprising 451 patients with AIH type 1 in Germany and 4103 controls. We also performed an association analysis in the discovery cohort using imputed genotypes of the major histocompatibility complex region. We associated AIH with a variant in the major histocompatibility complex region at rs2187668 (P = 1.5 × 10(-78)). Analysis of this variant in the discovery cohort identified HLA-DRB1*0301 (P = 5.3 × 10(-49)) as a primary susceptibility genotype and HLA-DRB1*0401 (P = 2.8 × 10(-18)) as a secondary susceptibility genotype. We also associated AIH with variants of SH2B3 (rs3184504, 12q24; P = 7.7 × 10(-8)) and CARD10 (rs6000782, 22q13.1; P = 3.0 × 10(-6)). In addition, strong inflation of association signal was found with single-nucleotide polymorphisms associated with other immune-mediated diseases, including primary sclerosing cholangitis and primary biliary cirrhosis, but not with single-nucleotide polymorphisms associated with other genetic traits. In a genome-wide association study, we associated AIH type 1 with variants in the major histocompatibility complex region, and identified variants of SH2B3and CARD10 as likely risk factors. These findings support a complex genetic basis for AIH pathogenesis and indicate that part of the genetic susceptibility overlaps with that for other immune-mediated liver diseases. Copyright © 2014 AGA Institute. Published by Elsevier Inc. All rights reserved.
Marete, Andrew Gitahi; Sahana, Goutam; Fritz, Sebastian
Using a combination of data from the BovineSNP50 BeadChip SNP array (Illumina, San Diego, CA) and a EuroGenomics (Amsterdam, the Netherlands) custom single nucleotide polymorphism (SNP) chip with SNP pre-selected from whole genome sequence data, we carried out an association study of milking speed...... associated with milking speed. As clinical mastitis and somatic cell score have an unfavorable genetic correlation with milking speed, we tested whether the most significant SNP on these 22 chromosomes associated with milking speed were also associated with clinical mastitis or somatic cell score. Nine...... hundred seventy-one genome-wide significant SNP were associated with milking speed. Of these, 86 were associated with clinical mastitis and 198 with somatic cell score. The most significant association signals for milking speed were observed on chromosomes 7, 8, 10, 14, and 18. The most significant signal...
Børglum, A D; Demontis, D; Grove, J
Genetic and environmental components as well as their interaction contribute to the risk of schizophrenia, making it highly relevant to include environmental factors in genetic studies of schizophrenia. This study comprises genome-wide association (GWA) and follow-up analyses of all individuals...... born in Denmark since 1981 and diagnosed with schizophrenia as well as controls from the same birth cohort. Furthermore, we present the first genome-wide interaction survey of single nucleotide polymorphisms (SNPs) and maternal cytomegalovirus (CMV) infection. The GWA analysis included 888 cases...... was found for rs7902091 (P(SNP × CMV)=7.3 × 10(-7)) in CTNNA3, a gene not previously implicated in schizophrenia, stressing the importance of including environmental factors in genetic studies....
M. Ilyas Kamboh
Full Text Available Background. The persistent presence of antiphospholipid antibodies (APA may lead to the development of primary or secondary antiphospholipid syndrome. Although the genetic basis of APA has been suggested, the identity of the underlying genes is largely unknown. In this study, we have performed a genome-wide association study (GWAS in an effort to identify susceptibility loci/genes for three main APA: anticardiolipin antibodies (ACL, lupus anticoagulant (LAC, and anti-β2 glycoprotein I antibodies (anti-β2GPI. Methods. DNA samples were genotyped using the Affymetrix 6.0 array containing 906,600 single-nucleotide polymorphisms (SNPs. Association of SNPs with the antibody status (positive/negative was tested using logistic regression under the additive model. Results. We have identified a number of suggestive novel loci with P
Edsgard, Stefan Daniel; Dalgaard, Marlene Danner; Weinhold, Nils
Testicular germ cell cancer (TGCC) is one of the most heritable forms of cancer. Previous genome-wide association studies have focused on single nucleotide polymorphisms, largely ignoring the influence of copy number variants (CNVs). Here we present a genome-wide study of CNV on a cohort of 212...... of rare CNVs related to cell migration (false-discovery rate = 0.021, 1.8% of cases and 1.1% of controls). Dysregulation during migration of primordial germ cells has previously been suspected to be a part of TGCC development and this set of multiple rare variants may thereby have a minor contribution...
Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K
A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.
Schürch, A C; Arredondo-Alonso, S; Willems, R J L; Goering, R V
Whole genome sequence (WGS)-based strain typing finds increasing use in the epidemiologic analysis of bacterial pathogens in both public health as well as more localized infection control settings. This minireview describes methodologic approaches that have been explored for WGS-based epidemiologic analysis and considers the challenges and pitfalls of data interpretation. Personal collection of relevant publications. When applying WGS to study the molecular epidemiology of bacterial pathogens, genomic variability between strains is translated into measures of distance by determining single nucleotide polymorphisms in core genome alignments or by indexing allelic variation in hundreds to thousands of core genes, assigning types to unique allelic profiles. Interpreting isolate relatedness from these distances is highly organism specific, and attempts to establish species-specific cutoffs are unlikely to be generally applicable. In cases where single nucleotide polymorphism or core gene typing do not provide the resolution necessary for accurate assessment of the epidemiology of bacterial pathogens, inclusion of accessory gene or plasmid sequences may provide the additional required discrimination. As with all epidemiologic analysis, realizing the full potential of the revolutionary advances in WGS-based approaches requires understanding and dealing with issues related to the fundamental steps of data generation and interpretation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Pujolar, J. M.; Jacobsen, M. W.; Als, Thomas Damm
Next-generation sequencing and the collection of genome-wide data allow identifying adaptive variation and footprints of directional selection. Using a large SNP data set from 259 RAD-sequenced European eel individuals (glass eels) from eight locations between 34 and 64oN, we examined the patterns...... of genome-wide genetic diversity across locations. We tested for local selection by searching for increased population differentiation using FST-based outlier tests and by testing for significant associations between allele frequencies and environmental variables. The overall low genetic differentiation...... with single-generation signatures of spatially varying selection acting on glass eels. After screening 50 354 SNPs, a total of 754 potentially locally selected SNPs were identified. Candidate genes for local selection constituted a wide array of functions, including calcium signalling, neuroactive ligand...
Li, Yonghong; Shiffman, Dov; Oberbauer, Rainer
Single nucleotide polymorphisms (SNPs) are the most common type of genetic variants in the human genome. SNPs are known to modify susceptibility to complex diseases. We describe and discuss methods used to identify SNPs associated with disease in case-control studies. An outline on study population selection, sample collection and genotyping platforms is presented, complemented by SNP selection, data preprocessing and analysis.
Lundby, Alicia; Rossin, Elizabeth J.; Steffensen, Annette B.
Genome-wide association studies (GWAS) have identified thousands of loci associated with complex traits, but it is challenging to pinpoint causal genes in these loci and to exploit subtle association signals. We used tissue-specific quantitative interaction proteomics to map a network of five genes...... involved in the Mendelian disorder long QT syndrome (LOTS). We integrated the LOTS network with GWAS loci from the corresponding common complex trait, QT-interval variation, to identify candidate genes that were subsequently confirmed in Xenopus laevis oocytes and zebrafish. We used the LOTS protein...... network to filter weak GWAS signals by identifying single-nucleotide polymorphisms (SNPs) in proximity to genes in the network supported by strong proteomic evidence. Three SNPs passing this filter reached genome-wide significance after replication genotyping. Overall, we present a general strategy...
Psychosis Endophenotypes International Consortium; Wellcome Trust Case-Control Consortium; Bramon, E.; Pirinen, M.; Strange, A.; Lin, K.; Freeman, C.; Bellenguez, C.; Su, Z.; Band, G.; Pearson, R.; Vukcevic, D.; Langford, C.; Deloukas, P.; Hunt, S.
BACKGROUND: Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories. METHODS: 1239 cases with schizophrenia, schizoaffective disorder, or psychotic bipolar disorder; 857 of their unaffected relatives, and 2739 healthy controls were genotyped with the Affymetrix 6.0 single nucleotide polymorphism (SNP) array. Analyses of 69...
Tosato, Sarah; Myin-germeys, Inez; Barroso, Ines; Bender, Stephan; Giegling, Ina; Arranz, Maria J.; Donnelly, Peter; Bellenguez, Celine; Brown, Matthew A.; Lawrie, Stephen; Kalaydjieva, Luba; Vukcevic, Damjan; Kahn, Rene S.; Dronov, Serge; Walshe, Muriel
Background: Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories.Methods: 1239 cases with schizophrenia, schizoaffective disorder, or psychotic bipolar disorder; 857 of their unaffected relatives, and 2739 healthy controls were genotyped with the Affymetrix 6.0 single nucleotide polymorphism (SNP) array. Analyses of 695,19...
Full Text Available Soybean is the most important legume crop in the world. Several diseases in soybean lead to serious yield losses in major soybean-producing countries. Moreover, soybean can be infected by diverse viruses. Recently, we carried out a large-scale screening to identify viruses infecting soybean using available soybean transcriptome data. Of the screened transcriptomes, a soybean transcriptome for soybean seed development analysis contains several virus-associated sequences. In this study, we identified five viruses, including soybean mosaic virus (SMV, infecting soybean by de novo transcriptome assembly followed by blast search. We assembled a nearly complete consensus genome sequence of SMV China using transcriptome data. Based on phylogenetic analysis, the consensus genome sequence of SMV China was closely related to SMV isolates from South Korea. We examined single nucleotide variations (SNVs for SMVs in the soybean seed transcriptome revealing 780 SNVs, which were evenly distributed on the SMV genome. Four SNVs, C-U, U-C, A-G, and G-A, were frequently identified. This result demonstrated the quasispecies variation of the SMV genome. Taken together, this study carried out bioinformatics analyses to identify viruses using soybean transcriptome data. In addition, we demonstrated the application of soybean transcriptome data for virus genome assembly and SNV analysis.
Zhu, Shijia; Beaulaurier, John; Deikus, Gintaras; Wu, Tao; Strahl, Maya; Hao, Ziyang; Luo, Guanzheng; Gregory, James A; Chess, Andrew; He, Chuan; Xiao, Andrew; Sebra, Robert; Schadt, Eric E; Fang, Gang
N6-methyladenine (m6dA) has been discovered as a novel form of DNA methylation prevalent in eukaryotes, however, methods for high resolution mapping of m6dA events are still lacking. Single-molecule real-time (SMRT) sequencing has enabled the detection of m6dA events at single-nucleotide resolution in prokaryotic genomes, but its application to detecting m6dA in eukaryotic genomes has not been rigorously examined. Herein, we identified unique characteristics of eukaryotic m6dA methylomes that fundamentally differ from those of prokaryotes. Based on these differences, we describe the first approach for mapping m6dA events using SMRT sequencing specifically designed for the study of eukaryotic genomes, and provide appropriate strategies for designing experiments and carrying out sequencing in future studies. We apply the novel approach to study two eukaryotic genomes. For green algae, we construct the first complete genome-wide map of m6dA at single nucleotide and single molecule resolution. For human lymphoblastoid cells (hLCLs), joint analyses of SMRT sequencing and independent sequencing data suggest that putative m6dA events are enriched in the promoters of young, full length LINE-1 elements (L1s). These analyses demonstrate a general method for rigorous mapping and characterization of m6dA events in eukaryotic genomes. Published by Cold Spring Harbor Laboratory Press.
J. H. Ha
Full Text Available The purpose of this study was to characterize genetic architecture of behavior patterns in Sapsaree dogs. The breed population (n = 8,256 has been constructed since 1990 over 12 generations and managed at the Sapsaree Breeding Research Institute, Gyeongsan, Korea. Seven behavioral traits were investigated for 882 individuals. The traits were classified as a quantitative or a categorical group, and heritabilities (h2 and variance components were estimated under the Animal model using ASREML 2.0 software program. In general, the h2 estimates of the traits ranged between 0.00 and 0.16. Strong genetic (rG and phenotypic (rP correlations were observed between nerve stability, affability and adaptability, i.e. 0.9 to 0.94 and 0.46 to 0.68, respectively. To detect significant single nucleotide polymorphism (SNP for the behavioral traits, a total of 134 and 60 samples were genotyped using the Illumina 22K CanineSNP20 and 170K CanineHD bead chips, respectively. Two datasets comprising 60 (Sap60 and 183 (Sap183 samples were analyzed, respectively, of which the latter was based on the SNPs that were embedded on both the 22K and 170K chips. To perform genome-wide association analysis, each SNP was considered with the residuals of each phenotype that were adjusted for sex and year of birth as fixed effects. A least squares based single marker regression analysis was followed by a stepwise regression procedure for the significant SNPs (p<0.01, to determine a best set of SNPs for each trait. A total of 41 SNPs were detected with the Sap183 samples for the behavior traits. The significant SNPs need to be verified using other samples, so as to be utilized to improve behavior traits via marker-assisted selection in the Sapsaree population.
Full Text Available Aleksandra Szczepankiewicz1,21Laboratory of Molecular and Cell Biology, 2Department of Psychiatric Genetics, Poznan University of Medical Sciences, Poznan, PolandAbstract: Bipolar disorder (BD is a complex disorder with a number of susceptibility genes and environmental risk factors involved in its pathogenesis. In recent years, huge progress has been made in molecular techniques for genetic studies, which have enabled identification of numerous genomic regions and genetic variants implicated in BD across populations. Despite the abundance of genetic findings, the results have often been inconsistent and not replicated for many candidate genes/single nucleotide polymorphisms (SNPs. Therefore, the aim of the review presented here is to summarize the most important data reported so far in candidate gene and genome-wide association studies. Taking into account the abundance of association data, this review focuses on the most extensively studied genes and polymorphisms reported so far for BD to present the most promising genomic regions/SNPs involved in BD. The review of association data reveals evidence for several genes (SLC6A4/5-HTT [serotonin transporter gene], BDNF [brain-derived neurotrophic factor], DAOA [D-amino acid oxidase activator], DTNBP1 [dysbindin], NRG1 [neuregulin 1], DISC1 [disrupted in schizophrenia 1] to be crucial candidates in BD, whereas numerous genome-wide association studies conducted in BD indicate polymorphisms in two genes (CACNA1C [calcium channel, voltage-dependent, L type, alpha 1C subunit], ANK3 [ankyrin 3] replicated for association with BD in most of these studies. Nevertheless, further studies focusing on interactions between multiple candidate genes/SNPs, as well as systems biology and pathway analyses are necessary to integrate and improve the way we analyze the currently available association data.Keywords: candidate gene, genome-wide association study, SLC6A4, BDNF, DAOA, DTNBP1, NRG1, DISC1
We conducted a genome-wide scan for visceral leishmaniasis in mixed-breed dogs from a highly endemic area in Brazil using 149,648 single nucleotide polymorphism (SNP) markers genotyped in 20 cases and 28 controls. Using a mixed model approach, we found two candidate loci on canine autosomes 1 and 2....
Zhao, Huiying; Nyholt, Dale R.; Yang, Yuanhao; Wang, Jihua; Yang, Yuedong
Genome-wide association studies (GWAS) have successfully identified single variants associated with diseases. To increase the power of GWAS, gene-based and pathway-based tests are commonly employed to detect more risk factors. However, the gene- and pathway-based association tests may be biased towards genes or pathways containing a large number of single-nucleotide polymorphisms (SNPs) with small P-values caused by high linkage disequilibrium (LD) correlations. To address such bias, numerous...
Okbay, Aysu; Beauchamp, Jonathan P.; Fontana, Mark A.; Lee, James J.; Pers, Tune H.; Rietveld, Cornelius A.; Turley, Patrick; Chen, Guo-Bo; Emilsson, Valur; Meddens, S. Fleur W.; Oskarsson, Sven; Pickrell, Joseph K.; Thom, Kevin; Timshel, Pascal; de Vlaming, Ronald; Abdellaoui, Abdel; Ahluwalia, Tarunveer S.; Bacelis, Jonas; Baumbach, Clemens; Bjornsdottir, Gyda; Brandsma, Johannes H.; Concas, Maria Pina; Derringer, Jaime; Furlotte, Nicholas A.; Galesloot, Tessel E.; Girotto, Giorgia; Gupta, Richa; Hall, Leanne M.; Harris, Sarah E.; Hofer, Edith; Horikoshi, Momoko; Huffman, Jennifer E.; Kaasik, Kadri; Kalafati, Ioanna P.; Karlsson, Robert; Kong, Augustine; Lahti, Jari; van der Lee, Sven J.; de Leeuw, Christiaan; Lind, Penelope A.; Lindgren, Karl-Oskar; Liu, Tian; Mangino, Massimo; Marten, Jonathan; Mihailov, Evelin; Miller, Michael B.; van der Most, Peter J.; Oldmeadow, Christopher; Payton, Antony; Pervjakova, Natalia; Peyrot, Wouter J.; Qian, Yong; Raitakari, Olli; Rueedi, Rico; Salvi, Erika; Schmidt, Börge; Schraut, Katharina E.; Shi, Jianxin; Smith, Albert V.; Poot, Raymond A.; Pourcain, Beate; Teumer, Alexander; Thorleifsson, Gudmar; Verweij, Niek; Vuckovic, Dragana; Wellmann, Juergen; Westra, Harm-Jan; Yang, Jingyun; Zhao, Wei; Zhu, Zhihong; Alizadeh, Behrooz Z.; Amin, Najaf; Bakshi, Andrew; Baumeister, Sebastian E.; Biino, Ginevra; Bønnelykke, Klaus; Boyle, Patricia A.; Campbell, Harry; Cappuccio, Francesco P.; Davies, Gail; De Neve, Jan-Emmanuel; Deloukas, Panos; Demuth, Ilja; Ding, Jun; Eibich, Peter; Eisele, Lewin; Eklund, Niina; Evans68, David M.; Faul, Jessica D.; Feitosa, Mary F.; Forstner, Andreas J.; Gandin, Ilaria; Gunnarsson, Bjarni; Halldórsson, Bjarni V.; Harris, Tamara B.; Heath, Andrew C.; Hocking, Lynne J.; Holliday, Elizabeth G.; Homuth, Georg; Horan, Michael A.; Hottenga, Jouke-Jan; de Jager, Philip L.; Joshi, Peter K.; Jugessur, Astanand; Kaakinen, Marika A.; Kähönen, Mika; Kanoni, Stavroula; Keltigangas-Järvinen, Liisa; Kiemeney, Lambertus A.L.M.; Kolcic, Ivana; Koskinen, Seppo; Kraja, Aldi T.; Kroh, Martin; Kutalik, Zoltan; Latvala, Antti; Launer, Lenore J.; Lebreton, Maël P.; Levinson, Douglas F.; Lichtenstein, Paul; Lichtner, Peter; Liewald, David C.M.; Loukola, Anu; Madden, Pamela A.; Mägi, Reedik; Mäki-Opas, Tomi; Marioni, Riccardo E.; Marques-Vidal, Pedro; Meddens, Gerardus A.; McMahon, George; Meisinger, Christa; Meitinger, Thomas; Milaneschi, Yusplitri; Milani, Lili; Montgomery, Grant W.; Myhre, Ronny; Nelson, Christopher P.; Nyholt, Dale R.; Ollier, William E.R.; Palotie, Aarno; Paternoster, Lavinia; Pedersen, Nancy L.; Petrovic, Katja E.; Porteous, David J.; Räikkönen, Katri; Ring, Susan M.; Robino, Antonietta; Rostapshova, Olga; Rudan, Igor; Rustichini, Aldo; Salomaa, Veikko; Sanders, Alan R.; Sarin, Antti-Pekka; Schmidt, Helena; Scott, Rodney J.; Smith, Blair H.; Smith, Jennifer A.; Staessen, Jan A.; Steinhagen-Thiessen, Elisabeth; Strauch, Konstantin; Terracciano, Antonio; Tobin, Martin D.; Ulivi, Sheila; Vaccargiu, Simona; Quaye, Lydia; van Rooij, Frank J.A.; Venturini, Cristina; Vinkhuyzen, Anna A.E.; Völker, Uwe; Völzke, Henry; Vonk, Judith M.; Vozzi, Diego; Waage, Johannes; Ware, Erin B.; Willemsen, Gonneke; Attia, John R.; Bennett, David A.; Berger, Klaus; Bertram, Lars; Bisgaard, Hans; Boomsma, Dorret I.; Borecki, Ingrid B.; Bultmann, Ute; Chabris, Christopher F.; Cucca, Francesco; Cusi, Daniele; Deary, Ian J.; Dedoussis, George V.; van Duijn, Cornelia M.; Eriksson, Johan G.; Franke, Barbara; Franke, Lude; Gasparini, Paolo; Gejman, Pablo V.; Gieger, Christian; Grabe, Hans-Jörgen; Gratten, Jacob; Groenen, Patrick J.F.; Gudnason, Vilmundur; van der Harst, Pim; Hayward, Caroline; Hinds, David A.; Hoffmann, Wolfgang; Hyppönen, Elina; Iacono, William G.; Jacobsson, Bo; Järvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Kaprio, Jaakko; Kardia, Sharon L.R.; Lehtimäki, Terho; Lehrer, Steven F.; Magnusson, Patrik K.E.; Martin, Nicholas G.; McGue, Matt; Metspalu, Andres; Pendleton, Neil; Penninx, Brenda W.J.H.; Perola, Markus; Pirastu, Nicola; Pirastu, Mario; Polasek, Ozren; Posthuma, Danielle; Power, Christine; Province, Michael A.; Samani, Nilesh J.; Schlessinger, David; Schmidt, Reinhold; Sørensen, Thorkild I.A.; Spector, Tim D.; Stefansson, Kari; Thorsteinsdottir, Unnur; Thurik, A. Roy; Timpson, Nicholas J.; Tiemeier, Henning; Tung, Joyce Y.; Uitterlinden, André G.; Vitart, Veronique; Vollenweider, Peter; Weir, David R.; Wilson, James F.; Wright, Alan F.; Conley, Dalton C.; Krueger, Robert F.; Smith, George Davey; Hofman, Albert; Laibson, David I.; Medland, Sarah E.; Meyer, Michelle N.; Yang, Jian; Johannesson, Magnus; Visscher, Peter M.; Esko, Tõnu; Koellinger, Philipp D.; Cesarini, David; Benjamin, Daniel J.
Summary Educational attainment (EA) is strongly influenced by social and other environmental factors, but genetic factors are also estimated to account for at least 20% of the variation across individuals1. We report the results of a genome-wide association study (GWAS) for EA that extends our earlier discovery sample1,2 of 101,069 individuals to 293,723 individuals, and a replication in an independent sample of 111,349 individuals from the UK Biobank. We now identify 74 genome-wide significant loci associated with number of years of schooling completed. Single-nucleotide polymorphisms (SNPs) associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioral phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because EA is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric disease. PMID:27225129
Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi Xuan; Han, Bin; Kurata, Nori
. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all
Full Text Available Abstract Background Clostridium beijerinckii is a prominent solvent-producing microbe that has great potential for biofuel and chemical industries. Although transcriptional analysis is essential to understand gene functions and regulation and thus elucidate proper strategies for further strain improvement, limited information is available on the genome-wide transcriptional analysis for C. beijerinckii. Results The genome-wide transcriptional dynamics of C. beijerinckii NCIMB 8052 over a batch fermentation process was investigated using high-throughput RNA-Seq technology. The gene expression profiles indicated that the glycolysis genes were highly expressed throughout the fermentation, with comparatively more active expression during acidogenesis phase. The expression of acid formation genes was down-regulated at the onset of solvent formation, in accordance with the metabolic pathway shift from acidogenesis to solventogenesis. The acetone formation gene (adc, as a part of the sol operon, exhibited highly-coordinated expression with the other sol genes. Out of the > 20 genes encoding alcohol dehydrogenase in C. beijerinckii, Cbei_1722 and Cbei_2181 were highly up-regulated at the onset of solventogenesis, corresponding to their key roles in primary alcohol production. Most sporulation genes in C. beijerinckii 8052 demonstrated similar temporal expression patterns to those observed in B. subtilis and C. acetobutylicum, while sporulation sigma factor genes sigE and sigG exhibited accelerated and stronger expression in C. beijerinckii 8052, which is consistent with the more rapid forespore and endspore development in this strain. Global expression patterns for specific gene functional classes were examined using self-organizing map analysis. The genes associated with specific functional classes demonstrated global expression profiles corresponding to the cell physiological variation and metabolic pathway switch. Conclusions The results from this
Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis
SNP Array v2. A ‘proof-of-concept’ advanced data mining algorithm for unsupervised analysis of genome-wide association study (GWAS) dataset was... Opal F AUS Yes U141 Peggs F AUS Yes U142 Taxi F AUS Yes U143 Riso MI MAL Yes U144 Szarik MI GSD Yes U145 Astor MI MAL Yes U146 Roy MC MAL Yes... mining of genetic studies in general, and especially GWAS. As a proof-of-concept, a classification analysis of the WG SNP typing dataset of a
Middeldorp, Christel M.; Hammerschlag, Anke R.; Ouwens, Klaasjan G.; Groen-Blokhuis, Maria M.; St. Pourcain, Beate; Greven, Corina U.; Pappa, Irene; Tiesler, Carla M.T.; Ang, Wei; Nolte, Ilja M.; Vilor-Tejedor, Natalia; Bacelis, Jonas; Ebejer, Jane L.; Zhao, Huiying; Davies, Gareth E.
ObjectiveTo elucidate the influence of common genetic variants on childhood attention-deficit/hyperactivity disorder (ADHD) symptoms, to identify genetic variants that explain its high heritability, and to investigate the genetic overlap of ADHD symptom scores with ADHD diagnosis.MethodWithin the EArly Genetics and Lifecourse Epidemiology (EAGLE) consortium, genome-wide single nucleotide polymorphisms (SNPs) and ADHD symptom scores were available for 17,666 children (< 13 years) from nine ...
Wu, Xiaoping; Fang, Ming; Liu, Lin
.Results: The Illumina BovineSNP50 BeadChip was used to identify single nucleotide polymorphisms (SNPs) that are associated with body conformation traits. A least absolute shrinkage and selection operator (LASSO) was applied to detect multiple SNPs simultaneously for 29 body conformation traits with 1,314 Chinese...... Holstein cattle and 52,166 SNPs. Totally, 59 genome-wide significant SNPs associated with 26 conformation traits were detected by genome-wide association analysis; five SNPs were within previously reported QTL regions (Animal Quantitative Trait Loci (QTL) database) and 11 were very close to the reported...... SNPs. Twenty-two SNPs were located within annotated gene regions, while the remainder were 0.6-826 kb away from known genes. Some of the genes had clear biological functions related to conformation traits. By combining information about the previously reported QTL regions and the biological functions...
Li, Mulin Jun; Wang, Junwen
As high throughput methods, such as whole genome genotyping arrays, whole exome sequencing (WES) and whole genome sequencing (WGS), have detected huge amounts of genetic variants associated with human diseases, function annotation of these variants is an indispensable step in understanding disease etiology. Large-scale functional genomics projects, such as The ENCODE Project and Roadmap Epigenomics Project, provide genome-wide profiling of functional elements across different human cell types and tissues. With the urgent demands for identification of disease-causal variants, comprehensive and easy-to-use annotation tool is highly in demand. Here we review and discuss current progress and trend of the variant annotation field. Furthermore, we introduce a comprehensive web portal for annotating human genetic variants. We use gene-based features and the latest functional genomics datasets to annotate single nucleotide variation (SNVs) in human, at whole genome scale. We further apply several function prediction algorithms to annotate SNVs that might affect different biological processes, including transcriptional gene regulation, alternative splicing, post-transcriptional regulation, translation and post-translational modifications. The SNVrap web portal is freely available at http://jjwanglab.org/snvrap. Copyright © 2014 Elsevier Inc. All rights reserved.
Singh, P; Benjak, A; Carat, S; Kai, M; Busso, P; Avanzi, C; Paniz-Mondolfi, A; Peter, C; Harshman, K; Rougemont, J; Matsuoka, M; Cole, S T
Genotyping and molecular characterization of drug resistance mechanisms in Mycobacterium leprae enables disease transmission and drug resistance trends to be monitored. In the present study, we performed genome-wide analysis of Airaku-3, a multidrug-resistant strain with an unknown mechanism of resistance to rifampicin. We identified 12 unique non-synonymous single-nucleotide polymorphisms (SNPs) including two in the transporter-encoding ctpC and ctpI genes. In addition, two SNPs were found that improve the resolution of SNP-based genotyping, particularly for Venezuelan and South East Asian strains of M. leprae. © 2014 The Authors Clinical Microbiology and Infection © 2014 European Society of Clinical Microbiology and Infectious Diseases.
Mao, Peng; Brown, Alexander J; Malc, Ewa P; Mieczkowski, Piotr A; Smerdon, Michael J; Roberts, Steven A; Wyrick, John J
DNA base damage is an important contributor to genome instability, but how the formation and repair of these lesions is affected by the genomic landscape and contributes to mutagenesis is unknown. Here, we describe genome-wide maps of DNA base damage, repair, and mutagenesis at single nucleotide resolution in yeast treated with the alkylating agent methyl methanesulfonate (MMS). Analysis of these maps revealed that base excision repair (BER) of alkylation damage is significantly modulated by chromatin, with faster repair in nucleosome-depleted regions, and slower repair and higher mutation density within strongly positioned nucleosomes. Both the translational and rotational settings of lesions within nucleosomes significantly influence BER efficiency; moreover, this effect is asymmetric relative to the nucleosome dyad axis and is regulated by histone modifications. Our data also indicate that MMS-induced mutations at adenine nucleotides are significantly enriched on the nontranscribed strand (NTS) of yeast genes, particularly in BER-deficient strains, due to higher damage formation on the NTS and transcription-coupled repair of the transcribed strand (TS). These findings reveal the influence of chromatin on repair and mutagenesis of base lesions on a genome-wide scale and suggest a novel mechanism for transcription-associated mutation asymmetry, which is frequently observed in human cancers. © 2017 Mao et al.; Published by Cold Spring Harbor Laboratory Press.
Brown, Allan F; Yousef, Gad G; Reid, Robert W; Chebrolu, Kranthi K; Thomas, Aswathy; Krueger, Christopher; Jeffery, Elizabeth; Jackson, Eric; Juvik, John A
The identification of genetic factors influencing the accumulation of individual glucosinolates in broccoli florets provides novel insight into the regulation of glucosinolate levels in Brassica vegetables and will accelerate the development of vegetables with glucosinolate profiles tailored to promote human health. Quantitative trait loci analysis of glucosinolate (GSL) variability was conducted with a B. oleracea (broccoli) mapping population, saturated with single nucleotide polymorphism markers from a high-density array designed for rapeseed (Brassica napus). In 4 years of analysis, 14 QTLs were associated with the accumulation of aliphatic, indolic, or aromatic GSLs in floret tissue. The accumulation of 3-carbon aliphatic GSLs (2-propenyl and 3-methylsulfinylpropyl) was primarily associated with a single QTL on C05, but common regulation of 4-carbon aliphatic GSLs was not observed. A single locus on C09, associated with up to 40 % of the phenotypic variability of 2-hydroxy-3-butenyl GSL over multiple years, was not associated with the variability of precursor compounds. Similarly, QTLs on C02, C04, and C09 were associated with 4-methylsulfinylbutyl GSL concentration over multiple years but were not significantly associated with downstream compounds. Genome-specific SNP markers were used to identify candidate genes that co-localized to marker intervals and previously sequenced Brassica oleracea BAC clones containing known GSL genes (GSL-ALK, GSL-PRO, and GSL-ELONG) were aligned to the genomic sequence, providing support that at least three of our 14 QTLs likely correspond to previously identified GSL loci. The results demonstrate that previously identified loci do not fully explain GSL variation in broccoli. The identification of additional genetic factors influencing the accumulation of GSL in broccoli florets provides novel insight into the regulation of GSL levels in Brassicaceae and will accelerate development of vegetables with modified or enhanced GSL
Wang, Xianshu; Pankratz, V. Shane; Fredericksen, Zachary; Tarrell, Robert; Karaus, Mary; McGuffog, Lesley; Pharaoh, Paul D. P.; Ponder, Bruce A. J.; Dunning, Alison M.; Peock, Susan; Cook, Margaret; Oliver, Clare; Frost, Debra; Sinilnikova, Olga M.; Stoppa-Lyonnet, Dominique; Mazoyer, Sylvie; Houdayer, Claude; Hogervorst, Frans B. L.; Hooning, Maartje J.; Ligtenberg, Marjolijn J.; Spurdle, Amanda; Chenevix-Trench, Georgia; Schmutzler, Rita K.; Wappenschmidt, Barbara; Engel, Christoph; Meindl, Alfons; Domchek, Susan M.; Nathanson, Katherine L.; Rebbeck, Timothy R.; Singer, Christian F.; Gschwantler-Kaulich, Daphne; Dressler, Catherina; Fink, Anneliese; Szabo, Csilla I.; Zikan, Michal; Foretova, Lenka; Claes, Kathleen; Thomas, Gilles; Hoover, Robert N.; Hunter, David J.; Chanock, Stephen J.; Easton, Douglas F.; Antoniou, Antonis C.; Couch, Fergus J.; Gregory, Helen; Miedzybrodzka, Zosia; Morrison, Patrick; Cole, Trevor; McKeown, Carole; Taylor, Amy; Donaldson, Alan; Paterson, Joan; Murray, Alexandra; Rogers, Mark; McCann, Emma; Kennedy, John; Barton, David; Porteous, Mary; Brewer, Carole; Kivuva, Emma; Searle, Anne; Goodman, Selina; Davidson, Rosemarie; Murday, Victoria; Bradshaw, Nicola; Snadden, Lesley; Longmuir, Mark; Watt, Catherine; Izatt, Louise; Pichert, Gabriella; Langman, Caroline; Dorkins, Huw; Barwell, Julian; Chu, Carol; Bishop, Tim; Miller, Julie; Ellis, Ian; Evans, D. Gareth; Lalloo, Fiona; Holt, Felicity; Male, Alison; Robinson, Anne; Gardiner, Carol; Douglas, Fiona; Claber, Oonagh; Walker, Lisa; McLeod, Diane; Eeles, Ros; Shanley, Susan; Rahman, Nazneen; Houlston, Richard; Bancroft, Elizabeth; D'Mello, Lucia; Page, Elizabeth; Ardern-Jones, Audrey; Mitra, Anita; Cook, Jackie; Quarrell, Oliver; Bardsley, Cathryn; Hodgson, Shirley; Goff, Sheila; Brice, Glen; Winchester, Lizzie; Eccles, Diana; Lucassen, Anneke; Crawford, Gillian; Tyler, Emma; McBride, Donna; Bérard, Léon; Sinilnikova, Olga; Barjhoux, Laure; Giraud, Sophie; Léone, Mélanie; Gauthier-Villars, Marion; Moncoutier, Virginie; Belotti, Muriel; de Pauw, Antoine; Bressac-de-Paillerets, Brigitte; Remenieras, Audrey; Byrde, Véronique; Caron, Olivier; Lenoir, Gilbert; Bignon, Yves-Jean; Uhrhammer, Nancy; Lasset, Christine; Bonadona, Valérie; Hardouin, Agnès; Berthet, Pascaline; Sobol, Hagay; Bourdon, Violaine; Eisinger, Françoise; Coulet, Florence; Colas, Chrystelle; Soubrier, Florent; Coupier, Isabelle; Payrat, Jean-Philippe; Fournier, Joëlle; Révillion, Françoise; Vennin, Philippe; Adenis, Claude; Rouleau, Etienne; Lidereau, Rosette; Demange, Liliane; Nogues, Catherine; Muller, Danièle; Fricker, Jean-Pierre; Longy, Michel; Sevenet, Nicolas; Toulas, Christine; Guimbaud, Rosine; Gladieff, Laurence; Feillel, Viviane; Leroux, Dominique; Dreyfus, Hélèn; Rebischung, Christine; Cassini, Cécile; Olivier-Faivre, Laurence; Prieur, Fabienne; Ferrer, Sandra Fert; Frénay, Marc; Vénat-Bouvet, Laurence; Lynch, Henry T.; Hogervorst, Frans; Vernhoef, Senno; Pijpe, Anouk; van 't Veer, Laura; van Leeuwen, Flora; Rookus, Matti; Collée, Margriet; van den Ouweland, Ans; Kriege, Mieke; Schutte, Mieke; Hooning, Maartje; Seynaeve, Caroline; van Asperen, Christi; Wijnen, Juul; Vreeswijk, Maaike; Tollenaar, Rob; Devilee, Peter; Ligtenberg, Marjolijn; Hoogerbrugge, Nicoline; Ausems, Margreet; van der Luijt, Rob; Aalfs, Cora; van Os, Theo; Gille, Hans; Waisfisz, Quinten; Meijers-Heijboer, Hanne; Gomez-Garcia, Encarna; van Roozendaal, Kees; Blok, Marinus; Oosterwijk, Jan; van der Hout, Annemieke; Mourits, Marian; Vasen, Hans; Szabo, Csilla; Pohlreich, Petr; Kleibl, Zdenek; Machackova, Eva; Lukesova, Miroslava; de Leeneer, Kim; Poppe, Bruce; de Paepe, Anne
Recent studies have identified single nucleotide polymorphisms (SNPs) that significantly modify breast cancer risk in BRCA1 and BRCA2 mutation carriers. Since these risk modifiers were originally identified as genetic risk factors for breast cancer in genome-wide association studies (GWASs),
Full Text Available Genome-wide association studies (GWAS have successfully identified a number of single-nucleotide polymorphisms (SNPs associated with colorectal cancer (CRC risk. However, these susceptibility loci known today explain only a small fraction of the genetic risk. Gene-gene interaction (GxG is considered to be one source of the missing heritability. To address this, we performed a genome-wide search for pair-wise GxG associated with CRC risk using 8,380 cases and 10,558 controls in the discovery phase and 2,527 cases and 2,658 controls in the replication phase. We developed a simple, but powerful method for testing interaction, which we term the Average Risk Due to Interaction (ARDI. With this method, we conducted a genome-wide search to identify SNPs showing evidence for GxG with previously identified CRC susceptibility loci from 14 independent regions. We also conducted a genome-wide search for GxG using the marginal association screening and examining interaction among SNPs that pass the screening threshold (p<10(-4. For the known locus rs10795668 (10p14, we found an interacting SNP rs367615 (5q21 with replication p = 0.01 and combined p = 4.19×10(-8. Among the top marginal SNPs after LD pruning (n = 163, we identified an interaction between rs1571218 (20p12.3 and rs10879357 (12q21.1 (nominal combined p = 2.51×10(-6; Bonferroni adjusted p = 0.03. Our study represents the first comprehensive search for GxG in CRC, and our results may provide new insight into the genetic etiology of CRC.
Ji, Yuan; Schaid, Daniel J; Desta, Zeruesenay; Kubo, Michiaki; Batzler, Anthony J; Snyder, Karen; Mushiroda, Taisei; Kamatani, Naoyuki; Ogburn, Evan; Hall-Flavin, Daniel; Flockhart, David; Nakamura, Yusuke; Mrazek, David A; Weinshilboum, Richard M
Citalopram (CT) and escitalopram (S-CT) are among the most widely prescribed selective serotonin reuptake inhibitors used to treat major depressive disorder (MDD). We applied a genome-wide association study to identify genetic factors that contribute to variation in plasma concentrations of CT or S-CT and their metabolites in MDD patients treated with CT or S-CT. Our genome-wide association study was performed using samples from 435 MDD patients. Linear mixed models were used to account for within-subject correlations of longitudinal measures of plasma drug/metabolite concentrations (4 and 8 weeks after the initiation of drug therapy), and single-nucleotide polymorphisms (SNPs) were modelled as additive allelic effects. Genome-wide significant associations were observed for S-CT concentration with SNPs in or near the CYP2C19 gene on chromosome 10 (rs1074145, P = 4.1 × 10(-9) ) and with S-didesmethylcitalopram concentration for SNPs near the CYP2D6 locus on chromosome 22 (rs1065852, P = 2.0 × 10(-16) ), supporting the important role of these cytochrome P450 (CYP) enzymes in biotransformation of citalopram. After adjustment for the effect of CYP2C19 functional alleles, the analyses also identified novel loci that will require future replication and functional validation. In vitro and in vivo studies have suggested that the biotransformation of CT to monodesmethylcitalopram and didesmethylcitalopram is mediated by CYP isozymes. The results of our genome-wide association study performed in MDD patients treated with CT or S-CT have confirmed those observations but also identified novel genomic loci that might play a role in variation in plasma levels of CT or its metabolites during the treatment of MDD patients with these selective serotonin reuptake inhibitors. © 2014 The British Pharmacological Society.
Schwarz, Erika N; Ruhlman, Tracey A; Weng, Mao-Lun; Khiyami, Mohammad A; Sabir, Jamal S M; Hajarah, Nahid H; Alharbi, Njud S; Rabah, Samar O; Jansen, Robert K
This study represents the most comprehensive plastome-wide comparison of nucleotide substitution rates across the three subfamilies of Fabaceae: Caesalpinioideae, Mimosoideae, and Papilionoideae. Caesalpinioid and mimosoid legumes have large, unrearranged plastomes compared with papilionoids, which exhibit varying levels of rearrangement including the loss of the inverted repeat (IR) in the IR-lacking clade (IRLC). Using 71 genes common to 39 legume taxa representing all the three subfamilies, we show that papilionoids consistently have higher nucleotide substitution rates than caesalpinioids and mimosoids, and rates in the IRLC papilionoids are generally higher than those in the IR-containing papilionoids. Unsurprisingly, this pattern was significantly correlated with growth habit as most papilionoids are herbaceous, whereas caesalpinioids and mimosoids are largely woody. Both nonsynonymous (dN) and synonymous (dS) substitution rates were also correlated with several biological features including plastome size and plastomic rearrangements such as the number of inversions and indels. In agreement with previous reports, we found that genes in the IR exhibit between three and fourfold reductions in the substitution rates relative to genes within the large single-copy or small single-copy regions. Furthermore, former IR genes in IR-lacking taxa exhibit accelerated rates compared with genes contained in the IR.
Hamshere, M L; Walters, J T R; Smith, R
The Schizophrenia Psychiatric Genome-Wide Association Study Consortium (PGC) highlighted 81 single-nucleotide polymorphisms (SNPs) with moderate evidence for association to schizophrenia. After follow-up in independent samples, seven loci attained genome-wide significance (GWS), but multi-locus t...... interval (CI) 78-100%) of the original set of 78 SNPs represent true associations. We also provide strong evidence for overlap in genetic risk between schizophrenia and bipolar disorder.Molecular Psychiatry advance online publication, 22 May 2012; doi:10.1038/mp.2012.67....
Neumann, Alexander; Direk, Nese; Crawford, Andrew A; Mirza, Saira; Adams, Hieab; Bolton, Jennifer; Hayward, Caroline; Strachan, David P; Payne, Erin K; Smith, Jennifer A; Milaneschi, Yuri; Penninx, Brenda; Hottenga, Jouke J; de Geus, Eco; Oldehinkel, Albertine J; van der Most, Peter J; de Rijke, Yolanda; Walker, Brian R; Tiemeier, Henning
Cortisol is an important stress hormone affected by a variety of biological and environmental factors, such as the circadian rhythm, exercise and psychological stress. Cortisol is mostly measured using blood or saliva samples. A number of genetic variants have been found to contribute to cortisol levels with these methods. While the effects of several specific single genetic variants is known, the joint genome-wide contribution to cortisol levels is unclear. Our aim was to estimate the amount of cortisol variance explained by common single nucleotide polymorphisms, i.e. the SNP heritability, using a variety of cortisol measures, cohorts and analysis approaches. We analyzed morning plasma (n=5705) and saliva levels (n=1717), as well as diurnal saliva levels (n=1541), in the Rotterdam Study using genomic restricted maximum likelihood estimation. Additionally, linkage disequilibrium score regression was fitted on the results of genome-wide association studies (GWAS) performed by the CORNET consortium on morning plasma cortisol (n=12,597) and saliva cortisol (n=7703). No significant SNP heritability was detected for any cortisol measure, sample or analysis approach. Point estimates ranged from 0% to 9%. Morning plasma cortisol in the CORNET cohorts, the sample with the most power, had a 6% [95%CI: 0-13%] SNP heritability. The results consistently suggest a low SNP heritability of these acute and short-term measures of cortisol. The low SNP heritability may reflect the substantial environmental and, in particular, situational component of these cortisol measures. Future GWAS will require very large sample sizes. Alternatively, more long-term cortisol measures such as hair cortisol samples are needed to discover further genetic pathways regulating cortisol concentrations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Shevidi, Saba; Uchida, Alicia; Schudrowitz, Natalie; Wessel, Gary M; Yajima, Mamiko
A single base pair mutation in the genome can result in many congenital disorders in humans. The recent gene editing approach using CRISPR/Cas9 has rapidly become a powerful tool to replicate or repair such mutations in the genome. These approaches rely on cleaving DNA, while presenting unexpected risks. In this study, we demonstrate a modified CRISPR/Cas9 system fused to cytosine deaminase (Cas9-DA), which induces a single nucleotide conversion in the genome. Cas9-DA was introduced into sea urchin eggs with sgRNAs targeted for SpAlx1, SpDsh, or SpPks, each of which is critical for skeletogenesis, embryonic axis formation, or pigment formation, respectively. We found that both Cas9 and Cas9-DA edit the genome, and cause predicted phenotypic changes at a similar efficiency. Cas9, however, resulted in significant deletions in the genome centered on the gRNA target sequence, whereas Cas9-DA resulted in single or double nucleotide editing of C to T conversions within the gRNA target sequence. These results suggest that the Cas9-DA approach may be useful for manipulating gene activity with decreased risks of genomic aberrations. Developmental Dynamics 246:1036-1046, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Rabinowicz Pablo D
Full Text Available Abstract Background Castor bean (Ricinus communis is an agricultural crop and garden ornamental that is widely cultivated and has been introduced worldwide. Understanding population structure and the distribution of castor bean cultivars has been challenging because of limited genetic variability. We analyzed the population genetics of R. communis in a worldwide collection of plants from germplasm and from naturalized populations in Florida, U.S. To assess genetic diversity we conducted survey sequencing of the genomes of seven diverse cultivars and compared the data to a reference genome assembly of a widespread cultivar (Hale. We determined the population genetic structure of 676 samples using single nucleotide polymorphisms (SNPs at 48 loci. Results Bayesian clustering indicated five main groups worldwide and a repeated pattern of mixed genotypes in most countries. High levels of population differentiation occurred between most populations but this structure was not geographically based. Most molecular variance occurred within populations (74% followed by 22% among populations, and 4% among continents. Samples from naturalized populations in Florida indicated significant population structuring consistent with local demes. There was significant population differentiation for 56 of 78 comparisons in Florida (pairwise population ϕPT values, p Conclusion Low levels of genetic diversity and mixing of genotypes have led to minimal geographic structuring of castor bean populations worldwide. Relatively few lineages occur and these are widely distributed. Our approach of determining population genetic structure using SNPs from genome-wide comparisons constitutes a framework for high-throughput analyses of genetic diversity in plants, particularly in species with limited genetic diversity.
Chen, X.; IJkel, W.F.J.; Tarchini, R.; Sun, X.; Sandbrink, H.; Wang, H.; Peters, S.; Zuidema, D.; Klein Lankhorst, R.; Vlak, J.M.; Hu, Z.
The nucleotide sequence of the Helicoverpa armigera single-nucleocapsid nucleopolyhedrovirus (HaSNPV) DNA genome was determined and analysed. The circular genome encompasses 131 403 bp, has a G C content of 39.1 molnd contains five homologous regions with a unique pattern of repeats.
Sheikh, Ishfaq A; Ahmad, Ejaz; Jamal, Mohammad S; Rehan, Mohd; Assidi, Mourad; Tayubi, Iftikhar A; AlBasri, Samera F; Bajouh, Osama S; Turki, Rola F; Abuzenadah, Adel M; Damanhouri, Ghazi A; Beg, Mohd A; Al-Qahtani, Mohammed
Preterm birth (PTB), birth at PTBs are spontaneous with about a half without any apparent cause and the other half associated with a number of risk factors. Genetic factors are one of the significant risks for PTB. The focus of this review is on single nucleotide gene polymorphisms (SNPs) that are reported to be associated with PTB. A comprehensive evaluation of studies on SNPs known to confer potential risk of PTB was done by performing a targeted PubMed search for the years 2007-2015 and systematically reviewing all relevant studies. Evaluation of 92 studies identified 119 candidate genes with SNPs that had potential association with PTB. The genes were associated with functions of a wide spectrum of tissue and cell types such as endocrine, tissue remodeling, vascular, metabolic, and immune and inflammatory systems. A number of potential functional candidate gene variants have been reported that predispose women for PTB. Understanding the complex genomic landscape of PTB needs high-throughput genome sequencing methods such as whole-exome sequencing and whole-genome sequencing approaches that will significantly enhance the understanding of PTB. Identification of high risk women, avoidance of possible risk factors, and provision of personalized health care are important to manage PTB.
Evangelou, Evangelos; Fellay, Jacques; Colombo, Sara
infected with human immunodeficiency virus type 1 (HIV-1) to assess whether differences in type of population (622 seroconverters vs. 636 seroprevalent subjects) or the number of measurements available for defining the phenotype resulted in differences in the effect sizes of associations between single...... nucleotide polymorphisms and the phenotype, HIV-1 viral load at set point. The effect estimate for the top 100 single nucleotide polymorphisms was 0.092 (95% confidence interval: 0.074, 0.110) log(10) viral load (log(10) copies of HIV-1 per mL of blood) greater in seroconverters than in seroprevalent...... available, particularly among seroconverters and for variants that achieved genome-wide significance. Differences in phenotype definition and ascertainment may affect the estimated magnitude of genetic effects and should be considered in optimizing power for discovering new associations....
Full Text Available Whole-genome single-nucleotide polymorphism (SNP markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms.
Kuhn, R M; Karolchik, D; Zweig, A S
The University of California, Santa Cruz Genome Browser Database contains, as of September 2006, sequence and annotation data for the genomes of 13 vertebrate and 19 invertebrate species. The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up t...
Full Text Available Prashanth Suravajhala,1 Alfredo Benso2 1Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark; 2Department of Control and Computer Engineering, Politecnico di Torino, Torino, Italy Abstract: Next-generation sequencing technology has provided resources to easily explore and identify candidate single-nucleotide polymorphisms (SNPs and variants. However, there remains a challenge in identifying and inferring the causal SNPs from sequence data. A problem with different methods that predict the effect of mutations is that they produce false positives. In this hypothesis, we provide an overview of methods known for identifying causal variants and discuss the challenges, fallacies, and prospects in discerning candidate SNPs. We then propose a three-point classification strategy, which could be an additional annotation method in identifying causalities. Keywords: clinical mastitis, single-nucleotide polymorphisms, variants, associations, diseases, linkage disequilibrium, GWAS
Zhang, Dong Feng; Pang, Zengchang; Li, Shuxia
The genetic loci affecting the commonly used BMI have been intensively investigated using linkage approaches in multiple populations. This study aims at performing the first genome-wide linkage scan on BMI in the Chinese population in mainland China with hypothesis that heterogeneity in genetic...... linkage could exist in different ethnic populations. BMI was measured from 126 dizygotic twins in Qingdao municipality who were genotyped using high-resolution Affymetrix Genome-Wide Human SNP arrays containing about 1 million single-nucleotide polymorphisms (SNPs). Nonparametric linkage analysis...... in western countries. Multiple loci showing suggestive linkage were found on chromosome 1 (lod score 2.38 at 242 cM), chromosome 8 (2.48 at 95 cM), and chromosome 14 (2.2 at 89.4 cM). The strong linkage identified in the Chinese subjects that is consistent with that found in populations of European origin...
Full Text Available Next-generation sequencing and the collection of genome-wide single-nucleotide polymorphisms (SNPs allow identifying fine-scale population genetic structure and genomic regions under selection. The spotted sea bass (Lateolabrax maculatus is a non-model species of ecological and commercial importance and widely distributed in northwestern Pacific. A total of 22 648 SNPs was discovered across the genome of L. maculatus by paired-end sequencing of restriction-site associated DNA (RAD-PE for 30 individuals from two populations. The nucleotide diversity (π for each population was 0.0028±0.0001 in Dandong and 0.0018±0.0001 in Beihai, respectively. Shallow but significant genetic differentiation was detected between the two populations analyzed by using both the whole data set (FST = 0.0550, P < 0.001 and the putatively neutral SNPs (FST = 0.0347, P < 0.001. However, the two populations were highly differentiated based on the putatively adaptive SNPs (FST = 0.6929, P < 0.001. Moreover, a total of 356 SNPs representing 298 unique loci were detected as outliers putatively under divergent selection by FST-based outlier tests as implemented in BAYESCAN and LOSITAN. Functional annotation of the contigs containing putatively adaptive SNPs yielded hits for 22 of 55 (40% significant BLASTX matches. Candidate genes for local selection constituted a wide array of functions, including binding, catalytic and metabolic activities, etc. The analyses with the SNPs developed in the present study highlighted the importance of genome-wide genetic variation for inference of population structure and local adaptation in L. maculatus.
Nguyen, Thao T.B.; Arimatsu, Yuji; Hong, Sung-Jong; Brindley, Paul J.; Blair, David; Laha, Thewarach; Sripa, Banchob
Clonorchis sinensis is an important carcinogenic human liver fluke endemic in East and Southeast Asia. There are several conventional molecular markers have been used for identification and genetic diversity, however, no information about microsatellites of this liver fluke published so far. We here report microsatellite characterization and marker development for genetic diversity study in C. sinensis using genome-wide bioinformatics approach. Based on our search criteria, a total of 256,990 microsatellites (≥ 12 base pairs) were identified from genome database of C. sinensis with hexa-nucleotide motif being the most abundant (51%) followed by penta-nucleotide (18.3%) and tri-nucleotide (12.7%). The tetra-nucleotide, di-nucleotide and mononucleotide motifs accounted for 9.75 %, 7.63% and 0.14%, respectively. The total length of all microsatellites accounts for 0. 72 % of 547 Mb of the whole genome size and the frequency of microsatellites were found to be one microsatellite in every 2.13 kb of DNA. For the di-, tri, and tetra-nucleotide, the repeat numbers redundant are six (28%), four (45%) and three (76%), respectively. The ATC repeat is the most abundant microsatellites followed by AT, AAT and AC, respectively. Within 40 microsatellite loci developed, 24 microsatellite markers showed potential to differentiate between C. sinensis and O. viverrini. Seven out of 24 loci showed heterozygous with observed heterozygosity ranged from 0.467 to 1. Four-primer sets could amplify both C. sinensis and O. viverrini DNA with different sizes. This study provides basic information of C. sinensis microsatellites and the genome-wide markers developed may be a useful tool for genetic study of C. sinensis. PMID:25782682
Huang, Lei; Ma, Fei; Chapman, Alec; Lu, Sijia; Xie, Xiaoliang Sunney
We present a survey of single-cell whole-genome amplification (WGA) methods, including degenerate oligonucleotide-primed polymerase chain reaction (DOP-PCR), multiple displacement amplification (MDA), and multiple annealing and looping-based amplification cycles (MALBAC). The key parameters to characterize the performance of these methods are defined, including genome coverage, uniformity, reproducibility, unmappable rates, chimera rates, allele dropout rates, false positive rates for calling single-nucleotide variations, and ability to call copy-number variations. Using these parameters, we compare five commercial WGA kits by performing deep sequencing of multiple single cells. We also discuss several major applications of single-cell genomics, including studies of whole-genome de novo mutation rates, the early evolution of cancer genomes, circulating tumor cells (CTCs), meiotic recombination of germ cells, preimplantation genetic diagnosis (PGD), and preimplantation genomic screening (PGS) for in vitro-fertilized embryos.
Full Text Available Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.
Bastide, Héloïse; Betancourt, Andrea; Nolte, Viola; Tobler, Raymond; Stöbe, Petra; Futschik, Andreas; Schlötterer, Christian
Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS) to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs) segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.
Wen, Zixiang; Boyse, John F; Song, Qijian; Cregan, Perry B; Wang, Dechun
Crop improvement always involves selection of specific alleles at genes controlling traits of agronomic importance, likely resulting in detectable signatures of selection within the genome of modern soybean (Glycine max L. Merr.). The identification of these signatures of selection is meaningful from the perspective of evolutionary biology and for uncovering the genetic architecture of agronomic traits. To this end, two populations of soybean, consisting of 342 landraces and 1062 improved lines, were genotyped with the SoySNP50K Illumina BeadChip containing 52,041 single nucleotide polymorphisms (SNPs), and systematically phenotyped for 9 agronomic traits. A cross-population composite likelihood ratio (XP-CLR) method was used to screen the signals of selective sweeps. A total of 125 candidate selection regions were identified, many of which harbored genes potentially involved in crop improvement. To further investigate whether these candidate regions were in fact enriched for genes affected by selection, genome-wide association studies (GWAS) were conducted on 7 selection traits targeted in soybean breeding (grain yield, plant height, lodging, maturity date, seed coat color, seed protein and oil content) and 2 non-selection traits (pubescence and flower color). Major genomic regions associated with selection traits overlapped with candidate selection regions, whereas no overlap of this kind occurred for the non-selection traits, suggesting that the selection sweeps identified are associated with traits of agronomic importance. Multiple novel loci and refined map locations of known loci related to these traits were also identified. These findings illustrate that comparative genomic analyses, especially when combined with GWAS, are a promising approach to dissect the genetic architecture of complex traits.
Metzger, Julia; Ohnesorge, Bernhard; Distl, Ottmar
Equine guttural pouch tympany (GPT) is a hereditary condition affecting foals in their first months of life. Complex segregation analyses in Arabian and German warmblood horses showed the involvement of a major gene as very likely. Genome-wide linkage and association analyses including a high density marker set of single nucleotide polymorphisms (SNPs) were performed to map the genomic region harbouring the potential major gene for GPT. A total of 85 Arabian and 373 German warmblood horses were genotyped on the Illumina equine SNP50 beadchip. Non-parametric multipoint linkage analyses showed genome-wide significance on horse chromosomes (ECA) 3 for German warmblood at 16–26 Mb and 34–55 Mb and for Arabian on ECA15 at 64–65 Mb. Genome-wide association analyses confirmed the linked regions for both breeds. In Arabian, genome-wide association was detected at 64 Mb within the region with the highest linkage peak on ECA15. For German warmblood, signals for genome-wide association were close to the peak region of linkage at 52 Mb on ECA3. The odds ratio for the SNP with the highest genome-wide association was 0.12 for the Arabian. In conclusion, the refinement of the regions with the Illumina equine SNP50 beadchip is an important step to unravel the responsible mutations for GPT. PMID:22848553
Kühnisch, Jan; Thiering, Elisabeth; Heitmüller, Daniela; Tiesler, Carla M T; Grallert, Harald; Heinrich-Weltzien, Roswitha; Hickel, Reinhard; Heinrich, Joachim
This genome-wide association study (GWAS) investigated the relationship between molar-incisor hypomineralization (MIH) and possible genetic loci. Clinical and genetic data from the 10-year follow-up of 668 children from the Munich GINI-plus and LISA-plus birth cohort studies were analyzed. The dental examinations included the diagnosis of MIH according to the criteria of the European Academy of Paediatric Dentistry (EAPD). Children with MIH were categorized as those with a minimum of one hypomineralized first permanent molar. A GWAS was implemented following a quality-control step and an additive genetic effect was assumed. A total of 2,013,491 single-nucleotide polymorphisms (SNPs) were available for analysis. Rs13058467, which is located near the SCUBE1 gene on chromosome 22 (p MIH when using a threshold of p value MIH.
Zheng, Xiaoying; Hoffmann, Ary; Xi, Zhiyong; Zhang, Dongjing; Rasic, Gordana; Schmidt, Thomas
Aedes albopictus is a highly invasive disease vector with an expanding worldwide distribution. Genetic assays using low to medium resolution markers have found little evidence of spatial genetic structure even at broad geographic scales, suggesting frequent passive movement along human transportation networks. Here we analysed genetic structure of Ae. albopictus collected from 12 sample sites in Guangzhou, China, using thousands of genome-wide single nucleotide polymorphisms (SNPs). We found ...
Full Text Available Genome-wide association studies (GWAS using single nucleotide polymorphisms (SNPs have identified more than 50 loci associated with estimated glomerular filtration rate (eGFR, a measure of kidney function. However, significant SNPs account for a small proportion of eGFR variability. Other forms of genetic variation have not been comprehensively evaluated for association with eGFR. In this study, we assess whether changes in germline DNA copy number are associated with GFR estimated from serum creatinine, eGFRcrea. We used hidden Markov models (HMMs to identify copy number polymorphic regions (CNPs from high-throughput SNP arrays for 2,514 African (AA and 8,645 European ancestry (EA participants in the Atherosclerosis Risk in Communities (ARIC study. Separately for the EA and AA cohorts, we used Bayesian Gaussian mixture models to estimate copy number at regions identified by the HMM or previously reported in the HapMap Project. We identified 312 and 464 autosomal CNPs among individuals of EA and AA, respectively. Multivariate models adjusted for SNP-derived covariates of population structure identified one CNP in the EA cohort near genome-wide statistical significance (Bonferroni-adjusted p = 0.067 located on chromosome 5 (876-880kb. Overall, our findings suggest a limited role of CNPs in explaining eGFR variability.
Liu, Jin; Huang, Jian; Ma, Shuangge
Genome-wide association studies have been extensively conducted, searching for markers for biologically meaningful outcomes and phenotypes. Penalization methods have been adopted in the analysis of the joint effects of a large number of SNPs (single nucleotide polymorphisms) and marker identification. This study is partly motivated by the analysis of heterogeneous stock mice dataset, in which multiple correlated phenotypes and a large number of SNPs are available. Existing penalization methods designed to analyze a single response variable cannot accommodate the correlation among multiple response variables. With multiple response variables sharing the same set of markers, joint modeling is first employed to accommodate the correlation. The group Lasso approach is adopted to select markers associated with all the outcome variables. An efficient computational algorithm is developed. Simulation study and analysis of the heterogeneous stock mice dataset show that the proposed method can outperform existing penalization methods. PMID:23272092
Tsai, Chia-Ti; Hsieh, Chia-Shan; Chang, Sheng-Nan; Chuang, Eric Y.; Ueng, Kwo-Chang; Tsai, Chin-Feng; Lin, Tsung-Hsien; Wu, Cho-Kai; Lee, Jen-Kuang; Lin, Lian-Yu; Wang, Yi-Chih; Yu, Chih-Chieh; Lai, Ling-Ping; Tseng, Chuen-Den; Hwang, Juey-Jen; Chiang, Fu-Tien; Lin, Jiunn-Lee
Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia. Previous genome-wide association studies had identified single-nucleotide polymorphisms in several genomic regions to be associated with AF. In human genome, copy number variations (CNVs) are known to contribute to disease susceptibility. Using a genome-wide multistage approach to identify AF susceptibility CNVs, we here show a common 4,470-bp diallelic CNV in the first intron of potassium interacting channel 1 gene (KCNIP1) is strongly associated with AF in Taiwanese populations (odds ratio=2.27 for insertion allele; P=6.23 × 10−24). KCNIP1 insertion is associated with higher KCNIP1 mRNA expression. KCNIP1-encoded protein potassium interacting channel 1 (KCHIP1) is physically associated with potassium Kv channels and modulates atrial transient outward current in cardiac myocytes. Overexpression of KCNIP1 results in inducible AF in zebrafish. In conclusions, a common CNV in KCNIP1 gene is a genetic predictor of AF risk possibly pointing to a functional pathway. PMID:26831368
Bigdeli, Tim B.; Ripke, Stephan; Bacanu, Silviu-Alin
Genome-wide association studies (GWAS) of schizophrenia have yielded more than 100 common susceptibility variants, and strongly support a substantial polygenic contribution of a large number of small allelic effects. It has been hypothesized that familial schizophrenia is largely a consequence...... of inherited rather than environmental factors. We investigated the extent to which familiality of schizophrenia is associated with enrichment for common risk variants detectable in a large GWAS. We analyzed single nucleotide polymorphism (SNP) data for cases reporting a family history of psychotic illness (N...... history subgroup. Comparison of genome-wide polygenic risk scores based on GWAS summary statistics indicated a significant enrichment for SNP effects among family history positive compared to family history negative cases (Nagelkerke's R2=0.0021; P=0.00331; P-value threshold
Full Text Available Peanut (Arachis hypogaea consists of two subspecies, hypogaea and fastigiata, and has been cultivated worldwide for hundreds of years. Here, 158 peanut accessions were selected to dissect the molecular footprint of agronomic traits related to domestication using specific-locus amplified fragment sequencing (SLAF-seq method. Then, a total of 17,338 high-quality single nucleotide polymorphisms (SNPs in the whole peanut genome were revealed. Eleven agronomic traits in 158 peanut accessions were subsequently analyzed using genome-wide association studies (GWAS. Candidate genes responsible for corresponding traits were then analyzed in genomic regions surrounding the peak SNPs, and 1,429 genes were found within 200 kb windows centerd on GWAS-identified peak SNPs related to domestication. Highly differentiated genomic regions were observed between hypogaea and fastigiata accessions using FST values and sequence diversity (π ratios. Among the 1,429 genes, 662 were located on chromosome A3, suggesting the presence of major selective sweeps caused by artificial selection during long domestication. These findings provide a promising insight into the complicated genetic architecture of domestication-related traits in peanut, and reveal whole-genome SNP markers of beneficial candidate genes for marker-assisted selection (MAS in future breeding programs.
Somavilla, A L; Sonstegard, T S; Higa, R H; Rosa, A N; Siqueira, F; Silva, L O C; Torres Júnior, R A A; Coutinho, L L; Mudadu, M A; Alencar, M M; Regitano, L C A
Brazilian Nellore cattle (Bos indicus) have been selected for growth traits for over more than four decades. In recent years, reproductive and meat quality traits have become more important because of increasing consumption, exports and consumer demand. The identification of genome regions altered by artificial selection can potentially permit a better understanding of the biology of specific phenotypes that are useful for the development of tools designed to increase selection efficiency. Therefore, the aims of this study were to detect evidence of recent selection signatures in Nellore cattle using extended haplotype homozygosity methodology and BovineHD marker genotypes (>777,000 single nucleotide polymorphisms) as well as to identify corresponding genes underlying these signals. Thirty-one significant regions (P meat quality, fatty acid profiles and immunity. In addition, 545 genes were identified in regions harboring selection signatures. Within this group, 58 genes were associated with growth, muscle and adipose tissue metabolism, reproductive traits or the immune system. Using relative extended haplotype homozygosity to analyze high-density single nucleotide polymorphism marker data allowed for the identification of regions potentially under artificial selection pressure in the Nellore genome, which might be used to better understand autozygosity and the effects of selection on the Nellore genome. © 2014 Stichting International Foundation for Animal Genetics.
Full Text Available In the yeast Saccharomyces cerevisiae and most other eukaryotes, mitotic recombination is important for the repair of double-stranded DNA breaks (DSBs. Mitotic recombination between homologous chromosomes can result in loss of heterozygosity (LOH. In this study, LOH events induced by ultraviolet (UV light are mapped throughout the genome to a resolution of about 1 kb using single-nucleotide polymorphism (SNP microarrays. UV doses that have little effect on the viability of diploid cells stimulate crossovers more than 1000-fold in wild-type cells. In addition, UV stimulates recombination in G1-synchronized cells about 10-fold more efficiently than in G2-synchronized cells. Importantly, at high doses of UV, most conversion events reflect the repair of two sister chromatids that are broken at approximately the same position whereas at low doses, most conversion events reflect the repair of a single broken chromatid. Genome-wide mapping of about 380 unselected crossovers, break-induced replication (BIR events, and gene conversions shows that UV-induced recombination events occur throughout the genome without pronounced hotspots, although the ribosomal RNA gene cluster has a significantly lower frequency of crossovers.
Ishfaq A. Sheikh
Full Text Available Abstract Background Preterm birth (PTB, birth at <37 weeks of gestation, is a significant global public health problem. World-wide, about 15 million babies are born preterm each year resulting in more than a million deaths of children. Preterm neonates are more prone to problems and need intensive care hospitalization. Health issues may persist through early adulthood and even be carried on to the next generation. Majority (70 % of PTBs are spontaneous with about a half without any apparent cause and the other half associated with a number of risk factors. Genetic factors are one of the significant risks for PTB. The focus of this review is on single nucleotide gene polymorphisms (SNPs that are reported to be associated with PTB. Results A comprehensive evaluation of studies on SNPs known to confer potential risk of PTB was done by performing a targeted PubMed search for the years 2007–2015 and systematically reviewing all relevant studies. Evaluation of 92 studies identified 119 candidate genes with SNPs that had potential association with PTB. The genes were associated with functions of a wide spectrum of tissue and cell types such as endocrine, tissue remodeling, vascular, metabolic, and immune and inflammatory systems. Conclusions A number of potential functional candidate gene variants have been reported that predispose women for PTB. Understanding the complex genomic landscape of PTB needs high-throughput genome sequencing methods such as whole-exome sequencing and whole-genome sequencing approaches that will significantly enhance the understanding of PTB. Identification of high risk women, avoidance of possible risk factors, and provision of personalized health care are important to manage PTB.
Adkins, D E; Clark, S L; Åberg, K; Hettema, J M; Bukszár, J; McClay, J L; Souza, R P; van den Oord, E J C G
Affecting about 1 in 12 Americans annually, depression is a leading cause of the global disease burden. While a range of effective antidepressants are now available, failure and relapse rates remain substantial, with intolerable side effect burden the most commonly cited reason for discontinuation. Thus, understanding individual differences in susceptibility to antidepressant therapy side effects will be essential to optimize depression treatment. Here we perform genome-wide association studies (GWAS) to identify genetic variation influencing susceptibility to citalopram-induced side effects. The analysis sample consisted of 1762 depression patients, successfully genotyped for 421K single-nucleotide polymorphisms (SNPs), from the Sequenced Treatment Alternatives to Relieve Depression (STAR(*)D) study. Outcomes included five indicators of citalopram side effects: general side effect burden, overall tolerability, sexual side effects, dizziness and vision/hearing side effects. Two SNPs met our genome-wide significance criterion (qeffects of citalopram on vision/hearing side effects (P=3.27 × 10(-8), q=0.026). The second genome-wide significant finding, representing a haplotype spanning ∼30 kb and eight genotyped SNPs in a gene desert on chromosome 13, was associated with general side effect burden (P=3.22 × 10(-7), q=0.096). Suggestive findings were also found for SNPs at LAMA1, AOX2P, EGFLAM, FHIT and RTP2. Although our findings require replication and functional validation, this study demonstrates the potential of GWAS to discover genes and pathways that potentially mediate adverse effects of antidepressant medications.
Full Text Available Traditional genetic association studies are very difficult in bacteria, as the generally limited recombination leads to large linked haplotype blocks, confounding the identification of causative variants. Beta-lactam antibiotic resistance in Streptococcus pneumoniae arises readily as the bacteria can quickly incorporate DNA fragments encompassing variants that make the transformed strains resistant. However, the causative mutations themselves are embedded within larger recombined blocks, and previous studies have only analysed a limited number of isolates, leading to the description of "mosaic genes" as being responsible for resistance. By comparing a large number of genomes of beta-lactam susceptible and non-susceptible strains, the high frequency of recombination should break up these haplotype blocks and allow the use of genetic association approaches to identify individual causative variants. Here, we performed a genome-wide association study to identify single nucleotide polymorphisms (SNPs and indels that could confer beta-lactam non-susceptibility using 3,085 Thai and 616 USA pneumococcal isolates as independent datasets for the variant discovery. The large sample sizes allowed us to narrow the source of beta-lactam non-susceptibility from long recombinant fragments down to much smaller loci comprised of discrete or linked SNPs. While some loci appear to be universal resistance determinants, contributing equally to non-susceptibility for at least two classes of beta-lactam antibiotics, some play a larger role in resistance to particular antibiotics. All of the identified loci have a highly non-uniform distribution in the populations. They are enriched not only in vaccine-targeted, but also non-vaccine-targeted lineages, which may raise clinical concerns. Identification of single nucleotide polymorphisms underlying resistance will be essential for future use of genome sequencing to predict antibiotic sensitivity in clinical microbiology.
Barbara E Stranger
Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.
Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.
Marques, Catarina A; Dickens, Nicholas J; Paape, Daniel; Campbell, Samantha J; McCulloch, Richard
DNA replication initiates on defined genome sites, termed origins. Origin usage appears to follow common rules in the eukaryotic organisms examined to date: all chromosomes are replicated from multiple origins, which display variations in firing efficiency and are selected from a larger pool of potential origins. To ask if these features of DNA replication are true of all eukaryotes, we describe genome-wide origin mapping in the parasite Leishmania. Origin mapping in Leishmania suggests a striking divergence in origin usage relative to characterized eukaryotes, since each chromosome appears to be replicated from a single origin. By comparing two species of Leishmania, we find evidence that such origin singularity is maintained in the face of chromosome fusion or fission events during evolution. Mapping Leishmania origins suggests that all origins fire with equal efficiency, and that the genomic sites occupied by origins differ from related non-origins sites. Finally, we provide evidence that origin location in Leishmania displays striking conservation with Trypanosoma brucei, despite the latter parasite replicating its chromosomes from multiple, variable strength origins. The demonstration of chromosome replication for a single origin in Leishmania, a microbial eukaryote, has implications for the evolution of origin multiplicity and associated controls, and may explain the pervasive aneuploidy that characterizes Leishmania chromosome architecture.
Sasayama, Daimei; Hattori, Kotaro; Ogawa, Shintaro; Yokota, Yuuki; Matsumura, Ryo; Teraishi, Toshiya; Hori, Hiroaki; Ota, Miho; Yoshida, Sumiko; Kunugi, Hiroshi
Cerebrospinal fluid (CSF) is virtually the only one accessible source of proteins derived from the central nervous system (CNS) of living humans and possibly reflects the pathophysiology of a variety of neuropsychiatric diseases. However, little is known regarding the genetic basis of variation in protein levels of human CSF. We examined CSF levels of 1,126 proteins in 133 subjects and performed a genome-wide association analysis of 514,227 single nucleotide polymorphisms (SNPs) to detect protein quantitative trait loci (pQTLs). To be conservative, Spearman's correlation was used to identify an association between genotypes of SNPs and protein levels. A total of 421 cis and 25 trans SNP-protein pairs were significantly correlated at a false discovery rate (FDR) of less than 0.01 (nominal P genome-wide association studies. The present findings suggest that genetic variations play an important role in the regulation of protein expression in the CNS. The obtained database may serve as a valuable resource to understand the genetic bases for CNS protein expression pattern in humans. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: firstname.lastname@example.org.
Izumi, Kosuke; Santani, Avni B; Deardorff, Matthew A; Feret, Holly A; Tischler, Tanya; Thiel, Brian D; Mulchandani, Surabhi; Stolle, Catherine A; Spinner, Nancy B; Zackai, Elaine H; Conlin, Laura K
Prader-Willi syndrome is caused by the loss of paternal gene expression on 15q11.2-q13.2, and one of the mechanisms resulting in Prader-Willi syndrome phenotype is maternal uniparental disomy of chromosome 15. Various mechanisms including trisomy rescue, monosomy rescue, and post fertilization errors can lead to uniparental disomy, and its mechanism can be inferred from the pattern of uniparental hetero and isodisomy. Detection of a mosaic cell line provides a unique opportunity to understand the mechanism of uniparental disomy; however, mosaic uniparental disomy is a rare finding in patients with Prader-Willi syndrome. We report on two infants with Prader-Willi syndrome caused by mosaic maternal uniparental disomy 15. Patient 1 has mosaic uniparental isodisomy of the entire chromosome 15, and Patient 2 has mosaic uniparental mixed iso/heterodisomy 15. Genome-wide single-nucleotide polymorphism array was able to demonstrate the presence of chromosomally normal cell line in the Patient 1 and trisomic cell line in Patient 2, and provide the evidence that post-fertilization error and trisomy rescue as a mechanism of uniparental disomy in each case, respectively. Given its ability of detecting small percent mosaicism as well as its capability of identifying the loss of heterozygosity of chromosomal regions, genome-wide single-nucleotide polymorphism array should be utilized as an adjunct to the standard methylation analysis in the evaluation of Prader-Willi syndrome. Copyright © 2012 Wiley Periodicals, Inc.
Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMcahon, Katherine D.; Mamlstrom, Rex R.
Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ecotype model? of diversification, but not previously observed in natural populations.
Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMahon, Katherine D.; Malmstrom, Rex R.
Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ‘ecotype model’ of diversification, but not previously observed in natural populations.
Anna C Need
Full Text Available We report a genome-wide assessment of single nucleotide polymorphisms (SNPs and copy number variants (CNVs in schizophrenia. We investigated SNPs using 871 patients and 863 controls, following up the top hits in four independent cohorts comprising 1,460 patients and 12,995 controls, all of European origin. We found no genome-wide significant associations, nor could we provide support for any previously reported candidate gene or genome-wide associations. We went on to examine CNVs using a subset of 1,013 cases and 1,084 controls of European ancestry, and a further set of 60 cases and 64 controls of African ancestry. We found that eight cases and zero controls carried deletions greater than 2 Mb, of which two, at 8p22 and 16p13.11-p12.4, are newly reported here. A further evaluation of 1,378 controls identified no deletions greater than 2 Mb, suggesting a high prior probability of disease involvement when such deletions are observed in cases. We also provide further evidence for some smaller, previously reported, schizophrenia-associated CNVs, such as those in NRXN1 and APBA2. We could not provide strong support for the hypothesis that schizophrenia patients have a significantly greater "load" of large (>100 kb, rare CNVs, nor could we find common CNVs that associate with schizophrenia. Finally, we did not provide support for the suggestion that schizophrenia-associated CNVs may preferentially disrupt genes in neurodevelopmental pathways. Collectively, these analyses provide the first integrated study of SNPs and CNVs in schizophrenia and support the emerging view that rare deleterious variants may be more important in schizophrenia predisposition than common polymorphisms. While our analyses do not suggest that implicated CNVs impinge on particular key pathways, we do support the contribution of specific genomic regions in schizophrenia, presumably due to recurrent mutation. On balance, these data suggest that very few schizophrenia patients
Pareek, Chandra Shekhar; Błaszczyk, Paweł; Dziuba, Piotr
Background RNA-seq is a useful next-generation sequencing (NGS) technology that has been widely used to understand mammalian transcriptome architecture and function. In this study, a breed-specific RNA-seq experiment was utilized to detect putative single nucleotide polymorphisms (SNPs) in liver...
Although species from the genus Thunnus include some of the most commercially important and most severely overexploited fishes, the phylogeny of this genus is still unresolved, hampering evolutionary and traceability studies that could help improve conservation and management strategies for these species. Previous attempts based on mitochondrial and nuclear markers were unsuccessful in inferring a congruent and reliable phylogeny, probably due to mitochondrial introgression events and lack of enough phylogenetically informative markers. Here we infer the first genome-wide nuclear marker-based phylogeny of tunas using restriction site associated DNA sequencing (RAD-seq) data. Our results, derived from phylogenomic inferences obtained from 128 nucleotide matrices constructed using alternative data assembly procedures, support a single Thunnus evolutionary history that challenges previous assumptions based on morphological and molecular data.
Barker Michael S
Full Text Available Abstract Background There is increasing demand to test hypotheses that contrast the evolution of genes and gene families among genomes, using simulations that work across these levels of organization. The EvolSimulator program was developed recently to provide a highly flexible platform for forward simulations of amino acid evolution in multiple related lineages of haploid genomes, permitting copy number variation and lateral gene transfer. Synonymous nucleotide evolution is not currently supported, however, and would be highly advantageous for comparisons to full genome, transcriptome, and single nucleotide polymorphism (SNP datasets. In addition, EvolSimulator creates new genomes for each simulation, and does not allow the input of user-specified sequences and gene family information, limiting the incorporation of further biological realism and/or user manipulations of the data. Findings We present modified C++ source code for the EvolSimulator platform, which we provide as the extension module NU-IN. With NU-IN, synonymous and non-synonymous nucleotide evolution is fully implemented, and the user has the ability to use real or previously-simulated sequence data to initiate a simulation of one or more lineages. Gene family membership can be optionally specified, as well as gene retention probabilities that model biased gene retention. We provide PERL scripts to assist the user in deriving this information from previous simulations. We demonstrate the features of NU-IN by simulating genome duplication (polyploidy in the presence of ongoing copy number variation in an evolving lineage. This example is initiated with real genomic data, and produces output that we analyse directly with existing bioinformatic pipelines. Conclusions The NU-IN extension module is a publicly available open source software (GNU GPLv3 license extension to EvolSimulator. With the NU-IN module, users are now able to simulate both drift and selection at the nucleotide
Dunn, Joshua G; Weissman, Jonathan S
Next-generation sequencing (NGS) informs many biological questions with unprecedented depth and nucleotide resolution. These assays have created a need for analytical tools that enable users to manipulate data nucleotide-by-nucleotide robustly and easily. Furthermore, because many NGS assays encode information jointly within multiple properties of read alignments - for example, in ribosome profiling, the locations of ribosomes are jointly encoded in alignment coordinates and length - analytical tools are often required to extract the biological meaning from the alignments before analysis. Many assay-specific pipelines exist for this purpose, but there remains a need for user-friendly, generalized, nucleotide-resolution tools that are not limited to specific experimental regimes or analytical workflows. Plastid is a Python library designed specifically for nucleotide-resolution analysis of genomics and NGS data. As such, Plastid is designed to extract assay-specific information from read alignments while retaining generality and extensibility to novel NGS assays. Plastid represents NGS and other biological data as arrays of values associated with genomic or transcriptomic positions, and contains configurable tools to convert data from a variety of sources to such arrays. Plastid also includes numerous tools to manipulate even discontinuous genomic features, such as spliced transcripts, with nucleotide precision. Plastid automatically handles conversion between genomic and feature-centric coordinates, accounting for splicing and strand, freeing users of burdensome accounting. Finally, Plastid's data models use consistent and familiar biological idioms, enabling even beginners to develop sophisticated analytical workflows with minimal effort. Plastid is a versatile toolkit that has been used to analyze data from multiple NGS assays, including RNA-seq, ribosome profiling, and DMS-seq. It forms the genomic engine of our ORF annotation tool, ORF-RATER, and is readily
Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J
The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53-3.14), P=1.9 × 10(-5)). Two polymorphisms at 6p21.2 LINC00951-LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37-1.85), P=1.6 × 10(-9)) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder.
Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J
The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53–3.14), P=1.9 × 10-5). Two polymorphisms at 6p21.2 LINC00951–LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37–1.85), P=1.6 × 10−9) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder. PMID:27598967
Nakajima, Masahiro; Takahashi, Atsushi; Kou, Ikuyo; Rodriguez-Fontenla, Cristina; Gomez-Reino, Juan J.; Furuichi, Tatsuya; Dai, Jin; Sudo, Akihiro; Uchida, Atsumasa; Fukui, Naoshi; Kubo, Michiaki; Kamatani, Naoyuki; Tsunoda, Tatsuhiko; Malizos, Konstantinos N.; Tsezou, Aspasia; Gonzalez, Antonio; Nakamura, Yusuke; Ikegawa, Shiro
Osteoarthritis (OA) is a common disease that has a definite genetic component. Only a few OA susceptibility genes that have definite functional evidence and replication of association have been reported, however. Through a genome-wide association study and a replication using a total of ∼4,800 Japanese subjects, we identified two single nucleotide polymorphisms (SNPs) (rs7775228 and rs10947262) associated with susceptibility to knee OA. The two SNPs were in a region containing HLA class II/III genes and their association reached genome-wide significance (combined P = 2.43×10−8 for rs7775228 and 6.73×10−8 for rs10947262). Our results suggest that immunologic mechanism is implicated in the etiology of OA. PMID:20305777
Although autism spectrum disorders (ASDs) have a substantial genetic basis, most of the known genetic risk has been traced to rare variants, principally copy number variants (CNVs). To identify common risk variation, the Autism Genome Project (AGP) Consortium genotyped 1558 rigorously defined ASD families for 1 million single-nucleotide polymorphisms (SNPs) and analyzed these SNP genotypes for association with ASD. In one of four primary association analyses, the association signal for marker rs4141463, located within MACROD2, crossed the genome-wide association significance threshold of P < 5 × 10(-8). When a smaller replication sample was analyzed, the risk allele at rs4141463 was again over-transmitted; yet, consistent with the winner\\'s curse, its effect size in the replication sample was much smaller; and, for the combined samples, the association signal barely fell below the P < 5 × 10(-8) threshold. Exploratory analyses of phenotypic subtypes yielded no significant associations after correction for multiple testing. They did, however, yield strong signals within several genes, KIAA0564, PLD5, POU6F2, ST8SIA2 and TAF1C.
Anney, Richard; Klei, Lambertus; Pinto, Dalila; Regan, Regina; Conroy, Judith; Magalhaes, Tiago R; Correia, Catarina; Abrahams, Brett S; Sykes, Nuala; Pagnamenta, Alistair T; Almeida, Joana; Bacchelli, Elena; Bailey, Anthony J; Baird, Gillian; Battaglia, Agatino; Berney, Tom; Bolshakova, Nadia; Bölte, Sven; Bolton, Patrick F; Bourgeron, Thomas; Brennan, Sean; Brian, Jessica; Carson, Andrew R; Casallo, Guillermo; Casey, Jillian; Chu, Su H; Cochrane, Lynne; Corsello, Christina; Crawford, Emily L; Crossett, Andrew; Dawson, Geraldine; de Jonge, Maretha; Delorme, Richard; Drmic, Irene; Duketis, Eftichia; Duque, Frederico; Estes, Annette; Farrar, Penny; Fernandez, Bridget A; Folstein, Susan E; Fombonne, Eric; Freitag, Christine M; Gilbert, John; Gillberg, Christopher; Glessner, Joseph T; Goldberg, Jeremy; Green, Jonathan; Guter, Stephen J; Hakonarson, Hakon; Heron, Elizabeth A; Hill, Matthew; Holt, Richard; Howe, Jennifer L; Hughes, Gillian; Hus, Vanessa; Igliozzi, Roberta; Kim, Cecilia; Klauck, Sabine M; Kolevzon, Alexander; Korvatska, Olena; Kustanovich, Vlad; Lajonchere, Clara M; Lamb, Janine A; Laskawiec, Magdalena; Leboyer, Marion; Le Couteur, Ann; Leventhal, Bennett L; Lionel, Anath C; Liu, Xiao-Qing; Lord, Catherine; Lotspeich, Linda; Lund, Sabata C; Maestrini, Elena; Mahoney, William; Mantoulan, Carine; Marshall, Christian R; McConachie, Helen; McDougle, Christopher J; McGrath, Jane; McMahon, William M; Melhem, Nadine M; Merikangas, Alison; Migita, Ohsuke; Minshew, Nancy J; Mirza, Ghazala K; Munson, Jeff; Nelson, Stanley F; Noakes, Carolyn; Noor, Abdul; Nygren, Gudrun; Oliveira, Guiomar; Papanikolaou, Katerina; Parr, Jeremy R; Parrini, Barbara; Paton, Tara; Pickles, Andrew; Piven, Joseph; Posey, David J; Poustka, Annemarie; Poustka, Fritz; Prasad, Aparna; Ragoussis, Jiannis; Renshaw, Katy; Rickaby, Jessica; Roberts, Wendy; Roeder, Kathryn; Roge, Bernadette; Rutter, Michael L; Bierut, Laura J; Rice, John P; Salt, Jeff; Sansom, Katherine; Sato, Daisuke; Segurado, Ricardo; Senman, Lili; Shah, Naisha; Sheffield, Val C; Soorya, Latha; Sousa, Inês; Stoppioni, Vera; Strawbridge, Christina; Tancredi, Raffaella; Tansey, Katherine; Thiruvahindrapduram, Bhooma; Thompson, Ann P; Thomson, Susanne; Tryfon, Ana; Tsiantis, John; Van Engeland, Herman; Vincent, John B; Volkmar, Fred; Wallace, Simon; Wang, Kai; Wang, Zhouzhi; Wassink, Thomas H; Wing, Kirsty; Wittemeyer, Kerstin; Wood, Shawn; Yaspan, Brian L; Zurawiecki, Danielle; Zwaigenbaum, Lonnie; Betancur, Catalina; Buxbaum, Joseph D; Cantor, Rita M; Cook, Edwin H; Coon, Hilary; Cuccaro, Michael L; Gallagher, Louise; Geschwind, Daniel H; Gill, Michael; Haines, Jonathan L; Miller, Judith; Monaco, Anthony P; Nurnberger, John I; Paterson, Andrew D; Pericak-Vance, Margaret A; Schellenberg, Gerard D; Scherer, Stephen W; Sutcliffe, James S; Szatmari, Peter; Vicente, Astrid M; Vieland, Veronica J; Wijsman, Ellen M; Devlin, Bernie; Ennis, Sean; Hallmayer, Joachim
Although autism spectrum disorders (ASDs) have a substantial genetic basis, most of the known genetic risk has been traced to rare variants, principally copy number variants (CNVs). To identify common risk variation, the Autism Genome Project (AGP) Consortium genotyped 1558 rigorously defined ASD families for 1 million single-nucleotide polymorphisms (SNPs) and analyzed these SNP genotypes for association with ASD. In one of four primary association analyses, the association signal for marker rs4141463, located within MACROD2, crossed the genome-wide association significance threshold of P < 5 × 10(-8). When a smaller replication sample was analyzed, the risk allele at rs4141463 was again over-transmitted; yet, consistent with the winner's curse, its effect size in the replication sample was much smaller; and, for the combined samples, the association signal barely fell below the P < 5 × 10(-8) threshold. Exploratory analyses of phenotypic subtypes yielded no significant associations after correction for multiple testing. They did, however, yield strong signals within several genes, KIAA0564, PLD5, POU6F2, ST8SIA2 and TAF1C.
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the foundation of powerful complex trait and pharmacogenomic analyses. The availability of large SNP databases, however, has emphasized a need for inexpensive SNP genotyping methods of commensurate simplicity, robustness, and scalability. We describe a solution-based, microtiter plate method for SNP genotyping of human genomic DNA. The method is based upon allele discrimination by ligation of open circle probes followed by rolling circle amplification of the signal using fluorescent primers. Only the probe with a 3' base complementary to the SNP is circularized by ligation. Results SNP scoring by ligation was optimized to a 100,000 fold discrimination against probe mismatched to the SNP. The assay was used to genotype 10 SNPs from a set of 192 genomic DNA samples in a high-throughput format. Assay directly from genomic DNA eliminates the need to preamplify the target as done for many other genotyping methods. The sensitivity of the assay was demonstrated by genotyping from 1 ng of genomic DNA. We demonstrate that the assay can detect a single molecule of the circularized probe. Conclusions Compatibility with homogeneous formats and the ability to assay small amounts of genomic DNA meets the exacting requirements of automated, high-throughput SNP scoring.
Ottolini, Christian S; Capalbo, Antonio; Newnham, Louise
We have developed a protocol for the generation of genome-wide maps (meiomaps) of recombination and chromosome segregation for the three products of human female meiosis: the first and second polar bodies (PB1 and PB2) and the corresponding oocyte. PB1 is biopsied and the oocyte is artificially......-nucleotide polymorphisms (SNPs) genome-wide by microarray. Informative maternal heterozygous SNPs are phased using a haploid PB2 or oocyte as a reference. A simple algorithm is then used to identify the maternal haplotypes for each chromosome, in all of the products of meiosis for each oocyte. This allows mapping...
Dijkstra, Akkelies E; Smolonska, Joanna; van den Berge, Maarten
by replication and meta-analysis in 11 additional cohorts. In total 2,704 subjects with, and 7,624 subjects without CMH were included, all current or former heavy smokers (≥20 pack-years). Additional studies were performed to test the functional relevance of the most significant single nucleotide polymorphism...... (SNP). RESULTS: A strong association with CMH, consistent across all cohorts, was observed with rs6577641 (p = 4.25×10(-6), OR = 1.17), located in intron 9 of the special AT-rich sequence-binding protein 1 locus (SATB1) on chromosome 3. The risk allele (G) was associated with higher mRNA expression...... of smokers develops CMH. A plausible explanation for this phenomenon is a predisposing genetic constitution. Therefore, we performed a genome wide association (GWA) study of CMH in Caucasian populations. METHODS: GWA analysis was performed in the NELSON-study using the Illumina 610 array, followed...
Duncan, Laramie; Yilmaz, Zeynep; Gaspar, Helena; Walters, Raymond; Goldstein, Jackie; Anttila, Verneri; Bulik-Sullivan, Brendan; Ripke, Stephan; Thornton, Laura; Hinney, Anke; Daly, Mark; Sullivan, Patrick F; Zeggini, Eleftheria; Breen, Gerome; Bulik, Cynthia M
The authors conducted a genome-wide association study of anorexia nervosa and calculated genetic correlations with a series of psychiatric, educational, and metabolic phenotypes. Following uniform quality control and imputation procedures using the 1000 Genomes Project (phase 3) in 12 case-control cohorts comprising 3,495 anorexia nervosa cases and 10,982 controls, the authors performed standard association analysis followed by a meta-analysis across cohorts. Linkage disequilibrium score regression was used to calculate genome-wide common variant heritability (single-nucleotide polymorphism [SNP]-based heritability [h 2 SNP ]), partitioned heritability, and genetic correlations (r g ) between anorexia nervosa and 159 other phenotypes. Results were obtained for 10,641,224 SNPs and insertion-deletion variants with minor allele frequencies >1% and imputation quality scores >0.6. The h 2 SNP of anorexia nervosa was 0.20 (SE=0.02), suggesting that a substantial fraction of the twin-based heritability arises from common genetic variation. The authors identified one genome-wide significant locus on chromosome 12 (rs4622308) in a region harboring a previously reported type 1 diabetes and autoimmune disorder locus. Significant positive genetic correlations were observed between anorexia nervosa and schizophrenia, neuroticism, educational attainment, and high-density lipoprotein cholesterol, and significant negative genetic correlations were observed between anorexia nervosa and body mass index, insulin, glucose, and lipid phenotypes. Anorexia nervosa is a complex heritable phenotype for which this study has uncovered the first genome-wide significant locus. Anorexia nervosa also has large and significant genetic correlations with both psychiatric phenotypes and metabolic traits. The study results encourage a reconceptualization of this frequently lethal disorder as one with both psychiatric and metabolic etiology.
Power, Robert A; Tansey, Katherine E; Buttenschøn, Henriette Nørmølle; Cohen-Woods, Sarah; Bigdeli, Tim; Hall, Lynsey S; Kutalik, Zoltán; Lee, S Hong; Ripke, Stephan; Steinberg, Stacy; Teumer, Alexander; Viktorin, Alexander; Wray, Naomi R; Arolt, Volker; Baune, Bernard T; Boomsma, Dorret I; Børglum, Anders D; Byrne, Enda M; Castelao, Enrique; Craddock, Nick; Craig, Ian W; Dannlowski, Udo; Deary, Ian J; Degenhardt, Franziska; Forstner, Andreas J; Gordon, Scott D; Grabe, Hans J; Grove, Jakob; Hamilton, Steven P; Hayward, Caroline; Heath, Andrew C; Hocking, Lynne J; Homuth, Georg; Hottenga, Jouke J; Kloiber, Stefan; Krogh, Jesper; Landén, Mikael; Lang, Maren; Levinson, Douglas F; Lichtenstein, Paul; Lucae, Susanne; MacIntyre, Donald J; Madden, Pamela; Magnusson, Patrik K E; Martin, Nicholas G; McIntosh, Andrew M; Middeldorp, Christel M; Milaneschi, Yuri; Montgomery, Grant W; Mors, Ole; Müller-Myhsok, Bertram; Nyholt, Dale R; Oskarsson, Hogni; Owen, Michael J; Padmanabhan, Sandosh; Penninx, Brenda W J H; Pergadia, Michele L; Porteous, David J; Potash, James B; Preisig, Martin; Rivera, Margarita; Shi, Jianxin; Shyn, Stanley I; Sigurdsson, Engilbert; Smit, Johannes H; Smith, Blair H; Stefansson, Hreinn; Stefansson, Kari; Strohmaier, Jana; Sullivan, Patrick F; Thomson, Pippa; Thorgeirsson, Thorgeir E; Van der Auwera, Sandra; Weissman, Myrna M; Breen, Gerome; Lewis, Cathryn M
Major depressive disorder (MDD) is a disabling mood disorder, and despite a known heritable component, a large meta-analysis of genome-wide association studies revealed no replicable genetic risk variants. Given prior evidence of heterogeneity by age at onset in MDD, we tested whether genome-wide significant risk variants for MDD could be identified in cases subdivided by age at onset. Discovery case-control genome-wide association studies were performed where cases were stratified using increasing/decreasing age-at-onset cutoffs; significant single nucleotide polymorphisms were tested in nine independent replication samples, giving a total sample of 22,158 cases and 133,749 control subjects for subsetting. Polygenic score analysis was used to examine whether differences in shared genetic risk exists between earlier and adult-onset MDD with commonly comorbid disorders of schizophrenia, bipolar disorder, Alzheimer's disease, and coronary artery disease. We identified one replicated genome-wide significant locus associated with adult-onset (>27 years) MDD (rs7647854, odds ratio: 1.16, 95% confidence interval: 1.11-1.21, p = 5.2 × 10 -11 ). Using polygenic score analyses, we show that earlier-onset MDD is genetically more similar to schizophrenia and bipolar disorder than adult-onset MDD. We demonstrate that using additional phenotype data previously collected by genetic studies to tackle phenotypic heterogeneity in MDD can successfully lead to the discovery of genetic risk factor despite reduced sample size. Furthermore, our results suggest that the genetic susceptibility to MDD differs between adult- and earlier-onset MDD, with earlier-onset cases having a greater genetic overlap with schizophrenia and bipolar disorder. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Zhang, G X; Fan, Q C; Wang, J Y; Zhang, T; Xue, Q; Shi, H Q
To identify molecular markers and candidate genes associated with reproductive traits, a genome-wide analysis was performed in Jinghai Yellow Chickens to analyze body weight at first oviposition (BWF), age at first oviposition (AFE), weight of the egg at first oviposition (FEW), egg weight at the age of 300 days (EW300), number of eggs produced by 300 days of age (EN300), egg hatchability (HA) and multiple selection index for egg production (MSI). The results showed that seven single nucleotide polymorphisms (SNPs) were associated with reproductive traits (Preproductive traits were identified (Preproductive traits will greatly advance the understanding of the genetic basis and molecular mechanisms underlying reproductive traits and may have practical significance in breeding programs for the improvements of reproductive traits in the Jinghai Yellow Chicken. Copyright © 2015 Elsevier B.V. All rights reserved.
Full Text Available Broadly neutralizing antibodies may protect against HIV-1 acquisition. In natural infection, only 10-30% of patients have cross-reactive neutralizing humoral immunity which may relate to viral and or host factors. To explore the role of host genetic markers in the formation of cross-reactive neutralizing activity (CrNA in HIV-1 infected individuals, we performed a genome-wide association study (GWAS, in participants of the Amsterdam Cohort Studies with known CrNA in their sera. Single-nucleotide polymorphisms (SNPs with the strongest P-values are located in the major histocompatibility complex (MHC region, close to MICA (P = 7.68 × 10(-7, HLA-B (P = 6.96 × 10(-6 and in the coding region of HCP5 (P = 1.34 × 10(-5. However, none of the signals reached genome-wide significance. Our findings underline the potential involvement of genes close or within the MHC region with the development of CrNA.
Euler, Zelda; van Gils, Marit J.; Boeser-Nunnink, Brigitte D.; Schuitemaker, Hanneke; van Manen, Daniëlle
Broadly neutralizing antibodies may protect against HIV-1 acquisition. In natural infection, only 10–30% of patients have cross-reactive neutralizing humoral immunity which may relate to viral and or host factors. To explore the role of host genetic markers in the formation of cross-reactive neutralizing activity (CrNA) in HIV-1 infected individuals, we performed a genome-wide association study (GWAS), in participants of the Amsterdam Cohort Studies with known CrNA in their sera. Single-nucleotide polymorphisms (SNPs) with the strongest P-values are located in the major histocompatibility complex (MHC) region, close to MICA (P = 7.68×10−7), HLA-B (P = 6.96×10−6) and in the coding region of HCP5 (P = 1.34×10−5). However, none of the signals reached genome-wide significance. Our findings underline the potential involvement of genes close or within the MHC region with the development of CrNA. PMID:23372753
Porter Christopher J
Full Text Available Abstract Background SNP microarrays are designed to genotype Single Nucleotide Polymorphisms (SNPs. These microarrays report hybridization of DNA fragments and therefore can be used for the purpose of detecting genomic fragments. Results Here, we demonstrate that a SNP microarray can be effectively used in this way to perform chromatin immunoprecipitation (ChIP on chip as an alternative to tiling microarrays. We illustrate this novel application by mapping whole genome histone H4 hyperacetylation in human myoblasts and myotubes. We detect clusters of hyperacetylated histone H4, often spanning across up to 300 kilobases of genomic sequence. Using complementary genome-wide analyses of gene expression by DNA microarray we demonstrate that these clusters of hyperacetylated histone H4 tend to be associated with expressed genes. Conclusion The use of a SNP array for a ChIP-on-chip application (ChIP on SNP-chip will be of great value to laboratories whose interest is the determination of general rules regarding the relationship of specific chromatin modifications to transcriptional status throughout the genome and to examine the asymmetric modification of chromatin at heterozygous loci.
Xu, P; Wu, X; Wang, B; Luo, J; Liu, Y; Ehlers, J D; Close, T J; Roberts, P A; Lu, Z; Wang, S; Li, G
Association mapping of important traits of crop plants relies on first understanding the extent and patterns of linkage disequilibrium (LD) in the particular germplasm being investigated. We characterize here the genetic diversity, population structure and genome wide LD patterns in a set of asparagus bean (Vigna. unguiculata ssp. sesquipedialis) germplasm from China. A diverse collection of 99 asparagus bean and normal cowpea accessions were genotyped with 1127 expressed sequence tag-derived single nucleotide polymorphism markers (SNPs). The proportion of polymorphic SNPs across the collection was relatively low (39%), with an average number of SNPs per locus of 1.33. Bayesian population structure analysis indicated two subdivisions within the collection sampled that generally represented the 'standard vegetable' type (subgroup SV) and the 'non-standard vegetable' type (subgroup NSV), respectively. Level of LD (r(2)) was higher and extent of LD persisted longer in subgroup SV than in subgroup NSV, whereas LD decayed rapidly (0-2 cM) in both subgroups. LD decay distance varied among chromosomes, with the longest (≈ 5 cM) five times longer than the shortest (≈ 1 cM). Partitioning of LD variance into within- and between-subgroup components coupled with comparative LD decay analysis suggested that linkage group 5, 7 and 10 may have undergone the most intensive epistatic selection toward traits favorable for vegetable use. This work provides a first population genetic insight into domestication history of asparagus bean and demonstrates the feasibility of mapping complex traits by genome wide association study in asparagus bean using a currently available cowpea SNPs marker platform.
Kulbrock, Maike; Lehner, Stefanie; Metzger, Julia; Ohnesorge, Bernhard; Distl, Ottmar
Equine recurrent uveitis (ERU) is a common eye disease affecting up to 3–15% of the horse population. A genome-wide association study (GWAS) using the Illumina equine SNP50 bead chip was performed to identify loci conferring risk to ERU. The sample included a total of 144 German warmblood horses. A GWAS showed a significant single nucleotide polymorphism (SNP) on horse chromosome (ECA) 20 at 49.3 Mb, with IL-17A and IL-17F being the closest genes. This locus explained a fraction of 23% of the phenotypic variance for ERU. A GWAS taking into account the severity of ERU, revealed a SNP on ECA18 nearby to the crystalline gene cluster CRYGA-CRYGF. For both genomic regions on ECA18 and 20, significantly associated haplotypes containing the genome-wide significant SNPs could be demonstrated. In conclusion, our results are indicative for a genetic component regulating the possible critical role of IL-17A and IL-17F in the pathogenesis of ERU. The associated SNP on ECA18 may be indicative for cataract formation in the course of ERU. PMID:23977091
Cho, Seoae; Kim, Haseong; Oh, Sohee; Kim, Kyunga; Park, Taesung
The current trend in genome-wide association studies is to identify regions where the true disease-causing genes may lie by evaluating thousands of single-nucleotide polymorphisms (SNPs) across the whole genome. However, many challenges exist in detecting disease-causing genes among the thousands of SNPs. Examples include multicollinearity and multiple testing issues, especially when a large number of correlated SNPs are simultaneously tested. Multicollinearity can often occur when predictor variables in a multiple regression model are highly correlated, and can cause imprecise estimation of association. In this study, we propose a simple stepwise procedure that identifies disease-causing SNPs simultaneously by employing elastic-net regularization, a variable selection method that allows one to address multicollinearity. At Step 1, the single-marker association analysis was conducted to screen SNPs. At Step 2, the multiple-marker association was scanned based on the elastic-net regularization. The proposed approach was applied to the rheumatoid arthritis (RA) case-control data set of Genetic Analysis Workshop 16. While the selected SNPs at the screening step are located mostly on chromosome 6, the elastic-net approach identified putative RA-related SNPs on other chromosomes in an increased proportion. For some of those putative RA-related SNPs, we identified the interactions with sex, a well known factor affecting RA susceptibility.
Willour, Virginia L.; Seifuddin, Fayaz; Mahon, Pamela B.; Jancic, Dubravka; Pirooznia, Mehdi; Steele, Jo; Schweizer, Barbara; Goes, Fernando S.; Mondimore, Francis M.; MacKinnon, Dean F.; Perlis, Roy H.; Lee, Phil Hyoun; Huang, Jie; Kelsoe, John R.; Shilling, Paul D.; Rietschel, Marcella; Nöthen, Markus; Cichon, Sven; Gurling, Hugh; Purcell, Shaun; Smoller, Jordan W.; Craddock, Nicholas; DePaulo, J. Raymond; Schulze, Thomas G.; McMahon, Francis J.; Zandi, Peter P.; Potash, James B.
The heritable component to attempted and completed suicide is partly related to psychiatric disorders and also partly independent of them. While attempted suicide linkage regions have been identified on 2p11–12 and 6q25–26, there are likely many more such loci, the discovery of which will require a much higher resolution approach, such as the genome-wide association study (GWAS). With this in mind, we conducted an attempted suicide GWAS that compared the single nucleotide polymorphism (SNP) genotypes of 1,201 bipolar (BP) subjects with a history of suicide attempts to the genotypes of 1,497 BP subjects without a history of suicide attempts. 2,507 SNPs with evidence for association at p<0.001 were identified. These associated SNPs were subsequently tested for association in a large and independent BP sample set. None of these SNPs were significantly associated in the replication sample after correcting for multiple testing, but the combined analysis of the two sample sets produced an association signal on 2p25 (rs300774) at the threshold of genome-wide significance (p= 5.07 × 10−8). The associated SNPs on 2p25 fall in a large linkage disequilibrium block containing the ACP1 gene, a gene whose expression is significantly elevated in BP subjects who have completed suicide. Furthermore, the ACP1 protein is a tyrosine phosphatase that influences Wnt signaling, a pathway regulated by lithium, making ACP1 a functional candidate for involvement in the phenotype. Larger GWAS sample sets will be required to confirm the signal on 2p25 and to identify additional genetic risk factors increasing susceptibility for attempted suicide. PMID:21423239
Strawbridge, Rona; Dupuis, Josée; Prokopenko, Inga; Barker, Adam; Ahlqvist, Emma; Rybin, Denis; Petrie, John; Bouatia-Naji, Nabila; Dimas, Antigone; Wheeler, Eleanor; Chen, Han; Voight, Benjamin; Taneera, Jalal; Kanoni, Stavroula; Peden, John
textabstractOBJECTIVE - Proinsulin is a precursor of mature insulin and C-peptide. Higher circulating proinsulin levels are associated with impaired b-cell function, raised glucose levels, insulin resistance, and type 2 diabetes (T2D). Studies of the insulin processing pathway could provide new insights about T2D pathophysiology. RESEARCH DESIGN AND METHODS - We have conducted a meta-analysis of genome-wide association tests of ;2.5 million genotyped or imputed single nucleotide polymorphisms...
Full Text Available Shanbeh Zienolddiny, Vidar SkaugSection for Toxicology and Biological Work Environment, National Institute of Occupational Health, Oslo, NorwayAbstract: Lung cancer is a major public health problem throughout the world. Among the most frequent cancer types (prostate, breast, colorectal, stomach, lung, lung cancer is the leading cause of cancer-related deaths worldwide. Among the two major subtypes of small cell lung cancer and nonsmall cell lung cancer (NSCLC, 85% of tumors belong to the NSCLC histological types. Small cell lung cancer is associated with the shortest survival time. Although tobacco smoking has been recognized as the major risk factor for lung cancer, there is a great interindividual and interethnic difference in risk of developing lung cancer given exposure to similar environmental and lifestyle factors. This may indicate that in addition to chemical and environmental factors, genetic variations in the genome may contribute to risk modification. A common type of genetic variation in the genome, known as single nucleotide polymorphism, has been found to be associated with susceptibility to lung cancer. Interestingly, many of these polymorphisms are found in the genes that regulate major pathways of carcinogen metabolism (cytochrome P450 genes, detoxification (glutathione S-transferases, adduct removal (DNA repair genes, cell growth/apoptosis (TP53/MDM2, the immune system (cytokines/chemokines, and membrane receptors (nicotinic acetylcholine and dopaminergic receptors. Some of these polymorphisms have been shown to alter the level of mRNA, and protein structure and function. In addition to being susceptibility markers, several of these polymorphisms are emerging to be important for response to chemotherapy/radiotherapy and survival of patients. Therefore, it is hypothesized that single nucleotide polymorphisms will be valuable genetic markers in individual-based prognosis and therapy in future. Here we will review some of the most
Parchman, Thomas L; Gompert, Zachariah; Mudge, Joann; Schilkey, Faye D; Benkman, Craig W; Buerkle, C Alex
Pine cones that remain closed and retain seeds until fire causes the cones to open (cone serotiny) represent a key adaptive trait in a variety of pine species. In lodgepole pine, there is substantial geographical variation in serotiny across the Rocky Mountain region. This variation in serotiny has evolved as a result of geographically divergent selection, with consequences that extend to forest communities and ecosystems. An understanding of the genetic architecture of this trait is of interest owing to the wide-reaching ecological consequences of serotiny and also because of the repeated evolution of the trait across the genus. Here, we present and utilize an inexpensive and time-effective method for generating population genomic data. The method uses restriction enzymes and PCR amplification to generate a library of fragments that can be sequenced with a high level of multiplexing. We obtained data for more than 95,000 single nucleotide polymorphisms across 98 serotinous and nonserotinous lodgepole pines from three populations. We used a Bayesian generalized linear model (GLM) to test for an association between genotypic variation at these loci and serotiny. The probability of serotiny varied by genotype at 11 loci, and the association between genotype and serotiny at these loci was consistent in each of the three populations of pines. Genetic variation across these 11 loci explained 50% of the phenotypic variation in serotiny. Our results provide a first genome-wide association map of serotiny in pines and demonstrate an inexpensive and efficient method for generating population genomic data. © 2012 Blackwell Publishing Ltd.
Charles F A de Bourcy
Full Text Available Single-cell sequencing is emerging as an important tool for studies of genomic heterogeneity. Whole genome amplification (WGA is a key step in single-cell sequencing workflows and a multitude of methods have been introduced. Here, we compare three state-of-the-art methods on both bulk and single-cell samples of E. coli DNA: Multiple Displacement Amplification (MDA, Multiple Annealing and Looping Based Amplification Cycles (MALBAC, and the PicoPLEX single-cell WGA kit (NEB-WGA. We considered the effects of reaction gain on coverage uniformity, error rates and the level of background contamination. We compared the suitability of the different WGA methods for the detection of copy-number variations, for the detection of single-nucleotide polymorphisms and for de-novo genome assembly. No single method performed best across all criteria and significant differences in characteristics were observed; the choice of which amplifier to use will depend strongly on the details of the type of question being asked in any given experiment.
vonHoldt, Bridgett M.; Pollinger, John P.; Lohmueller, Kirk E.; Han, Eunjung; Parker, Heidi G.; Quignon, Pascale; Degenhardt, Jeremiah D.; Boyko, Adam R.; Earl, Dent A.; Auton, Adam; Reynolds, Andy; Bryc, Kasia; Brisbin, Abra; Knowles, James C.; Mosher, Dana S.; Spady, Tyrone C.; Elkahloun, Abdel; Geffen, Eli; Pilot, Malgorzata; Jedrzejewski, Wlodzimierz; Greco, Claudia; Randi, Ettore; Bannasch, Danika; Wilton, Alan; Shearman, Jeremy; Musiani, Marco; Cargill, Michelle; Jones, Paul G.; Qian, Zuwei; Huang, Wei; Ding, Zhao-Li; Zhang, Ya-ping; Bustamante, Carlos D.; Ostrander, Elaine A.; Novembre, John; Wayne, Robert K.
Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication1,2. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data3. Furthermore, we find a surprising correspondence between genetic and phenotypic/functional breed groupings but there are exceptions that suggest phenotypic diversification depended in part on the repeated crossing of individuals with novel phenotypes. Our results show that Middle Eastern wolves were a critical source of genome diversity, although interbreeding with local wolf populations clearly occurred elsewhere in the early history of specific lineages. More recently, the evolution of modern dog breeds seems to have been an iterative process that drew on a limited genetic toolkit to create remarkable phenotypic diversity. PMID:20237475
Anney, Richard; Klei, Lambertus; Pinto, Dalila; Regan, Regina; Conroy, Judith; Magalhaes, Tiago R.; Correia, Catarina; Abrahams, Brett S.; Sykes, Nuala; Pagnamenta, Alistair T.; Almeida, Joana; Bacchelli, Elena; Bailey, Anthony J.; Baird, Gillian; Battaglia, Agatino; Berney, Tom; Bolshakova, Nadia; Bölte, Sven; Bolton, Patrick F.; Bourgeron, Thomas; Brennan, Sean; Brian, Jessica; Carson, Andrew R.; Casallo, Guillermo; Casey, Jillian; Chu, Su H.; Cochrane, Lynne; Corsello, Christina; Crawford, Emily L.; Crossett, Andrew; Dawson, Geraldine; de Jonge, Maretha; Delorme, Richard; Drmic, Irene; Duketis, Eftichia; Duque, Frederico; Estes, Annette; Farrar, Penny; Fernandez, Bridget A.; Folstein, Susan E.; Fombonne, Eric; Freitag, Christine M.; Gilbert, John; Gillberg, Christopher; Glessner, Joseph T.; Goldberg, Jeremy; Green, Jonathan; Guter, Stephen J.; Hakonarson, Hakon; Heron, Elizabeth A.; Hill, Matthew; Holt, Richard; Howe, Jennifer L.; Hughes, Gillian; Hus, Vanessa; Igliozzi, Roberta; Kim, Cecilia; Klauck, Sabine M.; Kolevzon, Alexander; Korvatska, Olena; Kustanovich, Vlad; Lajonchere, Clara M.; Lamb, Janine A.; Laskawiec, Magdalena; Leboyer, Marion; Le Couteur, Ann; Leventhal, Bennett L.; Lionel, Anath C.; Liu, Xiao-Qing; Lord, Catherine; Lotspeich, Linda; Lund, Sabata C.; Maestrini, Elena; Mahoney, William; Mantoulan, Carine; Marshall, Christian R.; McConachie, Helen; McDougle, Christopher J.; McGrath, Jane; McMahon, William M.; Melhem, Nadine M.; Merikangas, Alison; Migita, Ohsuke; Minshew, Nancy J.; Mirza, Ghazala K.; Munson, Jeff; Nelson, Stanley F.; Noakes, Carolyn; Noor, Abdul; Nygren, Gudrun; Oliveira, Guiomar; Papanikolaou, Katerina; Parr, Jeremy R.; Parrini, Barbara; Paton, Tara; Pickles, Andrew; Piven, Joseph; Posey, David J; Poustka, Annemarie; Poustka, Fritz; Prasad, Aparna; Ragoussis, Jiannis; Renshaw, Katy; Rickaby, Jessica; Roberts, Wendy; Roeder, Kathryn; Roge, Bernadette; Rutter, Michael L.; Bierut, Laura J.; Rice, John P.; Salt, Jeff; Sansom, Katherine; Sato, Daisuke; Segurado, Ricardo; Senman, Lili; Shah, Naisha; Sheffield, Val C.; Soorya, Latha; Sousa, Inês; Stoppioni, Vera; Strawbridge, Christina; Tancredi, Raffaella; Tansey, Katherine; Thiruvahindrapduram, Bhooma; Thompson, Ann P.; Thomson, Susanne; Tryfon, Ana; Tsiantis, John; Van Engeland, Herman; Vincent, John B.; Volkmar, Fred; Wallace, Simon; Wang, Kai; Wang, Zhouzhi; Wassink, Thomas H.; Wing, Kirsty; Wittemeyer, Kerstin; Wood, Shawn; Yaspan, Brian L.; Zurawiecki, Danielle; Zwaigenbaum, Lonnie; Betancur, Catalina; Buxbaum, Joseph D.; Cantor, Rita M.; Cook, Edwin H.; Coon, Hilary; Cuccaro, Michael L.; Gallagher, Louise; Geschwind, Daniel H.; Gill, Michael; Haines, Jonathan L.; Miller, Judith; Monaco, Anthony P.; Nurnberger, John I.; Paterson, Andrew D.; Pericak-Vance, Margaret A.; Schellenberg, Gerard D.; Scherer, Stephen W.; Sutcliffe, James S.; Szatmari, Peter; Vicente, Astrid M.; Vieland, Veronica J.; Wijsman, Ellen M.; Devlin, Bernie; Ennis, Sean; Hallmayer, Joachim
Although autism spectrum disorders (ASDs) have a substantial genetic basis, most of the known genetic risk has been traced to rare variants, principally copy number variants (CNVs). To identify common risk variation, the Autism Genome Project (AGP) Consortium genotyped 1558 rigorously defined ASD families for 1 million single-nucleotide polymorphisms (SNPs) and analyzed these SNP genotypes for association with ASD. In one of four primary association analyses, the association signal for marker rs4141463, located within MACROD2, crossed the genome-wide association significance threshold of P < 5 × 10−8. When a smaller replication sample was analyzed, the risk allele at rs4141463 was again over-transmitted; yet, consistent with the winner's curse, its effect size in the replication sample was much smaller; and, for the combined samples, the association signal barely fell below the P < 5 × 10−8 threshold. Exploratory analyses of phenotypic subtypes yielded no significant associations after correction for multiple testing. They did, however, yield strong signals within several genes, KIAA0564, PLD5, POU6F2, ST8SIA2 and TAF1C. PMID:20663923
Renton, Alan E.; Pliner, Hannah A.; Provenzano, Carlo; Evoli, Amelia; Ricciardi, Roberta; Nalls, Michael A.; Marangi, Giuseppe; Abramzon, Yevgeniya; Arepalli, Sampath; Chong, Sean; Hernandez, Dena G.; Johnson, Janel O.; Bartoccioni, Emanuela; Scuderi, Flavia; Maestri, Michelangelo; Raphael Gibbs, J.; Errichiello, Edoardo; Chiò, Adriano; Restagno, Gabriella; Sabatelli, Mario; Macek, Mark; Scholz, Sonja W.; Corse, Andrea; Chaudhry, Vinay; Benatar, Michael; Barohn, Richard J.; McVey, April; Pasnoor, Mamatha; Dimachkie, Mazen M.; Rowin, Julie; Kissel, John; Freimer, Miriam; Kaminski, Henry J.; Sanders, Donald B.; Lipscomb, Bernadette; Massey, Janice M.; Chopra, Manisha; Howard, James F.; Koopman, Wilma J.; Nicolle, Michael W.; Pascuzzi, Robert M.; Pestronk, Alan; Wulf, Charlie; Florence, Julaine; Blackmore, Derrick; Soloway, Aimee; Siddiqi, Zaeem; Muppidi, Srikanth; Wolfe, Gil; Richman, David; Mezei, Michelle M.; Jiwa, Theresa; Oger, Joel; Drachman, Daniel B.; Traynor, Bryan J.
IMPORTANCE Myasthenia gravis is a chronic, autoimmune, neuromuscular disease characterized by fluctuating weakness of voluntary muscle groups. Although genetic factors are known to play a role in this neuroimmunological condition, the genetic etiology underlying myasthenia gravis is not well understood. OBJECTIVE To identify genetic variants that alter susceptibility to myasthenia gravis, we performed a genome-wide association study. DESIGN, SETTING, AND PARTICIPANTS DNA was obtained from 1032 white individuals from North America diagnosed as having acetylcholine receptor antibody–positive myasthenia gravis and 1998 race/ethnicity-matched control individuals from January 2010 to January 2011. These samples were genotyped on Illumina OmniExpress single-nucleotide polymorphism arrays. An independent cohort of 423 Italian cases and 467 Italian control individuals were used for replication. MAIN OUTCOMES AND MEASURES We calculated P values for association between 8114394 genotyped and imputed variants across the genome and risk for developing myasthenia gravis using logistic regression modeling. A threshold P value of 5.0 × 10−8 was set for genome-wide significance after Bonferroni correction for multiple testing. RESULTS In the over all case-control cohort, we identified association signals at CTLA4 (rs231770; P = 3.98 × 10−8; odds ratio, 1.37; 95% CI, 1.25–1.49), HLA-DQA1 (rs9271871; P = 1.08 × 10−8; odds ratio, 2.31; 95% CI, 2.02 – 2.60), and TNFRSF11A (rs4263037; P = 1.60 × 10−9; odds ratio, 1.41; 95% CI, 1.29–1.53). These findings replicated for CTLA4 and HLA-DQA1 in an independent cohort of Italian cases and control individuals. Further analysis revealed distinct, but overlapping, disease-associated loci for early- and late-onset forms of myasthenia gravis. In the late-onset cases, we identified 2 association peaks: one was located in TNFRSF11A (rs4263037; P = 1.32 × 10−12; odds ratio, 1.56; 95% CI, 1.44–1.68) and the other was detected
Lu, Yi; Chen, Xiaoqing; Beesley, Jonathan
stage 1 GWAS rather than due to problems with the pooling approach. We conclude that there are unlikely to be any moderate or large effects on ovarian cancer risk untagged by less dense arrays. However, our study lacked power to make clear statements on the existence of hitherto untagged small......Recent Genome-Wide Association Studies (GWAS) have identified four low-penetrance ovarian cancer susceptibility loci. We hypothesized that further moderate- or low-penetrance variants exist among the subset of single-nucleotide polymorphisms (SNPs) not well tagged by the genotyping arrays used...... in the previous studies, which would account for some of the remaining risk. We therefore conducted a time- and cost-effective stage 1 GWAS on 342 invasive serous cases and 643 controls genotyped on pooled DNA using the high-density Illumina 1M-Duo array. We followed up 20 of the most significantly associated...
Liang, Jingjing; Le, Thu H.; Edwards, Digna R. Velez; Tayo, Bamidele O.; Gaulton, Kyle J.; Smith, Jennifer A.; Lu, Yingchang; Jensen, Richard A.; Chen, Guanjie; Yanek, Lisa R.; Schwander, Karen; Tajuddin, Salman M.; Sofer, Tamar; Kim, Wonji; Kayima, James
© 2017 Public Library of Science. All Rights Reserved. Hypertension is a leading cause of global disease, mortality, and disability. While individuals of African descent suffer a disproportionate burden of hypertension and its complications, they have been underrepresented in genetic studies. To identify novel susceptibility loci for blood pressure and hypertension in people of African ancestry, we performed both single and multiple-trait genome-wide association analyses. We analyzed 21 genom...
Horai, Makiko; Mishima, Hiroyuki; Hayashida, Chisa; Kinoshita, Akira; Nakane, Yoshibumi; Matsuo, Tatsuki; Tsuruda, Kazuto; Yanagihara, Katsunori; Sato, Shinya; Imanishi, Daisuke; Imaizumi, Yoshitaka; Hata, Tomoko; Miyazaki, Yasushi; Yoshiura, Koh-Ichiro
Ionizing radiation released by the atomic bombs at Hiroshima and Nagasaki, Japan, in 1945 caused many long-term illnesses, including increased risks of malignancies such as leukemia and solid tumours. Radiation has demonstrated genetic effects in animal models, leading to concerns over the potential hereditary effects of atomic bomb-related radiation. However, no direct analyses of whole DNA have yet been reported. We therefore investigated de novo variants in offspring of atomic-bomb survivors by whole-genome sequencing (WGS). We collected peripheral blood from three trios, each comprising a father (atomic-bomb survivor with acute radiation symptoms), a non-exposed mother, and their child, none of whom had any past history of haematological disorders. One trio of non-exposed individuals was included as a control. DNA was extracted and the numbers of de novo single nucleotide variants in the children were counted by WGS with sequencing confirmation. Gross structural variants were also analysed. Written informed consent was obtained from all participants prior to the study. There were 62, 81, and 42 de novo single nucleotide variants in the children of atomic-bomb survivors, compared with 48 in the control trio. There were no gross structural variants in any trio. These findings are in accord with previously published results that also showed no significant genetic effects of atomic-bomb radiation on second-generation survivors.
Ripke, Stephan; Wray, Naomi R; Lewis, Cathryn M; Hamilton, Steven P; Weissman, Myrna M; Breen, Gerome; Byrne, Enda M; Blackwood, Douglas H R; Boomsma, Dorret I; Cichon, Sven; Heath, Andrew C; Holsboer, Florian; Lucae, Susanne; Madden, Pamela A F; Martin, Nicholas G; McGuffin, Peter; Muglia, Pierandrea; Noethen, Markus M; Penninx, Brenda P; Pergadia, Michele L; Potash, James B; Rietschel, Marcella; Lin, Danyu; Müller-Myhsok, Bertram; Shi, Jianxin; Steinberg, Stacy; Grabe, Hans J; Lichtenstein, Paul; Magnusson, Patrik; Perlis, Roy H; Preisig, Martin; Smoller, Jordan W; Stefansson, Kari; Uher, Rudolf; Kutalik, Zoltan; Tansey, Katherine E; Teumer, Alexander; Viktorin, Alexander; Barnes, Michael R; Bettecken, Thomas; Binder, Elisabeth B; Breuer, René; Castro, Victor M; Churchill, Susanne E; Coryell, William H; Craddock, Nick; Craig, Ian W; Czamara, Darina; De Geus, Eco J; Degenhardt, Franziska; Farmer, Anne E; Fava, Maurizio; Frank, Josef; Gainer, Vivian S; Gallagher, Patience J; Gordon, Scott D; Goryachev, Sergey; Gross, Magdalena; Guipponi, Michel; Henders, Anjali K; Herms, Stefan; Hickie, Ian B; Hoefels, Susanne; Hoogendijk, Witte; Hottenga, Jouke Jan; Iosifescu, Dan V; Ising, Marcus; Jones, Ian; Jones, Lisa; Jung-Ying, Tzeng; Knowles, James A; Kohane, Isaac S; Kohli, Martin A; Korszun, Ania; Landen, Mikael; Lawson, William B; Lewis, Glyn; Macintyre, Donald; Maier, Wolfgang; Mattheisen, Manuel; McGrath, Patrick J; McIntosh, Andrew; McLean, Alan; Middeldorp, Christel M; Middleton, Lefkos; Montgomery, Grant M; Murphy, Shawn N; Nauck, Matthias; Nolen, Willem A; Nyholt, Dale R; O'Donovan, Michael; Oskarsson, Högni; Pedersen, Nancy; Scheftner, William A; Schulz, Andrea; Schulze, Thomas G; Shyn, Stanley I; Sigurdsson, Engilbert; Slager, Susan L; Smit, Johannes H; Stefansson, Hreinn; Steffens, Michael; Thorgeirsson, Thorgeir; Tozzi, Federica; Treutlein, Jens; Uhr, Manfred; van den Oord, Edwin J C G; Van Grootheest, Gerard; Völzke, Henry; Weilburg, Jeffrey B; Willemsen, Gonneke; Zitman, Frans G; Neale, Benjamin; Daly, Mark; Levinson, Douglas F; Sullivan, Patrick F
Prior genome-wide association studies (GWAS) of major depressive disorder (MDD) have met with limited success. We sought to increase statistical power to detect disease loci by conducting a GWAS mega-analysis for MDD. In the MDD discovery phase, we analyzed more than 1.2 million autosomal and X chromosome single-nucleotide polymorphisms (SNPs) in 18 759 independent and unrelated subjects of recent European ancestry (9240 MDD cases and 9519 controls). In the MDD replication phase, we evaluated 554 SNPs in independent samples (6783 MDD cases and 50 695 controls). We also conducted a cross-disorder meta-analysis using 819 autosomal SNPs with P<0.0001 for either MDD or the Psychiatric GWAS Consortium bipolar disorder (BIP) mega-analysis (9238 MDD cases/8039 controls and 6998 BIP cases/7775 controls). No SNPs achieved genome-wide significance in the MDD discovery phase, the MDD replication phase or in pre-planned secondary analyses (by sex, recurrent MDD, recurrent early-onset MDD, age of onset, pre-pubertal onset MDD or typical-like MDD from a latent class analyses of the MDD criteria). In the MDD-bipolar cross-disorder analysis, 15 SNPs exceeded genome-wide significance (P<5 × 10(-8)), and all were in a 248 kb interval of high LD on 3p21.1 (chr3:52 425 083-53 822 102, minimum P=5.9 × 10(-9) at rs2535629). Although this is the largest genome-wide analysis of MDD yet conducted, its high prevalence means that the sample is still underpowered to detect genetic effects typical for complex traits. Therefore, we were unable to identify robust and replicable findings. We discuss what this means for genetic research for MDD. The 3p21.1 MDD-BIP finding should be interpreted with caution as the most significant SNP did not replicate in MDD samples, and genotyping in independent samples will be needed to resolve its status.
Richard A Jensen
Full Text Available Mild retinopathy (microaneurysms or dot-blot hemorrhages is observed in persons without diabetes or hypertension and may reflect microvascular disease in other organs. We conducted a genome-wide association study (GWAS of mild retinopathy in persons without diabetes.A working group agreed on phenotype harmonization, covariate selection and analytic plans for within-cohort GWAS. An inverse-variance weighted fixed effects meta-analysis was performed with GWAS results from six cohorts of 19,411 Caucasians. The primary analysis included individuals without diabetes and secondary analyses were stratified by hypertension status. We also singled out the results from single nucleotide polymorphisms (SNPs previously shown to be associated with diabetes and hypertension, the two most common causes of retinopathy.No SNPs reached genome-wide significance in the primary analysis or the secondary analysis of participants with hypertension. SNP, rs12155400, in the histone deacetylase 9 gene (HDAC9 on chromosome 7, was associated with retinopathy in analysis of participants without hypertension, -1.3±0.23 (beta ± standard error, p = 6.6×10(-9. Evidence suggests this was a false positive finding. The minor allele frequency was low (∼2%, the quality of the imputation was moderate (r(2 ∼0.7, and no other common variants in the HDAC9 gene were associated with the outcome. SNPs found to be associated with diabetes and hypertension in other GWAS were not associated with retinopathy in persons without diabetes or in subgroups with or without hypertension.This GWAS of retinopathy in individuals without diabetes showed little evidence of genetic associations. Further studies are needed to identify genes associated with these signs in order to help unravel novel pathways and determinants of microvascular diseases.
Full Text Available Abstract Background Vibration-induced white finger disease (VWF, also known as hand-arm vibration syndrome, is a secondary form of Raynaud’s disease, affecting the blood vessels and nerves. So far, little is known about the pathogenesisof the disease. VWF is associated with an episodic reduction in peripheral blood flow. Sirtuin 1, a class III histone deacetylase, has been described to regulate the endothelium dependent vasodilation by targeting endothelial nitric oxide synthase. We assessed Sirt1single nucleotide polymorphisms in patients with VWF to further elucidate the role of sirtuin 1 in the pathogenesis of VWF. Methods Peripheral blood samples were obtained from 74 patients with VWF (male 93.2%, female 6.8%, median age 53 years and from 317 healthy volunteers (gender equally distributed, below 30 years of age. Genomic DNA was extracted from peripheral blood mononuclear cells and screened for potential Sirt1single nucleotide polymorphisms. Four putative genetic polymorphisms out of 113 within the Sirt1 genomic region (NCBI Gene Reference: NM_012238.3 were assessed. Allelic discrimination was performed by TaqMan-polymerasechainreaction-based allele-specific genotyping single nucleotide polymorphism assays. Results Sirt1single nucleotide polymorphism A2191G (Assay C_25611590_10, rs35224060 was identified within Sirt1 exon 9 (amino acid position 731, Ile → Val, with differing allelic frequencies in the VWF population (A/A: 70.5%, A/G: 29.5%, G/G: 0% and the control population (A/A: 99.7%, A/G: 0.3%, G/G: 0.5%, with significance levels of P U test (two-tailed P t-test and Chi-square test with Yates correction (all two-tailed: P Conclusion We identified theSirt1A2191Gsingle nucleotide polymorphism as a diagnostic marker for VWF.
Zhan, Qimin; Hu, Zhibin; He, Zhonghu; Jia, Weihua; Zhou, Yifeng; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Zhao, Xue-Ke; Gao, She-Gan; Yuan, Zhi-Qing; Zhou, Fu-You; Fan, Zong-Min; Cui, Ji-Li; Lin, Hong-Li; Han, Xue-Na; Li, Bei; Chen, Xi; Dawsey, Sanford M.; Liao, Linda; Lee, Maxwell P.; Ding, Ti; Qiao, You-Lin; Liu, Zhihua; Liu, Yu; Yu, Dianke; Chang, Jiang; Wei, Lixuan; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Han, Jing-Jing; Zhou, Sheng-Li; Zhang, Peng; Zhang, Dong-Yun; Yuan, Yuan; Huang, Ying; Liu, Chunling; Zhai, Kan; Qiao, Yan; Jin, Guangfu; Guo, Chuanhai; Fu, Jianhua; Miao, Xiaoping; Lu, Changdong; Yang, Haijun; Wang, Chaoyu; Wheeler, William A.; Gail, Mitchell; Yeager, Meredith; Yuenger, Jeff; Guo, Er-Tao; Li, Ai-Li; Zhang, Wei; Li, Xue-Min; Sun, Liang-Dan; Ma, Bao-Gen; Li, Yan; Tang, Sa; Peng, Xiu-Qing; Liu, Jing; Hutchinson, Amy; Jacobs, Kevin; Giffen, Carol; Burdette, Laurie; Fraumeni, Joseph F.; Shen, Hongbing; Ke, Yang; Zeng, Yixin; Wu, Tangchun; Kraft, Peter; Chung, Charles C.; Tucker, Margaret A.; Hou, Zhi-Chao; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Wang, Li; Yuan, Guo; Chen, Li-Sha; Liu, Xiao; Ma, Teng; Meng, Hui; Sun, Li; Li, Xin-Min; Li, Xiu-Min; Ku, Jian-Wei; Zhou, Ying-Fa; Yang, Liu-Qin; Wang, Zhou; Li, Yin; Qige, Qirenwang; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Yuan, Ling; Yue, Wen-Bin; Wang, Ran; Wang, Lu-Wen; Fan, Xue-Ping; Zhu, Fang-Heng; Zhao, Wei-Xing; Mao, Yi-Min; Zhang, Mei; Xing, Guo-Lan; Li, Ji-Lin; Han, Min; Ren, Jing-Li; Liu, Bin; Ren, Shu-Wei; Kong, Qing-Peng; Li, Feng; Sheyhidin, Ilyar; Wei, Wu; Zhang, Yan-Rui; Feng, Chang-Wei; Wang, Jin; Yang, Yu-Hua; Hao, Hong-Zhang; Bao, Qi-De; Liu, Bao-Chi; Wu, Ai-Qun; Xie, Dong; Yang, Wan-Cai; Wang, Liang; Zhao, Xiao-Hang; Chen, Shu-Qing; Hong, Jun-Yan; Zhang, Xue-Jun; Freedman, Neal D; Goldstein, Alisa M.; Lin, Dongxin; Taylor, Philip R.; Wang, Li-Dong; Chanock, Stephen J.
We conducted a joint (pooled) analysis of three genome-wide association studies (GWAS) 1-3 of esophageal squamous cell carcinoma (ESCC) in ethnic Chinese (5,337 ESCC cases and 5,787 controls) with 9,654 ESCC cases and 10,058 controls for follow-up. In a logistic regression model adjusted for age, sex, study, and two eigenvectors, two new loci achieved genome-wide significance, marked by rs7447927 at 5q31.2 (per-allele odds ratio (OR) = 0.85, 95% CI 0.82-0.88; P=7.72x10−20) and rs1642764 at 17p13.1 (per-allele OR= 0.88, 95% CI 0.85-0.91; P=3.10x10−13). rs7447927 is a synonymous single nucleotide polymorphism (SNP) in TMEM173 and rs1642764 is an intronic SNP in ATP1B2, near TP53. Furthermore, a locus in the HLA class II region at 6p21.32 (rs35597309) achieved genome-wide significance in the two populations at highest risk for ESSC (OR=1.33, 95% CI 1.22-1.46; P=1.99x10−10). Our joint analysis identified new ESCC susceptibility loci overall as well as a new locus unique to the ESCC high risk Taihang Mountain region. PMID:25129146
Full Text Available Selenium is an essential trace element and circulating selenium concentrations have been associated with a wide range of diseases. Candidate gene studies suggest that circulating selenium concentrations may be impacted by genetic variation; however, no study has comprehensively investigated this hypothesis. Therefore, we conducted a two-stage genome-wide association study to identify genetic variants associated with serum selenium concentrations in 1203 European descents from two cohorts: the Prostate, Lung, Colorectal, and Ovarian (PLCO Cancer Screening and the Women’s Health Initiative (WHI. We tested association between 2,474,333 single nucleotide polymorphisms (SNPs and serum selenium concentrations using linear regression models. In the first stage (PLCO 41 SNPs clustered in 15 regions had p < 1 × 10−5. None of these 41 SNPs reached the significant threshold (p = 0.05/15 regions = 0.003 in the second stage (WHI. Three SNPs had p < 0.05 in the second stage (rs1395479 and rs1506807 in 4q34.3/AGA-NEIL3; and rs891684 in 17q24.3/SLC39A11 and had p between 2.62 × 10−7 and 4.04 × 10−7 in the combined analysis (PLCO + WHI. Additional studies are needed to replicate these findings. Identification of genetic variation that impacts selenium concentrations may contribute to a better understanding of which genes regulate circulating selenium concentrations.
Markt, Sarah C; Nuttall, Elizabeth; Turman, Constance; Sinnott, Jennifer; Rimm, Eric B; Ecsedy, Ethan; Unger, Robert H; Fall, Katja; Finn, Stephen; Jensen, Majken K; Rider, Jennifer R; Kraft, Peter
Objective To determine the inherited factors associated with the ability to smell asparagus metabolites in urine. Design Genome wide association study. Setting Nurses’ Health Study and Health Professionals Follow-up Study cohorts. Participants 6909 men and women of European-American descent with available genetic data from genome wide association studies. Main outcome measure Participants were characterized as asparagus smellers if they strongly agreed with the prompt “after eating asparagus, you notice a strong characteristic odor in your urine,” and anosmic if otherwise. We calculated per-allele estimates of asparagus anosmia for about nine million single nucleotide polymorphisms using logistic regression. P values asparagus anosmia, all in a region on chromosome 1 (1q44: 248139851-248595299) containing multiple genes in the olfactory receptor 2 (OR2) family. Conditional analyses revealed three independent markers associated with asparagus anosmia: rs13373863, rs71538191, and rs6689553. Conclusion A large proportion of people have asparagus anosmia. Genetic variation near multiple olfactory receptor genes is associated with the ability of an individual to smell the metabolites of asparagus in urine. Future replication studies are necessary before considering targeted therapies to help anosmic people discover what they are missing. PMID:27965198
Lang, M; Leménager, T; Streit, F; Fauth-Bühler, M; Frank, J; Juraeva, D; Witt, S H; Degenhardt, F; Hofmann, A; Heilmann-Heimbach, S; Kiefer, F; Brors, B; Grabe, H-J; John, U; Bischof, A; Bischof, G; Völker, U; Homuth, G; Beutel, M; Lind, P A; Medland, S E; Slutske, W S; Martin, N G; Völzke, H; Nöthen, M M; Meyer, C; Rumpf, H-J; Wurst, F M; Rietschel, M; Mann, K F
Pathological gambling is a behavioural addiction with negative economic, social, and psychological consequences. Identification of contributing genes and pathways may improve understanding of aetiology and facilitate therapy and prevention. Here, we report the first genome-wide association study of pathological gambling. Our aims were to identify pathways involved in pathological gambling, and examine whether there is a genetic overlap between pathological gambling and alcohol dependence. Four hundred and forty-five individuals with a diagnosis of pathological gambling according to the Diagnostic and Statistical Manual of Mental Disorders were recruited in Germany, and 986 controls were drawn from a German general population sample. A genome-wide association study of pathological gambling comprising single marker, gene-based, and pathway analyses, was performed. Polygenic risk scores were generated using data from a German genome-wide association study of alcohol dependence. No genome-wide significant association with pathological gambling was found for single markers or genes. Pathways for Huntington's disease (P-value=6.63×10(-3)); 5'-adenosine monophosphate-activated protein kinase signalling (P-value=9.57×10(-3)); and apoptosis (P-value=1.75×10(-2)) were significant. Polygenic risk score analysis of the alcohol dependence dataset yielded a one-sided nominal significant P-value in subjects with pathological gambling, irrespective of comorbid alcohol dependence status. The present results accord with previous quantitative formal genetic studies which showed genetic overlap between non-substance- and substance-related addictions. Furthermore, pathway analysis suggests shared pathology between Huntington's disease and pathological gambling. This finding is consistent with previous imaging studies. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Pendleton, Matthew; Sebra, Robert; Pang, Andy Wing Chun; Ummat, Ajay; Franzen, Oscar; Rausch, Tobias; Stütz, Adrian M; Stedman, William; Anantharaman, Thomas; Hastie, Alex; Dai, Heng; Fritz, Markus Hsi-Yang; Cao, Han; Cohain, Ariella; Deikus, Gintaras; Durrett, Russell E; Blanchard, Scott C; Altman, Roger; Chin, Chen-Shan; Guo, Yan; Paxinos, Ellen E; Korbel, Jan O; Darnell, Robert B; McCombie, W Richard; Kwok, Pui-Yan; Mason, Christopher E; Schadt, Eric E; Bashir, Ali
We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality.
Full Text Available Type 2 diabetes (T2D is one of the most frequent mortality causes in western countries, with rapidly increasing prevalence. Anti-diabetic drugs are the first therapeutic approach, although many patients develop drug resistance. Most drug responsiveness variability can be explained by genetic causes. Inter-individual variability is principally due to single nucleotide polymorphisms, and differential drug responsiveness has been correlated to alteration in genes involved in drug metabolism (CYP2C9 or insulin signaling (IRS1, ABCC8, KCNJ11 and PPARG. However, most genome-wide association studies did not provide clues about the contribution of DNA variations to impaired drug responsiveness. Thus, characterizing T2D drug responsiveness variants is needed to guide clinicians toward tailored therapeutic approaches. Here, we extensively investigated polymorphisms associated with altered drug response in T2D, predicting their effects in silico. Combining different computational approaches, we focused on the expression pattern of genes correlated to drug resistance and inferred evolutionary conservation of polymorphic residues, computationally predicting the biochemical properties of polymorphic proteins. Using RNA-Sequencing followed by targeted validation, we identified and experimentally confirmed that two nucleotide variations in the CAPN10 gene—currently annotated as intronic—fall within two new transcripts in this locus. Additionally, we found that a Single Nucleotide Polymorphism (SNP, currently reported as intergenic, maps to the intron of a new transcript, harboring CAPN10 and GPR35 genes, which undergoes non-sense mediated decay. Finally, we analyzed variants that fall into non-coding regulatory regions of yet underestimated functional significance, predicting that some of them can potentially affect gene expression and/or post-transcriptional regulation of mRNAs affecting the splicing.
Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K
Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for
Welderufael, B. G.; Løvendahl, Peter; de Koning, Dirk-Jan; Janss, Lucas L. G.; Fikse, W. F.
Because mastitis is very frequent and unavoidable, adding recovery information into the analysis for genetic evaluation of mastitis is of great interest from economical and animal welfare point of view. Here we have performed genome-wide association studies (GWAS) to identify associated single nucleotide polymorphisms (SNPs) and investigate the genetic background not only for susceptibility to – but also for recoverability from mastitis. Somatic cell count records from 993 Danish Holstein cows genotyped for a total of 39378 autosomal SNP markers were used for the association analysis. Single SNP regression analysis was performed using the statistical software package DMU. Substitution effect of each SNP was tested with a t-test and a genome-wide significance level of P-value mastitis were located in or very near to genes that have been reported for their role in the immune system. Genes involved in lymphocyte developments (e.g., MAST3 and STAB2) and genes involved in macrophage recruitment and regulation of inflammations (PDGFD and PTX3) were suggested as possible causal genes for susceptibility to – and recoverability from mastitis, respectively. However, this is the first GWAS study for recoverability from mastitis and our results need to be validated. The findings in the current study are, therefore, a starting point for further investigations in identifying causal genetic variants or chromosomal regions for both susceptibility to – and recoverability from mastitis. PMID:29755506
Seo, Dong-Won; Oh, Jae-Don; Jin, Shil; Song, Ki-Duk; Park, Hee-Bok; Heo, Kang-Nyeong; Shin, Younhee; Jung, Myunghee; Park, Junhyung; Jo, Cheorun; Lee, Hak-Kyo; Lee, Jun-Heon
There are five native chicken lines in Korea, which are mainly classified by plumage colors (black, white, red, yellow, gray). These five lines are very important genetic resources in the Korean poultry industry. Based on a next generation sequencing technology, whole genome sequence and reference assemblies were performed using Gallus_gallus_4.0 (NCBI) with whole genome sequences from these lines to identify common and novel single nucleotide polymorphisms (SNPs). We obtained 36,660,731,136 ± 1,257,159,120 bp of raw sequence and average 26.6-fold of 25-29 billion reference assembly sequences representing 97.288 % coverage. Also, 4,006,068 ± 97,534 SNPs were observed from 29 autosomes and the Z chromosome and, of these, 752,309 SNPs are the common SNPs across lines. Among the identified SNPs, the number of novel- and known-location assigned SNPs was 1,047,951 ± 14,956 and 2,948,648 ± 81,414, respectively. The number of unassigned known SNPs was 1,181 ± 150 and unassigned novel SNPs was 8,238 ± 1,019. Synonymous SNPs, non-synonymous SNPs, and SNPs having character changes were 26,266 ± 1,456, 11,467 ± 604, 8,180 ± 458, respectively. Overall, 443,048 ± 26,389 SNPs in each bird were identified by comparing with dbSNP in NCBI. The presently obtained genome sequence and SNP information in Korean native chickens have wide applications for further genome studies such as genetic diversity studies to detect causative mutations for economic and disease related traits.
Marchal, Claire; Sasaki, Takayo; Vera, Daniel; Wilson, Korey; Sima, Jiao; Rivera-Mulia, Juan Carlos; Trevilla-García, Claudia; Nogues, Coralin; Nafie, Ebtesam; Gilbert, David M
This protocol is an extension to: Nat. Protoc. 6, 870-895 (2014); doi:10.1038/nprot.2011.328; published online 02 June 2011Cycling cells duplicate their DNA content during S phase, following a defined program called replication timing (RT). Early- and late-replicating regions differ in terms of mutation rates, transcriptional activity, chromatin marks and subnuclear position. Moreover, RT is regulated during development and is altered in diseases. Here, we describe E/L Repli-seq, an extension of our Repli-chip protocol. E/L Repli-seq is a rapid, robust and relatively inexpensive protocol for analyzing RT by next-generation sequencing (NGS), allowing genome-wide assessment of how cellular processes are linked to RT. Briefly, cells are pulse-labeled with BrdU, and early and late S-phase fractions are sorted by flow cytometry. Labeled nascent DNA is immunoprecipitated from both fractions and sequenced. Data processing leads to a single bedGraph file containing the ratio of nascent DNA from early versus late S-phase fractions. The results are comparable to those of Repli-chip, with the additional benefits of genome-wide sequence information and an increased dynamic range. We also provide computational pipelines for downstream analyses, for parsing phased genomes using single-nucleotide polymorphisms (SNPs) to analyze RT allelic asynchrony, and for direct comparison to Repli-chip data. This protocol can be performed in up to 3 d before sequencing, and requires basic cellular and molecular biology skills, as well as a basic understanding of Unix and R.
Avhashoni AA. Zwane
Sep 20, 2016 ... ... high temperatures and low-quality grass and for their resistance to ... growth rate, early marketability, grazing performance and good ... development of the Bonsmara breed (Scholtz, 2010). ..... transferability to water buffalo.
Sep 25, 2012 ... Codeine, Tramadol, Acetaminophen. CYP2C9. Celecoxib .... Pharmacogenet- ics of acute azathioprine toxicity: relationship to thiopurine ... Martinez C, Cueto R,. Garcia-Martin E. Pharmacogenomics in drug induced liver.
Markt, Sarah C; Nuttall, Elizabeth; Turman, Constance; Sinnott, Jennifer; Rimm, Eric B; Ecsedy, Ethan; Unger, Robert H; Fall, Katja; Finn, Stephen; Jensen, Majken K; Rider, Jennifer R; Kraft, Peter; Mucci, Lorelei A
To determine the inherited factors associated with the ability to smell asparagus metabolites in urine. Genome wide association study. Nurses' Health Study and Health Professionals Follow-up Study cohorts. 6909 men and women of European-American descent with available genetic data from genome wide association studies. Participants were characterized as asparagus smellers if they strongly agreed with the prompt "after eating asparagus, you notice a strong characteristic odor in your urine," and anosmic if otherwise. We calculated per-allele estimates of asparagus anosmia for about nine million single nucleotide polymorphisms using logistic regression. P values asparagus anosmia, all in a region on chromosome 1 (1q44: 248139851-248595299) containing multiple genes in the olfactory receptor 2 (OR2) family. Conditional analyses revealed three independent markers associated with asparagus anosmia: rs13373863, rs71538191, and rs6689553. A large proportion of people have asparagus anosmia. Genetic variation near multiple olfactory receptor genes is associated with the ability of an individual to smell the metabolites of asparagus in urine. Future replication studies are necessary before considering targeted therapies to help anosmic people discover what they are missing. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Full Text Available The Roma people, living throughout Europe and West Asia, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1,000-1,500 years ago. Genetic inferences about Roma history have mostly focused on the Y chromosome and mitochondrial DNA. To explore what additional information can be learned from genome-wide data, we analyzed data from six Roma groups that we genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs. We estimate that the Roma harbor about 80% West Eurasian ancestry-derived from a combination of European and South Asian sources-and that the date of admixture of South Asian and European ancestry was about 850 years before present. We provide evidence for Eastern Europe being a major source of European ancestry, and North-west India being a major source of the South Asian ancestry in the Roma. By computing allele sharing as a measure of linkage disequilibrium, we estimate that the migration of Roma out of the Indian subcontinent was accompanied by a severe founder event, which appears to have been followed by a major demographic expansion after the arrival in Europe.
Full Text Available Abstract Background With the availability of large-scale genome-wide association study (GWAS data, choosing an optimal set of SNPs for disease susceptibility prediction is a challenging task. This study aimed to use single nucleotide polymorphisms (SNPs to predict psoriasis from searching GWAS data. Methods Totally we had 2,798 samples and 451,724 SNPs. Process for searching a set of SNPs to predict susceptibility for psoriasis consisted of two steps. The first one was to search top 1,000 SNPs with high accuracy for prediction of psoriasis from GWAS dataset. The second one was to search for an optimal SNP subset for predicting psoriasis. The sequential information bottleneck (sIB method was compared with classical linear discriminant analysis(LDA for classification performance. Results The best test harmonic mean of sensitivity and specificity for predicting psoriasis by sIB was 0.674(95% CI: 0.650-0.698, while only 0.520(95% CI: 0.472-0.524 was reported for predicting disease by LDA. Our results indicate that the new classifier sIB performs better than LDA in the study. Conclusions The fact that a small set of SNPs can predict disease status with average accuracy of 68% makes it possible to use SNP data for psoriasis prediction.
Gref, Anna; Merid, Simon K; Gruzieva, Olena; Ballereau, Stéphane; Becker, Allan; Bellander, Tom; Bergström, Anna; Bossé, Yohan; Bottai, Matteo; Chan-Yeung, Moira; Fuertes, Elaine; Ierodiakonou, Despo; Jiang, Ruiwei; Joly, Stéphane; Jones, Meaghan; Kobor, Michael S; Korek, Michal; Kozyrskyj, Anita L; Kumar, Ashish; Lemonnier, Nathanaël; MacIntyre, Elaina; Ménard, Camille; Nickle, David; Obeidat, Ma'en; Pellet, Johann; Standl, Marie; Sääf, Annika; Söderhäll, Cilla; Tiesler, Carla M T; van den Berge, Maarten; Vonk, Judith M; Vora, Hita; Xu, Cheng-Jian; Antó, Josep M; Auffray, Charles; Brauer, Michael; Bousquet, Jean; Brunekreef, Bert; Gauderman, W James; Heinrich, Joachim; Kere, Juha; Koppelman, Gerard H; Postma, Dirkje; Carlsten, Christopher; Pershagen, Göran; Melén, Erik
The evidence supporting an association between traffic-related air pollution exposure and incident childhood asthma is inconsistent and may depend on genetic factors. To identify gene-environment interaction effects on childhood asthma using genome-wide single-nucleotide polymorphism (SNP) data and air pollution exposure. Identified loci were further analyzed at epigenetic and transcriptomic levels. We used land use regression models to estimate individual air pollution exposure (represented by outdoor NO 2 levels) at the birth address and performed a genome-wide interaction study for doctors' diagnoses of asthma up to 8 years in three European birth cohorts (n = 1,534) with look-up for interaction in two separate North American cohorts, CHS (Children's Health Study) and CAPPS/SAGE (Canadian Asthma Primary Prevention Study/Study of Asthma, Genetics and Environment) (n = 1,602 and 186 subjects, respectively). We assessed expression quantitative trait locus effects in human lung specimens and blood, as well as associations among air pollution exposure, methylation, and transcriptomic patterns. In the European cohorts, 186 SNPs had an interaction P asthma development and provided supportive evidence for interaction with air pollution for ADCY2, B4GALT5, and DLG2.
Tabas-Madrid, Daniel; Méndez-Vigo, Belén; Arteaga, Noelia; Marcer, Arnald; Pascual-Montano, Alberto; Weigel, Detlef; Xavier Picó, F; Alonso-Blanco, Carlos
Current global change is fueling an interest to understand the genetic and molecular mechanisms of plant adaptation to climate. In particular, altered flowering time is a common strategy for escape from unfavourable climate temperature. In order to determine the genomic bases underlying flowering time adaptation to this climatic factor, we have systematically analysed a collection of 174 highly diverse Arabidopsis thaliana accessions from the Iberian Peninsula. Analyses of 1.88 million single nucleotide polymorphisms provide evidence for a spatially heterogeneous contribution of demographic and adaptive processes to geographic patterns of genetic variation. Mountains appear to be allele dispersal barriers, whereas the relationship between flowering time and temperature depended on the precise temperature range. Environmental genome-wide associations supported an overall genome adaptation to temperature, with 9.4% of the genes showing significant associations. Furthermore, phenotypic genome-wide associations provided a catalogue of candidate genes underlying flowering time variation. Finally, comparison of environmental and phenotypic genome-wide associations identified known (Twin Sister of FT, FRIGIDA-like 1, and Casein Kinase II Beta chain 1) and new (Epithiospecifer Modifier 1 and Voltage-Dependent Anion Channel 5) genes as candidates for adaptation to climate temperature by altered flowering time. Thus, this regional collection provides an excellent resource to address the spatial complexity of climate adaptation in annual plants. © 2018 John Wiley & Sons Ltd.
Mariette, Stéphanie; Wong Jun Tai, Fabienne; Roch, Guillaume; Barre, Aurélien; Chague, Aurélie; Decroocq, Stéphane; Groppi, Alexis; Laizet, Yec'han; Lambert, Patrick; Tricon, David; Nikolski, Macha; Audergon, Jean-Marc; Abbott, Albert G; Decroocq, Véronique
In fruit tree species, many important traits have been characterized genetically by using single-family descent mapping in progenies segregating for the traits. However, most mapped loci have not been sufficiently resolved to the individual genes due to insufficient progeny sizes for high resolution mapping and the previous lack of whole-genome sequence resources of the study species. To address this problem for Plum Pox Virus (PPV) candidate resistance gene identification in Prunus species, we implemented a genome-wide association (GWA) approach in apricot. This study exploited the broad genetic diversity of the apricot (Prunus armeniaca) germplasm containing resistance to PPV, next-generation sequence-based genotyping, and the high-quality peach (Prunus persica) genome reference sequence for single nucleotide polymorphism (SNP) identification. The results of this GWA study validated previously reported PPV resistance quantitative trait loci (QTL) intervals, highlighted other potential resistance loci, and resolved each to a limited set of candidate genes for further study. This work substantiates the association genetics approach for resolution of QTL to candidate genes in apricot and suggests that this approach could simplify identification of other candidate genes for other marked trait intervals in this germplasm. © 2015 INRA, UMR 1332 BFP New Phytologist © 2015 New Phytologist Trust.
Nelson, George W.; Lautenberger, James A.; Chinn, Leslie; McIntosh, Carl; Johnson, Randall C.; Sezgin, Efe; Kessing, Bailey; Malasky, Michael; Hendrickson, Sher L.; Pontius, Joan; Tang, Minzhong; An, Ping; Winkler, Cheryl A.; Limou, Sophie; Le Clerc, Sigrid; Delaneau, Olivier; Zagury, Jean-François; Schuitemaker, Hanneke; van Manen, Daniëlle; Bream, Jay H.; Gomperts, Edward D.; Buchbinder, Susan; Goedert, James J.; Kirk, Gregory D.; O'Brien, Stephen J.
Background. Host genetic variation influences human immunodeficiency virus (HIV) infection and progression to AIDS. Here we used clinically well-characterized subjects from 5 pretreatment HIV/AIDS cohorts for a genome-wide association study to identify gene associations with rate of AIDS progression. Methods. European American HIV seroconverters (n = 755) were interrogated for single-nucleotide polymorphisms (SNPs) (n = 700,022) associated with progression to AIDS 1987 (Cox proportional hazards regression analysis, co-dominant model). Results. Association with slower progression was observed for SNPs in the gene PARD3B. One of these, rs11884476, reached genome-wide significance (relative hazard = 0.3; P =3. 370 × 10−9) after statistical correction for 700,022 SNPs and contributes 4.52% of the overall variance in AIDS progression in this study. Nine of the top-ranked SNPs define a PARD3B haplotype that also displays significant association with progression to AIDS (hazard ratio, 0.3; P = 3.220 × 10−8). One of these SNPs, rs10185378, is a predicted exonic splicing enhancer; significant alteration in the expression profile of PARD3B splicing transcripts was observed in B cell lines with alternate rs10185378 genotypes. This SNP was typed in European cohorts of rapid progressors and was found to be protective for AIDS 1993 definition (odds ratio, 0.43, P = .025). Conclusions. These observations suggest a potential unsuspected pathway of host genetic influence on the dynamics of AIDS progression. PMID:21502085
Børglum, A D; Demontis, D; Grove, J; Pallesen, J; Hollegaard, M V; Pedersen, C B; Hedemand, A; Mattheisen, M; Uitterlinden, A; Nyegaard, M; Ørntoft, T; Wiuf, C; Didriksen, M; Nordentoft, M; Nöthen, M M; Rietschel, M; Ophoff, R A; Cichon, S; Yolken, R H; Hougaard, D M; Mortensen, P B; Mors, O
Genetic and environmental components as well as their interaction contribute to the risk of schizophrenia, making it highly relevant to include environmental factors in genetic studies of schizophrenia. This study comprises genome-wide association (GWA) and follow-up analyses of all individuals born in Denmark since 1981 and diagnosed with schizophrenia as well as controls from the same birth cohort. Furthermore, we present the first genome-wide interaction survey of single nucleotide polymorphisms (SNPs) and maternal cytomegalovirus (CMV) infection. The GWA analysis included 888 cases and 882 controls, and the follow-up investigation of the top GWA results was performed in independent Danish (1396 cases and 1803 controls) and German-Dutch (1169 cases, 3714 controls) samples. The SNPs most strongly associated in the single-marker analysis of the combined Danish samples were rs4757144 in ARNTL (P=3.78 × 10(-6)) and rs8057927 in CDH13 (P=1.39 × 10(-5)). Both genes have previously been linked to schizophrenia or other psychiatric disorders. The strongest associated SNP in the combined analysis, including Danish and German-Dutch samples, was rs12922317 in RUNDC2A (P=9.04 × 10(-7)). A region-based analysis summarizing independent signals in segments of 100 kb identified a new region-based genome-wide significant locus overlapping the gene ZEB1 (P=7.0 × 10(-7)). This signal was replicated in the follow-up analysis (P=2.3 × 10(-2)). Significant interaction with maternal CMV infection was found for rs7902091 (P(SNP × CMV)=7.3 × 10(-7)) in CTNNA3, a gene not previously implicated in schizophrenia, stressing the importance of including environmental factors in genetic studies.
Hong, Chang Bum; Kim, Young Jin; Moon, Sanghoon; Shin, Young-Ah; Go, Min Jin; Kim, Dong-Joon; Lee, Jong-Young; Cho, Yoon Shin
Recent advances in high-throughput genotyping technologies have enabled us to conduct a genome-wide association study (GWAS) on a large cohort. However, analyzing millions of single nucleotide polymorphisms (SNPs) is still a difficult task for researchers conducting a GWAS. Several difficulties such as compatibilities and dependencies are often encountered by researchers using analytical tools, during the installation of software. This is a huge obstacle to any research institute without computing facilities and specialists. Therefore, a proper research environment is an urgent need for researchers working on GWAS. We developed BioSMACK to provide a research environment for GWAS that requires no configuration and is easy to use. BioSMACK is based on the Ubuntu Live CD that offers a complete Linux-based operating system environment without installation. Moreover, we provide users with a GWAS manual consisting of a series of guidelines for GWAS and useful examples. BioSMACK is freely available at http://ksnp.cdc. go.kr/biosmack.
Sundqvist, J; Xu, H; Vodolazkaia, A; Fassbender, A; Kyama, C; Bokor, A; Gemzell-Danielsson, K; D'Hooghe, T M; Falconer, H
Is it possible to replicate the previously identified genetic association of four single-nucleotide polymorphisms (SNPs), rs12700667, rs7798431, rs1250248 and rs7521902, with endometriosis in a Caucasian population? A borderline association was observed for rs1250248 and endometriosis (P = 0.049). However, we could not replicate the other previously identified endometriosis-associated SNPs (rs12700667, rs7798431 and rs7521902) in the same population. Endometriosis is considered a complex disease, influenced by several genetic and environmental factors, as well as interactions between them. Previous studies have found genetic associations with endometriosis for SNPs at the 7p15 and 2q35 loci in a Caucasian population. Allele frequencies of SNPs were investigated in patients with endometriosis and controls. Blood samples and peritoneal biopsies were taken from a Caucasian female population consisting of 1129 patients with endometriosis and 831 controls. DNA was extracted for genotyping. The study was performed at a University hospital and research laboratories. A weak association with endometriosis (all stages) was observed for rs1250248 (P = 0.049). No significant associations were observed for the SNPs rs12700667, rs7798431 and rs7521902. A non-significant trend towards the association of rs1250248 with moderate/severe endometriosis was observed (odds ratio 1.18, 95% confidence interval 0.97-1.44). The inability to confirm all previous findings may result from differences between populations and type II errors. Our result demonstrates the difficulty of identifying common genetic variants in complex diseases. This study was supported by grants from the Karolinska Institutet and Stockholm City County/Karolinska Institutet (ALF), Stockholm, Sweden, Swedish Medical Research Council (K2007-54X-14212-06-3, K2010-54X-14212-09-3), Stockholm, Sweden, Leuven University Research Council (Onderzoeksraad KU Leuven), the Leuven University Hospitals Clinical Research Foundation
Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y F; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie; Martin, Darren Patrick
Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.
In high-dimensional studies such as genome-wide association studies, the correction for multiple testing in order to control total type I error results in decreased power to detect modest effects. We present a new analytical approach based on the higher criticism statistic that allows identification of the presence of modest effects. We apply our method to the genome-wide study of rheumatoid arthritis provided in the Genetic Analysis Workshop 16 Problem 1 data set. There is evidence for unknown bias in this study that could be explained by the presence of undetected modest effects. We compared the asymptotic and empirical thresholds for the higher criticism statistic. Using the asymptotic threshold we detected the presence of modest effects genome-wide. We also detected modest effects using 90th percentile of the empirical null distribution as a threshold; however, there is no such evidence when the 95th and 99th percentiles were used. While the higher criticism method suggests that there is some evidence for modest effects, interpreting individual single-nucleotide polymorphisms with significant higher criticism statistics is of undermined value. The goal of higher criticism is to alert the researcher that genetic effects remain to be discovered and to promote the use of more targeted and powerful studies to detect the remaining effects. PMID:20018032
Perera, Minoli A; Cavallari, Larisa H; Limdi, Nita A; Gamazon, Eric R; Konkashbaev, Anuar; Daneshjou, Roxana; Pluzhnikov, Anna; Crawford, Dana C; Wang, Jelai; Liu, Nianjun; Tatonetti, Nicholas; Bourgeois, Stephane; Takahashi, Harumi; Bradford, Yukiko; Burkley, Benjamin M; Desnick, Robert J; Halperin, Jonathan L; Khalifa, Sherief I; Langaee, Taimour Y; Lubitz, Steven A; Nutescu, Edith A; Oetjens, Matthew; Shahin, Mohamed H; Patel, Shitalben R; Sagreiya, Hersh; Tector, Matthew; Weck, Karen E; Rieder, Mark J; Scott, Stuart A; Wu, Alan HB; Burmester, James K; Wadelius, Mia; Deloukas, Panos; Wagner, Michael J; Mushiroda, Taisei; Kubo, Michiaki; Roden, Dan M; Cox, Nancy J; Altman, Russ B; Klein, Teri E; Nakamura, Yusuke; Johnson, Julie A
Summary Background VKORC1 and CYP2C9 are important contributors to warfarin dose variability, but explain less variability for individuals of African descent than for those of European or Asian descent. We aimed to identify additional variants contributing to warfarin dose requirements in African Americans. Methods We did a genome-wide association study of discovery and replication cohorts. Samples from African-American adults (aged ≥18 years) who were taking a stable maintenance dose of warfarin were obtained at International Warfarin Pharmacogenetics Consortium (IWPC) sites and the University of Alabama at Birmingham (Birmingham, AL, USA). Patients enrolled at IWPC sites but who were not used for discovery made up the independent replication cohort. All participants were genotyped. We did a stepwise conditional analysis, conditioning first for VKORC1 −1639G→A, followed by the composite genotype of CYP2C9*2 and CYP2C9*3. We prespecified a genome-wide significance threshold of p<5×10−8 in the discovery cohort and p<0·0038 in the replication cohort. Findings The discovery cohort contained 533 participants and the replication cohort 432 participants. After the prespecified conditioning in the discovery cohort, we identified an association between a novel single nucleotide polymorphism in the CYP2C cluster on chromosome 10 (rs12777823) and warfarin dose requirement that reached genome-wide significance (p=1·51×10−8). This association was confirmed in the replication cohort (p=5·04×10−5); analysis of the two cohorts together produced a p value of 4·5×10−12. Individuals heterozygous for the rs12777823 A allele need a dose reduction of 6·92 mg/week and those homozygous 9·34 mg/week. Regression analysis showed that the inclusion of rs12777823 significantly improves warfarin dose variability explained by the IWPC dosing algorithm (21% relative improvement). Interpretation A novel CYP2C single nucleotide polymorphism exerts a clinically relevant
Li, Yongle; Ruperao, Pradeep; Batley, Jacqueline; Edwards, David; Khan, Tanveer; Colmer, Timothy D; Pang, Jiayin; Siddique, Kadambot H M; Sutton, Tim
Drought tolerance is a complex trait that involves numerous genes. Identifying key causal genes or linked molecular markers can facilitate the fast development of drought tolerant varieties. Using a whole-genome resequencing approach, we sequenced 132 chickpea varieties and advanced breeding lines and found more than 144,000 single nucleotide polymorphisms (SNPs). We measured 13 yield and yield-related traits in three drought-prone environments of Western Australia. The genotypic effects were significant for all traits, and many traits showed highly significant correlations, ranging from 0.83 between grain yield and biomass to -0.67 between seed weight and seed emergence rate. To identify candidate genes, the SNP and trait data were incorporated into the SUPER genome-wide association study (GWAS) model, a modified version of the linear mixed model. We found that several SNPs from auxin-related genes, including auxin efflux carrier protein (PIN3), p-glycoprotein, and nodulin MtN21/EamA-like transporter, were significantly associated with yield and yield-related traits under drought-prone environments. We identified four genetic regions containing SNPs significantly associated with several different traits, which was an indication of pleiotropic effects. We also investigated the possibility of incorporating the GWAS results into a genomic selection (GS) model, which is another approach to deal with complex traits. Compared to using all SNPs, application of the GS model using subsets of SNPs significantly associated with the traits under investigation increased the prediction accuracies of three yield and yield-related traits by more than twofold. This has important implication for implementing GS in plant breeding programs.
Nedenia Bonvino Stafuzza
Full Text Available Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose, Gyr, Girolando and Holstein (dairy production. A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs and 3,828,041 insertions/deletions (InDels were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs.
Full Text Available Background: We conducted a genome-wide association study (GWAS to identify specific genetic variants that underlie susceptibility to disease caused by Staphylococcus aureus in humans. Methods: Cases (n=309 and controls (n=2,925 were genotyped at 508,921 single nucleotide polymorphisms (SNPs. Cases had at least one laboratory and clinician confirmed disease caused by S. aureus whereas controls did not. R-package (for SNP association, EIGENSOFT (to estimate and adjust for population stratification and gene- (VEGAS and pathway-based (DAVID, PANTHER, and Ingenuity Pathway Analysis analyses were performed.Results: No SNP reached genome-wide significance. Four SNPs exceeded the pConclusion: We identified potential susceptibility genes for S. aureus diseases in this preliminary study but confirmation by other studies is needed. The observed associations could be relevant given the complexity of S. aureus as a pathogen and its ability to exploit multiple biological pathways to cause infections in humans.
Full Text Available The regenerative abilities and the immunosuppressive properties of mesenchymal stromal cells (MSCs make them potentially the ideal cellular product of choice for treatment of autoimmune and other immune mediated disorders. Although the usefulness of MSCs for therapeutic applications is in early phases, their potential clinical use remains of great interest. Current clinical evidence of use of MSCs from both autologous and allogeneic sources to treat autoimmune disorders confers conflicting clinical benefit outcomes. These varied results may possibly be due to MSC use across wide range of autoimmune disorders with clinical heterogeneity or due to variability of the cellular product. In the light of recent genome wide association studies (GWAS, linking predisposition of autoimmune diseases to single nucleotide polymorphisms (SNPs in the susceptible genetic loci, the clinical relevance of MSCs possessing SNPs in the critical effector molecules of immunosuppression is largely undiscussed. It is of further interest in the allogeneic setting, where SNPs in the target pathway of MSC's intervention may also modulate clinical outcome. In the present review, we have discussed the known critical SNPs predisposing to disease susceptibility in various autoimmune diseases and their significance in the immunomodulatory properties of MSCs.
Cheung, Sau W; Bi, Weimin
In 2004, the implementation of array comparative genomic hybridization (array comparative genome hybridization [CGH]) into clinical practice marked a new milestone for genetic diagnosis. Array CGH and single-nucleotide polymorphism (SNP) arrays enable genome-wide detection of copy number changes in a high resolution, and therefore microarray has been recognized as the first-tier test for patients with intellectual disability or multiple congenital anomalies, and has also been applied prenatally for detection of clinically relevant copy number variations in the fetus. Area covered: In this review, the authors summarize the evolution of array CGH technology from their diagnostic laboratory, highlighting exonic SNP arrays developed in the past decade which detect small intragenic copy number changes as well as large DNA segments for the region of heterozygosity. The applications of array CGH to human diseases with different modes of inheritance with the emphasis on autosomal recessive disorders are discussed. Expert commentary: An exonic array is a powerful and most efficient clinical tool in detecting genome wide small copy number variants in both dominant and recessive disorders. However, whole-genome sequencing may become the single integrated platform for detection of copy number changes, single-nucleotide changes as well as balanced chromosomal rearrangements in the near future.
Lane, Jérôme; McLaren, Paul J.; Dorrell, Lucy; Shianna, Kevin V.; Stemke, Amanda; Pelak, Kimberly; Moore, Stephen; Oldenburg, Johannes; Alvarez-Roman, Maria Teresa; Angelillo-Scherrer, Anne; Boehlen, Francoise; Bolton-Maggs, Paula H.B.; Brand, Brigit; Brown, Deborah; Chiang, Elaine; Cid-Haro, Ana Rosa; Clotet, Bonaventura; Collins, Peter; Colombo, Sara; Dalmau, Judith; Fogarty, Patrick; Giangrande, Paul; Gringeri, Alessandro; Iyer, Rathi; Katsarou, Olga; Kempton, Christine; Kuriakose, Philip; Lin, Judith; Makris, Mike; Manco-Johnson, Marilyn; Tsakiris, Dimitrios A.; Martinez-Picado, Javier; Mauser-Bunschoten, Evelien; Neff, Anne; Oka, Shinichi; Oyesiku, Lara; Parra, Rafael; Peter-Salonen, Kristiina; Powell, Jerry; Recht, Michael; Shapiro, Amy; Stine, Kimo; Talks, Katherine; Telenti, Amalio; Wilde, Jonathan; Yee, Thynn Thynn; Wolinsky, Steven M.; Martinson, Jeremy; Hussain, Shehnaz K.; Bream, Jay H.; Jacobson, Lisa P.; Carrington, Mary; Goedert, James J.; Haynes, Barton F.; McMichael, Andrew J.; Goldstein, David B.; Fellay, Jacques
Human genetic variation contributes to differences in susceptibility to HIV-1 infection. To search for novel host resistance factors, we performed a genome-wide association study (GWAS) in hemophilia patients highly exposed to potentially contaminated factor VIII infusions. Individuals with hemophilia A and a documented history of factor VIII infusions before the introduction of viral inactivation procedures (1979–1984) were recruited from 36 hemophilia treatment centers (HTCs), and their genome-wide genetic variants were compared with those from matched HIV-infected individuals. Homozygous carriers of known CCR5 resistance mutations were excluded. Single nucleotide polymorphisms (SNPs) and inferred copy number variants (CNVs) were tested using logistic regression. In addition, we performed a pathway enrichment analysis, a heritability analysis, and a search for epistatic interactions with CCR5 Δ32 heterozygosity. A total of 560 HIV-uninfected cases were recruited: 36 (6.4%) were homozygous for CCR5 Δ32 or m303. After quality control and SNP imputation, we tested 1 081 435 SNPs and 3686 CNVs for association with HIV-1 serostatus in 431 cases and 765 HIV-infected controls. No SNP or CNV reached genome-wide significance. The additional analyses did not reveal any strong genetic effect. Highly exposed, yet uninfected hemophiliacs form an ideal study group to investigate host resistance factors. Using a genome-wide approach, we did not detect any significant associations between SNPs and HIV-1 susceptibility, indicating that common genetic variants of major effect are unlikely to explain the observed resistance phenotype in this population. PMID:23372042
Toghiani, S; Aggrey, S E; Rekaya, R
Availability of high-density single nucleotide polymorphism (SNP) genotyping platforms provided unprecedented opportunities to enhance breeding programmes in livestock, poultry and plant species, and to better understand the genetic basis of complex traits. Using this genomic information, genomic breeding values (GEBVs), which are more accurate than conventional breeding values. The superiority of genomic selection is possible only when high-density SNP panels are used to track genes and QTLs affecting the trait. Unfortunately, even with the continuous decrease in genotyping costs, only a small fraction of the population has been genotyped with these high-density panels. It is often the case that a larger portion of the population is genotyped with low-density and low-cost SNP panels and then imputed to a higher density. Accuracy of SNP genotype imputation tends to be high when minimum requirements are met. Nevertheless, a certain rate of genotype imputation errors is unavoidable. Thus, it is reasonable to assume that the accuracy of GEBVs will be affected by imputation errors; especially, their cumulative effects over time. To evaluate the impact of multi-generational selection on the accuracy of SNP genotypes imputation and the reliability of resulting GEBVs, a simulation was carried out under varying updating of the reference population, distance between the reference and testing sets, and the approach used for the estimation of GEBVs. Using fixed reference populations, imputation accuracy decayed by about 0.5% per generation. In fact, after 25 generations, the accuracy was only 7% lower than the first generation. When the reference population was updated by either 1% or 5% of the top animals in the previous generations, decay of imputation accuracy was substantially reduced. These results indicate that low-density panels are useful, especially when the generational interval between reference and testing population is small. As the generational interval
Emily R Davenport
Full Text Available The bacterial composition of the human fecal microbiome is influenced by many lifestyle factors, notably diet. It is less clear, however, what role host genetics plays in dictating the composition of bacteria living in the gut. In this study, we examined the association of ~200K host genotypes with the relative abundance of fecal bacterial taxa in a founder population, the Hutterites, during two seasons (n = 91 summer, n = 93 winter, n = 57 individuals collected in both. These individuals live and eat communally, minimizing variation due to environmental exposures, including diet, which could potentially mask small genetic effects. Using a GWAS approach that takes into account the relatedness between subjects, we identified at least 8 bacterial taxa whose abundances were associated with single nucleotide polymorphisms in the host genome in each season (at genome-wide FDR of 20%. For example, we identified an association between a taxon known to affect obesity (genus Akkermansia and a variant near PLD1, a gene previously associated with body mass index. Moreover, we replicate a previously reported association from a quantitative trait locus (QTL mapping study of fecal microbiome abundance in mice (genus Lactococcus, rs3747113, P = 3.13 x 10-7. Finally, based on the significance distribution of the associated microbiome QTLs in our study with respect to chromatin accessibility profiles, we identified tissues in which host genetic variation may be acting to influence bacterial abundance in the gut.
Full Text Available Hypertension is a leading cause of global disease, mortality, and disability. While individuals of African descent suffer a disproportionate burden of hypertension and its complications, they have been underrepresented in genetic studies. To identify novel susceptibility loci for blood pressure and hypertension in people of African ancestry, we performed both single and multiple-trait genome-wide association analyses. We analyzed 21 genome-wide association studies comprised of 31,968 individuals of African ancestry, and validated our results with additional 54,395 individuals from multi-ethnic studies. These analyses identified nine loci with eleven independent variants which reached genome-wide significance (P < 1.25×10-8 for either systolic and diastolic blood pressure, hypertension, or for combined traits. Single-trait analyses identified two loci (TARID/TCF21 and LLPH/TMBIM4 and multiple-trait analyses identified one novel locus (FRMD3 for blood pressure. At these three loci, as well as at GRP20/CDH17, associated variants had alleles common only in African-ancestry populations. Functional annotation showed enrichment for genes expressed in immune and kidney cells, as well as in heart and vascular cells/tissues. Experiments driven by these findings and using angiotensin-II induced hypertension in mice showed altered kidney mRNA expression of six genes, suggesting their potential role in hypertension. Our study provides new evidence for genes related to hypertension susceptibility, and the need to study African-ancestry populations in order to identify biologic factors contributing to hypertension.
Biernacka, Joanna M.; Geske, Jennifer; Jenkins, Gregory D.; Colby, Colin; Rider, David N.; Karpyak, Victor M.; Choi, Doo-Sup; Fridley, Brooke L.
It is believed that multiple genetic variants with small individual effects contribute to the risk of alcohol dependence. Such polygenic effects are difficult to detect in genome-wide association studies that test for association of the phenotype with each single nucleotide polymorphism (SNP) individually. To overcome this challenge, gene set analysis (GSA) methods that jointly test for the effects of pre-defined groups of genes have been proposed. Rather than testing for association between the phenotype and individual SNPs, these analyses evaluate the global evidence of association with a set of related genes enabling the identification of cellular or molecular pathways or biological processes that play a role in development of the disease. It is hoped that by aggregating the evidence of association for all available SNPs in a group of related genes, these approaches will have enhanced power to detect genetic associations with complex traits. We performed GSA using data from a genome-wide study of 1165 alcohol dependent cases and 1379 controls from the Study of Addiction: Genetics and Environment (SAGE), for all 200 pathways listed in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Results demonstrated a potential role of the “Synthesis and Degradation of Ketone Bodies” pathway. Our results also support the potential involvement of the “Neuroactive Ligand Receptor Interaction” pathway, which has previously been implicated in addictive disorders. These findings demonstrate the utility of GSA in the study of complex disease, and suggest specific directions for further research into the genetic architecture of alcohol dependence. PMID:22717047
Atopic dermatitis (AD) is a common inflammatory skin disorder with a strong genetic component. Genome-wide association studies have been successful in the identification of common single nucleotide polymorphisms associated with AD, but their functional relevance has not been investigated yet. This work presents a comprehensive functional characterization of common and infrequent variants at the AD-associated C11orf30/LRRC32 locus. Analyses of cutaneous gene expression profiles in AD patients ...
Elbaz, Alexis; Nelson, Lorene M; Payami, Haydeh; Ioannidis, John P A; Fiske, Brian K; Annesi, Grazia; Belin, Andrea Carmine; Factor, Stewart A; Ferrarese, Carlo; Hadjigeorgiou, Georgios M; Higgins, Donald S; Kawakami, Hideshi; Krüger, Rejko; Marder, Karen S; Mayeux, Richard P; Mellick, George D; Nutt, John G; Ritz, Beate; Samii, Ali; Tanner, Caroline M; Van Broeckhoven, Christine; Van Den Eeden, Stephen K; Wirdefeldt, Karin; Zabetian, Cyrus P; Dehem, Marie; Montimurro, Jennifer S; Southwick, Audrey; Myers, Richard M; Trikalinos, Thomas A
Summary Background A genome-wide association study identified 13 single-nucleotide polymorphisms (SNPs) significantly associated with Parkinson’s disease. Small-scale replication studies were largely non-confirmatory, but a meta-analysis that included data from the original study could not exclude all SNP associations, leaving relevance of several markers uncertain. Methods Investigators from three Michael J Fox Foundation for Parkinson’s Research-funded genetics consortia—comprising 14 teams—contributed DNA samples from 5526 patients with Parkinson’s disease and 6682 controls, which were genotyped for the 13 SNPs. Most (88%) participants were of white, non-Hispanic descent. We assessed log-additive genetic effects using fixed and random effects models stratified by team and ethnic origin, and tested for heterogeneity across strata. A meta-analysis was undertaken that incorporated data from the original genome-wide study as well as subsequent replication studies. Findings In fixed and random-effects models no associations with any of the 13 SNPs were identified (odds ratios 0·89 to 1·09). Heterogeneity between studies and between ethnic groups was low for all SNPs. Subgroup analyses by age at study entry, ethnic origin, sex, and family history did not show any consistent associations. In our meta-analysis, no SNP showed significant association (summary odds ratios 0·95 to 1.08); there was little heterogeneity except for SNP rs7520966. Interpretation Our results do not lend support to the finding that the 13 SNPs reported in the original genome-wide association study are genetic susceptibility factors for Parkinson’s disease. PMID:17052658
White Frank F
Full Text Available Abstract Background Eight diverse sorghum (Sorghum bicolor L. Moench accessions were subjected to short-read genome sequencing to characterize the distribution of single-nucleotide polymorphisms (SNPs. Two strategies were used for DNA library preparation. Missing SNP genotype data were imputed by local haplotype comparison. The effect of library type and genomic diversity on SNP discovery and imputation are evaluated. Results Alignment of eight genome equivalents (6 Gb to the public reference genome revealed 283,000 SNPs at ≥82% confirmation probability. Sequencing from libraries constructed to limit sequencing to start at defined restriction sites led to genotyping 10-fold more SNPs in all 8 accessions, and correctly imputing 11% more missing data, than from semirandom libraries. The SNP yield advantage of the reduced-representation method was less than expected, since up to one fifth of reads started at noncanonical restriction sites and up to one third of restriction sites predicted in silico to yield unique alignments were not sampled at near-saturation. For imputation accuracy, the availability of a genomically similar accession in the germplasm panel was more important than panel size or sequencing coverage. Conclusions A sequence quantity of 3 million 50-base reads per accession using a BsrFI library would conservatively provide satisfactory genotyping of 96,000 sorghum SNPs. For most reliable SNP-genotype imputation in shallowly sequenced genomes, germplasm panels should consist of pairs or groups of genomically similar entries. These results may help in designing strategies for economical genotyping-by-sequencing of large numbers of plant accessions.
Jiao, Yan; Yan, Jian; Jiao, Feng; Yang, Hongbin; Donahue, Leah Rae; Li, Xinmin; Roe, Bruce A; Stuart, John; Gu, Weikuan
The long bone abnormality (lbab) mouse is a new autosomal recessive mutant characterized by overall smaller body size with proportionate dwarfing of all organs and shorter long bones. Previous linkage analysis has located the lbab mutation on chromosome 1 between the markers D1Mit9 and D1Mit488. A genome-based positional approach was used to identify a mutation associated with lbab disease. A total of 122 genes and expressed sequence tags at the lbab region were screened for possible mutation by using genomic DNA from lbabl/lbab, lbab/+, and +/+ B6 mice and high throughput temperature gradient capillary electrophoresis. A sequence difference was identified in one of the amplicons of gene Nppc between lbab/lbab and +/+ mice. One-step reverse transcriptase polymerase chain reaction was performed to validate the difference of Nppc in different types of mice at the mRNA level. The mutation of Nppc was unique in lbab/lbab mice among multiple mouse inbred strains. The mutation of Nppc is co-segregated with lbab disease in 200 progenies produced from heterozygous lbab/+ parents. A single nucleotide mutation of Nppc is associated with dwarfism in lbab/lbab mice. Current genome information and technology allow us to efficiently identify single nucleotide mutations from roughly mapped disease loci. The lbab mouse is a useful model for hereditary human achondroplasia.
Roe Bruce A
Full Text Available Abstract Background The long bone abnormality (lbab mouse is a new autosomal recessive mutant characterized by overall smaller body size with proportionate dwarfing of all organs and shorter long bones. Previous linkage analysis has located the lbab mutation on chromosome 1 between the markers D1Mit9 and D1Mit488. Results A genome-based positional approach was used to identify a mutation associated with lbab disease. A total of 122 genes and expressed sequence tags at the lbab region were screened for possible mutation by using genomic DNA from lbabl/lbab, lbab/+, and +/+ B6 mice and high throughput temperature gradient capillary electrophoresis. A sequence difference was identified in one of the amplicons of gene Nppc between lbab/lbab and +/+ mice. One-step reverse transcriptase polymerase chain reaction was performed to validate the difference of Nppc in different types of mice at the mRNA level. The mutation of Nppc was unique in lbab/lbab mice among multiple mouse inbred strains. The mutation of Nppc is co-segregated with lbab disease in 200 progenies produced from heterozygous lbab/+ parents. Conclusion A single nucleotide mutation of Nppc is associated with dwarfism in lbab/lbab mice. Current genome information and technology allow us to efficiently identify single nucleotide mutations from roughly mapped disease loci. The lbab mouse is a useful model for hereditary human achondroplasia.
Boueiz, Adel; Lutz, Sharon M; Cho, Michael H; Hersh, Craig P; Bowler, Russell P; Washko, George R; Halper-Stromberg, Eitan; Bakke, Per; Gulsvik, Amund; Laird, Nan M; Beaty, Terri H; Coxson, Harvey O; Crapo, James D; Silverman, Edwin K; Castaldi, Peter J; DeMeo, Dawn L
Emphysema has considerable variability in the severity and distribution of parenchymal destruction throughout the lungs. Upper lobe-predominant emphysema has emerged as an important predictor of response to lung volume reduction surgery. Yet, aside from alpha-1 antitrypsin deficiency, the genetic determinants of emphysema distribution remain largely unknown. To identify the genetic influences of emphysema distribution in non-alpha-1 antitrypsin-deficient smokers. A total of 11,532 subjects with complete genotype and computed tomography densitometry data in the COPDGene (Genetic Epidemiology of Chronic Obstructive Pulmonary Disease [COPD]; non-Hispanic white and African American), ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints), and GenKOLS (Genetics of Chronic Obstructive Lung Disease) studies were analyzed. Two computed tomography scan emphysema distribution measures (difference between upper-third and lower-third emphysema; ratio of upper-third to lower-third emphysema) were tested for genetic associations in all study subjects. Separate analyses in each study population were followed by a fixed effect metaanalysis. Single-nucleotide polymorphism-, gene-, and pathway-based approaches were used. In silico functional evaluation was also performed. We identified five loci associated with emphysema distribution at genome-wide significance. These loci included two previously reported associations with COPD susceptibility (4q31 near HHIP and 15q25 near CHRNA5) and three new associations near SOWAHB, TRAPPC9, and KIAA1462. Gene set analysis and in silico functional evaluation revealed pathways and cell types that may potentially contribute to the pathogenesis of emphysema distribution. This multicohort genome-wide association study identified new genomic loci associated with differential emphysematous destruction throughout the lungs. These findings may point to new biologic pathways on which to expand diagnostic and therapeutic
Allen, Alexandra M; Barker, Gary L A; Berry, Simon T; Coghill, Jane A; Gwilliam, Rhian; Kirby, Susan; Robinson, Phil; Brenchley, Rachel C; D'Amore, Rosalinda; McKenzie, Neil; Waite, Darren; Hall, Anthony; Bevan, Michael; Hall, Neil; Edwards, Keith J
Food security is a global concern and substantial yield increases in cereal crops are required to feed the growing world population. Wheat is one of the three most important crops for human and livestock feed. However, the complexity of the genome coupled with a decline in genetic diversity within modern elite cultivars has hindered the application of marker-assisted selection (MAS) in breeding programmes. A crucial step in the successful application of MAS in breeding programmes is the development of cheap and easy to use molecular markers, such as single-nucleotide polymorphisms. To mine selected elite wheat germplasm for intervarietal single-nucleotide polymorphisms, we have used expressed sequence tags derived from public sequencing programmes and next-generation sequencing of normalized wheat complementary DNA libraries, in combination with a novel sequence alignment and assembly approach. Here, we describe the development and validation of a panel of 1114 single-nucleotide polymorphisms in hexaploid bread wheat using competitive allele-specific polymerase chain reaction genotyping technology. We report the genotyping results of these markers on 23 wheat varieties, selected to represent a broad cross-section of wheat germplasm including a number of elite UK varieties. Finally, we show that, using relatively simple technology, it is possible to rapidly generate a linkage map containing several hundred single-nucleotide polymorphism markers in the doubled haploid mapping population of Avalon × Cadenza. © 2011 The Authors. Plant Biotechnology Journal © 2011 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
Full Text Available Actinobacillus pleuropneumoniae is the pathogen of porcine contagious pleuropneumoniae, a highly contagious respiratory disease of swine. Although the genome of A. pleuropneumoniae was sequenced several years ago, limited information is available on the genome-wide transcriptional analysis to accurately annotate the gene structures and regulatory elements. High-throughput RNA sequencing (RNA-seq has been applied to study the transcriptional landscape of bacteria, which can efficiently and accurately identify gene expression regions and unknown transcriptional units, especially small non-coding RNAs (sRNAs, UTRs and regulatory regions. The aim of this study is to comprehensively analyze the transcriptome of A. pleuropneumoniae by RNA-seq in order to improve the existing genome annotation and promote our understanding of A. pleuropneumoniae gene structures and RNA-based regulation. In this study, we utilized RNA-seq to construct a single nucleotide resolution transcriptome map of A. pleuropneumoniae. More than 3.8 million high-quality reads (average length ~90 bp from a cDNA library were generated and aligned to the reference genome. We identified 32 open reading frames encoding novel proteins that were mis-annotated in the previous genome annotations. The start sites for 35 genes based on the current genome annotation were corrected. Furthermore, 51 sRNAs in the A. pleuropneumoniae genome were discovered, of which 40 sRNAs were never reported in previous studies. The transcriptome map also enabled visualization of 5'- and 3'-UTR regions, in which contained 11 sRNAs. In addition, 351 operons covering 1230 genes throughout the whole genome were identified. The RNA-Seq based transcriptome map validated annotated genes and corrected annotations of open reading frames in the genome, and led to the identification of many functional elements (e.g. regions encoding novel proteins, non-coding sRNAs and operon structures. The transcriptional units
Zhang, Jiaoping; Song, Qijian; Cregan, Perry B; Jiang, Guo-Liang
Twenty-two loci for soybean SW and candidate genes conditioning seed development were identified; and prediction accuracies of GS and MAS were estimated through cross-validation and validation with unrelated populations. Soybean (Glycine max) is a major crop for plant protein and oil production, and seed weight (SW) is important for yield and quality in food/vegetable uses of soybean. However, our knowledge of genes controlling SW remains limited. To better understand the molecular mechanism underlying the trait and explore marker-based breeding approaches, we conducted a genome-wide association study in a population of 309 soybean germplasm accessions using 31,045 single nucleotide polymorphisms (SNPs), and estimated the prediction accuracy of genomic selection (GS) and marker-assisted selection (MAS) for SW. Twenty-two loci of minor effect associated with SW were identified, including hotspots on Gm04 and Gm19. The mixed model containing these loci explained 83.4% of phenotypic variation. Candidate genes with Arabidopsis orthologs conditioning SW were also proposed. The prediction accuracies of GS and MAS by cross-validation were 0.75-0.87 and 0.62-0.75, respectively, depending on the number of SNPs used and the size of training population. GS also outperformed MAS when the validation was performed using unrelated panels across a wide range of maturities, with an average prediction accuracy of 0.74 versus 0.53. This study convincingly demonstrated that soybean SW is controlled by numerous minor-effect loci. It greatly enhances our understanding of the genetic basis of SW in soybean and facilitates the identification of genes controlling the trait. It also suggests that GS holds promise for accelerating soybean breeding progress. The results are helpful for genetic improvement and genomic prediction of yield in soybean.
Full Text Available Abstract Background We conducted a genome-wide association study (GWAS and validation study for left ventricular (LV mass in the Family Blood Pressure Program – HyperGEN population. LV mass is a sensitive predictor of cardiovascular mortality and morbidity in all genders, races, and ages. Polymorphisms of candidate genes in diverse pathways have been associated with LV mass. However, subsequent studies have often failed to replicate these associations. Genome-wide association studies have unprecedented power to identify potential genes with modest effects on left LV mass. We describe here a GWAS for LV mass in Caucasians using the Affymetrix GeneChip Human Mapping 100 k Set. Cases (N = 101 and controls (N = 101 were selected from extreme tails of the LV mass index distribution from 906 individuals in the HyperGEN study. Eleven of 12 promising (Q Results Despite the relatively small sample, we identified 12 promising SNPs in the GWAS. Eleven SNPs were successfully genotyped in the validation study of 704 Caucasians and 1467 African Americans; 5 SNPs on chromosomes 5, 12, and 20 were significantly (P ≤ 0.05 associated with LV mass after correction for multiple testing. One SNP (rs756529 is intragenic within KCNB1, which is dephosphorylated by calcineurin, a previously reported candidate gene for LV hypertrophy within this population. Conclusion These findings suggest KCNB1 may be involved in the development of LV hypertrophy in humans.
Full Text Available The tsetse fly Glossina fuscipes fuscipes (Gff is the insect vector of the two forms of Human African Trypanosomiasis (HAT that exist in Uganda. Understanding Gff population dynamics, and the underlying genetics of epidemiologically relevant phenotypes is key to reducing disease transmission. Using ddRAD sequence technology, complemented with whole-genome sequencing, we developed a panel of ∼73,000 single-nucleotide polymorphisms (SNPs distributed across the Gff genome that can be used for population genomics and to perform genome-wide-association studies. We used these markers to estimate genomic patterns of linkage disequilibrium (LD in Gff, and used the information, in combination with outlier-locus detection tests, to identify candidate regions of the genome under selection. LD in individual populations decays to half of its maximum value (r2max/2 between 1359 and 2429 bp. The overall LD estimated for the species reaches r2max/2 at 708 bp, an order of magnitude slower than in Drosophila. Using 53 infected (Trypanosoma spp. and uninfected flies from four genetically distinct Ugandan populations adapted to different environmental conditions, we were able to identify SNPs associated with the infection status of the fly and local environmental adaptation. The extent of LD in Gff likely facilitated the detection of loci under selection, despite the small sample size. Furthermore, it is probable that LD in the regions identified is much higher than the average genomic LD due to strong selection. Our results show that even modest sample sizes can reveal significant genetic associations in this species, which has implications for future studies given the difficulties of collecting field specimens with contrasting phenotypes for association analysis.
Klos, Kathy Esvelt; Yimer, Belayneh A; Babiker, Ebrahiem M; Beattie, Aaron D; Bonman, J Michael; Carson, Martin L; Chong, James; Harrison, Stephen A; Ibrahim, Amir M H; Kolb, Frederic L; McCartney, Curt A; McMullen, Michael; Fetch, Jennifer Mitchell; Mohammadi, Mohsen; Murphy, J Paul; Tinker, Nicholas A
Oat crown rust, caused by f. sp. , is a major constraint to oat ( L.) production in many parts of the world. In this first comprehensive multienvironment genome-wide association map of oat crown rust, we used 2972 single-nucleotide polymorphisms (SNPs) genotyped on 631 oat lines for association mapping of quantitative trait loci (QTL). Seedling reaction to crown rust in these lines was assessed as infection type (IT) with each of 10 crown rust isolates. Adult plant reaction was assessed in the field in a total of 10 location-years as percentage severity (SV) and as infection reaction (IR) in a 0-to-1 scale. Overall, 29 SNPs on 12 linkage groups were predictive of crown rust reaction in at least one experiment at a genome-wide level of statistical significance. The QTL identified here include those in regions previously shown to be linked with seedling resistance genes , , , , , and and also with adult-plant resistance and adaptation-related QTL. In addition, QTL on linkage groups Mrg03, Mrg08, and Mrg23 were identified in regions not previously associated with crown rust resistance. Evaluation of marker genotypes in a set of crown rust differential lines supported as the identity of . The SNPs with rare alleles associated with lower disease scores may be suitable for use in marker-assisted selection of oat lines for crown rust resistance. Copyright © 2017 Crop Science Society of America.
Sebastiaan M Bol
Full Text Available HIV-1 infected macrophages play an important role in rendering resting T cells permissive for infection, in spreading HIV-1 to T cells, and in the pathogenesis of AIDS dementia. During highly active anti-retroviral treatment (HAART, macrophages keep producing virus because tissue penetration of antiretrovirals is suboptimal and the efficacy of some is reduced. Thus, to cure HIV-1 infection with antiretrovirals we will also need to efficiently inhibit viral replication in macrophages. The majority of the current drugs block the action of viral enzymes, whereas there is an abundance of yet unidentified host factors that could be targeted. We here present results from a genome-wide association study identifying novel genetic polymorphisms that affect in vitro HIV-1 replication in macrophages.Monocyte-derived macrophages from 393 blood donors were infected with HIV-1 and viral replication was determined using Gag p24 antigen levels. Genomic DNA from individuals with macrophages that had relatively low (n = 96 or high (n = 96 p24 production was used for SNP genotyping with the Illumina 610 Quad beadchip. A total of 494,656 SNPs that passed quality control were tested for association with HIV-1 replication in macrophages, using linear regression. We found a strong association between in vitro HIV-1 replication in monocyte-derived macrophages and SNP rs12483205 in DYRK1A (p = 2.16 × 10(-5. While the association was not genome-wide significant (p<1 × 10(-7, we could replicate this association using monocyte-derived macrophages from an independent group of 31 individuals (p = 0.0034. Combined analysis of the initial and replication cohort increased the strength of the association (p = 4.84 × 10(-6. In addition, we found this SNP to be associated with HIV-1 disease progression in vivo in two independent cohort studies (p = 0.035 and p = 0.0048.These findings suggest that the kinase DYRK1A is involved in the replication of HIV-1, in vitro in macrophages
Jarislav von Zitzewitz
Full Text Available Winterhardiness is a complex trait that involves low temperature tolerance (LTT, vernalization sensitivity, and photoperiod sensitivity. Quantitative trait loci (QTL for these traits were first identified using biparental mapping populations; candidate genes for all loci have since been identified and characterized. In this research we used a set of 148 accessions consisting of advanced breeding lines from the Oregon barley ( L. subsp breeding program and selected cultivars that were extensively phenotyped and genotyped with single nucleotide polymorphisms. Using these data for genome-wide association mapping we detected the same QTL and genes that have been systematically characterized using biparental populations over nearly two decades of intensive research. In this sample of germplasm, maximum LTT can be achieved with facultative growth habit, which can be predicted using a three-locus haplotype involving , , and . The and LTT QTL explained 25% of the phenotypic variation, offering the prospect that additional gains from selection can be achieved once favorable alleles are fixed at these loci.
Li, Jingyun; Zhang, Yuan; Zhang, Luo
Allergic rhinitis and allergy are complex conditions, in which both genetic and environmental factors contribute to the pathogenesis. Genome-wide association studies (GWASs) employing common single-nucleotide polymorphisms have accelerated the search for novel and interesting genes, and also confirmed the role of some previously described genes which may be involved in the cause of allergic rhinitis and allergy. The aim of this review is to provide an overview of the genetic basis of allergic rhinitis and the associated allergic phenotypes, with particular focus on GWASs. The last decade has been marked by the publication of more than 20 GWASs of allergic rhinitis and the associated allergic phenotypes. Allergic diseases and traits have been shown to share a large number of genetic susceptibility loci, of which IL33/IL1RL1, IL-13-RAD50 and C11orf30/LRRC32 appear to be important for more than two allergic phenotypes. GWASs have further reflected the genetic heterogeneity underlying allergic phenotypes. Large-scale genome-wide association strategies are underway to discover new susceptibility variants for allergic rhinitis and allergic phenotypes. Characterization of the underlying genetics provides us with an insight into the potential targets for future studies and the corresponding interventions.
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs have emerged as the genetic marker of choice for mapping disease loci and candidate gene association studies, because of their high density and relatively even distribution in the human genomes. There is a need for systems allowing medium multiplexing (ten to hundreds of SNPs with high throughput, which can efficiently and cost-effectively generate genotypes for a very large sample set (thousands of individuals. Methods that are flexible, fast, accurate and cost-effective are urgently needed. This is also important for those who work on high throughput genotyping in non-model systems where off-the-shelf assays are not available and a flexible platform is needed. Results We demonstrate the use of a nanofluidic Integrated Fluidic Circuit (IFC - based genotyping system for medium-throughput multiplexing known as the Dynamic Array, by genotyping 994 individual human DNA samples on 47 different SNP assays, using nanoliter volumes of reagents. Call rates of greater than 99.5% and call accuracies of greater than 99.8% were achieved from our study, which demonstrates that this is a formidable genotyping platform. The experimental set up is very simple, with a time-to-result for each sample of about 3 hours. Conclusion Our results demonstrate that the Dynamic Array is an excellent genotyping system for medium-throughput multiplexing (30-300 SNPs, which is simple to use and combines rapid throughput with excellent call rates, high concordance and low cost. The exceptional call rates and call accuracy obtained may be of particular interest to those working on validation and replication of genome- wide- association (GWA studies.
Beaty, Terri H; Ruczinski, Ingo; Murray, Jeffrey C
Nonsyndromic cleft palate (CP) is a common birth defect with a complex and heterogeneous etiology involving both genetic and environmental risk factors. We conducted a genome-wide association study (GWAS) using 550 case-parent trios, ascertained through a CP case collected in an international...... consortium. Family-based association tests of single nucleotide polymorphisms (SNP) and three common maternal exposures (maternal smoking, alcohol consumption, and multivitamin supplementation) were used in a combined 2 df test for gene (G) and gene-environment (G × E) interaction simultaneously, plus...... multiple SNPs associated with higher risk of CP in the presence of maternal smoking. Additional evidence of reduced risk due to G × E interaction in the presence of multivitamin supplementation was observed for SNPs in BAALC on chr. 8. These results emphasize the need to consider G × E interaction when...
Hwang, Eun-Young; Song, Qijian; Jia, Gaofeng; Specht, James E; Hyten, David L; Costa, Jose; Cregan, Perry B
Association analysis is an alternative to conventional family-based methods to detect the location of gene(s) or quantitative trait loci (QTL) and provides relatively high resolution in terms of defining the genome position of a gene or QTL. Seed protein and oil concentration are quantitative traits which are determined by the interaction among many genes with small to moderate genetic effects and their interaction with the environment. In this study, a genome-wide association study (GWAS) was performed to identify quantitative trait loci (QTL) controlling seed protein and oil concentration in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content. A total of 55,159 single nucleotide polymorphisms (SNPs) were genotyped using various methods including Illumina Infinium and GoldenGate assays and 31,954 markers with minor allele frequency >0.10 were used to estimate linkage disequilibrium (LD) in heterochromatic and euchromatic regions. In euchromatic regions, the mean LD (r2) rapidly declined to 0.2 within 360 Kbp, whereas the mean LD declined to 0.2 at 9,600 Kbp in heterochromatic regions. The GWAS results identified 40 SNPs in 17 different genomic regions significantly associated with seed protein. Of these, the five SNPs with the highest associations and seven adjacent SNPs were located in the 27.6-30.0 Mbp region of Gm20. A major seed protein QTL has been previously mapped to the same location and potential candidate genes have recently been identified in this region. The GWAS results also detected 25 SNPs in 13 different genomic regions associated with seed oil. Of these markers, seven SNPs had a significant association with both protein and oil. This research indicated that GWAS not only identified most of the previously reported QTL controlling seed protein and oil, but also resulted in narrower genomic regions than the regions reported as containing these QTL. The narrower GWAS-defined genome regions will allow more precise
Okbay, Aysu; P. Beauchamp, Jonathan; Alan Fontana, Mark
-nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural......Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals1. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends...... development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals...
Multani, Shaleen; Saranath, Dhananjaya
Globocan 2012 reports the global oral cancer incidence of 300,373 new oral cancer cases annually, contributing to 2.1 % of the world cancer burden. The major well-established risk factors for oral cancer include tobacco, betel/areca nut, alcohol and high-risk oncogenic human papilloma virus (HPV) 16/18. However, only 5-10 % of individuals with high-risk lifestyle develop oral cancer. Thus, genomic variants in individuals represented as single nucleotide polymorphisms (SNPs) influence susceptibility to oral cancer. With a view to understanding the role of genomic variants in oral cancer, we reviewed SNPs in case-control studies with a minimum of 100 cases and 100 controls. PubMed and HuGE navigator search engines were used to obtain data published from 1990 to 2015, which identified 67 articles investigating the role of SNPs in oral cancer. Single publications reported 93 SNPs in 55 genes, with 34 SNPs associated with a risk of oral cancer. Meta-analysis of data in multiple studies defined nine SNPs associated with a risk of oral cancer. The genes were associated with critical functions deregulated in cancers, including cell proliferation, immune function, inflammation, transcription, DNA repair and xenobiotic metabolism.
Marcelletti, Simone; Scortichini, Marco
A total of 21 Xylella fastidiosa strains were assessed by comparing their genomes to infer their taxonomic relationships. The whole-genome-based average nucleotide identity and tetranucleotide frequency correlation coefficient analyses were performed. In addition, a consensus tree based on comparisons of 956 core gene families, and a genome-wide phylogenetic tree and a Neighbor-net network were constructed with 820,088 nucleotides (i.e., approximately 30-33 % of the entire X. fastidiosa genome). All approaches revealed the occurrence of three well-demarcated genetic clusters that represent X. fastidiosa subspecies fastidiosa, multiplex and pauca, with the latter appeared to diverge. We suggest that the proposed but never formally described subspecies 'sandyi' and 'morus' are instead members of the subspecies fastidiosa. These analyses support the view that the Xylella strain isolated from Pyrus pyrifolia in Taiwan is likely to be a new species. A widely used multilocus sequence typing analysis yielded conflicting results.
Abbott, Diana; Li, Yi-Ju; Guggenheim, Jeremy A
To investigate quantitative trait loci linked to refractive error, we performed a genome-wide quantitative trait linkage analysis using single nucleotide polymorphism markers and family data from five international sites....
van Hemert, Formijn; Berkhout, Ben
RNA viruses have genomes with a distinct nucleotide composition and codon usage. We present the global characteristics of the RNA genome of Zika virus (ZIKV), an emerging pathogen within the Flavivirus genus. ZIKV was first isolated in 1947 in Uganda, caused a widespread epidemic in South and
Full Text Available Drought tolerance is a complex trait that involves numerous genes. Identifying key causal genes or linked molecular markers can facilitate the fast development of drought tolerant varieties. Using a whole-genome resequencing approach, we sequenced 132 chickpea varieties and advanced breeding lines and found more than 144,000 single nucleotide polymorphisms (SNPs. We measured 13 yield and yield-related traits in three drought-prone environments of Western Australia. The genotypic effects were significant for all traits, and many traits showed highly significant correlations, ranging from 0.83 between grain yield and biomass to -0.67 between seed weight and seed emergence rate. To identify candidate genes, the SNP and trait data were incorporated into the SUPER genome-wide association study (GWAS model, a modified version of the linear mixed model. We found that several SNPs from auxin-related genes, including auxin efflux carrier protein (PIN3, p-glycoprotein, and nodulin MtN21/EamA-like transporter, were significantly associated with yield and yield-related traits under drought-prone environments. We identified four genetic regions containing SNPs significantly associated with several different traits, which was an indication of pleiotropic effects. We also investigated the possibility of incorporating the GWAS results into a genomic selection (GS model, which is another approach to deal with complex traits. Compared to using all SNPs, application of the GS model using subsets of SNPs significantly associated with the traits under investigation increased the prediction accuracies of three yield and yield-related traits by more than twofold. This has important implication for implementing GS in plant breeding programs.
Nagano, Takashi; Lubling, Yaniv; Yaffe, Eitan; Wingett, Steven W; Dean, Wendy; Tanay, Amos; Fraser, Peter
Hi-C is a powerful method that provides pairwise information on genomic regions in spatial proximity in the nucleus. Hi-C requires millions of cells as input and, as genome organization varies from cell to cell, a limitation of Hi-C is that it only provides a population average of genome conformations. We developed single-cell Hi-C to create snapshots of thousands of chromatin interactions that occur simultaneously in a single cell. To adapt Hi-C to single-cell analysis, we modified the protocol to include in-nucleus ligation. This enables the isolation of single nuclei carrying Hi-C-ligated DNA into separate tubes, followed by reversal of cross-links, capture of biotinylated ligation junctions on streptavidin-coated magnetic beads and PCR amplification of single-cell Hi-C libraries. The entire laboratory protocol can be carried out in 1 week, and although we have demonstrated its use in mouse T helper (TH1) cells, it should be applicable to any cell type or species for which standard Hi-C has been successful. We also developed an analysis pipeline to filter noise and assess the quality of data sets in a few hours. Although the interactome maps produced by single-cell Hi-C are sparse, the data provide useful information to understand cellular variability in nuclear genome organization and chromosome structure. Standard wet and dry laboratory skills in molecular biology and computational analysis are required.
Full Text Available Global-genome nucleotide excision repair (GG-NER prevents genome instability by excising a wide range of structurally unrelated DNA base adducts and crosslinks induced by chemical carcinogens, ultraviolet (UV radiation or intracellular metabolic by-products. As a versatile damage sensor, xeroderma pigmentosum group C (XPC protein initiates this generic defense reaction by locating the damage and recruiting the subunits of a large lesion demarcation complex that, in turn, triggers the excision of aberrant DNA by endonucleases. In the very special case of a DNA repair response to UV radiation, the function of this XPC initiator is tightly controlled by the dual action of cullin-type CRL4DDB2 and sumo-targeted RNF111 ubiquitin ligases. This twofold protein ubiquitination system promotes GG-NER reactions by spatially and temporally regulating the interaction of XPC protein with damaged DNA across the nucleosome landscape of chromatin. In the absence of either CRL4DDB2 or RNF111, the DNA excision repair of UV lesions is inefficient, indicating that these two ubiquitin ligases play a critical role in mitigating the adverse biological effects of UV light in the exposed skin.
Shaheen, T.; Zafar, Y.; Rahman, M.
Single nucleotide polymorphism analysis is an expedient way to study polymorphisms at genomic level. In the present study we have explored Ubiquitin extension protein gene of G. arboreum (A2) and G. herbaceum (A1) of cotton which is a multiple copy gene. We have found SNPs at 16 positions in 200 bp region within A genome of cotton indicating frequency of SNPs 1/13 bp. Both sequences from cotton have shown maximum similarity with UBQ5 and UBQ6 of Arabidopsis thaliana. Sequence obtained from G. arboreum has shown SNPs at 28 positions in comparison with each UBQ5 and UBQ6 of Arabidopsis thaliana while sequence obtained from G. herbaceum has shown SNPs at 31 positions in comparison with each UBQ5 and UBQ6 of Arabidopsis thaliana. In conclusion although during pace of evolution ubiquitin extension protein genes of both A genome species have got some mutations from nature but still most of their sequence is similar. Single nucleotide polymorphism study can prove a vital tool to identify gene type in case of Multicopy genes. (author)
Melka Melkaye G
Full Text Available Abstract Background Studies of genetic diversity are essential in understanding the extent of differentiation between breeds, and in designing successful diversity conservation strategies. The objective of this study was to evaluate the level of genetic diversity within and between North American Brown Swiss (BS, n = 900, Jersey (JE, n = 2,922 and Holstein (HO, n = 3,535 cattle, using genotyped bulls. GENEPOP and FSTAT software were used to evaluate the level of genetic diversity within each breed and between each pair of the three breeds based on genome-wide SNP markers (n = 50,972. Results Hardy-Weinberg equilibrium (HWE exact test within breeds showed a significant deviation from equilibrium within each population (P st indicated that the combination of BS and HO in an ideally amalgamated population had higher genetic diversity than the other pairs of breeds. Conclusion Results suggest that the three bull populations have substantially different gene pools. BS and HO show the largest gene differentiation and jointly the highest total expected gene diversity compared to when JE is considered. If the loss of genetic diversity within breeds worsens in the future, the use of crossbreeding might be an option to recover genetic diversity, especially for the breeds with small population size.
Widmer, Christian; Lippert, Christoph; Weissbrod, Omer; Fusi, Nicolo; Kadie, Carl; Davidson, Robert; Listgarten, Jennifer; Heckerman, David
We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals in a cohort. These similarities are estimated from single nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs are used to estimate the GSM. In empirical studies across a wide range of synthetic and real data, we find that modifications to this approach improve GWAS performance as measured by type I error control and power. Specifically, when only population structure is present, a GSM constructed from SNPs that well predict the phenotype in combination with principal components as covariates controls type I error and yields more power than the traditional LMM. In any setting, with or without population structure or family relatedness, a GSM consisting of a mixture of two component GSMs, one constructed from all SNPs and another constructed from SNPs that well predict the phenotype again controls type I error and yields more power than the traditional LMM. Software implementing these improvements and the experimental comparisons are available at http://microsoft.com/science.
Nucleotide compositional asymmetry (NCA) between leading and lagging strands (LeS and LaS) is dynamic and diverse among eubacterial genomes due to different mutation and selection forces. A thorough investigation is needed in order to study the relationship between nucleotide composition dynamics and gene distribution biases. Based on a collection of 364 eubacterial genomes that were grouped according to a DnaE-based scheme (DnaE1-DnaE1, DnaE2-DnaE1, and DnaE3-PolC), we investigated NCA and nucleotide composition gradients at three codon positions and found that there was universal G-enrichment on LeS among all groups. This was due to a strong selection for G-heading (codon position1 or cp1) codons and mutation pressure that led to more G-ending (cp3) codons. Moreover, a slight T-enrichment of LeS due to the mutation of cytosine deamination at cp3 was universal among DnaE1-DnaE1 and DnaE2-DnaE1 genomes, but was not clearly seen among DnaE3-PolC genomes, in which A-enrichment of LeS was proposed to be the effect of selections unique to polC and a mutation bias toward A-richness at cp1 that may be a result of transcription-coupled DNA repair mechanisms. Furthermore, strand-biased gene distribution enhances the purine-richness of LeS for DnaE3-PolC genomes and T-richness of LeS for DnaE1-DnaE1 and DnaE2-dnaE1 genomes. © 2010 Institut Pasteur.
Qu, Hongzhu; Wu, Hao; Zhang, Tongwu; Zhang, Zhang; Hu, Songnian; Yu, Jun
Nucleotide compositional asymmetry (NCA) between leading and lagging strands (LeS and LaS) is dynamic and diverse among eubacterial genomes due to different mutation and selection forces. A thorough investigation is needed in order to study the relationship between nucleotide composition dynamics and gene distribution biases. Based on a collection of 364 eubacterial genomes that were grouped according to a DnaE-based scheme (DnaE1-DnaE1, DnaE2-DnaE1, and DnaE3-PolC), we investigated NCA and nucleotide composition gradients at three codon positions and found that there was universal G-enrichment on LeS among all groups. This was due to a strong selection for G-heading (codon position1 or cp1) codons and mutation pressure that led to more G-ending (cp3) codons. Moreover, a slight T-enrichment of LeS due to the mutation of cytosine deamination at cp3 was universal among DnaE1-DnaE1 and DnaE2-DnaE1 genomes, but was not clearly seen among DnaE3-PolC genomes, in which A-enrichment of LeS was proposed to be the effect of selections unique to polC and a mutation bias toward A-richness at cp1 that may be a result of transcription-coupled DNA repair mechanisms. Furthermore, strand-biased gene distribution enhances the purine-richness of LeS for DnaE3-PolC genomes and T-richness of LeS for DnaE1-DnaE1 and DnaE2-dnaE1 genomes. © 2010 Institut Pasteur.
Full Text Available Abstract Background To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs have been used for single nucleotide polymorphism (SNP discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA broodstock population. Results The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends. Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183 of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In
Sánchez, Cecilia Castaño; Smith, Timothy P L; Wiedmann, Ralph T; Vallejo, Roger L; Salem, Mohamed; Yao, Jianbo; Rexroad, Caird E
To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population. The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the sequences from the
Beaty, Terri H.; Ruczinski, Ingo; Murray, Jeffrey C.; Marazita, Mary L.; Munger, Ronald G.; Hetmanski, Jacqueline B.; Murray, Tanda; Redett, Richard J.; Fallin, M. Daniele; Liang, Kung Yee; Wu, Tao; Patel, Poorav J.; Jin, Sheng C.; Zhang, Tian Xiao; Schwender, Holger; Wu-Chou, Yah Huei; Chen, Philip K; Chong, Samuel S; Cheah, Felicia; Yeow, Vincent; Ye, Xiaoqian; Wang, Hong; Huang, Shangzhi; Jabs, Ethylin W.; Shi, Bing; Wilcox, Allen J.; Lie, Rolv T.; Jee, Sun Ha; Christensen, Kaare; Doheny, Kimberley F.; Pugh, Elizabeth W.; Ling, Hua; Scott, Alan F.
Non-syndromic cleft palate (CP) is a common birth defect with a complex and heterogeneous etiology involving both genetic and environmental risk factors. We conducted a genome wide association study (GWAS) using 550 case-parent trios, ascertained through a CP case collected in an international consortium. Family based association tests of single nucleotide polymorphisms (SNP) and three common maternal exposures (maternal smoking, alcohol consumption and multivitamin supplementation) were used in a combined 2 df test for gene (G) and gene-environment (G×E) interaction simultaneously, plus a separate 1 df test for G×E interaction alone. Conditional logistic regression models were used to estimate effects on risk to exposed and unexposed children. While no SNP achieved genome wide significance when considered alone, markers in several genes attained or approached genome wide significance when G×E interaction was included. Among these, MLLT3 and SMC2 on chromosome 9 showed multiple SNPs resulting in increased risk if the mother consumed alcohol during the peri-conceptual period (3 months prior to conception through the first trimester). TBK1 on chr. 12 and ZNF236 on chr. 18 showed multiple SNPs associated with higher risk of CP in the presence of maternal smoking. Additional evidence of reduced risk due to G×E interaction in the presence of multivitamin supplementation was observed for SNPs in BAALC on chr. 8. These results emphasize the need to consider G×E interaction when searching for genes influencing risk to complex and heterogeneous disorders, such as non-syndromic CP. PMID:21618603
Liu, Zhaohua; Ji, Zhibin; Wang, Guizhi; Chao, Tianle; Hou, Lei; Wang, Jianmin
Throughout a long period of adaptation and selection, sheep have thrived in a diverse range of ecological environments. Mongolian sheep is the common ancestor of the Chinese short fat-tailed sheep. Migration to different ecoregions leads to changes in selection pressures and results in microevolution. Mongolian sheep and its subspecies differ in a number of important traits, especially reproductive traits. Genome-wide intraspecific variation is required to dissect the genetic basis of these traits. This research resequenced 3 short fat-tailed sheep breeds with a 43.2-fold coverage of the sheep genome. We report more than 17 million single nucleotide polymorphisms and 2.9 million indels and identify 143 genomic regions with reduced pooled heterozygosity or increased genetic distance to each other breed that represent likely targets for selection during the migration. These regions harbor genes related to developmental processes, cellular processes, multicellular organismal processes, biological regulation, metabolic processes, reproduction, localization, growth and various components of the stress responses. Furthermore, we examined the haplotype diversity of 3 genomic regions involved in reproduction and found significant differences in TSHR and PRL gene regions among 8 sheep breeds. Our results provide useful genomic information for identifying genes or causal mutations associated with important economic traits in sheep and for understanding the genetic basis of adaptation to different ecological environments.
Hayden, Lystra P; Cho, Michael H; McDonald, Merry-Lynn N; Crapo, James D; Beaty, Terri H; Silverman, Edwin K; Hersh, Craig P
Previous studies have indicated that in adult smokers, a history of childhood pneumonia is associated with reduced lung function and chronic obstructive pulmonary disease. There have been few previous investigations using genome-wide association studies to investigate genetic predisposition to pneumonia. This study aims to identify the genetic variants associated with the development of pneumonia during childhood and over the course of the lifetime. Study subjects included current and former smokers with and without chronic obstructive pulmonary disease participating in the COPDGene Study. Pneumonia was defined by subject self-report, with childhood pneumonia categorized as having the first episode at pneumonia (843 cases, 9,091 control subjects) and lifetime pneumonia (3,766 cases, 5,659 control subjects) were performed separately in non-Hispanic whites and African Americans. Non-Hispanic white and African American populations were combined in the meta-analysis. Top genetic variants from childhood pneumonia were assessed in network analysis. No single-nucleotide polymorphisms reached genome-wide significance, although we identified potential regions of interest. In the childhood pneumonia analysis, this included variants in NGR1 (P = 6.3 × 10 -8 ), PAK6 (P = 3.3 × 10 -7 ), and near MATN1 (P = 2.8 × 10 -7 ). In the lifetime pneumonia analysis, this included variants in LOC339862 (P = 8.7 × 10 -7 ), RAPGEF2 (P = 8.4 × 10 -7 ), PHACTR1 (P = 6.1 × 10 -7 ), near PRR27 (P = 4.3 × 10 -7 ), and near MCPH1 (P = 2.7 × 10 -7 ). Network analysis of the genes associated with childhood pneumonia included top networks related to development, blood vessel morphogenesis, muscle contraction, WNT signaling, DNA damage, apoptosis, inflammation, and immune response (P ≤ 0.05). We have identified genes potentially associated with the risk of pneumonia. Further research will be required to confirm these
Wei, Lijuan; Qu, Cunmin; Xu, Xinfu; Lu, Kun; Qian, Wei; Li, Jiana; Li, Maoteng; Liu, Liezhao
A stable yellow-seeded variety is the breeding goal for obtaining the ideal rapeseed (Brassica napus L.) plant, and the amount of acid detergent lignin (ADL) in the seeds and the hull content (HC) are often used as yellow-seeded rapeseed screening indices. In this study, a genome-wide association analysis of 520 accessions was performed using the Q + K model with a total of 31,839 single-nucleotide polymorphism (SNP) sites. As a result, three significant associations on the B. napus chromosomes A05, A09, and C05 were detected for seed ADL content. The peak SNPs were within 9.27, 14.22, and 20.86 kb of the key genes BnaA.PAL4, BnaA.CAD2/BnaA.CAD3, and BnaC.CCR1, respectively. Further analyses were performed on the major locus of A05, which was also detected in the seed HC examination. A comparison of our genome-wide association study (GWAS) results and previous linkage mappings revealed a common chromosomal region on A09, which indicates that GWAS can be used as a powerful complementary strategy for dissecting complex traits in B. napus. Genomic selection (GS) utilizing the significant SNP markers based on the GWAS results exhibited increased predictive ability, indicating that the predictive ability of a given model can be substantially improved by using GWAS and GS. PMID:26673885
Full Text Available Sorghum [ (L Moench], an important grain and forage crop, is receiving significant attention as a lignocellulosic feedstock because of its water-use efficiency and high biomass yield potential. Because of the advancement of genotyping and sequencing technologies, genome-wide association study (GWAS has become a routinely used method to investigate the genetic mechanisms underlying natural phenotypic variation. In this study, we performed a GWAS for nine grain and biomass-related plant architecture traits to determine their overall genetic architecture and the specific association of allelic variants in gibberellin (GA biosynthesis and signaling genes with these phenotypes. A total of 101 single-nucleotide polymorphism (SNP representative regions were associated with at least one of the nine traits, and two of the significant markers correspond to GA candidate genes, ( and (, affecting plant height and seed number, respectively. The resolution of a previously reported quantitative trait loci (QTL for leaf angle on chromosome 7 was increased to a 1.67 Mb region containing seven candidate genes with good prospects for further investigation. This study provides new knowledge of the association of GA genes with plant architecture traits and the genomic regions controlling variation in leaf angle, stem circumference, internode number, tiller number, seed number, panicle exsertion, and panicle length. The GA gene affecting seed number variation ( and the genomic region on chromosome 7 associated with variation in leaf angle are also important outcomes of this study and represent the foundation of future validation studies needed to apply this knowledge in breeding programs.
Bol, Sebastiaan M.; Moerland, Perry D.; Limou, Sophie; van Remmerden, Yvonne; Coulonges, Cédric; van Manen, Daniëlle; Herbeck, Joshua T.; Fellay, Jacques; Sieberer, Margit; Sietzema, Jantine G.; van 't Slot, Ruben; Martinson, Jeremy; Zagury, Jean-François; Schuitemaker, Hanneke; van 't Wout, Angélique B.
Background HIV-1 infected macrophages play an important role in rendering resting T cells permissive for infection, in spreading HIV-1 to T cells, and in the pathogenesis of AIDS dementia. During highly active anti-retroviral treatment (HAART), macrophages keep producing virus because tissue penetration of antiretrovirals is suboptimal and the efficacy of some is reduced. Thus, to cure HIV-1 infection with antiretrovirals we will also need to efficiently inhibit viral replication in macrophages. The majority of the current drugs block the action of viral enzymes, whereas there is an abundance of yet unidentified host factors that could be targeted. We here present results from a genome-wide association study identifying novel genetic polymorphisms that affect in vitro HIV-1 replication in macrophages. Methodology/Principal Findings Monocyte-derived macrophages from 393 blood donors were infected with HIV-1 and viral replication was determined using Gag p24 antigen levels. Genomic DNA from individuals with macrophages that had relatively low (n = 96) or high (n = 96) p24 production was used for SNP genotyping with the Illumina 610 Quad beadchip. A total of 494,656 SNPs that passed quality control were tested for association with HIV-1 replication in macrophages, using linear regression. We found a strong association between in vitro HIV-1 replication in monocyte-derived macrophages and SNP rs12483205 in DYRK1A (p = 2.16×10−5). While the association was not genome-wide significant (p<1×10−7), we could replicate this association using monocyte-derived macrophages from an independent group of 31 individuals (p = 0.0034). Combined analysis of the initial and replication cohort increased the strength of the association (p = 4.84×10−6). In addition, we found this SNP to be associated with HIV-1 disease progression in vivo in two independent cohort studies (p = 0.035 and p = 0.0048). Conclusions/Significance These findings suggest that
Sugiyama, Miki; Ito, Hiroshi; Hata, Yusuke; Ono, Eriko; Ito, Toshihiro
Complete nucleotide sequences were determined for subtype B avian metapneumovirus (aMPV), the attenuated vaccine strain VCO3/50 and its parental pathogenic strain VCO3/60616. The genomes of both strains comprised 13,508 nucleotides (nt), with a 42-nt leader at the 3'-end and a 46-nt trailer at the 5'-end. The genome contains eight genes in the order 3'-N-P-M-F-M2-SH-G-L-5', which is the same order shown in the other metapneumoviruses. The genes are flanked on either side by conserved transcriptional start and stop signals and have intergenic sequences varying in length from 1 to 88 nt. Comparison of nt and predicted amino acid (aa) sequences of VCO3/60616 with those of other metapneumoviruses revealed higher homology with aMPV subtype A virus than with other metapneumoviruses. A total of 18 nt and 10 deduced aa differences were seen between the strains, and one or a combination of several differences could be associated with attenuation of VCO3/50.
Full Text Available Abstract Background Specific genetic contributions for preeclampsia (PE are currently unknown. This genome-wide association study (GWAS aims to identify maternal single nucleotide polymorphisms (SNPs and copy-number variants (CNVs involved in the etiology of PE. Methods A genome-wide scan was performed on 177 PE cases (diagnosed according to National Heart, Lung and Blood Institute guidelines and 116 normotensive controls. White female study subjects from Iowa were genotyped on Affymetrix SNP 6.0 microarrays. CNV calls made using a combination of four detection algorithms (Birdseye, Canary, PennCNV, and QuantiSNP were merged using CNVision and screened with stringent prioritization criteria. Due to limited DNA quantities and the deleterious nature of copy-number deletions, it was decided a priori that only deletions would be selected for assay on the entire case-control dataset using quantitative real-time PCR. Results The top four SNP candidates had an allelic or genotypic p-value between 10-5 and 10-6, however, none surpassed the Bonferroni-corrected significance threshold. Three recurrent rare deletions meeting prioritization criteria detected in multiple cases were selected for targeted genotyping. A locus of particular interest was found showing an enrichment of case deletions in 19q13.31 (5/169 cases and 1/114 controls, which encompasses the PSG11 gene contiguous to a highly plastic genomic region. All algorithm calls for these regions were assay confirmed. Conclusions CNVs may confer risk for PE and represent interesting regions that warrant further investigation. Top SNP candidates identified from the GWAS, although not genome-wide significant, may be useful to inform future studies in PE genetics.
Zhu, Huayu; Song, Pengyao; Koo, Dal-Hoe; Guo, Luqin; Li, Yanman; Sun, Shouru; Weng, Yiqun; Yang, Luming
Microsatellite markers are one of the most informative and versatile DNA-based markers used in plant genetic research, but their development has traditionally been difficult and costly. The whole genome sequencing with next-generation sequencing (NGS) technologies provides large amounts of sequence data to develop numerous microsatellite markers at whole genome scale. SSR markers have great advantage in cross-species comparisons and allow investigation of karyotype and genome evolution through highly efficient computation approaches such as in silico PCR. Here we described genome wide development and characterization of SSR markers in the watermelon (Citrullus lanatus) genome, which were then use in comparative analysis with two other important crop species in the Cucurbitaceae family: cucumber (Cucumis sativus L.) and melon (Cucumis melo L.). We further applied these markers in evaluating the genetic diversity and population structure in watermelon germplasm collections. A total of 39,523 microsatellite loci were identified from the watermelon draft genome with an overall density of 111 SSRs/Mbp, and 32,869 SSR primers were designed with suitable flanking sequences. The dinucleotide SSRs were the most common type representing 34.09 % of the total SSR loci and the AT-rich motifs were the most abundant in all nucleotide repeat types. In silico PCR analysis identified 832 and 925 SSR markers with each having a single amplicon in the cucumber and melon draft genome, respectively. Comparative analysis with these cross-species SSR markers revealed complicated mosaic patterns of syntenic blocks among the genomes of three species. In addition, genetic diversity analysis of 134 watermelon accessions with 32 highly informative SSR loci placed these lines into two groups with all accessions of C.lanatus var. citorides and three accessions of C. colocynthis clustered in one group and all accessions of C. lanatus var. lanatus and the remaining accessions of C. colocynthis
Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M
Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.
Marcio P. Arruda
Full Text Available Fusarium head blight (FHB is one of the most important wheat ( L. diseases worldwide, and host resistance displays complex genetic control. A genome-wide association study (GWAS was performed on 273 winter wheat breeding lines from the midwestern and eastern regions of the United States to identify chromosomal regions associated with FHB resistance. Genotyping-by-sequencing (GBS was used to identify 19,992 single-nucleotide polymorphisms (SNPs covering all 21 wheat chromosomes. Marker–trait associations were performed with different statistical models, the most appropriate being a compressed mixed linear model (cMLM controlling for relatedness and population structure. Ten significant SNP–trait associations were detected on chromosomes 4A, 6A, 7A, 1D, 4D, and 7D, and multiple SNPs were associated with on chromosome 3B. Although combination of favorable alleles of these SNPs resulted in lower levels of severity (SEV, incidence (INC, and deoxynivalenol concentration (DON, lines carrying multiple beneficial alleles were in very low frequency for most traits. These SNPs can now be used for creating new breeding lines with different combinations of favorable alleles. This is one of the first GWAS using genomic resources from the International Wheat Genome Sequencing Consortium (IWGSC.
Turner, Adam W; Martinuk, Amy; Silva, Anada; Lau, Paulina; Nikpay, Majid; Eriksson, Per; Folkersen, Lasse; Perisic, Ljubica; Hedin, Ulf; Soubeyrand, Sebastien; McPherson, Ruth
A recent genome-wide association study meta-analysis identified an intronic single nucleotide polymorphism in SMAD3, rs56062135C>T, the minor allele (T) which associates with protection from coronary artery disease. Relevant to atherosclerosis, SMAD3 is a key contributor to transforming growth factor-β pathway signaling. Here, we seek to identify ≥1 causal coronary artery disease-associated single nucleotide polymorphisms at the SMAD3 locus and characterize mechanisms whereby the risk allele(s) contribute to coronary artery disease risk. By genetic and epigenetic fine mapping, we identified a candidate causal single nucleotide polymorphism rs17293632C>T (D', 0.97; r(2), 0.94 with rs56062135) in intron 1 of SMAD3 with predicted functional effects. We show that the sequence encompassing rs17293632 acts as a strong enhancer in human arterial smooth muscle cells. The common allele (C) preserves an activator protein (AP)-1 site and enhancer function, whereas the protective (T) allele disrupts the AP-1 site and significantly reduces enhancer activity (Pto the (C) allele. We show that rs17293632 is an expression quantitative trait locus for SMAD3 in blood and atherosclerotic plaque with reduced expression of SMAD3 in carriers of the protective allele. Finally, siRNA knockdown of SMAD3 in human arterial smooth muscle cells increases cell viability, consistent with an antiproliferative role. The coronary artery disease-associated rs17293632C>T single nucleotide polymorphism represents a novel functional cis-acting element at the SMAD3 locus. The protective (T) allele of rs17293632 disrupts a consensus AP-1 binding site in a SMAD3 intron 1 enhancer, reduces enhancer activity and SMAD3 expression, altering human arterial smooth muscle cell proliferation. © 2016 American Heart Association, Inc.
Hanson, Robert L; Muller, Yunhua L; Kobes, Sayuko; Guo, Tingwei; Bian, Li; Ossowski, Victoria; Wiedrich, Kim; Sutherland, Jeffrey; Wiedrich, Christopher; Mahkee, Darin; Huang, Ke; Abdussamad, Maryam; Traurig, Michael; Weil, E Jennifer; Nelson, Robert G; Bennett, Peter H; Knowler, William C; Bogardus, Clifton; Baier, Leslie J
Most genetic variants associated with type 2 diabetes mellitus (T2DM) have been identified through genome-wide association studies (GWASs) in Europeans. The current study reports a GWAS for young-onset T2DM in American Indians. Participants were selected from a longitudinal study conducted in Pima Indians and included 278 cases with diabetes with onset before 25 years of age, 295 nondiabetic controls ≥45 years of age, and 267 siblings of cases or controls. Individuals were genotyped on a ∼1M single nucleotide polymorphism (SNP) array, resulting in 453,654 SNPs with minor allele frequency >0.05. SNPs were analyzed for association in cases and controls, and a family-based association test was conducted. Tag SNPs (n = 311) were selected for 499 SNPs associated with diabetes (P associated with T2DM (odds ratio = 1.29 per copy of the T allele; P = 6.6 × 10(-8), which represents genome-wide significance accounting for the number of effectively independent SNPs analyzed). Transfection studies in murine pancreatic β-cells suggested that DNER regulates expression of notch signaling pathway genes. These studies implicate DNER as a susceptibility gene for T2DM in American Indians.
Ng, MYM; Levinson, DF; Faraone, SV; Suarez, BK; DeLisi, LE; Arinami, T; Riley, B; Paunio, T; Pulver, AE; Irmansyah; Holmans, PA; Escamilla, M; Wildenauer, DB; Williams, NM; Laurent, C; Mowry, BJ; Brzustowicz, LM; Maziade, M; Sklar, P; Garver, DL; Abecasis, GR; Lerer, B; Fallin, MD; Gurling, HMD; Gejman, PV; Lindholm, E; Moises, HW; Byerley, W; Wijsman, EM; Forabosco, P; Tsuang, MT; Hwu, H-G; Okazaki, Y; Kendler, KS; Wormley, B; Fanous, A; Walsh, D; O’Neill, FA; Peltonen, L; Nestadt, G; Lasseter, VK; Liang, KY; Papadimitriou, GM; Dikeos, DG; Schwab, SG; Owen, MJ; O’Donovan, MC; Norton, N; Hare, E; Raventos, H; Nicolini, H; Albus, M; Maier, W; Nimgaonkar, VL; Terenius, L; Mallet, J; Jay, M; Godard, S; Nertney, D; Alexander, M; Crowe, RR; Silverman, JM; Bassett, AS; Roy, M-A; Mérette, C; Pato, CN; Pato, MT; Roos, J Louw; Kohn, Y; Amann-Zalcenstein, D; Kalsi, G; McQuillin, A; Curtis, D; Brynjolfson, J; Sigmundsson, T; Petursson, H; Sanders, AR; Duan, J; Jazin, E; Myles-Worsley, M; Karayiorgou, M; Lewis, CM
A genome scan meta-analysis (GSMA) was carried out on 32 independent genome-wide linkage scan analyses that included 3255 pedigrees with 7413 genotyped cases affected with schizophrenia (SCZ) or related disorders. The primary GSMA divided the autosomes into 120 bins, rank-ordered the bins within each study according to the most positive linkage result in each bin, summed these ranks (weighted for study size) for each bin across studies and determined the empirical probability of a given summed rank (PSR) by simulation. Suggestive evidence for linkage was observed in two single bins, on chromosomes 5q (142-168 Mb) and 2q (103-134 Mb). Genome-wide evidence for linkage was detected on chromosome 2q (119-152 Mb) when bin boundaries were shifted to the middle of the previous bins. The primary analysis met empirical criteria for ‘aggregate’ genome-wide significance, indicating that some or all of 10 bins are likely to contain loci linked to SCZ, including regions of chromosomes 1, 2q, 3q, 4q, 5q, 8p and 10q. In a secondary analysis of 22 studies of European-ancestry samples, suggestive evidence for linkage was observed on chromosome 8p (16-33 Mb). Although the newer genome-wide association methodology has greater power to detect weak associations to single common DNA sequence variants, linkage analysis can detect diverse genetic effects that segregate in families, including multiple rare variants within one locus or several weakly associated loci in the same region. Therefore, the regions supported by this meta-analysis deserve close attention in future studies. PMID:19349958
Easton, Douglas F.; Pooley, Karen A.; Dunning, Alison M.; Pharoah, Paul D. P.; Thompson, Deborah; Ballinger, Dennis G.; Struewing, Jeffery P.; Morrison, Jonathan; Field, Helen; Luben, Robert; Wareham, Nicholas; Ahmed, Shahana; Healey, Catherine S.; Bowman, Richard; Meyer, Kerstin B.; Haiman, Christopher A.; Kolonel, Laurence K.; Henderson, Brian E.; Marchand, Loic Le; Brennan, Paul; Sangrajrang, Suleeporn; Gaborieau, Valerie; Odefrey, Fabrice; Shen, Chen-Yang; Wu, Pei-Ei; Wang, Hui-Chun; Eccles, Diana; Evans, D. Gareth; Peto, Julian; Fletcher, Olivia; Johnson, Nichola; Seal, Sheila; Stratton, Michael R.; Rahman, Nazneen; Chenevix-Trench, Georgia; Bojesen, Stig E.; Nordestgaard, Børge G.; Axelsson, Christen K.; Garcia-Closas, Montserrat; Brinton, Louise; Chanock, Stephen; Lissowska, Jolanta; Peplonska, Beata; Nevanlinna, Heli; Fagerholm, Rainer; Eerola, Hannaleena; Kang, Daehee; Yoo, Keun-Young; Noh, Dong-Young; Ahn, Sei-Hyun; Hunter, David J.; Hankinson, Susan E.; Cox, David G.; Hall, Per; Wedren, Sara; Liu, Jianjun; Low, Yen-Ling; Bogdanova, Natalia; Schürmann, Peter; Dörk, Thilo; Tollenaar, Rob A. E. M.; Jacobi, Catharina E.; Devilee, Peter; Klijn, Jan G. M.; Sigurdson, Alice J.; Doody, Michele M.; Alexander, Bruce H.; Zhang, Jinghui; Cox, Angela; Brock, Ian W.; MacPherson, Gordon; Reed, Malcolm W. R.; Couch, Fergus J.; Goode, Ellen L.; Olson, Janet E.; Meijers-Heijboer, Hanne; van den Ouweland, Ans; Uitterlinden, André; Rivadeneira, Fernando; Milne, Roger L.; Ribas, Gloria; Gonzalez-Neira, Anna; Benitez, Javier; Hopper, John L.; McCredie, Margaret; Southey, Melissa; Giles, Graham G.; Schroen, Chris; Justenhoven, Christina; Brauch, Hiltrud; Hamann, Ute; Ko, Yon-Dschun; Spurdle, Amanda B.; Beesley, Jonathan; Chen, Xiaoqing; Mannermaa, Arto; Kosma, Veli-Matti; Kataja, Vesa; Hartikainen, Jaana; Day, Nicholas E.; Cox, David R.; Ponder, Bruce A. J.; Luccarini, Craig; Conroy, Don; Shah, Mitul; Munday, Hannah; Jordan, Clare; Perkins, Barbara; West, Judy; Redman, Karen; Driver, Kristy; Aghmesheh, Morteza; Amor, David; Andrews, Lesley; Antill, Yoland; Armes, Jane; Armitage, Shane; Arnold, Leanne; Balleine, Rosemary; Begley, Glenn; Beilby, John; Bennett, Ian; Bennett, Barbara; Berry, Geoffrey; Blackburn, Anneke; Brennan, Meagan; Brown, Melissa; Buckley, Michael; Burke, Jo; Butow, Phyllis; Byron, Keith; Callen, David; Campbell, Ian; Chenevix-Trench, Georgia; Clarke, Christine; Colley, Alison; Cotton, Dick; Cui, Jisheng; Culling, Bronwyn; Cummings, Margaret; Dawson, Sarah-Jane; Dixon, Joanne; Dobrovic, Alexander; Dudding, Tracy; Edkins, Ted; Eisenbruch, Maurice; Farshid, Gelareh; Fawcett, Susan; Field, Michael; Firgaira, Frank; Fleming, Jean; Forbes, John; Friedlander, Michael; Gaff, Clara; Gardner, Mac; Gattas, Mike; George, Peter; Giles, Graham; Gill, Grantley; Goldblatt, Jack; Greening, Sian; Grist, Scott; Haan, Eric; Harris, Marion; Hart, Stewart; Hayward, Nick; Hopper, John; Humphrey, Evelyn; Jenkins, Mark; Jones, Alison; Kefford, Rick; Kirk, Judy; Kollias, James; Kovalenko, Sergey; Lakhani, Sunil; Leary, Jennifer; Lim, Jacqueline; Lindeman, Geoff; Lipton, Lara; Lobb, Liz; Maclurcan, Mariette; Mann, Graham; Marsh, Deborah; McCredie, Margaret; McKay, Michael; McLachlan, Sue Anne; Meiser, Bettina; Milne, Roger; Mitchell, Gillian; Newman, Beth; O'Loughlin, Imelda; Osborne, Richard; Peters, Lester; Phillips, Kelly; Price, Melanie; Reeve, Jeanne; Reeve, Tony; Richards, Robert; Rinehart, Gina; Robinson, Bridget; Rudzki, Barney; Salisbury, Elizabeth; Sambrook, Joe; Saunders, Christobel; Scott, Clare; Scott, Elizabeth; Scott, Rodney; Seshadri, Ram; Shelling, Andrew; Southey, Melissa; Spurdle, Amanda; Suthers, Graeme; Taylor, Donna; Tennant, Christopher; Thorne, Heather; Townshend, Sharron; Tucker, Kathy; Tyler, Janet; Venter, Deon; Visvader, Jane; Walpole, Ian; Ward, Robin; Waring, Paul; Warner, Bev; Warren, Graham; Watson, Elizabeth; Williams, Rachael; Wilson, Judy; Winship, Ingrid; Young, Mary Ann; Bowtell, David; Green, Adele; deFazio, Anna; Chenevix-Trench, Georgia; Gertig, Dorota; Webb, Penny
Breast cancer exhibits familial aggregation, consistent with variation in genetic susceptibility to the disease. Known susceptibility genes account for less than 25% of the familial risk of breast cancer, and the residual genetic variance is likely to be due to variants conferring more moderate risks. To identify further susceptibility alleles, we conducted a two-stage genome-wide association study in 4,398 breast cancer cases and 4,316 controls, followed by a third stage in which 30 single nucleotide polymorphisms (SNPs) were tested for confirmation in 21,860 cases and 22,578 controls from 22 studies. We used 227,876 SNPs that were estimated to correlate with 77% of known common SNPs in Europeans at r2>0.5. SNPs in five novel independent loci exhibited strong and consistent evidence of association with breast cancer (P<10−7). Four of these contain plausible causative genes (FGFR2, TNRC9, MAP3K1 and LSP1). At the second stage, 1,792 SNPs were significant at the P<0.05 level compared with an estimated 1,343 that would be expected by chance, indicating that many additional common susceptibility alleles may be identifiable by this approach. PMID:17529967
Gerry Norman P
Full Text Available Abstract Background Uric acid is the primary byproduct of purine metabolism. Hyperuricemia is associated with body mass index (BMI, sex, and multiple complex diseases including gout, hypertension (HTN, renal disease, and type 2 diabetes (T2D. Multiple genome-wide association studies (GWAS in individuals of European ancestry (EA have reported associations between serum uric acid levels (SUAL and specific genomic loci. The purposes of this study were: 1 to replicate major signals reported in EA populations; and 2 to use the weak LD pattern in African ancestry population to better localize (fine-map reported loci and 3 to explore the identification of novel findings cognizant of the moderate sample size. Methods African American (AA participants (n = 1,017 from the Howard University Family Study were included in this study. Genotyping was performed using the Affymetrix® Genome-wide Human SNP Array 6.0. Imputation was performed using MACH and the HapMap reference panels for CEU and YRI. A total of 2,400,542 single nucleotide polymorphisms (SNPs were assessed for association with serum uric acid under the additive genetic model with adjustment for age, sex, BMI, glomerular filtration rate, HTN, T2D, and the top two principal components identified in the assessment of admixture and population stratification. Results Four variants in the gene SLC2A9 achieved genome-wide significance for association with SUAL (p-values ranging from 8.88 × 10-9 to 1.38 × 10-9. Fine-mapping of the SLC2A9 signals identified a 263 kb interval of linkage disequilibrium in the HapMap CEU sample. This interval was reduced to 37 kb in our AA and the HapMap YRI samples. Conclusions The most strongly associated locus for SUAL in EA populations was also the most strongly associated locus in this AA sample. This finding provides evidence for the role of SLC2A9 in uric acid metabolism across human populations. Additionally, our findings demonstrate the utility of following-up EA
Croteau-Chonka, Damien C; Marvelle, Amanda F; Lange, Ethan M; Lee, Nanette R; Adair, Linda S; Lange, Leslie A; Mohlke, Karen L
Increased values of multiple adiposity-related anthropometric traits are important risk factors for many common complex diseases. We performed a genome-wide association (GWA) study for four quantitative traits related to body size and adiposity (BMI, weight, waist circumference, and height) in a cohort of 1,792 adult Filipino women from the Cebu Longitudinal Health and Nutrition Survey (CLHNS). This is the first GWA study of anthropometric traits in Filipinos, a population experiencing a rapid transition into a more obesogenic environment. In addition to identifying suggestive evidence of additional single-nucleotide polymorphism (SNP) association signals (P Filipinos and provide further insight into the effects of BDNF, FTO, and MC4R on BMI.
Full Text Available The identification of statistical SNP-SNP interactions may help explain the genetic etiology of many human diseases, but exhaustive genome-wide searches for these interactions have been difficult, due to a lack of power in most datasets. We aimed to use data from the Resource for Genetic Epidemiology Research on Adult Health and Aging (GERA study to search for SNP-SNP interactions associated with 10 common diseases. FastEpistasis and BOOST were used to evaluate all pairwise interactions among approximately N = 300,000 single nucleotide polymorphisms (SNPs with minor allele frequency (MAF ≥ 0.15, for the dichotomous outcomes of allergic rhinitis, asthma, cardiac disease, depression, dermatophytosis, type 2 diabetes, dyslipidemia, hemorrhoids, hypertensive disease, and osteoarthritis. A total of N = 45,171 subjects were included after quality control steps were applied. These data were divided into discovery and replication subsets; the discovery subset had > 80% power, under selected models, to detect genome-wide significant interactions (P < 10−12. Interactions were also evaluated for enrichment in particular SNP features, including functionality, prior disease relevancy, and marginal effects. No interaction in any disease was significant in both the discovery and replication subsets. Enrichment analysis suggested that, for some outcomes, interactions involving SNPs with marginal effects were more likely to be nominally replicated, compared to interactions without marginal effects. If SNP-SNP interactions play a role in the etiology of the studied conditions, they likely have weak effect sizes, involve lower-frequency variants, and/or involve complex models of interaction that are not captured well by the methods that were utilized.
Ai, XianTao; Liang, YaJun; Wang, JunDuo; Zheng, JuYun; Gong, ZhaoLong; Guo, JiangPing; Li, XueYuan; Qu, YanYing
Cotton (Gossypium spp.) is the most important natural textile fiber crop, and Gossypium hirsutum L. is responsible for 90% of the annual cotton crop in the world. Information on cotton genetic diversity and population structure is essential for new breeding lines. In this study, we analyzed population structure and genetic diversity of 288 elite Gossypium hirsutum cultivar accessions collected from around the world, and especially from China, using genome-wide single nucleotide polymorphisms (SNP) markers. The average polymorphsim information content (PIC) was 0.25, indicating a relatively low degree of genetic diversity. Population structure analysis revealed extensive admixture and identified three subgroups. Phylogenetic analysis supported the subgroups identified by STRUCTURE. The results from both population structure and phylogenetic analysis were, for the most part, in agreement with pedigree information. Analysis of molecular variance revealed a larger amount of variation was due to diversity within the groups. Establishment of genetic diversity and population structure from this study could be useful for genetic and genomic analysis and systematic utilization of the standing genetic variation in upland cotton.
Seyerle, Amanda A; Lin, Henry J; Gogarten, Stephanie M; Stilp, Adrienne; Méndez Giráldez, Raul; Soliman, Elsayed; Baldassari, Antoine; Graff, Mariaelisa; Heckbert, Susan; Kerr, Kathleen F; Kooperberg, Charles; Rodriguez, Carlos; Guo, Xiuqing; Yao, Jie; Sotoodehnia, Nona; Taylor, Kent D; Whitsel, Eric A; Rotter, Jerome I; Laurie, Cathy C; Avery, Christy L
PR interval (PR) is a heritable electrocardiographic measure of atrial and atrioventricular nodal conduction. Changes in PR duration may be associated with atrial fibrillation, heart failure and all-cause mortality. Hispanic/Latino populations have high burdens of cardiovascular morbidity and mortality, are highly admixed and represent exceptional opportunities for novel locus identification. However, they remain chronically understudied. We present the first genome-wide association study (GWAS) of PR in 14 756 participants of Hispanic/Latino ancestry from three studies. Study-specific summary results of the association between 1000 Genomes Phase 1 imputed single-nucleotide polymorphisms (SNPs) and PR assumed an additive genetic model and were adjusted for global ancestry, study centre/region and clinical covariates. Results were combined using fixed-effects, inverse variance weighted meta-analysis. Sequential conditional analyses were used to identify independent signals. Replication of novel loci was performed in populations of Asian, African and European descent. ENCODE and RoadMap data were used to annotate results. We identified a novel genome-wide association (PPR at ID2 (rs6730558), which replicated in Asian and European populations (PPR loci to Hispanics/Latinos. Bioinformatics annotation provided evidence for regulatory function in cardiac tissue. Further, for six loci that generalised, the Hispanic/Latino index SNP was genome-wide significant and identical to (or in high linkage disequilibrium with) the previously identified GWAS lead SNP. Our results suggest that genetic determinants of PR are consistent across race/ethnicity, but extending studies to admixed populations can identify novel associations, underscoring the importance of conducting genetic studies in diverse populations. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise
Full Text Available Abstract Background The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive. Results To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes. Conclusions Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate
Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.
Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795
Fabbri, Elena; Caniglia, R.; Mucci, Nadia
Single nucleotide polymorphisms (SNPs) which represent the most widespread source of sequence variation in genomes, are becoming a routine application in several fields such as forensics, ecology and conservation genetics. Their use, requiring short amplifications, may allow a more efficient geno....... We evaluated the cost, laboratory effort and reliability of these different markers and discuss the possible future use of VeraCode, SNPlex and Fluidigm EP1 system in wild population monitoring....
Elisabeth M van Leeuwen
Full Text Available Genome-wide association studies (GWAS have revealed 74 single nucleotide polymorphisms (SNPs associated with high-density lipoprotein cholesterol (HDL blood levels. This study is, to our knowledge, the first genome-wide interaction study (GWIS to identify SNP×SNP interactions associated with HDL levels. We performed a GWIS in the Rotterdam Study (RS cohort I (RS-I using the GLIDE tool which leverages the massively parallel computing power of Graphics Processing Units (GPUs to perform linear regression on all genome-wide pairs of SNPs. By performing a meta-analysis together with Rotterdam Study cohorts II and III (RS-II and RS-III, we were able to filter 181 interaction terms with a p-value<1 · 10-8 that replicated in the two independent cohorts. We were not able to replicate any of these interaction term in the AGES, ARIC, CHS, ERF, FHS and NFBC-66 cohorts (Ntotal = 30,011 when adjusting for multiple testing. Our GWIS resulted in the consistent finding of a possible interaction between rs774801 in ARMC8 (ENSG00000114098 and rs12442098 in SPATA8 (ENSG00000185594 being associated with HDL levels. However, p-values do not reach the preset Bonferroni correction of the p-values. Our study suggest that even for highly genetically determined traits such as HDL the sample sizes needed to detect SNP×SNP interactions are large and the 2-step filtering approaches do not yield a solution. Here we present our analysis plan and our reservations concerning GWIS.
Lee, Sang Y; Zhu, Junjia; Salzberg, Anna C; Zhang, Bo; Liu, Dajiang J; Muscat, Joshua E; Langan, Sara T; Connor, James R
Human hemochromatosis protein (HFE) is involved in iron metabolism. Two major HFE polymorphisms, H63D and C282Y, have been associated with an increased risk of cancers. Previously, we reported decreased gender effects in overall survival based on H63D or C282Y HFE polymorphisms patients with glioblastoma multiforme (GBM). However, the effect of other single nucleotide variation (SNV) in the HFE gene on the cancer development and progression has not been systematically studied. To expand our finding in a larger sample, and to identify other HFE SNV, we analyzed the frequency of somatic SNV in HFE gene and its relationship to survival in GBM patients using The Cancer Genome Atlas (TCGA) GBM (Caucasian only) database. We found 9 SNVs with increased frequency in blood normal of TCGA GBM patients compared to the 1000Genome. Among 9 SNVs, 7 SNVs were located in the intron and 2 SNVs (i.e., H63D, C282Y) in the exon of HFE gene. The statistical analysis demonstrated that blood normal samples of TCGA GBM have more H63D (p = 0.0002, 95% Confidence interval (CI): 0.2119-0.3223) or C282Y (p = 0.0129, 95% CI: 0.0474-0.1159) HFE polymorphisms than 1000Genome. The Kaplan-Meier survival curve for the 264 GBM samples revealed no difference between wild type (WT) HFE and H63D, and WT HFE and C282Y GBM patients. In addition, there was no difference in the survival of male/female GBM patients based on HFE genotype. There was no correlation between HFE expression and survival. In conclusion, the current results suggest that somatic HFE polymorphisms do not impact GBM patients' survival in the TCGA data set of GBM.
Sang Y Lee
Full Text Available Human hemochromatosis protein (HFE is involved in iron metabolism. Two major HFE polymorphisms, H63D and C282Y, have been associated with an increased risk of cancers. Previously, we reported decreased gender effects in overall survival based on H63D or C282Y HFE polymorphisms patients with glioblastoma multiforme (GBM. However, the effect of other single nucleotide variation (SNV in the HFE gene on the cancer development and progression has not been systematically studied. To expand our finding in a larger sample, and to identify other HFE SNV, we analyzed the frequency of somatic SNV in HFE gene and its relationship to survival in GBM patients using The Cancer Genome Atlas (TCGA GBM (Caucasian only database. We found 9 SNVs with increased frequency in blood normal of TCGA GBM patients compared to the 1000Genome. Among 9 SNVs, 7 SNVs were located in the intron and 2 SNVs (i.e., H63D, C282Y in the exon of HFE gene. The statistical analysis demonstrated that blood normal samples of TCGA GBM have more H63D (p = 0.0002, 95% Confidence interval (CI: 0.2119-0.3223 or C282Y (p = 0.0129, 95% CI: 0.0474-0.1159 HFE polymorphisms than 1000Genome. The Kaplan-Meier survival curve for the 264 GBM samples revealed no difference between wild type (WT HFE and H63D, and WT HFE and C282Y GBM patients. In addition, there was no difference in the survival of male/female GBM patients based on HFE genotype. There was no correlation between HFE expression and survival. In conclusion, the current results suggest that somatic HFE polymorphisms do not impact GBM patients' survival in the TCGA data set of GBM.
Full Text Available Genome-wide association study (GWAS aims to discover genetic factors underlying phenotypic traits. The large number of genetic factors poses both computational and statistical challenges. Various computational approaches have been developed for large scale GWAS. In this chapter, we will discuss several widely used computational approaches in GWAS. The following topics will be covered: (1 An introduction to the background of GWAS. (2 The existing computational approaches that are widely used in GWAS. This will cover single-locus, epistasis detection, and machine learning methods that have been recently developed in biology, statistic, and computer science communities. This part will be the main focus of this chapter. (3 The limitations of current approaches and future directions.
Sempéré, Guilhem; Philippe, Florian; Dereeper, Alexis; Ruiz, Manuel; Sarah, Gautier; Larmande, Pierre
Exploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly true with regards to studies of genomic variation, which are currently lacking scalable and user-friendly data exploration solutions. Here we present Gigwa, a web-based tool that provides an easy and intuitive way to explore large amounts of genotyping data by filtering it not only on the basis of variant features, including functional annotations, but also on genotype patterns. The data storage relies on MongoDB, which offers good scalability properties. Gigwa can handle multiple databases and may be deployed in either single- or multi-user mode. In addition, it provides a wide range of popular export formats. The Gigwa application is suitable for managing large amounts of genomic variation data. Its user-friendly web interface makes such processing widely accessible. It can either be simply deployed on a workstation or be used to provide a shared data portal for a given community of researchers.
Christine E McLaren
Full Text Available The existence of multiple inherited disorders of iron metabolism in man, rodents and other vertebrates suggests genetic contributions to iron deficiency. To identify new genomic locations associated with iron deficiency, a genome-wide association study (GWAS was performed using DNA collected from white men aged≥25 y and women≥50 y in the Hemochromatosis and Iron Overload Screening (HEIRS Study with serum ferritin (SF≤12 µg/L (cases and iron replete controls (SF>100 µg/L in men, SF>50 µg/L in women. Regression analysis was used to examine the association between case-control status (336 cases, 343 controls and quantitative serum iron measures and 331,060 single nucleotide polymorphism (SNP genotypes, with replication analyses performed in a sample of 71 cases and 161 controls from a population of white male and female veterans screened at a US Veterans Affairs (VA medical center. Five SNPs identified in the GWAS met genome-wide statistical significance for association with at least one iron measure, rs2698530 on chr. 2p14; rs3811647 on chr. 3q22, a known SNP in the transferrin (TF gene region; rs1800562 on chr. 6p22, the C282Y mutation in the HFE gene; rs7787204 on chr. 7p21; and rs987710 on chr. 22q11 (GWAS observed P<1.51×10(-7 for all. An association between total iron binding capacity and SNP rs3811647 in the TF gene (GWAS observed P=7.0×10(-9, corrected P=0.012 was replicated within the VA samples (observed P=0.012. Associations with the C282Y mutation in the HFE gene also were replicated. The joint analysis of the HEIRS and VA samples revealed strong associations between rs2698530 on chr. 2p14 and iron status outcomes. These results confirm a previously-described TF polymorphism and implicate one potential new locus as a target for gene identification.
Cheng, Yu-Ching; Stanne, Tara M; Giese, Anne-Katrin; Ho, Weang Kee; Traylor, Matthew; Amouyel, Philippe; Holliday, Elizabeth G; Malik, Rainer; Xu, Huichun; Kittner, Steven J; Cole, John W; O'Connell, Jeffrey R; Danesh, John; Rasheed, Asif; Zhao, Wei; Engelter, Stefan; Grond-Ginsbach, Caspar; Kamatani, Yoichiro; Lathrop, Mark; Leys, Didier; Thijs, Vincent; Metso, Tiina M; Tatlisumak, Turgut; Pezzini, Alessandro; Parati, Eugenio A; Norrving, Bo; Bevan, Steve; Rothwell, Peter M; Sudlow, Cathie; Slowik, Agnieszka; Lindgren, Arne; Walters, Matthew R; Jannes, Jim; Shen, Jess; Crosslin, David; Doheny, Kimberly; Laurie, Cathy C; Kanse, Sandip M; Bis, Joshua C; Fornage, Myriam; Mosley, Thomas H; Hopewell, Jemma C; Strauch, Konstantin; Müller-Nurasyid, Martina; Gieger, Christian; Waldenberger, Melanie; Peters, Annette; Meisinger, Christine; Ikram, M Arfan; Longstreth, W T; Meschia, James F; Seshadri, Sudha; Sharma, Pankaj; Worrall, Bradford; Jern, Christina; Levi, Christopher; Dichgans, Martin; Boncoraglio, Giorgio B; Markus, Hugh S; Debette, Stephanie; Rolfs, Arndt; Saleheen, Danish; Mitchell, Braxton D
Although a genetic contribution to ischemic stroke is well recognized, only a handful of stroke loci have been identified by large-scale genetic association studies to date. Hypothesizing that genetic effects might be stronger for early- versus late-onset stroke, we conducted a 2-stage meta-analysis of genome-wide association studies, focusing on stroke cases with an age of onset genetic variants at loci with association Pstroke susceptibility locus at 10q25 reached genome-wide significance in the combined analysis of all samples from the discovery and follow-up stages (rs11196288; odds ratio =1.41; P=9.5×10(-9)). The associated locus is in an intergenic region between TCF7L2 and HABP2. In a further analysis in an independent sample, we found that 2 single nucleotide polymorphisms in high linkage disequilibrium with rs11196288 were significantly associated with total plasma factor VII-activating protease levels, a product of HABP2. HABP2, which encodes an extracellular serine protease involved in coagulation, fibrinolysis, and inflammatory pathways, may be a genetic susceptibility locus for early-onset stroke. © 2016 American Heart Association, Inc.
Chávez-Galarza, Julio; Henriques, Dora; Johnston, J Spencer; Azevedo, João C; Patton, John C; Muñoz, Irene; De la Rúa, Pilar; Pinto, M Alice
Understanding the genetic mechanisms of adaptive population divergence is one of the most fundamental endeavours in evolutionary biology and is becoming increasingly important as it will allow predictions about how organisms will respond to global environmental crisis. This is particularly important for the honey bee, a species of unquestionable ecological and economical importance that has been exposed to increasing human-mediated selection pressures. Here, we conducted a single nucleotide polymorphism (SNP)-based genome scan in honey bees collected across an environmental gradient in Iberia and used four FST -based outlier tests to identify genomic regions exhibiting signatures of selection. Additionally, we analysed associations between genetic and environmental data for the identification of factors that might be correlated or act as selective pressures. With these approaches, 4.4% (17 of 383) of outlier loci were cross-validated by four FST -based methods, and 8.9% (34 of 383) were cross-validated by at least three methods. Of the 34 outliers, 15 were found to be strongly associated with one or more environmental variables. Further support for selection, provided by functional genomic information, was particularly compelling for SNP outliers mapped to different genes putatively involved in the same function such as vision, xenobiotic detoxification and innate immune response. This study enabled a more rigorous consideration of selection as the underlying cause of diversity patterns in Iberian honey bees, representing an important first step towards the identification of polymorphisms implicated in local adaptation and possibly in response to recent human-mediated environmental changes. © 2013 John Wiley & Sons Ltd.
Full Text Available Detection of genetic diversity is important for characterisation of crop plant collections in order to detect the presence of valuable trait variation for use in breeding programs. A collection of faba bean (Vicia faba L. genotypes was evaluated for intra- and inter-population diversity using a set of 768 genome-wide distributed single nucleotide polymorphism (SNP markers, of which 657 obtained successful amplification and detected polymorphisms. Gene diversity and polymorphism information content (PIC values varied between 0.022–0.500 and 0.023–1.00, with averages of 0.363 and 0.287, respectively. The genetic structure of the germplasm collection was analysed and a neighbour-joining (NJ dendrogram was constructed. The faba bean accessions grouped into two major groups, with several additional smaller sub-groups, predominantly on the basis of geographical origin. These results were further supported by principal co-ordinate analysis (PCoA, deriving two major groupings which were differentiated on the basis of site of origin and pedigree relationships. In general, high levels of heterozygosity were observed, presumably due to the partially allogamous nature of the species. The results will facilitate targeted crossing strategies in future faba bean breeding programs in order to achieve genetic gain.
Davies, G; Harris, S E; Reynolds, C A; Payton, A; Knight, H M; Liewald, D C; Lopez, L M; Luciano, M; Gow, A J; Corley, J; Henderson, R; Murray, C; Pattie, A; Fox, H C; Redmond, P; Lutz, M W; Chiba-Falek, O; Linnertz, C; Saith, S; Haggarty, P; McNeill, G; Ke, X; Ollier, W; Horan, M; Roses, A D; Ponting, C P; Porteous, D J; Tenesa, A; Pickles, A; Starr, J M; Whalley, L J; Pedersen, N L; Pendleton, N; Visscher, P M; Deary, I J
Cognitive decline is a feared aspect of growing old. It is a major contributor to lower quality of life and loss of independence in old age. We investigated the genetic contribution to individual differences in nonpathological cognitive ageing in five cohorts of older adults. We undertook a genome-wide association analysis using 549 692 single-nucleotide polymorphisms (SNPs) in 3511 unrelated adults in the Cognitive Ageing Genetics in England and Scotland (CAGES) project. These individuals have detailed longitudinal cognitive data from which phenotypes measuring each individual's cognitive changes were constructed. One SNP--rs2075650, located in TOMM40 (translocase of the outer mitochondrial membrane 40 homolog)--had a genome-wide significant association with cognitive ageing (P=2.5 × 10(-8)). This result was replicated in a meta-analysis of three independent Swedish cohorts (P=2.41 × 10(-6)). An Apolipoprotein E (APOE) haplotype (adjacent to TOMM40), previously associated with cognitive ageing, had a significant effect on cognitive ageing in the CAGES sample (P=2.18 × 10(-8); females, P=1.66 × 10(-11); males, P=0.01). Fine SNP mapping of the TOMM40/APOE region identified both APOE (rs429358; P=3.66 × 10(-11)) and TOMM40 (rs11556505; P=2.45 × 10(-8)) as loci that were associated with cognitive ageing. Imputation and conditional analyses in the discovery and replication cohorts strongly suggest that this effect is due to APOE (rs429358). Functional genomic analysis indicated that SNPs in the TOMM40/APOE region have a functional, regulatory non-protein-coding effect. The APOE region is significantly associated with nonpathological cognitive ageing. The identity and mechanism of one or multiple causal variants remain unclear.
Matsuda, Fumio; Nakabayashi, Ryo; Yang, Zhigang; Okazaki, Yozo; Yonemaru, Jun-ichi; Ebana, Kaworu; Yano, Masahiro; Saito, Kazuki
Plants produce structurally diverse secondary (specialized) metabolites to increase their fitness for survival under adverse environments. Several bioactive compounds for new drugs have been identified through screening of plant extracts. In this study, genome-wide association studies (GWAS) were conducted to investigate the genetic architecture behind the natural variation of rice secondary metabolites. GWAS using the metabolome data of 175 rice accessions successfully identified 323 associations among 143 single nucleotide polymorphisms (SNPs) and 89 metabolites. The data analysis highlighted that levels of many metabolites are tightly associated with a small number of strong quantitative trait loci (QTLs). The tight association may be a mechanism generating strains with distinct metabolic composition through the crossing of two different strains. The results indicate that one plant species produces more diverse phytochemicals than previously expected, and plants still contain many useful compounds for human applications. PMID:25267402
Jinam, Timothy A; Phipps, Maude E; Saitou, Naruya
Southeast Asia houses various culturally and linguistically diverse ethnic groups. In Malaysia, where the Malay, Chinese, and Indian ethnic groups form the majority, there exist minority groups such as the "negritos" who are believed to be descendants of the earliest settlers of Southeast Asia. Here we report patterns of genetic substructure and admixture in two Malaysian negrito populations (Jehai and Kensiu), using ~50,000 genome-wide single-nucleotide polymorphism (SNP) data. We found traces of recent admixture in both the negrito populations, particularly in the Jehai, with the Malay through principal component analysis and STRUCTURE analysis software, which suggested that the admixture was as recent as one generation ago. We also identified significantly differentiated nonsynonymous SNPs and haplotype blocks related to intracellular transport, metabolic processes, and detection of stimulus. These results highlight the different levels of admixture experienced by the two Malaysian negritos. Delineating admixture and differentiated genomic regions should be of importance in designing and interpretation of molecular anthropology and disease association studies. Copyright © 2013 Wayne State University Press, Detroit, Michigan 48201-1309.
Elijah R Behr
Full Text Available Marked prolongation of the QT interval on the electrocardiogram associated with the polymorphic ventricular tachycardia Torsades de Pointes is a serious adverse event during treatment with antiarrhythmic drugs and other culprit medications, and is a common cause for drug relabeling and withdrawal. Although clinical risk factors have been identified, the syndrome remains unpredictable in an individual patient. Here we used genome-wide association analysis to search for common predisposing genetic variants. Cases of drug-induced Torsades de Pointes (diTdP, treatment tolerant controls, and general population controls were ascertained across multiple sites using common definitions, and genotyped on the Illumina 610k or 1M-Duo BeadChips. Principal Components Analysis was used to select 216 Northwestern European diTdP cases and 771 ancestry-matched controls, including treatment-tolerant and general population subjects. With these sample sizes, there is 80% power to detect a variant at genome-wide significance with minor allele frequency of 10% and conferring an odds ratio of ≥2.7. Tests of association were carried out for each single nucleotide polymorphism (SNP by logistic regression adjusting for gender and population structure. No SNP reached genome wide-significance; the variant with the lowest P value was rs2276314, a non-synonymous coding variant in C18orf21 (p = 3×10(-7, odds ratio = 2, 95% confidence intervals: 1.5-2.6. The haplotype formed by rs2276314 and a second SNP, rs767531, was significantly more frequent in controls than cases (p = 3×10(-9. Expanding the number of controls and a gene-based analysis did not yield significant associations. This study argues that common genomic variants do not contribute importantly to risk for drug-induced Torsades de Pointes across multiple drugs.
Full Text Available The establishment of DNA methylation patterns in oocytes is a highly dynamic process marking gene-regulatory events during fertilization, embryonic development, and adulthood. However, after epigenetic reprogramming in primordial germ cells, how and when DNA methylation is re-established in developing human oocytes remains to be characterized. Here, using single-cell whole-genome bisulfite sequencing, we describe DNA methylation patterns in three different maturation stages of human oocytes. We found that while broad-scale patterns of CpG methylation have been largely established by the immature germinal vesicle stage, localized changes continue into later development. Non-CpG methylation, on the other hand, undergoes a large-scale, generalized remodeling through the final stage of maturation, with the net overall result being the accumulation of methylation as oocytes mature. The role of the genome-wide, non-CpG methylation remodeling in the final stage of oocyte maturation deserves further investigation.
Danjou, Fabrice; Fozza, Claudio; Zoledziewska, Magdalena; Mulas, Antonella; Corda, Giovanna; Contini, Salvatore; Dore, Fausto; Galleu, Antonio; Di Tucci, Anna Angela; Caocci, Giovanni; Gaviano, Eleonora; Latte, Giancarlo; Gabbas, Attilio; Casula, Paolo; Delogu, Lucia Gemma; La Nasa, Giorgio; Angelucci, Emanuele; Cucca, Francesco; Longinotti, Maurizio
Because different findings suggest that an immune dysregulation plays a role in the pathogenesis of myelodysplastic syndrome (MDS), we analyzed a large cohort of patients from a homogeneous Sardinian population using ImmunoChip, a genotyping array exploring 147,954 single-nucleotide polymorphisms (SNPs) localized in genomic regions displaying some degree of association with immune-mediated diseases or pathways. The population studied included 133 cases and 3,894 controls, and a total of 153,978 autosomal markers and 971 non-autosomal markers were genotyped. After association analysis, only one variant passed the genome-wide significance threshold: rs71325459 (p = 1.16 × 10 -12 ), which is situated on chromosome 20. The variant is in high linkage disequilibrium with rs35640778, an untested missense variant situated in the RTEL1 gene, an interesting candidate that encodes for an ATP-dependent DNA helicase implicated in telomere-length regulation, DNA repair, and maintenance of genomic stability. The second most associated signal is composed of five variants that fall slightly below the genome-wide significance threshold but point out another interesting gene candidate. These SNPs, with p values between 2.53 × 10 -6 and 3.34 × 10 -6 , are situated in the methylene tetrahydrofolate reductase (MTHFR) gene. The most associated of these variants, rs1537514, presents an increased frequency of the derived C allele in cases, with 11.4% versus 4.4% in controls. MTHFR is the rate-limiting enzyme in the methyl cycle and genetic variations in this gene have been strongly associated with the risk of neoplastic diseases. The current understanding of the MDS biology, which is based on the hypothesis of the sequential development of multiple subclonal molecular lesions, fits very well with the demonstration of a possible role for RTEL1 and MTHFR gene polymorphisms, both of which are related to a variable risk of genomic instability. Copyright © 2016 ISEH - International
Full Text Available Abstract Pre-harvest sprouting (PHS is a major abiotic factor affecting grain weight and quality, and is caused by an early break in seed dormancy. Association mapping (AM is used to detect correlations between phenotypes and genotypes based on linkage disequilibrium (LD in wheat breeding programs. We evaluated seed dormancy in 80 Chinese wheat founder parents in five environments and performed a genome-wide association study using 6,057 markers, including 93 simple sequence repeat (SSR, 1,472 diversity array technology (DArT, and 4,492 single nucleotide polymorphism (SNP markers. The general linear model (GLM and the mixed linear model (MLM were used in this study, and two significant markers (tPt-7980 and wPt-6457 were identified. Both markers were located on Chromosome 1B, with wPt-6457 having been identified in a previously reported chromosomal position. The significantly associated loci contain essential information for cloning genes related to resistance to PHS and can be used in wheat breeding programs.
Weidinger, Stephan; Willis-Owen, Saffron A G; Kamatani, Yoichiro; Baurecht, Hansjörg; Morar, Nilesh; Liang, Liming; Edser, Pauline; Street, Teresa; Rodriguez, Elke; O'Regan, Grainne M; Beattie, Paula; Fölster-Holst, Regina; Franke, Andre; Novak, Natalija; Fahy, Caoimhe M; Winge, Mårten C G; Kabesch, Michael; Illig, Thomas; Heath, Simon; Söderhäll, Cilla; Melén, Erik; Pershagen, Göran; Kere, Juha; Bradley, Maria; Lieden, Agne; Nordenskjold, Magnus; Harper, John I; McLean, W H Irwin; Brown, Sara J; Cookson, William O C; Lathrop, G Mark; Irvine, Alan D; Moffatt, Miriam F
Atopic dermatitis (AD) is the most common dermatological disease of childhood. Many children with AD have asthma and AD shares regions of genetic linkage with psoriasis, another chronic inflammatory skin disease. We present here a genome-wide association study (GWAS) of childhood-onset AD in 1563 European cases with known asthma status and 4054 European controls. Using Illumina genotyping followed by imputation, we generated 268 034 consensus genotypes and in excess of 2 million single nucleotide polymorphisms (SNPs) for analysis. Association signals were assessed for replication in a second panel of 2286 European cases and 3160 European controls. Four loci achieved genome-wide significance for AD and replicated consistently across all cohorts. These included the epidermal differentiation complex (EDC) on chromosome 1, the genomic region proximal to LRRC32 on chromosome 11, the RAD50/IL13 locus on chromosome 5 and the major histocompatibility complex (MHC) on chromosome 6; reflecting action of classical HLA alleles. We observed variation in the contribution towards co-morbid asthma for these regions of association. We further explored the genetic relationship between AD, asthma and psoriasis by examining previously identified susceptibility SNPs for these diseases. We found considerable overlap between AD and psoriasis together with variable coincidence between allergic rhinitis (AR) and asthma. Our results indicate that the pathogenesis of AD incorporates immune and epidermal barrier defects with combinations of specific and overlapping effects at individual loci.
Mathew J Barber
Full Text Available Statins effectively lower total and plasma LDL-cholesterol, but the magnitude of decrease varies among individuals. To identify single nucleotide polymorphisms (SNPs contributing to this variation, we performed a combined analysis of genome-wide association (GWA results from three trials of statin efficacy.Bayesian and standard frequentist association analyses were performed on untreated and statin-mediated changes in LDL-cholesterol, total cholesterol, HDL-cholesterol, and triglyceride on a total of 3932 subjects using data from three studies: Cholesterol and Pharmacogenetics (40 mg/day simvastatin, 6 weeks, Pravastatin/Inflammation CRP Evaluation (40 mg/day pravastatin, 24 weeks, and Treating to New Targets (10 mg/day atorvastatin, 8 weeks. Genotype imputation was used to maximize genomic coverage and to combine information across studies. Phenotypes were normalized within each study to account for systematic differences among studies, and fixed-effects combined analysis of the combined sample were performed to detect consistent effects across studies. Two SNP associations were assessed as having posterior probability greater than 50%, indicating that they were more likely than not to be genuinely associated with statin-mediated lipid response. SNP rs8014194, located within the CLMN gene on chromosome 14, was strongly associated with statin-mediated change in total cholesterol with an 84% probability by Bayesian analysis, and a p-value exceeding conventional levels of genome-wide significance by frequentist analysis (P = 1.8 x 10(-8. This SNP was less significantly associated with change in LDL-cholesterol (posterior probability = 0.16, P = 4.0 x 10(-6. Bayesian analysis also assigned a 51% probability that rs4420638, located in APOC1 and near APOE, was associated with change in LDL-cholesterol.Using combined GWA analysis from three clinical trials involving nearly 4,000 individuals treated with simvastatin, pravastatin, or atorvastatin, we
Full Text Available Young-onset hypertension has a stronger genetic component than late-onset counterpart; thus, the identification of genes related to its susceptibility is a critical issue for the prevention and management of this disease. We carried out a two-stage association scan to map young-onset hypertension susceptibility genes. The first-stage analysis, a genome-wide association study, analyzed 175 matched case-control pairs; the second-stage analysis, a confirmatory association study, verified the results at the first stage based on a total of 1,008 patients and 1,008 controls. Single-locus association tests, multilocus association tests and pair-wise gene-gene interaction tests were performed to identify young-onset hypertension susceptibility genes. After considering stringent adjustments of multiple testing, gene annotation and single-nucleotide polymorphism (SNP quality, four SNPs from two SNP triplets with strong association signals (-log(10(p>7 and 13 SNPs from 8 interactive SNP pairs with strong interactive signals (-log(10(p>8 were carefully re-examined. The confirmatory study verified the association for a SNP quartet 219 kb and 495 kb downstream of LOC344371 (a hypothetical gene and RASGRP3 on chromosome 2p22.3, respectively. The latter has been implicated in the abnormal vascular responsiveness to endothelin-1 and angiotensin II in diabetic-hypertensive rats. Intrinsic synergy involving IMPG1 on chromosome 6q14.2-q15 was also verified. IMPG1 encodes interphotoreceptor matrix proteoglycan 1 which has cation binding capacity. The genes are novel hypertension targets identified in this first genome-wide hypertension association study of the Han Chinese population.
B. G. Welderufael
Full Text Available Because mastitis is very frequent and unavoidable, adding recovery information into the analysis for genetic evaluation of mastitis is of great interest from economical and animal welfare point of view. Here we have performed genome-wide association studies (GWAS to identify associated single nucleotide polymorphisms (SNPs and investigate the genetic background not only for susceptibility to – but also for recoverability from mastitis. Somatic cell count records from 993 Danish Holstein cows genotyped for a total of 39378 autosomal SNP markers were used for the association analysis. Single SNP regression analysis was performed using the statistical software package DMU. Substitution effect of each SNP was tested with a t-test and a genome-wide significance level of P-value < 10-4 was used to declare significant SNP-trait association. A number of significant SNP variants were identified for both traits. Many of the SNP variants associated either with susceptibility to – or recoverability from mastitis were located in or very near to genes that have been reported for their role in the immune system. Genes involved in lymphocyte developments (e.g., MAST3 and STAB2 and genes involved in macrophage recruitment and regulation of inflammations (PDGFD and PTX3 were suggested as possible causal genes for susceptibility to – and recoverability from mastitis, respectively. However, this is the first GWAS study for recoverability from mastitis and our results need to be validated. The findings in the current study are, therefore, a starting point for further investigations in identifying causal genetic variants or chromosomal regions for both susceptibility to – and recoverability from mastitis.
Volkov, Petr; Olsson, Anders H; Gillberg, Linn
Little is known about the extent to which interactions between genetics and epigenetics may affect the risk of complex metabolic diseases and/or their intermediary phenotypes. We performed a genome-wide DNA methylation quantitative trait locus (mQTL) analysis in human adipose tissue of 119 men, w...... and epigenetic variation in both cis and trans positions influencing gene expression in adipose tissue and in vivo (dys)metabolic traits associated with the development of obesity and diabetes.......Little is known about the extent to which interactions between genetics and epigenetics may affect the risk of complex metabolic diseases and/or their intermediary phenotypes. We performed a genome-wide DNA methylation quantitative trait locus (mQTL) analysis in human adipose tissue of 119 men......, where 592,794 single nucleotide polymorphisms (SNPs) were related to DNA methylation of 477,891 CpG sites, covering 99% of RefSeq genes. SNPs in significant mQTLs were further related to gene expression in adipose tissue and obesity related traits. We found 101,911 SNP-CpG pairs (mQTLs) in cis and 5...
Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.
Full Text Available The genetic basis of autoantibody production is largely unknown outside of associations located in the major histocompatibility complex (MHC human leukocyte antigen (HLA region. The aim of this study is the discovery of new genetic associations with autoantibody positivity using genome-wide association scan single nucleotide polymorphism (SNP data in type 1 diabetes (T1D patients with autoantibody measurements. We measured two anti-islet autoantibodies, glutamate decarboxylase (GADA, n = 2,506, insulinoma-associated antigen 2 (IA-2A, n = 2,498, antibodies to the autoimmune thyroid (Graves' disease (AITD autoantigen thyroid peroxidase (TPOA, n = 8,300, and antibodies against gastric parietal cells (PCA, n = 4,328 that are associated with autoimmune gastritis. Two loci passed a stringent genome-wide significance level (p<10(-10: 1q23/FCRL3 with IA-2A and 9q34/ABO with PCA. Eleven of 52 non-MHC T1D loci showed evidence of association with at least one autoantibody at a false discovery rate of 16%: 16p11/IL27-IA-2A, 2q24/IFIH1-IA-2A and PCA, 2q32/STAT4-TPOA, 10p15/IL2RA-GADA, 6q15/BACH2-TPOA, 21q22/UBASH3A-TPOA, 1p13/PTPN22-TPOA, 2q33/CTLA4-TPOA, 4q27/IL2/TPOA, 15q14/RASGRP1/TPOA, and 12q24/SH2B3-GADA and TPOA. Analysis of the TPOA-associated loci in 2,477 cases with Graves' disease identified two new AITD loci (BACH2 and UBASH3A.
Erk, Susanne; Meyer-Lindenberg, Andreas; Schnell, Knut; Opitz von Boberfeld, Carola; Esslinger, Christine; Kirsch, Peter; Grimm, Oliver; Arnold, Claudia; Haddad, Leila; Witt, Stephanie H; Cichon, Sven; Nöthen, Markus M; Rietschel, Marcella; Walter, Henrik
The neural abnormalities underlying genetic risk for bipolar disorder, a severe, common, and highly heritable psychiatric condition, are largely unknown. An opportunity to define these mechanisms is provided by the recent discovery, through genome-wide association, of a single-nucleotide polymorphism (rs1006737) strongly associated with bipolar disorder within the CACNA1C gene, encoding the alpha subunit of the L-type voltage-dependent calcium channel Ca(v)1.2. To determine whether the genetic risk associated with rs1006737 is mediated through hippocampal function. Functional magnetic resonance imaging study. University hospital. A total of 110 healthy volunteers of both sexes and of German descent in the Hardy-Weinberg equilibrium for rs1006737. Blood oxygen level-dependent signal during an episodic memory task and behavioral and psychopathological measures. Using an intermediate phenotype approach, we show that healthy carriers of the CACNA1C risk variant exhibit a pronounced reduction of bilateral hippocampal activation during episodic memory recall and diminished functional coupling between left and right hippocampal regions. Furthermore, risk allele carriers exhibit activation deficits of the subgenual anterior cingulate cortex, a region repeatedly associated with affective disorders and the mediation of adaptive stress-related responses. The relevance of these findings for affective disorders is supported by significantly higher psychopathology scores for depression, anxiety, obsessive-compulsive thoughts, interpersonal sensitivity, and neuroticism in risk allele carriers, correlating negatively with the observed regional brain activation. Our data demonstrate that rs1006737 or genetic variants in linkage disequilibrium with it are functional in the human brain and provide a neurogenetic risk mechanism for bipolar disorder backed by genome-wide evidence.
Boraska, Vesna; Jerončić, Ana; Colonna, Vincenza; Southam, Lorraine; Nyholt, Dale R.; William Rayner, Nigel; Perry, John R.B.; Toniolo, Daniela; Albrecht, Eva; Ang, Wei; Bandinelli, Stefania; Barbalic, Maja; Barroso, Inês; Beckmann, Jacques S.; Biffar, Reiner; Boomsma, Dorret; Campbell, Harry; Corre, Tanguy; Erdmann, Jeanette; Esko, Tõnu; Fischer, Krista; Franceschini, Nora; Frayling, Timothy M.; Girotto, Giorgia; Gonzalez, Juan R.; Harris, Tamara B.; Heath, Andrew C.; Heid, Iris M.; Hoffmann, Wolfgang; Hofman, Albert; Horikoshi, Momoko; Hua Zhao, Jing; Jackson, Anne U.; Hottenga, Jouke-Jan; Jula, Antti; Kähönen, Mika; Khaw, Kay-Tee; Kiemeney, Lambertus A.; Klopp, Norman; Kutalik, Zoltán; Lagou, Vasiliki; Launer, Lenore J.; Lehtimäki, Terho; Lemire, Mathieu; Lokki, Marja-Liisa; Loley, Christina; Luan, Jian'an; Mangino, Massimo; Mateo Leach, Irene; Medland, Sarah E.; Mihailov, Evelin; Montgomery, Grant W.; Navis, Gerjan; Newnham, John; Nieminen, Markku S.; Palotie, Aarno; Panoutsopoulou, Kalliope; Peters, Annette; Pirastu, Nicola; Polašek, Ozren; Rehnström, Karola; Ripatti, Samuli; Ritchie, Graham R.S.; Rivadeneira, Fernando; Robino, Antonietta; Samani, Nilesh J.; Shin, So-Youn; Sinisalo, Juha; Smit, Johannes H.; Soranzo, Nicole; Stolk, Lisette; Swinkels, Dorine W.; Tanaka, Toshiko; Teumer, Alexander; Tönjes, Anke; Traglia, Michela; Tuomilehto, Jaakko; Valsesia, Armand; van Gilst, Wiek H.; van Meurs, Joyce B.J.; Smith, Albert Vernon; Viikari, Jorma; Vink, Jacqueline M.; Waeber, Gerard; Warrington, Nicole M.; Widen, Elisabeth; Willemsen, Gonneke; Wright, Alan F.; Zanke, Brent W.; Zgaga, Lina; Boehnke, Michael; d'Adamo, Adamo Pio; de Geus, Eco; Demerath, Ellen W.; den Heijer, Martin; Eriksson, Johan G.; Ferrucci, Luigi; Gieger, Christian; Gudnason, Vilmundur; Hayward, Caroline; Hengstenberg, Christian; Hudson, Thomas J.; Järvelin, Marjo-Riitta; Kogevinas, Manolis; Loos, Ruth J.F.; Martin, Nicholas G.; Metspalu, Andres; Pennell, Craig E.; Penninx, Brenda W.; Perola, Markus; Raitakari, Olli; Salomaa, Veikko; Schreiber, Stefan; Schunkert, Heribert; Spector, Tim D.; Stumvoll, Michael; Uitterlinden, André G.; Ulivi, Sheila; van der Harst, Pim; Vollenweider, Peter; Völzke, Henry; Wareham, Nicholas J.; Wichmann, H.-Erich; Wilson, James F.; Rudan, Igor; Xue, Yali; Zeggini, Eleftheria
The male-to-female sex ratio at birth is constant across world populations with an average of 1.06 (106 male to 100 female live births) for populations of European descent. The sex ratio is considered to be affected by numerous biological and environmental factors and to have a heritable component. The aim of this study was to investigate the presence of common allele modest effects at autosomal and chromosome X variants that could explain the observed sex ratio at birth. We conducted a large-scale genome-wide association scan (GWAS) meta-analysis across 51 studies, comprising overall 114 863 individuals (61 094 women and 53 769 men) of European ancestry and 2 623 828 common (minor allele frequency >0.05) single-nucleotide polymorphisms (SNPs). Allele frequencies were compared between men and women for directly-typed and imputed variants within each study. Forward-time simulations for unlinked, neutral, autosomal, common loci were performed under the demographic model for European populations with a fixed sex ratio and a random mating scheme to assess the probability of detecting significant allele frequency differences. We do not detect any genome-wide significant (P < 5 × 10−8) common SNP differences between men and women in this well-powered meta-analysis. The simulated data provided results entirely consistent with these findings. This large-scale investigation across ∼115 000 individuals shows no detectable contribution from common genetic variants to the observed skew in the sex ratio. The absence of sex-specific differences is useful in guiding genetic association study design, for example when using mixed controls for sex-biased traits. PMID:22843499
Watson, Corey T; Roussos, Panos; Garg, Paras; Ho, Daniel J; Azam, Nidha; Katsel, Pavel L; Haroutunian, Vahram; Sharp, Andrew J
Alzheimer's disease affects ~13% of people in the United States 65 years and older, making it the most common neurodegenerative disorder. Recent work has identified roles for environmental, genetic, and epigenetic factors in Alzheimer's disease risk. We performed a genome-wide screen of DNA methylation using the Illumina Infinium HumanMethylation450 platform on bulk tissue samples from the superior temporal gyrus of patients with Alzheimer's disease and non-demented controls. We paired a sliding window approach with multivariate linear regression to characterize Alzheimer's disease-associated differentially methylated regions (DMRs). We identified 479 DMRs exhibiting a strong bias for hypermethylated changes, a subset of which were independently associated with aging. DMR intervals overlapped 475 RefSeq genes enriched for gene ontology categories with relevant roles in neuron function and development, as well as cellular metabolism, and included genes reported in Alzheimer's disease genome-wide and epigenome-wide association studies. DMRs were enriched for brain-specific histone signatures and for binding motifs of transcription factors with roles in the brain and Alzheimer's disease pathology. Notably, hypermethylated DMRs preferentially overlapped poised promoter regions, marked by H3K27me3 and H3K4me3, previously shown to co-localize with aging-associated hypermethylation. Finally, the integration of DMR-associated single nucleotide polymorphisms with Alzheimer's disease genome-wide association study risk loci and brain expression quantitative trait loci highlights multiple potential DMRs of interest for further functional analysis. We have characterized changes in DNA methylation in the superior temporal gyrus of patients with Alzheimer's disease, highlighting novel loci that facilitate better characterization of pathways and mechanisms underlying Alzheimer's disease pathogenesis, and improve our understanding of epigenetic signatures that may contribute to the
Berkhout, Ben; van Hemert, Formijn
We investigated the nucleotide composition of the RNA genome of the six human coronaviruses. Some general coronavirus characteristics were apparent (e.g. high U, low C count), but we also detected species-specific signatures. Most strikingly, the high U and low C proportions are quite variable and
DeWoody, J Andrew; Fernandez, Nadia B; Brüniche-Olsen, Anna; Antonides, Jennifer D; Doyle, Jacqueline M; San Miguel, Phillip; Westerman, Rick; Vertyankin, Vladimir V; Godard-Codding, Céline A J; Bickham, John W
Genetic and genomic approaches have much to offer in terms of ecology, evolution, and conservation. To better understand the biology of the gray whale Eschrichtius robustus (Lilljeborg, 1861), we sequenced the genome and produced an assembly that contains ∼95% of the genes known to be highly conserved among eukaryotes. From this assembly, we annotated 22,711 genes and identified 2,057,254 single-nucleotide polymorphisms (SNPs). Using this assembly, we generated a curated list of candidate genes potentially subject to strong natural selection, including genes associated with osmoregulation, oxygen binding and delivery, and other aspects of marine life. From these candidate genes, we queried 92 autosomal protein-coding markers with a panel of 96 SNPs that also included 2 sexing and 2 mitochondrial markers. Genotyping error rates, calculated across loci and across 69 intentional replicate samples, were low (0.021%), and observed heterozygosity was 0.33 averaged over all autosomal markers. This level of variability provides substantial discriminatory power across loci (mean probability of identity of 1.6 × 10 -25 and mean probability of exclusion >0.999 with neither parent known), indicating that these markers provide a powerful means to assess parentage and relatedness in gray whales. We found 29 unique multilocus genotypes represented among our 36 biopsies (indicating that we inadvertently sampled 7 whales twice). In total, we compiled an individual data set of 28 western gray whales (WGSs) and 1 presumptive eastern gray whale (EGW). The lone EGW we sampled was no more or less related to the WGWs than expected by chance alone. The gray whale genomes reported here will enable comparative studies of natural selection in cetaceans, and the SNP markers should be highly informative for future studies of gray whale evolution, population structure, demography, and relatedness.
Shiotani, Akiko; Murao, Takahisa; Fujita, Yoshihiko; Fujimura, Yoshinori; Sakakibara, Takashi; Nishio, Kazuto; Haruma, Ken
In our previous study, the SLCO1B1 521TT genotype and the SLCO1B1*1b haplotype were significantly associated with the risk of peptic ulcer in patients taking low-dose aspirin (LDA). The aim of the present study was to investigate pharmacogenomic profile of LDA-induced peptic ulcer and ulcer bleeding. Patients taking 100 mg of enteric-coated aspirin for cardiovascular diseases and with a peptic ulcer or ulcer bleeding and patients who also participated in endoscopic surveillance were studied. Genome-wide analysis of single nucleotide polymorphisms (SNPs) was performed using the Affymetrix DME Plus Premier Pack. SLCO1B1*1b haplotype and candidate genotypes of genes associated with ulcer bleeding or small bowel bleeding identified by genome-wide analysis were determined using TaqMan SNP Genotyping Assay kits, polymerase chain reaction-restriction fragment length polymorphism, and direct sequencing. Of 593 patients enrolled, 111 patients had a peptic ulcer and 45 had ulcer bleeding. The frequencies of the SLCO1B1*1b haplotype and CHST2 2082 T allele were significantly greater in patients with peptic ulcer and ulcer bleeding compared to the controls. After adjustment for significant factors, the SLCO1B1*1b haplotype was associated with peptic ulcer (OR 2.20, 95% CI 1.24-3.89) and CHST2 2082 T allele with ulcer bleeding (2.57, 1.07-6.17). The CHST2 2082 T allele as well as SLCO1B1*1b haplotype may identify patients at increased risk for aspirin-induced peptic ulcer or ulcer bleeding. © 2014 Journal of Gastroenterology and Hepatology Foundation and Wiley Publishing Asia Pty Ltd.
Sahakyan, Aleksandr B; Balasubramanian, Shankar
Accurate knowledge of the core components of substitution rates is of vital importance to understand genome evolution and dynamics. By performing a single-genome and direct analysis of 39,894 retrotransposon remnants, we reveal sequence context-dependent germline nucleotide substitution rates for the human genome. The rates are characterised through rate constants in a time-domain, and are made available through a dedicated program (Trek) and a stand-alone database. Due to the nature of the method design and the imposed stringency criteria, we expect our rate constants to be good estimates for the rates of spontaneous mutations. Benefiting from such data, we study the short-range nucleotide (up to 7-mer) organisation and the germline basal substitution propensity (BSP) profile of the human genome; characterise novel, CpG-independent, substitution prone and resistant motifs; confirm a decreased tendency of moieties with low BSP to undergo somatic mutations in a number of cancer types; and, produce a Trek-based estimate of the overall mutation rate in human. The extended set of rate constants we report may enrich our resources and help advance our understanding of genome dynamics and evolution, with possible implications for the role of spontaneous mutations in the emergence of pathological genotypes and neutral evolution of proteomes.
Li, Bi Jun; Li, Hong Lian; Meng, Zining; Zhang, Yong; Lin, Haoran; Yue, Gen Hua; Xia, Jun Hong
Discovering the nature and pattern of genome variation is fundamental in understanding phenotypic diversity among populations. Although several millions of single nucleotide polymorphisms (SNPs) have been discovered in tilapia, the genome-wide characterization of larger structural variants, such as copy number variation (CNV) regions has not been carried out yet. We conducted a genome-wide scan for CNVs in 47 individuals from three tilapia populations. Based on 254 Gb of high-quality paired-end sequencing reads, we identified 4642 distinct high-confidence CNVs. These CNVs account for 1.9% (12.411 Mb) of the used Nile tilapia reference genome. A total of 1100 predicted CNVs were found overlapping with exon regions of protein genes. Further association analysis based on linear model regression found 85 CNVs ranging between 300 and 27,000 base pairs significantly associated to population types (R 2 > 0.9 and P > 0.001). Our study sheds first insights on genome-wide CNVs in tilapia. These CNVs among and within tilapia populations may have functional effects on phenotypes and specific adaptation to particular environments.
Bigdeli, Tim B.; Ripke, Stephan; Bacanu, Silviu-Alin; Lee, Sang Hong; Wray, Naomi R.; Gejman, Pablo V.; Rietschel, Marcella; Cichon, Sven; St Clair, David; Corvin, Aiden; Kirov, George; McQuillin, Andrew; Gurling, Hugh; Rujescu, Dan; Andreassen, Ole A.; Werge, Thomas; Blackwood, Douglas H.R.; Pato, Carlos N.; Pato, Michele T.; Malhotra, Anil K.; O’Donovan, Michael C.; Kendler, Kenneth S.; Fanous, Ayman H.
Genome-wide association studies (GWAS) of schizophrenia have yielded more than 100 common susceptibility variants, and strongly support a substantial polygenic contribution of a large number of small allelic effects. It has been hypothesized that familial schizophrenia is largely a consequence of inherited rather than environmental factors. We investigated the extent to which familiality of schizophrenia is associated with enrichment for common risk variants detectable in a large GWAS. We analyzed single nucleotide polymorphism (SNP) data for cases reporting a family history of psychotic illness (N = 978), cases reporting no such family history (N = 4,503), and unscreened controls (N = 8,285) from the Psychiatric Genomics Consortium (PGC1) study of schizophrenia. We used a multinomial logistic regression approach with model-fitting to detect allelic effects specific to either family history subgroup. We also considered a polygenic model, in which we tested whether family history positive subjects carried more schizophrenia risk alleles than family history negative subjects, on average. Several individual SNPs attained suggestive but not genome-wide significant association with either family history subgroup. Comparison of genome-wide polygenic risk scores based on GWAS summary statistics indicated a significant enrichment for SNP effects among family history positive compared to family history negative cases (Nagelkerke’s R2 = 0.0021; P = 0.00331; P-value threshold history positive compared to family history negative cases (0.32 and 0.22, respectively; P = 0.031).We found suggestive evidence of allelic effects detectable in large GWAS of schizophrenia that might be specific to particular family history subgroups. However, consideration of a polygenic risk score indicated a significant enrichment among family history positive cases for common allelic effects. Familial illness might, therefore, represent a more heritable form of schizophrenia, as suggested by
Hossain, Zenat Zebin; Leekitcharoenphon, Pimlapas; Dalsgaard, Anders
patients was co-infected with two V. cholerae strains (VC-1 and VC-3). Major virulence factors, biotype and antimicrobial resistance genes were identified by WGS. A global phylogenetic tree was inferred using genome wide SNPs (Single Nucleotide Polymorphism) analysis. RESULTS: All the V. cholerae strains...
Pain, Oliver; Dudbridge, Frank; Cardno, Alastair G; Freeman, Daniel; Lu, Yi; Lundstrom, Sebastian; Lichtenstein, Paul; Ronald, Angelica
This study aimed to test for overlap in genetic influences between psychotic-like experience traits shown by adolescents in the community, and clinically-recognized psychiatric disorders in adulthood, specifically schizophrenia, bipolar disorder, and major depression. The full spectra of psychotic-like experience domains, both in terms of their severity and type (positive, cognitive, and negative), were assessed using self- and parent-ratings in three European community samples aged 15-19 years (Final N incl. siblings = 6,297-10,098). A mega-genome-wide association study (mega-GWAS) for each psychotic-like experience domain was performed. Single nucleotide polymorphism (SNP)-heritability of each psychotic-like experience domain was estimated using genomic-relatedness-based restricted maximum-likelihood (GREML) and linkage disequilibrium- (LD-) score regression. Genetic overlap between specific psychotic-like experience domains and schizophrenia, bipolar disorder, and major depression was assessed using polygenic risk score (PRS) and LD-score regression. GREML returned SNP-heritability estimates of 3-9% for psychotic-like experience trait domains, with higher estimates for less skewed traits (Anhedonia, Cognitive Disorganization) than for more skewed traits (Paranoia and Hallucinations, Parent-rated Negative Symptoms). Mega-GWAS analysis identified one genome-wide significant association for Anhedonia within IDO2 but which did not replicate in an independent sample. PRS analysis revealed that the schizophrenia PRS significantly predicted all adolescent psychotic-like experience trait domains (Paranoia and Hallucinations only in non-zero scorers). The major depression PRS significantly predicted Anhedonia and Parent-rated Negative Symptoms in adolescence. Psychotic-like experiences during adolescence in the community show additive genetic effects and partly share genetic influences with clinically-recognized psychiatric disorders, specifically schizophrenia and
Hong, Joon Ki; Jeong, Yong Dae; Cho, Eun Seok; Choi, Tae Jeong; Kim, Yong Min; Cho, Kyu Ho; Lee, Jae Bong; Lim, Hyun Tae; Lee, Deuk Hwan
The genetic effects of an individual on the phenotypes of its social partners, such as its pen mates, are known as social genetic effects. This study aims to identify the candidate genes for social (pen-mates') average daily gain (ADG) in pigs by using the genome-wide association approach. Social ADG (sADG) was the average ADG of unrelated pen-mates (strangers). We used the phenotype data (16,802 records) after correcting for batch (week), sex, pen, number of strangers (1 to 7 pigs) in the pen, full-sib rate (0% to 80%) within pen, and age at the end of the test. A total of 1,041 pigs from Landrace breeds were genotyped using the Illumina PorcineSNP60 v2 BeadChip panel, which comprised 61,565 single nucleotide polymorphism (SNP) markers. After quality control, 909 individuals and 39,837 markers remained for sADG in genome-wide association study. We detected five new SNPs, all on chromosome 6, which have not been associated with social ADG or other growth traits to date. One SNP was inside the prostaglandin F2α receptor ( PTGFR ) gene, another SNP was located 22 kb upstream of gene interferon-induced protein 44 ( IFI44 ), and the last three SNPs were between 161 kb and 191 kb upstream of the EGF latrophilin and seven transmembrane domain-containing protein 1 ( ELTD1 ) gene. PTGFR, IFI44, and ELTD1 were never associated with social interaction and social genetic effects in any of the previous studies. The identification of several genomic regions, and candidate genes associated with social genetic effects reported here, could contribute to a better understanding of the genetic basis of interaction traits for ADG. In conclusion, we suggest that the PTGFR, IFI44, and ELTD1 may be used as a molecular marker for sADG, although their functional effect was not defined yet. Thus, it will be of interest to execute association studies in those genes.
Kim, Tae-Sung; He, Qiang; Kim, Kyu-Won; Yoon, Min-Young; Ra, Won-Hee; Li, Feng Peng; Tong, Wei; Yu, Jie; Oo, Win Htet; Choi, Buung; Heo, Eun-Beom; Yun, Byoung-Kook; Kwon, Soon-Jae; Kwon, Soon-Wook; Cho, Yoo-Hyun; Lee, Chang-Yong; Park, Beom-Seok; Park, Yong-Jin
Rice germplasm collections continue to grow in number and size around the world. Since maintaining and screening such massive resources remains challenging, it is important to establish practical methods to manage them. A core collection, by definition, refers to a subset of the entire population that preserves the majority of genetic diversity, enhancing the efficiency of germplasm utilization. Here, we report whole-genome resequencing of the 137 rice mini core collection or Korean rice core set (KRICE_CORE) that represents 25,604 rice germplasms deposited in the Korean genebank of the Rural Development Administration (RDA). We implemented the Illumina HiSeq 2000 and 2500 platform to produce short reads and then assembled those with 9.8 depths using Nipponbare as a reference. Comparisons of the sequences with the reference genome yielded more than 15 million (M) single nucleotide polymorphisms (SNPs) and 1.3 M INDELs. Phylogenetic and population analyses using 2,046,529 high-quality SNPs successfully assigned rice accessions to the relevant rice subgroups, suggesting that these SNPs capture evolutionary signatures that have accumulated in rice subpopulations. Furthermore, genome-wide association studies (GWAS) for four exemplary agronomic traits in the KRIC_CORE manifest the utility of KRICE_CORE; that is, identifying previously defined genes or novel genetic factors that potentially regulate important phenotypes. This study provides strong evidence that the size of KRICE_CORE is small but contains high genetic and functional diversity across the genome. Thus, our resequencing results will be useful for future breeding, as well as functional and evolutionary studies, in the post-genomic era.
Kidd, Jeffrey M; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F; Peckham, Heather E; Omberg, Larsson; Bormann Chung, Christina A; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G; Russell, Archie; Reynolds, Andy; Clark, Andrew G; Reese, Martin G; Lincoln, Stephen E; Butte, Atul J; De La Vega, Francisco M; Bustamante, Carlos D
Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Ye Seul Bae
Full Text Available Osteoporosis is a medical condition of global concern, with increasing incidence in both sexes. Bone mineral density (BMD, a highly heritable trait, has been proven a useful diagnostic factor in predicting fracture. Because medical information is lacking about male osteoporotic genetics, we conducted a genome-wide association study of BMD in Korean men. With 1,176 participants, we analyzed 4,414,664 single nucleotide polymorphisms (SNPs after genomic imputation, and identified five SNPs and three loci correlated with bone density and strength. Multivariate linear regression models were applied to adjust for age and body mass index interference. Rs17124500 (p = 6.42 × 10-7, rs34594869 (p = 6.53 × 10-7 and rs17124504 (p = 6.53 × 10-7 in 14q31.3 and rs140155614 (p = 8.64 × 10-7 in 15q25.1 were significantly associated with lumbar spine BMD (LS-BMD, while rs111822233 (p = 6.35 × 10-7 was linked with the femur total BMD (FT-BMD. Additionally, we analyzed the relationship between BMD and five genes previously identified in Korean men. Rs61382873 (p = 0.0009 in LRP5, rs9567003 (p = 0.0033 in TNFSF11 and rs9935828 (p = 0.0248 in FOXL1 were observed for LS-BMD. Furthermore, rs33997547 (p = 0.0057 in ZBTB and rs1664496 (p = 0.0012 in MEF2C were found to influence FT-BMD and rs61769193 (p = 0.0114 in ZBTB to influence femur neck BMD. We identified five SNPs and three genomic regions, associated with BMD. The significance of our results lies in the discovery of new loci, while also affirming a previously significant locus, as potential osteoporotic factors in the Korean male population.
Full Text Available Plasma fibrinogen is an acute phase protein playing an important role in the blood coagulation cascade having strong associations with smoking, alcohol consumption and body mass index (BMI. Genome-wide association studies (GWAS have identified a variety of gene regions associated with elevated plasma fibrinogen concentrations. However, little is yet known about how associations between environmental factors and fibrinogen might be modified by genetic variation. Therefore, we conducted large-scale meta-analyses of genome-wide interaction studies to identify possible interactions of genetic variants and smoking status, alcohol consumption or BMI on fibrinogen concentration. The present study included 80,607 subjects of European ancestry from 22 studies. Genome-wide interaction analyses were performed separately in each study for about 2.6 million single nucleotide polymorphisms (SNPs across the 22 autosomal chromosomes. For each SNP and risk factor, we performed a linear regression under an additive genetic model including an interaction term between SNP and risk factor. Interaction estimates were meta-analysed using a fixed-effects model. No genome-wide significant interaction with smoking status, alcohol consumption or BMI was observed in the meta-analyses. The most suggestive interaction was found for smoking and rs10519203, located in the LOC123688 region on chromosome 15, with a p value of 6.2 × 10(-8. This large genome-wide interaction study including 80,607 participants found no strong evidence of interaction between genetic variants and smoking status, alcohol consumption or BMI on fibrinogen concentrations. Further studies are needed to yield deeper insight in the interplay between environmental factors and gene variants on the regulation of fibrinogen concentrations.
A single nucleotide polymorphism (SNP) assay for population stratification test ... phenotypes and unlinked candidate loci in case-control and cohort studies of ... Key words: Chinese, Japanese, population stratification, ancestry informative ...
Full Text Available BACKGROUND: The rs12807809 single-nucleotide polymorphism in NRGN is a genetic risk variant with genome-wide significance for schizophrenia. The frequency of the T allele of rs12807809 is higher in individuals with schizophrenia than in those without the disorder. Reduced immunoreactivity of NRGN, which is expressed exclusively in the brain, has been observed in Brodmann areas (BA 9 and 32 of the prefrontal cortex in postmortem brains from patients with schizophrenia compared with those in controls. METHODS: Genotype effects of rs12807809 were investigated on gray matter (GM and white matter (WM volumes using magnetic resonance imaging (MRI with a voxel-based morphometry (VBM technique in a sample of 99 Japanese patients with schizophrenia and 263 healthy controls. RESULTS: Although significant genotype-diagnosis interaction either on GM or WM volume was not observed, there was a trend of genotype-diagnosis interaction on GM volume in the left anterior cingulate cortex (ACC. Thus, the effects of NRGN genotype on GM volume of patients with schizophrenia and healthy controls were separately investigated. In patients with schizophrenia, carriers of the risk T allele had a smaller GM volume in the left ACC (BA32 than did carriers of the non-risk C allele. Significant genotype effect on other regions of the GM or WM was not observed for either the patients or controls. CONCLUSIONS: Our findings suggest that the genome-wide associated genetic risk variant in the NRGN gene may be related to a small GM volume in the ACC in the left hemisphere in patients with schizophrenia.
Preimplantation genetic diagnosis (PGD) aims to help couples with heritable genetic disorders to avoid the birth of diseased offspring or the recurrence of loss of conception. Following in vitro fertilization, one or a few cells are biopsied from each human preimplantation embryo for genetic testing, allowing diagnosis and selection of healthy embryos for uterine transfer. Although classical methods, including single-cell PCR and fluorescent in situ hybridization, enable PGD for many genetic disorders, they have limitations. They often require family-specific designs and can be labor intensive, resulting in long waiting lists. Furthermore, certain types of genetic anomalies are not easy to diagnose using these classical approaches, and healthy offspring carrying the parental mutant allele(s) can result. Recently, state-of-the-art methods for single-cell genomics have flourished, which may overcome the limitations associated with classical PGD, and these underpin the development of generic assays for PGD that enable selection of embryos not only for the familial genetic disorder in question, but also for various other genetic aberrations and traits at once. Here, we discuss the latest single-cell genomics methodologies based on DNA microarrays, single-nucleotide polymorphism arrays or next-generation sequence analysis. We focus on their strengths, their validation status, their weaknesses and the challenges for implementing them in PGD. PMID:23998893
Lipka, Alexander E; Tian, Feng; Wang, Qishan; Peiffer, Jason; Li, Meng; Bradbury, Peter J; Gore, Michael A; Buckler, Edward S; Zhang, Zhiwu
Software programs that conduct genome-wide association studies and genomic prediction and selection need to use methodologies that maximize statistical power, provide high prediction accuracy and run in a computationally efficient manner. We developed an R package called Genome Association and Prediction Integrated Tool (GAPIT) that implements advanced statistical methods including the compressed mixed linear model (CMLM) and CMLM-based genomic prediction and selection. The GAPIT package can handle large datasets in excess of 10 000 individuals and 1 million single-nucleotide polymorphisms with minimal computational time, while providing user-friendly access and concise tables and graphs to interpret results. http://www.maizegenetics.net/GAPIT. email@example.com Supplementary data are available at Bioinformatics online.
Ben J Hayes
Full Text Available Continued production of food in areas predicted to be most affected by climate change, such as dairy farming regions of Australia, will be a major challenge in coming decades. Along with rising temperatures and water shortages, scarcity of inputs such as high energy feeds is predicted. With the motivation of selecting cattle adapted to these changing environments, we conducted a genome wide association study to detect DNA markers (single nucleotide polymorphisms associated with the sensitivity of milk production to environmental conditions. To do this we combined historical milk production and weather records with dense marker genotypes on dairy sires with many daughters milking across a wide range of production environments in Australia. Markers associated with sensitivity of milk production to feeding level and sensitivity of milk production to temperature humidity index on chromosome nine and twenty nine respectively were validated in two independent populations, one a different breed of cattle. As the extent of linkage disequilibrium across cattle breeds is limited, the underlying causative mutations have been mapped to a small genomic interval containing two promising candidate genes. The validated marker panels we have reported here will aid selection for high milk production under anticipated climate change scenarios, for example selection of sires whose daughters will be most productive at low levels of feeding.
P. J. Maughan
Full Text Available Quinoa ( Willd. is an important seed crop throughout the Andean region of South America. It is important as a regional food security crop for millions of impoverished rural inhabitants of the Andean Altiplano (high plains. Efforts to improve the crop have led to an increased focus on genetic research. We report the identification of 14,178 putative single nucleotide polymorphisms (SNPs using a genomic reduction protocol as well as the development of 511 functional SNP assays. The SNP assays are based on KASPar genotyping chemistry and were detected using the Fluidigm dynamic array platform. A diversity screen of 113 quinoa accessions showed that the minor allele frequency (MAF of the SNPs ranged from 0.02 to 0.50, with an average MAF of 0.28. Structure analysis of the quinoa diversity panel uncovered the two major subgroups corresponding to the Andean and coastal quinoa ecotypes. Linkage mapping of the SNPs in two recombinant inbred line populations produced an integrated linkage map consisting of 29 linkage groups with 20 large linkage groups, spanning 1404 cM with a marker density of 3.1 cM per SNP marker. The SNPs identified here represent important genomic tools needed in emerging plant breeding programs for advanced genetic analysis of agronomic traits in quinoa.
de Moor, Marleen H.M.; van den Berg, Stéphanie M.; Verweij, Karin J.H.; Krueger, Robert F.; Luciano, Michelle; Vasquez, Alejandro Arias; Matteson, Lindsay K.; Derringer, Jaime; Esko, Tõnu; Amin, Najaf; Gordon, Scott D.; Hansell, Narelle K.; Hart, Amy B.; Seppälä, Ilkka; Huffman, Jennifer E.; Konte, Bettina; Lahti, Jari; Lee, Minyoung; Miller, Mike; Nutile, Teresa; Tanaka, Toshiko; Teumer, Alexander; Viktorin, Alexander; Wedenoja, Juho; Abecasis, Goncalo R.; Adkins, Daniel E.; Agrawal, Arpana; Allik, Jüri; Appel, Katja; Bigdeli, Timothy B.; Busonero, Fabio; Campbell, Harry; Costa, Paul T.; Smith, George Davey; Davies, Gail; de Wit, Harriet; Ding, Jun; Engelhardt, Barbara E.; Eriksson, Johan G.; Fedko, Iryna O.; Ferrucci, Luigi; Franke, Barbara; Giegling, Ina; Grucza, Richard; Hartmann, Annette M.; Heath, Andrew C.; Heinonen, Kati; Henders, Anjali K.; Homuth, Georg; Hottenga, Jouke-Jan; Janzing, Joost; Jokela, Markus; Karlsson, Robert; Kemp, John P.; Kirkpatrick, Matthew G.; Latvala, Antti; Lehtimäki, Terho; Liewald, David C.; Madden, Pamela A.F.; Magri, Chiara; Magnusson, Patrik K.E.; Marten, Jonathan; Maschio, Andrea; Medland, Sarah E.; Mihailov, Evelin; Milaneschi, Yuri; Montgomery, Grant W.; Nauck, Matthias; Ouwens, Klaasjan G.; Palotie, Aarno; Pettersson, Erik; Polasek, Ozren; Qian, Yong; Pulkki-Råback, Laura; Raitakari, Olli T.; Realo, Anu; Rose, Richard J.; Ruggiero, Daniela; Schmidt, Carsten O.; Slutske, Wendy S.; Sorice, Rossella; Starr, John M.; Pourcain, Beate St; Sutin, Angelina R.; Timpson, Nicholas J.; Trochet, Holly; Vermeulen, Sita; Vuoksimaa, Eero; Widen, Elisabeth; Wouda, Jasper; Wright, Margaret J.; Zgaga, Lina; Scotland, Generation; Porteous, David; Minelli, Alessandra; Palmer, Abraham A.; Rujescu, Dan; Ciullo, Marina; Hayward, Caroline; Rudan, Igor; Metspalu, Andres; Kaprio, Jaakko; Deary, Ian J.; Räikkönen, Katri; Wilson, James F.; Keltikangas-Järvinen, Liisa; Bierut, Laura J.; Hettema, John M.; Grabe, Hans J.; van Duijn, Cornelia M.; Evans, David M.; Schlessinger, David; Pedersen, Nancy L.; Terracciano, Antonio; McGue, Matt; Penninx, Brenda W.J.H.; Martin, Nicholas G.; Boomsma, Dorret I.
Importance Neuroticism is a personality trait that is briefly defined by emotional instability. It is a robust genetic risk factor for Major Depressive Disorder (MDD) and other psychiatric disorders. Hence, neuroticism is an important phenotype for psychiatric genetics. The Genetics of Personality Consortium (GPC) has created a resource for genome-wide association analyses of personality traits in over 63,000 participants (including MDD cases). Objective To identify genetic variants associated with neuroticism by performing a meta-analysis of genome-wide association (GWA) results based on 1000Genomes imputation, to evaluate if common genetic variants as assessed by Single Nucleotide Polymorphisms (SNPs) explain variation in neuroticism by estimating SNP-based heritability, and to examine whether SNPs that predict neuroticism also predict MDD. Setting 30 cohorts with genome-wide genotype, personality and MDD data from the GPC. Participants The study included 63,661 participants from 29 discovery cohorts and 9,786 participants from a replication cohort. Participants came from Europe, the United States or Australia. Main outcome measure(s) Neuroticism scores harmonized across all cohorts by Item Response Theory (IRT) analysis, and clinically assessed MDD case-control status. Results A genome-wide significant SNP was found in the MAGI1 gene (rs35855737; P=9.26 × 10−9 in the discovery meta-analysis, and P=2.38 × 10−8 in the meta-analysis of all 30 cohorts). Common genetic variants explain 15% of the variance in neuroticism. Polygenic scores based on the meta-analysis of neuroticism in 27 of the discovery cohorts significantly predicted neuroticism in 2 independent cohorts. Importantly, polygenic scores also predicted MDD in these cohorts. Conclusions and relevance This study identifies a novel locus for neuroticism. The variant is located in a known gene that has been associated with bipolar disorder and schizophrenia in previous studies. In addition, the study
Ferrari, Raffaele; Hernandez, Dena G; Nalls, Michael A; Rohrer, Jonathan D; Ramasamy, Adaikalavan; Kwok, John B J; Dobson-Stone, Carol; Brooks, William S; Schofield, Peter R; Halliday, Glenda M; Hodges, John R; Piguet, Olivier; Bartley, Lauren; Thompson, Elizabeth; Haan, Eric; Hernández, Isabel; Ruiz, Agustín; Boada, Mercè; Borroni, Barbara; Padovani, Alessandro; Cruchaga, Carlos; Cairns, Nigel J; Benussi, Luisa; Binetti, Giuliano; Ghidoni, Roberta; Forloni, Gianluigi; Galimberti, Daniela; Fenoglio, Chiara; Serpente, Maria; Scarpini, Elio; Clarimón, Jordi; Lleó, Alberto; Blesa, Rafael; Waldö, Maria Landqvist; Nilsson, Karin; Nilsson, Christer; Mackenzie, Ian R A; Hsiung, Ging-Yuek R; Mann, David M A; Grafman, Jordan; Morris, Christopher M; Attems, Johannes; Griffiths, Timothy D; McKeith, Ian G; Thomas, Alan J; Pietrini, P; Huey, Edward D; Wassermann, Eric M; Baborie, Atik; Jaros, Evelyn; Tierney, Michael C; Pastor, Pau; Razquin, Cristina; Ortega-Cubero, Sara; Alonso, Elena; Perneczky, Robert; Diehl-Schmid, Janine; Alexopoulos, Panagiotis; Kurz, Alexander; Rainero, Innocenzo; Rubino, Elisa; Pinessi, Lorenzo; Rogaeva, Ekaterina; George-Hyslop, Peter St; Rossi, Giacomina; Tagliavini, Fabrizio; Giaccone, Giorgio; Rowe, James B; Schlachetzki, J C M; Uphill, James; Collinge, John; Mead, S; Danek, Adrian; Van Deerlin, Vivianna M; Grossman, Murray; Trojanowsk, John Q; van der Zee, Julie; Deschamps, William; Van Langenhove, Tim; Cruts, Marc; Van Broeckhoven, Christine; Cappa, Stefano F; Le Ber, Isabelle; Hannequin, Didier; Golfier, Véronique; Vercelletto, Martine; Brice, Alexis; Nacmias, Benedetta; Sorbi, Sandro; Bagnoli, Silvia; Piaceri, Irene; Nielsen, Jørgen E; Hjermind, Lena E; Riemenschneider, Matthias; Mayhaus, Manuel; Ibach, Bernd; Gasparoni, Gilles; Pichler, Sabrina; Gu, Wei; Rossor, Martin N; Fox, Nick C; Warren, Jason D; Spillantini, Maria Grazia; Morris, Huw R; Rizzu, Patrizia; Heutink, Peter; Snowden, Julie S; Rollinson, Sara; Richardson, Anna; Gerhard, Alexander; Bruni, Amalia C; Maletta, Raffaele; Frangipane, Francesca; Cupidi, Chiara; Bernardi, Livia; Anfossi, Maria; Gallo, Maura; Conidi, Maria Elena; Smirne, Nicoletta; Rademakers, Rosa; Baker, Matt; Dickson, Dennis W; Graff-Radford, Neill R; Petersen, Ronald C; Knopman, David; Josephs, Keith A; Boeve, Bradley F; Parisi, Joseph E; Seeley, William W; Miller, Bruce L; Karydas, Anna M; Rosen, Howard; van Swieten, John C; Dopper, Elise G P; Seelaar, Harro; Pijnenburg, Yolande AL; Scheltens, Philip; Logroscino, Giancarlo; Capozzo, Rosa; Novelli, Valeria; Puca, Annibale A; Franceschi, M; Postiglione, Alfredo; Milan, Graziella; Sorrentino, Paolo; Kristiansen, Mark; Chiang, Huei-Hsin; Graff, Caroline; Pasquier, Florence; Rollin, Adeline; Deramecourt, Vincent; Lebert, Florence; Kapogiannis, Dimitrios; Ferrucci, Luigi; Pickering-Brown, Stuart; Singleton, Andrew B; Hardy, John; Momeni, Parastoo
Summary Background Frontotemporal dementia (FTD) is a complex disorder characterised by a broad range of clinical manifestations, differential pathological signatures, and genetic variability. Mutations in three genes—MAPT, GRN, and C9orf72—have been associated with FTD. We sought to identify novel genetic risk loci associated with the disorder. Methods We did a two-stage genome-wide association study on clinical FTD, analysing samples from 3526 patients with FTD and 9402 healthy controls. All participants had European ancestry. In the discovery phase (samples from 2154 patients with FTD and 4308 controls), we did separate association analyses for each FTD subtype (behavioural variant FTD, semantic dementia, progressive non-fluent aphasia, and FTD overlapping with motor neuron disease [FTD-MND]), followed by a meta-analysis of the entire dataset. We carried forward replication of the novel suggestive loci in an independent sample series (samples from 1372 patients and 5094 controls) and then did joint phase and brain expression and methylation quantitative trait loci analyses for the associated (p<5 × 10−8) and suggestive single-nucleotide polymorphisms. Findings We identified novel associations exceeding the genome-wide significance threshold (p<5 × 10−8) that encompassed the HLA locus at 6p21.3 in the entire cohort. We also identified a potential novel locus at 11q14, encompassing RAB38/CTSC, for the behavioural FTD subtype. Analysis of expression and methylation quantitative trait loci data suggested that these loci might affect expression and methylation incis. Interpretation Our findings suggest that immune system processes (link to 6p21.3) and possibly lysosomal and autophagy pathways (link to 11q14) are potentially involved in FTD. Our findings need to be replicated to better define the association of the newly identified loci with disease and possibly to shed light on the pathomechanisms contributing to FTD. Funding The National Institute of
Full Text Available Single nucleotide polymorphism (SNP markers are by far the most common form of DNA polymorphism in a genome. The objectives of this study were to discover SNPs in common bean ( L. by comparing sequences from coding and noncoding regions obtained from the GenBank and genomic DNA and to compare sequencing results with those obtained using single base extension (SBE assays on the Luminex-100 system for use in high-throughput germplasm evaluation. We assessed the frequency of SNPs in 47 fragments of common bean DNA, using SBE as the evaluation methodology. We conducted a sequence analysis of 10 genotypes of cultivated and wild beans belonging to the Mesoamerican and Andean genetic pools of . For the 10 genotypes evaluated, a total of 20,964 bp of sequence were analyzed in each genotype and compared, resulting in the discovery of 239 SNPs and 133 InDels, giving an average SNP frequency of one per 88 bp and an InDel frequency of one per 157 bp. This is the equivalent of a nucleotide diversity (θ of 6.27 × 10. Comparisons with the SNP genotypes previously obtained by direct sequencing showed that the SBE assays on the Luminex-100 were accurate, with 2.5% being miscalled and 1% showing no signal. These results indicate that the Luminex-100 provides a high-throughput system that can be used to analyze SNPs in large samples of genotypes both for purposes of assessing diversity and also for mapping studies.
García-Sanz, Ramón; Corchete, Luis Antonio; Alcoceba, Miguel; Chillon, María Carmen; Jiménez, Cristina; Prieto, Isabel; García-Álvarez, María; Puig, Noemi; Rapado, Immaculada; Barrio, Santiago; Oriol, Albert; Blanchard, María Jesús; de la Rubia, Javier; Martínez, Rafael; Lahuerta, Juan José; González Díaz, Marcos; Mateos, María Victoria; San Miguel, Jesús Fernando; Martínez-López, Joaquín; Sarasquete, María Eugenia
Bortezomib- and thalidomide-based therapies have significantly contributed to improved survival of multiple myeloma (MM) patients. However, treatment-induced peripheral neuropathy (TiPN) is a common adverse event associated with them. Risk factors for TiPN in MM patients include advanced age, prior neuropathy, and other drugs, but there are conflicting results about the role of genetics in predicting the risk of TiPN. Thus, we carried out a genome-wide association study based on more than 300 000 exome single nucleotide polymorphisms in 172 MM patients receiving therapy involving bortezomib and thalidomide. We compared patients developing and not developing TiPN under similar treatment conditions (GEM05MAS65, NCT00443235). The highest-ranking single nucleotide polymorphism was rs45443101, located in the PLCG2 gene, but no significant differences were found after multiple comparison correction (adjusted P = .1708). Prediction analyses, cytoband enrichment, and pathway analyses were also performed, but none yielded any significant findings. A copy number approach was also explored, but this gave no significant results either. In summary, our study did not find a consistent genetic component associated with TiPN under bortezomib and thalidomide therapies that could be used for prediction, which makes clinical judgment essential in the practical management of MM treatment. Copyright © 2016 John Wiley & Sons, Ltd.
Mulder, H A; Crump, R E; Calus, M P L; Veerkamp, R F
In recent years, it has been shown that not only is the phenotype under genetic control, but also the environmental variance. Very little, however, is known about the genetic architecture of environmental variance. The main objective of this study was to unravel the genetic architecture of the mean and environmental variance of somatic cell score (SCS) by identifying genome-wide associations for mean and environmental variance of SCS in dairy cows and by quantifying the accuracy of genome-wide breeding values. Somatic cell score was used because previous research has shown that the environmental variance of SCS is partly under genetic control and reduction of the variance of SCS by selection is desirable. In this study, we used 37,590 single nucleotide polymorphism (SNP) genotypes and 46,353 test-day records of 1,642 cows at experimental research farms in 4 countries in Europe. We used a genomic relationship matrix in a double hierarchical generalized linear model to estimate genome-wide breeding values and genetic parameters. The estimated mean and environmental variance per cow was used in a Bayesian multi-locus model to identify SNP associated with either the mean or the environmental variance of SCS. Based on the obtained accuracy of genome-wide breeding values, 985 and 541 independent chromosome segments affecting the mean and environmental variance of SCS, respectively, were identified. Using a genomic relationship matrix increased the accuracy of breeding values relative to using a pedigree relationship matrix. In total, 43 SNP were significantly associated with either the mean (22) or the environmental variance of SCS (21). The SNP with the highest Bayes factor was on chromosome 9 (Hapmap31053-BTA-111664) explaining approximately 3% of the genetic variance of the environmental variance of SCS. Other significant SNP explained less than 1% of the genetic variance. It can be concluded that fewer genomic regions affect the environmental variance of SCS than the
McCoy, Thomas H; Castro, Victor M; Snapper, Leslie A; Hart, Kamber L; Perlis, Roy H
Biobanks and national registries represent a powerful tool for genomic discovery, but rely on diagnostic codes that may be unreliable and fail to capture the relationship between related diagnoses. We developed an efficient means of conducting genome-wide association studies using combinations of diagnostic codes from electronic health records (EHR) for 10845 participants in a biobanking program at two large academic medical centers. Specifically, we applied latent Dirichilet allocation to fit 50 disease topics based on diagnostic codes, then conducted genome-wide common-variant association for each topic. In sensitivity analysis, these results were contrasted with those obtained from traditional single-diagnosis phenome-wide association analysis, as well as those in which only a subset of diagnostic codes are included per topic. In meta-analysis across three biobank cohorts, we identified 23 disease-associated loci with p<1e-15, including previously associated autoimmune disease loci. In all cases, observed significant associations were of greater magnitude than for single phenome-wide diagnostic codes, and incorporation of less strongly-loading diagnostic codes enhanced association. This strategy provides a more efficient means of phenome-wide association in biobanks with coded clinical data.
Dana B Hancock
Full Text Available Genome-wide association studies have identified numerous genetic loci for spirometic measures of pulmonary function, forced expiratory volume in one second (FEV(1, and its ratio to forced vital capacity (FEV(1/FVC. Given that cigarette smoking adversely affects pulmonary function, we conducted genome-wide joint meta-analyses (JMA of single nucleotide polymorphism (SNP and SNP-by-smoking (ever-smoking or pack-years associations on FEV(1 and FEV(1/FVC across 19 studies (total N = 50,047. We identified three novel loci not previously associated with pulmonary function. SNPs in or near DNER (smallest P(JMA = 5.00×10(-11, HLA-DQB1 and HLA-DQA2 (smallest P(JMA = 4.35×10(-9, and KCNJ2 and SOX9 (smallest P(JMA = 1.28×10(-8 were associated with FEV(1/FVC or FEV(1 in meta-analysis models including SNP main effects, smoking main effects, and SNP-by-smoking (ever-smoking or pack-years interaction. The HLA region has been widely implicated for autoimmune and lung phenotypes, unlike the other novel loci, which have not been widely implicated. We evaluated DNER, KCNJ2, and SOX9 and found them to be expressed in human lung tissue. DNER and SOX9 further showed evidence of differential expression in human airway epithelium in smokers compared to non-smokers. Our findings demonstrated that joint testing of SNP and SNP-by-environment interaction identified novel loci associated with complex traits that are missed when considering only the genetic main effects.
Full Text Available The effect of the cellular reprogramming process per se on mutation load remains unclear. To address this issue, we performed whole exome sequencing analysis of induced pluripotent stem cells (iPSCs reprogrammed from human cord blood (CB CD34(+ cells. Cells from a single donor and improved lentiviral vectors for high-efficiency (2-14% reprogramming were used to examine the effects of three different combinations of reprogramming factors: OCT4 and SOX2 (OS, OS and ZSCAN4 (OSZ, OS and MYC and KLF4 (OSMK. Five clones from each group were subject to whole exome sequencing analysis. We identified 14, 11, and 9 single nucleotide variations (SNVs, in exomes, including untranslated regions (UTR, in the five clones of OSMK, OS, and OSZ iPSC lines. Only 8, 7, and 4 of these, respectively, were protein-coding mutations. An average of 1.3 coding mutations per CB iPSC line is remarkably lower than previous studies using fibroblasts and low-efficiency reprogramming approaches. These data demonstrate that point nucleotide mutations during cord blood reprogramming are negligible and that the inclusion of genome stabilizers like ZSCAN4 during reprogramming may further decrease reprogramming-associated mutations. Our findings provide evidence that CB is a superior source of cells for iPSC banking.
Full Text Available The primary objective of this study was to create a genome-wide high resolution map (i.e., >100 bp of 'rearrangement hotspots' which can facilitate the identification of regions capable of mediating de novo deletions or duplications in humans. A hierarchical method was employed to fragment segmental duplications (SDs into multiple smaller SD units. Combining an end space free pairwise alignment algorithm with a 'seed and extend' approach, we have exhaustively searched 409 million alignments to detect complex structural rearrangements within the reference-guided assembly of the NA18507 human genome (18× coverage, including the previously identified novel 4.8 Mb sequence from de novo assembly within this genome. We have identified 1,963 rearrangement hotspots within SDs which encompass 166 genes and display an enrichment of duplicated gene nucleotide variants (DNVs. These regions are correlated with increased non-allelic homologous recombination (NAHR event frequency which presumably represents the origin of copy number variations (CNVs and pathogenic duplications/deletions. Analysis revealed that 20% of the detected hotspots are clustered within the proximal and distal SD breakpoints flanked by the pathogenic deletions/duplications that have been mapped for 24 NAHR-mediated genomic disorders. FISH Validation of selected complex regions revealed 94% concordance with in silico localization of the highly homologous derivatives. Other results from this study indicate that intra-chromosomal recombination is enhanced in genic compared with agenic duplicated regions, and that gene desert regions comprising SDs may represent reservoirs for creation of novel genes. The generation of genome-wide signatures of 'rearrangement hotspots', which likely serve as templates for NAHR, may provide a powerful approach towards understanding the underlying mutational mechanism(s for development of constitutional and acquired diseases.
Nguyen, Thanh-Tung; Huang, Joshua; Wu, Qingyao; Nguyen, Thuy; Li, Mark
Single-nucleotide polymorphisms (SNPs) selection and identification are the most important tasks in Genome-wide association data analysis. The problem is difficult because genome-wide association data is very high dimensional and a large portion of SNPs in the data is irrelevant to the disease. Advanced machine learning methods have been successfully used in Genome-wide association studies (GWAS) for identification of genetic variants that have relatively big effects in some common, complex diseases. Among them, the most successful one is Random Forests (RF). Despite of performing well in terms of prediction accuracy in some data sets with moderate size, RF still suffers from working in GWAS for selecting informative SNPs and building accurate prediction models. In this paper, we propose to use a new two-stage quality-based sampling method in random forests, named ts-RF, for SNP subspace selection for GWAS. The method first applies p-value assessment to find a cut-off point that separates informative and irrelevant SNPs in two groups. The informative SNPs group is further divided into two sub-groups: highly informative and weak informative SNPs. When sampling the SNP subspace for building trees for the forest, only those SNPs from the two sub-groups are taken into account. The feature subspaces always contain highly informative SNPs when used to split a node at a tree. This approach enables one to generate more accurate trees with a lower prediction error, meanwhile possibly avoiding overfitting. It allows one to detect interactions of multiple SNPs with the diseases, and to reduce the dimensionality and the amount of Genome-wide association data needed for learning the RF model. Extensive experiments on two genome-wide SNP data sets (Parkinson case-control data comprised of 408,803 SNPs and Alzheimer case-control data comprised of 380,157 SNPs) and 10 gene data sets have demonstrated that the proposed model significantly reduced prediction errors and outperformed
Chasman, Daniel I; Fuchsberger, Christian; Pattaro, Cristian; Teumer, Alexander; Böger, Carsten A; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; O'Seaghdha, Conall M; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D; Gierman, Hinco J; Feitosa, Mary F; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Asa; Tönjes, Anke; Dehghan, Abbas; Lambert, Jean-Charles; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Coassin, Stefan; Pistis, Giorgio; Harris, Tamara B; Launer, Lenore J; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank; Demirkan, Ayse; Oostra, Ben A; de Andrade, Mariza; Turner, Stephen T; Ding, Jingzhong; Andrews, Jeanette S; Freedman, Barry I; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Meisinger, Christa; Gieger, Christian; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H; Wright, Alan F; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G; Rivadeneira, Fernando; Aulchenko, Yurii S; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K; Portas, Laura; Ford, Ian; Buckley, Brendan M; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J Wouter; Probst-Hensch, Nicole M; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S; van Duijn, Cornelia M; Borecki, Ingrid B; Kardia, Sharon L R; Liu, Yongmei; Curhan, Gary C; Rudan, Igor; Gyllensten, Ulf; Wilson, James F; Franke, Andre; Pramstaller, Peter P; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Parsa, Afshin; Bochud, Murielle; Heid, Iris M; Kao, W H Linda; Fox, Caroline S; Köttgen, Anna
In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10(-9)) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10(-4)-2.2 × 10(-7). Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general.
Herold, Christine; Hooli, Basavaraj V.; Mullin, Kristina; Liu, Tian; Roehr, Johannes T; Mattheisen, Manuel; Parrado, Antonio R.; Bertram, Lars; Lange, Christoph; Tanzi, Rudolph E.
The genetic basis of Alzheimer's disease (AD) is complex and heterogeneous. Over 200 highly penetrant pathogenic variants in the genes APP, PSEN1 and PSEN2 cause a subset of early-onset familial Alzheimer's disease (EOFAD). On the other hand, susceptibility to late-onset forms of AD (LOAD) is indisputably associated to the ε4 allele in the gene APOE, and more recently to variants in more than two-dozen additional genes identified in the large-scale genome-wide association studies (GWAS) and meta-analyses reports. Taken together however, although the heritability in AD is estimated to be as high as 80%, a large proportion of the underlying genetic factors still remain to be elucidated. In this study we performed a systematic family-based genome-wide association and meta-analysis on close to 15 million imputed variants from three large collections of AD families (~3,500 subjects from 1,070 families). Using a multivariate phenotype combining affection status and onset age, meta-analysis of the association results revealed three single nucleotide polymorphisms (SNPs) that achieved genome-wide significance for association with AD risk: rs7609954 in the gene PTPRG (P-value = 3.98·10−08), rs1347297 in the gene OSBPL6 (P-value = 4.53·10−08), and rs1513625 near PDCL3 (P-value = 4.28·10−08). In addition, rs72953347 in OSBPL6 (P-value = 6.36·10−07) and two SNPs in the gene CDKAL1 showed marginally significant association with LOAD (rs10456232, P-value: 4.76·10−07; rs62400067, P-value: 3.54·10−07). In summary, family-based GWAS meta-analysis of imputed SNPs revealed novel genomic variants in (or near) PTPRG, OSBPL6, and PDCL3 that influence risk for AD with genome-wide significance. PMID:26830138
Full Text Available Genetic diagnosis of inherited metabolic disease is conventionally achieved through syndrome recognition and targeted gene sequencing, but many patients receive no specific diagnosis. Next generation sequencing allied to capture of expressed sequences from genomic DNA now offers a powerful new diagnostic approach. Barriers to routine diagnostic use include cost, and the complexity of interpreting results arising from simultaneous identification of large numbers of variants. We applied exome-wide sequencing to an individual, 16 year old daughter of consanguineous parents with a novel syndrome of short stature, severe insulin resistance, ptosis and microcephaly. Pulldown of expressed sequences from genomic DNA followed by massively parallel sequencing was undertaken. Single nucleotide variants (SNVs were called using SAMtools prior to filtering based on sequence quality and existence in control genomes and exomes. Of 485 genetic variants predicted to alter protein sequence and absent from control data, 24 were homozygous in the patient. One mutation – the p.Arg732X mutation in the WRN gene – has previously been reported in Werner’s syndrome (WS. On re-evaluation of the patient several early features of WS were detected including loss of fat from the extremities and frontal hair thinning. Lymphoblastoid cells from the proband exhibited a defective decatenation checkpoint, consistent with loss of WRN activity. We have thus diagnosed WS some 15 years earlier than average, permitting aggressive prophylactic therapy and screening for WS complications, illustrating the potential of exome-wide sequencing to achieve early diagnosis and change management of rare autosomal recessive disease, even in individual patients of consanguineous parentage with apparently novel syndromes.
Galvan, Antonella; Falvella, Felicia S; Frullanti, Elisa; Spinola, Monica; Incarbone, Matteo; Nosotti, Mario; Santambrogio, Luigi; Conti, Barbara; Pastorino, Ugo; Gonzalez-Neira, Anna; Dragani, Tommaso A
We analyzed a series of young (median age = 52 years) non-smoker lung cancer patients and their unaffected siblings as controls, using a genome-wide 620 901 single-nucleotide polymorphism (SNP) array analysis and a case-control DNA pooling approach. We identified 82 putatively associated SNPs that were retested by individual genotyping followed by use of the sib transmission disequilibrium test, pointing to 36 SNPs associated with lung cancer risk in the discordant sibs series. Analysis of these 36 SNPs in a polygenic model characterized by additive and interchangeable effects of rare alleles revealed a highly statistically significant dosage-dependent association between risk allele carrier status and proportion of cancer cases. Replication of the same 36 SNPs in a population-based series confirmed the association with lung cancer for three SNPs, suggesting that phenocopies and genetic heterogeneity can play a major role in the complex genetics of lung cancer risk in the general population.
C. G. Dang
Full Text Available Significant SNPs associated with Warner-Bratzler (WB shear force and sensory traits were confirmed for Hanwoo beef (Korean cattle. A Bonferroni-corrected genome-wide significant association (p<1.3×10−6 was detected with only one single nucleotide polymorphism (SNP on chromosome 5 for WB shear force. A slightly higher number of SNPs was significantly (p<0.001 associated with WB shear force than with other sensory traits. Further, 50, 25, 29, and 34 SNPs were significantly associated with WB shear force, tenderness, juiciness, and flavor likeness, respectively. The SNPs between p = 0.001 and p = 0.0001 thresholds explained 3% to 9% of the phenotypic variance, while the most significant SNPs accounted for 7% to 12% of the phenotypic variance. In conclusion, because WB shear force and sensory evaluation were moderately affected by a few loci and minimally affected by other loci, further studies are required by using a large sample size and high marker density.
McElroy, Michel S; Navarro, Alberto J R; Mustiga, Guiliana; Stack, Conrad; Gezan, Salvador; Peña, Geover; Sarabia, Widem; Saquicela, Diego; Sotomayor, Ignacio; Douglas, Gavin M; Migicovsky, Zoë; Amores, Freddy; Tarqui, Omar; Myles, Sean; Motamayor, Juan C
Cacao ( Theobroma cacao ) is a globally important crop, and its yield is severely restricted by disease. Two of the most damaging diseases, witches' broom disease (WBD) and frosty pod rot disease (FPRD), are caused by a pair of related fungi: Moniliophthora perniciosa and Moniliophthora roreri , respectively. Resistant cultivars are the most effective long-term strategy to address Moniliophthora diseases, but efficiently generating resistant and productive new cultivars will require robust methods for screening germplasm before field testing. Marker-assisted selection (MAS) and genomic selection (GS) provide two potential avenues for predicting the performance of new genotypes, potentially increasing the selection gain per unit time. To test the effectiveness of these two approaches, we performed a genome-wide association study (GWAS) and GS on three related populations of cacao in Ecuador genotyped with a 15K single nucleotide polymorphism (SNP) microarray for three measures of WBD infection (vegetative broom, cushion broom, and chirimoya pod), one of FPRD (monilia pod) and two productivity traits (total fresh weight of pods and % healthy pods produced). GWAS yielded several SNPs associated with disease resistance in each population, but none were significantly correlated with the same trait in other populations. Genomic selection, using one population as a training set to estimate the phenotypes of the remaining two (composed of different families), varied among traits, from a mean prediction accuracy of 0.46 (vegetative broom) to 0.15 (monilia pod), and varied between training populations. Simulations demonstrated that selecting seedlings using GWAS markers alone generates no improvement over selecting at random, but that GS improves the selection process significantly. Our results suggest that the GWAS markers discovered here are not sufficiently predictive across diverse germplasm to be useful for MAS, but that using all markers in a GS framework holds
Michel S. McElroy
Full Text Available Cacao (Theobroma cacao is a globally important crop, and its yield is severely restricted by disease. Two of the most damaging diseases, witches’ broom disease (WBD and frosty pod rot disease (FPRD, are caused by a pair of related fungi: Moniliophthora perniciosa and Moniliophthora roreri, respectively. Resistant cultivars are the most effective long-term strategy to address Moniliophthora diseases, but efficiently generating resistant and productive new cultivars will require robust methods for screening germplasm before field testing. Marker-assisted selection (MAS and genomic selection (GS provide two potential avenues for predicting the performance of new genotypes, potentially increasing the selection gain per unit time. To test the effectiveness of these two approaches, we performed a genome-wide association study (GWAS and GS on three related populations of cacao in Ecuador genotyped with a 15K single nucleotide polymorphism (SNP microarray for three measures of WBD infection (vegetative broom, cushion broom, and chirimoya pod, one of FPRD (monilia pod and two productivity traits (total fresh weight of pods and % healthy pods produced. GWAS yielded several SNPs associated with disease resistance in each population, but none were significantly correlated with the same trait in other populations. Genomic selection, using one population as a training set to estimate the phenotypes of the remaining two (composed of different families, varied among traits, from a mean prediction accuracy of 0.46 (vegetative broom to 0.15 (monilia pod, and varied between training populations. Simulations demonstrated that selecting seedlings using GWAS markers alone generates no improvement over selecting at random, but that GS improves the selection process significantly. Our results suggest that the GWAS markers discovered here are not sufficiently predictive across diverse germplasm to be useful for MAS, but that using all markers in a GS framework holds
Full Text Available In Brassica napus breeding, traits related to commercial success are of highest importance for plant breeders. However, such traits can only be assessed in an advanced developmental stage. % as well as require high experimental effort due to their quantitative inheritance and the importance of genotype*environment interaction. Molecular markers genetically linked to such traits have the potential to accelerate the breeding process of B. napus by marker-assisted selection. Therefore, the objectives of this study were to identify (i genome regions associated with the examined agronomic and seed quality traits, (ii the interrelationship of population structure and the detected associations, and (iii candidate genes for the revealed associations. The diversity set used in this study consisted of 405 Brassica napus inbred lines which were genotyped using a 6K single nucleotide polymorphism (SNP array and phenotyped for agronomic and seed quality traits in field trials. In a genome-wide association study, we detected a total of 112 associations between SNPs and the seed quality traits as well as 46 SNP-trait associations for the agronomic traits with a P-value 100 and a sequence identity of > 70 % to A. thaliana or B. rapa could be found for the agronomic SNP-trait associations and 187 hits of potential candidate genes for the seed quality SNP-trait associations.
Davies, G; Marioni, R E; Liewald, D C; Hill, W D; Hagenaars, S P; Harris, S E; Ritchie, S J; Luciano, M; Fawns-Ritchie, C; Lyall, D; Cullen, B; Cox, S R; Hayward, C; Porteous, D J; Evans, J; McIntosh, A M; Gallacher, J; Craddock, N; Pell, J P; Smith, D J; Gale, C R; Deary, I J
People's differences in cognitive functions are partly heritable and are associated with important life outcomes. Previous genome-wide association (GWA) studies of cognitive functions have found evidence for polygenic effects yet, to date, there are few replicated genetic associations. Here we use data from the UK Biobank sample to investigate the genetic contributions to variation in tests of three cognitive functions and in educational attainment. GWA analyses were performed for verbal–numerical reasoning (N=36 035), memory (N=112 067), reaction time (N=111 483) and for the attainment of a college or a university degree (N=111 114). We report genome-wide significant single-nucleotide polymorphism (SNP)-based associations in 20 genomic regions, and significant gene-based findings in 46 regions. These include findings in the ATXN2, CYP2DG, APBA1 and CADM2 genes. We report replication of these hits in published GWA studies of cognitive function, educational attainment and childhood intelligence. There is also replication, in UK Biobank, of SNP hits reported previously in GWA studies of educational attainment and cognitive function. GCTA-GREML analyses, using common SNPs (minor allele frequency>0.01), indicated significant SNP-based heritabilities of 31% (s.e.m.=1.8%) for verbal–numerical reasoning, 5% (s.e.m.=0.6%) for memory, 11% (s.e.m.=0.6%) for reaction time and 21% (s.e.m.=0.6%) for educational attainment. Polygenic score analyses indicate that up to 5% of the variance in cognitive test scores can be predicted in an independent cohort. The genomic regions identified include several novel loci, some of which have been associated with intracranial volume, neurodegeneration, Alzheimer's disease and schizophrenia. PMID:27046643
Strauss, Chloe; Long, Hongan; Patterson, Caitlyn E; Te, Ronald; Lynch, Michael
Recent application of mutation accumulation techniques combined with whole-genome sequencing (MA/WGS) has greatly promoted studies of spontaneous mutation. However, such explorations have rarely been conducted on marine organisms, and it is unclear how marine habitats have influenced genome stability. This report resolves the mutation rate and spectrum of the coral reef pathogen Vibrio shilonii , which causes coral bleaching and endangers the biodiversity maintained by coral reefs. We found that its mutation rate and spectrum are highly similar to those of other studied bacteria from various habitats, despite the saline environment. The mutational properties of this marine bacterium are thus controlled by other general evolutionary forces such as natural selection and genetic drift. We also found that as pH drops, the mutation rate decreases and the mutation spectrum is biased in the direction of generating G/C nucleotides. This implies that evolutionary features of this organism and perhaps other marine microbes might be altered by the increasingly acidic ocean water caused by excess CO 2 emission. Nonetheless, further exploration is needed as the pH range tested in this study was rather narrow and many other possible mutation determinants, such as carbonate increase, are associated with ocean acidification. IMPORTANCE This study explored the pH dependence of a bacterial genome-wide mutation rate. We discovered that the genome-wide rates of appearance of most mutation types decrease linearly and that the mutation spectrum is biased in generating more G/C nucleotides with pH drop in the coral reef pathogen V. shilonii . Copyright © 2017 Strauss et al.
Alabdullah, Abdulkader; Minafra, Angelantonio; Elbeaino, Toufic; Saponari, Maria; Savino, Vito; Martelli, Giovanni P
The complete nucleotide sequence and the genome organization were determined of a putative new member of the family Tymoviridae, tentatively named Olive latent virus 3 (OLV-3), recovered in southern Italy from a symptomless olive tree. The sequenced ssRNA genome comprises 7148 nucleotides excluding the poly(A) tail and contains four open reading frames (ORFs). ORF1 encodes a polyprotein of 221.6kDa in size, containing the conserved signatures of the methyltransferase (MTR), papain-like protease (PRO), helicase (HEL) and RNA-dependent RNA polymerase (RdRp) domains of the replication-associated proteins of positive-strand RNA viruses. ORF2 overlaps completely ORF1 and encodes a putative protein of 43.33kDa showing limited sequence similarity with the putative movement protein of Maize rayado fino virus (MRFV). ORF3 codes for a protein with predicted molecular mass of 28.46kDa, identified as the coat protein (CP), whereas ORF4 overlaps ORF3 and encodes a putative protein of 16kDa with sequence similarity to the p16 and p31 proteins of Citrus sudden death-associated virus (CSDaV) and Grapevine fleck virus (GFkV), respectively. Within the family Tymoviridae, OLV-3 genome has the closest identity level (49-52%) with members of the genus Marafivirus, from which, however, it differs because of the diverse genome organization and the presence of a single type of CP subunits. Copyright (c) 2010 Elsevier B.V. All rights reserved.
Full Text Available Oluwadamilare Falola,1 Victor Chukwudi Osamor,1,2 Marion Adebiyi,1,2 Ezekiel Adebiyi1,2 1Covenant University Bioinformatics Research (CUBRe, 2Department of Computer and Information Sciences, College of Science and Technology, Covenant University, Ota, Ogun State, Nigeria Background: Schizophrenia is a severe mental disorder affecting >21 million people worldwide. Some genetic studies reported that single nucleotide polymorphism (SNP involving variant rs1344706 from the ZNF804A gene in human beings is associated with the risk of schizophrenia in several populations. Similar results tend to conflict with other reports in literature, indicating that no true significant association exists between rs1344706 and schizophrenia. We seek to determine the level of association of this SNP with schizophrenia in the Asian population using more recent genome-wide association study (GWAS datasets. Methods: Applying a computational approach with inclusion of more recent GWAS datasets, we conducted a meta-analysis to examine the level of association of SNP rs1344706 and the risk of schizophrenia disorder among the Asian population constituting Chinese, Indonesians, Japanese, Kazakhs and Singaporeans. For a total of 21 genetic studies, including a total of 28,842 cases and 35,630 controls, regression analysis, publication bias, Cochran’s Q and I2 tests were performed. The DerSimonian and Laird random-effects model was used to assess the association of the genetic variant to schizophrenia. Leave-one-out sensitivity analysis was also conducted to determine the influence of each study on the final outcome of the association study. Results: Our summarized analysis for Asian population revealed a pooled odds ratio of 1.06, 95% confidence interval of 1.01–1.11 and two-tailed P-value of 0.0228. Our test for heterogeneity showed the presence of large heterogeneity (I2=53.44%, P =0.00207 and Egger’s regression test (P =0.8763 and Begg’s test (P =0
Olsen Kenneth M
Full Text Available Abstract Background Weedy rice (red rice, a conspecific weed of cultivated rice (Oryza sativa L., is a significant problem throughout the world and an emerging threat in regions where it was previously absent. Despite belonging to the same species complex as domesticated rice and its wild relatives, the evolutionary origins of weedy rice remain unclear. We use genome-wide patterns of single nucleotide polymorphism (SNP variation in a broad geographic sample of weedy, domesticated, and wild Oryza samples to infer the origin and demographic processes influencing U.S. weedy rice evolution. Results We find greater population structure than has been previously reported for U.S. weedy rice, and that the multiple, genetically divergent populations have separate origins. The two main U.S. weedy rice populations share genetic backgrounds with cultivated O. sativa varietal groups not grown commercially in the U.S., suggesting weed origins from domesticated ancestors. Hybridization between weedy groups and between weedy rice and local crops has also led to the evolution of distinct U.S. weedy rice populations. Demographic simulations indicate differences among the main weedy groups in the impact of bottlenecks on their establishment in the U.S., and in the timing of divergence from their cultivated relatives. Conclusions Unlike prior research, we did not find unambiguous evidence for U.S. weedy rice originating via hybridization between cultivated and wild Oryza species. Our results demonstrate the potential for weedy life-histories to evolve directly from within domesticated lineages. The diverse origins of U.S. weedy rice populations demonstrate the multiplicity of evolutionary forces that can influence the emergence of weeds from a single species complex.
Full Text Available Abstract Background Complementary single-nucleotide polymorphisms (SNPs may not be distributed equally between two DNA strands if the strands are functionally distinct, such as in transcribed genes. In introns, an excess of A↔G over the complementary C↔T substitutions had previously been found and attributed to transcription-coupled repair (TCR, demonstrating the valuable functional clues that can be obtained by studying such asymmetry. Here we studied asymmetry of human synonymous SNPs (sSNPs in the fourfold degenerate (FFD sites as compared to intronic SNPs (iSNPs. Results The identities of the ancestral bases and the direction of mutations were inferred from human-chimpanzee genomic alignment. After correction for background nucleotide composition, excess of A→G over the complementary T→C polymorphisms, which was observed previously and can be explained by TCR, was confirmed in FFD SNPs and iSNPs. However, when SNPs were separately examined according to whether they mapped to a CpG dinucleotide or not, an excess of C→T over G→A polymorphisms was found in non-CpG site FFD SNPs but was absent from iSNPs and CpG site FFD SNPs. Conclusion The genome-wide discrepancy of human FFD SNPs provides novel evidence for widespread selective pressure due to functional effects of sSNPs. The similar asymmetry pattern of FFD SNPs and iSNPs that map to a CpG can be explained by transcription-coupled mechanisms, including TCR and transcription-coupled mutation. Because of the hypermutability of CpG sites, more CpG site FFD SNPs are relatively younger and have confronted less selection effect than non-CpG FFD SNPs, which can explain the asymmetric discrepancy of CpG site FFD SNPs vs. non-CpG site FFD SNPs.
Schork, Andrew J; Thompson, Wesley K; Pham, Phillip
Recent results indicate that genome-wide association studies (GWAS) have the potential to explain much of the heritability of common complex phenotypes, but methods are lacking to reliably identify the remaining associated single nucleotide polymorphisms (SNPs). We applied stratified False...... Discovery Rate (sFDR) methods to leverage genic enrichment in GWAS summary statistics data to uncover new loci likely to replicate in independent samples. Specifically, we use linkage disequilibrium-weighted annotations for each SNP in combination with nominal p-values to estimate the True Discovery Rate...... in introns, and negative enrichment for intergenic SNPs. Stratified enrichment directly leads to increased TDR for a given p-value, mirrored by increased replication rates in independent samples. We show this in independent Crohn's disease GWAS, where we find a hundredfold variation in replication rate...
Full Text Available Most of the previously reported loci for total immunoglobulin E (IgE levels are related to Th2 cell-dependent pathways. We undertook a genome-wide association study (GWAS to identify genetic loci responsible for IgE regulation. A total of 479,940 single nucleotide polymorphisms (SNPs were tested for association with total serum IgE levels in 1180 Japanese adults. Fine-mapping with SNP imputation demonstrated 6 candidate regions: the PYHIN1/IFI16, MHC classes I and II, LEMD2, GRAMD1B, and chr13∶60576338 regions. Replication of these candidate loci in each region was assessed in 2 independent Japanese cohorts (n = 1110 and 1364, respectively. SNP rs3130941 in the HLA-C region was consistently associated with total IgE levels in 3 independent populations, and the meta-analysis yielded genome-wide significance (P = 1.07×10(-10. Using our GWAS results, we also assessed the reproducibility of previously reported gene associations with total IgE levels. Nine of 32 candidate genes identified by a literature search were associated with total IgE levels after correction for multiple testing. Our findings demonstrate that SNPs in the HLA-C region are strongly associated with total serum IgE levels in the Japanese population and that some of the previously reported genetic associations are replicated across ethnic groups.
Full Text Available Once considered a single species, the whitefly, Bemisia tabaci, is a complex of numerous morphologically indistinguishable species. Within the last three decades, two of its members (MED and MEAM1 have become some of the world's most damaging agricultural pests invading countries across Europe, Africa, Asia and the Americas and affecting a vast range of agriculturally important food and fiber crops through both feeding-related damage and the transmission of numerous plant viruses. For some time now, researchers have relied on a single mitochondrial gene and/or a handful of nuclear markers to study this species complex. Here, we move beyond this by using 38,041 genome-wide Single Nucleotide Polymorphisms, and show that the two invasive members of the complex are closely related species with signatures of introgression with a third species (IO. Gene flow patterns were traced between contemporary invasive populations within MED and MEAM1 species and these were best explained by recent international trade. These findings have profound implications for delineating the B. tabaci species status and will impact quarantine measures and future management strategies of this global pest.
Full Text Available Schizophrenia is a devastating neuropsychiatric disorder with genetically complex traits. Genetic variants should explain a considerable portion of the risk for schizophrenia, and genome-wide association study (GWAS is a potentially powerful tool for identifying the risk variants that underlie the disease. Here, we report the results of a three-stage analysis of three independent cohorts consisting of a total of 2,535 samples from Japanese and Chinese populations for searching schizophrenia susceptibility genes using a GWAS approach. Firstly, we examined 115,770 single nucleotide polymorphisms (SNPs in 120 patient-parents trio samples from Japanese schizophrenia pedigrees. In stage II, we evaluated 1,632 SNPs (1,159 SNPs of p<0.01 and 473 SNPs of p<0.05 that located in previously reported linkage regions. The second sample consisted of 1,012 case-control samples of Japanese origin. The most significant p value was obtained for the SNP in the ELAVL2 [(embryonic lethal, abnormal vision, Drosophila-like 2] gene located on 9p21.3 (p = 0.00087. In stage III, we scrutinized the ELAVL2 gene by genotyping gene-centric tagSNPs in the third sample set of 293 family samples (1,163 individuals of Chinese descent and the SNP in the gene showed a nominal association with schizophrenia in Chinese population (p = 0.026. The current data in Asian population would be helpful for deciphering ethnic diversity of schizophrenia etiology.
Desta, Zeratsion Abera; Ortiz, Rodomiro
Association analysis is used to measure relations between markers and quantitative trait loci (QTL). Their estimation ignores genes with small effects that trigger underpinning quantitative traits. By contrast, genome-wide selection estimates marker effects across the whole genome on the target population based on a prediction model developed in the training population (TP). Whole-genome prediction models estimate all marker effects in all loci and capture small QTL effects. Here, we review several genomic selection (GS) models with respect to both the prediction accuracy and genetic gain from selection. Phenotypic selection or marker-assisted breeding protocols can be replaced by selection, based on whole-genome predictions in which phenotyping updates the model to build up the prediction accuracy. Copyright © 2014 Elsevier Ltd. All rights reserved.
Hari Deo eUpadhyaya
Full Text Available Identification of potential genes/alleles governing complex seed-protein content (SPC trait is essential in marker-assisted breeding for quality trait improvement of chickpea. Henceforth, the present study utilized an integrated genomics-assisted breeding strategy encompassing trait association analysis, selective genotyping in traditional bi-parental mapping population and differential expression profiling for the first-time to understand the complex genetic architecture of quantitative SPC trait in chickpea. For GWAS (genome-wide association study, high-throughput genotyping information of 16376 genome-based SNPs (single nucleotide polymorphism discovered from a structured population of 336 sequenced desi and kabuli accessions [with 150-200 kb LD (linkage disequilibrium decay] was utilized. This led to identification of seven most effective genomic loci (genes associated [10 to 20% with 41% combined PVE (phenotypic variation explained] with SPC trait in chickpea. Regardless of the diverse desi and kabuli genetic backgrounds, a comparable level of association potential of the identified seven genomic loci with SPC trait was observed. Five SPC-associated genes were validated successfully in parental accessions and homozygous individuals of an intra-specific desi RIL (recombinant inbred line mapping population (ICC 12299 x ICC 4958 by selective genotyping. The seed-specific expression, including differential up-regulation (> 4-fold of six SPC-associated genes particularly in accessions, parents and homozygous individuals of the aforementioned mapping population with high level of contrasting seed-protein content (21-22% was evident. Collectively, the integrated genomic approach delineated diverse naturally occurring novel functional SNP allelic variants in six potential candidate genes regulating SPC trait in chickpea. Of these, a non-synonymous SNP allele-carrying zinc finger transcription factor gene exhibiting strong association with SPC trait
J Brent Richards
Full Text Available The adipocyte-derived protein adiponectin is highly heritable and inversely associated with risk of type 2 diabetes mellitus (T2D and coronary heart disease (CHD. We meta-analyzed 3 genome-wide association studies for circulating adiponectin levels (n = 8,531 and sought validation of the lead single nucleotide polymorphisms (SNPs in 5 additional cohorts (n = 6,202. Five SNPs were genome-wide significant in their relationship with adiponectin (P< or =5x10(-8. We then tested whether these 5 SNPs were associated with risk of T2D and CHD using a Bonferroni-corrected threshold of P< or =0.011 to declare statistical significance for these disease associations. SNPs at the adiponectin-encoding ADIPOQ locus demonstrated the strongest associations with adiponectin levels (P-combined = 9.2x10(-19 for lead SNP, rs266717, n = 14,733. A novel variant in the ARL15 (ADP-ribosylation factor-like 15 gene was associated with lower circulating levels of adiponectin (rs4311394-G, P-combined = 2.9x10(-8, n = 14,733. This same risk allele at ARL15 was also associated with a higher risk of CHD (odds ratio [OR] = 1.12, P = 8.5x10(-6, n = 22,421 more nominally, an increased risk of T2D (OR = 1.11, P = 3.2x10(-3, n = 10,128, and several metabolic traits. Expression studies in humans indicated that ARL15 is well-expressed in skeletal muscle. These findings identify a novel protein, ARL15, which influences circulating adiponectin levels and may impact upon CHD risk.
Bertram, Lars; Lange, Christoph; Mullin, Kristina; Parkinson, Michele; Hsiao, Monica; Hogan, Meghan F; Schjeide, Brit M M; Hooli, Basavaraj; Divito, Jason; Ionita, Iuliana; Jiang, Hongyu; Laird, Nan; Moscarillo, Thomas; Ohlsen, Kari L; Elliott, Kathryn; Wang, Xin; Hu-Lince, Diane; Ryder, Marie; Murphy, Amy; Wagner, Steven L; Blacker, Deborah; Becker, K David; Tanzi, Rudolph E
Alzheimer's disease (AD) is a genetically complex and heterogeneous disorder. To date four genes have been established to either cause early-onset autosomal-dominant AD (APP, PSEN1, and PSEN2(1-4)) or to increase susceptibility for late-onset AD (APOE5). However, the heritability of late-onset AD is as high as 80%, (6) and much of the phenotypic variance remains unexplained to date. We performed a genome-wide association (GWA) analysis using 484,522 single-nucleotide polymorphisms (SNPs) on a large (1,376 samples from 410 families) sample of AD families of self-reported European descent. We identified five SNPs showing either significant or marginally significant genome-wide association with a multivariate phenotype combining affection status and onset age. One of these signals (p = 5.7 x 10(-14)) was elicited by SNP rs4420638 and probably reflects APOE-epsilon4, which maps 11 kb proximal (r2 = 0.78). The other four signals were tested in three additional independent AD family samples composed of nearly 2700 individuals from almost 900 families. Two of these SNPs showed significant association in the replication samples (combined p values 0.007 and 0.00002). The SNP (rs11159647, on chromosome 14q31) with the strongest association signal also showed evidence of association with the same allele in GWA data generated in an independent sample of approximately 1,400 AD cases and controls (p = 0.04). Although the precise identity of the underlying locus(i) remains elusive, our study provides compelling evidence for the existence of at least one previously undescribed AD gene that, like APOE-epsilon4, primarily acts as a modifier of onset age.
Sreekumar G Pillai
Full Text Available There is considerable variability in the susceptibility of smokers to develop chronic obstructive pulmonary disease (COPD. The only known genetic risk factor is severe deficiency of alpha(1-antitrypsin, which is present in 1-2% of individuals with COPD. We conducted a genome-wide association study (GWAS in a homogenous case-control cohort from Bergen, Norway (823 COPD cases and 810 smoking controls and evaluated the top 100 single nucleotide polymorphisms (SNPs in the family-based International COPD Genetics Network (ICGN; 1891 Caucasian individuals from 606 pedigrees study. The polymorphisms that showed replication were further evaluated in 389 subjects from the US National Emphysema Treatment Trial (NETT and 472 controls from the Normative Aging Study (NAS and then in a fourth cohort of 949 individuals from 127 extended pedigrees from the Boston Early-Onset COPD population. Logistic regression models with adjustments of covariates were used to analyze the case-control populations. Family-based association analyses were conducted for a diagnosis of COPD and lung function in the family populations. Two SNPs at the alpha-nicotinic acetylcholine receptor (CHRNA 3/5 locus were identified in the genome-wide association study. They showed unambiguous replication in the ICGN family-based analysis and in the NETT case-control analysis with combined p-values of 1.48 x 10(-10, (rs8034191 and 5.74 x 10(-10 (rs1051730. Furthermore, these SNPs were significantly associated with lung function in both the ICGN and Boston Early-Onset COPD populations. The C allele of the rs8034191 SNP was estimated to have a population attributable risk for COPD of 12.2%. The association of hedgehog interacting protein (HHIP locus on chromosome 4 was also consistently replicated, but did not reach genome-wide significance levels. Genome-wide significant association of the HHIP locus with lung function was identified in the Framingham Heart study (Wilk et al., companion article
Zhang, Xinjun; Li, Meng; Lin, Hai; Rao, Xi; Feng, Weixing; Yang, Yuedong; Mort, Matthew; Cooper, David N; Wang, Yue; Wang, Yadong; Wells, Clark; Zhou, Yaoqi; Liu, Yunlong
While synonymous single-nucleotide variants (sSNVs) have largely been unstudied, since they do not alter protein sequence, mounting evidence suggests that they may affect RNA conformation, splicing, and the stability of nascent-mRNAs to promote various diseases. Accurately prioritizing deleterious sSNVs from a pool of neutral ones can significantly improve our ability of selecting functional genetic variants identified from various genome-sequencing projects, and, therefore, advance our understanding of disease etiology. In this study, we develop a computational algorithm to prioritize sSNVs based on their impact on mRNA splicing and protein function. In addition to genomic features that potentially affect splicing regulation, our proposed algorithm also includes dozens structural features that characterize the functions of alternatively spliced exons on protein function. Our systematical evaluation on thousands of sSNVs suggests that several structural features, including intrinsic disorder protein scores, solvent accessible surface areas, protein secondary structures, and known and predicted protein family domains, show significant differences between disease-causing and neutral sSNVs. Our result suggests that the protein structure features offer an added dimension of information while distinguishing disease-causing and neutral synonymous variants. The inclusion of structural features increases the predictive accuracy for functional sSNV prioritization.
Lee, Myoungsook; Kwon, Dae Young; Kim, Myung-Sunny; Choi, Chong Ran; Park, Mi-Young; Kim, Ae-Jung
This is the first study to identify common genetic factors associated with the basal metabolic rate (BMR) and body mass index (BMI) in obese Korean women including overweight. This will be a basic study for future research of obese gene-BMR interaction. The experimental design was 2 by 2 with variables of BMR and BMI. A genome-wide association study (GWAS) of single nucleotide polymorphisms (SNPs) was conducted in the overweight and obesity (BMI > 23 kg/m(2)) compared to the normality, and in women with low BMR (BMR. A total of 140 SNPs reached formal genome-wide statistical significance in this study (P BMR (rs10786764; P = 8.0 × 10(-7), rs1040675; 2.3 × 10(-6)) and BMI (rs10786764; P = 2.5 × 10(-5), rs10786764; 6.57 × 10(-5)). The other genes related to BMI (HSD52, TMA16, MARCH1, NRG1, NRXN3, and STK4) yielded P BMR and BMI, including NRG3, OR8U8, BCL2L2-PABPN1, PABPN1, and SLC22A17 were identified in obese Korean women (P BMR- and BMI-related genes using GWAS. Although most of these newly established loci were not previously associated with obesity, they may provide new insights into body weight regulation. Our findings of five common genes associated with BMR and BMI in Koreans will serve as a reference for replication and validation of future studies on the metabolic rate.
Full Text Available Summary: A number of mitochondrial diseases arise from single-nucleotide variant (SNV accumulation in multiple mitochondria. Here, we present a method for identification of variants present at the single-mitochondrion level in individual mouse and human neuronal cells, allowing for extremely high-resolution study of mitochondrial mutation dynamics. We identified extensive heteroplasmy between individual mitochondrion, along with three high-confidence variants in mouse and one in human that were present in multiple mitochondria across cells. The pattern of variation revealed by single-mitochondrion data shows surprisingly pervasive levels of heteroplasmy in inbred mice. Distribution of SNV loci suggests inheritance of variants across generations, resulting in Poisson jackpot lines with large SNV load. Comparison of human and mouse variants suggests that theÂ two species might employ distinct modes of somatic segregation. Single-mitochondrion resolution revealed mitochondria mutational dynamics that we hypothesize to affect risk probabilities for mutations reaching disease thresholds. : Morris etÂ al. use independent sequencing of multiple individual mitochondria from mouse and human brain cells to show high pervasiveness of mutations. The mutations are heteroplasmic within single mitochondria and within and between cells. These findings suggest mechanisms by which mutations accumulate over time, resulting in mitochondrial dysfunction and disease. Keywords: single mitochondrion, single cell, human neuron, mouse neuron, single-nucleotide variation
Brandon L Pierce
Full Text Available Arsenic contamination of drinking water is a major public health issue in many countries, increasing risk for a wide array of diseases, including cancer. There is inter-individual variation in arsenic metabolism efficiency and susceptibility to arsenic toxicity; however, the basis of this variation is not well understood. Here, we have performed the first genome-wide association study (GWAS of arsenic-related metabolism and toxicity phenotypes to improve our understanding of the mechanisms by which arsenic affects health. Using data on urinary arsenic metabolite concentrations and approximately 300,000 genome-wide single nucleotide polymorphisms (SNPs for 1,313 arsenic-exposed Bangladeshi individuals, we identified genome-wide significant association signals (P<5×10(-8 for percentages of both monomethylarsonic acid (MMA and dimethylarsinic acid (DMA near the AS3MT gene (arsenite methyltransferase; 10q24.32, with five genetic variants showing independent associations. In a follow-up analysis of 1,085 individuals with arsenic-induced premalignant skin lesions (the classical sign of arsenic toxicity and 1,794 controls, we show that one of these five variants (rs9527 is also associated with skin lesion risk (P = 0.0005. Using a subset of individuals with prospectively measured arsenic (n = 769, we show that rs9527 interacts with arsenic to influence incident skin lesion risk (P = 0.01. Expression quantitative trait locus (eQTL analyses of genome-wide expression data from 950 individual's lymphocyte RNA suggest that several of our lead SNPs represent cis-eQTLs for AS3MT (P = 10(-12 and neighboring gene C10orf32 (P = 10(-44, which are involved in C10orf32-AS3MT read-through transcription. This is the largest and most comprehensive genomic investigation of arsenic metabolism and toxicity to date, the only GWAS of any arsenic-related trait, and the first study to implicate 10q24.32 variants in both arsenic metabolism and arsenical
Full Text Available We demonstrate that Au-cluster-decorated single-walled carbon nanotubes (SWNTs may be used to discriminate single nucleotide polymorphism (SNP. Nanoscale Au clusters were formed on the side walls of carbon nanotubes in a transistor geometry using electrochemical deposition. The effect of Au cluster decoration appeared as hole doping when electrical transport characteristics were examined. Thiolated single-stranded probe peptide nucleic acid (PNA was successfully immobilized on Au clusters decorating single-walled carbon nanotube field-effect transistors (SWNT-FETs, resulting in a conductance decrease that could be explained by a decrease in Au work function upon adsorption of thiolated PNA. Although a target single-stranded DNA (ssDNA with a single mismatch did not cause any change in electrical conductance, a clear decrease in conductance was observed with matched ssDNA, thereby showing the possibility of SNP (single nucleotide polymorphism detection using Au-cluster-decorated SWNT-FETs. However, a power to discriminate SNP target is lost in high ionic environment. We can conclude that observed SNP discrimination in low ionic environment is due to the hampered binding of SNP target on nanoscale surfaces in low ionic conditions.
Full Text Available DNA methylation plays a central role in regulating many aspects of growth and development in mammals through regulating gene expression. The development of next generation sequencing technologies have paved the way for genome-wide, high resolution analysis of DNA methylation landscapes using methodology known as reduced representation bisulfite sequencing (RRBS. While RRBS has proven to be effective in understanding DNA methylation landscapes in humans, mice, and rats, to date, few studies have utilised this powerful method for investigating DNA methylation in agricultural animals. Here we describe the utilisation of RRBS to investigate DNA methylation in sheep Longissimus dorsi muscles. RRBS analysis of ∼1% of the genome from Longissimus dorsi muscles provided data of suitably high precision and accuracy for DNA methylation analysis, at all levels of resolution from genome-wide to individual nucleotides. Combining RRBS data with mRNAseq data allowed the sheep Longissimus dorsi muscle methylome to be compared with methylomes from other species. While some species differences were identified, many similarities were observed between DNA methylation patterns in sheep and other more commonly studied species. The RRBS data presented here highlights the complexity of epigenetic regulation of genes. However, the similarities observed across species are promising, in that knowledge gained from epigenetic studies in human and mice may be applied, with caution, to agricultural species. The ability to accurately measure DNA methylation in agricultural animals will contribute an additional layer of information to the genetic analyses currently being used to maximise production gains in these species.
Mota, R R; Guimarães, S E F; Fortes, M R S; Hayes, B; Silva, F F; Verardo, L L; Kelly, M J; de Campos, C F; Guimarães, J D; Wenceslau, R R; Penitente-Filho, J M; Garcia, J F; Moore, S
We performed a genome-wide mapping for the age at first calving (AFC) with the goal of annotating candidate genes that regulate fertility in Nellore cattle. Phenotypic data from 762 cows and 777k SNP genotypes from 2,992 bulls and cows were used. Single nucleotide polymorphism (SNP) effects based on the single-step GBLUP methodology were blocked into adjacent windows of 1 Megabase (Mb) to explain the genetic variance. SNP windows explaining more than 0.40% of the AFC genetic variance were identified on chromosomes 2, 8, 9, 14, 16 and 17. From these windows, we identified 123 coding protein genes that were used to build gene networks. From the association study and derived gene networks, putative candidate genes (e.g., PAPPA, PREP, FER1L6, TPR, NMNAT1, ACAD10, PCMTD1, CRH, OPKR1, NPBWR1 and NCOA2) and transcription factors (TF) (STAT1, STAT3, RELA, E2F1 and EGR1) were strongly associated with female fertility (e.g., negative regulation of luteinizing hormone secretion, folliculogenesis and establishment of uterine receptivity). Evidence suggests that AFC inheritance is complex and controlled by multiple loci across the genome. As several windows explaining higher proportion of the genetic variance were identified on chromosome 14, further studies investigating the interaction across haplotypes to better understand the molecular architecture behind AFC in Nellore cattle should be undertaken. © 2017 Blackwell Verlag GmbH.
Friedrich, Juliane; Brand, Bodo; Ponsuksili, Siriluck; Graunke, Katharina L; Langbein, Jan; Knaust, Jacqueline; Kühn, Christa; Schwerin, Manfred
Behaviour traits of cattle have been reported to affect important production traits, such as meat quality and milk performance as well as reproduction and health. Genetic predisposition is, together with environmental stimuli, undoubtedly involved in the development of behaviour phenotypes. Underlying molecular mechanisms affecting behaviour in general and behaviour and productions traits in particular still have to be studied in detail. Therefore, we performed a genome-wide association study in an F2 Charolais × German Holstein cross-breed population to identify genetic variants that affect behaviour-related traits assessed in an open-field and novel-object test and analysed their putative impact on milk performance. Of 37,201 tested single nucleotide polymorphism (SNPs), four showed a genome-wide and 37 a chromosome-wide significant association with behaviour traits assessed in both tests. Nine of the SNPs that were associated with behaviour traits likewise showed a nominal significant association with milk performance traits. On chromosomes 14 and 29, six SNPs were identified to be associated with exploratory behaviour and inactivity during the novel-object test as well as with milk yield traits. Least squares means for behaviour and milk performance traits for these SNPs revealed that genotypes associated with higher inactivity and less exploratory behaviour promote higher milk yields. Whether these results are due to molecular mechanisms simultaneously affecting behaviour and milk performance or due to a behaviour predisposition, which causes indirect effects on milk performance by influencing individual reactivity, needs further investigation. © 2015 Stichting International Foundation for Animal Genetics.
Full Text Available Abstract Background Many common inference problems in computational genetics depend on inferring aspects of the evolutionary history of a data set given a set of observed modern sequences. Detailed predictions of the full phylogenies are therefore of value in improving our ability to make further inferences about population history and sources of genetic variation. Making phylogenetic predictions on the scale needed for whole-genome analysis is, however, extremely computationally demanding. Results In order to facilitate phylogeny-based predictions on a genomic scale, we develop a library of maximum parsimony phylogenies within local regions spanning all autosomal human chromosomes based on Haplotype Map variation data. We demonstrate the utility of this library for population genetic inferences by examining a tree statistic we call 'imperfection,' which measures the reuse of variant sites within a phylogeny. This statistic is significantly predictive of recombination rate, shows additional regional and population-specific conservation, and allows us to identify outlier genes likely to have experienced unusual amounts of variation in recent human history. Conclusion Recent theoretical advances in algorithms for phylogenetic tree reconstruction have made it possible to perform large-scale inferences of local maximum parsimony phylogenies from single nucleotide polymorphism (SNP data. As results from the imperfection statistic demonstrate, phylogeny predictions encode substantial information useful for detecting genomic features and population history. This data set should serve as a platform for many kinds of inferences one may wish to make about human population history and genetic variation.
Nagao, Yumiko; Nishida, Nao; Toyo-Oka, Licht; Kawaguchi, Atsushi; Amoroso, Antonio; Carrozzo, Marco; Sata, Michio; Mizokami, Masashi; Tokunaga, Katsushi; Tanaka, Yasuhito
There is a close relationship between hepatitis C virus (HCV) infection and lichen planus, a chronic inflammatory mucocutaneous disease. We performed a genome-wide association study (GWAS) to identify genetic variants associated with HCV-related lichen planus. We conducted a GWAS of 261 patients with HCV infection treated at a tertiary medical center in Japan from October 2007 through January 2013; a total of 71 had lichen planus and 190 had normal oral mucosa. We validated our findings in a GWAS of 38 patients with HCV-associated lichen planus and 7 HCV-infected patients with normal oral mucosa treated at a medical center in Italy. Single-nucleotide polymorphisms in NRP2 (rs884000) and IGFBP4 (rs538399) were associated with risk of HCV-associated lichen planus (P lichen planus. The odds ratios for the minor alleles of rs884000, rs538399, and rs9461799 were 3.25 (95% confidence interval, 1.95-5.41), 0.40 (95% confidence interval, 0.25-0.63), and 2.15 (95% confidence interval, 1.41-3.28), respectively. In a GWAS of Japanese patients with HCV infection, we replicated associations between previously reported polymorphisms in HLA class II genes and risk for lichen planus. We also identified single-nucleotide polymorphisms in NRP2 and IGFBP4 loci that increase and reduce risk of lichen planus, respectively. These genetic variants might be used to identify patients with HCV infection who are at risk for lichen planus. Copyright © 2017 AGA Institute. Published by Elsevier Inc. All rights reserved.
Full Text Available Numerous hybrid and polypoid species are found within the Triticeae. It has been suggested that the H subgenome of allopolyploid Elymus (wheatgrass species originated from diploid Hordeum (barley species, but the role of hybridization between polyploid Elymus and Hordeum has not been studied. It is not clear whether gene flow across polyploid Hordeum and Elymus species has occurred following polyploid speciation. Answering these questions will provide new insights into the formation of these polyploid species, and the potential role of gene flow among polyploid species during polyploid evolution. In order to address these questions, disrupted meiotic cDNA1 (DMC1 data from the allopolyploid StH Elymus are analyzed together with diploid and polyploid Hordeum species. Phylogenetic analysis revealed that the H copies of DMC1 sequence in some Elymus are very close to the H copies of DMC1 sequence in some polyploid Hordeum species, indicating either that the H genome in theses Elymus and polyploid Hordeum species originated from same diploid donor or that gene flow has occurred among them. Our analysis also suggested that the H genomes in Elymus species originated from limited gene pool, while H genomes in Hordeum polyploids have originated from broad gene pools. Nucleotide diversity (π of the DMC1 sequences on H genome from polyploid species (π = 0.02083 in Elymus, π = 0.01680 in polyploid Hordeum is higher than that in diploid Hordeum (π = 0.01488. The estimates of Tajima's D were significantly departure from the equilibrium neutral model at this locus in diploid Hordeum species (P<0.05, suggesting an excess of rare variants in diploid species which may not contribute to the origination of polyploids. Nucleotide diversity (π of the DMC1 sequences in Elymus polyploid species (π = 0.02083 is higher than that in polyploid Hordeum (π = 0.01680, suggesting that the degree of relationships between two parents of a polyploid might be a factor
Hamzić, Edin; Buitenhuis, Bart; Hérault, Frédéric; Hawken, Rachel; Abrahamsen, Mitchel S; Servin, Bertrand; Elsen, Jean-Michel; Pinard-van der Laan, Marie-Hélène; Bed'Hom, Bertrand
Coccidiosis is the most common and costly disease in the poultry industry and is caused by protozoans of the Eimeria genus. The current control of coccidiosis, based on the use of anticoccidial drugs and vaccination, faces serious obstacles such as drug resistance and the high costs for the development of efficient vaccines, respectively. Therefore, the current control programs must be expanded with complementary approaches such as the use of genetics to improve the host response to Eimeria infections. Recently, we have performed a large-scale challenge study on Cobb500 broilers using E. maxima for which we investigated variability among animals in response to the challenge. As a follow-up to this challenge study, we performed a genome-wide association study (GWAS) to identify genomic regions underlying variability of the measured traits in the response to Eimeria maxima in broilers. Furthermore, we conducted a post-GWAS functional analysis to increase our biological understanding of the underlying response to Eimeria maxima challenge. In total, we identified 22 single nucleotide polymorphisms (SNPs) with q value Eimeria maxima in broilers. Furthermore, the post-GWAS functional analysis indicates that biological pathways and networks involved in tissue proliferation and repair along with the primary innate immune response may play the most important role during the early stage of Eimeria maxima infection in broilers.
Su, Guosheng; Christensen, Ole Fredslund; Ostersen, Tage
of genomic predictions for daily gain in pigs. In the analysis of daily gain, four linear models were used: 1) a simple additive genetic model (MA), 2) a model including both additive and additive by additive epistatic genetic effects (MAE), 3) a model including both additive and dominance genetic effects...
Goel, Ridhi; Pandey, Ashutosh; Trivedi, Prabodh K; Asif, Mehar H
The WRKY gene family plays an important role in the development and stress responses in plants. As information is not available on the WRKY gene family in Musa species, genome-wide analysis has been carried out in this study using available genomic information from two species, Musa acuminata and Musa balbisiana. Analysis identified 147 and 132 members of the WRKY gene family in M. acuminata and M. balbisiana, respectively. Evolutionary analysis suggests that the WRKY gene family expanded much before the speciation in both the species. Most of the orthologs retained in two species were from the γ duplication event which occurred prior to α and β genome-wide duplication (GWD) events. Analysis also suggests that subtle changes in nucleotide sequences during the course of evolution have led to the development of new motifs which might be involved in neo-functionalization of different WRKY members in two species. Expression and cis-regulatory motif analysis suggest possible involvement of Group II and Group III WRKY members during various stresses and growth/development including fruit ripening process respectively.
Full Text Available The WRKY gene family plays an important role in the development and stress responses in plants. As information is not available on the WRKY gene family in Musa species, genome-wide analysis has been carried out in this study using available genomic information from two species, Musa acuminata and Musa balbisiana. Analysis identified 147 and 132 members of the WRKY gene family in M. acuminata and M. balbisiana respectively. Evolutionary analysis suggests that the WRKY gene family expanded much before the speciation in both the species. Most of the orthologs retained in two species were from the γ duplication event which occurred prior to α and β genome-wide duplication (GWD events. Analysis also suggests that subtle changes in nucleotide sequences during the course of evolution have led to the development of new motifs which might be involved in neo-functionalization of different WRKY members in two species. Expression and cis-regulatory motif analysis suggest possible involvement of Group II and Group III WRKY members during various stresses and growth/ development including fruit ripening process respectively.
Peters, Ulrike; Jiao, Shuo; Schumacher, Fredrick R; Hutter, Carolyn M; Aragaki, Aaron K; Baron, John A; Berndt, Sonja I; Bézieau, Stéphane; Brenner, Hermann; Butterbach, Katja; Caan, Bette J; Campbell, Peter T; Carlson, Christopher S; Casey, Graham; Chan, Andrew T; Chang-Claude, Jenny; Chanock, Stephen J; Chen, Lin S; Coetzee, Gerhard A; Coetzee, Simon G; Conti, David V; Curtis, Keith R; Duggan, David; Edwards, Todd; Fuchs, Charles S; Gallinger, Steven; Giovannucci, Edward L; Gogarten, Stephanie M; Gruber, Stephen B; Haile, Robert W; Harrison, Tabitha A; Hayes, Richard B; Henderson, Brian E; Hoffmeister, Michael; Hopper, John L; Hudson, Thomas J; Hunter, David J; Jackson, Rebecca D; Jee, Sun Ha; Jenkins, Mark A; Jia, Wei-Hua; Kolonel, Laurence N; Kooperberg, Charles; Küry, Sébastien; Lacroix, Andrea Z; Laurie, Cathy C; Laurie, Cecelia A; Le Marchand, Loic; Lemire, Mathieu; Levine, David; Lindor, Noralane M; Liu, Yan; Ma, Jing; Makar, Karen W; Matsuo, Keitaro; Newcomb, Polly A; Potter, John D; Prentice, Ross L; Qu, Conghui; Rohan, Thomas; Rosse, Stephanie A; Schoen, Robert E; Seminara, Daniela; Shrubsole, Martha; Shu, Xiao-Ou; Slattery, Martha L; Taverna, Darin; Thibodeau, Stephen N; Ulrich, Cornelia M; White, Emily; Xiang, Yongbing; Zanke, Brent W; Zeng, Yi-Xin; Zhang, Ben; Zheng, Wei; Hsu, Li
Heritable factors contribute to the development of colorectal cancer. Identifying the genetic loci associated with colorectal tumor formation could elucidate the mechanisms of pathogenesis. We conducted a genome-wide association study that included 14 studies, 12,696 cases of colorectal tumors (11,870 cancer, 826 adenoma), and 15,113 controls of European descent. The 10 most statistically significant, previously unreported findings were followed up in 6 studies; these included 3056 colorectal tumor cases (2098 cancer, 958 adenoma) and 6658 controls of European and Asian descent. Based on the combined analysis, we identified a locus that reached the conventional genome-wide significance level at less than 5.0 × 10(-8): an intergenic region on chromosome 2q32.3, close to nucleic acid binding protein 1 (most significant single nucleotide polymorphism: rs11903757; odds ratio [OR], 1.15 per risk allele; P = 3.7 × 10(-8)). We also found evidence for 3 additional loci with P values less than 5.0 × 10(-7): a locus within the laminin gamma 1 gene on chromosome 1q25.3 (rs10911251; OR, 1.10 per risk allele; P = 9.5 × 10(-8)), a locus within the cyclin D2 gene on chromosome 12p13.32 (rs3217810 per risk allele; OR, 0.84; P = 5.9 × 10(-8)), and a locus in the T-box 3 gene on chromosome 12q24.21 (rs59336; OR, 0.91 per risk allele; P = 3.7 × 10(-7)). In a large genome-wide association study, we associated polymorphisms close to nucleic acid binding protein 1 (which encodes a DNA-binding protein involved in DNA repair) with colorectal tumor risk. We also provided evidence for an association between colorectal tumor risk and polymorphisms in laminin gamma 1 (this is the second gene in the laminin family to be associated with colorectal cancers), cyclin D2 (which encodes for cyclin D2), and T-box 3 (which encodes a T-box transcription factor and is a target of Wnt signaling to β-catenin). The roles of these genes and their products in cancer pathogenesis warrant further
Sharma, Swarkar; Gao, Xiaochong; Londono, Douglas; Devroy, Shonn E.; Mauldin, Kristen N.; Frankel, Jessica T.; Brandon, January M.; Zhang, Dongping; Li, Quan-Zhen; Dobbs, Matthew B.; Gurnett, Christina A.; Grant, Struan F.A.; Hakonarson, Hakon; Dormans, John P.; Herring, John A.; Gordon, Derek; Wise, Carol A.
Adolescent idiopathic scoliosis (AIS) is an unexplained and common spinal deformity seen in otherwise healthy children. Its pathophysiology is poorly understood despite intensive investigation. Although genetic underpinnings are clear, replicated susceptibility loci that could provide insight into etiology have not been forthcoming. To address these issues, we performed genome-wide association studies (GWAS) of ∼327 000 single nucleotide polymorphisms (SNPs) in 419 AIS families. We found strongest evidence of association with chromosome 3p26.3 SNPs in the proximity of the CHL1 gene (P protein related to Robo3. Mutations in the Robo3 protein cause horizontal gaze palsy with progressive scoliosis (HGPPS), a rare disease marked by severe scoliosis. Other top associations in our GWAS were with SNPs in the DSCAM gene encoding an axon guidance protein in the same structural class with Chl1 and Robo3. We additionally found AIS associations with loci in CNTNAP2, supporting a previous study linking this gene with AIS. Cntnap2 is also of functional interest, as it interacts directly with L1 and Robo class proteins and participates in axon pathfinding. Our results suggest the relevance of axon guidance pathways in AIS susceptibility, although these findings require further study, particularly given the apparent genetic heterogeneity in this disease. PMID:21216876
Nivard, Michel G; Middeldorp, Christel M; Lubke, Gitta; Hottenga, Jouke-Jan; Abdellaoui, Abdel; Boomsma, Dorret I; Dolan, Conor V
Heritability may be estimated using phenotypic data collected in relatives or in distantly related individuals using genome-wide single nucleotide polymorphism (SNP) data. We combined these approaches by re-parameterizing the model proposed by Zaitlen et al and extended this model to include moderation of (total and SNP-based) genetic and environmental variance components by a measured moderator. By means of data simulation, we demonstrated that the type 1 error rates of the proposed test are correct and parameter estimates are accurate. As an application, we considered the moderation by age or year of birth of variance components associated with body mass index (BMI), height, attention problems (AP), and symptoms of anxiety and depression. The genetic variance of BMI was found to increase with age, but the environmental variance displayed a greater increase with age, resulting in a proportional decrease of the heritability of BMI. Environmental variance of height increased with year of birth. The environmental variance of AP increased with age. These results illustrate the assessment of moderation of environmental and genetic effects, when estimating heritability from combined SNP and family data. The assessment of moderation of genetic and environmental variance will enhance our understanding of the genetic architecture of complex traits. PMID:27436263
Derks, E M; Zwinderman, A H; Gamazon, E R
Population divergence impacts the degree of population stratification in Genome Wide Association Studies. We aim to: (i) investigate type-I error rate as a function of population divergence (F ST ) in multi-ethnic (admixed) populations; (ii) evaluate the statistical power and effect size estimates; and (iii) investigate the impact of population stratification on the results of gene-based analyses. Quantitative phenotypes were simulated. Type-I error rate was investigated for Single Nucleotide Polymorphisms (SNPs) with varying levels of F ST between the ancestral European and African populations. Type-II error rate was investigated for a SNP characterized by a high value of F ST . In all tests, genomic MDS components were included to correct for population stratification. Type-I and type-II error rate was adequately controlled in a population that included two distinct ethnic populations but not in admixed samples. Statistical power was reduced in the admixed samples. Gene-based tests showed no residual inflation in type-I error rate.
Full Text Available Abstract Background Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. Results We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. Conclusions The GWAMA (Genome-Wide Association Meta-Analysis software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA.
Full Text Available Copy number variation (CNV or single nucleotide phlyorphism (SNP is useful genetic resource to aid in understanding complex phenotypes or deseases susceptibility. Although thousands of CNVs and SNPs are currently avaliable in the public databases, they are somewhat difficult to use for analyses without visualization tools. We developed a web-based tool called the VCS (visualization of CNV or SNP to visualize the CNV or SNP detected. The VCS tool can assist to easily interpret a biological meaning from the numerical value of CNV and SNP. The VCS provides six visualization tools: i the enrichment of genome contents in CNV; ii the physical distribution of CNV or SNP on chromosomes; iii the distribution of log2 ratio of CNVs with criteria of interested; iv the number of CNV or SNP per binning unit; v the distribution of homozygosity of SNP genotype; and vi cytomap of genes within CNV or SNP region.
Full Text Available Obesity is a major health problem. Although heritability is substantial, genetic mechanisms predisposing to obesity are not very well understood. We have performed a genome wide association study (GWA for early onset (extreme obesity.a GWA (Genome-Wide Human SNP Array 5.0 comprising 440,794 single nucleotide polymorphisms for early onset extreme obesity based on 487 extremely obese young German individuals and 442 healthy lean German controls; b confirmatory analyses on 644 independent families with at least one obese offspring and both parents. We aimed to identify and subsequently confirm the 15 SNPs (minor allele frequency > or =10% with the lowest p-values of the GWA by four genetic models: additive, recessive, dominant and allelic. Six single nucleotide polymorphisms (SNPs in FTO (fat mass and obesity associated gene within one linkage disequilibrium (LD block including the GWA SNP rendering the lowest p-value (rs1121980; log-additive model: nominal p = 1.13 x 10(-7, corrected p = 0.0494; odds ratio (OR(CT 1.67, 95% confidence interval (CI 1.22-2.27; OR(TT 2.76, 95% CI 1.88-4.03 belonged to the 15 SNPs showing the strongest evidence for association with obesity. For confirmation we genotyped 11 of these in the 644 independent families (of the six FTO SNPs we chose only two representing the LD bock. For both FTO SNPs the initial association was confirmed (both Bonferroni corrected p<0.01. However, none of the nine non-FTO SNPs revealed significant transmission disequilibrium.Our GWA for extreme early onset obesity substantiates that variation in FTO strongly contributes to early onset obesity. This is a further proof of concept for GWA to detect genes relevant for highly complex phenotypes. We concurrently show that nine additional SNPs with initially low p-values in the GWA were not confirmed in our family study, thus suggesting that of the best 15 SNPs in the GWA only the FTO SNPs represent true positive findings.
Full Text Available Using accumulating SNP (Single-Nucleotide Polymorphism data, we performed a genome-wide search for polypeptide hormone ligands showing changes in the mature regions to elucidate genotype/phenotype diversity among various human populations. Neuropeptide S (NPS, a brain peptide hormone highly conserved in vertebrates, has diverse physiological effects on anxiety, fear, hyperactivity, food intake, and sleeping time through its cognate receptor-NPSR. Here, we report a SNP rs4751440 (L(6-NPS causing non-synonymous substitution on the 6(th position (V to L of the NPS mature peptide region. L(6-NPS has a higher allele frequency in Europeans than other populations and probably originated from European ancestors ~25,000 yrs ago based on haplotype analysis and Approximate Bayesian Computation. Functional analyses indicate that L(6-NPS exhibits a significant lower bioactivity than the wild type NPS, with ~20-fold higher EC50 values in the stimulation of NPSR. Additional evolutionary and mutagenesis studies further demonstrate the importance of the valine residue in the 6(th position for NPS functions. Given the known physiological roles of NPS receptor in inflammatory bowel diseases, asthma pathogenesis, macrophage immune responses, and brain functions, our study provides the basis to elucidate NPS evolution and signaling diversity among human populations.
Shankaranarayanan, P.; Mendoza-Parra, M.A.; Gool, van W.; Trindade, L.M.; Gronemeyer, H.
Linear amplification of DNA (LinDA) by T7 polymerase is a versatile and robust method for generating sufficient amounts of DNA for genome-wide studies with minute amounts of cells. LinDA can be coupled to a great number of global profiling technologies. Indeed, chromatin immunoprecipitation coupled
Hsu, Yi-Hsiang; Liu, Youfang; Hannan, Marian T.; Maixner, William; Smith, Shad B.; Diatchenko, Luda; Golightly, Yvonne M.; Menz, Hylton B.; Kraus, Virginia B.; Doherty, Michael; Wilson, A.G.; Jordan, Joanne M.
Objective Hallux valgus (HV) affects ~36% of Caucasian adults. Although considered highly heritable, the underlying genetic determinants are unclear. We conducted the first genome-wide association study (GWAS) aimed to identify genetic variants associated with HV. Methods HV was assessed in 3 Caucasian cohorts (n=2,263, n=915, and n=1,231 participants, respectively). In each cohort, a GWAS was conducted using 2.5M imputed single nucleotide polymorphisms (SNPs). Mixed-effect regression with the additive genetic model adjusted for age, sex, weight and within-family correlations was used for both sex-specific and combined analyses. To combine GWAS results across cohorts, fixed-effect inverse-variance meta-analyses were used. Following meta-analyses, top-associated findings were also examined in an African American cohort (n=327). Results The proportion of HV variance explained by genome-wide genotyped SNPs was 50% in men and 48% in women. A higher proportion of genetic determinants of HV was sex-specific. The most significantly associated SNP in men was rs9675316 located on chr17q23-a24 near the AXIN2 gene (p=5.46×10−7); the most significantly associated SNP in women was rs7996797 located on chr13q14.1-q14.2 near the ESD gene (p=7.21×10−7). Genome-wide significant SNP-by-sex interaction was found for SNP rs1563374 located on chr11p15.1 near the MRGPRX3 gene (interaction p-value =4.1×10−9). The association signals diminished when combining men and women. Conclusion Findings suggest that the potential pathophysiological mechanisms of HV are complex and strongly underlined by sex-specific interactions. The identified genetic variants imply contribution of biological pathways observed in osteoarthritis as well as new pathways, influencing skeletal development and inflammation. PMID:26337638
Full Text Available Nucleotide Analysis Japonica genome blast search result Result of blastn search against jap...onica genome sequence kome_japonica_genome_blast_search_result.zip kome_japonica_genome_blast_search_result ...
Wang, Jia; Xian, Xiaohua; Xu, Xinfu; Qu, Cunmin; Lu, Kun; Li, Jiana; Liu, Liezhao
Seed coat color is an extremely important breeding characteristic of Brassica napus. To elucidate the factors affecting the genetic architecture of seed coat color, a genome-wide association study (GWAS) of seed coat color was conducted with a diversity panel comprising 520 B. napus cultivars and inbred lines. In total, 22 single-nucleotide polymorphisms (SNPs) distributed on 7 chromosomes were found to be associated with seed coat color. The most significant SNPs were found in 2014 near Bn-scaff_15763_1-p233999, only 43.42 kb away from BnaC06g17050D, which is orthologous to Arabidopsis thaliana TRANSPARENT TESTA 12 (TT12), an important gene involved in the transportation of proanthocyanidin precursors into the vacuole. Two of eight repeatedly detected SNPs can be identified and digested by restriction enzymes. Candidate gene mining revealed that the relevant regions of significant SNP loci on the A09 and C08 chromosomes are highly homologous. Moreover, a comparison of the GWAS results to those of previous quantitative trait locus (QTL) studies showed that 11 SNPs were located in the confidence intervals of the QTLs identified in previous studies based on linkage analyses or association mapping. Our results provide insights into the genetic basis of seed coat color in B. napus, and the beneficial allele, SNP information, and candidate genes should be useful for selecting yellow seeds in B. napus breeding.
Full Text Available BACKGROUND: We describe SNPpy, a hybrid script database system using the Python SQLAlchemy library coupled with the PostgreSQL database to manage genotype data from Genome-Wide Association Studies (GWAS. This system makes it possible to merge study data with HapMap data and merge across studies for meta-analyses, including data filtering based on the values of phenotype and Single-Nucleotide Polymorphism (SNP data. SNPpy and its dependencies are open source software. RESULTS: The current version of SNPpy offers utility functions to import genotype and annotation data from two commercial platforms. We use these to import data from two GWAS studies and the HapMap Project. We then export these individual datasets to standard data format files that can be imported into statistical software for downstream analyses. CONCLUSIONS: By leveraging the power of relational databases, SNPpy offers integrated management and manipulation of genotype and phenotype data from GWAS studies. The analysis of these studies requires merging across GWAS datasets as well as patient and marker selection. To this end, SNPpy enables the user to filter the data and output the results as standardized GWAS file formats. It does low level and flexible data validation, including validation of patient data. SNPpy is a practical and extensible solution for investigators who seek to deploy central management of their GWAS data.
Full Text Available Although balancing selection with the sickle-cell trait and other red blood cell disorders has emphasized the interaction between malaria and human genetics, no systematic approach has so far been undertaken towards a comprehensive search for human genome variants influencing malaria. By screening 2,551 families in rural Ghana, West Africa, 108 nuclear families were identified who were exposed to hyperendemic malaria transmission and were homozygous wild-type for the established malaria resistance factors of hemoglobin (HbS, HbC, alpha(+ thalassemia, and glucose-6-phosphate-dehydrogenase deficiency. Of these families, 392 siblings aged 0.5-11 y were characterized for malaria susceptibility by closely monitoring parasite counts, malaria fever episodes, and anemia over 8 mo. An autosome-wide linkage analysis based on 10,000 single-nucleotide polymorphisms was conducted in 68 selected families including 241 siblings forming 330 sib pairs. Several regions were identified which showed evidence for linkage to the parasitological and clinical phenotypes studied, among them a prominent signal on Chromosome 10p15 obtained with malaria fever episodes (asymptotic z score = 4.37, empirical p-value = 4.0 x 10(-5, locus-specific heritability of 37.7%; 95% confidence interval, 15.7%-59.7%. The identification of genetic variants underlying the linkage signals may reveal as yet unrecognized pathways influencing human resistance to malaria.
Michelle D Johnson
Full Text Available Epigenetic marks such as cytosine methylation are important determinants of cellular and whole-body phenotypes. However, the extent of, and reasons for inter-individual differences in cytosine methylation, and their association with phenotypic variation are poorly characterised. Here we present the first genome-wide study of cytosine methylation at single-nucleotide resolution in an animal model of human disease. We used whole-genome bisulfite sequencing in the spontaneously hypertensive rat (SHR, a model of cardiovascular disease, and the Brown Norway (BN control strain, to define the genetic architecture of cytosine methylation in the mammalian heart and to test for association between methylation and pathophysiological phenotypes. Analysis of 10.6 million CpG dinucleotides identified 77,088 CpGs that were differentially methylated between the strains. In F1 hybrids we found 38,152 CpGs showing allele-specific methylation and 145 regions with parent-of-origin effects on methylation. Cis-linkage explained almost 60% of inter-strain variation in methylation at a subset of loci tested for linkage in a panel of recombinant inbred (RI strains. Methylation analysis in isolated cardiomyocytes showed that in the majority of cases methylation differences in cardiomyocytes and non-cardiomyocytes were strain-dependent, confirming a strong genetic component for cytosine methylation. We observed preferential nucleotide usage associated with increased and decreased methylation that is remarkably conserved across species, suggesting a common mechanism for germline control of inter-individual variation in CpG methylation. In the RI strain panel, we found significant correlation of CpG methylation and levels of serum chromogranin B (CgB, a proposed biomarker of heart failure, which is evidence for a link between germline DNA sequence variation, CpG methylation differences and pathophysiological phenotypes in the SHR strain. Together, these results will
Full Text Available Sheep are among the major economically important livestock species worldwide because the animals produce milk, wool, skin, and meat. In the present study, the Illumina OvineSNP50 BeadChip was used to investigate genetic diversity and genome selection among Suffolk, Rambouillet, Columbia, Polypay, and Targhee sheep breeds from the United States. After quality-control filtering of SNPs (single nucleotide polymorphisms, we used 48,026 SNPs, including 46,850 SNPs on autosomes that were in Hardy-Weinberg equilibrium and 1,176 SNPs on chromosome × for analysis. Phylogenetic analysis based on all 46,850 SNPs clearly separated Suffolk from Rambouillet, Columbia, Polypay, and Targhee, which was not surprising as Rambouillet contributed to the synthesis of the later three breeds. Based on pair-wise estimates of F(ST, significant genetic differentiation appeared between Suffolk and Rambouillet (F(ST = 0.1621, while Rambouillet and Targhee had the closest relationship (F(ST = 0.0681. A scan of the genome revealed 45 and 41 differentially selected regions (DSRs between Suffolk and Rambouillet and among Rambouillet-related breed populations, respectively. Our data indicated that regions 13 and 24 between Suffolk and Rambouillet might be good candidates for evaluating breed differences. Furthermore, ovine genome v3.1 assembly was used as reference to link functionally known homologous genes to economically important traits covered by these differentially selected regions. In brief, our present study provides a comprehensive genome-wide view on within- and between-breed genetic differentiation, biodiversity, and evolution among Suffolk, Rambouillet, Columbia, Polypay, and Targhee sheep breeds. These results may provide new guidance for the synthesis of new breeds with different breeding objectives.
Full Text Available Immunosuppression is cornerstone treatment of antineutrophil cytoplasmic antibody associated vasculitis (AAV but is later complicated by infection, cancer, cardiovascular and chronic kidney disease. Caveolin-1 is an essential structural protein for small cell membrane invaginations known as caveolae. Its functional role has been associated with these complications. For the first time, caveolin-1 (CAV1 gene variation is studied in AAV.CAV1 single nucleotide polymorphism rs4730751 was analysed in genomic DNA from 187 white patients with AAV from Birmingham, United Kingdom. The primary outcome measure was the composite endpoint of time to all-cause mortality or renal replacement therapy. Secondary endpoints included time to all-cause mortality, death from sepsis or vascular disease, cancer and renal replacement therapy. Validation of results was sought from 589 white AAV patients, from two European cohorts.The primary outcome occurred in 41.7% of Birmingham patients. In a multivariate model, non-CC genotype variation at the studied single nucleotide polymorphism was associated with increased risk from: the primary outcome measure [HR 1.86; 95% CI: 1.14-3.04; p=0.013], all-cause mortality [HR:1.83; 95% CI: 1.02-3.27; p=0.042], death from infection [HR:3.71; 95% CI: 1.28-10.77; p=0.016], death from vascular disease [HR:3.13; 95% CI: 1.07-9.10; p=0.037], and cancer [HR:5.55; 95% CI: 1.59-19.31; p=0.007]. In the validation cohort, the primary outcome rate was far lower (10.4%; no association between genotype and the studied endpoints was evident.The presence of a CC genotype in Birmingham is associated with protection from adverse outcomes of immunosuppression treated AAV. Lack of replication in the European cohort may have resulted from low clinical event rates. These findings are worthy of further study in larger cohorts.
Wattacheril, Julia; Lavine, Joel E; Chalasani, Naga P; Guo, Xiuqing; Kwon, Soonil; Schwimmer, Jeffrey; Molleston, Jean P; Loomba, Rohit; Brunt, Elizabeth M; Chen, Yii-Der Ida; Goodarzi, Mark O; Taylor, Kent D; Yates, Katherine P; Tonascia, James; Rotter, Jerome I
To identify genetic loci associated with features of histologic severity of nonalcoholic fatty liver disease in a cohort of Hispanic boys. There were 234 eligible Hispanic boys age 2-17 years with clinical, laboratory, and histologic data enrolled in the Nonalcoholic Steatohepatitis Clinical Research Network included in the analysis of 624 297 single nucleotide polymorphisms (SNPs). After the elimination of 4 outliers and 22 boys with cryptic relatedness, association analyses were performed on 208 DNA samples with corresponding liver histology. Logistic regression analyses were carried out for qualitative traits and linear regression analyses were applied for quantitative traits. The median age and body mass index z-score were 12.0 years (IQR, 11.0-14.0) and 2.4 (IQR, 2.1-2.6), respectively. The nonalcoholic fatty liver disease activity score (scores 1-4 vs 5-8) was associated with SNP rs11166927 on chromosome 8 in the TRAPPC9 region (P = 8.7 -07 ). Fibrosis stage was associated with SNP rs6128907 on chromosome 20, near actin related protein 5 homolog (p = 9.9 -07 ). In comparing our results in Hispanic boys with those of previously reported SNPs in adult nonalcoholic steatohepatitis, 2 of 26 susceptibility loci were associated with nonalcoholic fatty liver disease activity score and 2 were associated with fibrosis stage. In this discovery genome-wide association study, we found significant novel gene effects on histologic traits associated with nonalcoholic fatty liver disease activity score and fibrosis that are distinct from those previously recognized by adult nonalcoholic fatty liver disease genome-wide association studies. Copyright © 2017 Elsevier Inc. All rights reserved.
Ferrari, Raffaele; Hernandez, Dena G; Nalls, Michael A; Rohrer, Jonathan D; Ramasamy, Adaikalavan; Kwok, John B J; Dobson-Stone, Carol; Brooks, William S; Schofield, Peter R; Halliday, Glenda M; Hodges, John R; Piguet, Olivier; Bartley, Lauren; Thompson, Elizabeth; Haan, Eric; Hernández, Isabel; Ruiz, Agustín; Boada, Mercè; Borroni, Barbara; Padovani, Alessandro; Cruchaga, Carlos; Cairns, Nigel J; Benussi, Luisa; Binetti, Giuliano; Ghidoni, Roberta; Forloni, Gianluigi; Galimberti, Daniela; Fenoglio, Chiara; Serpente, Maria; Scarpini, Elio; Clarimón, Jordi; Lleó, Alberto; Blesa, Rafael; Waldö, Maria Landqvist; Nilsson, Karin; Nilsson, Christer; Mackenzie, Ian R A; Hsiung, Ging-Yuek R; Mann, David M A; Grafman, Jordan; Morris, Christopher M; Attems, Johannes; Griffiths, Timothy D; McKeith, Ian G; Thomas, Alan J; Pietrini, P; Huey, Edward D; Wassermann, Eric M; Baborie, Atik; Jaros, Evelyn; Tierney, Michael C; Pastor, Pau; Razquin, Cristina; Ortega-Cubero, Sara; Alonso, Elena; Perneczky, Robert; Diehl-Schmid, Janine; Alexopoulos, Panagiotis; Kurz, Alexander; Rainero, Innocenzo; Rubino, Elisa; Pinessi, Lorenzo; Rogaeva, Ekaterina; St George-Hyslop, Peter; Rossi, Giacomina; Tagliavini, Fabrizio; Giaccone, Giorgio; Rowe, James B; Schlachetzki, Johannes C M; Uphill, James; Collinge, John; Mead, Simon; Danek, Adrian; Van Deerlin, Vivianna M; Grossman, Murray; Trojanowski, John Q; van der Zee, Julie; Deschamps, William; Van Langenhove, Tim; Cruts, Marc; Van Broeckhoven, Christine; Cappa, Stefano F; Le Ber, Isabelle; Hannequin, Didier; Golfier, Véronique; Vercelletto, Martine; Brice, Alexis; Nacmias, Benedetta; Sorbi, Sandro; Bagnoli, Silvia; Piaceri, Irene; Nielsen, Jørgen E; Hjermind, Lena E; Riemenschneider, Matthias; Mayhaus, Manuel; Ibach, Bernd; Gasparoni, Gilles; Pichler, Sabrina; Gu, Wei; Rossor, Martin N; Fox, Nick C; Warren, Jason D; Spillantini, Maria Grazia; Morris, Huw R; Rizzu, Patrizia; Heutink, Peter; Snowden, Julie S; Rollinson, Sara; Richardson, Anna; Gerhard, Alexander; Bruni, Amalia C; Maletta, Raffaele; Frangipane, Francesca; Cupidi, Chiara; Bernardi, Livia; Anfossi, Maria; Gallo, Maura; Conidi, Maria Elena; Smirne, Nicoletta; Rademakers, Rosa; Baker, Matt; Dickson, Dennis W; Graff-Radford, Neill R; Petersen, Ronald C; Knopman, David; Josephs, Keith A; Boeve, Bradley F; Parisi, Joseph E; Seeley, William W; Miller, Bruce L; Karydas, Anna M; Rosen, Howard; van Swieten, John C; Dopper, Elise G P; Seelaar, Harro; Pijnenburg, Yolande A L; Scheltens, Philip; Logroscino, Giancarlo; Capozzo, Rosa; Novelli, Valeria; Puca, Annibale A; Franceschi, Massimo; Postiglione, Alfredo; Milan, Graziella; Sorrentino, Paolo; Kristiansen, Mark; Chiang, Huei-Hsin; Graff, Caroline; Pasquier, Florence; Rollin, Adeline; Deramecourt, Vincent; Lebert, Florence; Kapogiannis, Dimitrios; Ferrucci, Luigi; Pickering-Brown, Stuart; Singleton, Andrew B; Hardy, John; Momeni, Parastoo
Frontotemporal dementia (FTD) is a complex disorder characterised by a broad range of clinical manifestations, differential pathological signatures, and genetic variability. Mutations in three genes-MAPT, GRN, and C9orf72--have been associated with FTD. We sought to identify novel genetic risk loci associated with the disorder. We did a two-stage genome-wide association study on clinical FTD, analysing samples from 3526 patients with FTD and 9402 healthy controls. To reduce genetic heterogeneity, all participants were of European ancestry. In the discovery phase (samples from 2154 patients with FTD and 4308 controls), we did separate association analyses for each FTD subtype (behavioural variant FTD, semantic dementia, progressive non-fluent aphasia, and FTD overlapping with motor neuron disease [FTD-MND]), followed by a meta-analysis of the entire dataset. We carried forward replication of the novel suggestive loci in an independent sample series (samples from 1372 patients and 5094 controls) and then did joint phase and brain expression and methylation quantitative trait loci analyses for the associated (p<5 × 10(-8)) single-nucleotide polymorphisms. We identified novel associations exceeding the genome-wide significance threshold (p<5 × 10(-8)). Combined (joint) analyses of discovery and replication phases showed genome-wide significant association at 6p21.3, HLA locus (immune system), for rs9268877 (p=1·05 × 10(-8); odds ratio=1·204 [95% CI 1·11-1·30]), rs9268856 (p=5·51 × 10(-9); 0·809 [0·76-0·86]) and rs1980493 (p value=1·57 × 10(-8), 0·775 [0·69-0·86]) in the entire cohort. We also identified a potential novel locus at 11q14, encompassing RAB38/CTSC (the transcripts of which are related to lysosomal biology), for the behavioural FTD subtype for which joint analyses showed suggestive association for rs302668 (p=2·44 × 10(-7); 0·814 [0·71-0·92]). Analysis of expression and methylation quantitative trait loci data
Full Text Available Objective Holsteins are known as the world’s highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein using Korean Holstein data. Methods This study was performed using single nucleotide polymorphism (SNP chip data (Illumina BovineSNP50 Beadchip of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. Results We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. Conclusion This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins.
Dresch, Jacqueline M; Zellers, Rowan G; Bork, Daniel K; Drewell, Robert A
A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development.
Full Text Available Abstract Background Determining the position and order of contigs and scaffolds from a genome assembly within an organism's genome remains a technical challenge in a majority of sequencing projects. In order to exploit contemporary technologies for DNA sequencing, we developed a strategy for whole genome single nucleotide polymorphism sequencing allowing the positioning of sequence contigs onto a linkage map using the bin mapping method. Results The strategy was tested on a draft genome of the fungal pathogen Venturia inaequalis, the causal agent of apple scab, and further validated using sequence contigs derived from the diploid plant genome Fragaria vesca. Using our novel method we were able to anchor 70% and 92% of sequences assemblies for V. inaequalis and F. vesca, respectively, to genetic linkage maps. Conclusions We demonstrated the utility of this approach by accurately determining the bin map positions of the majority of the large sequence contigs from each genome sequence and validated our method by mapping single sequence repeat markers derived from sequence contigs on a full mapping population.
Jeffrey A Gross
Full Text Available Suicide and suicide attempts are complex behaviors that result from the interaction of different factors, including genetic variants that increase the predisposition to suicidal behaviors. Copy number variations (CNVs are deletions or duplications of a segment of DNA usually larger than one kilobase. These structural genetic changes, although quite rare, have been associated with genetic liability to mental disorders, such as autism, schizophrenia, and bipolar disorder. No genome-wide level studies have been published investigating the potential role of CNVs in suicidal behaviors. Based on single-nucleotide polymorphism array data, we followed the Penn-CNV standards to detect CNVs in 1,608 subjects, comprising 475 suicide and suicide attempt cases and 1,133 controls. Although the initial algorithms determined the presence of CNVs on chromosomes 6 and 12 in seven and eight cases, respectively, compared with none of the controls, visual inspection of the raw data did not support this finding. Furthermore we were unable to validate these findings by CNV-specific real-time polymerase chain reaction. Additionally, rare CNV burden analysis did not find an association between the frequency or length of rare CNVs and suicidal behavior in our sample population. Although our findings suggest CNVs do not play an important role in the etiology of suicidal behaviors, they are not inconsistent with the strong evidence from the literature suggesting that other genetic variants account for a portion of the total phenotypic variability in suicidal behavior.
Chaste, Pauline; Klei, Lambertus; Sanders, Stephan J; Hus, Vanessa; Murtha, Michael T; Lowe, Jennifer K; Willsey, A Jeremy; Moreno-De-Luca, Daniel; Yu, Timothy W; Fombonne, Eric; Geschwind, Daniel; Grice, Dorothy E; Ledbetter, David H; Mane, Shrikant M; Martin, Donna M; Morrow, Eric M; Walsh, Christopher A; Sutcliffe, James S; Lese Martin, Christa; Beaudet, Arthur L; Lord, Catherine; State, Matthew W; Cook, Edwin H; Devlin, Bernie
Phenotypic heterogeneity in autism has long been conjectured to be a major hindrance to the discovery of genetic risk factors, leading to numerous attempts to stratify children based on phenotype to increase power of discovery studies. This approach, however, is based on the hypothesis that phenotypic heterogeneity closely maps to genetic variation, which has not been tested. Our study examines the impact of subphenotyping of a well-characterized autism spectrum disorder (ASD) sample on genetic homogeneity and the ability to discover common genetic variants conferring liability to ASD. Genome-wide genotypic data of 2576 families from the Simons Simplex Collection were analyzed in the overall sample and phenotypic subgroups defined on the basis of diagnosis, IQ, and symptom profiles. We conducted a family-based association study, as well as estimating heritability and evaluating allele scores for each phenotypic subgroup. Association analyses revealed no genome-wide significant association signal. Subphenotyping did not increase power substantially. Moreover, allele scores built from the most associated single nucleotide polymorphisms, based on the odds ratio in the full sample, predicted case status in subsets of the sample equally well and heritability estimates were very similar for all subgroups. In genome-wide association analysis of the Simons Simplex Collection sample, reducing phenotypic heterogeneity had at most a modest impact on genetic homogeneity. Our results are based on a relatively small sample, one with greater homogeneity than the entire population; if they apply more broadly, they imply that analysis of subphenotypes is not a productive path forward for discovering genetic risk variants in ASD. Copyright © 2015 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Kim, Jin C.; Ha, Ye J.; Roh, Seon A.; Cho, Dong H.; Choi, Eun Y.; Kim, Tae W.; Kim, Jong H.; Kang, Tae W.; Kim, Seon Y.; Kim, Yong S.
Purpose: Studies aimed at predicting individual responsiveness to preoperative chemoradiation therapy (CRT) are urgently needed, especially considering the risks associated with poorly responsive patients. Methods and Materials: A 3-step strategy for the determination of CRT sensitivity is proposed based on (1) the screening of a human genome-wide single-nucleotide polymorphism (SNP) array in correlation with histopathologic tumor regression grade (TRG); (2) clinical association analysis of 113 patients treated with preoperative CRT; and (3) a cell-based functional assay for biological validation. Results: Genome-wide screening identified 9 SNPs associated with preoperative CRT responses. Positive responses (TRG 1-3) were obtained more frequently in patients carrying the reference allele (C) of the SNP CORO2A rs1985859 than in those with the substitution allele (T) (P=.01). Downregulation of CORO2A was significantly associated with reduced early apoptosis by 27% (P=.048) and 39% (P=.023) in RKO and COLO320DM colorectal cancer cells, respectively, as determined by flow cytometry. Reduced radiosensitivity was confirmed by colony-forming assays in the 2 colorectal cancer cells (P=.034 and .015, respectively). The SNP FAM101A rs7955740 was not associated with radiosensitivity in the clinical association analysis. However, downregulation of FAM101A significantly reduced early apoptosis by 29% in RKO cells (P=.047), and it enhanced colony formation in RKO cells (P=.001) and COLO320DM cells (P=.002). Conclusion: CRT-sensitive SNP markers were identified using a novel 3-step process. The candidate marker CORO2A rs1985859 and the putative marker FAM101A rs7955740 may be of value for the prediction of radiosensitivity to preoperative CRT, although further validation is needed in large cohorts
DeVilbiss, Andrew W; Sanalkumar, Rajendran; Johnson, Kirby D; Keles, Sunduz; Bresnick, Emery H
Hematopoiesis is an exquisitely regulated process in which stem cells in the developing embryo and the adult generate progenitor cells that give rise to all blood lineages. Master regulatory transcription factors control hematopoiesis by integrating signals from the microenvironment and dynamically establishing and maintaining genetic networks. One of the most rudimentary aspects of cell type-specific transcription factor function, how they occupy a highly restricted cohort of cis-elements in chromatin, remains poorly understood. Transformative technologic advances involving the coupling of next-generation DNA sequencing technology with the chromatin immunoprecipitation assay (ChIP-seq) have enabled genome-wide mapping of factor occupancy patterns. However, formidable problems remain; notably, ChIP-seq analysis yields hundreds to thousands of chromatin sites occupied by a given transcription factor, and only a fraction of the sites appear to be endowed with critical, non-redundant function. It has become en vogue to map transcription factor occupancy patterns genome-wide, while using powerful statistical tools to establish correlations to inform biology and mechanisms. With the advent of revolutionary genome editing technologies, one can now reach beyond correlations to conduct definitive hypothesis testing. This review focuses on key discoveries that have emerged during the path from single loci to genome-wide analyses, specifically in the context of hematopoietic transcriptional mechanisms. Copyright © 2014 ISEH - International Society for Experimental Hematology. Published by Elsevier Inc. All rights reserved.
O'Brien, Katie M; Sandler, Dale P; Shi, Min; Harmon, Quaker E; Taylor, Jack A; Weinberg, Clarice R
Genetic factors likely influence individuals' concentrations of 25-hydroxyvitamin D [25(OH)D], a biomarker of vitamin D exposure previously linked to reduced risk of several chronic diseases. We conducted a genome-wide association study of serum 25(OH)D (assessed using liquid chromatography-tandem mass spectrometry) and 386,449 single nucleotide polymorphisms (SNPs). Our sample consisted of 1,829 participants randomly selected from the Sister Study, a cohort of women who had a sister with breast cancer but had never had breast cancer themselves. 19,741 SNPs were associated with 25(OH)D ( p < 0.05). We re-assessed these hits in an independent sample of 1,534 participants who later developed breast cancer. After pooling, 32 SNPs had genome-wide significant associations ( p < 5 × 10 -8 ). These were located in or near GC , the vitamin D binding protein, or CYP2R1 , a cytochrome P450 enzyme that hydroxylates vitamin D to form 25(OH)D. The top hit was rs4588, a missense GC polymorphism associated with a 3.5 ng/mL decrease in 25(OH)D per copy of the minor allele (95% confidence interval [CI]: -4.1, -3.0; p = 4.5 × 10 -38 ). The strongest SNP near CYP2R1 was rs12794714, a synonymous variant ( p = 3.8 × 10 -12 ; β = 1.8 ng/mL decrease in 25(OH)D per minor allele [CI: -2.2, -1.3]). Serum 25(OH)D concentrations from samples collected from some participants 3-10 years after baseline (811 cases, 780 non-cases) were also strongly associated with both loci. These findings augment our understanding of genetic influences on 25(OH)D and the possible role of vitamin D binding proteins and cytochrome P450 enzymes in determining measured levels. These results may help to identify individuals genetically predisposed to vitamin D insufficiency.
Nadeem, Amina; Mumtaz, Sadaf; Naveed, Abdul Khaliq; Aslam, Muhammad; Siddiqui, Arif; Lodhi, Ghulam Mustafa; Ahmad, Tausif
Inflammation plays a significant role in the etiology of type 2 diabetes mellitus (T2DM). The rise in the pro-inflammatory cytokines is the essential step in glucotoxicity and lipotoxicity induced mitochondrial injury, oxidative stress and beta cell apoptosis in T2DM. Among the recognized markers are interleukin (IL)-6, IL-1, IL-10, IL-18, tissue necrosis factor-alpha (TNF-α), C-reactive protein, resistin, adiponectin, tissue plasminogen activator, fibrinogen and heptoglobins. Diabetes mellitus has firm genetic and very strong environmental influence; exhibiting a polygenic mode of inheritance. Many single nucleotide polymorphisms (SNPs) in various genes including those of pro and anti-inflammatory cytokines have been reported as a risk for T2DM. Not all the SNPs have been confirmed by unifying results in different studies and wide variations have been reported in various ethnic groups. The inter-ethnic variations can be explained by the fact that gene expression may be regulated by gene-gene, gene-environment and gene-nutrient interactions. This review highlights the impact of these interactions on determining the role of single nucleotide polymorphism of IL-6, TNF-α, resistin and adiponectin in pathogenesis of T2DM.
Full Text Available There have been a number of recent successes in the use of whole genome sequencing and sophisticated bioinformatics techniques to identify pathogenic DNA sequence variants responsible for individual idiopathic congenital conditions. However, the success of this identification process is heavily influenced by the ancestry or genetic background of a patient with an idiopathic condition. This is so because potential pathogenic variants in a patient’s genome must be contrasted with variants in a reference set of genomes made up of other individuals’ genomes of the same ancestry as the patient. We explored the effect of ignoring the ancestries of both an individual patient and the individuals used to construct reference genomes. We pursued this exploration in two major steps. We first considered variation in the per-genome number and rates likely functional derived (i.e., non-ancestral, based on the chimp genome single nucleotide variants and small indels in 52 individual whole human genomes sampled from 10 different global populations. We took advantage of a suite of computational and bioinformatics techniques to predict the functional effect of over 24 million genomic variants, both coding and non-coding, across these genomes. We found that the typical human genome harbors ~5.5-6.1 million total derived variants, of which ~12,000 are likely to have a functional effect (~5000 coding and ~7000 non-coding. We also found that the rates of functional genotypes per the total number of genotypes in individual whole genomes differ dramatically between human populations. We then created tables showing how the use of comparator or reference genome panels comprised of genomes from individuals that do not have the same ancestral background as a patient can negatively impact pathogenic variant identification. Our results have important implications for clinical sequencing initiatives.
Full Text Available Bipolar disorder is a common and severe mental illness with unsolved pathophysiology. A genome-wide association study (GWAS has been used to find a number of risk genes, but it is difficult for a GWAS to find genes indirectly associated with a disease. To find core hub genes, we introduce a network analysis after the GWAS was conducted. Six thousand four hundred fifty eight single nucleotide polymorphisms (SNPs with p < 0.01 were sifted out from Wellcome Trust Case Control Consortium (WTCCC dataset and mapped to 2045 genes, which are then compared with the protein–protein network. One hundred twelve genes with a degree >17 were chosen as hub genes from which five significant modules and four core hub genes (FBXL13, WDFY2, bFGF, and MTHFD1L were found. These core hub genes have not been reported to be directly associated with BD but may function by interacting with genes directly related to BD. Our method engenders new thoughts on finding genes indirectly associated with, but important for, complex diseases.
Bradley J Foresman
Full Text Available Barley yellow dwarf viruses (BYDVs are responsible for the disease barley yellow dwarf (BYD and affect many cereals including oat (Avena sativa L.. Until recently, the molecular marker technology in oat has not allowed for many marker-trait association studies to determine the genetic mechanisms for tolerance. A genome-wide association study (GWAS was performed on 428 spring oat lines using a recently developed high-density oat single nucleotide polymorphism (SNP array as well as a SNP-based consensus map. Marker-trait associations were performed using a Q-K mixed model approach to control for population structure and relatedness. Six significant SNP-trait associations representing two QTL were found on chromosomes 3C (Mrg17 and 18D (Mrg04. This is the first report of BYDV tolerance QTL on chromosome 3C (Mrg17 and 18D (Mrg04. Haplotypes using the two QTL were evaluated and distinct classes for tolerance were identified based on the number of favorable alleles. A large number of lines carrying both favorable alleles were observed in the panel.
Nguyen H. Nguyen
Full Text Available The genetic resources available for the commercially important fish species Yellowtail kingfish (YTK (Seriola lalandi are relative sparse. To overcome this, we aimed (1 to develop a linkage map for this species, and (2 to identify markers/variants associated with economically important traits in kingfish (with an emphasis on body weight. Genetic and genomic analyses were conducted using 13,898 single nucleotide polymorphisms (SNPs generated from a new high-throughput genotyping by sequencing platform, Diversity Arrays Technology (DArTseqTM in a pedigreed population comprising 752 animals. The linkage analysis enabled to map about 4,000 markers to 24 linkage groups (LGs, with an average density of 3.4 SNPs per cM. The linkage map was integrated into a genome-wide association study (GWAS and identified six variants/SNPs associated with body weight (P < 5e-8 when a multi-locus mixed model was used. Two out of the six significant markers were mapped to LGs 17 and 23, and collectively they explained 5.8% of the total genetic variance. It is concluded that the newly developed linkage map and the significantly associated markers with body weight provide fundamental information to characterize genetic architecture of growth-related traits in this population of YTK S. lalandi.
Nguyen, Nguyen H; Rastas, Pasi M A; Premachandra, H K A; Knibb, Wayne
The genetic resources available for the commercially important fish species Yellowtail kingfish (YTK) ( Seriola lalandi) are relative sparse. To overcome this, we aimed (1) to develop a linkage map for this species, and (2) to identify markers/variants associated with economically important traits in kingfish (with an emphasis on body weight). Genetic and genomic analyses were conducted using 13,898 single nucleotide polymorphisms (SNPs) generated from a new high-throughput genotyping by sequencing platform, Diversity Arrays Technology (DArTseq TM ) in a pedigreed population comprising 752 animals. The linkage analysis enabled to map about 4,000 markers to 24 linkage groups (LGs), with an average density of 3.4 SNPs per cM. The linkage map was integrated into a genome-wide association study (GWAS) and identified six variants/SNPs associated with body weight ( P 5e -8 ) when a multi-locus mixed model was used. Two out of the six significant markers were mapped to LGs 17 and 23, and collectively they explained 5.8% of the total genetic variance. It is concluded that the newly developed linkage map and the significantly associated markers with body weight provide fundamental information to characterize genetic architecture of growth-related traits in this population of YTK S. lalandi .
Ma, Yu; Coyne, Clarice J; Grusak, Michael A; Mazourek, Michael; Cheng, Peng; Main, Dorrie; McGee, Rebecca J
Marker-assisted breeding is now routinely used in major crops to facilitate more efficient cultivar improvement. This has been significantly enabled by the use of next-generation sequencing technology to identify loci and markers associated with traits of interest. While rich in a range of nutritional components, such as protein, mineral nutrients, carbohydrates and several vitamins, pea (Pisum sativum L.), one of the oldest domesticated crops in the world, remains behind many other crops in the availability of genomic and genetic resources. To further improve mineral nutrient levels in pea seeds requires the development of genome-wide tools. The objectives of this research were to develop these tools by: identifying genome-wide single nucleotide polymorphisms (SNPs) using genotyping by sequencing (GBS); constructing a high-density linkage map and comparative maps with other legumes, and identifying quantitative trait loci (QTL) for levels of boron, calcium, iron, potassium, magnesium, manganese, molybdenum, phosphorous, sulfur, and zinc in the seed, as well as for seed weight. In this study, 1609 high quality SNPs were found to be polymorphic between 'Kiflica' and 'Aragorn', two parents of an F 6 -derived recombinant inbred line (RIL) population. Mapping 1683 markers including 75 previously published markers and 1608 SNPs developed from the present study generated a linkage map of size 1310.1 cM. Comparative mapping with other legumes demonstrated that the highest level of synteny was observed between pea and the genome of Medicago truncatula. QTL analysis of the RIL population across two locations revealed at least one QTL for each of the mineral nutrient traits. In total, 46 seed mineral concentration QTLs, 37 seed mineral content QTLs, and 6 seed weight QTLs were discovered. The QTLs explained from 2.4% to 43.3% of the phenotypic variance. The genome-wide SNPs and the genetic linkage map developed in this study permitted QTL identification for pea seed mineral
de Oliveira, Thais C; Rodrigues, Priscila T; Menezes, Maria José; Gonçalves-Lopes, Raquel M; Bastos, Melissa S; Lima, Nathália F; Barbosa, Susana; Gerber, Alexandra L; Loss de Morais, Guilherme; Berná, Luisa; Phelan, Jody; Robello, Carlos; de Vasconcelos, Ana Tereza R; Alves, João Marcelo P; Ferreira, Marcelo U
The Americas were the last continent colonized by humans carrying malaria parasites. Plasmodium falciparum from the New World shows very little genetic diversity and greater linkage disequilibrium, compared with its African counterparts, and is clearly subdivided into local, highly divergent populations. However, limited available data have revealed extensive genetic diversity in American populations of another major human malaria parasite, P. vivax. We used an improved sample preparation strategy and next-generation sequencing to characterize 9 high-quality P. vivax genome sequences from northwestern Brazil. These new data were compared with publicly available sequences from recently sampled clinical P. vivax isolates from Brazil (BRA, total n = 11 sequences), Peru (PER, n = 23), Colombia (COL, n = 31), and Mexico (MEX, n = 19). We found that New World populations of P. vivax are as diverse (nucleotide diversity π between 5.2 × 10-4 and 6.2 × 10-4) as P. vivax populations from Southeast Asia, where malaria transmission is substantially more intense. They display several non-synonymous nucleotide substitutions (some of them previously undescribed) in genes known or suspected to be involved in antimalarial drug resistance, such as dhfr, dhps, mdr1, mrp1, and mrp-2, but not in the chloroquine resistance transporter ortholog (crt-o) gene. Moreover, P. vivax in the Americas is much less geographically substructured than local P. falciparum populations, with relatively little between-population genome-wide differentiation (pairwise FST values ranging between 0.025 and 0.092). Finally, P. vivax populations show a rapid decline in linkage disequilibrium with increasing distance between pairs of polymorphic sites, consistent with very frequent outcrossing. We hypothesize that the high diversity of present-day P. vivax lineages in the Americas originated from successive migratory waves and subsequent admixture between parasite lineages from geographically diverse sites
Yang, Cheng-Hong; Wu, Kuo-Chuan; Dahms, Hans-Uwe; Chuang, Li-Yeh; Chang, Hsueh-Wei
DNA barcodes are widely used in taxonomy, systematics, species identification, food safety, and forensic science. Most of the conventional DNA barcode sequences contain the whole information of a given barcoding gene. Most of the sequence information does not vary and is uninformative for a given group of taxa within a monophylum. We suggest here a method that reduces the amount of noninformative nucleotides in a given barcoding sequence of a major taxon, like the prokaryotes, or eukaryotic animals, plants, or fungi. The actual differences in genetic sequences, called single nucleotide polymorphism (SNP) genotyping, provide a tool for developing a rapid, reliable, and high-throughput assay for the discrimination between known species. Here, we investigated SNPs as robust markers of genetic variation for identifying different pigeon species based on available cytochrome c oxidase I (COI) data. We propose here a decision tree-based SNP barcoding (DTSB) algorithm where SNP patterns are selected from the DNA barcoding sequence of several evolutionarily related species in order to identify a single species with pigeons as an example. This approach can make use of any established barcoding system. We here firstly used as an example the mitochondrial gene COI information of 17 pigeon species (Columbidae, Aves) using DTSB after sequence trimming and alignment. SNPs were chosen which followed the rule of decision tree and species-specific SNP barcodes. The shortest barcode of about 11 bp was then generated for discriminating 17 pigeon species using the DTSB method. This method provides a sequence alignment and tree decision approach to parsimoniously assign a unique and shortest SNP barcode for any known species of a chosen monophyletic taxon where a barcoding sequence is available.
Stephanie N Lewis
Full Text Available Genome wide association studies (GWAS have proven useful as a method for identifying genetic variations associated with diseases. In this study, we analyzed GWAS data for 61 diseases and phenotypes to elucidate common associations based on single nucleotide polymorphisms (SNP. The study was an expansion on a previous study on identifying disease associations via data from a single GWAS on seven diseases.Adjustments to the originally reported study included expansion of the SNP dataset using Linkage Disequilibrium (LD and refinement of the four levels of analysis to encompass SNP, SNP block, gene, and pathway level comparisons. A pair-wise comparison between diseases and phenotypes was performed at each level and the Jaccard similarity index was used to measure the degree of association between two diseases/phenotypes. Disease relatedness networks (DRNs were used to visualize our results. We saw predominant relatedness between Multiple Sclerosis, type 1 diabetes, and rheumatoid arthritis for the first three levels of analysis. Expected relatedness was also seen between lipid- and blood-related traits.The predominant associations between Multiple Sclerosis, type 1 diabetes, and rheumatoid arthritis can be validated by clinical studies. The diseases have been proposed to share a systemic inflammation phenotype that can result in progression of additional diseases in patients with one of these three diseases. We also noticed unexpected relationships between metabolic and neurological diseases at the pathway comparison level. The less significant relationships found between diseases require a more detailed literature review to determine validity of the predictions. The results from this study serve as a first step towards a better understanding of seemingly unrelated diseases and phenotypes with similar symptoms or modes of treatment.
Frackelton Edward C
Full Text Available Abstract Background Human height is considered highly heritable and correlated with certain disorders, such as type 2 diabetes and cancer. Despite environmental influences, genetic factors are known to play an important role in stature determination. A number of genetic determinants of adult height have already been established through genome wide association studies. Methods To examine 51 single nucleotide polymorphisms (SNPs corresponding to the 46 previously reported genomic loci for height in 8,184 European American children with height measurements. We leveraged genotyping data from our ongoing GWA study of height variation in children in order to query the 51 SNPs in this pediatric cohort. Results Sixteen of these SNPs yielded at least nominally significant association to height, representing fifteen different loci including EFEMP1-PNPT1, GPR126, C6orf173, SPAG17, Histone class 1, HLA class III and GDF5-UQCC. Other loci revealed no evidence for association, including HMGA1 and HMGA2. For the 16 associated variants, the genotype score explained 1.64% of the total variation for height z-score. Conclusion Among 46 loci that have been reported to associate with adult height to date, at least 15 also contribute to the determination of height in childhood.
Shi, Yingyao; Gao, Lingling; Wu, Zhichao; Zhang, Xiaojing; Wang, Mingming; Zhang, Congshun; Zhang, Fan; Zhou, Yongli; Li, Zhikang
Improving the salt tolerance of direct-seeding rice at the seed germination stage is a major breeding goal in many Asian rice-growing countries, where seedlings must often establish in soils with a high salt content. Thus, it is important to understand the genetic mechanisms of salt tolerance in rice and to screen for germplasm with salt tolerance at the seed germination stage. Here, we investigated seven seed germination-related traits under control and salt-stress conditions and conducted a genome-wide association study based on the re-sequencing of 478 diverse rice accessions. The analysis used a mixed linear model and was based on 6,361,920 single nucleotide polymorphisms in 478 rice accessions grouped into whole, indica, and non-indica panels. Eleven loci containing 22 significant salt tolerance-associated single nucleotide polymorphisms were identified based on the stress-susceptibility indices (SSIs) of vigor index (VI) and mean germination time (MGT). From the SSI of VI, six major loci were identified, explaining 20.2% of the phenotypic variation. From the SSI of MGT, five major loci were detected, explaining 26.4% of the phenotypic variation. Of these, seven loci on chromosomes 1, 5, 6, 11, and 12 were close to six previously identified quantitative gene loci/genes related to tolerance to salinity or other abiotic stresses. The strongest association region for the SSI of MGT was identified in a ~ 13.3 kb interval (15450039-15,463,330) on chromosome 1, near salt-tolerance quantitative trait loci controlling the Na + : K + ratio, total Na + uptake, and total K + concentration. The strongest association region for the SSI of VI was detected in a ~ 164.2 kb interval (526662-690,854) on chromosome 2 harboring two nitrate transporter family genes (OsNRT2.1 and OsNRT2.2), which affect gene expression under salt stress. The haplotype analysis indicated that OsNRT2.2 was associated with subpopulation differentiation and its minor/rare tolerant haplotype was
The nature of the single nucleotide polymorphism (SNP) marker was validated by DNA sequencing of the parental PCR products. Using high resolution melt (HRM) profiles and normalised difference plots, we successfully differentiated the homozygous dominant (wild type), homozygous recessive (LPA) and heterozygous ...
Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
Matarin, Mar; Simon-Sanchez, Javier; Fung, Hon-Chung; Scholz, Sonja; Gibbs, J. Raphael; Hernandez, Dena G.; Crews, Cynthia; Britton, Angela; Wavrant De Vrieze, Fabienne; Brott, Thomas G.; Brown, Robert D.; Worrall, Bradford B.; Silliman, Scott; Case, L. Douglas; Hardy, John A.; Rich, Stephen S.; Meschia, James F.; Singleton, Andrew B.
Technological advances in molecular genetics allow rapid and sensitive identification of genomic copy number variants (CNVs). This, in turn, has sparked interest in the function such variation may play in disease. While a role for copy number mutations as a cause of Mendelian disorders is well established, it is unclear whether CNVs may affect risk for common complex disorders. We sought to investigate whether CNVs may modulate risk for ischemic stroke (IS) and to provide a catalog of CNVs in patients with this disorder by analyzing copy number metrics produced as a part of our previous genome-wide single-nucleotide polymorphism (SNP)-based association study of ischemic stroke in a North American white population. We examined CNVs in 263 patients with ischemic stroke (IS). Each identified CNV was compared with changes identified in 275 neurologically normal controls. Our analysis identified 247 CNVs, corresponding to 187 insertions (76%; 135 heterozygous; 25 homozygous duplications or triplications; 2 heterosomic) and 60 deletions (24%; 40 heterozygous deletions;3 homozygous deletions; 14 heterosomic deletions). Most alterations (81%) were the same as, or overlapped with, previously reported CNVs. We report here the first genome-wide analysis of CNVs in IS patients. In summary, our study did not detect any common genomic structural variation unequivocally linked to IS, although we cannot exclude that smaller CNVs or CNVs in genomic regions poorly covered by this methodology may confer risk for IS. The application of genome-wide SNP arrays now facilitates the evaluation of structural changes through the entire genome as part of a genome-wide genetic association study. PMID:18288507
Jiao, J; Wu, J; Lv, Z; Sun, C; Gao, L; Yan, X; Cui, L; Tang, Z; Yan, B; Jia, Y
This study aimed to investigate cytosine methylation profiles in different tobacco (Nicotiana tabacum) cultivars grown in China. Methylation-sensitive amplified polymorphism was used to analyze genome-wide global methylation profiles in four tobacco cultivars (Yunyan 85, NC89, K326, and Yunyan 87). Amplicons with methylated C motifs were cloned by reamplified polymerase chain reaction, sequenced, and analyzed. The results show that geographical location had a greater effect on methylation patterns in the tobacco genome than did sampling time. Analysis of the CG dinucleotide distribution in methylation-sensitive polymorphic restriction fragments suggested that a CpG dinucleotide cluster-enriched area is a possible site of cytosine methylation in the tobacco genome. The sequence alignments of the Nia1 gene (that encodes nitrate reductase) in Yunyan 87 in different regions indicate that a C-T transition might be responsible for the tobacco phenotype. T-C nucleotide replacement might also be responsible for the tobacco phenotype and may be influenced by geographical location.
Hand Melanie L
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs provide essential tools for the advancement of research in plant genomics, and the development of SNP resources for many species has been accelerated by the capabilities of second-generation sequencing technologies. The current study aimed to develop and use a novel bioinformatic pipeline to generate a comprehensive collection of SNP markers within the agriculturally important pasture grass tall fescue; an outbreeding allopolyploid species displaying three distinct morphotypes: Continental, Mediterranean and rhizomatous. Results A bioinformatic pipeline was developed that successfully identified SNPs within genotypes from distinct tall fescue morphotypes, following the sequencing of 414 polymerase chain reaction (PCR – generated amplicons using 454 GS FLX technology. Equivalent amplicon sets were derived from representative genotypes of each morphotype, including six Continental, five Mediterranean and one rhizomatous. A total of 8,584 and 2,292 SNPs were identified with high confidence within the Continental and Mediterranean morphotypes respectively. The success of the bioinformatic approach was demonstrated through validation (at a rate of 70% of a subset of 141 SNPs using both SNaPshot™ and GoldenGate™ assay chemistries. Furthermore, the quantitative genotyping capability of the GoldenGate™ assay revealed that approximately 30% of the putative SNPs were accessible to co-dominant scoring, despite the hexaploid genome structure. The sub-genome-specific origin of each SNP validated from Continental tall fescue was predicted using a phylogenetic approach based on comparison with orthologous sequences from predicted progenitor species. Conclusions Using the appropriate bioinformatic approach, amplicon resequencing based on 454 GS FLX technology is an effective method for the identification of polymorphic SNPs within the genomes of Continental and Mediterranean tall fescue. The
Rachel Maree Jones
Full Text Available Research has proposed that autistic-like traits in the general population lie on a continuum, with clinical Autism Spectrum Disorder (ASD representing the extreme end of this distribution. Inherent in this proposal is that biological mechanisms associated with clinical ASD may also underpin variation in autistic-like traits within the general population. A genome-wide association study u