WorldWideScience

Sample records for genome level identifies

  1. Genome-wide association study for levels of total serum IgE identifies HLA-C in a Japanese population.

    Directory of Open Access Journals (Sweden)

    Yohei Yatagai

    Full Text Available Most of the previously reported loci for total immunoglobulin E (IgE levels are related to Th2 cell-dependent pathways. We undertook a genome-wide association study (GWAS to identify genetic loci responsible for IgE regulation. A total of 479,940 single nucleotide polymorphisms (SNPs were tested for association with total serum IgE levels in 1180 Japanese adults. Fine-mapping with SNP imputation demonstrated 6 candidate regions: the PYHIN1/IFI16, MHC classes I and II, LEMD2, GRAMD1B, and chr13∶60576338 regions. Replication of these candidate loci in each region was assessed in 2 independent Japanese cohorts (n = 1110 and 1364, respectively. SNP rs3130941 in the HLA-C region was consistently associated with total IgE levels in 3 independent populations, and the meta-analysis yielded genome-wide significance (P = 1.07×10(-10. Using our GWAS results, we also assessed the reproducibility of previously reported gene associations with total IgE levels. Nine of 32 candidate genes identified by a literature search were associated with total IgE levels after correction for multiple testing. Our findings demonstrate that SNPs in the HLA-C region are strongly associated with total serum IgE levels in the Japanese population and that some of the previously reported genetic associations are replicated across ethnic groups.

  2. Comparative analyses identified species-specific functional roles in oral microbial genomes

    Science.gov (United States)

    Chen, Tsute; Gajare, Prasad; Olsen, Ingar; Dewhirst, Floyd E.

    2017-01-01

    ABSTRACT The advent of next generation sequencing is producing more genomic sequences for various strains of many human oral microbial species and allows for insightful functional comparisons at both intra- and inter-species levels. This study performed in-silico functional comparisons for currently available genomic sequences of major species associated with periodontitis including Aggregatibacter actinomycetemcomitans (AA), Porphyromonas gingivalis (PG), Treponema denticola (TD), and Tannerella forsythia (TF), as well as several cariogenic and commensal streptococcal species. Complete or draft sequences were annotated with the RAST to infer structured functional subsystems for each genome. The subsystems profiles were clustered to groups of functions with similar patterns. Functional enrichment and depletion were evaluated based on hypergeometric distribution to identify subsystems that are unique or missing between two groups of genomes. Unique or missing metabolic pathways and biological functions were identified in different species. For example, components involved in flagellar motility were found only in the motile species TD, as expected, with few exceptions scattered in several streptococcal species, likely associated with chemotaxis. Transposable elements were only found in the two Bacteroidales species PG and TF, and half of the AA genomes. Genes involved in CRISPR were prevalent in most oral species. Furthermore, prophage related subsystems were also commonly found in most species except for PG and Streptococcus mutans, in which very few genomes contain prophage components. Comparisons between pathogenic (P) and nonpathogenic (NP) genomes also identified genes potentially important for virulence. Two such comparisons were performed between AA (P) and several A. aphrophilus (NP) strains, and between S. mutans + S. sobrinus (P) and other oral streptococcal species (NP). This comparative genomics approach can be readily used to identify functions unique to

  3. Genome-wide association study with 1000 genomes imputation identifies signals for nine sex hormone-related phenotypes.

    Science.gov (United States)

    Ruth, Katherine S; Campbell, Purdey J; Chew, Shelby; Lim, Ee Mun; Hadlow, Narelle; Stuckey, Bronwyn G A; Brown, Suzanne J; Feenstra, Bjarke; Joseph, John; Surdulescu, Gabriela L; Zheng, Hou Feng; Richards, J Brent; Murray, Anna; Spector, Tim D; Wilson, Scott G; Perry, John R B

    2016-02-01

    Genetic factors contribute strongly to sex hormone levels, yet knowledge of the regulatory mechanisms remains incomplete. Genome-wide association studies (GWAS) have identified only a small number of loci associated with sex hormone levels, with several reproductive hormones yet to be assessed. The aim of the study was to identify novel genetic variants contributing to the regulation of sex hormones. We performed GWAS using genotypes imputed from the 1000 Genomes reference panel. The study used genotype and phenotype data from a UK twin register. We included 2913 individuals (up to 294 males) from the Twins UK study, excluding individuals receiving hormone treatment. Phenotypes were standardised for age, sex, BMI, stage of menstrual cycle and menopausal status. We tested 7,879,351 autosomal SNPs for association with levels of dehydroepiandrosterone sulphate (DHEAS), oestradiol, free androgen index (FAI), follicle-stimulating hormone (FSH), luteinizing hormone (LH), prolactin, progesterone, sex hormone-binding globulin and testosterone. Eight independent genetic variants reached genome-wide significance (P<5 × 10(-8)), with minor allele frequencies of 1.3-23.9%. Novel signals included variants for progesterone (P=7.68 × 10(-12)), oestradiol (P=1.63 × 10(-8)) and FAI (P=1.50 × 10(-8)). A genetic variant near the FSHB gene was identified which influenced both FSH (P=1.74 × 10(-8)) and LH (P=3.94 × 10(-9)) levels. A separate locus on chromosome 7 was associated with both DHEAS (P=1.82 × 10(-14)) and progesterone (P=6.09 × 10(-14)). This study highlights loci that are relevant to reproductive function and suggests overlap in the genetic basis of hormone regulation.

  4. Genome of the Netherlands population-specific imputations identify an ABCA6 variant associated with cholesterol levels

    Science.gov (United States)

    van Leeuwen, Elisabeth M.; Karssen, Lennart C.; Deelen, Joris; Isaacs, Aaron; Medina-Gomez, Carolina; Mbarek, Hamdi; Kanterakis, Alexandros; Trompet, Stella; Postmus, Iris; Verweij, Niek; van Enckevort, David J.; Huffman, Jennifer E.; White, Charles C.; Feitosa, Mary F.; Bartz, Traci M.; Manichaikul, Ani; Joshi, Peter K.; Peloso, Gina M.; Deelen, Patrick; van Dijk, Freerk; Willemsen, Gonneke; de Geus, Eco J.; Milaneschi, Yuri; Penninx, Brenda W.J.H.; Francioli, Laurent C.; Menelaou, Androniki; Pulit, Sara L.; Rivadeneira, Fernando; Hofman, Albert; Oostra, Ben A.; Franco, Oscar H.; Leach, Irene Mateo; Beekman, Marian; de Craen, Anton J.M.; Uh, Hae-Won; Trochet, Holly; Hocking, Lynne J.; Porteous, David J.; Sattar, Naveed; Packard, Chris J.; Buckley, Brendan M.; Brody, Jennifer A.; Bis, Joshua C.; Rotter, Jerome I.; Mychaleckyj, Josyf C.; Campbell, Harry; Duan, Qing; Lange, Leslie A.; Wilson, James F.; Hayward, Caroline; Polasek, Ozren; Vitart, Veronique; Rudan, Igor; Wright, Alan F.; Rich, Stephen S.; Psaty, Bruce M.; Borecki, Ingrid B.; Kearney, Patricia M.; Stott, David J.; Adrienne Cupples, L.; Neerincx, Pieter B.T.; Elbers, Clara C.; Francesco Palamara, Pier; Pe'er, Itsik; Abdellaoui, Abdel; Kloosterman, Wigard P.; van Oven, Mannis; Vermaat, Martijn; Li, Mingkun; Laros, Jeroen F.J.; Stoneking, Mark; de Knijff, Peter; Kayser, Manfred; Veldink, Jan H.; van den Berg, Leonard H.; Byelas, Heorhiy; den Dunnen, Johan T.; Dijkstra, Martijn; Amin, Najaf; Joeri van der Velde, K.; van Setten, Jessica; Kattenberg, Mathijs; van Schaik, Barbera D.C.; Bot, Jan; Nijman, Isaäc J.; Mei, Hailiang; Koval, Vyacheslav; Ye, Kai; Lameijer, Eric-Wubbo; Moed, Matthijs H.; Hehir-Kwa, Jayne Y.; Handsaker, Robert E.; Sunyaev, Shamil R.; Sohail, Mashaal; Hormozdiari, Fereydoun; Marschall, Tobias; Schönhuth, Alexander; Guryev, Victor; Suchiman, H. Eka D.; Wolffenbuttel, Bruce H.; Platteel, Mathieu; Pitts, Steven J.; Potluri, Shobha; Cox, David R.; Li, Qibin; Li, Yingrui; Du, Yuanping; Chen, Ruoyan; Cao, Hongzhi; Li, Ning; Cao, Sujie; Wang, Jun; Bovenberg, Jasper A.; Jukema, J. Wouter; van der Harst, Pim; Sijbrands, Eric J.; Hottenga, Jouke-Jan; Uitterlinden, Andre G.; Swertz, Morris A.; van Ommen, Gert-Jan B.; de Bakker, Paul I.W.; Eline Slagboom, P.; Boomsma, Dorret I.; Wijmenga, Cisca; van Duijn, Cornelia M.

    2015-01-01

    Variants associated with blood lipid levels may be population-specific. To identify low-frequency variants associated with this phenotype, population-specific reference panels may be used. Here we impute nine large Dutch biobanks (~35,000 samples) with the population-specific reference panel created by the Genome of the Netherlands Project and perform association testing with blood lipid levels. We report the discovery of five novel associations at four loci (P value <6.61 × 10−4), including a rare missense variant in ABCA6 (rs77542162, p.Cys1359Arg, frequency 0.034), which is predicted to be deleterious. The frequency of this ABCA6 variant is 3.65-fold increased in the Dutch and its effect (βLDL-C=0.135, βTC=0.140) is estimated to be very similar to those observed for single variants in well-known lipid genes, such as LDLR. PMID:25751400

  5. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    Science.gov (United States)

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  6. Integrated genomic and gene expression profiling identifies two major genomic circuits in urothelial carcinoma.

    Directory of Open Access Journals (Sweden)

    David Lindgren

    Full Text Available Similar to other malignancies, urothelial carcinoma (UC is characterized by specific recurrent chromosomal aberrations and gene mutations. However, the interconnection between specific genomic alterations, and how patterns of chromosomal alterations adhere to different molecular subgroups of UC, is less clear. We applied tiling resolution array CGH to 146 cases of UC and identified a number of regions harboring recurrent focal genomic amplifications and deletions. Several potential oncogenes were included in the amplified regions, including known oncogenes like E2F3, CCND1, and CCNE1, as well as new candidate genes, such as SETDB1 (1q21, and BCL2L1 (20q11. We next combined genome profiling with global gene expression, gene mutation, and protein expression data and identified two major genomic circuits operating in urothelial carcinoma. The first circuit was characterized by FGFR3 alterations, overexpression of CCND1, and 9q and CDKN2A deletions. The second circuit was defined by E3F3 amplifications and RB1 deletions, as well as gains of 5p, deletions at PTEN and 2q36, 16q, 20q, and elevated CDKN2A levels. TP53/MDM2 alterations were common for advanced tumors within the two circuits. Our data also suggest a possible RAS/RAF circuit. The tumors with worst prognosis showed a gene expression profile that indicated a keratinized phenotype. Taken together, our integrative approach revealed at least two separate networks of genomic alterations linked to the molecular diversity seen in UC, and that these circuits may reflect distinct pathways of tumor development.

  7. Use of Whole-Genus Genome Sequence Data To Develop a Multilocus Sequence Typing Tool That Accurately Identifies Yersinia Isolates to the Species and Subspecies Levels

    Science.gov (United States)

    Hall, Miquette; Chattaway, Marie A.; Reuter, Sandra; Savin, Cyril; Strauch, Eckhard; Carniel, Elisabeth; Connor, Thomas; Van Damme, Inge; Rajakaruna, Lakshani; Rajendram, Dunstan; Jenkins, Claire; Thomson, Nicholas R.

    2014-01-01

    The genus Yersinia is a large and diverse bacterial genus consisting of human-pathogenic species, a fish-pathogenic species, and a large number of environmental species. Recently, the phylogenetic and population structure of the entire genus was elucidated through the genome sequence data of 241 strains encompassing every known species in the genus. Here we report the mining of this enormous data set to create a multilocus sequence typing-based scheme that can identify Yersinia strains to the species level to a level of resolution equal to that for whole-genome sequencing. Our assay is designed to be able to accurately subtype the important human-pathogenic species Yersinia enterocolitica to whole-genome resolution levels. We also report the validation of the scheme on 386 strains from reference laboratory collections across Europe. We propose that the scheme is an important molecular typing system to allow accurate and reproducible identification of Yersinia isolates to the species level, a process often inconsistent in nonspecialist laboratories. Additionally, our assay is the most phylogenetically informative typing scheme available for Y. enterocolitica. PMID:25339391

  8. TCGA study identifies genomic features of cervical cancer

    Science.gov (United States)

    Investigators with The Cancer Genome Atlas (TCGA) Research Network have identified novel genomic and molecular characteristics of cervical cancer that will aid in subclassification of the disease and may help target therapies that are most appropriate for each patient.

  9. Novel genomes and genome constitutions identified by GISH and 5S rDNA and knotted1 genomic sequences in the genus Setaria.

    Science.gov (United States)

    Zhao, Meicheng; Zhi, Hui; Doust, Andrew N; Li, Wei; Wang, Yongfang; Li, Haiquan; Jia, Guanqing; Wang, Yongqiang; Zhang, Ning; Diao, Xianmin

    2013-04-11

    The Setaria genus is increasingly of interest to researchers, as its two species, S. viridis and S. italica, are being developed as models for understanding C4 photosynthesis and plant functional genomics. The genome constitution of Setaria species has been studied in the diploid species S. viridis, S. adhaerans and S. grisebachii, where three genomes A, B and C were identified respectively. Two allotetraploid species, S. verticillata and S. faberi, were found to have AABB genomes, and one autotetraploid species, S. queenslandica, with an AAAA genome, has also been identified. The genomes and genome constitutions of most other species remain unknown, even though it was thought there are approximately 125 species in the genus distributed world-wide. GISH was performed to detect the genome constitutions of Eurasia species of S. glauca, S. plicata, and S. arenaria, with the known A, B and C genomes as probes. No or very poor hybridization signal was detected indicating that their genomes are different from those already described. GISH was also performed reciprocally between S. glauca, S. plicata, and S. arenaria genomes, but no hybridization signals between each other were found. The two sets of chromosomes of S. lachnea both hybridized strong signals with only the known C genome of S. grisebachii. Chromosomes of Qing 9, an accession formerly considered as S. viridis, hybridized strong signal only to B genome of S. adherans. Phylogenetic trees constructed with 5S rDNA and knotted1 markers, clearly classify the samples in this study into six clusters, matching the GISH results, and suggesting that the F genome of S. arenaria is basal in the genus. Three novel genomes in the Setaria genus were identified and designated as genome D (S. glauca), E (S. plicata) and F (S. arenaria) respectively. The genome constitution of tetraploid S. lachnea is putatively CCC'C'. Qing 9 is a B genome species indigenous to China and is hypothesized to be a newly identified species. The

  10. Genomic suppression subtractive hybridization as a tool to identify differences in mycorrhizal fungal genomes.

    Science.gov (United States)

    Murat, Claude; Zampieri, Elisa; Vallino, Marta; Daghino, Stefania; Perotto, Silvia; Bonfante, Paola

    2011-05-01

    Characterization of genomic variation among different microbial species, or different strains of the same species, is a field of significant interest with a wide range of potential applications. We have investigated the genomic variation in mycorrhizal fungal genomes through genomic suppressive subtractive hybridization. The comparison was between phylogenetically distant and close truffle species (Tuber spp.), and between isolates of the ericoid mycorrhizal fungus Oidiodendron maius featuring different degrees of metal tolerance. In the interspecies experiment, almost all the sequences that were identified in the Tuber melanosporum genome and absent in Tuber borchii and Tuber indicum corresponded to transposable elements. In the intraspecies comparison, some specific sequences corresponded to regions coding for enzymes, among them a glutathione synthetase known to be involved in metal tolerance. This approach is a quick and rather inexpensive tool to develop molecular markers for mycorrhizal fungi tracking and barcoding, to identify functional genes and to investigate the genome plasticity, adaptation and evolution. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  11. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines.

    Science.gov (United States)

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours' biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription-quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes.

  12. Whole genome DNA copy number changes identified by high density oligonucleotide arrays

    Directory of Open Access Journals (Sweden)

    Huang Jing

    2004-05-01

    Full Text Available Abstract Changes in DNA copy number are one of the hallmarks of the genetic instability common to most human cancers. Previous micro-array-based methods have been used to identify chromosomal gains and losses; however, they are unable to genotype alleles at the level of single nucleotide polymorphisms (SNPs. Here we describe a novel algorithm that uses a recently developed high-density oligonucleotide array-based SNP genotyping method, whole genome sampling analysis (WGSA, to identify genome-wide chromosomal gains and losses at high resolution. WGSA simultaneously genotypes over 10,000 SNPs by allele-specific hybridisation to perfect match (PM and mismatch (MM probes synthesised on a single array. The copy number algorithm jointly uses PM intensity and discrimination ratios between paired PM and MM intensity values to identify and estimate genetic copy number changes. Values from an experimental sample are compared with SNP-specific distributions derived from a reference set containing over 100 normal individuals to gain statistical power. Genomic regions with statistically significant copy number changes can be identified using both single point analysis and contiguous point analysis of SNP intensities. We identified multiple regions of amplification and deletion using a panel of human breast cancer cell lines. We verified these results using an independent method based on quantitative polymerase chain reaction and found that our approach is both sensitive and specific and can tolerate samples which contain a mixture of both tumour and normal DNA. In addition, by using known allele frequencies from the reference set, statistically significant genomic intervals can be identified containing contiguous stretches of homozygous markers, potentially allowing the detection of regions undergoing loss of heterozygosity (LOH without the need for a matched normal control sample. The coupling of LOH analysis, via SNP genotyping, with copy number

  13. Genome of the Netherlands population-specific imputations identify an ABCA6 variant associated with cholesterol levels

    DEFF Research Database (Denmark)

    Van Leeuwen, Elisabeth M.; Karssen, Lennart C.; Deelen, Joris

    2015-01-01

    created by the Genome of the Netherlands Project and perform association testing with blood lipid levels. We report the discovery of five novel associations at four loci (P value -4), including a rare missense variant in ABCA6 (rs77542162, p.Cys1359Arg, frequency 0.034), which is predicted...

  14. Limitations of variable number of tandem repeat typing identified through whole genome sequencing of Mycobacterium avium subsp. paratuberculosis on a national and herd level.

    Science.gov (United States)

    Ahlstrom, Christina; Barkema, Herman W; Stevenson, Karen; Zadoks, Ruth N; Biek, Roman; Kao, Rowland; Trewby, Hannah; Haupstein, Deb; Kelton, David F; Fecteau, Gilles; Labrecque, Olivia; Keefe, Greg P; McKenna, Shawn L B; De Buck, Jeroen

    2015-03-08

    Mycobacterium avium subsp. paratuberculosis (MAP), the causative bacterium of Johne's disease in dairy cattle, is widespread in the Canadian dairy industry and has significant economic and animal welfare implications. An understanding of the population dynamics of MAP can be used to identify introduction events, improve control efforts and target transmission pathways, although this requires an adequate understanding of MAP diversity and distribution between herds and across the country. Whole genome sequencing (WGS) offers a detailed assessment of the SNP-level diversity and genetic relationship of isolates, whereas several molecular typing techniques used to investigate the molecular epidemiology of MAP, such as variable number of tandem repeat (VNTR) typing, target relatively unstable repetitive elements in the genome that may be too unpredictable to draw accurate conclusions. The objective of this study was to evaluate the diversity of bovine MAP isolates in Canadian dairy herds using WGS and then determine if VNTR typing can distinguish truly related and unrelated isolates. Phylogenetic analysis based on 3,039 SNPs identified through WGS of 124 MAP isolates identified eight genetically distinct subtypes in dairy herds from seven Canadian provinces, with the dominant type including over 80% of MAP isolates. VNTR typing of 527 MAP isolates identified 12 types, including "bison type" isolates, from seven different herds. At a national level, MAP isolates differed from each other by 1-2 to 239-240 SNPs, regardless of whether they belonged to the same or different VNTR types. A herd-level analysis of MAP isolates demonstrated that VNTR typing may both over-estimate and under-estimate the relatedness of MAP isolates found within a single herd. The presence of multiple MAP subtypes in Canada suggests multiple introductions into the country including what has now become one dominant type, an important finding for Johne's disease control. VNTR typing often failed to

  15. Computational approaches to identify functional genetic variants in cancer genomes

    DEFF Research Database (Denmark)

    Gonzalez-Perez, Abel; Mustonen, Ville; Reva, Boris

    2013-01-01

    The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor but only a minority of these drive tumor progression. We present the result of discu......The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor but only a minority of these drive tumor progression. We present the result...... of discussions within the ICGC on how to address the challenge of identifying mutations that contribute to oncogenesis, tumor maintenance or response to therapy, and recommend computational techniques to annotate somatic variants and predict their impact on cancer phenotype....

  16. A genome-wide association study identifies rs2000999 as a strong genetic determinant of circulating haptoglobin levels.

    Directory of Open Access Journals (Sweden)

    Philippe Froguel

    Full Text Available Haptoglobin is an acute phase inflammatory marker. Its main function is to bind hemoglobin released from erythrocytes to aid its elimination, and thereby haptoglobin prevents the generation of reactive oxygen species in the blood. Haptoglobin levels have been repeatedly associated with a variety of inflammation-linked infectious and non-infectious diseases, including malaria, tuberculosis, human immunodeficiency virus, hepatitis C, diabetes, carotid atherosclerosis, and acute myocardial infarction. However, a comprehensive genetic assessment of the inter-individual variability of circulating haptoglobin levels has not been conducted so far.We used a genome-wide association study initially conducted in 631 French children followed by a replication in three additional European sample sets and we identified a common single nucleotide polymorphism (SNP, rs2000999 located in the Haptoglobin gene (HP as a strong genetic predictor of circulating Haptoglobin levels (P(overall = 8.1 × 10(-59, explaining 45.4% of its genetic variability (11.8% of Hp global variance. The functional relevance of rs2000999 was further demonstrated by its specific association with HP mRNA levels (β = 0.23 ± 0.08, P = 0.007. Finally, SNP rs2000999 was associated with decreased total and low-density lipoprotein cholesterol in 8,789 European children (P(total cholesterol = 0.002 and P(LDL = 0.0008.Given the central position of haptoglobin in many inflammation-related metabolic pathways, the relevance of rs2000999 genotyping when evaluating haptoglobin concentration should be further investigated in order to improve its diagnostic/therapeutic and/or prevention impact.

  17. A genome-wide association study identifies protein quantitative trait loci (pQTLs.

    Directory of Open Access Journals (Sweden)

    David Melzer

    2008-05-01

    Full Text Available There is considerable evidence that human genetic variation influences gene expression. Genome-wide studies have revealed that mRNA levels are associated with genetic variation in or close to the gene coding for those mRNA transcripts - cis effects, and elsewhere in the genome - trans effects. The role of genetic variation in determining protein levels has not been systematically assessed. Using a genome-wide association approach we show that common genetic variation influences levels of clinically relevant proteins in human serum and plasma. We evaluated the role of 496,032 polymorphisms on levels of 42 proteins measured in 1200 fasting individuals from the population based InCHIANTI study. Proteins included insulin, several interleukins, adipokines, chemokines, and liver function markers that are implicated in many common diseases including metabolic, inflammatory, and infectious conditions. We identified eight Cis effects, including variants in or near the IL6R (p = 1.8x10(-57, CCL4L1 (p = 3.9x10(-21, IL18 (p = 6.8x10(-13, LPA (p = 4.4x10(-10, GGT1 (p = 1.5x10(-7, SHBG (p = 3.1x10(-7, CRP (p = 6.4x10(-6 and IL1RN (p = 7.3x10(-6 genes, all associated with their respective protein products with effect sizes ranging from 0.19 to 0.69 standard deviations per allele. Mechanisms implicated include altered rates of cleavage of bound to unbound soluble receptor (IL6R, altered secretion rates of different sized proteins (LPA, variation in gene copy number (CCL4L1 and altered transcription (GGT1. We identified one novel trans effect that was an association between ABO blood group and tumour necrosis factor alpha (TNF-alpha levels (p = 6.8x10(-40, but this finding was not present when TNF-alpha was measured using a different assay , or in a second study, suggesting an assay-specific association. Our results show that protein levels share some of the features of the genetics of gene expression. These include the presence of strong genetic effects in cis

  18. Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing

    Science.gov (United States)

    Faucon, Frederic; Dusfour, Isabelle; Gaude, Thierry; Navratil, Vincent; Boyer, Frederic; Chandre, Fabrice; Sirisopa, Patcharawan; Thanispong, Kanutcharee; Juntarajumnong, Waraporn; Poupardin, Rodolphe; Chareonviriyaphap, Theeraphap; Girod, Romain; Corbel, Vincent; Reynaud, Stephane; David, Jean-Philippe

    2015-01-01

    The capacity of mosquitoes to resist insecticides threatens the control of diseases such as dengue and malaria. Until alternative control tools are implemented, characterizing resistance mechanisms is crucial for managing resistance in natural populations. Insecticide biodegradation by detoxification enzymes is a common resistance mechanism; however, the genomic changes underlying this mechanism have rarely been identified, precluding individual resistance genotyping. In particular, the role of copy number variations (CNVs) and polymorphisms of detoxification enzymes have never been investigated at the genome level, although they can represent robust markers of metabolic resistance. In this context, we combined target enrichment with high-throughput sequencing for conducting the first comprehensive screening of gene amplifications and polymorphisms associated with insecticide resistance in mosquitoes. More than 760 candidate genes were captured and deep sequenced in several populations of the dengue mosquito Ae. aegypti displaying distinct genetic backgrounds and contrasted resistance levels to the insecticide deltamethrin. CNV analysis identified 41 gene amplifications associated with resistance, most affecting cytochrome P450s overtranscribed in resistant populations. Polymorphism analysis detected more than 30,000 variants and strong selection footprints in specific genomic regions. Combining Bayesian and allele frequency filtering approaches identified 55 nonsynonymous variants strongly associated with resistance. Both CNVs and polymorphisms were conserved within regions but differed across continents, confirming that genomic changes underlying metabolic resistance to insecticides are not universal. By identifying novel DNA markers of insecticide resistance, this study opens the way for tracking down metabolic changes developed by mosquitoes to resist insecticides within and among populations. PMID:26206155

  19. Base-By-Base: single nucleotide-level analysis of whole viral genome alignments.

    Science.gov (United States)

    Brodie, Ryan; Smith, Alex J; Roper, Rachel L; Tcherepanov, Vasily; Upton, Chris

    2004-07-14

    With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes) is not feasible without new bioinformatics tools. A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1) rapidly identify and correct alignment errors in large, multiple genome alignments; and 2) generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs) to retrieve detailed annotation information about the aligned genomes or use information from text files. Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.

  20. Genome-wide nucleosome map and cytosine methylation levels of an ancient human genome

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Valen, Eivind; Velazquez, Amhed Missael Vargas

    2014-01-01

    Epigenetic information is available from contemporary organisms, but is difficult to track back in evolutionary time. Here, we show that genome-wide epigenetic information can be gathered directly from next-generation sequence reads of DNA isolated from ancient remains. Using the genome sequence...... data generated from hair shafts of a 4000-yr-old Paleo-Eskimo belonging to the Saqqaq culture, we generate the first ancient nucleosome map coupled with a genome-wide survey of cytosine methylation levels. The validity of both nucleosome map and methylation levels were confirmed by the recovery...

  1. MVisAGe Identifies Concordant and Discordant Genomic Alterations of Driver Genes in Squamous Tumors.

    Science.gov (United States)

    Walter, Vonn; Du, Ying; Danilova, Ludmila; Hayward, Michele C; Hayes, D Neil

    2018-06-15

    Integrated analyses of multiple genomic datatypes are now common in cancer profiling studies. Such data present opportunities for numerous computational experiments, yet analytic pipelines are limited. Tools such as the cBioPortal and Regulome Explorer, although useful, are not easy to access programmatically or to implement locally. Here, we introduce the MVisAGe R package, which allows users to quantify gene-level associations between two genomic datatypes to investigate the effect of genomic alterations (e.g., DNA copy number changes on gene expression). Visualizing Pearson/Spearman correlation coefficients according to the genomic positions of the underlying genes provides a powerful yet novel tool for conducting exploratory analyses. We demonstrate its utility by analyzing three publicly available cancer datasets. Our approach highlights canonical oncogenes in chr11q13 that displayed the strongest associations between expression and copy number, including CCND1 and CTTN , genes not identified by copy number analysis in the primary reports. We demonstrate highly concordant usage of shared oncogenes on chr3q, yet strikingly diverse oncogene usage on chr11q as a function of HPV infection status. Regions of chr19 that display remarkable associations between methylation and gene expression were identified, as were previously unreported miRNA-gene expression associations that may contribute to the epithelial-to-mesenchymal transition. Significance: This study presents an important bioinformatics tool that will enable integrated analyses of multiple genomic datatypes. Cancer Res; 78(12); 3375-85. ©2018 AACR . ©2018 American Association for Cancer Research.

  2. Base-By-Base: Single nucleotide-level analysis of whole viral genome alignments

    Directory of Open Access Journals (Sweden)

    Tcherepanov Vasily

    2004-07-01

    Full Text Available Abstract Background With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes is not feasible without new bioinformatics tools. Results A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1 rapidly identify and correct alignment errors in large, multiple genome alignments; and 2 generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs to retrieve detailed annotation information about the aligned genomes or use information from text files. Conclusion Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.

  3. Genomic Regions Affecting Cheese Making Properties Identified in Danish Holsteins

    DEFF Research Database (Denmark)

    Gregersen, Vivi Raundahl; Bertelsen, Henriette Pasgaard; Poulsen, Nina Aagaard

    The cheese renneting process is affected by a number of factors associated to milk composition and a number of Danish Holsteins has previously been identified to have poor milk coagulation ability. Therefore, the aim of this study was to identify genomic regions affecting the technological...

  4. Patterns of genomic variation in the poplar rust fungus Melampsora larici-populina identify pathogenesis-related factors

    Directory of Open Access Journals (Sweden)

    Antoine ePersoons

    2014-09-01

    Full Text Available Melampsora larici-populina is a fungal pathogen responsible for foliar rust disease on poplar trees, which causes damage to forest plantations worldwide, particularly in Northern Europe. The reference genome of the isolate 98AG31 was previously sequenced using a whole genome shotgun strategy, revealing a large genome of 101 megabases containing 16,399 predicted genes, which included secreted protein genes representing poplar rust candidate effectors. In the present study, the genomes of 15 isolates collected over the past 20 years throughout the French territory, representing distinct virulence profiles, were characterized by massively parallel sequencing to assess genetic variation in the poplar rust fungus. Comparison to the reference genome revealed striking structural variations. Analysis of coverage and sequencing depth identified large missing regions between isolates related to the mating type loci. More than 611,824 single-nucleotide polymorphism (SNP positions were uncovered overall, indicating a remarkable level of polymorphism. Based on the accumulation of non-synonymous substitutions in coding sequences and the relative frequencies of synonymous and non-synonymous polymorphisms (i.e. PN/PS, we identify candidate genes that may be involved in fungal pathogenesis. Correlation between non-synonymous SNPs in genes encoding secreted proteins and pathotypes of the studied isolates revealed candidate genes potentially related to virulences 1, 6 and 8 of the poplar rust fungus.

  5. Genome-wide Association Study Identifies New Loci for Resistance to Leptosphaeria maculans in Canola

    Directory of Open Access Journals (Sweden)

    Harsh Raman

    2016-10-01

    Full Text Available Blackleg, caused by Leptosphaeria maculans, is a significant disease which affects the sustainable production of canola. This study reports a genome-wide association study based on 18,804 polymorphic SNPs to identify loci associated with qualitative and quantitative resistance to L. maculans. Genomic regions delimited with 503 significant SNP markers, that are associated with resistance evaluated using 12 single spore isolates and pathotypes from four canola stubble were identified. Several significant associations were detected at known disease resistance loci including in the vicinity of recently cloned Rlm2/LepR3 genes, and at new loci on chromosomes A01/C01, A02/C02, A03/C03, A05/C05, A06, A08, and A09. In addition, we validated statistically significant associations on A01, A07 and A10 in four genetic mapping populations, demonstrating that GWAS marker loci are indeed associated with resistance to L. maculans. One of the novel loci identified for the first time, Rlm12, conveys adult plant resistance and mapped within 13.2 kb from Arabidopsis R gene of TIR-NBS class. We showed that resistance loci are located in the vicinity of R genes of A. thaliana and B. napus on the sequenced genome of B. napus cv. Darmor-bzh. Significantly associated SNP markers provide a valuable tool to enrich germplasm for favorable alleles in order to improve the level of resistance to L. maculans in canola.

  6. Epidemiological analysis of Salmonella clusters identified by whole genome sequencing, England and Wales 2014.

    Science.gov (United States)

    Waldram, Alison; Dolan, Gayle; Ashton, Philip M; Jenkins, Claire; Dallman, Timothy J

    2018-05-01

    The unprecedented level of bacterial strain discrimination provided by whole genome sequencing (WGS) presents new challenges with respect to the utility and interpretation of the data. Whole genome sequences from 1445 isolates of Salmonella belonging to the most commonly identified serotypes in England and Wales isolated between April and August 2014 were analysed. Single linkage single nucleotide polymorphism thresholds at the 10, 5 and 0 level were explored for evidence of epidemiological links between clustered cases. Analysis of the WGS data organised 566 of the 1445 isolates into 32 clusters of five or more. A statistically significant epidemiological link was identified for 17 clusters. The clusters were associated with foreign travel (n = 8), consumption of Chinese takeaways (n = 4), chicken eaten at home (n = 2), and one each of the following; eating out, contact with another case in the home and contact with reptiles. In the same time frame, one cluster was detected using traditional outbreak detection methods. WGS can be used for the highly specific and highly sensitive detection of biologically related isolates when epidemiological links are obscured. Improvements in the collection of detailed, standardised exposure information would enhance cluster investigations. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Comparative genomic analysis identified a mutation related to enhanced heterologous protein production in the filamentous fungus Aspergillus oryzae.

    Science.gov (United States)

    Jin, Feng-Jie; Katayama, Takuya; Maruyama, Jun-Ichi; Kitamoto, Katsuhiko

    2016-11-01

    Genomic mapping of mutations using next-generation sequencing technologies has facilitated the identification of genes contributing to fundamental biological processes, including human diseases. However, few studies have used this approach to identify mutations contributing to heterologous protein production in industrial strains of filamentous fungi, such as Aspergillus oryzae. In a screening of A. oryzae strains that hyper-produce human lysozyme (HLY), we previously isolated an AUT1 mutant that showed higher production of various heterologous proteins; however, the underlying factors contributing to the increased heterologous protein production remained unclear. Here, using a comparative genomic approach performed with whole-genome sequences, we attempted to identify the genes responsible for the high-level production of heterologous proteins in the AUT1 mutant. The comparative sequence analysis led to the detection of a gene (AO090120000003), designated autA, which was predicted to encode an unknown cytoplasmic protein containing an alpha/beta-hydrolase fold domain. Mutation or deletion of autA was associated with higher production levels of HLY. Specifically, the HLY yields of the autA mutant and deletion strains were twofold higher than that of the control strain during the early stages of cultivation. Taken together, these results indicate that combining classical mutagenesis approaches with comparative genomic analysis facilitates the identification of novel genes involved in heterologous protein production in filamentous fungi.

  8. A Drosophila Genome-Wide Screen Identifies Regulators of Steroid Hormone Production and Developmental Timing

    DEFF Research Database (Denmark)

    Thomas Danielsen, E.; E. Møller, Morten; Yamanaka, Naoki

    2016-01-01

    Steroid hormones control important developmental processes and are linked to many diseases. To systematically identify genes and pathways required for steroid production, we performed a Drosophila genome-wide in vivo RNAi screen and identified 1,906 genes with potential roles in steroidogenesis...... and developmental timing. Here, we use our screen as a resource to identify mechanisms regulating intracellular levels of cholesterol, a substrate for steroidogenesis. We identify a conserved fatty acid elongase that underlies a mechanism that adjusts cholesterol trafficking and steroidogenesis with nutrition...... and developmental programs. In addition, we demonstrate the existence of an autophagosomal cholesterol mobilization mechanism and show that activation of this system rescues Niemann-Pick type C1 deficiency that causes a disorder characterized by cholesterol accumulation. These cholesterol-trafficking mechanisms...

  9. Genome-wide methylation analysis identifies a core set of hypermethylated genes in CIMP-H colorectal cancer.

    Science.gov (United States)

    McInnes, Tyler; Zou, Donghui; Rao, Dasari S; Munro, Francesca M; Phillips, Vicky L; McCall, John L; Black, Michael A; Reeve, Anthony E; Guilford, Parry J

    2017-03-28

    Aberrant DNA methylation profiles are a characteristic of all known cancer types, epitomized by the CpG island methylator phenotype (CIMP) in colorectal cancer (CRC). Hypermethylation has been observed at CpG islands throughout the genome, but it is unclear which factors determine whether an individual island becomes methylated in cancer. DNA methylation in CRC was analysed using the Illumina HumanMethylation450K array. Differentially methylated loci were identified using Significance Analysis of Microarrays (SAM) and the Wilcoxon Signed Rank (WSR) test. Unsupervised hierarchical clustering was used to identify methylation subtypes in CRC. In this study we characterized the DNA methylation profiles of 94 CRC tissues and their matched normal counterparts. Consistent with previous studies, unsupervized hierarchical clustering of genome-wide methylation data identified three subtypes within the tumour samples, designated CIMP-H, CIMP-L and CIMP-N, that showed high, low and very low methylation levels, respectively. Differential methylation between normal and tumour samples was analysed at the individual CpG level, and at the gene level. The distribution of hypermethylation in CIMP-N tumours showed high inter-tumour variability and appeared to be highly stochastic in nature, whereas CIMP-H tumours exhibited consistent hypermethylation at a subset of genes, in addition to a highly variable background of hypermethylated genes. EYA4, TFPI2 and TLX1 were hypermethylated in more than 90% of all tumours examined. One-hundred thirty-two genes were hypermethylated in 100% of CIMP-H tumours studied and these were highly enriched for functions relating to skeletal system development (Bonferroni adjusted p value =2.88E-15), segment specification (adjusted p value =9.62E-11), embryonic development (adjusted p value =1.52E-04), mesoderm development (adjusted p value =1.14E-20), and ectoderm development (adjusted p value =7.94E-16). Our genome-wide characterization of DNA

  10. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes

    DEFF Research Database (Denmark)

    Parker, Brian John; Moltke, Ida; Roth, Adam

    2011-01-01

    a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein...

  11. Identifying elemental genomic track types and representing them uniformly

    Directory of Open Access Journals (Sweden)

    Gundersen Sveinung

    2011-12-01

    Full Text Available Abstract Background With the recent advances and availability of various high-throughput sequencing technologies, data on many molecular aspects, such as gene regulation, chromatin dynamics, and the three-dimensional organization of DNA, are rapidly being generated in an increasing number of laboratories. The variation in biological context, and the increasingly dispersed mode of data generation, imply a need for precise, interoperable and flexible representations of genomic features through formats that are easy to parse. A host of alternative formats are currently available and in use, complicating analysis and tool development. The issue of whether and how the multitude of formats reflects varying underlying characteristics of data has to our knowledge not previously been systematically treated. Results We here identify intrinsic distinctions between genomic features, and argue that the distinctions imply that a certain variation in the representation of features as genomic tracks is warranted. Four core informational properties of tracks are discussed: gaps, lengths, values and interconnections. From this we delineate fifteen generic track types. Based on the track type distinctions, we characterize major existing representational formats and find that the track types are not adequately supported by any single format. We also find, in contrast to the XML formats, that none of the existing tabular formats are conveniently extendable to support all track types. We thus propose two unified formats for track data, an improved XML format, BioXSD 1.1, and a new tabular format, GTrack 1.0. Conclusions The defined track types are shown to capture relevant distinctions between genomic annotation tracks, resulting in varying representational needs and analysis possibilities. The proposed formats, GTrack 1.0 and BioXSD 1.1, cater to the identified track distinctions and emphasize preciseness, flexibility and parsing convenience.

  12. Identifying artificial selection signals in the chicken genome.

    Directory of Open Access Journals (Sweden)

    Yunlong Ma

    Full Text Available Identifying the signals of artificial selection can contribute to further shaping economically important traits. Here, a chicken 600k SNP-array was employed to detect the signals of artificial selection using 331 individuals from 9 breeds, including Jingfen (JF, Jinghong (JH, Araucanas (AR, White Leghorn (WL, Pekin-Bantam (PB, Shamo (SH, Gallus-Gallus-Spadiceus (GA, Rheinlander (RH and Vorwerkhuhn (VO. Per the population genetic structure, 9 breeds were combined into 5 breed-pools, and a 'two-step' strategy was used to reveal the signals of artificial selection. GA, which has little artificial selection, was defined as the reference population, and a total of 204, 155, 305 and 323 potential artificial selection signals were identified in AR_VO, PB, RH_WL and JH_JF, respectively. We also found signals derived from standing and de-novo genetic variations have contributed to adaptive evolution during artificial selection. Further enrichment analysis suggests that the genomic regions of artificial selection signals harbour genes, including THSR, PTHLH and PMCH, responsible for economic traits, such as fertility, growth and immunization. Overall, this study found a series of genes that contribute to the improvement of chicken breeds and revealed the genetic mechanisms of adaptive evolution, which can be used as fundamental information in future chicken functional genomics study.

  13. Comparative genomics of 12 strains of Erwinia amylovora identifies a pan-genome with a large conserved core.

    Directory of Open Access Journals (Sweden)

    Rachel A Mann

    Full Text Available The plant pathogen Erwinia amylovora can be divided into two host-specific groupings; strains infecting a broad range of hosts within the Rosaceae subfamily Spiraeoideae (e.g., Malus, Pyrus, Crataegus, Sorbus and strains infecting Rubus (raspberries and blackberries. Comparative genomic analysis of 12 strains representing distinct populations (e.g., geographic, temporal, host origin of E. amylovora was used to describe the pan-genome of this major pathogen. The pan-genome contains 5751 coding sequences and is highly conserved relative to other phytopathogenic bacteria comprising on average 89% conserved, core genes. The chromosomes of Spiraeoideae-infecting strains were highly homogeneous, while greater genetic diversity was observed between Spiraeoideae- and Rubus-infecting strains (and among individual Rubus-infecting strains, the majority of which was attributed to variable genomic islands. Based on genomic distance scores and phylogenetic analysis, the Rubus-infecting strain ATCC BAA-2158 was genetically more closely related to the Spiraeoideae-infecting strains of E. amylovora than it was to the other Rubus-infecting strains. Analysis of the accessory genomes of Spiraeoideae- and Rubus-infecting strains has identified putative host-specific determinants including variation in the effector protein HopX1(Ea and a putative secondary metabolite pathway only present in Rubus-infecting strains.

  14. Genome-wide analysis identifies 12 loci influencing human reproductive behavior

    Science.gov (United States)

    Barban, Nicola; Jansen, Rick; de Vlaming, Ronald; Vaez, Ahmad; Mandemakers, Jornt J.; Tropf, Felix C.; Shen, Xia; Wilson, James F.; Chasman, Daniel I.; Nolte, Ilja M.; Tragante, Vinicius; van der Laan, Sander W.; Perry, John R. B.; Kong, Augustine; Ahluwalia, Tarunveer; Albrecht, Eva; Yerges-Armstrong, Laura; Atzmon, Gil; Auro, Kirsi; Ayers, Kristin; Bakshi, Andrew; Ben-Avraham, Danny; Berger, Klaus; Bergman, Aviv; Bertram, Lars; Bielak, Lawrence F.; Bjornsdottir, Gyda; Bonder, Marc Jan; Broer, Linda; Bui, Minh; Barbieri, Caterina; Cavadino, Alana; Chavarro, Jorge E; Turman, Constance; Concas, Maria Pina; Cordell, Heather J.; Davies, Gail; Eibich, Peter; Eriksson, Nicholas; Esko, Tõnu; Eriksson, Joel; Falahi, Fahimeh; Felix, Janine F.; Fontana, Mark Alan; Franke, Lude; Gandin, Ilaria; Gaskins, Audrey J.; Gieger, Christian; Gunderson, Erica P.; Guo, Xiuqing; Hayward, Caroline; He, Chunyan; Hofer, Edith; Huang, Hongyan; Joshi, Peter K.; Kanoni, Stavroula; Karlsson, Robert; Kiechl, Stefan; Kifley, Annette; Kluttig, Alexander; Kraft, Peter; Lagou, Vasiliki; Lecoeur, Cecile; Lahti, Jari; Li-Gao, Ruifang; Lind, Penelope A.; Liu, Tian; Makalic, Enes; Mamasoula, Crysovalanto; Matteson, Lindsay; Mbarek, Hamdi; McArdle, Patrick F.; McMahon, George; Meddens, S. Fleur W.; Mihailov, Evelin; Miller, Mike; Missmer, Stacey A.; Monnereau, Claire; van der Most, Peter J.; Myhre, Ronny; Nalls, Mike A.; Nutile, Teresa; Panagiota, Kalafati Ioanna; Porcu, Eleonora; Prokopenko, Inga; Rajan, Kumar B.; Rich-Edwards, Janet; Rietveld, Cornelius A.; Robino, Antonietta; Rose, Lynda M.; Rueedi, Rico; Ryan, Kathy; Saba, Yasaman; Schmidt, Daniel; Smith, Jennifer A.; Stolk, Lisette; Streeten, Elizabeth; Tonjes, Anke; Thorleifsson, Gudmar; Ulivi, Sheila; Wedenoja, Juho; Wellmann, Juergen; Willeit, Peter; Yao, Jie; Yengo, Loic; Zhao, Jing Hua; Zhao, Wei; Zhernakova, Daria V.; Amin, Najaf; Andrews, Howard; Balkau, Beverley; Barzilai, Nir; Bergmann, Sven; Biino, Ginevra; Bisgaard, Hans; Bønnelykke, Klaus; Boomsma, Dorret I.; Buring, Julie E.; Campbell, Harry; Cappellani, Stefania; Ciullo, Marina; Cox, Simon R.; Cucca, Francesco; Daniela, Toniolo; Davey-Smith, George; Deary, Ian J.; Dedoussis, George; Deloukas, Panos; van Duijn, Cornelia M.; de Geus, Eco JC.; Eriksson, Johan G.; Evans, Denis A.; Faul, Jessica D.; Felicita, Sala Cinzia; Froguel, Philippe; Gasparini, Paolo; Girotto, Giorgia; Grabe, Hans-Jörgen; Greiser, Karin Halina; Groenen, Patrick J.F.; de Haan, Hugoline G.; Haerting, Johannes; Harris, Tamara B.; Heath, Andrew C.; Heikkilä, Kauko; Hofman, Albert; Homuth, Georg; Holliday, Elizabeth G; Hopper, John; Hypponen, Elina; Jacobsson, Bo; Jaddoe, Vincent W. V.; Johannesson, Magnus; Jugessur, Astanand; Kähönen, Mika; Kajantie, Eero; Kardia, Sharon L.R.; Keavney, Bernard; Kolcic, Ivana; Koponen, Päivikki; Kovacs, Peter; Kronenberg, Florian; Kutalik, Zoltan; La Bianca, Martina; Lachance, Genevieve; Iacono, William; Lai, Sandra; Lehtimäki, Terho; Liewald, David C; Lindgren, Cecilia; Liu, Yongmei; Luben, Robert; Lucht, Michael; Luoto, Riitta; Magnus, Per; Magnusson, Patrik K.E.; Martin, Nicholas G.; McGue, Matt; McQuillan, Ruth; Medland, Sarah E.; Meisinger, Christa; Mellström, Dan; Metspalu, Andres; Michela, Traglia; Milani, Lili; Mitchell, Paul; Montgomery, Grant W.; Mook-Kanamori, Dennis; de Mutsert, Renée; Nohr, Ellen A; Ohlsson, Claes; Olsen, Jørn; Ong, Ken K.; Paternoster, Lavinia; Pattie, Alison; Penninx, Brenda WJH; Perola, Markus; Peyser, Patricia A.; Pirastu, Mario; Polasek, Ozren; Power, Chris; Kaprio, Jaakko; Raffel, Leslie J.; Räikkönen, Katri; Raitakari, Olli; Ridker, Paul M.; Ring, Susan M.; Roll, Kathryn; Rudan, Igor; Ruggiero, Daniela; Rujescu, Dan; Salomaa, Veikko; Schlessinger, David; Schmidt, Helena; Schmidt, Reinhold; Schupf, Nicole; Smit, Johannes; Sorice, Rossella; Spector, Tim D.; Starr, John M.; Stöckl, Doris; Strauch, Konstantin; Stumvoll, Michael; Swertz, Morris A.; Thorsteinsdottir, Unnur; Thurik, A. Roy; Timpson, Nicholas J.; Tönjes, Anke; Tung, Joyce Y.; Uitterlinden, André G.; Vaccargiu, Simona; Viikari, Jorma; Vitart, Veronique; Völzke, Henry; Vollenweider, Peter; Vuckovic, Dragana; Waage, Johannes; Wagner, Gert G.; Wang, Jie Jin; Wareham, Nicholas J.; Weir, David R.; Willemsen, Gonneke; Willeit, Johann; Wright, Alan F.; Zondervan, Krina T.; Stefansson, Kari; Krueger, Robert F.; Lee, James J.; Benjamin, Daniel J.; Cesarini, David; Koellinger, Philipp D.; den Hoed, Marcel; Snieder, Harold; Mills, Melinda C.

    2017-01-01

    The genetic architecture of human reproductive behavior – age at first birth (AFB) and number of children ever born (NEB) – has a strong relationship with fitness, human development, infertility and risk of neuropsychiatric disorders. However, very few genetic loci have been identified and the underlying mechanisms of AFB and NEB are poorly understood. We report the largest genome-wide association study to date of both sexes including 251,151 individuals for AFB and 343,072 for NEB. We identified 12 independent loci that are significantly associated with AFB and/or NEB in a SNP-based genome-wide association study, and four additional loci in a gene-based effort. These loci harbor genes that are likely to play a role – either directly or by affecting non-local gene expression – in human reproduction and infertility, thereby increasing our understanding of these complex traits. PMID:27798627

  15. A comparative genomics screen identifies a Sinorhizobium meliloti 1021 sodM-like gene strongly expressed within host plant nodules

    Directory of Open Access Journals (Sweden)

    Queiroux Clothilde

    2012-05-01

    Full Text Available Abstract Background We have used the genomic data in the Integrated Microbial Genomes system of the Department of Energy’s Joint Genome Institute to make predictions about rhizobial open reading frames that play a role in nodulation of host plants. The genomic data was screened by searching for ORFs conserved in α-proteobacterial rhizobia, but not conserved in closely-related non-nitrogen-fixing α-proteobacteria. Results Using this approach, we identified many genes known to be involved in nodulation or nitrogen fixation, as well as several new candidate genes. We knocked out selected new genes and assayed for the presence of nodulation phenotypes and/or nodule-specific expression. One of these genes, SMc00911, is strongly expressed by bacterial cells within host plant nodules, but is expressed minimally by free-living bacterial cells. A strain carrying an insertion mutation in SMc00911 is not defective in the symbiosis with host plants, but in contrast to expectations, this mutant strain is able to out-compete the S. meliloti 1021 wild type strain for nodule occupancy in co-inoculation experiments. The SMc00911 ORF is predicted to encode a “SodM-like” (superoxide dismutase-like protein containing a rhodanese sulfurtransferase domain at the N-terminus and a chromate-resistance superfamily domain at the C-terminus. Several other ORFs (SMb20360, SMc01562, SMc01266, SMc03964, and the SMc01424-22 operon identified in the screen are expressed at a moderate level by bacteria within nodules, but not by free-living bacteria. Conclusions Based on the analysis of ORFs identified in this study, we conclude that this comparative genomics approach can identify rhizobial genes involved in the nitrogen-fixing symbiosis with host plants, although none of the newly identified genes were found to be essential for this process.

  16. Modifiers of notch transcriptional activity identified by genome-wide RNAi

    Directory of Open Access Journals (Sweden)

    Firnhaber Christopher B

    2010-10-01

    Full Text Available Abstract Background The Notch signaling pathway regulates a diverse array of developmental processes, and aberrant Notch signaling can lead to diseases, including cancer. To obtain a more comprehensive understanding of the genetic network that integrates into Notch signaling, we performed a genome-wide RNAi screen in Drosophila cell culture to identify genes that modify Notch-dependent transcription. Results Employing complementary data analyses, we found 399 putative modifiers: 189 promoting and 210 antagonizing Notch activated transcription. These modifiers included several known Notch interactors, validating the robustness of the assay. Many novel modifiers were also identified, covering a range of cellular localizations from the extracellular matrix to the nucleus, as well as a large number of proteins with unknown function. Chromatin-modifying proteins represent a major class of genes identified, including histone deacetylase and demethylase complex components and other chromatin modifying, remodeling and replacement factors. A protein-protein interaction map of the Notch-dependent transcription modifiers revealed that a large number of the identified proteins interact physically with these core chromatin components. Conclusions The genome-wide RNAi screen identified many genes that can modulate Notch transcriptional output. A protein interaction map of the identified genes highlighted a network of chromatin-modifying enzymes and remodelers that regulate Notch transcription. Our results open new avenues to explore the mechanisms of Notch signal regulation and the integration of this pathway into diverse cellular processes.

  17. Genome wide association study identifies KCNMA1 contributing to human obesity

    DEFF Research Database (Denmark)

    Jiao, Hong; Arner, Peter; Hoffstedt, Johan

    2011-01-01

    Recent genome-wide association (GWA) analyses have identified common single nucleotide polymorphisms (SNPs) that are associated with obesity. However, the reported genetic variation in obesity explains only a minor fraction of the total genetic variation expected to be present in the population....... Thus many genetic variants controlling obesity remain to be identified. The aim of this study was to use GWA followed by multiple stepwise validations to identify additional genes associated with obesity....

  18. Genome-wide association study identifies 74 loci associated with educational attainment

    Science.gov (United States)

    Okbay, Aysu; Beauchamp, Jonathan P.; Fontana, Mark A.; Lee, James J.; Pers, Tune H.; Rietveld, Cornelius A.; Turley, Patrick; Chen, Guo-Bo; Emilsson, Valur; Meddens, S. Fleur W.; Oskarsson, Sven; Pickrell, Joseph K.; Thom, Kevin; Timshel, Pascal; de Vlaming, Ronald; Abdellaoui, Abdel; Ahluwalia, Tarunveer S.; Bacelis, Jonas; Baumbach, Clemens; Bjornsdottir, Gyda; Brandsma, Johannes H.; Concas, Maria Pina; Derringer, Jaime; Furlotte, Nicholas A.; Galesloot, Tessel E.; Girotto, Giorgia; Gupta, Richa; Hall, Leanne M.; Harris, Sarah E.; Hofer, Edith; Horikoshi, Momoko; Huffman, Jennifer E.; Kaasik, Kadri; Kalafati, Ioanna P.; Karlsson, Robert; Kong, Augustine; Lahti, Jari; van der Lee, Sven J.; de Leeuw, Christiaan; Lind, Penelope A.; Lindgren, Karl-Oskar; Liu, Tian; Mangino, Massimo; Marten, Jonathan; Mihailov, Evelin; Miller, Michael B.; van der Most, Peter J.; Oldmeadow, Christopher; Payton, Antony; Pervjakova, Natalia; Peyrot, Wouter J.; Qian, Yong; Raitakari, Olli; Rueedi, Rico; Salvi, Erika; Schmidt, Börge; Schraut, Katharina E.; Shi, Jianxin; Smith, Albert V.; Poot, Raymond A.; Pourcain, Beate; Teumer, Alexander; Thorleifsson, Gudmar; Verweij, Niek; Vuckovic, Dragana; Wellmann, Juergen; Westra, Harm-Jan; Yang, Jingyun; Zhao, Wei; Zhu, Zhihong; Alizadeh, Behrooz Z.; Amin, Najaf; Bakshi, Andrew; Baumeister, Sebastian E.; Biino, Ginevra; Bønnelykke, Klaus; Boyle, Patricia A.; Campbell, Harry; Cappuccio, Francesco P.; Davies, Gail; De Neve, Jan-Emmanuel; Deloukas, Panos; Demuth, Ilja; Ding, Jun; Eibich, Peter; Eisele, Lewin; Eklund, Niina; Evans68, David M.; Faul, Jessica D.; Feitosa, Mary F.; Forstner, Andreas J.; Gandin, Ilaria; Gunnarsson, Bjarni; Halldórsson, Bjarni V.; Harris, Tamara B.; Heath, Andrew C.; Hocking, Lynne J.; Holliday, Elizabeth G.; Homuth, Georg; Horan, Michael A.; Hottenga, Jouke-Jan; de Jager, Philip L.; Joshi, Peter K.; Jugessur, Astanand; Kaakinen, Marika A.; Kähönen, Mika; Kanoni, Stavroula; Keltigangas-Järvinen, Liisa; Kiemeney, Lambertus A.L.M.; Kolcic, Ivana; Koskinen, Seppo; Kraja, Aldi T.; Kroh, Martin; Kutalik, Zoltan; Latvala, Antti; Launer, Lenore J.; Lebreton, Maël P.; Levinson, Douglas F.; Lichtenstein, Paul; Lichtner, Peter; Liewald, David C.M.; Loukola, Anu; Madden, Pamela A.; Mägi, Reedik; Mäki-Opas, Tomi; Marioni, Riccardo E.; Marques-Vidal, Pedro; Meddens, Gerardus A.; McMahon, George; Meisinger, Christa; Meitinger, Thomas; Milaneschi, Yusplitri; Milani, Lili; Montgomery, Grant W.; Myhre, Ronny; Nelson, Christopher P.; Nyholt, Dale R.; Ollier, William E.R.; Palotie, Aarno; Paternoster, Lavinia; Pedersen, Nancy L.; Petrovic, Katja E.; Porteous, David J.; Räikkönen, Katri; Ring, Susan M.; Robino, Antonietta; Rostapshova, Olga; Rudan, Igor; Rustichini, Aldo; Salomaa, Veikko; Sanders, Alan R.; Sarin, Antti-Pekka; Schmidt, Helena; Scott, Rodney J.; Smith, Blair H.; Smith, Jennifer A.; Staessen, Jan A.; Steinhagen-Thiessen, Elisabeth; Strauch, Konstantin; Terracciano, Antonio; Tobin, Martin D.; Ulivi, Sheila; Vaccargiu, Simona; Quaye, Lydia; van Rooij, Frank J.A.; Venturini, Cristina; Vinkhuyzen, Anna A.E.; Völker, Uwe; Völzke, Henry; Vonk, Judith M.; Vozzi, Diego; Waage, Johannes; Ware, Erin B.; Willemsen, Gonneke; Attia, John R.; Bennett, David A.; Berger, Klaus; Bertram, Lars; Bisgaard, Hans; Boomsma, Dorret I.; Borecki, Ingrid B.; Bultmann, Ute; Chabris, Christopher F.; Cucca, Francesco; Cusi, Daniele; Deary, Ian J.; Dedoussis, George V.; van Duijn, Cornelia M.; Eriksson, Johan G.; Franke, Barbara; Franke, Lude; Gasparini, Paolo; Gejman, Pablo V.; Gieger, Christian; Grabe, Hans-Jörgen; Gratten, Jacob; Groenen, Patrick J.F.; Gudnason, Vilmundur; van der Harst, Pim; Hayward, Caroline; Hinds, David A.; Hoffmann, Wolfgang; Hyppönen, Elina; Iacono, William G.; Jacobsson, Bo; Järvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Kaprio, Jaakko; Kardia, Sharon L.R.; Lehtimäki, Terho; Lehrer, Steven F.; Magnusson, Patrik K.E.; Martin, Nicholas G.; McGue, Matt; Metspalu, Andres; Pendleton, Neil; Penninx, Brenda W.J.H.; Perola, Markus; Pirastu, Nicola; Pirastu, Mario; Polasek, Ozren; Posthuma, Danielle; Power, Christine; Province, Michael A.; Samani, Nilesh J.; Schlessinger, David; Schmidt, Reinhold; Sørensen, Thorkild I.A.; Spector, Tim D.; Stefansson, Kari; Thorsteinsdottir, Unnur; Thurik, A. Roy; Timpson, Nicholas J.; Tiemeier, Henning; Tung, Joyce Y.; Uitterlinden, André G.; Vitart, Veronique; Vollenweider, Peter; Weir, David R.; Wilson, James F.; Wright, Alan F.; Conley, Dalton C.; Krueger, Robert F.; Smith, George Davey; Hofman, Albert; Laibson, David I.; Medland, Sarah E.; Meyer, Michelle N.; Yang, Jian; Johannesson, Magnus; Visscher, Peter M.; Esko, Tõnu; Koellinger, Philipp D.; Cesarini, David; Benjamin, Daniel J.

    2016-01-01

    Summary Educational attainment (EA) is strongly influenced by social and other environmental factors, but genetic factors are also estimated to account for at least 20% of the variation across individuals1. We report the results of a genome-wide association study (GWAS) for EA that extends our earlier discovery sample1,2 of 101,069 individuals to 293,723 individuals, and a replication in an independent sample of 111,349 individuals from the UK Biobank. We now identify 74 genome-wide significant loci associated with number of years of schooling completed. Single-nucleotide polymorphisms (SNPs) associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioral phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because EA is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric disease. PMID:27225129

  19. Genome-wide association study identifies novel breast cancer susceptibility loci

    Science.gov (United States)

    Easton, Douglas F.; Pooley, Karen A.; Dunning, Alison M.; Pharoah, Paul D. P.; Thompson, Deborah; Ballinger, Dennis G.; Struewing, Jeffery P.; Morrison, Jonathan; Field, Helen; Luben, Robert; Wareham, Nicholas; Ahmed, Shahana; Healey, Catherine S.; Bowman, Richard; Meyer, Kerstin B.; Haiman, Christopher A.; Kolonel, Laurence K.; Henderson, Brian E.; Marchand, Loic Le; Brennan, Paul; Sangrajrang, Suleeporn; Gaborieau, Valerie; Odefrey, Fabrice; Shen, Chen-Yang; Wu, Pei-Ei; Wang, Hui-Chun; Eccles, Diana; Evans, D. Gareth; Peto, Julian; Fletcher, Olivia; Johnson, Nichola; Seal, Sheila; Stratton, Michael R.; Rahman, Nazneen; Chenevix-Trench, Georgia; Bojesen, Stig E.; Nordestgaard, Børge G.; Axelsson, Christen K.; Garcia-Closas, Montserrat; Brinton, Louise; Chanock, Stephen; Lissowska, Jolanta; Peplonska, Beata; Nevanlinna, Heli; Fagerholm, Rainer; Eerola, Hannaleena; Kang, Daehee; Yoo, Keun-Young; Noh, Dong-Young; Ahn, Sei-Hyun; Hunter, David J.; Hankinson, Susan E.; Cox, David G.; Hall, Per; Wedren, Sara; Liu, Jianjun; Low, Yen-Ling; Bogdanova, Natalia; Schürmann, Peter; Dörk, Thilo; Tollenaar, Rob A. E. M.; Jacobi, Catharina E.; Devilee, Peter; Klijn, Jan G. M.; Sigurdson, Alice J.; Doody, Michele M.; Alexander, Bruce H.; Zhang, Jinghui; Cox, Angela; Brock, Ian W.; MacPherson, Gordon; Reed, Malcolm W. R.; Couch, Fergus J.; Goode, Ellen L.; Olson, Janet E.; Meijers-Heijboer, Hanne; van den Ouweland, Ans; Uitterlinden, André; Rivadeneira, Fernando; Milne, Roger L.; Ribas, Gloria; Gonzalez-Neira, Anna; Benitez, Javier; Hopper, John L.; McCredie, Margaret; Southey, Melissa; Giles, Graham G.; Schroen, Chris; Justenhoven, Christina; Brauch, Hiltrud; Hamann, Ute; Ko, Yon-Dschun; Spurdle, Amanda B.; Beesley, Jonathan; Chen, Xiaoqing; Mannermaa, Arto; Kosma, Veli-Matti; Kataja, Vesa; Hartikainen, Jaana; Day, Nicholas E.; Cox, David R.; Ponder, Bruce A. J.; Luccarini, Craig; Conroy, Don; Shah, Mitul; Munday, Hannah; Jordan, Clare; Perkins, Barbara; West, Judy; Redman, Karen; Driver, Kristy; Aghmesheh, Morteza; Amor, David; Andrews, Lesley; Antill, Yoland; Armes, Jane; Armitage, Shane; Arnold, Leanne; Balleine, Rosemary; Begley, Glenn; Beilby, John; Bennett, Ian; Bennett, Barbara; Berry, Geoffrey; Blackburn, Anneke; Brennan, Meagan; Brown, Melissa; Buckley, Michael; Burke, Jo; Butow, Phyllis; Byron, Keith; Callen, David; Campbell, Ian; Chenevix-Trench, Georgia; Clarke, Christine; Colley, Alison; Cotton, Dick; Cui, Jisheng; Culling, Bronwyn; Cummings, Margaret; Dawson, Sarah-Jane; Dixon, Joanne; Dobrovic, Alexander; Dudding, Tracy; Edkins, Ted; Eisenbruch, Maurice; Farshid, Gelareh; Fawcett, Susan; Field, Michael; Firgaira, Frank; Fleming, Jean; Forbes, John; Friedlander, Michael; Gaff, Clara; Gardner, Mac; Gattas, Mike; George, Peter; Giles, Graham; Gill, Grantley; Goldblatt, Jack; Greening, Sian; Grist, Scott; Haan, Eric; Harris, Marion; Hart, Stewart; Hayward, Nick; Hopper, John; Humphrey, Evelyn; Jenkins, Mark; Jones, Alison; Kefford, Rick; Kirk, Judy; Kollias, James; Kovalenko, Sergey; Lakhani, Sunil; Leary, Jennifer; Lim, Jacqueline; Lindeman, Geoff; Lipton, Lara; Lobb, Liz; Maclurcan, Mariette; Mann, Graham; Marsh, Deborah; McCredie, Margaret; McKay, Michael; McLachlan, Sue Anne; Meiser, Bettina; Milne, Roger; Mitchell, Gillian; Newman, Beth; O'Loughlin, Imelda; Osborne, Richard; Peters, Lester; Phillips, Kelly; Price, Melanie; Reeve, Jeanne; Reeve, Tony; Richards, Robert; Rinehart, Gina; Robinson, Bridget; Rudzki, Barney; Salisbury, Elizabeth; Sambrook, Joe; Saunders, Christobel; Scott, Clare; Scott, Elizabeth; Scott, Rodney; Seshadri, Ram; Shelling, Andrew; Southey, Melissa; Spurdle, Amanda; Suthers, Graeme; Taylor, Donna; Tennant, Christopher; Thorne, Heather; Townshend, Sharron; Tucker, Kathy; Tyler, Janet; Venter, Deon; Visvader, Jane; Walpole, Ian; Ward, Robin; Waring, Paul; Warner, Bev; Warren, Graham; Watson, Elizabeth; Williams, Rachael; Wilson, Judy; Winship, Ingrid; Young, Mary Ann; Bowtell, David; Green, Adele; deFazio, Anna; Chenevix-Trench, Georgia; Gertig, Dorota; Webb, Penny

    2009-01-01

    Breast cancer exhibits familial aggregation, consistent with variation in genetic susceptibility to the disease. Known susceptibility genes account for less than 25% of the familial risk of breast cancer, and the residual genetic variance is likely to be due to variants conferring more moderate risks. To identify further susceptibility alleles, we conducted a two-stage genome-wide association study in 4,398 breast cancer cases and 4,316 controls, followed by a third stage in which 30 single nucleotide polymorphisms (SNPs) were tested for confirmation in 21,860 cases and 22,578 controls from 22 studies. We used 227,876 SNPs that were estimated to correlate with 77% of known common SNPs in Europeans at r2>0.5. SNPs in five novel independent loci exhibited strong and consistent evidence of association with breast cancer (P<10−7). Four of these contain plausible causative genes (FGFR2, TNRC9, MAP3K1 and LSP1). At the second stage, 1,792 SNPs were significant at the P<0.05 level compared with an estimated 1,343 that would be expected by chance, indicating that many additional common susceptibility alleles may be identifiable by this approach. PMID:17529967

  20. Urban landscape genomics identifies fine-scale gene flow patterns in an avian invasive.

    Science.gov (United States)

    Low, G W; Chattopadhyay, B; Garg, K M; Irestedt, M; Ericson, Pgp; Yap, G; Tang, Q; Wu, S; Rheindt, F E

    2018-01-01

    Invasive species exert a serious impact on native fauna and flora and have been the target of many eradication and management efforts worldwide. However, a lack of data on population structure and history, exacerbated by the recency of many species introductions, limits the efficiency with which such species can be kept at bay. In this study we generated a novel genome of high assembly quality and genotyped 4735 genome-wide single nucleotide polymorphic (SNP) markers from 78 individuals of an invasive population of the Javan Myna Acridotheres javanicus across the island of Singapore. We inferred limited population subdivision at a micro-geographic level, a genetic patch size (~13-14 km) indicative of a pronounced dispersal ability, and barely an increase in effective population size since introduction despite an increase of four to five orders of magnitude in actual population size, suggesting that low population-genetic diversity following a bottleneck has not impeded establishment success. Landscape genomic analyses identified urban features, such as low-rise neighborhoods, that constitute pronounced barriers to gene flow. Based on our data, we consider an approach targeting the complete eradication of Javan Mynas across Singapore to be unfeasible. Instead, a mixed approach of localized mitigation measures taking into account urban geographic features and planning policy may be the most promising avenue to reducing the adverse impacts of this urban pest. Our study demonstrates how genomic methods can directly inform the management and control of invasive species, even in geographically limited datasets with high gene flow rates.

  1. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    Science.gov (United States)

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  2. Incidental copy-number variants identified by routine genome testing in a clinical population

    Science.gov (United States)

    Boone, Philip M.; Soens, Zachry T.; Campbell, Ian M.; Stankiewicz, Pawel; Cheung, Sau Wai; Patel, Ankita; Beaudet, Arthur L.; Plon, Sharon E.; Shaw, Chad A.; McGuire, Amy L.; Lupski, James R.

    2013-01-01

    Purpose Mutational load of susceptibility variants has not been studied on a genomic scale in a clinical population, nor has the potential to identify these mutations as incidental findings during clinical testing been systematically ascertained. Methods Array comparative genomic hybridization, a method for genome-wide detection of DNA copy-number variants, was performed clinically on DNA from 9,005 individuals. Copy-number variants encompassing or disrupting single genes were identified and analyzed for their potential to confer predisposition to dominant, adult-onset disease. Multigene copy-number variants affecting dominant, adult-onset cancer syndrome genes were also assessed. Results In our cohort, 83 single-gene copy-number variants affected 40 unique genes associated with dominant, adult-onset disorders and unrelated to the patients’ referring diagnoses (i.e., incidental) were found. Fourteen of these copy-number variants are likely disease-predisposing, 25 are likely benign, and 44 are of unknown clinical consequence. When incidental copy-number variants spanning up to 20 genes were considered, 27 copy-number variants affected 17 unique genes associated with dominant, adult-onset cancer predisposition. Conclusion Copy-number variants potentially conferring susceptibility to adult-onset disease can be identified as incidental findings during routine genome-wide testing. Some of these mutations may be medically actionable, enabling disease surveillance or prevention; however, most incidentally observed single-gene copy-number variants are currently of unclear significance to the patient. PMID:22878507

  3. Genomic profiling identifies GATA6 as a candidate oncogene amplified in pancreatobiliary cancer.

    Directory of Open Access Journals (Sweden)

    Kevin A Kwei

    2008-05-01

    Full Text Available Pancreatobiliary cancers have among the highest mortality rates of any cancer type. Discovering the full spectrum of molecular genetic alterations may suggest new avenues for therapy. To catalogue genomic alterations, we carried out array-based genomic profiling of 31 exocrine pancreatic cancers and 6 distal bile duct cancers, expanded as xenografts to enrich the tumor cell fraction. We identified numerous focal DNA amplifications and deletions, including in 19% of pancreatobiliary cases gain at cytoband 18q11.2, a locus uncommonly amplified in other tumor types. The smallest shared amplification at 18q11.2 included GATA6, a transcriptional regulator previously linked to normal pancreas development. When amplified, GATA6 was overexpressed at both the mRNA and protein levels, and strong immunostaining was observed in 25 of 54 (46% primary pancreatic cancers compared to 0 of 33 normal pancreas specimens surveyed. GATA6 expression in xenografts was associated with specific microarray gene-expression patterns, enriched for GATA binding sites and mitochondrial oxidative phosphorylation activity. siRNA mediated knockdown of GATA6 in pancreatic cancer cell lines with amplification led to reduced cell proliferation, cell cycle progression, and colony formation. Our findings indicate that GATA6 amplification and overexpression contribute to the oncogenic phenotypes of pancreatic cancer cells, and identify GATA6 as a candidate lineage-specific oncogene in pancreatobiliary cancer, with implications for novel treatment strategies.

  4. The FUN of identifying gene function in bacterial pathogens; insights from Salmonella functional genomics.

    Science.gov (United States)

    Hammarlöf, Disa L; Canals, Rocío; Hinton, Jay C D

    2013-10-01

    The availability of thousands of genome sequences of bacterial pathogens poses a particular challenge because each genome contains hundreds of genes of unknown function (FUN). How can we easily discover which FUN genes encode important virulence factors? One solution is to combine two different functional genomic approaches. First, transcriptomics identifies bacterial FUN genes that show differential expression during the process of mammalian infection. Second, global mutagenesis identifies individual FUN genes that the pathogen requires to cause disease. The intersection of these datasets can reveal a small set of candidate genes most likely to encode novel virulence attributes. We demonstrate this approach with the Salmonella infection model, and propose that a similar strategy could be used for other bacterial pathogens. Copyright © 2013 Elsevier Ltd. All rights reserved.

  5. Machine Learning Leveraging Genomes from Metagenomes Identifies Influential Antibiotic Resistance Genes in the Infant Gut Microbiome

    Science.gov (United States)

    Olm, Matthew R.; Morowitz, Michael J.

    2018-01-01

    ABSTRACT Antibiotic resistance in pathogens is extensively studied, and yet little is known about how antibiotic resistance genes of typical gut bacteria influence microbiome dynamics. Here, we leveraged genomes from metagenomes to investigate how genes of the premature infant gut resistome correspond to the ability of bacteria to survive under certain environmental and clinical conditions. We found that formula feeding impacts the resistome. Random forest models corroborated by statistical tests revealed that the gut resistome of formula-fed infants is enriched in class D beta-lactamase genes. Interestingly, Clostridium difficile strains harboring this gene are at higher abundance in formula-fed infants than C. difficile strains lacking this gene. Organisms with genes for major facilitator superfamily drug efflux pumps have higher replication rates under all conditions, even in the absence of antibiotic therapy. Using a machine learning approach, we identified genes that are predictive of an organism’s direction of change in relative abundance after administration of vancomycin and cephalosporin antibiotics. The most accurate results were obtained by reducing annotated genomic data to five principal components classified by boosted decision trees. Among the genes involved in predicting whether an organism increased in relative abundance after treatment are those that encode subclass B2 beta-lactamases and transcriptional regulators of vancomycin resistance. This demonstrates that machine learning applied to genome-resolved metagenomics data can identify key genes for survival after antibiotics treatment and predict how organisms in the gut microbiome will respond to antibiotic administration. IMPORTANCE The process of reconstructing genomes from environmental sequence data (genome-resolved metagenomics) allows unique insight into microbial systems. We apply this technique to investigate how the antibiotic resistance genes of bacteria affect their ability to

  6. Genomic ancestry and education level independently influence abdominal fat distributions in a Brazilian admixed population.

    Science.gov (United States)

    França, Giovanny Vinícius Araújo de; De Lucia Rolfe, Emanuella; Horta, Bernardo Lessa; Gigante, Denise Petrucci; Yudkin, John S; Ong, Ken K; Victora, Cesar Gomes

    2017-01-01

    We aimed to identify the independent associations of genomic ancestry and education level with abdominal fat distributions in the 1982 Pelotas birth cohort study, Brazil. In 2,890 participants (1,409 men and 1,481 women), genomic ancestry was assessed using genotype data on 370,539 genome-wide variants to quantify ancestral proportions in each individual. Years of completed education was used to indicate socio-economic position. Visceral fat depth and subcutaneous abdominal fat thickness were measured by ultrasound at age 29-31y; these measures were adjusted for BMI to indicate abdominal fat distributions. Linear regression models were performed, separately by sex. Admixture was observed between European (median proportion 85.3), African (6.6), and Native American (6.3) ancestries, with a strong inverse correlation between the African and European ancestry scores (ρ = -0.93; pabdominal fat distributions in men (both P = 0.001), and inversely associated with subcutaneous abdominal fat distribution in women (p = 0.009). Independent of genomic ancestry, higher education level was associated with lower visceral fat, but higher subcutaneous fat, in both men and women (all pabdominal fat distribution in adults. African ancestry appeared to lower abdominal fat distributions, particularly in men.

  7. A survey of single nucleotide polymorphisms identified from whole-genome sequencing and their functional effect in the porcine genome.

    Science.gov (United States)

    Keel, B N; Nonneman, D J; Rohrer, G A

    2017-08-01

    Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a more significant effect on phenotypic variation than do other types of genetic variants. Hence, a comprehensive list of these functional variants would be of considerable interest in swine genomic studies, particularly those targeting fertility and production traits. Whole-genome sequence was obtained from 72 of the founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the founding boars (12 Duroc and 12 Landrace) and 48 Yorkshire-Landrace composite sows. Sequence reads were mapped to the Sscrofa10.2 genome build, resulting in a mean of 6.1 fold (×) coverage per genome. A total of 22 342 915 high confidence SNPs were identified from the sequenced genomes. These included 21 million previously reported SNPs and 79% of the 62 163 SNPs on the PorcineSNP60 BeadChip assay. Variation was detected in the coding sequence or untranslated regions (UTRs) of 87.8% of the genes in the porcine genome: loss-of-function variants were predicted in 504 genes, 10 202 genes contained nonsynonymous variants, 10 773 had variation in UTRs and 13 010 genes contained synonymous variants. Approximately 139 000 SNPs were classified as loss-of-function, nonsynonymous or regulatory, which suggests that over 99% of the variation detected in our pigs could potentially be ignored, allowing us to focus on a much smaller number of functional SNPs during future analyses. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  8. Genomic, Epigenomic, and Transcriptomic Profiling towards Identifying Omics Features and Specific Biomarkers That Distinguish Uterine Leiomyosarcoma and Leiomyoma at Molecular Levels

    Directory of Open Access Journals (Sweden)

    Tomoko Miyata

    2015-01-01

    Full Text Available Uterine leiomyosarcoma (LMS is the worst malignancy among the gynecologic cancers. Uterine leiomyoma (LM, a benign tumor of myometrial origin, is the most common among women of childbearing age. Because of their similar symptoms, it is difficult to preoperatively distinguish the two conditions only by ultrasound and pelvic MRI. While histopathological diagnosis is currently the main approach used to distinguish them postoperatively, unusual histologic variants of LM tend to be misdiagnosed as LMS. Therefore, development of molecular diagnosis as an alternative or confirmatory means will help to diagnose LMS more accurately. We adopted omics-based technologies to identify genome-wide features to distinguish LMS from LM and revealed that copy number, gene expression, and DNA methylation profiles successfully distinguished these tumors. LMS was found to possess features typically observed in malignant solid tumors, such as extensive chromosomal abnormalities, overexpression of cell cycle-related genes, hypomethylation spreading through large genomic regions, and frequent hypermethylation at the polycomb group target genes and protocadherin genes. We also identified candidate expression and DNA methylation markers, which will facilitate establishing postoperative molecular diagnostic tests based on conventional quantitative assays. Our results demonstrate the feasibility of establishing such tests and the possibility of developing preoperative and noninvasive methods.

  9. Genome-wide association identifies OBFC1 as a locus involved in human leukocyte telomere biology.

    Science.gov (United States)

    Levy, Daniel; Neuhausen, Susan L; Hunt, Steven C; Kimura, Masayuki; Hwang, Shih-Jen; Chen, Wei; Bis, Joshua C; Fitzpatrick, Annette L; Smith, Erin; Johnson, Andrew D; Gardner, Jeffrey P; Srinivasan, Sathanur R; Schork, Nicholas; Rotter, Jerome I; Herbig, Utz; Psaty, Bruce M; Sastrasinh, Malinee; Murray, Sarah S; Vasan, Ramachandran S; Province, Michael A; Glazer, Nicole L; Lu, Xiaobin; Cao, Xiaojian; Kronmal, Richard; Mangino, Massimo; Soranzo, Nicole; Spector, Tim D; Berenson, Gerald S; Aviv, Abraham

    2010-05-18

    Telomeres are engaged in a host of cellular functions, and their length is regulated by multiple genes. Telomere shortening, in the course of somatic cell replication, ultimately leads to replicative senescence. In humans, rare mutations in genes that regulate telomere length have been identified in monogenic diseases such as dyskeratosis congenita and idiopathic pulmonary fibrosis, which are associated with shortened leukocyte telomere length (LTL) and increased risk for aplastic anemia. Shortened LTL is observed in a host of aging-related complex genetic diseases and is associated with diminished survival in the elderly. We report results of a genome-wide association study of LTL in a consortium of four observational studies (n = 3,417 participants with LTL and genome-wide genotyping). SNPs in the regions of the oligonucleotide/oligosaccharide-binding folds containing one gene (OBFC1; rs4387287; P = 3.9 x 10(-9)) and chemokine (C-X-C motif) receptor 4 gene (CXCR4; rs4452212; P = 2.9 x 10(-8)) were associated with LTL at a genome-wide significance level (P a gene associated with LTL (P = 1.1 x 10(-5)). The identification of OBFC1 through genome-wide association as a locus for interindividual variation in LTL in the general population advances the understanding of telomere biology in humans and may provide insights into aging-related disorders linked to altered LTL dynamics.

  10. Identifying Likely Transmission Pathways within a 10-Year Community Outbreak of Tuberculosis by High-Depth Whole Genome Sequencing.

    Directory of Open Access Journals (Sweden)

    Alexander C Outhred

    Full Text Available Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways.We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants.Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade.Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster.

  11. Measuring the Levels of Ribonucleotides Embedded in Genomic DNA.

    Science.gov (United States)

    Meroni, Alice; Nava, Giulia M; Sertic, Sarah; Plevani, Paolo; Muzi-Falconi, Marco; Lazzaro, Federico

    2018-01-01

    Ribonucleotides (rNTPs) are incorporated into genomic DNA at a relatively high frequency during replication. They have beneficial effects but, if not removed from the chromosomes, increase genomic instability. Here, we describe a fast method to easily estimate the amounts of embedded ribonucleotides into the genome. The protocol described is performed in Saccharomyces cerevisiae and allows us to quantify altered levels of rNMPs due to different mutations in the replicative polymerase ε. However, this protocol can be easily applied to cells derived from any organism.

  12. Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses

    Science.gov (United States)

    Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A.; Janke, Axel

    2015-01-01

    The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. PMID:26019166

  13. Genome-wide association analysis identifies three new breast cancer susceptibility loci

    DEFF Research Database (Denmark)

    Ghoussaini, Maya; Fletcher, Olivia; Michailidou, Kyriaki

    2012-01-01

    Breast cancer is the most common cancer among women. To date, 22 common breast cancer susceptibility loci have been identified accounting for ∼8% of the heritability of the disease. We attempted to replicate 72 promising associations from two independent genome-wide association studies (GWAS...

  14. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes.

    Science.gov (United States)

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-02-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information.

  15. Genomes2Drugs: identifies target proteins and lead drugs from proteome data.

    LENUS (Irish Health Repository)

    Toomey, David

    2009-01-01

    BACKGROUND: Genome sequencing and bioinformatics have provided the full hypothetical proteome of many pathogenic organisms. Advances in microarray and mass spectrometry have also yielded large output datasets of possible target proteins\\/genes. However, the challenge remains to identify new targets for drug discovery from this wealth of information. Further analysis includes bioinformatics and\\/or molecular biology tools to validate the findings. This is time consuming and expensive, and could fail to yield novel drugs if protein purification and crystallography is impossible. To pre-empt this, a researcher may want to rapidly filter the output datasets for proteins that show good homology to proteins that have already been structurally characterised or proteins that are already targets for known drugs. Critically, those researchers developing novel antibiotics need to select out the proteins that show close homology to any human proteins, as future inhibitors are likely to cross-react with the host protein, causing off-target toxicity effects later in clinical trials. METHODOLOGY\\/PRINCIPAL FINDINGS: To solve many of these issues, we have developed a free online resource called Genomes2Drugs which ranks sequences to identify proteins that are (i) homologous to previously crystallized proteins or (ii) targets of known drugs, but are (iii) not homologous to human proteins. When tested using the Plasmodium falciparum malarial genome the program correctly enriched the ranked list of proteins with known drug target proteins. CONCLUSIONS\\/SIGNIFICANCE: Genomes2Drugs rapidly identifies proteins that are likely to succeed in drug discovery pipelines. This free online resource helps in the identification of potential drug targets. Importantly, the program further highlights proteins that are likely to be inhibited by FDA-approved drugs. These drugs can then be rapidly moved into Phase IV clinical studies under \\'change-of-application\\' patents.

  16. Genomes2Drugs: identifies target proteins and lead drugs from proteome data.

    Directory of Open Access Journals (Sweden)

    David Toomey

    Full Text Available BACKGROUND: Genome sequencing and bioinformatics have provided the full hypothetical proteome of many pathogenic organisms. Advances in microarray and mass spectrometry have also yielded large output datasets of possible target proteins/genes. However, the challenge remains to identify new targets for drug discovery from this wealth of information. Further analysis includes bioinformatics and/or molecular biology tools to validate the findings. This is time consuming and expensive, and could fail to yield novel drugs if protein purification and crystallography is impossible. To pre-empt this, a researcher may want to rapidly filter the output datasets for proteins that show good homology to proteins that have already been structurally characterised or proteins that are already targets for known drugs. Critically, those researchers developing novel antibiotics need to select out the proteins that show close homology to any human proteins, as future inhibitors are likely to cross-react with the host protein, causing off-target toxicity effects later in clinical trials. METHODOLOGY/PRINCIPAL FINDINGS: To solve many of these issues, we have developed a free online resource called Genomes2Drugs which ranks sequences to identify proteins that are (i homologous to previously crystallized proteins or (ii targets of known drugs, but are (iii not homologous to human proteins. When tested using the Plasmodium falciparum malarial genome the program correctly enriched the ranked list of proteins with known drug target proteins. CONCLUSIONS/SIGNIFICANCE: Genomes2Drugs rapidly identifies proteins that are likely to succeed in drug discovery pipelines. This free online resource helps in the identification of potential drug targets. Importantly, the program further highlights proteins that are likely to be inhibited by FDA-approved drugs. These drugs can then be rapidly moved into Phase IV clinical studies under 'change-of-application' patents.

  17. A universal genomic coordinate translator for comparative genomics.

    Science.gov (United States)

    Zamani, Neda; Sundström, Görel; Meadows, Jennifer R S; Höppner, Marc P; Dainat, Jacques; Lantz, Henrik; Haas, Brian J; Grabherr, Manfred G

    2014-06-30

    Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across

  18. The mitochondrial genome of Phallusia mammillata and Phallusia fumigata (Tunicata, Ascidiacea: high genome plasticity at intra-genus level

    Directory of Open Access Journals (Sweden)

    Pesole Graziano

    2007-08-01

    Full Text Available Abstract Background Within Chordata, the subphyla Vertebrata and Cephalochordata (lancelets are characterized by a remarkable stability of the mitochondrial (mt genome, with constancy of gene content and almost invariant gene order, whereas the limited mitochondrial data on the subphylum Tunicata suggest frequent and extensive gene rearrangements, observed also within ascidians of the same genus. Results To confirm this evolutionary trend and to better understand the evolutionary dynamics of the mitochondrial genome in Tunicata Ascidiacea, we have sequenced and characterized the complete mt genome of two congeneric ascidian species, Phallusia mammillata and Phallusia fumigata (Phlebobranchiata, Ascidiidae. The two mtDNAs are surprisingly rearranged, both with respect to one another and relative to those of other tunicates and chordates, with gene rearrangements affecting both protein-coding and tRNA genes. The new data highlight the extraordinary variability of ascidian mt genome in base composition, tRNA secondary structure, tRNA gene content, and non-coding regions (number, size, sequence and location. Indeed, both Phallusia genomes lack the trnD gene, show loss/acquisition of DHU-arm in two tRNAs, and have a G+C content two-fold higher than other ascidians. Moreover, the mt genome of P. fumigata presents two identical copies of trnI, an extra tRNA gene with uncertain amino acid specificity, and four almost identical sequence regions. In addition, a truncated cytochrome b, lacking a C-terminal tail that commonly protrudes into the mt matrix, has been identified as a new mt feature probably shared by all tunicates. Conclusion The frequent occurrence of major gene order rearrangements in ascidians both at high taxonomic level and within the same genus makes this taxon an excellent model to study the mechanisms of gene rearrangement, and renders the mt genome an invaluable phylogenetic marker to investigate molecular biodiversity and speciation

  19. Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses.

    Science.gov (United States)

    Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A; Janke, Axel

    2015-05-27

    The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  20. Whole genome association study identifies regions of the bovine genome and biological pathways involved in carcass trait performance in Holstein-Friesian cattle.

    Science.gov (United States)

    Doran, Anthony G; Berry, Donagh P; Creevey, Christopher J

    2014-10-01

    Four traits related to carcass performance have been identified as economically important in beef production: carcass weight, carcass fat, carcass conformation of progeny and cull cow carcass weight. Although Holstein-Friesian cattle are primarily utilized for milk production, they are also an important source of meat for beef production and export. Because of this, there is great interest in understanding the underlying genomic structure influencing these traits. Several genome-wide association studies have identified regions of the bovine genome associated with growth or carcass traits, however, little is known about the mechanisms or underlying biological pathways involved. This study aims to detect regions of the bovine genome associated with carcass performance traits (employing a panel of 54,001 SNPs) using measures of genetic merit (as predicted transmitting abilities) for 5,705 Irish Holstein-Friesian animals. Candidate genes and biological pathways were then identified for each trait under investigation. Following adjustment for false discovery (q-value carcass traits using a single SNP regression approach. Using a Bayesian approach, 46 QTL were associated (posterior probability > 0.5) with at least one of the four traits. In total, 557 unique bovine genes, which mapped to 426 human orthologs, were within 500kbs of QTL found associated with a trait using the Bayesian approach. Using this information, 24 significantly over-represented pathways were identified across all traits. The most significantly over-represented biological pathway was the peroxisome proliferator-activated receptor (PPAR) signaling pathway. A large number of genomic regions putatively associated with bovine carcass traits were detected using two different statistical approaches. Notably, several significant associations were detected in close proximity to genes with a known role in animal growth such as glucagon and leptin. Several biological pathways, including PPAR signaling, were

  1. The compact Selaginella genome identifies changes in gene content associated with the evolution of vascular plants

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor V.; Banks, Jo Ann; Nishiyama, Tomoaki; Hasebe, Mitsuyasu; Bowman, John L.; Gribskov, Michael; dePamphilis, Claude; Albert, Victor A.; Aono, Naoki; Aoyama, Tsuyoshi; Ambrose, Barbara A.; Ashton, Neil W.; Axtell, Michael J.; Barker, Elizabeth; Barker, Michael S.; Bennetzen, Jeffrey L.; Bonawitz, Nicholas D.; Chapple, Clint; Cheng, Chaoyang; Correa, Luiz Gustavo Guedes; Dacre, Michael; DeBarry, Jeremy; Dreyer, Ingo; Elias, Marek; Engstrom, Eric M.; Estelle, Mark; Feng, Liang; Finet, Cedric; Floyd, Sandra K.; Frommer, Wolf B.; Fujita, Tomomichi; Gramzow, Lydia; Gutensohn, Michael; Harholt, Jesper; Hattori, Mitsuru; Heyl, Alexander; Hirai, Tadayoshi; Hiwatashi, Yuji; Ishikawa, Masaki; Iwata, Mineko; Karol, Kenneth G.; Koehler, Barbara; Kolukisaoglu, Uener; Kubo, Minoru; Kurata, Tetsuya; Lalonde, Sylvie; Li, Kejie; Li, Ying; Litt, Amy; Lyons, Eric; Manning, Gerard; Maruyama, Takeshi; Michael, Todd P.; Mikami, Koji; Miyazaki, Saori; Morinaga, Shin-ichi; Murata, Takashi; Mueller-Roeber, Bernd; Nelson, David R.; Obara, Mari; Oguri, Yasuko; Olmstead, Richard G.; Onodera, Naoko; Petersen, Bent Larsen; Pils, Birgit; Prigge, Michael; Rensing, Stefan A.; Riano-Pachon, Diego Mauricio; Roberts, Alison W.; Sato, Yoshikatsu; Scheller, Henrik Vibe; Schulz, Burkhard; Schulz, Christian; Shakirov, Eugene V.; Shibagaki, Nakako; Shinohara, Naoki; Shippen, Dorothy E.; Sorensen, Iben; Sotooka, Ryo; Sugimoto, Nagisa; Sugita, Mamoru; Sumikawa, Naomi; Tanurdzic, Milos; Theilsen, Gunter; Ulvskov, Peter; Wakazuki, Sachiko; Weng, Jing-Ke; Willats, William W.G.T.; Wipf, Daniel; Wolf, Paul G.; Yang, Lixing; Zimmer, Andreas D.; Zhu, Qihui; Mitros, Therese; Hellsten, Uffe; Loque, Dominique; Otillar, Robert; Salamov, Asaf; Schmutz, Jeremy; Shapiro, Harris; Lindquist, Erika; Lucas, Susan; Rokhsar, Daniel

    2011-04-28

    We report the genome sequence of the nonseed vascular plant, Selaginella moellendorffii, and by comparative genomics identify genes that likely played important roles in the early evolution of vascular plants and their subsequent evolution

  2. Pooled-DNA sequencing identifies genomic regions of selection in Nigerian isolates of Plasmodium falciparum.

    Science.gov (United States)

    Oyebola, Kolapo M; Idowu, Emmanuel T; Olukosi, Yetunde A; Awolola, Taiwo S; Amambua-Ngwa, Alfred

    2017-06-29

    The burden of falciparum malaria is especially high in sub-Saharan Africa. Differences in pressure from host immunity and antimalarial drugs lead to adaptive changes responsible for high level of genetic variations within and between the parasite populations. Population-specific genetic studies to survey for genes under positive or balancing selection resulting from drug pressure or host immunity will allow for refinement of interventions. We performed a pooled sequencing (pool-seq) of the genomes of 100 Plasmodium falciparum isolates from Nigeria. We explored allele-frequency based neutrality test (Tajima's D) and integrated haplotype score (iHS) to identify genes under selection. Fourteen shared iHS regions that had at least 2 SNPs with a score > 2.5 were identified. These regions code for genes that were likely to have been under strong directional selection. Two of these genes were the chloroquine resistance transporter (CRT) on chromosome 7 and the multidrug resistance 1 (MDR1) on chromosome 5. There was a weak signature of selection in the dihydrofolate reductase (DHFR) gene on chromosome 4 and MDR5 genes on chromosome 13, with only 2 and 3 SNPs respectively identified within the iHS window. We observed strong selection pressure attributable to continued chloroquine and sulfadoxine-pyrimethamine use despite their official proscription for the treatment of uncomplicated malaria. There was also a major selective sweep on chromosome 6 which had 32 SNPs within the shared iHS region. Tajima's D of circumsporozoite protein (CSP), erythrocyte-binding antigen (EBA-175), merozoite surface proteins - MSP3 and MSP7, merozoite surface protein duffy binding-like (MSPDBL2) and serine repeat antigen (SERA-5) were 1.38, 1.29, 0.73, 0.84 and 0.21, respectively. We have demonstrated the use of pool-seq to understand genomic patterns of selection and variability in P. falciparum from Nigeria, which bears the highest burden of infections. This investigation identified known

  3. An object model for genome information at all levels of resolution

    Energy Technology Data Exchange (ETDEWEB)

    Honda, S.; Parrott, N.W.; Smith, R.; Lawrence, C.

    1993-12-31

    An object model for genome data at all levels of resolution is described. The model was derived by considering the requirements for representing genome related objects in three application domains: genome maps, large-scale DNA sequencing, and exploring functional information in gene and protein sequences. The methodology used for the object-oriented analysis is also described.

  4. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes.

    Science.gov (United States)

    Parker, Brian J; Moltke, Ida; Roth, Adam; Washietl, Stefan; Wen, Jiayu; Kellis, Manolis; Breaker, Ronald; Pedersen, Jakob Skou

    2011-11-01

    Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.

  5. Comparative Genomics and Disorder Prediction Identify Biologically Relevant SH3 Protein Interactions.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available Protein interaction networks are an important part of the post-genomic effort to integrate a part-list view of the cell into system-level understanding. Using a set of 11 yeast genomes we show that combining comparative genomics and secondary structure information greatly increases consensus-based prediction of SH3 targets. Benchmarking of our method against positive and negative standards gave 83% accuracy with 26% coverage. The concept of an optimal divergence time for effective comparative genomics studies was analyzed, demonstrating that genomes of species that diverged very recently from Saccharomyces cerevisiae(S. mikatae, S. bayanus, and S. paradoxus, or a long time ago (Neurospora crassa and Schizosaccharomyces pombe, contain less information for accurate prediction of SH3 targets than species within the optimal divergence time proposed. We also show here that intrinsically disordered SH3 domain targets are more probable sites of interaction than equivalent sites within ordered regions. Our findings highlight several novel S. cerevisiae SH3 protein interactions, the value of selection of optimal divergence times in comparative genomics studies, and the importance of intrinsic disorder for protein interactions. Based on our results we propose novel roles for the S. cerevisiae proteins Abp1p in endocytosis and Hse1p in endosome protein sorting.

  6. Comparative genomics and disorder prediction identify biologically relevant SH3 protein interactions.

    Directory of Open Access Journals (Sweden)

    Pedro Beltrao

    2005-08-01

    Full Text Available Protein interaction networks are an important part of the post-genomic effort to integrate a part-list view of the cell into system-level understanding. Using a set of 11 yeast genomes we show that combining comparative genomics and secondary structure information greatly increases consensus-based prediction of SH3 targets. Benchmarking of our method against positive and negative standards gave 83% accuracy with 26% coverage. The concept of an optimal divergence time for effective comparative genomics studies was analyzed, demonstrating that genomes of species that diverged very recently from Saccharomyces cerevisiae(S. mikatae, S. bayanus, and S. paradoxus, or a long time ago (Neurospora crassa and Schizosaccharomyces pombe, contain less information for accurate prediction of SH3 targets than species within the optimal divergence time proposed. We also show here that intrinsically disordered SH3 domain targets are more probable sites of interaction than equivalent sites within ordered regions. Our findings highlight several novel S. cerevisiae SH3 protein interactions, the value of selection of optimal divergence times in comparative genomics studies, and the importance of intrinsic disorder for protein interactions. Based on our results we propose novel roles for the S. cerevisiae proteins Abp1p in endocytosis and Hse1p in endosome protein sorting.

  7. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Science.gov (United States)

    Greub, Gilbert; Kebbi-Beghdadi, Carole; Bertelli, Claire; Collyn, François; Riederer, Beat M; Yersin, Camille; Croxatto, Antony; Raoult, Didier

    2009-12-23

    With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  8. Genome-wide siRNA-based functional genomics of pigmentation identifies novel genes and pathways that impact melanogenesis in human cells.

    Directory of Open Access Journals (Sweden)

    Anand K Ganesan

    2008-12-01

    Full Text Available Melanin protects the skin and eyes from the harmful effects of UV irradiation, protects neural cells from toxic insults, and is required for sound conduction in the inner ear. Aberrant regulation of melanogenesis underlies skin disorders (melasma and vitiligo, neurologic disorders (Parkinson's disease, auditory disorders (Waardenburg's syndrome, and opthalmologic disorders (age related macular degeneration. Much of the core synthetic machinery driving melanin production has been identified; however, the spectrum of gene products participating in melanogenesis in different physiological niches is poorly understood. Functional genomics based on RNA-mediated interference (RNAi provides the opportunity to derive unbiased comprehensive collections of pharmaceutically tractable single gene targets supporting melanin production. In this study, we have combined a high-throughput, cell-based, one-well/one-gene screening platform with a genome-wide arrayed synthetic library of chemically synthesized, small interfering RNAs to identify novel biological pathways that govern melanin biogenesis in human melanocytes. Ninety-two novel genes that support pigment production were identified with a low false discovery rate. Secondary validation and preliminary mechanistic studies identified a large panel of targets that converge on tyrosinase expression and stability. Small molecule inhibition of a family of gene products in this class was sufficient to impair chronic tyrosinase expression in pigmented melanoma cells and UV-induced tyrosinase expression in primary melanocytes. Isolation of molecular machinery known to support autophagosome biosynthesis from this screen, together with in vitro and in vivo validation, exposed a close functional relationship between melanogenesis and autophagy. In summary, these studies illustrate the power of RNAi-based functional genomics to identify novel genes, pathways, and pharmacologic agents that impact a biological phenotype

  9. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Science.gov (United States)

    Ping, Zheng; Siegal, Gene P.; Almeida, Jonas S.; Schnitt, Stuart J.; Shen, Dejun

    2014-01-01

    Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA) is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer. PMID:24672738

  10. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Directory of Open Access Journals (Sweden)

    Zheng Ping

    2014-01-01

    Full Text Available Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer.

  11. Genome-wide association identifies nine common variants associated with fasting proinsulin levels and provides new insights into the pathophysiology of type 2 diabetes

    OpenAIRE

    Strawbridge, Rona; Dupuis, Josée; Prokopenko, Inga; Barker, Adam; Ahlqvist, Emma; Rybin, Denis; Petrie, John; Bouatia-Naji, Nabila; Dimas, Antigone; Wheeler, Eleanor; Chen, Han; Voight, Benjamin; Taneera, Jalal; Kanoni, Stavroula; Peden, John

    2011-01-01

    textabstractOBJECTIVE - Proinsulin is a precursor of mature insulin and C-peptide. Higher circulating proinsulin levels are associated with impaired b-cell function, raised glucose levels, insulin resistance, and type 2 diabetes (T2D). Studies of the insulin processing pathway could provide new insights about T2D pathophysiology. RESEARCH DESIGN AND METHODS - We have conducted a meta-analysis of genome-wide association tests of ;2.5 million genotyped or imputed single nucleotide polymorphisms...

  12. Benchmark data for identifying N6-methyladenosine sites in the Saccharomyces cerevisiae genome

    Directory of Open Access Journals (Sweden)

    Wei Chen

    2015-12-01

    Full Text Available This data article contains the benchmark dataset for training and testing iRNA-Methyl, a web-server predictor for identifying N6-methyladenosine sites in RNA (Chen et al., 2015 [15]. It can also be used to develop other predictors for identifying N6-methyladenosine sites in the Saccharomyces cerevisiae genome.

  13. Whole-genome and Transcriptome Sequencing of Prostate Cancer Identify New Genetic Alterations Driving Disease Progression

    DEFF Research Database (Denmark)

    Ren, Shancheng; Wei, Gong-Hong; Liu, Dongbing

    2018-01-01

    BACKGROUND: Global disparities in prostate cancer (PCa) incidence highlight the urgent need to identify genomic abnormalities in prostate tumors in different ethnic populations including Asian men. OBJECTIVE: To systematically explore the genomic complexity and define disease-driven genetic......-scale and comprehensive genomic data of prostate cancer from Asian population. Identification of these genetic alterations may help advance prostate cancer diagnosis, prognosis, and treatment....... alterations in PCa. DESIGN, SETTING, AND PARTICIPANTS: The study sequenced whole-genome and transcriptome of tumor-benign paired tissues from 65 treatment-naive Chinese PCa patients. Subsequent targeted deep sequencing of 293 PCa-relevant genes was performed in another cohort of 145 prostate tumors. OUTCOME...

  14. Genome-wide association study identifies multiple risk loci for chronic lymphocytic leukemia

    OpenAIRE

    Berndt, S.I.; Skibola, C.F.; Joseph, V.; Camp, N.J.; Nieters, A.; Wang, Z.; Cozen, W.; Monnereau, A.; Wang, S.S.; Kelly, R.S.; Lan, Q.; Teras, L.R.; Chatterjee, N.; Chung, C.C.; Yeager, M.

    2013-01-01

    Genome-wide association studies (GWAS) have previously identified 13 loci associated with risk of chronic lymphocytic leukemia or small lymphocytic lymphoma (CLL). To identify additional CLL susceptibility loci, we conducted the largest meta-analysis for CLL thus far, including four GWAS with a total of 3,100 individuals with CLL (cases) and 7,667 controls. In the meta-analysis, we identified ten independent associated SNPs in nine new loci at 10q23.31 (ACTA2 or FAS (ACTA2/FAS), P = 1.22 × 10...

  15. A genome-wide siRNA screen to identify modulators of insulin sensitivity and gluconeogenesis.

    Directory of Open Access Journals (Sweden)

    Ruojing Yang

    Full Text Available BACKGROUND: Hepatic insulin resistance impairs insulin's ability to suppress hepatic glucose production (HGP and contributes to the development of type 2 diabetes (T2D. Although the interests to discover novel genes that modulate insulin sensitivity and HGP are high, it remains challenging to have a human cell based system to identify novel genes. METHODOLOGY/PRINCIPAL FINDINGS: To identify genes that modulate hepatic insulin signaling and HGP, we generated a human cell line stably expressing beta-lactamase under the control of the human glucose-6-phosphatase (G6PC promoter (AH-G6PC cells. Both beta-lactamase activity and endogenous G6PC mRNA were increased in AH-G6PC cells by a combination of dexamethasone and pCPT-cAMP, and reduced by insulin. A 4-gene High-Throughput-Genomics assay was developed to concomitantly measure G6PC and pyruvate-dehydrogenase-kinase-4 (PDK4 mRNA levels. Using this assay, we screened an siRNA library containing pooled siRNA targeting 6650 druggable genes and identified 614 hits that lowered G6PC expression without increasing PDK4 mRNA levels. Pathway analysis indicated that siRNA-mediated knockdown (KD of genes known to positively or negatively affect insulin signaling increased or decreased G6PC mRNA expression, respectively, thus validating our screening platform. A subset of 270 primary screen hits was selected and 149 hits were confirmed by target gene KD by pooled siRNA and 7 single siRNA for each gene to reduce G6PC expression in 4-gene HTG assay. Subsequently, pooled siRNA KD of 113 genes decreased PEPCK and/or PGC1alpha mRNA expression thereby demonstrating their role in regulating key gluconeogenic genes in addition to G6PC. Last, KD of 61 of the above 113 genes potentiated insulin-stimulated Akt phosphorylation, suggesting that they suppress gluconeogenic gene by enhancing insulin signaling. CONCLUSIONS/SIGNIFICANCE: These results support the proposition that the proteins encoded by the genes identified in

  16. Genome-wide association study identifies 74 loci associated with educational attainment

    DEFF Research Database (Denmark)

    Okbay, Aysu; P. Beauchamp, Jonathan; Alan Fontana, Mark

    2016-01-01

    -nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural......Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals1. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends...... development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals...

  17. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Directory of Open Access Journals (Sweden)

    Gilbert Greub

    Full Text Available BACKGROUND: With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. METHODS/PRINCIPAL FINDINGS: We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. CONCLUSIONS/SIGNIFICANCE: This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  18. Deep sequencing of Brachypodium small RNAs at the global genome level identifies microRNAs involved in cold stress response

    Directory of Open Access Journals (Sweden)

    Chong Kang

    2009-09-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are endogenous small RNAs having large-scale regulatory effects on plant development and stress responses. Extensive studies of miRNAs have only been performed in a few model plants. Although miRNAs are proved to be involved in plant cold stress responses, little is known for winter-habit monocots. Brachypodium distachyon, with close evolutionary relationship to cool-season cereals, has recently emerged as a novel model plant. There are few reports of Brachypodium miRNAs. Results High-throughput sequencing and whole-genome-wide data mining led to the identification of 27 conserved miRNAs, as well as 129 predicted miRNAs in Brachypodium. For multiple-member conserved miRNA families, their sizes in Brachypodium were much smaller than those in rice and Populus. The genome organization of miR395 family in Brachypodium was quite different from that in rice. The expression of 3 conserved miRNAs and 25 predicted miRNAs showed significant changes in response to cold stress. Among these miRNAs, some were cold-induced and some were cold-suppressed, but all the conserved miRNAs were up-regulated under cold stress condition. Conclusion Our results suggest that Brachypodium miRNAs are composed of a set of conserved miRNAs and a large proportion of non-conserved miRNAs with low expression levels. Both kinds of miRNAs were involved in cold stress response, but all the conserved miRNAs were up-regulated, implying an important role for cold-induced miRNAs. The different size and genome organization of miRNA families in Brachypodium and rice suggest that the frequency of duplication events or the selection pressure on duplicated miRNAs are different between these two closely related plant species.

  19. Using sheep genomes from diverse U.S. breeds to identify missense variants in genes affecting fecundity

    Science.gov (United States)

    Background: Access to sheep genome sequences significantly improves the chances of identifying genes that may influence the health, welfare, and productivity of these animals. Methods: A public, searchable DNA sequence resource for U.S. sheep was created with whole genome sequence (WGS) of 96 rams. ...

  20. Genome-wide association study identified CNP12587 region underlying height variation in Chinese females.

    Directory of Open Access Journals (Sweden)

    Yin-Ping Zhang

    Full Text Available Human height is a highly heritable trait considered as an important factor for health. There has been limited success in identifying the genetic factors underlying height variation. We aim to identify sequence variants associated with adult height by a genome-wide association study of copy number variants (CNVs in Chinese.Genome-wide CNV association analyses were conducted in 1,625 unrelated Chinese adults and sex specific subgroup for height variation, respectively. Height was measured with a stadiometer. Affymetrix SNP6.0 genotyping platform was used to identify copy number polymorphisms (CNPs. We constructed a genomic map containing 1,009 CNPs in Chinese individuals and performed a genome-wide association study of CNPs with height.We detected 10 significant association signals for height (p<0.05 in the whole population, 9 and 11 association signals for Chinese female and male population, respectively. A copy number polymorphism (CNP12587, chr18:54081842-54086942, p = 2.41 × 10(-4 was found to be significantly associated with height variation in Chinese females even after strict Bonferroni correction (p = 0.048. Confirmatory real time PCR experiments lent further support for CNV validation. Compared to female subjects with two copies of the CNP, carriers of three copies had an average of 8.1% decrease in height. An important candidate gene, ubiquitin-protein ligase NEDD4-like (NEDD4L, was detected at this region, which plays important roles in bone metabolism by binding to bone formation regulators.Our findings suggest the important genetic variants underlying height variation in Chinese.

  1. Genome-wide scan identifies variant in TNFSF13 associated with serum IgM in a healthy Chinese male population.

    Directory of Open Access Journals (Sweden)

    Ming Yang

    Full Text Available IgM provides a first line of defense during microbial infections. Serum IgM levels are detected routinely in clinical practice. And IgM is a genetically complex trait. We conducted a two-stage genome-wide association study (GWAS to identify genetic variants affecting serum IgM levels in a Chinese population of 3495, including 1999 unrelated subjects in the first stage and 1496 independent individuals in the second stage. Our data show that a common single nucleotide polymorphism (SNP, rs11552708 located in the TNFSF13 gene was significantly associated with IgM levels (p = 5.00×10(-7 in first stage, p = 1.34×10(-3 in second stage, and p = 4.22×10(-9 when combined. Besides, smoking was identified to be associated with IgM levels in both stages (P0.05. It is suggested that TNFSF13 may be a susceptibility gene affecting serum IgM levels in Chinese male population.

  2. Genome-wide Analyses Identify KIF5A as a Novel ALS Gene

    NARCIS (Netherlands)

    Nicolas, Aude; Kenna, Kevin P.; Renton, Alan E.; Ticozzi, Nicola; Faghri, Faraz; Chia, Ruth; Dominov, Janice A.; Kenna, Brendan J.; Nalls, Mike A.; Keagle, Pamela; Rivera, Alberto M.; van Rheenen, Wouter; Murphy, Natalie A.; van Vugt, Joke J.F.A.; Geiger, Joshua T.; van der Spek, Rick; Pliner, Hannah A.; Smith, Bradley N.; Marangi, Giuseppe; Topp, Simon D.; Abramzon, Yevgeniya; Gkazi, Athina Soragia; Eicher, John D.; Kenna, Aoife; Logullo, Francesco O.; Simone, Isabella L.; Logroscino, Giancarlo; Salvi, Fabrizio; Bartolomei, Ilaria; Borghero, Giuseppe; Murru, Maria Rita; Costantino, Emanuela; Pani, Carla; Puddu, Roberta; Caredda, Carla; Piras, Valeria; Tranquilli, Stefania; Cuccu, Stefania; Corongiu, Daniela; Melis, Maurizio; Milia, Antonio; Marrosu, Francesco; Marrosu, Maria Giovanna; Floris, Gianluca; Cannas, Antonino; Capasso, Margherita; Caponnetto, Claudia; Mancardi, Gianluigi; Origone, Paola; Mandich, Paola; Conforti, Francesca L.; Cavallaro, Sebastiano; Mora, Gabriele; Marinou, Kalliopi; Sideri, Riccardo; Penco, Silvana; Mosca, Lorena; Lunetta, Christian; Pinter, Giuseppe Lauria; Corbo, Massimo; Riva, Nilo; Carrera, Paola; Volanti, Paolo; Mandrioli, Jessica; Fini, Nicola; Fasano, Antonio; Tremolizzo, Lucio; Arosio, Alessandro; Ferrarese, Carlo; Trojsi, Francesca; Tedeschi, Gioacchino; Monsurrò, Maria Rosaria; Piccirillo, Giovanni; Femiano, Cinzia; Ticca, Anna; Ortu, Enzo; La Bella, Vincenzo; Spataro, Rossella; Colletti, Tiziana; Sabatelli, Mario; Zollino, Marcella; Conte, Amelia; Luigetti, Marco; Lattante, Serena; Marangi, Giuseppe; Santarelli, Marialuisa; Petrucci, Antonio; Pugliatti, Maura; Pirisi, Angelo; Parish, Leslie D.; Occhineri, Patrizia; Giannini, Fabio; Battistini, Stefania; Ricci, Claudia; Benigni, Michele; Cau, Tea B.; Loi, Daniela; Calvo, Andrea; Moglia, Cristina; Brunetti, Maura; Barberis, Marco; Restagno, Gabriella; Casale, Federico; Marrali, Giuseppe; Fuda, Giuseppe; Ossola, Irene; Cammarosano, Stefania; Canosa, Antonio; Ilardi, Antonio; Manera, Umberto; Grassano, Maurizio; Tanel, Raffaella; Pisano, Fabrizio; Mora, Gabriele; Calvo, Andrea; Mazzini, Letizia; Riva, Nilo; Mandrioli, Jessica; Caponnetto, Claudia; Battistini, Stefania; Volanti, Paolo; La Bella, Vincenzo; Conforti, Francesca L.; Borghero, Giuseppe; Messina, Sonia; Simone, Isabella L.; Trojsi, Francesca; Salvi, Fabrizio; Logullo, Francesco O.; D'Alfonso, Sandra; Corrado, Lucia; Capasso, Margherita; Ferrucci, Luigi; Harms, Matthew B.; Goldstein, David B.; Shneider, Neil A.; Goutman, Stephen A.; Simmons, Zachary; Miller, Timothy M.; Chandran, Siddharthan; Pal, Suvankar; Manousakis, George; Appel, Stanley H.; Simpson, Ericka; Wang, Leo; Baloh, Robert H.; Gibson, Summer B.; Bedlack, Richard; Lacomis, David; Sareen, Dhruv; Sherman, Alexander; Bruijn, Lucie; Penny, Michelle; Moreno, Cristiane de Araujo Martins; Kamalakaran, Sitharthan; Goldstein, David B.; Allen, Andrew S.; Appel, Stanley; Baloh, Robert H.; Bedlack, Richard S.; Boone, Braden E.; Brown, Robert; Carulli, John P.; Chesi, Alessandra; Chung, Wendy K.; Cirulli, Elizabeth T.; Cooper, Gregory M.; Couthouis, Julien; Day-Williams, Aaron G.; Dion, Patrick A.; Gibson, Summer B.; Gitler, Aaron D.; Glass, Jonathan D.; Goldstein, David B.; Han, Yujun; Harms, Matthew B.; Harris, Tim; Hayes, Sebastian D.; Jones, Angela L.; Keebler, Jonathan; Krueger, Brian J.; Lasseigne, Brittany N.; Levy, Shawn E.; Lu, Yi Fan; Maniatis, Tom; McKenna-Yasek, Diane; Miller, Timothy M.; Myers, Richard M.; Petrovski, Slavé; Pulst, Stefan M.; Raphael, Alya R.; Ravits, John M.; Ren, Zhong; Rouleau, Guy A.; Sapp, Peter C.; Shneider, Neil A.; Simpson, Ericka; Sims, Katherine B.; Staropoli, John F.; Waite, Lindsay L.; Wang, Quanli; Wimbish, Jack R.; Xin, Winnie W.; Gitler, Aaron D.; Harris, Tim; Myers, Richard M.; Phatnani, Hemali; Kwan, Justin; Sareen, Dhruv; Broach, James R.; Simmons, Zachary; Arcila-Londono, Ximena; Lee, Edward B.; Van Deerlin, Vivianna M.; Shneider, Neil A.; Fraenkel, Ernest; Ostrow, Lyle W.; Baas, Frank; Zaitlen, Noah; Berry, James D.; Malaspina, Andrea; Fratta, Pietro; Cox, Gregory A.; Thompson, Leslie M.; Finkbeiner, Steve; Dardiotis, Efthimios; Miller, Timothy M.; Chandran, Siddharthan; Pal, Suvankar; Hornstein, Eran; MacGowan, Daniel J.L.; Heiman-Patterson, Terry D.; Hammell, Molly G.; Patsopoulos, Nikolaos A.; Dubnau, Joshua; Nath, Avindra; Phatnani, Hemali; Musunuri, Rajeeva Lochan; Evani, Uday Shankar; Abhyankar, Avinash; Zody, Michael C.; Kaye, Julia; Finkbeiner, Steven; Wyman, Stacia K.; LeNail, Alexander; Lima, Leandro; Fraenkel, Ernest; Rothstein, Jeffrey D.; Svendsen, Clive N.; Thompson, Leslie M.; Van Eyk, Jenny; Maragakis, Nicholas J.; Berry, James D.; Glass, Jonathan D.; Miller, Timothy M.; Kolb, Stephen J.; Baloh, Robert H.; Cudkowicz, Merit; Baxi, Emily; Kaye, Julia; Finkbeiner, Steven; Wyman, Stacia K.; Finkbeiner, Steven; LeNail, Alex; Lima, Leandro; Fraenkel, Ernest; Fraenkel, Ernest; Svendsen, Clive N.; Svendsen, Clive N.; Thompson, Leslie M.; Thompson, Leslie M.; Van Eyk, Jennifer E.; Berry, James D.; Berry, James D.; Miller, Timothy M.; Kolb, Stephen J.; Cudkowicz, Merit; Cudkowicz, Merit; Baxi, Emily; Benatar, Michael; Taylor, J. Paul; Wu, Gang; Rampersaud, Evadnie; Wuu, Joanne; Rademakers, Rosa; Züchner, Stephan; Schule, Rebecca; McCauley, Jacob; Hussain, Sumaira; Cooley, Anne; Wallace, Marielle; Clayman, Christine; Barohn, Richard; Statland, Jeffrey; Ravits, John M.; Swenson, Andrea; Jackson, Carlayne; Trivedi, Jaya; Khan, Shaida; Katz, Jonathan; Jenkins, Liberty; Burns, Ted; Gwathmey, Kelly; Caress, James; McMillan, Corey; Elman, Lauren; Pioro, Erik P.; Heckmann, Jeannine; So, Yuen; Walk, David; Maiser, Samuel; Zhang, Jinghui; Benatar, Michael; Taylor, J. Paul; Taylor, J. Paul; Rampersaud, Evadnie; Wu, Gang; Wuu, Joanne; Silani, Vincenzo; Ticozzi, Nicola; Gellera, Cinzia; Ratti, Antonia; Taroni, Franco; Lauria, Giuseppe; Verde, Federico; Fogh, Isabella; Tiloca, Cinzia; Comi, Giacomo P.; Sorarù, Gianni; Cereda, Cristina; D'Alfonso, Sandra; Corrado, Lucia; De Marchi, Fabiola; Corti, Stefania; Ceroni, Mauro; Mazzini, Letizia; Siciliano, Gabriele; Filosto, Massimiliano; Inghilleri, Maurizio; Peverelli, Silvia; Colombrita, Claudia; Poletti, Barbara; Maderna, Luca; Del Bo, Roberto; Gagliardi, Stella; Querin, Giorgia; Bertolin, Cinzia; Pensato, Viviana; Castellotti, Barbara; Lauria, Giuseppe; Verde, Federico; Fogh, Isabella; Tiloca, Cinzia; Fogh, Isabella; Comi, Giacomo P.; Sorarù, Gianni; Cereda, Cristina; Camu, William; Mouzat, Kevin; Lumbroso, Serge; Corcia, Philippe; Meininger, Vincent; Besson, Gérard; Lagrange, Emmeline; Clavelou, Pierre; Guy, Nathalie; Couratier, Philippe; Vourch, Patrick; Danel, Véronique; Bernard, Emilien; Lemasson, Gwendal; Corcia, Philippe; Laaksovirta, Hannu; Myllykangas, Liisa; Jansson, Lilja; Valori, Miko; Ealing, John; Hamdalla, Hisham; Rollinson, Sara; Pickering-Brown, Stuart; Orrell, Richard W.; Sidle, Katie C.; Malaspina, Andrea; Hardy, John; Singleton, Andrew B.; Johnson, Janel O.; Arepalli, Sampath; Sapp, Peter C.; McKenna-Yasek, Diane; Polak, Meraida; Asress, Seneshaw; Al-Sarraj, Safa; King, Andrew; Troakes, Claire; Vance, Caroline; de Belleroche, Jacqueline; Baas, Frank; ten Asbroek, Anneloor L.M.A.; Muñoz-Blanco, José Luis; Hernandez, Dena G.; Ding, Jinhui; Gibbs, J. Raphael; Scholz, Sonja W.; Scholz, Sonja W.; Floeter, Mary Kay; Campbell, Roy H.; Landi, Francesco; Bowser, Robert; Pulst, Stefan M.; Ravits, John M.; MacGowan, Daniel J.L.; Kirby, Janine; Pioro, Erik P.; Pamphlett, Roger; Broach, James; Gerhard, Glenn; Dunckley, Travis L.; Brady, Christopher B.; Brady, Christopher B.; Kowall, Neil W.; Troncoso, Juan C.; Le Ber, Isabelle; Mouzat, Kevin; Lumbroso, Serge; Mouzat, Kevin; Lumbroso, Serge; Heiman-Patterson, Terry D.; Heiman-Patterson, Terry D.; Kamel, Freya; Van Den Bosch, Ludo; Van Den Bosch, Ludo; Baloh, Robert H.; Strom, Tim M.; Meitinger, Thomas; Strom, Tim M.; Shatunov, Aleksey; Van Eijk, Kristel R.; de Carvalho, Mamede; de Carvalho, Mamede; Kooyman, Maarten; Middelkoop, Bas; Moisse, Matthieu; McLaughlin, Russell; Van Es, Michael A.; Weber, Markus; Boylan, Kevin B.; Van Blitterswijk, Marka; Rademakers, Rosa; Morrison, Karen; Basak, A. Nazli; Mora, Jesús S.; Drory, Vivian; Shaw, Pamela; Turner, Martin R.; Talbot, Kevin; Hardiman, Orla; Williams, Kelly L.; Fifita, Jennifer A.; Nicholson, Garth A.; Blair, Ian P.; Nicholson, Garth A.; Rouleau, Guy A.; Esteban-Pérez, Jesús; García-Redondo, Alberto; Al-Chalabi, Ammar; Al Kheifat, Ahmad; Al-Chalabi, Ammar; Andersen, Peter M.; Basak, A. Nazli; Blair, Ian P.; Chio, Adriano; Cooper-Knock, Jonathan; Corcia, Philippe; Couratier, Philippe; de Carvalho, Mamede; Dekker, Annelot; Drory, Vivian; Redondo, Alberto Garcia; Gotkine, Marc; Hardiman, Orla; Hide, Winston; Iacoangeli, Alfredo; Glass, Jonathan D.; Kenna, Kevin P.; Kiernan, Matthew; Kooyman, Maarten; Landers, John E.; McLaughlin, Russell; Middelkoop, Bas; Mill, Jonathan; Neto, Miguel Mitne; Moisse, Matthieu; Pardina, Jesus Mora; Morrison, Karen; Newhouse, Stephen; Pinto, Susana; Pulit, Sara; Robberecht, Wim; Shatunov, Aleksey; Shaw, Pamela; Shaw, Chris; Silani, Vincenzo; Sproviero, William; Tazelaar, Gijs; Ticozzi, Nicola; Van Damme, Philip; van den Berg, Leonard; van der Spek, Rick; Van Eijk, Kristel R.; Van Es, Michael A.; van Rheenen, Wouter; van Vugt, Joke J.F.A.; Veldink, Jan H.; Weber, Markus; Williams, Kelly L.; Van Damme, Philip; Robberecht, Wim; Zatz, Mayana; Robberecht, Wim; Bauer, Denis C.; Twine, Natalie A.; Rogaeva, Ekaterina; Zinman, Lorne; Ostrow, Lyle W.; Maragakis, Nicholas J.; Rothstein, Jeffrey D.; Simmons, Zachary; Cooper-Knock, Johnathan; Brice, Alexis; Goutman, Stephen A.; Feldman, Eva L.; Gibson, Summer B.; Taroni, Franco; Ratti, Antonia; Ratti, Antonia; Gellera, Cinzia; Van Damme, Philip; Robberecht, Wim; Fratta, Pietro; Sabatelli, Mario; Lunetta, Christian; Ludolph, Albert C.; Andersen, Peter M.; Weishaupt, Jochen H.; Camu, William; Trojanowski, John Q.; Van Deerlin, Vivianna M.; Brown, Robert H.; van den Berg, Leonard; Veldink, Jan H.; Harms, Matthew B.; Glass, Jonathan D.; Stone, David J.; Tienari, Pentti; Silani, Vincenzo; Silani, Vincenzo; Chiò, Adriano; Shaw, Christopher E.; Chiò, Adriano; Traynor, Bryan J.; Landers, John E.; Traynor, Bryan J.

    2018-01-01

    To identify novel genes associated with ALS, we undertook two lines of investigation. We carried out a genome-wide association study comparing 20,806 ALS cases and 59,804 controls. Independently, we performed a rare variant burden analysis comparing 1,138 index familial ALS cases and 19,494

  3. Polymorphic microsatellites in the human bloodfluke, Schistosoma japonicum, identified using a genomic resource

    Directory of Open Access Journals (Sweden)

    Spear Robert

    2011-02-01

    Full Text Available Abstract Re-emergence of schistosomiasis in regions of China where control programs have ceased requires development of molecular-genetic tools to track gene flow and assess genetic diversity of Schistosoma populations. We identified many microsatellite loci in the draft genome of Schistosoma japonicum using defined search criteria and selected a subset for further analysis. From an initial panel of 50 loci, 20 new microsatellites were selected for eventual optimization and application to a panel of worms from endemic areas. All but one of the selected microsatellites contain simple tri-nucleotide repeats. Moderate to high levels of polymorphism were detected. Numbers of alleles ranged from 6 to 14 and observed heterozygosity was always >0.6. The loci reported here will facilitate high resolution population-genetic studies on schistosomes in re-emergent foci.

  4. Exceptionally high levels of recombination across the honey bee genome.

    Science.gov (United States)

    Beye, Martin; Gattermeier, Irene; Hasselmann, Martin; Gempe, Tanja; Schioett, Morten; Baines, John F; Schlipalius, David; Mougel, Florence; Emore, Christine; Rueppell, Olav; Sirviö, Anu; Guzmán-Novoa, Ernesto; Hunt, Greg; Solignac, Michel; Page, Robert E

    2006-11-01

    The first draft of the honey bee genome sequence and improved genetic maps are utilized to analyze a genome displaying 10 times higher levels of recombination (19 cM/Mb) than previously analyzed genomes of higher eukaryotes. The exceptionally high recombination rate is distributed genome-wide, but varies by two orders of magnitude. Analysis of chromosome, sequence, and gene parameters with respect to recombination showed that local recombination rate is associated with distance to the telomere, GC content, and the number of simple repeats as described for low-recombining genomes. Recombination rate does not decrease with chromosome size. On average 5.7 recombination events per chromosome pair per meiosis are found in the honey bee genome. This contrasts with a wide range of taxa that have a uniform recombination frequency of about 1.6 per chromosome pair. The excess of recombination activity does not support a mechanistic role of recombination in stabilizing pairs of homologous chromosome during chromosome pairing. Recombination rate is associated with gene size, suggesting that introns are larger in regions of low recombination and may improve the efficacy of selection in these regions. Very few transposons and no retrotransposons are present in the high-recombining genome. We propose evolutionary explanations for the exceptionally high genome-wide recombination rate.

  5. A Web-Based Comparative Genomics Tutorial for Investigating Microbial Genomes

    Directory of Open Access Journals (Sweden)

    Michael Strong

    2009-12-01

    Full Text Available As the number of completely sequenced microbial genomes continues to rise at an impressive rate, it is important to prepare students with the skills necessary to investigate microorganisms at the genomic level. As a part of the core curriculum for first-year graduate students in the biological sciences, we have implemented a web-based tutorial to introduce students to the fields of comparative and functional genomics. The tutorial focuses on recent computational methods for identifying functionally linked genes and proteins on a genome-wide scale and was used to introduce students to the Rosetta Stone, Phylogenetic Profile, conserved Gene Neighbor, and Operon computational methods. Students learned to use a number of publicly available web servers and databases to identify functionally linked genes in the Escherichia coli genome, with emphasis on genome organization and operon structure. The overall effectiveness of the tutorial was assessed based on student evaluations and homework assignments. The tutorial is available to other educators at http://www.doe-mbi.ucla.edu/~strong/m253.php.

  6. Genome of the Netherlands population-specific imputations identify an ABCA6 variant associated with cholesterol levels

    NARCIS (Netherlands)

    van Leeuwen, E.M.; Karssen, L.C.; Deelen, J.; Isaacs, A.; Medina-Gomez, C.; Mbarek, H.; Kanterakis, A.; Trompet, S.; Postmus, I.; Verweij, N.; van Enckevort, D.; Huffman, J.E.; White, C.C.; Feitosa, M.F.; Bartz, T.M.; Manichaikul, A.; Joshi, P.K.; Peloso, G.M.; Deelen, P.; Dijk, F.; Willemsen, G.; de Geus, E.J.C.; Milaneschi, Y.; Penninx, B.W.J.H.; Francioli, L.C.; Menelaou, A.; Pulit, S.L.; Rivadeneira, F.; Hofman, A.; Oostra, B.A.; Franco, O.H.; Mateo Leach, I.; Beekman, M.; de Craen, A.J.; Uh, H.W.; Trochet, H.; Hocking, L.J.; Porteous, D.J.; Sattar, N.; Packard, C.J.; Buckley, B.M.; Brody, J.A.; Bis, J.C.; Rotter, J.I.; Mychaleckyj, J.C.; Campbell, H.; Duan, Q.; Lange, L.A.; Wilson, J.F.; Hayward, C.; Polasek, O.; Vitart, V.; Rudan, I.; Wright, A.F.; Rich, S.S.; Psaty, B.M.; Borecki, I.B.; Kearney, P.M.; Stott, D.J.; Cupples, L.A.; Jukema, J.W.; van der Harst, P.; Sijbrands, E.J.; Hottenga, J.J.; Uitterlinden, A.G.; Swertz, M.A.; van Ommen, G.J.B; Bakker, P.I.W.; Slagboom, P.E.; Boomsma, D.I.; Wijmenga, C.; van Duijn, C.M.

    2015-01-01

    Variants associated with blood lipid levels may be population-specific. To identify low-frequency variants associated with this phenotype, population-specific reference panels may be used. Here we impute nine large Dutch biobanks (∼35,000 samples) with the population-specific reference panel created

  7. Cross-Genome Comparisons of Newly Identified Domains in Mycoplasma gallisepticum and Domain Architectures with Other Mycoplasma species

    Directory of Open Access Journals (Sweden)

    Chandra Sekhar Reddy Chilamakuri

    2011-01-01

    Full Text Available Accurate functional annotation of protein sequences is hampered by important factors such as the failure of sequence search methods to identify relationships and the inherent diversity in function of proteins related at low sequence similarities. Earlier, we had employed intermediate sequence search approach to establish new domain relationships in the unassigned regions of gene products at the whole genome level by taking Mycoplasma gallisepticum as a specific example and established new domain relationships. In this paper, we report a detailed comparison of the conservation status of the domain and domain architectures of the gene products that bear our newly predicted domains amongst 14 other Mycoplasma genomes and reported the probable implications for the organisms. Some of the domain associations, observed in Mycoplasma that afflict humans and other non-human primates, are involved in regulation of solute transport and DNA binding suggesting specific modes of host-pathogen interactions.

  8. Pan-Genome Analysis of Human Gastric Pathogen H. pylori: Comparative Genomics and Pathogenomics Approaches to Identify Regions Associated with Pathogenicity and Prediction of Potential Core Therapeutic Targets

    DEFF Research Database (Denmark)

    Ali, Amjad; Naz, Anam; Soares, Siomar C.

    2015-01-01

    -genome approach; the predicted conserved gene families (1,193) constitute similar to 77% of the average H. pylori genome and 45% of the global gene repertoire of the species. Reverse vaccinology strategies have been adopted to identify and narrow down the potential core-immunogenic candidates. Total of 28 nonhost....... Pan-genome analyses of the global representative H. pylori isolates consisting of 39 complete genomes are presented in this paper. Phylogenetic analyses have revealed close relationships among geographically diverse strains of H. pylori. The conservation among these genomes was further analyzed by pan...

  9. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.

    Science.gov (United States)

    Motamayor, Juan C; Mockaitis, Keithanne; Schmutz, Jeremy; Haiminen, Niina; Livingstone, Donald; Cornejo, Omar; Findley, Seth D; Zheng, Ping; Utro, Filippo; Royaert, Stefan; Saski, Christopher; Jenkins, Jerry; Podicheti, Ram; Zhao, Meixia; Scheffler, Brian E; Stack, Joseph C; Feltus, Frank A; Mustiga, Guiliana M; Amores, Freddy; Phillips, Wilbert; Marelli, Jean Philippe; May, Gregory D; Shapiro, Howard; Ma, Jianxin; Bustamante, Carlos D; Schnell, Raymond J; Main, Dorrie; Gilbert, Don; Parida, Laxmi; Kuhn, David N

    2013-06-03

    Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits.

  10. Genomic analyses identify molecular subtypes of pancreatic cancer.

    Science.gov (United States)

    Bailey, Peter; Chang, David K; Nones, Katia; Johns, Amber L; Patch, Ann-Marie; Gingras, Marie-Claude; Miller, David K; Christ, Angelika N; Bruxner, Tim J C; Quinn, Michael C; Nourse, Craig; Murtaugh, L Charles; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourbakhsh, Ehsan; Wani, Shivangi; Fink, Lynn; Holmes, Oliver; Chin, Venessa; Anderson, Matthew J; Kazakoff, Stephen; Leonard, Conrad; Newell, Felicity; Waddell, Nick; Wood, Scott; Xu, Qinying; Wilson, Peter J; Cloonan, Nicole; Kassahn, Karin S; Taylor, Darrin; Quek, Kelly; Robertson, Alan; Pantano, Lorena; Mincarelli, Laura; Sanchez, Luis N; Evers, Lisa; Wu, Jianmin; Pinese, Mark; Cowley, Mark J; Jones, Marc D; Colvin, Emily K; Nagrial, Adnan M; Humphrey, Emily S; Chantrill, Lorraine A; Mawson, Amanda; Humphris, Jeremy; Chou, Angela; Pajic, Marina; Scarlett, Christopher J; Pinho, Andreia V; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S; Kench, James G; Lovell, Jessica A; Merrett, Neil D; Toon, Christopher W; Epari, Krishna; Nguyen, Nam Q; Barbour, Andrew; Zeps, Nikolajs; Moran-Jones, Kim; Jamieson, Nigel B; Graham, Janet S; Duthie, Fraser; Oien, Karin; Hair, Jane; Grützmann, Robert; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Wolfgang, Christopher L; Morgan, Richard A; Lawlor, Rita T; Corbo, Vincenzo; Bassi, Claudio; Rusev, Borislav; Capelli, Paola; Salvia, Roberto; Tortora, Giampaolo; Mukhopadhyay, Debabrata; Petersen, Gloria M; Munzy, Donna M; Fisher, William E; Karim, Saadia A; Eshleman, James R; Hruban, Ralph H; Pilarsky, Christian; Morton, Jennifer P; Sansom, Owen J; Scarpa, Aldo; Musgrove, Elizabeth A; Bailey, Ulla-Maja Hagbo; Hofmann, Oliver; Sutherland, Robert L; Wheeler, David A; Gill, Anthony J; Gibbs, Richard A; Pearson, John V; Waddell, Nicola; Biankin, Andrew V; Grimmond, Sean M

    2016-03-03

    Integrated genomic analysis of 456 pancreatic ductal adenocarcinomas identified 32 recurrently mutated genes that aggregate into 10 pathways: KRAS, TGF-β, WNT, NOTCH, ROBO/SLIT signalling, G1/S transition, SWI-SNF, chromatin modification, DNA repair and RNA processing. Expression analysis defined 4 subtypes: (1) squamous; (2) pancreatic progenitor; (3) immunogenic; and (4) aberrantly differentiated endocrine exocrine (ADEX) that correlate with histopathological characteristics. Squamous tumours are enriched for TP53 and KDM6A mutations, upregulation of the TP63∆N transcriptional network, hypermethylation of pancreatic endodermal cell-fate determining genes and have a poor prognosis. Pancreatic progenitor tumours preferentially express genes involved in early pancreatic development (FOXA2/3, PDX1 and MNX1). ADEX tumours displayed upregulation of genes that regulate networks involved in KRAS activation, exocrine (NR5A2 and RBPJL), and endocrine differentiation (NEUROD1 and NKX2-2). Immunogenic tumours contained upregulated immune networks including pathways involved in acquired immune suppression. These data infer differences in the molecular evolution of pancreatic cancer subtypes and identify opportunities for therapeutic development.

  11. A meta-analysis of genome-wide association studies identifies novel variants associated with osteoarthritis of the hip

    DEFF Research Database (Denmark)

    Evangelou, Evangelos; Kerkhof, Hanneke J; Styrkarsdottir, Unnur

    2014-01-01

    Osteoarthritis (OA) is the most common form of arthritis with a clear genetic component. To identify novel loci associated with hip OA we performed a meta-analysis of genome-wide association studies (GWAS) on European subjects.......Osteoarthritis (OA) is the most common form of arthritis with a clear genetic component. To identify novel loci associated with hip OA we performed a meta-analysis of genome-wide association studies (GWAS) on European subjects....

  12. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python

    Directory of Open Access Journals (Sweden)

    Kristopher J. L. Irizarry

    2016-01-01

    Full Text Available Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism’s genome (such as the mouse genome in order to make physiological inferences about the role of genes and proteins in a less characterized organism’s genome (such as the Burmese python. We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1 production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2 enhanced assisted reproduction technology for endangered and captive reptiles; and (3 novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  13. Genome-wide association study identifies three novel loci for type 2 diabetes

    DEFF Research Database (Denmark)

    Hara, Kazuo; Fujita, Hayato; Johnson, Todd A

    2014-01-01

    Although over 60 loci for type 2 diabetes (T2D) have been identified, there still remains a large genetic component to be clarified. To explore unidentified loci for T2D, we performed a genome-wide association study (GWAS) of 6 209 637 single-nucleotide polymorphisms (SNPs), which were directly g...

  14. 4C-ker: A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments.

    Science.gov (United States)

    Raviram, Ramya; Rocha, Pedro P; Müller, Christian L; Miraldi, Emily R; Badri, Sana; Fu, Yi; Swanzey, Emily; Proudhon, Charlotte; Snetkova, Valentina; Bonneau, Richard; Skok, Jane A

    2016-03-01

    4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or "bait") that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes.

  15. 4C-ker: A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments.

    Directory of Open Access Journals (Sweden)

    Ramya Raviram

    2016-03-01

    Full Text Available 4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or "bait" that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes.

  16. Distinct high resolution genome profiles of early onset and late onset colorectal cancer integrated with gene expression data identify candidate susceptibility loci

    Directory of Open Access Journals (Sweden)

    Merok Marianne A

    2010-05-01

    Full Text Available Abstract Background Estimates suggest that up to 30% of colorectal cancers (CRC may develop due to an increased genetic risk. The mean age at diagnosis for CRC is about 70 years. Time of disease onset 20 years younger than the mean age is assumed to be indicative of genetic susceptibility. We have compared high resolution tumor genome copy number variation (CNV (Roche NimbleGen, 385 000 oligo CGH array in microsatellite stable (MSS tumors from two age groups, including 23 young at onset patients without known hereditary syndromes and with a median age of 44 years (range: 28-53 and 17 elderly patients with median age 79 years (range: 69-87. Our aim was to identify differences in the tumor genomes between these groups and pinpoint potential susceptibility loci. Integration analysis of CNV and genome wide mRNA expression data, available for the same tumors, was performed to identify a restricted candidate gene list. Results The total fraction of the genome with aberrant copy number, the overall genomic profile and the TP53 mutation spectrum were similar between the two age groups. However, both the number of chromosomal aberrations and the number of breakpoints differed significantly between the groups. Gains of 2q35, 10q21.3-22.1, 10q22.3 and 19q13.2-13.31 and losses from 1p31.3, 1q21.1, 2q21.2, 4p16.1-q28.3, 10p11.1 and 19p12, positions that in total contain more than 500 genes, were found significantly more often in the early onset group as compared to the late onset group. Integration analysis revealed a covariation of DNA copy number at these sites and mRNA expression for 107 of the genes. Seven of these genes, CLC, EIF4E, LTBP4, PLA2G12A, PPAT, RG9MTD2, and ZNF574, had significantly different mRNA expression comparing median expression levels across the transcriptome between the two groups. Conclusions Ten genomic loci, containing more than 500 protein coding genes, are identified as more often altered in tumors from early onset versus late

  17. Identifying tagging SNPs for African specific genetic variation from the African Diaspora Genome.

    Science.gov (United States)

    Johnston, Henry Richard; Hu, Yi-Juan; Gao, Jingjing; O'Connor, Timothy D; Abecasis, Gonçalo R; Wojcik, Genevieve L; Gignoux, Christopher R; Gourraud, Pierre-Antoine; Lizee, Antoine; Hansen, Mark; Genuario, Rob; Bullis, Dave; Lawley, Cindy; Kenny, Eimear E; Bustamante, Carlos; Beaty, Terri H; Mathias, Rasika A; Barnes, Kathleen C; Qin, Zhaohui S

    2017-04-21

    A primary goal of The Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) is to develop an 'African Diaspora Power Chip' (ADPC), a genotyping array consisting of tagging SNPs, useful in comprehensively identifying African specific genetic variation. This array is designed based on the novel variation identified in 642 CAAPA samples of African ancestry with high coverage whole genome sequence data (~30× depth). This novel variation extends the pattern of variation catalogued in the 1000 Genomes and Exome Sequencing Projects to a spectrum of populations representing the wide range of West African genomic diversity. These individuals from CAAPA also comprise a large swath of the African Diaspora population and incorporate historical genetic diversity covering nearly the entire Atlantic coast of the Americas. Here we show the results of designing and producing such a microchip array. This novel array covers African specific variation far better than other commercially available arrays, and will enable better GWAS analyses for researchers with individuals of African descent in their study populations. A recent study cataloging variation in continental African populations suggests this type of African-specific genotyping array is both necessary and valuable for facilitating large-scale GWAS in populations of African ancestry.

  18. Exploiting genomic data to identify proteins involved in abalone reproduction.

    Science.gov (United States)

    Mendoza-Porras, Omar; Botwright, Natasha A; McWilliam, Sean M; Cook, Mathew T; Harris, James O; Wijffels, Gene; Colgrave, Michelle L

    2014-08-28

    Aside from their critical role in reproduction, abalone gonads serve as an indicator of sexual maturity and energy balance, two key considerations for effective abalone culture. Temperate abalone farmers face issues with tank restocking with highly marketable abalone owing to inefficient spawning induction methods. The identification of key proteins in sexually mature abalone will serve as the foundation for a greater understanding of reproductive biology. Addressing this knowledge gap is the first step towards improving abalone aquaculture methods. Proteomic profiling of female and male gonads of greenlip abalone, Haliotis laevigata, was undertaken using liquid chromatography-mass spectrometry. Owing to the incomplete nature of abalone protein databases, in addition to searching against two publicly available databases, a custom database comprising genomic data was used. Overall, 162 and 110 proteins were identified in females and males respectively with 40 proteins common to both sexes. For proteins involved in sexual maturation, sperm and egg structure, motility, acrosomal reaction and fertilization, 23 were identified only in females, 18 only in males and 6 were common. Gene ontology analysis revealed clear differences between the female and male protein profiles reflecting a higher rate of protein synthesis in the ovary and higher metabolic activity in the testis. A comprehensive mass spectrometry-based analysis was performed to profile the abalone gonad proteome providing the foundation for future studies of reproduction in abalone. Key proteins involved in both reproduction and energy balance were identified. Genomic resources were utilised to build a database of molluscan proteins yielding >60% more protein identifications than in a standard workflow employing public protein databases. Copyright © 2014 Elsevier B.V. All rights reserved.

  19. Genome-wide association analyses identify 18 new loci associated with serum urate concentrations

    NARCIS (Netherlands)

    Kottgen, A.; Albrecht, E.; Teumer, A.; Vitart, V.; Krumsiek, J.; Hundertmark, C.; Pistis, G.; Ruggiero, D.; O'Seaghdha, C.M.; Haller, T.; Yang, Q.; Johnson, A.D.; Kutalik, Z.; Smith, A.V.; Shi, J.L.; Struchalin, M.; Middelberg, R.P.S.; Brown, M.J.; Gaffo, A.L.; Pirastu, N.; Li, G.; Hayward, C.; Zemunik, T.; Huffman, J.; Yengo, L.; Zhao, J.H.; Demirkan, A.; Feitosa, M.F.; Liu, X.; Malerba, G.; Lopez, L.M.; van der Harst, P.; Li, X.Z.; Kleber, M.E.; Hicks, A.A.; Nolte, I.M.; Johansson, A.; Murgia, F.; Wild, S.H.; Bakker, S.J.L.; Peden, J.F.; Dehghan, A.; Steri, M.; Tenesa, A.; Lagou, V.; Salo, P.; Mangino, M.; Rose, L.M.; Lehtimaki, T.; Woodward, O.M.; Okada, Y.; Tin, A.; Muller, C.; Oldmeadow, C.; Putku, M.; Czamara, D.; Kraft, P.; Frogheri, L.; Thun, G.A.; Grotevendt, A.; Gislason, G.K.; Harris, T.B.; Launer, L.J.; McArdle, P.; Shuldiner, A.R.; Boerwinkle, E.; Coresh, J.; Schmidt, H.; Schallert, M.; Martin, N.G.; Montgomery, G.W.; Kubo, M.; Nakamura, Y.; Tanaka, T.; Munroe, P.B.; Samani, N.J.; Jacobs, D.R.; Liu, K.; d'Adamo, P.; Ulivi, S.; Rotter, J.I.; Psaty, B.M.; Vollenweider, P.; Waeber, G.; Campbell, S.; Devuyst, O.; Navarro, P.; Kolcic, I.; Hastie, N.; Balkau, B.; Froguel, P.; Esko, T.; Salumets, A.; Khaw, K.T.; Langenberg, C.; Wareham, N.J.; Isaacs, A.; Kraja, A.; Zhang, Q.Y.; Penninx, B.W.J.H.; Smit, J.H.; Bochud, M.; Gieger, C.

    2013-01-01

    Elevated serum urate concentrations can cause gout, a prevalent and painful inflammatory arthritis. By combining data from >140,000 individuals of European ancestry within the Global Urate Genetics Consortium (GUGC), we identified and replicated 28 genome-wide significant loci in association with

  20. Genome-wide association analyses identify 18 new loci associated with serum urate concentrations

    NARCIS (Netherlands)

    Köttgen, Anna; Albrecht, Eva; Teumer, Alexander; Vitart, Veronique; Krumsiek, Jan; Hundertmark, Claudia; Pistis, Giorgio; Ruggiero, Daniela; O'Seaghdha, Conall M; Haller, Toomas; Yang, Qiong; Tanaka, Toshiko; Johnson, Andrew D; Kutalik, Zoltán; Smith, Albert V; Shi, Julia; Struchalin, Maksim; Middelberg, Rita P S; Brown, Morris J; Gaffo, Angelo L; Pirastu, Nicola; Li, Guo; Hayward, Caroline; Zemunik, Tatijana; Huffman, Jennifer; Yengo, Loic; Zhao, Jing Hua; Demirkan, Ayse; Feitosa, Mary F; Liu, Xuan; Malerba, Giovanni; Lopez, Lorna M; van der Harst, Pim; Li, Xinzhong; Kleber, Marcus E; Hicks, Andrew A; Nolte, Ilja M; Johansson, Asa; Murgia, Federico; Bakker, Stephan J L; Lagou, Vasiliki; Bruinenberg, Marcel; Stolk, Ronald P; Penninx, Brenda W; Mateo Leach, Irene; van Gilst, Wiek H; Hillege, Hans L; Wolffenbuttel, Bruce H R; Snieder, Harold; Navis, Gerjan

    Elevated serum urate concentrations can cause gout, a prevalent and painful inflammatory arthritis. By combining data from >140,000 individuals of European ancestry within the Global Urate Genetics Consortium (GUGC), we identified and replicated 28 genome-wide significant loci in association with

  1. Genome-Wide Mapping of 5mC and 5hmC Identified Differentially Modified Genomic Regions in Late-Onset Severe Preeclampsia: A Pilot Study.

    Directory of Open Access Journals (Sweden)

    Lisha Zhu

    Full Text Available Preeclampsia (PE is a leading cause of perinatal morbidity and mortality. However, as a common form of PE, the etiology of late-onset PE is elusive. We analyzed 5-methylcytosine (5mC and 5-hydroxymethylcytosine (5hmC levels in the placentas of late-onset severe PE patients (n = 4 and normal controls (n = 4 using a (hydroxymethylated DNA immunoprecipitation approach combined with deep sequencing ([h]MeDIP-seq, and the results were verified by (hMeDIP-qPCR. The most significant differentially methylated regions (DMRs were verified by MassARRAY EppiTYPER in an enlarged sample size (n = 20. Bioinformatics analysis identified 714 peaks of 5mC that were associated with 403 genes and 119 peaks of 5hmC that were associated with 61 genes, thus showing significant differences between the PE patients and the controls (>2-fold, p<0.05. Further, only one gene, PTPRN2, had both 5mC and 5hmC changes in patients. The ErbB signaling pathway was enriched in those 403 genes that had significantly different 5mC level between the groups. This genome-wide mapping of 5mC and 5hmC in late-onset severe PE and normal controls demonstrates that both 5mC and 5hmC play epigenetic roles in the regulation of the disease, but work independently. We reveal the genome-wide mapping of DNA methylation and DNA hydroxymethylation in late-onset PE placentas for the first time, and the identified ErbB signaling pathway and the gene PTPRN2 may be relevant to the epigenetic pathogenesis of late-onset PE.

  2. Genome-wide association study identifies variants associated with autoimmune hepatitis type 1.

    Science.gov (United States)

    de Boer, Ynto S; van Gerven, Nicole M F; Zwiers, Antonie; Verwer, Bart J; van Hoek, Bart; van Erpecum, Karel J; Beuers, Ulrich; van Buuren, Henk R; Drenth, Joost P H; den Ouden, Jannie W; Verdonk, Robert C; Koek, Ger H; Brouwer, Johannes T; Guichelaar, Maureen M J; Vrolijk, Jan M; Kraal, Georg; Mulder, Chris J J; van Nieuwkerk, Carin M J; Fischer, Janett; Berg, Thomas; Stickel, Felix; Sarrazin, Christoph; Schramm, Christoph; Lohse, Ansgar W; Weiler-Normann, Christina; Lerch, Markus M; Nauck, Matthias; Völzke, Henry; Homuth, Georg; Bloemena, Elisabeth; Verspaget, Hein W; Kumar, Vinod; Zhernakova, Alexandra; Wijmenga, Cisca; Franke, Lude; Bouma, Gerd

    2014-08-01

    Autoimmune hepatitis (AIH) is an uncommon autoimmune liver disease of unknown etiology. We used a genome-wide approach to identify genetic variants that predispose individuals to AIH. We performed a genome-wide association study of 649 adults in The Netherlands with AIH type 1 and 13,436 controls. Initial associations were further analyzed in an independent replication panel comprising 451 patients with AIH type 1 in Germany and 4103 controls. We also performed an association analysis in the discovery cohort using imputed genotypes of the major histocompatibility complex region. We associated AIH with a variant in the major histocompatibility complex region at rs2187668 (P = 1.5 × 10(-78)). Analysis of this variant in the discovery cohort identified HLA-DRB1*0301 (P = 5.3 × 10(-49)) as a primary susceptibility genotype and HLA-DRB1*0401 (P = 2.8 × 10(-18)) as a secondary susceptibility genotype. We also associated AIH with variants of SH2B3 (rs3184504, 12q24; P = 7.7 × 10(-8)) and CARD10 (rs6000782, 22q13.1; P = 3.0 × 10(-6)). In addition, strong inflation of association signal was found with single-nucleotide polymorphisms associated with other immune-mediated diseases, including primary sclerosing cholangitis and primary biliary cirrhosis, but not with single-nucleotide polymorphisms associated with other genetic traits. In a genome-wide association study, we associated AIH type 1 with variants in the major histocompatibility complex region, and identified variants of SH2B3and CARD10 as likely risk factors. These findings support a complex genetic basis for AIH pathogenesis and indicate that part of the genetic susceptibility overlaps with that for other immune-mediated liver diseases. Copyright © 2014 AGA Institute. Published by Elsevier Inc. All rights reserved.

  3. Genome-wide Association Study Identifies Five Susceptibility Loci for Follicular Lymphoma outside the HLA Region

    NARCIS (Netherlands)

    Skibola, Christine F.; Berndt, Sonja I.; Vijai, Joseph; Conde, Lucia; Wang, Zhaoming; Yeager, Meredith; de Bakker, Paul I. W.; Birmann, Brenda M.; Vajdic, Claire M.; Foo, Jia-Nee; Bracci, Paige M.; Vermeulen, Roel C. H.; Slager, Susan L.; de Sanjose, Silvia; Wang, Sophia S.; Linet, Martha S.; Salles, Gilles; Lan, Qing; Severi, Gianluca; Hjalgrim, Henrik; Lightfoot, Tracy; Melbye, Mads; Gu, Jian; Ghesquieres, Herve; Link, Brian K.; Morton, Lindsay M.; Holly, Elizabeth A.; Smith, Alex; Tinker, Lesley F.; Teras, Lauren R.; Kricker, Anne; Becker, Nikolaus; Purdue, Mark P.; Spinelli, John J.; Zhang, Yawei; Giles, Graham G.; Vineis, Paolo; Monnereau, Alain; Bertrand, Kimberly A.; Albanes, Demetrius; Zeleniuch-Jacquotte, Anne; Gabbas, Attilio; Chung, Charles C.; Burdett, Laurie; Hutchinson, Amy; Lawrence, Charles; Montalvan, Rebecca; Liang, Liming; Huang, Jinyan; Ma, Baoshan; Liu, Jianjun; Adami, Hans-Olov; Glimelius, Bengt; Ye, Yuanqing; Nowakowski, Grzegorz S.; Dogan, Ahmet; Thompson, Carrie A.; Habermann, Thomas M.; Novak, Anne J.; Liebow, Mark; Witzig, Thomas E.; Weiner, George J.; Schenk, Maryjean; Hartge, Patricia; De Roos, Anneclaire J.; Cozen, Wendy; Zhi, Degui; Akers, Nicholas K.; Riby, Jacques; Smith, Martyn T.; Lacher, Mortimer; Villano, Danylo J.; Maria, Ann; Roman, Eve; Kane, Eleanor; Jackson, Rebecca D.; North, Kari E.; Diver, W. Ryan; Turner, Jenny; Armstrong, Bruce K.; Benavente, Yolanda; Boffetta, Paolo; Brennan, Paul; Foretova, Lenka; Maynadie, Marc; Staines, Anthony; McKay, James; Brooks-Wilson, Angela R.; Zheng, Tongzhang; Holford, Theodore R.; Chamosa, Saioa; Kaaks, Rudolph; Kelly, Rachel S.; Ohlsson, Bodil; Travis, Ruth C.; Weiderpass, Elisabete; Clave, Jacqueline; Giovannucci, Edward; Kraft, Peter; Virtamo, Jarmo; Mazza, Patrizio; Cocco, Pierluigi; Ennas, Maria Grazia; Chiu, Brian C. H.; Fraumeni, Joseph R.; Nieters, Alexandra; Offit, Kenneth; Wu, Xifeng; Cerhan, James R.; Smedby, Karin E.; Chanock, Stephen J.; Rothman, Nathaniel

    2014-01-01

    Genome-wide association studies (GWASs) of follicular lymphoma (FL) have previously identified human leukocyte antigen (HLA) gene variants. To identify additional FL susceptibility loci, we conducted a large-scale two-stage GWAS in 4,523 case subjects and 13,344 control subjects of European

  4. Meta-Analysis of Genome-Wide Association Studies Identifies Genetic Risk Factors for Stroke in African Americans.

    Science.gov (United States)

    Carty, Cara L; Keene, Keith L; Cheng, Yu-Ching; Meschia, James F; Chen, Wei-Min; Nalls, Mike; Bis, Joshua C; Kittner, Steven J; Rich, Stephen S; Tajuddin, Salman; Zonderman, Alan B; Evans, Michele K; Langefeld, Carl D; Gottesman, Rebecca; Mosley, Thomas H; Shahar, Eyal; Woo, Daniel; Yaffe, Kristine; Liu, Yongmei; Sale, Michèle M; Dichgans, Martin; Malik, Rainer; Longstreth, W T; Mitchell, Braxton D; Psaty, Bruce M; Kooperberg, Charles; Reiner, Alexander; Worrall, Bradford B; Fornage, Myriam

    2015-08-01

    The majority of genome-wide association studies (GWAS) of stroke have focused on European-ancestry populations; however, none has been conducted in African Americans, despite the disproportionately high burden of stroke in this population. The Consortium of Minority Population Genome-Wide Association Studies of Stroke (COMPASS) was established to identify stroke susceptibility loci in minority populations. Using METAL, we conducted meta-analyses of GWAS in 14 746 African Americans (1365 ischemic and 1592 total stroke cases) from COMPASS, and tested genetic variants with Pstroke genetic studies in European-ancestry populations. We also evaluated stroke loci previously identified in European-ancestry populations. The 15q21.3 locus linked with lipid levels and hypertension was associated with total stroke (rs4471613; P=3.9×10(-8)) in African Americans. Nominal associations (Pstroke were observed for 18 variants in or near genes implicated in cell cycle/mRNA presplicing (PTPRG, CDC5L), platelet function (HPS4), blood-brain barrier permeability (CLDN17), immune response (ELTD1, WDFY4, and IL1F10-IL1RN), and histone modification (HDAC9). Two of these loci achieved nominal significance in METASTROKE: 5q35.2 (P=0.03), and 1p31.1 (P=0.018). Four of 7 previously reported ischemic stroke loci (PITX2, HDAC9, CDKN2A/CDKN2B, and ZFHX3) were nominally associated (Pstroke in COMPASS. We identified a novel genetic variant associated with total stroke in African Americans and found that ischemic stroke loci identified in European-ancestry populations may also be relevant for African Americans. Our findings support investigation of diverse populations to identify and characterize genetic risk factors, and the importance of shared genetic risk across populations. © 2015 American Heart Association, Inc.

  5. snpTree - a web-server to identify and construct SNP trees from whole genome sequence data

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Kaas, Rolf Sommer; Thomsen, Martin Christen Frølund

    2012-01-01

    identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed...... to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic...... skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Results Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can...

  6. A multi-sample based method for identifying common CNVs in normal human genomic structure using high-resolution aCGH data.

    Directory of Open Access Journals (Sweden)

    Chihyun Park

    Full Text Available BACKGROUND: It is difficult to identify copy number variations (CNV in normal human genomic data due to noise and non-linear relationships between different genomic regions and signal intensity. A high-resolution array comparative genomic hybridization (aCGH containing 42 million probes, which is very large compared to previous arrays, was recently published. Most existing CNV detection algorithms do not work well because of noise associated with the large amount of input data and because most of the current methods were not designed to analyze normal human samples. Normal human genome analysis often requires a joint approach across multiple samples. However, the majority of existing methods can only identify CNVs from a single sample. METHODOLOGY AND PRINCIPAL FINDINGS: We developed a multi-sample-based genomic variations detector (MGVD that uses segmentation to identify common breakpoints across multiple samples and a k-means-based clustering strategy. Unlike previous methods, MGVD simultaneously considers multiple samples with different genomic intensities and identifies CNVs and CNV zones (CNVZs; CNVZ is a more precise measure of the location of a genomic variant than the CNV region (CNVR. CONCLUSIONS AND SIGNIFICANCE: We designed a specialized algorithm to detect common CNVs from extremely high-resolution multi-sample aCGH data. MGVD showed high sensitivity and a low false discovery rate for a simulated data set, and outperformed most current methods when real, high-resolution HapMap datasets were analyzed. MGVD also had the fastest runtime compared to the other algorithms evaluated when actual, high-resolution aCGH data were analyzed. The CNVZs identified by MGVD can be used in association studies for revealing relationships between phenotypes and genomic aberrations. Our algorithm was developed with standard C++ and is available in Linux and MS Windows format in the STL library. It is freely available at: http://embio.yonsei.ac.kr/~Park/mgvd.php.

  7. Comparative genomics identifies distinct lineages of S. Enteritidis from Queensland, Australia.

    Science.gov (United States)

    Graham, Rikki M A; Hiley, Lester; Rathnayake, Irani U; Jennison, Amy V

    2018-01-01

    Salmonella enterica is a major cause of gastroenteritis and foodborne illness in Australia where notification rates in the state of Queensland are the highest in the country. S. Enteritidis is among the five most common serotypes reported in Queensland and it is a priority for epidemiological surveillance due to concerns regarding its emergence in Australia. Using whole genome sequencing, we have analysed the genomic epidemiology of 217 S. Enteritidis isolates from Queensland, and observed that they fall into three distinct clades, which we have differentiated as Clades A, B and C. Phage types and MLST sequence types differed between the clades and comparative genomic analysis has shown that each has a unique profile of prophage and genomic islands. Several of the phage regions present in the S. Enteritidis reference strain P125109 were absent in Clades A and C, and these clades also had difference in the presence of pathogenicity islands, containing complete SPI-6 and SPI-19 regions, while P125109 does not. Antimicrobial resistance markers were found in 39 isolates, all but one of which belonged to Clade B. Phylogenetic analysis of the Queensland isolates in the context of 170 international strains showed that Queensland Clade B isolates group together with the previously identified global clade, while the other two clades are distinct and appear largely restricted to Australia. Locally sourced environmental isolates included in this analysis all belonged to Clades A and C, which is consistent with the theory that these clades are a source of locally acquired infection, while Clade B isolates are mostly travel related.

  8. Comparative genomic analysis of Helicobacter pylori from Malaysia identifies three distinct lineages suggestive of differential evolution.

    Science.gov (United States)

    Kumar, Narender; Mariappan, Vanitha; Baddam, Ramani; Lankapalli, Aditya K; Shaik, Sabiha; Goh, Khean-Lee; Loke, Mun Fai; Perkins, Tim; Benghezal, Mohammed; Hasnain, Seyed E; Vadivelu, Jamuna; Marshall, Barry J; Ahmed, Niyaz

    2015-01-01

    The discordant prevalence of Helicobacter pylori and its related diseases, for a long time, fostered certain enigmatic situations observed in the countries of the southern world. Variation in H. pylori infection rates and disease outcomes among different populations in multi-ethnic Malaysia provides a unique opportunity to understand dynamics of host-pathogen interaction and genome evolution. In this study, we extensively analyzed and compared genomes of 27 Malaysian H. pylori isolates and identified three major phylogeographic lineages: hspEastAsia, hpEurope and hpSouthIndia. The analysis of the virulence genes within the core genome, however, revealed a comparable pathogenic potential of the strains. In addition, we identified four genes limited to strains of East-Asian lineage. Our analyses identified a few strain-specific genes encoding restriction modification systems and outlined 311 core genes possibly under differential evolutionary constraints, among the strains representing different ethnic groups. The cagA and vacA genes also showed variations in accordance with the host genetic background of the strains. Moreover, restriction modification genes were found to be significantly enriched in East-Asian strains. An understanding of these variations in the genome content would provide significant insights into various adaptive and host modulation strategies harnessed by H. pylori to effectively persist in a host-specific manner. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

    Science.gov (United States)

    Puente, Xose S.; Pinyol, Magda; Quesada, Víctor; Conde, Laura; Ordóñez, Gonzalo R.; Villamor, Neus; Escaramis, Georgia; Jares, Pedro; Beà, Sílvia; González-Díaz, Marcos; Bassaganyas, Laia; Baumann, Tycho; Juan, Manel; López-Guerra, Mónica; Colomer, Dolors; Tubío, José M. C.; López, Cristina; Navarro, Alba; Tornador, Cristian; Aymerich, Marta; Rozman, María; Hernández, Jesús M.; Puente, Diana A.; Freije, José M. P.; Velasco, Gloria; Gutiérrez-Fernández, Ana; Costa, Dolors; Carrió, Anna; Guijarro, Sara; Enjuanes, Anna; Hernández, Lluís; Yagüe, Jordi; Nicolás, Pilar; Romeo-Casabona, Carlos M.; Himmelbauer, Heinz; Castillo, Ester; Dohm, Juliane C.; de Sanjosé, Silvia; Piris, Miguel A.; de Alava, Enrique; Miguel, Jesús San; Royo, Romina; Gelpí, Josep L.; Torrents, David; Orozco, Modesto; Pisano, David G.; Valencia, Alfonso; Guigó, Roderic; Bayés, Mónica; Heath, Simon; Gut, Marta; Klatt, Peter; Marshall, John; Raine, Keiran; Stebbings, Lucy A.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.; Gut, Ivo; López-Guillermo, Armando; Estivill, Xavier; Montserrat, Emili; López-Otín, Carlos; Campo, Elías

    2012-01-01

    Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution1,2. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes3,4. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer. PMID:21642962

  10. Genome size evolution at the speciation level: the cryptic species complex Brachionus plicatilis (Rotifera).

    Science.gov (United States)

    Stelzer, Claus-Peter; Riss, Simone; Stadler, Peter

    2011-04-07

    Studies on genome size variation in animals are rarely done at lower taxonomic levels, e.g., slightly above/below the species level. Yet, such variation might provide important clues on the tempo and mode of genome size evolution. In this study we used the flow-cytometry method to study the evolution of genome size in the rotifer Brachionus plicatilis, a cryptic species complex consisting of at least 14 closely related species. We found an unexpectedly high variation in this species complex, with genome sizes ranging approximately seven-fold (haploid '1C' genome sizes: 0.056-0.416 pg). Most of this variation (67%) could be ascribed to the major clades of the species complex, i.e. clades that are well separated according to most species definitions. However, we also found substantial variation (32%) at lower taxonomic levels--within and among genealogical species--and, interestingly, among species pairs that are not completely reproductively isolated. In one genealogical species, called B. 'Austria', we found greatly enlarged genome sizes that could roughly be approximated as multiples of the genomes of its closest relatives, which suggests that whole-genome duplications have occurred early during separation of this lineage. Overall, genome size was significantly correlated to egg size and body size, even though the latter became non-significant after controlling for phylogenetic non-independence. Our study suggests that substantial genome size variation can build up early during speciation, potentially even among isolated populations. An alternative, but not mutually exclusive interpretation might be that reproductive isolation tends to build up unusually slow in this species complex.

  11. Genome-wide profiling of HPV integration in cervical cancer identifies clustered genomic hot spots and a potential microhomology-mediated integration mechanism

    DEFF Research Database (Denmark)

    Hu, Zheng; Zhu, Da; Wang, Wei

    2015-01-01

    Human papillomavirus (HPV) integration is a key genetic event in cervical carcinogenesis1. By conducting whole-genome sequencing and high-throughput viral integration detection, we identified 3,667 HPV integration breakpoints in 26 cervical intraepithelial neoplasias, 104 cervical carcinomas and ...

  12. Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson's disease

    NARCIS (Netherlands)

    Nalls, Mike A.; Pankratz, Nathan; Lill, Christina M.; Do, Chuong B.; Hernandez, Dena G.; Saad, Mohamad; DeStefano, Anita L.; Kara, Eleanna; Bras, Jose; Sharma, Manu; Schulte, Claudia; Keller, Margaux F.; Arepalli, Sampath; Letson, Christopher; Edsall, Connor; Stefansson, Hreinn; Liu, Xinmin; Pliner, Hannah; Lee, Joseph H.; Cheng, Rong; Ikram, M. Arfan; Ioannidis, John P. A.; Hadjigeorgiou, Georgios M.; Bis, Joshua C.; Martinez, Maria; Perlmutter, Joel S.; Goate, Alison; Marder, Karen; Fiske, Brian; Sutherland, Margaret; Xiromerisiou, Georgia; Myers, Richard H.; Clark, Lorraine N.; Stefansson, Kari; Hardy, John A.; Heutink, Peter; Chen, Honglei; Wood, Nicholas W.; Houlden, Henry; Payami, Haydeh; Brice, Alexis; Scott, William K.; Gasser, Thomas; Bertram, Lars; Eriksson, Nicholas; Foroud, Tatiana; Singleton, Andrew B.; Plagnol, Vincent; Sheerin, Una-Marie; Simón-Sánchez, Javier; Lesage, Suzanne; Sveinbjörnsdóttir, Sigurlaug; Barker, Roger; Ben-Shlomo, Yoav; Berendse, Henk W.; Berg, Daniela; Bhatia, Kailash; de Bie, Rob M. A.; Biffi, Alessandro; Bloem, Bas; Bochdanovits, Zoltan; Bonin, Michael; Bras, Jose M.; Brockmann, Kathrin; Brooks, Janet; Burn, David J.; Charlesworth, Gavin; Chinnery, Patrick F.; Chong, Sean; Clarke, Carl E.; Cookson, Mark R.; Cooper, J. Mark; Corvol, Jean Christophe; Counsell, Carl; Damier, Philippe; Dartigues, Jean-François; Deloukas, Panos; Deuschl, Günther; Dexter, David T.; van Dijk, Karin D.; Dillman, Allissa; Durif, Frank; Dürr, Alexandra; Edkins, Sarah; Evans, Jonathan R.; Foltynie, Thomas; Dong, Jing; Gardner, Michelle; Gibbs, J. Raphael; Gray, Emma; Guerreiro, Rita; Harris, Clare; van Hilten, Jacobus J.; Hofman, Albert; Hollenbeck, Albert; Holton, Janice; Hu, Michele; Huang, Xuemei; Wurster, Isabel; Mätzler, Walter; Hudson, Gavin; Hunt, Sarah E.; Huttenlocher, Johanna; Illig, Thomas; Jónsson, Pálmi V.; Lambert, Jean-Charles; Langford, Cordelia; Lees, Andrew; Lichtner, Peter; Limousin, Patricia; Lopez, Grisel; Lorenz, Delia; McNeill, Alisdair; Moorby, Catriona; Moore, Matthew; Morris, Huw R.; Morrison, Karen E.; Mudanohwo, Ese; O'Sullivan, Sean S.; Pearson, Justin; Pétursson, Hjörvar; Pollak, Pierre; Post, Bart; Potter, Simon; Ravina, Bernard; Revesz, Tamas; Riess, Olaf; Rivadeneira, Fernando; Rizzu, Patrizia; Ryten, Mina; Sawcer, Stephen; Schapira, Anthony; Scheffer, Hans; Shaw, Karen; Shoulson, Ira; Sidransky, Ellen; Smith, Colin; Spencer, Chris C. A.; Stefánsson, Hreinn; Bettella, Francesco; Stockton, Joanna D.; Strange, Amy; Talbot, Kevin; Tanner, Carlie M.; Tashakkori-Ghanbaria, Avazeh; Tison, François; Trabzuni, Daniah; Traynor, Bryan J.; Uitterlinden, André G.; Velseboer, Daan; Vidailhet, Marie; Walker, Robert; van de Warrenburg, Bart; Wickremaratchi, Mirdhu; Williams, Nigel; Williams-Gray, Caroline H.; Winder-Rhodes, Sophie; Stefánsson, Kári; Hardy, John; Factor, S.; Higgins, D.; Evans, S.; Shill, H.; Stacy, M.; Danielson, J.; Marlor, L.; Williamson, K.; Jankovic, J.; Hunter, C.; Simon, D.; Ryan, P.; Scollins, L.; Saunders-Pullman, R.; Boyar, K.; Costan-Toth, C.; Ohmann, E.; Sudarsky, L.; Joubert, C.; Friedman, J.; Chou, K.; Fernandez, H.; Lannon, M.; Galvez-Jimenez, N.; Podichetty, A.; Thompson, K.; Lewitt, P.; Deangelis, M.; O'Brien, C.; Seeberger, L.; Dingmann, C.; Judd, D.; Marder, K.; Fraser, J.; Harris, J.; Bertoni, J.; Peterson, C.; Rezak, M.; Medalle, G.; Chouinard, S.; Panisset, M.; Hall, J.; Poiffaut, H.; Calabrese, V.; Roberge, P.; Wojcieszek, J.; Belden, J.; Jennings, D.; Marek, K.; Mendick, S.; Reich, S.; Dunlop, B.; Jog, M.; Horn, C.; Uitti, R.; Turk, M.; Ajax, T.; Mannetter, J.; Sethi, K.; Carpenter, J.; Dill, B.; Hatch, L.; Ligon, K.; Narayan, S.; Blindauer, K.; Abou-Samra, K.; Petit, J.; Elmer, L.; Aiken, E.; Davis, K.; Schell, C.; Wilson, S.; Velickovic, M.; Koller, W.; Phipps, S.; Feigin, A.; Gordon, M.; Hamann, J.; Licari, E.; Marotta-Kollarus, M.; Shannon, B.; Winnick, R.; Simuni, T.; Videnovic, A.; Kaczmarek, A.; Williams, K.; Wolff, M.; Rao, J.; Cook, M.; Fernandez, M.; Kostyk, S.; Hubble, J.; Campbell, A.; Reider, C.; Seward, A.; Camicioli, R.; Carter, J.; Nutt, J.; Andrews, P.; Morehouse, S.; Stone, C.; Mendis, T.; Grimes, D.; Alcorn-Costa, C.; Gray, P.; Haas, K.; Vendette, J.; Sutton, J.; Hutchinson, B.; Young, J.; Rajput, A.; Klassen, L.; Shirley, T.; Manyam, B.; Simpson, P.; Whetteckey, J.; Wulbrecht, B.; Truong, D.; Pathak, M.; Frei, K.; Luong, N.; Tra, T.; Tran, A.; Vo, J.; Lang, A.; Kleiner- Fisman, G.; Nieves, A.; Johnston, L.; So, J.; Podskalny, G.; Giffin, L.; Atchison, P.; Allen, C.; Martin, W.; Wieler, M.; Suchowersky, O.; Furtado, S.; Klimek, M.; Hermanowicz, N.; Niswonger, S.; Shults, C.; Fontaine, D.; Aminoff, M.; Christine, C.; Diminno, M.; Hevezi, J.; Dalvi, A.; Kang, U.; Richman, J.; Uy, S.; Sahay, A.; Gartner, M.; Schwieterman, D.; Hall, D.; Leehey, M.; Culver, S.; Derian, T.; Demarcaida, T.; Thurlow, S.; Rodnitzky, R.; Dobson, J.; Lyons, K.; Pahwa, R.; Gales, T.; Thomas, S.; Shulman, L.; Weiner, W.; Dustin, K.; Singer, C.; Zelaya, L.; Tuite, P.; Hagen, V.; Rolandelli, S.; Schacherer, R.; Kosowicz, J.; Gordon, P.; Werner, J.; Serrano, C.; Roque, S.; Kurlan, R.; Berry, D.; Gardiner, I.; Hauser, R.; Sanchez-Ramos, J.; Zesiewicz, T.; Delgado, H.; Price, K.; Rodriguez, P.; Wolfrath, S.; Pfeiffer, R.; Davis, L.; Pfeiffer, B.; Dewey, R.; Hayward, B.; Johnson, A.; Meacham, M.; Estes, B.; Walker, F.; Hunt, V.; O'Neill, C.; Racette, B.; Swisher, L.; Dijamco, Cheri; Conley, Emily Drabant; Dorfman, Elizabeth; Tung, Joyce Y.; Hinds, David A.; Mountain, Joanna L.; Wojcicki, Anne; Lew, M.; Klein, C.; Golbe, L.; Growdon, J.; Wooten, G. F.; Watts, R.; Guttman, M.; Goldwurm, S.; Saint-Hilaire, M. H.; Baker, K.; Litvan, I.; Nicholson, G.; Nance, M.; Drasby, E.; Isaacson, S.; Burn, D.; Pramstaller, P.; Al-hinti, J.; Moller, A.; Sherman, S.; Roxburgh, R.; Slevin, J.; Perlmutter, J.; Mark, M. H.; Huggins, N.; Pezzoli, G.; Massood, T.; Itin, I.; Corbett, A.; Chinnery, P.; Ostergaard, K.; Snow, B.; Cambi, F.; Kay, D.; Samii, A.; Agarwal, P.; Roberts, J. W.; Higgins, D. S.; Molho, Eric; Rosen, Ami; Montimurro, J.; Martinez, E.; Griffith, A.; Kusel, V.; Yearout, D.; Zabetian, C.; Clark, L. N.; Liu, X.; Lee, J. H.; Taub, R. Cheng; Louis, E. D.; Cote, L. J.; Waters, C.; Ford, B.; Fahn, S.; Vance, Jeffery M.; Beecham, Gary W.; Martin, Eden R.; Nuytemans, Karen; Pericak-Vance, Margaret A.; Haines, Jonathan L.; DeStefano, Anita; Seshadri, Sudha; Choi, Seung Hoan; Frank, Samuel; Psaty, Bruce M.; Rice, Kenneth; Longstreth, W. T.; Ton, Thanh G. N.; Jain, Samay; van Duijn, Cornelia M.; Verlinden, Vincent J.; Koudstaal, Peter J.; Singleton, Andrew; Cookson, Mark; Hernandez, Dena; Nalls, Michael; Zonderman, Alan; Ferrucci, Luigi; Johnson, Robert; Longo, Dan; O'Brien, Richard; Traynor, Bryan; Troncoso, Juan; van der Brug, Marcel; Zielke, Ronald; Weale, Michael; Ramasamy, Adaikalavan; Dardiotis, Efthimios; Tsimourtou, Vana; Spanaki, Cleanthe; Plaitakis, Andreas; Bozi, Maria; Stefanis, Leonidas; Vassilatis, Dimitris; Koutsis, Georgios; Panas, Marios; Lunnon, Katie; Lupton, Michelle; Powell, John; Parkkinen, Laura; Ansorge, Olaf

    2014-01-01

    We conducted a meta-analysis of Parkinson's disease genome-wide association studies using a common set of 7,893,274 variants across 13,708 cases and 95,282 controls. Twenty-six loci were identified as having genome-wide significant association; these and 6 additional previously reported loci were

  13. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color

    Science.gov (United States)

    2013-01-01

    Background Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. Results We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. Conclusions We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits. PMID:23731509

  14. Genome-wide association study identifies TF as a significant modifier gene of iron metabolism in HFE hemochromatosis.

    Science.gov (United States)

    de Tayrac, Marie; Roth, Marie-Paule; Jouanolle, Anne-Marie; Coppin, Hélène; le Gac, Gérald; Piperno, Alberto; Férec, Claude; Pelucchi, Sara; Scotet, Virginie; Bardou-Jacquet, Edouard; Ropert, Martine; Bouvet, Régis; Génin, Emmanuelle; Mosser, Jean; Deugnier, Yves

    2015-03-01

    Hereditary hemochromatosis (HH) is the most common form of genetic iron loading disease. It is mainly related to the homozygous C282Y/C282Y mutation in the HFE gene that is, however, a necessary but not a sufficient condition to develop clinical and even biochemical HH. This suggests that modifier genes are likely involved in the expressivity of the disease. Our aim was to identify such modifier genes. We performed a genome-wide association study (GWAS) using DNA collected from 474 unrelated C282Y homozygotes. Associations were examined for both quantitative iron burden indices and clinical outcomes with 534,213 single nucleotide polymorphisms (SNP) genotypes, with replication analyses in an independent sample of 748 C282Y homozygotes from four different European centres. One SNP met genome-wide statistical significance for association with transferrin concentration (rs3811647, GWAS p value of 7×10(-9) and replication p value of 5×10(-13)). This SNP, located within intron 11 of the TF gene, had a pleiotropic effect on serum iron (GWAS p value of 4.9×10(-6) and replication p value of 3.2×10(-6)). Both serum transferrin and iron levels were associated with serum ferritin levels, amount of iron removed and global clinical stage (pHFE-associated HH (HFE-HH) patients, identified the rs3811647 polymorphism in the TF gene as the only SNP significantly associated with iron metabolism through serum transferrin and iron levels. Because these two outcomes were clearly associated with the biochemical and clinical expression of the disease, an indirect link between the rs3811647 polymorphism and the phenotypic presentation of HFE-HH is likely. Copyright © 2014 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.

  15. Complete genome sequence of Clostridium estertheticum DSM 8809, a microbe identified in spoiled vacuum packed beef

    Directory of Open Access Journals (Sweden)

    Zhongyi Yu

    2016-11-01

    Full Text Available Blown pack spoilage (BPS is a major issue for the beef industry. Aetiological agents of BPS involve members of a group of Clostridium species, including Clostridium estertheticum which has the ability to produce gas, mostly carbon dioxide, under anaerobic psychotrophic growth conditions. This spore-forming bacterium grows slowly under laboratory conditions, and it can take up to 3 months to produce a workable culture. These characteristics have limited the study of this commercially challenging bacterium. Consequently information on this bacterium is limited and no effective controls are currently available to confidently detect and manage this production risk. In this study the complete genome of Clostridium estertheticum DSM 8809 was determined by SMRT® sequencing. The genome consists of a circular chromosome of 4.7 Mbp along with a single plasmid carrying a potential tellurite resistance gene tehB and a Tn3-like resolvase-encoding gene tnpR. The genome sequence was searched for central metabolic pathways that would support its biochemical profile and several enzymes contributing to this phenotype were identified. Several putative antibiotic/biocide/metal resistance-encoding genes and virulence factors were also identified in the genome, a feature that requires further research. The availability of the genome sequence will provide a basic blueprint from which to develop valuable biomarkers that could support and improve the detection and control of this bacterium along the beef production chain.

  16. An Integrative Bioinformatics Framework for Genome-scale Multiple Level Network Reconstruction of Rice

    Directory of Open Access Journals (Sweden)

    Liu Lili

    2013-06-01

    Full Text Available Understanding how metabolic reactions translate the genome of an organism into its phenotype is a grand challenge in biology. Genome-wide association studies (GWAS statistically connect genotypes to phenotypes, without any recourse to known molecular interactions, whereas a molecular mechanistic description ties gene function to phenotype through gene regulatory networks (GRNs, protein-protein interactions (PPIs and molecular pathways. Integration of different regulatory information levels of an organism is expected to provide a good way for mapping genotypes to phenotypes. However, the lack of curated metabolic model of rice is blocking the exploration of genome-scale multi-level network reconstruction. Here, we have merged GRNs, PPIs and genome-scale metabolic networks (GSMNs approaches into a single framework for rice via omics’ regulatory information reconstruction and integration. Firstly, we reconstructed a genome-scale metabolic model, containing 4,462 function genes, 2,986 metabolites involved in 3,316 reactions, and compartmentalized into ten subcellular locations. Furthermore, 90,358 pairs of protein-protein interactions, 662,936 pairs of gene regulations and 1,763 microRNA-target interactions were integrated into the metabolic model. Eventually, a database was developped for systematically storing and retrieving the genome-scale multi-level network of rice. This provides a reference for understanding genotype-phenotype relationship of rice, and for analysis of its molecular regulatory network.

  17. Low levels of LTR retrotransposon deletion by ectopic recombination in the gigantic genomes of salamanders.

    Science.gov (United States)

    Frahry, Matthew Blake; Sun, Cheng; Chong, Rebecca A; Mueller, Rachel Lockridge

    2015-02-01

    Across the tree of life, species vary dramatically in nuclear genome size. Mutations that add or remove sequences from genomes-insertions or deletions, or indels-are the ultimate source of this variation. Differences in the tempo and mode of insertion and deletion across taxa have been proposed to contribute to evolutionary diversity in genome size. Among vertebrates, most of the largest genomes are found within the salamanders, an amphibian clade with genome sizes ranging from ~14 to ~120 Gb. Salamander genomes have been shown to experience slower rates of DNA loss through small (i.e., genomes. However, no studies have addressed DNA loss from salamander genomes resulting from larger deletions. Here, we focus on one type of large deletion-ectopic-recombination-mediated removal of LTR retrotransposon sequences. In ectopic recombination, double-strand breaks are repaired using a "wrong" (i.e., ectopic, or non-allelic) template sequence-typically another locus of similar sequence. When breaks occur within the LTR portions of LTR retrotransposons, ectopic-recombination-mediated repair can produce deletions that remove the internal transposon sequence and the equivalent of one of the two LTR sequences. These deletions leave a signature in the genome-a solo LTR sequence. We compared levels of solo LTRs in the genomes of four salamander species with levels present in five vertebrates with smaller genomes. Our results demonstrate that salamanders have low levels of solo LTRs, suggesting that ectopic-recombination-mediated deletion of LTR retrotransposons occurs more slowly than in other vertebrates with smaller genomes.

  18. TSSer: an automated method to identify transcription start sites in prokaryotic genomes from differential RNA sequencing data.

    Science.gov (United States)

    Jorjani, Hadi; Zavolan, Mihaela

    2014-04-01

    Accurate identification of transcription start sites (TSSs) is an essential step in the analysis of transcription regulatory networks. In higher eukaryotes, the capped analysis of gene expression technology enabled comprehensive annotation of TSSs in genomes such as those of mice and humans. In bacteria, an equivalent approach, termed differential RNA sequencing (dRNA-seq), has recently been proposed, but the application of this approach to a large number of genomes is hindered by the paucity of computational analysis methods. With few exceptions, when the method has been used, annotation of TSSs has been largely done manually. In this work, we present a computational method called 'TSSer' that enables the automatic inference of TSSs from dRNA-seq data. The method rests on a probabilistic framework for identifying both genomic positions that are preferentially enriched in the dRNA-seq data as well as preferentially captured relative to neighboring genomic regions. Evaluating our approach for TSS calling on several publicly available datasets, we find that TSSer achieves high consistency with the curated lists of annotated TSSs, but identifies many additional TSSs. Therefore, TSSer can accelerate genome-wide identification of TSSs in bacterial genomes and can aid in further characterization of bacterial transcription regulatory networks. TSSer is freely available under GPL license at http://www.clipz.unibas.ch/TSSer/index.php

  19. Genome size evolution at the speciation level: The cryptic species complex Brachionus plicatilis (Rotifera

    Directory of Open Access Journals (Sweden)

    Riss Simone

    2011-04-01

    Full Text Available Abstract Background Studies on genome size variation in animals are rarely done at lower taxonomic levels, e.g., slightly above/below the species level. Yet, such variation might provide important clues on the tempo and mode of genome size evolution. In this study we used the flow-cytometry method to study the evolution of genome size in the rotifer Brachionus plicatilis, a cryptic species complex consisting of at least 14 closely related species. Results We found an unexpectedly high variation in this species complex, with genome sizes ranging approximately seven-fold (haploid '1C' genome sizes: 0.056-0.416 pg. Most of this variation (67% could be ascribed to the major clades of the species complex, i.e. clades that are well separated according to most species definitions. However, we also found substantial variation (32% at lower taxonomic levels - within and among genealogical species - and, interestingly, among species pairs that are not completely reproductively isolated. In one genealogical species, called B. 'Austria', we found greatly enlarged genome sizes that could roughly be approximated as multiples of the genomes of its closest relatives, which suggests that whole-genome duplications have occurred early during separation of this lineage. Overall, genome size was significantly correlated to egg size and body size, even though the latter became non-significant after controlling for phylogenetic non-independence. Conclusions Our study suggests that substantial genome size variation can build up early during speciation, potentially even among isolated populations. An alternative, but not mutually exclusive interpretation might be that reproductive isolation tends to build up unusually slow in this species complex.

  20. Comparative analysis of the full genome of Helicobacter pylori isolate Sahul64 identifies genes of high divergence.

    Science.gov (United States)

    Lu, Wei; Wise, Michael J; Tay, Chin Yen; Windsor, Helen M; Marshall, Barry J; Peacock, Christopher; Perkins, Tim

    2014-03-01

    Isolates of Helicobacter pylori can be classified phylogeographically. High genetic diversity and rapid microevolution are a hallmark of H. pylori genomes, a phenomenon that is proposed to play a functional role in persistence and colonization of diverse human populations. To provide further genomic evidence in the lineage of H. pylori and to further characterize diverse strains of this pathogen in different human populations, we report the finished genome sequence of Sahul64, an H. pylori strain isolated from an indigenous Australian. Our analysis identified genes that were highly divergent compared to the 38 publically available genomes, which include genes involved in the biosynthesis and modification of lipopolysaccharide, putative prophage genes, restriction modification components, and hypothetical genes. Furthermore, the virulence-associated vacA locus is a pseudogene and the cag pathogenicity island (cagPAI) is not present. However, the genome does contain a gene cluster associated with pathogenicity, including dupA. Our analysis found that with the addition of Sahul64 to the 38 genomes, the core genome content of H. pylori is reduced by approximately 14% (∼170 genes) and the pan-genome has expanded from 2,070 to 2,238 genes. We have identified three putative horizontally acquired regions, including one that is likely to have been acquired from the closely related Helicobacter cetorum prior to speciation. Our results suggest that Sahul64, with the absence of cagPAI, highly divergent cell envelope proteins, and a predicted nontransportable VacA protein, could be more highly adapted to ancient indigenous Australian people but with lower virulence potential compared to other sequenced and cagPAI-positive H. pylori strains.

  1. Genome-wide association study identifies 74 loci associated with educational attainment

    OpenAIRE

    Okbay, Aysu; Beauchamp, Jonathan; Fontana, M.A. (Mark Alan); Lee, James J.; Pers, Tune; Rietveld, C.A. (Cornelius A.); Turley, Patrick; Chen, G.-B. (Guo-Bo); Emilsson, Valur; Meddens, S.F.W. (S. Fleur W.); Oskarsson, S. (Sven); Pickrell, J.K. (Joseph K.); Thom, K. (Kevin); Timshel, P. (Pascal); Vlaming, Ronald

    2016-01-01

    textabstractEducational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends our earlier discovery sample of 101,069 individuals to 293,723 individuals, and a replication study in an independent sample of 111,349 individuals from the UK Biobank. We identify 74 geno...

  2. Integration of mouse and human genome-wide association data identifies KCNIP4 as an asthma gene.

    Directory of Open Access Journals (Sweden)

    Blanca E Himes

    Full Text Available Asthma is a common chronic respiratory disease characterized by airway hyperresponsiveness (AHR. The genetics of asthma have been widely studied in mouse and human, and homologous genomic regions have been associated with mouse AHR and human asthma-related phenotypes. Our goal was to identify asthma-related genes by integrating AHR associations in mouse with human genome-wide association study (GWAS data. We used Efficient Mixed Model Association (EMMA analysis to conduct a GWAS of baseline AHR measures from males and females of 31 mouse strains. Genes near or containing SNPs with EMMA p-values <0.001 were selected for further study in human GWAS. The results of the previously reported EVE consortium asthma GWAS meta-analysis consisting of 12,958 diverse North American subjects from 9 study centers were used to select a subset of homologous genes with evidence of association with asthma in humans. Following validation attempts in three human asthma GWAS (i.e., Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG and two human AHR GWAS (i.e., SHARP, DAG, the Kv channel interacting protein 4 (KCNIP4 gene was identified as nominally associated with both asthma and AHR at a gene- and SNP-level. In EVE, the smallest KCNIP4 association was at rs6833065 (P-value 2.9e-04, while the strongest associations for Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG were 1.5e-03, 1.0e-03, 3.1e-03 at rs7664617, rs4697177, rs4696975, respectively. At a SNP level, the strongest association across all asthma GWAS was at rs4697177 (P-value 1.1e-04. The smallest P-values for association with AHR were 2.3e-03 at rs11947661 in SHARP and 2.1e-03 at rs402802 in DAG. Functional studies are required to validate the potential involvement of KCNIP4 in modulating asthma susceptibility and/or AHR. Our results suggest that a useful approach to identify genes associated with human asthma is to leverage mouse AHR association data.

  3. Evolution of a Pathogen: A Comparative Genomics Analysis Identifies a Genetic Pathway to Pathogenesis in Acinetobacter

    Science.gov (United States)

    Sahl, Jason W.; Gillece, John D.; Schupp, James M.; Waddell, Victor G.; Driebe, Elizabeth M.; Engelthaler, David M.; Keim, Paul

    2013-01-01

    Acinetobacter baumannii is an emergent and global nosocomial pathogen. In addition to A. baumannii, other Acinetobacter species, especially those in the Acinetobacter calcoaceticus-baumannii (Acb) complex, have also been associated with serious human infection. Although mechanisms of attachment, persistence on abiotic surfaces, and pathogenesis in A. baumannii have been identified, the genetic mechanisms that explain the emergence of A. baumannii as the most widespread and virulent Acinetobacter species are not fully understood. Recent whole genome sequencing has provided insight into the phylogenetic structure of the genus Acinetobacter. However, a global comparison of genomic features between Acinetobacter spp. has not been described in the literature. In this study, 136 Acinetobacter genomes, including 67 sequenced in this study, were compared to identify the acquisition and loss of genes in the expansion of the Acinetobacter genus. A whole genome phylogeny confirmed that A. baumannii is a monophyletic clade and that the larger Acb complex is also a well-supported monophyletic group. The whole genome phylogeny provided the framework for a global genomic comparison based on a blast score ratio (BSR) analysis. The BSR analysis demonstrated that specific genes have been both lost and acquired in the evolution of A. baumannii. In addition, several genes associated with A. baumannii pathogenesis were found to be more conserved in the Acb complex, and especially in A. baumannii, than in other Acinetobacter genomes; until recently, a global analysis of the distribution and conservation of virulence factors across the genus was not possible. The results demonstrate that the acquisition of specific virulence factors has likely contributed to the widespread persistence and virulence of A. baumannii. The identification of novel features associated with transcriptional regulation and acquired by clades in the Acb complex presents targets for better understanding the

  4. A system-level model for the microbial regulatory genome.

    Science.gov (United States)

    Brooks, Aaron N; Reiss, David J; Allard, Antoine; Wu, Wei-Ju; Salvanha, Diego M; Plaisier, Christopher L; Chandrasekaran, Sriram; Pan, Min; Kaur, Amardeep; Baliga, Nitin S

    2014-07-15

    Microbes can tailor transcriptional responses to diverse environmental challenges despite having streamlined genomes and a limited number of regulators. Here, we present data-driven models that capture the dynamic interplay of the environment and genome-encoded regulatory programs of two types of prokaryotes: Escherichia coli (a bacterium) and Halobacterium salinarum (an archaeon). The models reveal how the genome-wide distributions of cis-acting gene regulatory elements and the conditional influences of transcription factors at each of those elements encode programs for eliciting a wide array of environment-specific responses. We demonstrate how these programs partition transcriptional regulation of genes within regulons and operons to re-organize gene-gene functional associations in each environment. The models capture fitness-relevant co-regulation by different transcriptional control mechanisms acting across the entire genome, to define a generalized, system-level organizing principle for prokaryotic gene regulatory networks that goes well beyond existing paradigms of gene regulation. An online resource (http://egrin2.systemsbiology.net) has been developed to facilitate multiscale exploration of conditional gene regulation in the two prokaryotes. © 2014 The Authors. Published under the terms of the CC BY 4.0 license.

  5. Genome-wide screening identifies a KCNIP1 copy number variant as a genetic predictor for atrial fibrillation

    Science.gov (United States)

    Tsai, Chia-Ti; Hsieh, Chia-Shan; Chang, Sheng-Nan; Chuang, Eric Y.; Ueng, Kwo-Chang; Tsai, Chin-Feng; Lin, Tsung-Hsien; Wu, Cho-Kai; Lee, Jen-Kuang; Lin, Lian-Yu; Wang, Yi-Chih; Yu, Chih-Chieh; Lai, Ling-Ping; Tseng, Chuen-Den; Hwang, Juey-Jen; Chiang, Fu-Tien; Lin, Jiunn-Lee

    2016-01-01

    Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia. Previous genome-wide association studies had identified single-nucleotide polymorphisms in several genomic regions to be associated with AF. In human genome, copy number variations (CNVs) are known to contribute to disease susceptibility. Using a genome-wide multistage approach to identify AF susceptibility CNVs, we here show a common 4,470-bp diallelic CNV in the first intron of potassium interacting channel 1 gene (KCNIP1) is strongly associated with AF in Taiwanese populations (odds ratio=2.27 for insertion allele; P=6.23 × 10−24). KCNIP1 insertion is associated with higher KCNIP1 mRNA expression. KCNIP1-encoded protein potassium interacting channel 1 (KCHIP1) is physically associated with potassium Kv channels and modulates atrial transient outward current in cardiac myocytes. Overexpression of KCNIP1 results in inducible AF in zebrafish. In conclusions, a common CNV in KCNIP1 gene is a genetic predictor of AF risk possibly pointing to a functional pathway. PMID:26831368

  6. Genome-wide association scan in HIV-1-infected individuals identifying variants influencing disease course.

    Directory of Open Access Journals (Sweden)

    Daniëlle van Manen

    Full Text Available BACKGROUND: AIDS develops typically after 7-11 years of untreated HIV-1 infection, with extremes of very rapid disease progression (15 years. To reveal additional host genetic factors that may impact on the clinical course of HIV-1 infection, we designed a genome-wide association study (GWAS in 404 participants of the Amsterdam Cohort Studies on HIV-1 infection and AIDS. METHODS: The association of SNP genotypes with the clinical course of HIV-1 infection was tested in Cox regression survival analyses using AIDS-diagnosis and AIDS-related death as endpoints. RESULTS: Multiple, not previously identified SNPs, were identified to be strongly associated with disease progression after HIV-1 infection, albeit not genome-wide significant. However, three independent SNPs in the top ten associations between SNP genotypes and time between seroconversion and AIDS-diagnosis, and one from the top ten associations between SNP genotypes and time between seroconversion and AIDS-related death, had P-values smaller than 0.05 in the French Genomics of Resistance to Immunodeficiency Virus cohort on disease progression. CONCLUSIONS: Our study emphasizes that the use of different phenotypes in GWAS may be useful to unravel the full spectrum of host genetic factors that may be associated with the clinical course of HIV-1 infection.

  7. Genome-Wide Association Scan in HIV-1-Infected Individuals Identifying Variants Influencing Disease Course

    Science.gov (United States)

    van Manen, Daniëlle; Delaneau, Olivier; Kootstra, Neeltje A.; Boeser-Nunnink, Brigitte D.; Limou, Sophie; Bol, Sebastiaan M.; Burger, Judith A.; Zwinderman, Aeilko H.; Moerland, Perry D.; van 't Slot, Ruben; Zagury, Jean-François; van 't Wout, Angélique B.; Schuitemaker, Hanneke

    2011-01-01

    Background AIDS develops typically after 7–11 years of untreated HIV-1 infection, with extremes of very rapid disease progression (15 years). To reveal additional host genetic factors that may impact on the clinical course of HIV-1 infection, we designed a genome-wide association study (GWAS) in 404 participants of the Amsterdam Cohort Studies on HIV-1 infection and AIDS. Methods The association of SNP genotypes with the clinical course of HIV-1 infection was tested in Cox regression survival analyses using AIDS-diagnosis and AIDS-related death as endpoints. Results Multiple, not previously identified SNPs, were identified to be strongly associated with disease progression after HIV-1 infection, albeit not genome-wide significant. However, three independent SNPs in the top ten associations between SNP genotypes and time between seroconversion and AIDS-diagnosis, and one from the top ten associations between SNP genotypes and time between seroconversion and AIDS-related death, had P-values smaller than 0.05 in the French Genomics of Resistance to Immunodeficiency Virus cohort on disease progression. Conclusions Our study emphasizes that the use of different phenotypes in GWAS may be useful to unravel the full spectrum of host genetic factors that may be associated with the clinical course of HIV-1 infection. PMID:21811574

  8. A high-density genetic map for anchoring genome sequences and identifying QTLs associated with dwarf vine in pumpkin (Cucurbita maxima Duch.).

    Science.gov (United States)

    Zhang, Guoyu; Ren, Yi; Sun, Honghe; Guo, Shaogui; Zhang, Fan; Zhang, Jie; Zhang, Haiying; Jia, Zhangcai; Fei, Zhangjun; Xu, Yong; Li, Haizhen

    2015-12-24

    Pumpkin (Cucurbita maxima Duch.) is an economically important crop belonging to the Cucurbitaceae family. However, very few genomic and genetic resources are available for this species. As part of our ongoing efforts to sequence the pumpkin genome, high-density genetic map is essential for anchoring and orienting the assembled scaffolds. In addition, a saturated genetic map can facilitate quantitative trait locus (QTL) mapping. A set of 186 F2 plants derived from the cross of pumpkin inbred lines Rimu and SQ026 were genotyped using the genotyping-by-sequencing approach. Using the SNPs we identified, a high-density genetic map containing 458 bin-markers was constructed, spanning a total genetic distance of 2,566.8 cM across the 20 linkage groups of C. maxima with a mean marker density of 5.60 cM. Using this map we were able to anchor 58 assembled scaffolds that covered about 194.5 Mb (71.7%) of the 271.4 Mb assembled pumpkin genome, of which 44 (183.0 Mb; 67.4%) were oriented. Furthermore, the high-density genetic map was used to identify genomic regions highly associated with an important agronomic trait, dwarf vine. Three QTLs on linkage groups (LGs) 1, 3 and 4, respectively, were recovered. One QTL, qCmB2, which was located in an interval of 0.42 Mb on LG 3, explained 21.4% phenotypic variations. Within qCmB2, one gene, Cma_004516, encoding the gibberellin (GA) 20-oxidase in the GA biosynthesis pathway, had a 1249-bp deletion in its promoter in bush type lines, and its expression level was significantly increased during the vine growth and higher in vine type lines than bush type lines, supporting Cma_004516 as a possible candidate gene controlling vine growth in pumpkin. A high-density pumpkin genetic map was constructed, which was used to successfully anchor and orient the assembled genome scaffolds, and to identify QTLs highly associated with pumpkin vine length. The map provided a valuable resource for gene cloning and marker assisted breeding in pumpkin and

  9. Genome scan identifies a locus affecting gamma-globin expression in human beta-cluster YAC transgenic mice

    Energy Technology Data Exchange (ETDEWEB)

    Lin, S.D.; Cooper, P.; Fung, J.; Weier, H.U.G.; Rubin, E.M.

    2000-03-01

    Genetic factors affecting post-natal g-globin expression - a major modifier of the severity of both b-thalassemia and sickle cell anemia, have been difficult to study. This is especially so in mice, an organism lacking a globin gene with an expression pattern equivalent to that of human g-globin. To model the human b-cluster in mice, with the goal of screening for loci affecting human g-globin expression in vivo, we introduced a human b-globin cluster YAC transgene into the genome of FVB mice . The b-cluster contained a Greek hereditary persistence of fetal hemoglobin (HPFH) g allele resulting in postnatal expression of human g-globin in transgenic mice. The level of human g-globin for various F1 hybrids derived from crosses between the FVB transgenics and other inbred mouse strains was assessed. The g-globin level of the C3HeB/FVB transgenic mice was noted to be significantly elevated. To map genes affecting postnatal g-globin expression, a 20 centiMorgan (cM) genome scan of a C3HeB/F VB transgenics [prime] FVB backcross was performed, followed by high-resolution marker analysis of promising loci. From this analysis we mapped a locus within a 2.2 cM interval of mouse chromosome 1 at a LOD score of 4.2 that contributes 10.4% of variation in g-globin expression level. Combining transgenic modeling of the human b-globin gene cluster with quantitative trait analysis, we have identified and mapped a murine locus that impacts on human g-globin expression in vivo.

  10. Whole genome sequencing and bioinformatics analysis of two Egyptian genomes.

    Science.gov (United States)

    ElHefnawi, Mahmoud; Jeon, Sungwon; Bhak, Youngjune; ElFiky, Asmaa; Horaiz, Ahmed; Jun, JeHoon; Kim, Hyunho; Bhak, Jong

    2018-05-15

    We report two Egyptian male genomes (EGP1 and EGP2) sequenced at ~ 30× sequencing depths. EGP1 had 4.7 million variants, where 198,877 were novel variants while EGP2 had 209,109 novel variants out of 4.8 million variants. The mitochondrial haplogroup of the two individuals were identified to be H7b1 and L2a1c, respectively. We also identified the Y haplogroup of EGP1 (R1b) and EGP2 (J1a2a1a2 > P58 > FGC11). EGP1 had a mutation in the NADH gene of the mitochondrial genome ND4 (m.11778 G > A) that causes Leber's hereditary optic neuropathy. Some SNPs shared by the two genomes were associated with an increased level of cholesterol and triglycerides, probably related with Egyptians obesity. Comparison of these genomes with African and Western-Asian genomes can provide insights on Egyptian ancestry and genetic history. This resource can be used to further understand genomic diversity and functional classification of variants as well as human migration and evolution across Africa and Western-Asia. Copyright © 2017. Published by Elsevier B.V.

  11. Resolution effects in reconstructing ancestral genomes.

    Science.gov (United States)

    Zheng, Chunfang; Jeong, Yuji; Turcotte, Madisyn Gabrielle; Sankoff, David

    2018-05-09

    The reconstruction of ancestral genomes must deal with the problem of resolution, necessarily involving a trade-off between trying to identify genomic details and being overwhelmed by noise at higher resolutions. We use the median reconstruction at the synteny block level, of the ancestral genome of the order Gentianales, based on coffee, Rhazya stricta and grape, to exemplify the effects of resolution (granularity) on comparative genomic analyses. We show how decreased resolution blurs the differences between evolving genomes, with respect to rate, mutational process and other characteristics.

  12. Genome-wide association study of PR interval in Hispanics/Latinos identifies novel locus at ID2.

    Science.gov (United States)

    Seyerle, Amanda A; Lin, Henry J; Gogarten, Stephanie M; Stilp, Adrienne; Méndez Giráldez, Raul; Soliman, Elsayed; Baldassari, Antoine; Graff, Mariaelisa; Heckbert, Susan; Kerr, Kathleen F; Kooperberg, Charles; Rodriguez, Carlos; Guo, Xiuqing; Yao, Jie; Sotoodehnia, Nona; Taylor, Kent D; Whitsel, Eric A; Rotter, Jerome I; Laurie, Cathy C; Avery, Christy L

    2017-11-10

    PR interval (PR) is a heritable electrocardiographic measure of atrial and atrioventricular nodal conduction. Changes in PR duration may be associated with atrial fibrillation, heart failure and all-cause mortality. Hispanic/Latino populations have high burdens of cardiovascular morbidity and mortality, are highly admixed and represent exceptional opportunities for novel locus identification. However, they remain chronically understudied. We present the first genome-wide association study (GWAS) of PR in 14 756 participants of Hispanic/Latino ancestry from three studies. Study-specific summary results of the association between 1000 Genomes Phase 1 imputed single-nucleotide polymorphisms (SNPs) and PR assumed an additive genetic model and were adjusted for global ancestry, study centre/region and clinical covariates. Results were combined using fixed-effects, inverse variance weighted meta-analysis. Sequential conditional analyses were used to identify independent signals. Replication of novel loci was performed in populations of Asian, African and European descent. ENCODE and RoadMap data were used to annotate results. We identified a novel genome-wide association (PPR at ID2 (rs6730558), which replicated in Asian and European populations (PPR loci to Hispanics/Latinos. Bioinformatics annotation provided evidence for regulatory function in cardiac tissue. Further, for six loci that generalised, the Hispanic/Latino index SNP was genome-wide significant and identical to (or in high linkage disequilibrium with) the previously identified GWAS lead SNP. Our results suggest that genetic determinants of PR are consistent across race/ethnicity, but extending studies to admixed populations can identify novel associations, underscoring the importance of conducting genetic studies in diverse populations. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise

  13. Genome Wide Association Mapping in Arabidopsis thaliana Identifies Novel Genes Involved in Linking Allyl Glucosinolate to Altered Biomass and Defense.

    Science.gov (United States)

    Francisco, Marta; Joseph, Bindu; Caligagan, Hart; Li, Baohua; Corwin, Jason A; Lin, Catherine; Kerwin, Rachel E; Burow, Meike; Kliebenstein, Daniel J

    2016-01-01

    A key limitation in modern biology is the ability to rapidly identify genes underlying newly identified complex phenotypes. Genome wide association studies (GWAS) have become an increasingly important approach for dissecting natural variation by associating phenotypes with genotypes at a genome wide level. Recent work is showing that the Arabidopsis thaliana defense metabolite, allyl glucosinolate (GSL), may provide direct feedback regulation, linking defense metabolism outputs to the growth, and defense responses of the plant. However, there is still a need to identify genes that underlie this process. To start developing a deeper understanding of the mechanism(s) that modulate the ability of exogenous allyl GSL to alter growth and defense, we measured changes in plant biomass and defense metabolites in a collection of natural 96 A. thaliana accessions fed with 50 μM of allyl GSL. Exogenous allyl GSL was introduced exclusively to the roots and the compound transported to the leaf leading to a wide range of heritable effects upon plant biomass and endogenous GSL accumulation. Using natural variation we conducted GWAS to identify a number of new genes which potentially control allyl responses in various plant processes. This is one of the first instances in which this approach has been successfully utilized to begin dissecting a novel phenotype to the underlying molecular/polygenic basis.

  14. Genome wide association mapping in Arabidopsis thaliana identifies novel genes involved in linking allyl glucosinolate to altered biomass and defense

    Directory of Open Access Journals (Sweden)

    Marta Francisco

    2016-07-01

    Full Text Available A key limitation in modern biology is the ability to rapidly identify genes underlying newly identified complex phenotypes. Genome wide association studies (GWAS have become an increasingly important approach for dissecting natural variation by associating phenotypes with genotypes at a genome wide level. Recent work is showing that the Arabidopsis thaliana defense metabolite, allyl glucosinolate (GSL, may provide direct feedback regulation, linking defense metabolism outputs to the growth and defense responses of the plant. However, there is still a need to identify genes that underlie this process. To start developing a deeper understanding of the mechanism(s that modulate the ability of exogenous allyl GSL to alter growth and defense, we measured changes in plant biomass and defense metabolites in a collection of natural 96 A. thaliana accessions fed with 50 µM of allyl GSL. Exogenous allyl GSL was introduced exclusively to the roots and the compound transported to the leaf leading to a wide range of heritable effects upon plant biomass and endogenous GSL accumulation. Using natural variation we conducted GWAS to identify a number of new genes which potentially control allyl responses in various plant processes. This is one of the first instances in which this approach has been successfully utilized to begin dissecting a novel phenotype to the underlying molecular/polygenic basis.

  15. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses

    OpenAIRE

    Okbay, Aysu; Baselmans, B.M.L. (Bart M.L.); Neve, Jan-Emmanuel; Turley, Patrick; Nivard, Michel; Fontana, M.A. (Mark Alan); Meddens, S.F.W. (S. Fleur W.); Linnér, R.K. (Richard Karlsson); Rietveld, C.A. (Cornelius A); Derringer, J.; Gratten, Jacob; Lee, James J.; Liu, J.Z. (Jimmy Z); Vlaming, Ronald; SAhluwalia, T. (Tarunveer)

    2016-01-01

    textabstractVery few genetic variants have been associated with depression and neuroticism, likely because of limitations on sample size in previous studies. Subjective well-being, a phenotype that is genetically correlated with both of these traits, has not yet been studied with genome-wide data. We conducted genome-wide association studies of three phenotypes: subjective well-being (n = 298,420), depressive symptoms (n = 161,460), and neuroticism (n = 170,911). We identify 3 variants associ...

  16. Genome-wide association studies in the Japanese population identify seven novel loci for type 2 diabetes

    DEFF Research Database (Denmark)

    Imamura, Minako; Takahashi, Atsushi; Yamauchi, Toshimasa

    2016-01-01

    Genome-wide association studies (GWAS) have identified more than 80 susceptibility loci for type 2 diabetes (T2D), but most of its heritability still remains to be elucidated. In this study, we conducted a meta-analysis of GWAS for T2D in the Japanese population. Combined data from discovery...... and subsequent validation analyses (23,399 T2D cases and 31,722 controls) identify 7 new loci with genome-wide significance (P2, rs7107784 near MIR4686 and rs67839313 near INAFM2....... Of these, the association of 4 loci with T2D is replicated in multi-ethnic populations other than Japanese (up to 65,936 T2Ds and 158,030 controls, P

  17. Genome-wide meta-analysis identifies new susceptibility loci for migraine.

    Science.gov (United States)

    Anttila, Verneri; Winsvold, Bendik S; Gormley, Padhraig; Kurth, Tobias; Bettella, Francesco; McMahon, George; Kallela, Mikko; Malik, Rainer; de Vries, Boukje; Terwindt, Gisela; Medland, Sarah E; Todt, Unda; McArdle, Wendy L; Quaye, Lydia; Koiranen, Markku; Ikram, M Arfan; Lehtimäki, Terho; Stam, Anine H; Ligthart, Lannie; Wedenoja, Juho; Dunham, Ian; Neale, Benjamin M; Palta, Priit; Hamalainen, Eija; Schürks, Markus; Rose, Lynda M; Buring, Julie E; Ridker, Paul M; Steinberg, Stacy; Stefansson, Hreinn; Jakobsson, Finnbogi; Lawlor, Debbie A; Evans, David M; Ring, Susan M; Färkkilä, Markus; Artto, Ville; Kaunisto, Mari A; Freilinger, Tobias; Schoenen, Jean; Frants, Rune R; Pelzer, Nadine; Weller, Claudia M; Zielman, Ronald; Heath, Andrew C; Madden, Pamela A F; Montgomery, Grant W; Martin, Nicholas G; Borck, Guntram; Göbel, Hartmut; Heinze, Axel; Heinze-Kuhn, Katja; Williams, Frances M K; Hartikainen, Anna-Liisa; Pouta, Anneli; van den Ende, Joyce; Uitterlinden, Andre G; Hofman, Albert; Amin, Najaf; Hottenga, Jouke-Jan; Vink, Jacqueline M; Heikkilä, Kauko; Alexander, Michael; Muller-Myhsok, Bertram; Schreiber, Stefan; Meitinger, Thomas; Wichmann, Heinz Erich; Aromaa, Arpo; Eriksson, Johan G; Traynor, Bryan; Trabzuni, Daniah; Rossin, Elizabeth; Lage, Kasper; Jacobs, Suzanne B R; Gibbs, J Raphael; Birney, Ewan; Kaprio, Jaakko; Penninx, Brenda W; Boomsma, Dorret I; van Duijn, Cornelia; Raitakari, Olli; Jarvelin, Marjo-Riitta; Zwart, John-Anker; Cherkas, Lynn; Strachan, David P; Kubisch, Christian; Ferrari, Michel D; van den Maagdenberg, Arn M J M; Dichgans, Martin; Wessman, Maija; Smith, George Davey; Stefansson, Kari; Daly, Mark J; Nyholt, Dale R; Chasman, Daniel; Palotie, Aarno

    2013-08-01

    Migraine is the most common brain disorder, affecting approximately 14% of the adult population, but its molecular mechanisms are poorly understood. We report the results of a meta-analysis across 29 genome-wide association studies, including a total of 23,285 individuals with migraine (cases) and 95,425 population-matched controls. We identified 12 loci associated with migraine susceptibility (P<5×10(-8)). Five loci are new: near AJAP1 at 1p36, near TSPAN2 at 1p13, within FHL5 at 6q16, within C7orf10 at 7p14 and near MMP16 at 8q21. Three of these loci were identified in disease subgroup analyses. Brain tissue expression quantitative trait locus analysis suggests potential functional candidate genes at four loci: APOA1BP, TBC1D7, FUT9, STAT6 and ATP5B.

  18. Genome-wide association study identifies genetic loci associated with iron deficiency.

    Directory of Open Access Journals (Sweden)

    Christine E McLaren

    2011-03-01

    Full Text Available The existence of multiple inherited disorders of iron metabolism in man, rodents and other vertebrates suggests genetic contributions to iron deficiency. To identify new genomic locations associated with iron deficiency, a genome-wide association study (GWAS was performed using DNA collected from white men aged≥25 y and women≥50 y in the Hemochromatosis and Iron Overload Screening (HEIRS Study with serum ferritin (SF≤12 µg/L (cases and iron replete controls (SF>100 µg/L in men, SF>50 µg/L in women. Regression analysis was used to examine the association between case-control status (336 cases, 343 controls and quantitative serum iron measures and 331,060 single nucleotide polymorphism (SNP genotypes, with replication analyses performed in a sample of 71 cases and 161 controls from a population of white male and female veterans screened at a US Veterans Affairs (VA medical center. Five SNPs identified in the GWAS met genome-wide statistical significance for association with at least one iron measure, rs2698530 on chr. 2p14; rs3811647 on chr. 3q22, a known SNP in the transferrin (TF gene region; rs1800562 on chr. 6p22, the C282Y mutation in the HFE gene; rs7787204 on chr. 7p21; and rs987710 on chr. 22q11 (GWAS observed P<1.51×10(-7 for all. An association between total iron binding capacity and SNP rs3811647 in the TF gene (GWAS observed P=7.0×10(-9, corrected P=0.012 was replicated within the VA samples (observed P=0.012. Associations with the C282Y mutation in the HFE gene also were replicated. The joint analysis of the HEIRS and VA samples revealed strong associations between rs2698530 on chr. 2p14 and iron status outcomes. These results confirm a previously-described TF polymorphism and implicate one potential new locus as a target for gene identification.

  19. A novel common large genomic deletion and two new missense mutations identified in the Romanian phenylketonuria population.

    Science.gov (United States)

    Gemperle-Britschgi, Corinne; Iorgulescu, Daniela; Mager, Monica Alina; Anton-Paduraru, Dana; Vulturar, Romana; Thöny, Beat

    2016-01-15

    The mutation spectrum for the phenylalanine hydroxylase (PAH) gene was investigated in a cohort of 84 hyperphenylalaninemia (HPA) patients from Romania identified through newborn screening or neurometabolic investigations. Differential diagnosis identified 81 patients with classic PAH deficiency while 3 had tetrahydropterin-cofactor deficiency and/or remained uncertain due to insufficient specimen. PAH-genetic analysis included a combination of Sanger sequencing of exons and exon–intron boundaries, MLPA and NGS with genomic DNA, and cDNA analysis from immortalized lymphoblasts. A diagnostic efficiency of 99.4% was achieved, as for one allele (out of a total of 162 alleles) no mutation could be identified. The most prevalent mutation was p.Arg408Trp which was found in ~ 38% of all PKU alleles. Three novel mutations were identified, including the two missense mutations p.Gln226Lys and p.Tyr268Cys that were both disease causing by prediction algorithms, and the large genomic deletion EX6del7831 (c.509 + 4140_706 + 510del7831) that resulted in skipping of exon 6 based on PAH-cDNA analysis in immortalized lymphocytes. The genomic deletion was present in a heterozygous state in 12 patients, i.e. in ~ 8% of all the analyzed PKU alleles, and might have originated from a Romanian founder.

  20. Genome-wide association scan meta-analysis identifies three loci influencing adiposity and fat distribution

    NARCIS (Netherlands)

    C.M. Lindgren (Cecilia); I.M. Heid (Iris); J.C. Randall (Joshua); C. Lamina (Claudia); V. Steinthorsdottir (Valgerdur); L. Qi (Lu); E.K. Speliotes (Elizabeth); G. Thorleifsson (Gudmar); C.J. Willer (Cristen); B.M. Herrera (Blanca); A.U. Jackson (Anne); N. Lim (Noha); P. Scheet (Paul); N. Soranzo (Nicole); N. Amin (Najaf); Y.S. Aulchenko (Yurii); J.C. Chambers (John); A. Drong (Alexander); J. Luan; H.N. Lyon (Helen); F. Rivadeneira Ramirez (Fernando); S. Sanna (Serena); N.J. Timpson (Nicholas); M.C. Zillikens (Carola); H.Z. Jing; P. Almgren (Peter); S. Bandinelli (Stefania); A.J. Bennett (Amanda); R.N. Bergman (Richard); L.L. Bonnycastle (Lori); S. Bumpstead (Suzannah); S.J. Chanock (Stephen); L. Cherkas (Lynn); P.S. Chines (Peter); L. Coin (Lachlan); C. Cooper (Charles); G. Crawford (Gabe); A. Doering (Angela); A. Dominiczak (Anna); A.S.F. Doney (Alex); S. Ebrahim (Shanil); P. Elliott (Paul); M.R. Erdos (Michael); K. Estrada Gil (Karol); L. Ferrucci (Luigi); G. Fischer (Guido); N.G. Forouhi (Nita); C. Gieger (Christian); H. Grallert (Harald); C.J. Groves (Christopher); S.M. Grundy (Scott); C. Guiducci (Candace); D. Hadley (David); A. Hamsten (Anders); A.S. Havulinna (Aki); A. Hofman (Albert); R. Holle (Rolf); J.W. Holloway (John); T. Illig (Thomas); B. Isomaa (Bo); L.C. Jacobs (Leonie); K. Jameson (Karen); P. Jousilahti (Pekka); F. Karpe (Fredrik); J. Kuusisto (Johanna); J. Laitinen (Jaana); G.M. Lathrop (Mark); D.A. Lawlor (Debbie); M. Mangino (Massimo); W.L. McArdle (Wendy); T. Meitinger (Thomas); M.A. Morken (Mario); A.P. Morris (Andrew); P. Munroe (Patricia); N. Narisu (Narisu); A. Nordström (Anna); B.A. Oostra (Ben); C.N.A. Palmer (Colin); F. Payne (Felicity); J. Peden (John); I. Prokopenko (Inga); F. Renström (Frida); A. Ruokonen (Aimo); V. Salomaa (Veikko); M.S. Sandhu (Manjinder); L.J. Scott (Laura); A. Scuteri (Angelo); K. Silander (Kaisa); K. Song (Kijoung); X. Yuan (Xin); H.M. Stringham (Heather); A.J. Swift (Amy); T. Tuomi (Tiinamaija); M. Uda (Manuela); P. Vollenweider (Peter); G. Waeber (Gérard); C. Wallace (Chris); G.B. Walters (Bragi); M.N. Weedon (Michael); J.C.M. Witteman (Jacqueline); C. Zhang (Cuilin); M. Caulfield (Mark); F.S. Collins (Francis); G.D. Smith; I.N.M. Day (Ian); P.W. Franks (Paul); A.T. Hattersley (Andrew); F.B. Hu (Frank); M.-R. Jarvelin (Marjo-Riitta); A. Kong (Augustine); J.S. Kooner (Jaspal); M. Laakso (Markku); E. Lakatta (Edward); V. Mooser (Vincent); L. Peltonen (Leena Johanna); N.J. Samani (Nilesh); T.D. Spector (Timothy); D.P. Strachan (David); T. Tanaka (Toshiko); J. Tuomilehto (Jaakko); A.G. Uitterlinden (André); P. Tikka-Kleemola (Päivi); N.J. Wareham (Nick); H. Watkins (Hugh); D. Waterworth (Dawn); M. Boehnke (Michael); P. Deloukas (Panagiotis); L. Groop (Leif); D.J. Hunter (David); U. Thorsteinsdottir (Unnur); D. Schlessinger (David); H.E. Wichmann (Erich); T.M. Frayling (Timothy); G.R. Abecasis (Gonçalo); J.N. Hirschhorn (Joel); R.J.F. Loos (Ruth); J-A. Zwart (John-Anker); K.L. Mohlke (Karen); I.E. Barroso (Inês); M.I. McCarthy (Mark)

    2009-01-01

    textabstractTo identify genetic loci influencing central obesity and fat distribution, we performed a meta-analysis of 16 genome-wide association studies (GWAS, N = 38,580) informative for adult waist circumference (WC) and waist-hip ratio (WHR). We selected 26 SNPs for follow-up, for which the

  1. Synonymous Codon Usage Analysis of Thirty Two Mycobacteriophage Genomes

    Directory of Open Access Journals (Sweden)

    Sameer Hassan

    2009-01-01

    Full Text Available Synonymous codon usage of protein coding genes of thirty two completely sequenced mycobacteriophage genomes was studied using multivariate statistical analysis. One of the major factors influencing codon usage is identified to be compositional bias. Codons ending with either C or G are preferred in highly expressed genes among which C ending codons are highly preferred over G ending codons. A strong negative correlation between effective number of codons (Nc and GC3s content was also observed, showing that the codon usage was effected by gene nucleotide composition. Translational selection is also identified to play a role in shaping the codon usage operative at the level of translational accuracy. High level of heterogeneity is seen among and between the genomes. Length of genes is also identified to influence the codon usage in 11 out of 32 phage genomes. Mycobacteriophage Cooper is identified to be the highly biased genome with better translation efficiency comparing well with the host specific tRNA genes.

  2. Exome chip meta-analysis identifies novel loci and East Asian-specific coding variants that contribute to lipid levels and coronary artery disease

    DEFF Research Database (Denmark)

    Lu, Xiangfeng; Peloso, Gina M; Liu, Dajiang J

    2017-01-01

    Most genome-wide association studies have been of European individuals, even though most genetic variation in humans is seen only in non-European samples. To search for novel loci associated with blood lipid levels and clarify the mechanism of action at previously identified lipid loci, we used a...

  3. A genome-wide association study of COPD identifies a susceptibility locus on chromosome 19q13

    DEFF Research Database (Denmark)

    Cho, Michael H; Castaldi, Peter J; Wan, Emily S

    2012-01-01

    The genetic risk factors for chronic obstructive pulmonary disease (COPD) are still largely unknown. To date, genome-wide association studies (GWASs) of limited size have identified several novel risk loci for COPD at CHRNA3/CHRNA5/IREB2, HHIP and FAM13A; additional loci may be identified through...

  4. Identifying all moiety conservation laws in genome-scale metabolic networks.

    Science.gov (United States)

    De Martino, Andrea; De Martino, Daniele; Mulet, Roberto; Pagnani, Andrea

    2014-01-01

    The stoichiometry of a metabolic network gives rise to a set of conservation laws for the aggregate level of specific pools of metabolites, which, on one hand, pose dynamical constraints that cross-link the variations of metabolite concentrations and, on the other, provide key insight into a cell's metabolic production capabilities. When the conserved quantity identifies with a chemical moiety, extracting all such conservation laws from the stoichiometry amounts to finding all non-negative integer solutions of a linear system, a programming problem known to be NP-hard. We present an efficient strategy to compute the complete set of integer conservation laws of a genome-scale stoichiometric matrix, also providing a certificate for correctness and maximality of the solution. Our method is deployed for the analysis of moiety conservation relationships in two large-scale reconstructions of the metabolism of the bacterium E. coli, in six tissue-specific human metabolic networks, and, finally, in the human reactome as a whole, revealing that bacterial metabolism could be evolutionarily designed to cover broader production spectra than human metabolism. Convergence to the full set of moiety conservation laws in each case is achieved in extremely reduced computing times. In addition, we uncover a scaling relation that links the size of the independent pool basis to the number of metabolites, for which we present an analytical explanation.

  5. Identifying all moiety conservation laws in genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Andrea De Martino

    Full Text Available The stoichiometry of a metabolic network gives rise to a set of conservation laws for the aggregate level of specific pools of metabolites, which, on one hand, pose dynamical constraints that cross-link the variations of metabolite concentrations and, on the other, provide key insight into a cell's metabolic production capabilities. When the conserved quantity identifies with a chemical moiety, extracting all such conservation laws from the stoichiometry amounts to finding all non-negative integer solutions of a linear system, a programming problem known to be NP-hard. We present an efficient strategy to compute the complete set of integer conservation laws of a genome-scale stoichiometric matrix, also providing a certificate for correctness and maximality of the solution. Our method is deployed for the analysis of moiety conservation relationships in two large-scale reconstructions of the metabolism of the bacterium E. coli, in six tissue-specific human metabolic networks, and, finally, in the human reactome as a whole, revealing that bacterial metabolism could be evolutionarily designed to cover broader production spectra than human metabolism. Convergence to the full set of moiety conservation laws in each case is achieved in extremely reduced computing times. In addition, we uncover a scaling relation that links the size of the independent pool basis to the number of metabolites, for which we present an analytical explanation.

  6. The Human Genome Project and Eugenics: Identifying the Impact on Individuals with Mental Retardation.

    Science.gov (United States)

    Kuna, Jason

    2001-01-01

    This article explores the impact of the mapping work of the Human Genome Project on individuals with mental retardation and the negative effects of genetic testing. The potential to identify disabilities and the concept of eugenics are discussed, along with ethical issues surrounding potential genetic therapies. (Contains references.) (CR)

  7. Genome-wide association study identifies multiple susceptibility loci for diffuse large B cell lymphoma

    NARCIS (Netherlands)

    Cerhan, James R.; Berndt, Sonja I.; Vijai, Joseph; Ghesquières, Hervé; McKay, James; Wang, Sophia S.; Wang, Zhaoming; Yeager, Meredith; Conde, Lucia; De Bakker, Paul I W; Nieters, Alexandra; Cox, David; Burdett, Laurie; Monnereau, Alain; Flowers, Christopher R.; De Roos, Anneclaire J.; Brooks-Wilson, Angela R.; Lan, Qing; Severi, Gianluca; Melbye, Mads; Gu, Jian; Jackson, Rebecca D.; Kane, Eleanor; Teras, Lauren R.; Purdue, Mark P.; Vajdic, Claire M.; Spinelli, John J.; Giles, Graham G.; Albanes, Demetrius; Kelly, Rachel S.; Zucca, Mariagrazia; Bertrand, Kimberly A.; Zeleniuch-Jacquotte, Anne; Lawrence, Charles; Hutchinson, Amy; Zhi, Degui; Habermann, Thomas M.; Link, Brian K.; Novak, Anne J.; Dogan, Ahmet; Asmann, Yan W.; Liebow, Mark; Thompson, Carrie A.; Ansell, Stephen M.; Witzig, Thomas E.; Weiner, George J.; Veron, Amelie S.; Zelenika, Diana; Tilly, Hervé; Haioun, Corinne; Molina, Thierry Jo; Hjalgrim, Henrik; Glimelius, Bengt; Adami, Hans Olov; Bracci, Paige M.; Riby, Jacques; Smith, Martyn T.; Holly, Elizabeth A.; Cozen, Wendy; Hartge, Patricia; Morton, Lindsay M.; Severson, Richard K.; Tinker, Lesley F.; North, Kari E.; Becker, Nikolaus; Benavente, Yolanda; Boffetta, Paolo; Brennan, Paul; Foretova, Lenka; Maynadie, Marc; Staines, Anthony; Lightfoot, Tracy; Crouch, Simon; Smith, Alex; Roman, Eve; Diver, W. Ryan; Offit, Kenneth; Zelenetz, Andrew; Klein, Robert J.; Villano, Danylo J.; Zheng, Tongzhang; Zhang, Yawei; Holford, Theodore R.; Kricker, Anne; Turner, Jenny; Southey, Melissa C.; Clavel, Jacqueline; Virtamo, Jarmo; Weinstein, Stephanie; Riboli, Elio; Vineis, Paolo; Kaaks, Rudolph; Trichopoulos, Dimitrios; Vermeulen, Roel C H; Boeing, Heiner; Tjonneland, Anne; Angelucci, Emanuele; Di Lollo, Simonetta; Rais, Marco; Birmann, Brenda M.; Laden, Francine; Giovannucci, Edward; Kraft, Peter; Huang, Jinyan; Ma, Baoshan; Ye, Yuanqing; Chiu, Brian C H; Sampson, Joshua; Liang, Liming; Park, Ju Hyun; Chung, Charles C.; Weisenburger, Dennis D.; Chatterjee, Nilanjan; Fraumeni, Joseph F.; Slager, Susan L.; Wu, Xifeng; De Sanjose, Silvia; Smedby, Karin E.; Salles, Gilles; Skibola, Christine F.; Rothman, Nathaniel; Chanock, Stephen J.

    2014-01-01

    Diffuse large B cell lymphoma (DLBCL) is the most common lymphoma subtype and is clinically aggressive. To identify genetic susceptibility loci for DLBCL, we conducted a meta-analysis of 3 new genome-wide association studies (GWAS) and 1 previous scan, totaling 3,857 cases and 7,666 controls of

  8. A Large-Scale Multi-ancestry Genome-wide Study Accounting for Smoking Behavior Identifies Multiple Significant Loci for Blood Pressure

    DEFF Research Database (Denmark)

    Sung, Yun J; Winkler, Thomas W; de Las Fuentes, Lisa

    2018-01-01

    Genome-wide association analysis advanced understanding of blood pressure (BP), a major risk factor for vascular conditions such as coronary heart disease and stroke. Accounting for smoking behavior may help identify BP loci and extend our knowledge of its genetic architecture. We performed genom...

  9. Bat Biology, Genomes, and the Bat1K Project: To Generate Chromosome-Level Genomes for All Living Bat Species.

    Science.gov (United States)

    Teeling, Emma C; Vernes, Sonja C; Dávalos, Liliana M; Ray, David A; Gilbert, M Thomas P; Myers, Eugene

    2018-02-15

    Bats are unique among mammals, possessing some of the rarest mammalian adaptations, including true self-powered flight, laryngeal echolocation, exceptional longevity, unique immunity, contracted genomes, and vocal learning. They provide key ecosystem services, pollinating tropical plants, dispersing seeds, and controlling insect pest populations, thus driving healthy ecosystems. They account for more than 20% of all living mammalian diversity, and their crown-group evolutionary history dates back to the Eocene. Despite their great numbers and diversity, many species are threatened and endangered. Here we announce Bat1K, an initiative to sequence the genomes of all living bat species (n∼1,300) to chromosome-level assembly. The Bat1K genome consortium unites bat biologists (>148 members as of writing), computational scientists, conservation organizations, genome technologists, and any interested individuals committed to a better understanding of the genetic and evolutionary mechanisms that underlie the unique adaptations of bats. Our aim is to catalog the unique genetic diversity present in all living bats to better understand the molecular basis of their unique adaptations; uncover their evolutionary history; link genotype with phenotype; and ultimately better understand, promote, and conserve bats. Here we review the unique adaptations of bats and highlight how chromosome-level genome assemblies can uncover the molecular basis of these traits. We present a novel sequencing and assembly strategy and review the striking societal and scientific benefits that will result from the Bat1K initiative.

  10. Phylogenetic distribution of large-scale genome patchiness

    Directory of Open Access Journals (Sweden)

    Hackenberg Michael

    2008-04-01

    Full Text Available Abstract Background The phylogenetic distribution of large-scale genome structure (i.e. mosaic compositional patchiness has been explored mainly by analytical ultracentrifugation of bulk DNA. However, with the availability of large, good-quality chromosome sequences, and the recently developed computational methods to directly analyze patchiness on the genome sequence, an evolutionary comparative analysis can be carried out at the sequence level. Results The local variations in the scaling exponent of the Detrended Fluctuation Analysis are used here to analyze large-scale genome structure and directly uncover the characteristic scales present in genome sequences. Furthermore, through shuffling experiments of selected genome regions, computationally-identified, isochore-like regions were identified as the biological source for the uncovered large-scale genome structure. The phylogenetic distribution of short- and large-scale patchiness was determined in the best-sequenced genome assemblies from eleven eukaryotic genomes: mammals (Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, and Canis familiaris, birds (Gallus gallus, fishes (Danio rerio, invertebrates (Drosophila melanogaster and Caenorhabditis elegans, plants (Arabidopsis thaliana and yeasts (Saccharomyces cerevisiae. We found large-scale patchiness of genome structure, associated with in silico determined, isochore-like regions, throughout this wide phylogenetic range. Conclusion Large-scale genome structure is detected by directly analyzing DNA sequences in a wide range of eukaryotic chromosome sequences, from human to yeast. In all these genomes, large-scale patchiness can be associated with the isochore-like regions, as directly detected in silico at the sequence level.

  11. Genome-Wide Association Study Identifies Loci for Salt Tolerance during Germination in Autotetraploid Alfalfa (Medicago sativa L.) Using Genotyping-by-Sequencing

    Science.gov (United States)

    Yu, Long-Xi; Liu, Xinchun; Boge, William; Liu, Xiang-Ping

    2016-01-01

    Salinity is one of major abiotic stresses limiting alfalfa (Medicago sativa L.) production in the arid and semi-arid regions in US and other counties. In this study, we used a diverse panel of alfalfa accessions previously described by Zhang et al. (2015) to identify molecular markers associated with salt tolerance during germination using genome-wide association study (GWAS) and genotyping-by-sequencing (GBS). Phenotyping was done by germinating alfalfa seeds under different levels of salt stress. Phenotypic data of adjusted germination rates and SNP markers generated by GBS were used for marker-trait association. Thirty six markers were significantly associated with salt tolerance in at least one level of salt treatments. Alignment of sequence tags to the Medicago truncatula genome revealed genetic locations of the markers on all chromosomes except chromosome 3. Most significant markers were found on chromosomes 1, 2, and 4. BLAST search using the flanking sequences of significant markers identified 14 putative candidate genes linked to 23 significant markers. Most of them were repeatedly identified in two or three salt treatments. Several loci identified in the present study had similar genetic locations to the reported QTL associated with salt tolerance in M. truncatula. A locus identified on chromosome 6 by this study overlapped with that by drought in our previous study. To our knowledge, this is the first report on mapping loci associated with salt tolerance during germination in autotetraploid alfalfa. Further investigation on these loci and their linked genes would provide insight into understanding molecular mechanisms by which salt and drought stresses affect alfalfa growth. Functional markers closely linked to the resistance loci would be useful for MAS to improve alfalfa cultivars with enhanced resistance to drought and salt stresses. PMID:27446182

  12. Seven prostate cancer susceptibility loci identified by a multi-stage genome-wide association study

    DEFF Research Database (Denmark)

    Kote-Jarai, Zsofia; Olama, Ali Amin Al; Giles, Graham G

    2011-01-01

    Prostate cancer (PrCa) is the most frequently diagnosed male cancer in developed countries. We conducted a multi-stage genome-wide association study for PrCa and previously reported the results of the first two stages, which identified 16 PrCa susceptibility loci. We report here the results of st...

  13. Meta-analysis of genome-wide studies identifies WNT16 and ESR1 SNPs associated with bone mineral density in premenopausal women.

    Science.gov (United States)

    Koller, Daniel L; Zheng, Hou-Feng; Karasik, David; Yerges-Armstrong, Laura; Liu, Ching-Ti; McGuigan, Fiona; Kemp, John P; Giroux, Sylvie; Lai, Dongbing; Edenberg, Howard J; Peacock, Munro; Czerwinski, Stefan A; Choh, Audrey C; McMahon, George; St Pourcain, Beate; Timpson, Nicholas J; Lawlor, Debbie A; Evans, David M; Towne, Bradford; Blangero, John; Carless, Melanie A; Kammerer, Candace; Goltzman, David; Kovacs, Christopher S; Prior, Jerilynn C; Spector, Tim D; Rousseau, Francois; Tobias, Jon H; Akesson, Kristina; Econs, Michael J; Mitchell, Braxton D; Richards, J Brent; Kiel, Douglas P; Foroud, Tatiana

    2013-03-01

    Previous genome-wide association studies (GWAS) have identified common variants in genes associated with variation in bone mineral density (BMD), although most have been carried out in combined samples of older women and men. Meta-analyses of these results have identified numerous single-nucleotide polymorphisms (SNPs) of modest effect at genome-wide significance levels in genes involved in both bone formation and resorption, as well as other pathways. We performed a meta-analysis restricted to premenopausal white women from four cohorts (n = 4061 women, aged 20 to 45 years) to identify genes influencing peak bone mass at the lumbar spine and femoral neck. After imputation, age- and weight-adjusted bone-mineral density (BMD) values were tested for association with each SNP. Association of an SNP in the WNT16 gene (rs3801387; p = 1.7 × 10(-9) ) and multiple SNPs in the ESR1/C6orf97 region (rs4870044; p = 1.3 × 10(-8) ) achieved genome-wide significance levels for lumbar spine BMD. These SNPs, along with others demonstrating suggestive evidence of association, were then tested for association in seven replication cohorts that included premenopausal women of European, Hispanic-American, and African-American descent (combined n = 5597 for femoral neck; n = 4744 for lumbar spine). When the data from the discovery and replication cohorts were analyzed jointly, the evidence was more significant (WNT16 joint p = 1.3 × 10(-11) ; ESR1/C6orf97 joint p = 1.4 × 10(-10) ). Multiple independent association signals were observed with spine BMD at the ESR1 region after conditioning on the primary signal. Analyses of femoral neck BMD also supported association with SNPs in WNT16 and ESR1/C6orf97 (p women. These data support the hypothesis that variants in these genes of known skeletal function also affect BMD during the premenopausal period. Copyright © 2013 American Society for Bone and Mineral Research.

  14. A genome-wide association analysis of a broad psychosis phenotype identifies three loci for further investigation

    OpenAIRE

    Bramon, Elvira; Pirinen, Matti; Strange, Amy; Lin, Kuang; Freeman, Colin; Bellenguez, Céline; Su, Zhan; Band, Gavin; Pearson, Richard; Vukcevic, Damjan; Langford, Cordelia; Deloukas, Panos; Hunt, Sarah; Gray, Emma; Dronov, Serge

    2014-01-01

    Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories.

  15. Genome-wide gene expression dataset used to identify potential therapeutic targets in androgenetic alopecia

    Directory of Open Access Journals (Sweden)

    R. Dey-Rao

    2017-08-01

    Full Text Available The microarray dataset attached to this report is related to the research article with the title: “A genomic approach to susceptibility and pathogenesis leads to identifying potential novel therapeutic targets in androgenetic alopecia” (Dey-Rao and Sinha, 2017 [1]. Male-pattern hair loss that is induced by androgens (testosterone in genetically predisposed individuals is known as androgenetic alopecia (AGA. The raw dataset is being made publicly available to enable critical and/or extended analyses. Our related research paper utilizes the attached raw dataset, for genome-wide gene-expression associated investigations. Combined with several in silico bioinformatics-based analyses we were able to delineate five strategic molecular elements as potential novel targets towards future AGA-therapy.

  16. Genome-wide meta-analysis identifies novel determinants of circulating serum progranulin.

    Science.gov (United States)

    Tönjes, Anke; Scholz, Markus; Krüger, Jacqueline; Krause, Kerstin; Schleinitz, Dorit; Kirsten, Holger; Gebhardt, Claudia; Marzi, Carola; Grallert, Harald; Ladenvall, Claes; Heyne, Henrike; Laurila, Esa; Kriebel, Jennifer; Meisinger, Christa; Rathmann, Wolfgang; Gieger, Christian; Groop, Leif; Prokopenko, Inga; Isomaa, Bo; Beutner, Frank; Kratzsch, Jürgen; Fischer-Rosinsky, Antje; Pfeiffer, Andreas; Krohn, Knut; Spranger, Joachim; Thiery, Joachim; Blüher, Matthias; Stumvoll, Michael; Kovacs, Peter

    2018-02-01

    Progranulin is a secreted protein with important functions in processes including immune and inflammatory response, metabolism and embryonic development. The present study aimed at identification of genetic factors determining progranulin concentrations. We conducted a genome-wide association meta-analysis for serum progranulin in three independent cohorts from Europe: Sorbs (N = 848) and KORA (N = 1628) from Germany and PPP-Botnia (N = 335) from Finland (total N = 2811). Single nucleotide polymorphisms (SNPs) associated with progranulin levels were replicated in two additional German cohorts: LIFE-Heart Study (Leipzig; N = 967) and Metabolic Syndrome Berlin Potsdam (Berlin cohort; N = 833). We measured mRNA expression of genes in peripheral blood mononuclear cells (PBMC) by micro-arrays and performed mRNA expression quantitative trait and expression-progranulin association studies to functionally substantiate identified loci. Finally, we conducted siRNA silencing experiments in vitro to validate potential candidate genes within the associated loci. Heritability of circulating progranulin levels was estimated at 31.8% and 26.1% in the Sorbs and LIFE-Heart cohort, respectively. SNPs at three loci reached study-wide significance (rs660240 in CELSR2-PSRC1-MYBPHL-SORT1, rs4747197 in CDH23-PSAP and rs5848 in GRN) explaining 19.4%/15.0% of the variance and 61%/57% of total heritability in the Sorbs/LIFE-Heart Study. The strongest evidence for association was at rs660240 (P = 5.75 × 10-50), which was also associated with mRNA expression of PSRC1 in PBMC (P = 1.51 × 10-21). Psrc1 knockdown in murine preadipocytes led to a consecutive 30% reduction in progranulin secretion. In conclusion, the present meta-GWAS combined with mRNA expression identified three loci associated with progranulin and supports the role of PSRC1 in the regulation of progranulin secretion. © The Author(s) 2017. Published by Oxford University Press. All rights

  17. Genome-wide meta-analyses identify multiple loci associated with smoking behavior.

    LENUS (Irish Health Repository)

    2010-05-01

    Consistent but indirect evidence has implicated genetic factors in smoking behavior. We report meta-analyses of several smoking phenotypes within cohorts of the Tobacco and Genetics Consortium (n = 74,053). We also partnered with the European Network of Genetic and Genomic Epidemiology (ENGAGE) and Oxford-GlaxoSmithKline (Ox-GSK) consortia to follow up the 15 most significant regions (n > 140,000). We identified three loci associated with number of cigarettes smoked per day. The strongest association was a synonymous 15q25 SNP in the nicotinic receptor gene CHRNA3 (rs1051730[A], beta = 1.03, standard error (s.e.) = 0.053, P = 2.8 x 10(-73)). Two 10q25 SNPs (rs1329650[G], beta = 0.367, s.e. = 0.059, P = 5.7 x 10(-10); and rs1028936[A], beta = 0.446, s.e. = 0.074, P = 1.3 x 10(-9)) and one 9q13 SNP in EGLN2 (rs3733829[G], beta = 0.333, s.e. = 0.058, P = 1.0 x 10(-8)) also exceeded genome-wide significance for cigarettes per day. For smoking initiation, eight SNPs exceeded genome-wide significance, with the strongest association at a nonsynonymous SNP in BDNF on chromosome 11 (rs6265[C], odds ratio (OR) = 1.06, 95% confidence interval (Cl) 1.04-1.08, P = 1.8 x 10(-8)). One SNP located near DBH on chromosome 9 (rs3025343[G], OR = 1.12, 95% Cl 1.08-1.18, P = 3.6 x 10(-8)) was significantly associated with smoking cessation.

  18. Genome wide association identifies common variants at the SERPINA6/SERPINA1 locus influencing plasma cortisol and corticosteroid binding globulin.

    Directory of Open Access Journals (Sweden)

    Jennifer L Bolton

    2014-07-01

    Full Text Available Variation in plasma levels of cortisol, an essential hormone in the stress response, is associated in population-based studies with cardio-metabolic, inflammatory and neuro-cognitive traits and diseases. Heritability of plasma cortisol is estimated at 30-60% but no common genetic contribution has been identified. The CORtisol NETwork (CORNET consortium undertook genome wide association meta-analysis for plasma cortisol in 12,597 Caucasian participants, replicated in 2,795 participants. The results indicate that <1% of variance in plasma cortisol is accounted for by genetic variation in a single region of chromosome 14. This locus spans SERPINA6, encoding corticosteroid binding globulin (CBG, the major cortisol-binding protein in plasma, and SERPINA1, encoding α1-antitrypsin (which inhibits cleavage of the reactive centre loop that releases cortisol from CBG. Three partially independent signals were identified within the region, represented by common SNPs; detailed biochemical investigation in a nested sub-cohort showed all these SNPs were associated with variation in total cortisol binding activity in plasma, but some variants influenced total CBG concentrations while the top hit (rs12589136 influenced the immunoreactivity of the reactive centre loop of CBG. Exome chip and 1000 Genomes imputation analysis of this locus in the CROATIA-Korcula cohort identified missense mutations in SERPINA6 and SERPINA1 that did not account for the effects of common variants. These findings reveal a novel common genetic source of variation in binding of cortisol by CBG, and reinforce the key role of CBG in determining plasma cortisol levels. In turn this genetic variation may contribute to cortisol-associated degenerative diseases.

  19. Genomic locus modulating corneal thickness in the mouse identifies POU6F2 as a potential risk of developing glaucoma.

    Directory of Open Access Journals (Sweden)

    Rebecca King

    2018-01-01

    Full Text Available Central corneal thickness (CCT is one of the most heritable ocular traits and it is also a phenotypic risk factor for primary open angle glaucoma (POAG. The present study uses the BXD Recombinant Inbred (RI strains to identify novel quantitative trait loci (QTLs modulating CCT in the mouse with the potential of identifying a molecular link between CCT and risk of developing POAG. The BXD RI strain set was used to define mammalian genomic loci modulating CCT, with a total of 818 corneas measured from 61 BXD RI strains (between 60-100 days of age. The mice were anesthetized and the eyes were positioned in front of the lens of the Phoenix Micron IV Image-Guided OCT system or the Bioptigen OCT system. CCT data for each strain was averaged and used to QTLs modulating this phenotype using the bioinformatics tools on GeneNetwork (www.genenetwork.org. The candidate genes and genomic loci identified in the mouse were then directly compared with the summary data from a human POAG genome wide association study (NEIGHBORHOOD to determine if any genomic elements modulating mouse CCT are also risk factors for POAG.This analysis revealed one significant QTL on Chr 13 and a suggestive QTL on Chr 7. The significant locus on Chr 13 (13 to 19 Mb was examined further to define candidate genes modulating this eye phenotype. For the Chr 13 QTL in the mouse, only one gene in the region (Pou6f2 contained nonsynonymous SNPs. Of these five nonsynonymous SNPs in Pou6f2, two resulted in changes in the amino acid proline which could result in altered secondary structure affecting protein function. The 7 Mb region under the mouse Chr 13 peak distributes over 2 chromosomes in the human: Chr 1 and Chr 7. These genomic loci were examined in the NEIGHBORHOOD database to determine if they are potential risk factors for human glaucoma identified using meta-data from human GWAS. The top 50 hits all resided within one gene (POU6F2, with the highest significance level of p = 10-6 for

  20. Polygenic analysis of genome-wide SNP data identifies common variants on allergic rhinitis

    DEFF Research Database (Denmark)

    Mohammadnejad, Afsaneh; Brasch-Andersen, Charlotte; Haagerup, Annette

    Background: Allergic Rhinitis (AR) is a complex disorder that affects many people around the world. There is a high genetic contribution to the development of the AR, as twins and family studies have estimated heritability of more than 33%. Due to the complex nature of the disease, single SNP...... analysis has limited power in identifying the genetic variations for AR. We combined genome-wide association analysis (GWAS) with polygenic risk score (PRS) in exploring the genetic basis underlying the disease. Methods: We collected clinical data on 631 Danish subjects with AR cases consisting of 434...... sibling pairs and unrelated individuals and control subjects of 197 unrelated individuals. SNP genotyping was done by Affymetrix Genome-Wide Human SNP Array 5.0. SNP imputation was performed using "IMPUTE2". Using additive effect model, GWAS was conducted in discovery sample, the genotypes...

  1. Use of deep whole-genome sequencing data to identify structure risk variants in breast cancer susceptibility genes.

    Science.gov (United States)

    Guo, Xingyi; Shi, Jiajun; Cai, Qiuyin; Shu, Xiao-Ou; He, Jing; Wen, Wanqing; Allen, Jamie; Pharoah, Paul; Dunning, Alison; Hunter, David J; Kraft, Peter; Easton, Douglas F; Zheng, Wei; Long, Jirong

    2018-03-01

    Functional disruptions of susceptibility genes by large genomic structure variant (SV) deletions in germlines are known to be associated with cancer risk. However, few studies have been conducted to systematically search for SV deletions in breast cancer susceptibility genes. We analysed deep (> 30x) whole-genome sequencing (WGS) data generated in blood samples from 128 breast cancer patients of Asian and European descent with either a strong family history of breast cancer or early cancer onset disease. To identify SV deletions in known or suspected breast cancer susceptibility genes, we used multiple SV calling tools including Genome STRiP, Delly, Manta, BreakDancer and Pindel. SV deletions were detected by at least three of these bioinformatics tools in five genes. Specifically, we identified heterozygous deletions covering a fraction of the coding regions of BRCA1 (with approximately 80kb in two patients), and TP53 genes (with ∼1.6 kb in two patients), and of intronic regions (∼1 kb) of the PALB2 (one patient), PTEN (three patients) and RAD51C genes (one patient). We confirmed the presence of these deletions using real-time quantitative PCR (qPCR). Our study identified novel SV deletions in breast cancer susceptibility genes and the identification of such SV deletions may improve clinical testing.

  2. Cytoplasmic male sterility-associated chimeric open reading frames identified by mitochondrial genome sequencing of four Cajanus genotypes.

    Science.gov (United States)

    Tuteja, Reetu; Saxena, Rachit K; Davila, Jaime; Shah, Trushar; Chen, Wenbin; Xiao, Yong-Li; Fan, Guangyi; Saxena, K B; Alverson, Andrew J; Spillane, Charles; Town, Christopher; Varshney, Rajeev K

    2013-10-01

    The hybrid pigeonpea (Cajanus cajan) breeding technology based on cytoplasmic male sterility (CMS) is currently unique among legumes and displays major potential for yield increase. CMS is defined as a condition in which a plant is unable to produce functional pollen grains. The novel chimeric open reading frames (ORFs) produced as a results of mitochondrial genome rearrangements are considered to be the main cause of CMS. To identify these CMS-related ORFs in pigeonpea, we sequenced the mitochondrial genomes of three C. cajan lines (the male-sterile line ICPA 2039, the maintainer line ICPB 2039, and the hybrid line ICPH 2433) and of the wild relative (Cajanus cajanifolius ICPW 29). A single, circular-mapping molecule of length 545.7 kb was assembled and annotated for the ICPA 2039 line. Sequence annotation predicted 51 genes, including 34 protein-coding and 17 RNA genes. Comparison of the mitochondrial genomes from different Cajanus genotypes identified 31 ORFs, which differ between lines within which CMS is present or absent. Among these chimeric ORFs, 13 were identified by comparison of the related male-sterile and maintainer lines. These ORFs display features that are known to trigger CMS in other plant species and to represent the most promising candidates for CMS-related mitochondrial rearrangements in pigeonpea.

  3. Genomics in Public Health: Perspective from the Office of Public Health Genomics at the Centers for Disease Control and Prevention (CDC

    Directory of Open Access Journals (Sweden)

    Ridgely Fisk Green

    2015-09-01

    Full Text Available The national effort to use genomic knowledge to save lives is gaining momentum, as illustrated by the inclusion of genomics in key public health initiatives, including Healthy People 2020, and the recent launch of the precision medicine initiative. The Office of Public Health Genomics (OPHG at the Centers for Disease Control and Prevention (CDC partners with state public health departments and others to advance the translation of genome-based discoveries into disease prevention and population health. To do this, OPHG has adopted an “identify, inform, and integrate” model: identify evidence-based genomic applications ready for implementation, inform stakeholders about these applications, and integrate these applications into public health at the local, state, and national level. This paper addresses current and future work at OPHG for integrating genomics into public health programs.

  4. Genomics in Public Health: Perspective from the Office of Public Health Genomics at the Centers for Disease Control and Prevention (CDC).

    Science.gov (United States)

    Green, Ridgely Fisk; Dotson, W David; Bowen, Scott; Kolor, Katherine; Khoury, Muin J

    2015-01-01

    The national effort to use genomic knowledge to save lives is gaining momentum, as illustrated by the inclusion of genomics in key public health initiatives, including Healthy People 2020, and the recent launch of the precision medicine initiative. The Office of Public Health Genomics (OPHG) at the Centers for Disease Control and Prevention (CDC) partners with state public health departments and others to advance the translation of genome-based discoveries into disease prevention and population health. To do this, OPHG has adopted an "identify, inform, and integrate" model: identify evidence-based genomic applications ready for implementation, inform stakeholders about these applications, and integrate these applications into public health at the local, state, and national level. This paper addresses current and future work at OPHG for integrating genomics into public health programs.

  5. RNAi-Based Functional Genomics Identifies New Virulence Determinants in Mucormycosis.

    Directory of Open Access Journals (Sweden)

    Trung Anh Trieu

    2017-01-01

    Full Text Available Mucorales are an emerging group of human pathogens that are responsible for the lethal disease mucormycosis. Unfortunately, functional studies on the genetic factors behind the virulence of these organisms are hampered by their limited genetic tractability, since they are reluctant to classical genetic tools like transposable elements or gene mapping. Here, we describe an RNAi-based functional genomic platform that allows the identification of new virulence factors through a forward genetic approach firstly described in Mucorales. This platform contains a whole-genome collection of Mucor circinelloides silenced transformants that presented a broad assortment of phenotypes related to the main physiological processes in fungi, including virulence, hyphae morphology, mycelial and yeast growth, carotenogenesis and asexual sporulation. Selection of transformants with reduced virulence allowed the identification of mcplD, which encodes a Phospholipase D, and mcmyo5, encoding a probably essential cargo transporter of the Myosin V family, as required for a fully virulent phenotype of M. circinelloides. Knock-out mutants for those genes showed reduced virulence in both Galleria mellonella and Mus musculus models, probably due to a delayed germination and polarized growth within macrophages. This study provides a robust approach to study virulence in Mucorales and as a proof of concept identified new virulence determinants in M. circinelloides that could represent promising targets for future antifungal therapies.

  6. Utilization of genomic signatures to identify phenotype-specific drugs.

    Directory of Open Access Journals (Sweden)

    Seiichi Mori

    2009-08-01

    Full Text Available Genetic and genomic studies highlight the substantial complexity and heterogeneity of human cancers and emphasize the general lack of therapeutics that can match this complexity. With the goal of expanding opportunities for drug discovery, we describe an approach that makes use of a phenotype-based screen combined with the use of multiple cancer cell lines. In particular, we have used the NCI-60 cancer cell line panel that includes drug sensitivity measures for over 40,000 compounds assayed on 59 independent cells lines. Targets are cancer-relevant phenotypes represented as gene expression signatures that are used to identify cells within the NCI-60 panel reflecting the signature phenotype and then connect to compounds that are selectively active against those cells. As a proof-of-concept, we show that this strategy effectively identifies compounds with selectivity to the RAS or PI3K pathways. We have then extended this strategy to identify compounds that have activity towards cells exhibiting the basal phenotype of breast cancer, a clinically-important breast cancer characterized as ER-, PR-, and Her2- that lacks viable therapeutic options. One of these compounds, Simvastatin, has previously been shown to inhibit breast cancer cell growth in vitro and importantly, has been associated with a reduction in ER-, PR- breast cancer in a clinical study. We suggest that this approach provides a novel strategy towards identification of therapeutic agents based on clinically relevant phenotypes that can augment the conventional strategies of target-based screens.

  7. Genome-wide association identifies multiple genomic regions associated with susceptibility to and control of ovine lentivirus.

    Directory of Open Access Journals (Sweden)

    Stephen N White

    Full Text Available BACKGROUND: Like human immunodeficiency virus (HIV, ovine lentivirus (OvLV is macrophage-tropic and causes lifelong infection. OvLV infects one quarter of U.S. sheep and induces pneumonia and body condition wasting. There is no vaccine to prevent OvLV infection and no cost-effective treatment for infected animals. However, breed differences in prevalence and proviral concentration have indicated a genetic basis for susceptibility to OvLV. A recent study identified TMEM154 variants in OvLV susceptibility. The objective here was to identify additional loci associated with odds and/or control of OvLV infection. METHODOLOGY/PRINCIPAL FINDINGS: This genome-wide association study (GWAS included 964 sheep from Rambouillet, Polypay, and Columbia breeds with serological status and proviral concentration phenotypes. Analytic models accounted for breed and age, as well as genotype. This approach identified TMEM154 (nominal P=9.2×10(-7; empirical P=0.13, provided 12 additional genomic regions associated with odds of infection, and provided 13 regions associated with control of infection (all nominal P<1 × 10(-5. Rapid decline of linkage disequilibrium with distance suggested many regions included few genes each. Genes in regions associated with odds of infection included DPPA2/DPPA4 (empirical P=0.006, and SYTL3 (P=0.051. Genes in regions associated with control of infection included a zinc finger cluster (ZNF192, ZSCAN16, ZNF389, and ZNF165; P=0.001, C19orf42/TMEM38A (P=0.047, and DLGAP1 (P=0.092. CONCLUSIONS/SIGNIFICANCE: These associations provide targets for mutation discovery in sheep susceptibility to OvLV. Aside from TMEM154, these genes have not been associated previously with lentiviral infection in any species, to our knowledge. Further, data from other species suggest functional hypotheses for future testing of these genes in OvLV and other lentiviral infections. Specifically, SYTL3 binds and may regulate RAB27A, which is required for enveloped

  8. Use of a draft genome of coffee (Coffea arabica) to identify SNPs associated with caffeine content.

    Science.gov (United States)

    Tran, Hue T M; Ramaraj, Thiruvarangan; Furtado, Agnelo; Lee, Leonard Slade; Henry, Robert J

    2018-03-07

    Arabica coffee (Coffea arabica) has a small gene pool limiting genetic improvement. Selection for caffeine content within this gene pool would be assisted by identification of the genes controlling this important trait. Sequencing of DNA bulks from 18 genotypes with extreme high- or low-caffeine content from a population of 232 genotypes was used to identify linked polymorphisms. To obtain a reference genome, a whole genome assembly of arabica coffee (variety K7) was achieved by sequencing using short read (Illumina) and long-read (PacBio) technology. Assembly was performed using a range of assembly tools resulting in 76 409 scaffolds with a scaffold N50 of 54 544 bp and a total scaffold length of 1448 Mb. Validation of the genome assembly using different tools showed high completeness of the genome. More than 99% of transcriptome sequences mapped to the C. arabica draft genome, and 89% of BUSCOs were present. The assembled genome annotated using AUGUSTUS yielded 99 829 gene models. Using the draft arabica genome as reference in mapping and variant calling allowed the detection of 1444 nonsynonymous single nucleotide polymorphisms (SNPs) associated with caffeine content. Based on Kyoto Encyclopaedia of Genes and Genomes pathway-based analysis, 65 caffeine-associated SNPs were discovered, among which 11 SNPs were associated with genes encoding enzymes involved in the conversion of substrates, which participate in the caffeine biosynthesis pathways. This analysis demonstrated the complex genetic control of this key trait in coffee. © 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  9. Genomic analysis identifies masqueraders of full-term cerebral palsy.

    Science.gov (United States)

    Takezawa, Yusuke; Kikuchi, Atsuo; Haginoya, Kazuhiro; Niihori, Tetsuya; Numata-Uematsu, Yurika; Inui, Takehiko; Yamamura-Suzuki, Saeko; Miyabayashi, Takuya; Anzai, Mai; Suzuki-Muromoto, Sato; Okubo, Yukimune; Endo, Wakaba; Togashi, Noriko; Kobayashi, Yasuko; Onuma, Akira; Funayama, Ryo; Shirota, Matsuyuki; Nakayama, Keiko; Aoki, Yoko; Kure, Shigeo

    2018-05-01

    Cerebral palsy is a common, heterogeneous neurodevelopmental disorder that causes movement and postural disabilities. Recent studies have suggested genetic diseases can be misdiagnosed as cerebral palsy. We hypothesized that two simple criteria, that is, full-term births and nonspecific brain MRI findings, are keys to extracting masqueraders among cerebral palsy cases due to the following: (1) preterm infants are susceptible to multiple environmental factors and therefore demonstrate an increased risk of cerebral palsy and (2) brain MRI assessment is essential for excluding environmental causes and other particular disorders. A total of 107 patients-all full-term births-without specific findings on brain MRI were identified among 897 patients diagnosed with cerebral palsy who were followed at our center. DNA samples were available for 17 of the 107 cases for trio whole-exome sequencing and array comparative genomic hybridization. We prioritized variants in genes known to be relevant in neurodevelopmental diseases and evaluated their pathogenicity according to the American College of Medical Genetics guidelines. Pathogenic/likely pathogenic candidate variants were identified in 9 of 17 cases (52.9%) within eight genes: CTNNB1 , CYP2U1 , SPAST , GNAO1 , CACNA1A , AMPD2 , STXBP1 , and SCN2A . Five identified variants had previously been reported. No pathogenic copy number variations were identified. The AMPD2 missense variant and the splice-site variants in CTNNB1 and AMPD2 were validated by in vitro functional experiments. The high rate of detecting causative genetic variants (52.9%) suggests that patients diagnosed with cerebral palsy in full-term births without specific MRI findings may include genetic diseases masquerading as cerebral palsy.

  10. Genomic Diversity of Lactobacillus salivarius▿ †

    Science.gov (United States)

    Raftis, Emma J.; Salvetti, Elisa; Torriani, Sandra; Felis, Giovanna E.; O'Toole, Paul W.

    2011-01-01

    Strains of Lactobacillus salivarius are increasingly employed as probiotic agents for humans or animals. Despite the diversity of environmental sources from which they have been isolated, the genomic diversity of L. salivarius has been poorly characterized, and the implications of this diversity for strain selection have not been examined. To tackle this, we applied comparative genomic hybridization (CGH) and multilocus sequence typing (MLST) to 33 strains derived from humans, animals, or food. The CGH, based on total genome content, including small plasmids, identified 18 major regions of genomic variation, or hot spots for variation. Three major divisions were thus identified, with only a subset of the human isolates constituting an ecologically discernible group. Omission of the small plasmids from the CGH or analysis by MLST provided broadly concordant fine divisions and separated human-derived and animal-derived strains more clearly. The two gene clusters for exopolysaccharide (EPS) biosynthesis corresponded to regions of significant genomic diversity. The CGH-based groupings of these regions did not correlate with levels of production of bound or released EPS. Furthermore, EPS production was significantly modulated by available carbohydrate. In addition to proving difficult to predict from the gene content, EPS production levels correlated inversely with production of biofilms, a trait considered desirable in probiotic commensals. L. salivarius displays a high level of genomic diversity, and while selection of L. salivarius strains for probiotic use can be informed by CGH or MLST, it also requires pragmatic experimental validation of desired phenotypic traits. PMID:21131523

  11. Novel immune-modulator identified by a rapid, functional screen of the parapoxvirus ovis (Orf virus genome

    Directory of Open Access Journals (Sweden)

    McGuire Michael J

    2012-01-01

    Full Text Available Abstract Background The success of new sequencing technologies and informatic methods for identifying genes has made establishing gene product function a critical rate limiting step in progressing the molecular sciences. We present a method to functionally mine genomes for useful activities in vivo, using an unusual property of a member of the poxvirus family to demonstrate this screening approach. Results The genome of Parapoxvirus ovis (Orf virus was sequenced, annotated, and then used to PCR-amplify its open-reading-frames. Employing a cloning-independent protocol, a viral expression-library was rapidly built and arrayed into sub-library pools. These were directly delivered into mice as expressible cassettes and assayed for an immune-modulating activity associated with parapoxvirus infection. The product of the B2L gene, a homolog of vaccinia F13L, was identified as the factor eliciting immune cell accumulation at sites of skin inoculation. Administration of purified B2 protein also elicited immune cell accumulation activity, and additionally was found to serve as an adjuvant for antigen-specific responses. Co-delivery of the B2L gene with an influenza gene-vaccine significantly improved protection in mice. Furthermore, delivery of the B2L expression construct, without antigen, non-specifically reduced tumor growth in murine models of cancer. Conclusion A streamlined, functional approach to genome-wide screening of a biological activity in vivo is presented. Its application to screening in mice for an immune activity elicited by the pathogen genome of Parapoxvirus ovis yielded a novel immunomodulator. In this inverted discovery method, it was possible to identify the adjuvant responsible for a function of interest prior to a mechanistic study of the adjuvant. The non-specific immune activity of this modulator, B2, is similar to that associated with administration of inactivated particles to a host or to a live viral infection. Administration

  12. The challenges of genome-wide interaction studies: lessons to learn from the analysis of HDL blood levels.

    Directory of Open Access Journals (Sweden)

    Elisabeth M van Leeuwen

    Full Text Available Genome-wide association studies (GWAS have revealed 74 single nucleotide polymorphisms (SNPs associated with high-density lipoprotein cholesterol (HDL blood levels. This study is, to our knowledge, the first genome-wide interaction study (GWIS to identify SNP×SNP interactions associated with HDL levels. We performed a GWIS in the Rotterdam Study (RS cohort I (RS-I using the GLIDE tool which leverages the massively parallel computing power of Graphics Processing Units (GPUs to perform linear regression on all genome-wide pairs of SNPs. By performing a meta-analysis together with Rotterdam Study cohorts II and III (RS-II and RS-III, we were able to filter 181 interaction terms with a p-value<1 · 10-8 that replicated in the two independent cohorts. We were not able to replicate any of these interaction term in the AGES, ARIC, CHS, ERF, FHS and NFBC-66 cohorts (Ntotal = 30,011 when adjusting for multiple testing. Our GWIS resulted in the consistent finding of a possible interaction between rs774801 in ARMC8 (ENSG00000114098 and rs12442098 in SPATA8 (ENSG00000185594 being associated with HDL levels. However, p-values do not reach the preset Bonferroni correction of the p-values. Our study suggest that even for highly genetically determined traits such as HDL the sample sizes needed to detect SNP×SNP interactions are large and the 2-step filtering approaches do not yield a solution. Here we present our analysis plan and our reservations concerning GWIS.

  13. Integrative analysis of functional genomic annotations and sequencing data to identify rare causal variants via hierarchical modeling

    Directory of Open Access Journals (Sweden)

    Marinela eCapanu

    2015-05-01

    Full Text Available Identifying the small number of rare causal variants contributing to disease has beena major focus of investigation in recent years, but represents a formidable statisticalchallenge due to the rare frequencies with which these variants are observed. In thiscommentary we draw attention to a formal statistical framework, namely hierarchicalmodeling, to combine functional genomic annotations with sequencing data with theobjective of enhancing our ability to identify rare causal variants. Using simulations weshow that in all configurations studied, the hierarchical modeling approach has superiordiscriminatory ability compared to a recently proposed aggregate measure of deleteriousness,the Combined Annotation-Dependent Depletion (CADD score, supportingour premise that aggregate functional genomic measures can more accurately identifycausal variants when used in conjunction with sequencing data through a hierarchicalmodeling approach

  14. A Genome-wide Association Analysis of a Broad Psychosis Phenotype Identifies Three Loci for Further Investigation

    NARCIS (Netherlands)

    Bramon, Elvira; Pirinen, Matti; Strange, Amy; Lin, Kuang; Freeman, Colin; Bellenguez, Celine; Su, Zhan; Band, Gavin; Pearson, Richard; Vukcevic, Damjan; Langford, Cordelia; Deloukas, Panos; Hunt, Sarah; Gray, Emma; Dronov, Serge; Potter, Simon C.; Tashakkori-Ghanbaria, Avazeh; Edkins, Sarah; Bumpstead, Suzannah J.; Arranz, Maria J.; Bakker, Steven; Bender, Stephan; Bruggeman, Richard; Cahn, Wiepke; Chandler, David; Collier, David A.; Crespo-Facorro, Benedicto; Dazzan, Paola; de Haan, Lieuwe; di Forti, Marta; Dragovic, Milan; Giegling, Ina; Hall, Jeremy; Iyegbe, Conrad; Jablensky, Assen; Kahn, Rene S.; Kalaydjieva, Luba; Kravariti, Eugenia; Lawrie, Stephen; Lins-Zen, Don H.; Mata, Ignacio; McDonald, Colm; McIntosh, Andrew; Myin-Germeys, Inez; Ophoff, Roel A.; Pariante, Carmine M.; Paunio, Tiina; Picchioni, Marco; Ripke, Stephan; Wiersma, Durk

    2014-01-01

    Background: Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories. Methods: 1239 cases with schizophrenia, schizoaffective

  15. Genome-wide association analysis identifies variants associated with nonalcoholic fatty liver disease that have distinct effects on metabolic traits

    DEFF Research Database (Denmark)

    Speliotes, Elizabeth K; Yerges-Armstrong, Laura M; Wu, Jun

    2011-01-01

    steatosis, a non-invasive measure of NAFLD, in large population based samples. Using variance components methods, we show that CT hepatic steatosis is heritable (~26%-27%) in family-based Amish, Family Heart, and Framingham Heart Studies (n¿=¿880 to 3,070). By carrying out a fixed-effects meta......-analysis of genome-wide association (GWA) results between CT hepatic steatosis and ~2.4 million imputed or genotyped SNPs in 7,176 individuals from the Old Order Amish, Age, Gene/Environment Susceptibility-Reykjavik study (AGES), Family Heart, and Framingham Heart Studies, we identify variants associated at genome......Nonalcoholic fatty liver disease (NAFLD) clusters in families, but the only known common genetic variants influencing risk are near PNPLA3. We sought to identify additional genetic variants influencing NAFLD using genome-wide association (GWA) analysis of computed tomography (CT) measured hepatic...

  16. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer

    NARCIS (Netherlands)

    K. Michailidou (Kyriaki); J. Beesley (Jonathan); S. Lindstrom (Stephen); S. Canisius (Sander); J. Dennis (Joe); M. Lush (Michael); M. Maranian (Melanie); M.K. Bolla (Manjeet); Q. Wang (Qing); M. Shah (Mitul); B. Perkins (Barbara); K. Czene (Kamila); M. Eriksson (Mikael); H. Darabi (Hatef); J.S. Brand (Judith S.); S.E. Bojesen (Stig); B.G. Nordestgaard (Børge); H. Flyger (Henrik); S.F. Nielsen (Sune); N. Rahman (Nazneen); C. Turnbull (Clare); O. Fletcher (Olivia); J. Peto (Julian); L.J. Gibson (Lorna); I. dos Santos Silva (Isabel); J. Chang-Claude (Jenny); D. Flesch-Janys (Dieter); A. Rudolph (Anja); U. Eilber (Ursula); T.W. Behrens (Timothy); H. Nevanlinna (Heli); T.A. Muranen (Taru); K. Aittomäki (Kristiina); C. Blomqvist (Carl); S. Khan (Sofia); K. Aaltonen (Kirsimari); H. Ahsan (Habibul); M.G. Kibriya (Muhammad); A.S. Whittemore (Alice S.); E.M. John (Esther M.); K.E. Malone (Kathleen E.); M.D. Gammon (Marilie); R.M. Santella (Regina M.); G. Ursin (Giske); E. Makalic (Enes); D.F. Schmidt (Daniel); G. Casey (Graham); D.J. Hunter (David J.); S.M. Gapstur (Susan M.); M.M. Gaudet (Mia); W.R. Diver (Ryan); C.A. Haiman (Christopher A.); F.R. Schumacher (Fredrick); B.E. Henderson (Brian); L. Le Marchand (Loic); C.D. Berg (Christine); S.J. Chanock (Stephen); J.D. Figueroa (Jonine); R.N. Hoover (Robert N.); D. Lambrechts (Diether); P. Neven (Patrick); H. Wildiers (Hans); E. van Limbergen (Erik); M.K. Schmidt (Marjanka); A. Broeks (Annegien); S. Verhoef; S. Cornelissen (Sten); F.J. Couch (Fergus); J.E. Olson (Janet); B. Hallberg (Boubou); C. Vachon (Celine); Q. Waisfisz (Quinten); E.J. Meijers-Heijboer (Hanne); M.A. Adank (Muriel); R.B. van der Luijt (Rob); J. Li (Jingmei); J. Liu (Jianjun); M.K. Humphreys (Manjeet); D. Kang (Daehee); J.-Y. Choi (Ji-Yeob); S.K. Park (Sue K.); K.Y. Yoo; K. Matsuo (Keitaro); H. Ito (Hidemi); H. Iwata (Hiroji); K. Tajima (Kazuo); P. Guénel (Pascal); T. Truong (Thérèse); C. Mulot (Claire); M. Sanchez (Marie); B. Burwinkel (Barbara); F. Marme (Federick); H. Surowy (Harald); C. Sohn (Christof); A.H. Wu (Anna H); C.-C. Tseng (Chiu-chen); D. Van Den Berg (David); D.O. Stram (Daniel O.); A. González-Neira (Anna); J. Benítez (Javier); M.P. Zamora (Pilar); J.I.A. Perez (Jose Ignacio Arias); X.-O. Shu (Xiao-Ou); W. Lu (Wei); Y. Gao; H. Cai (Hui); A. Cox (Angela); S.S. Cross (Simon); M.W.R. Reed (Malcolm); I.L. Andrulis (Irene); J.A. Knight (Julia); G. Glendon (Gord); A.-M. Mulligan (Anna-Marie); E.J. Sawyer (Elinor); I.P. Tomlinson (Ian); M. Kerin (Michael); N. Miller (Nicola); A. Lindblom (Annika); S. Margolin (Sara); S.H. Teo (Soo Hwang); C.H. Yip (Cheng Har); N.A.M. Taib (Nur Aishah Mohd); G.-H. Tan (Gie-Hooi); M.J. Hooning (Maartje); A. Hollestelle (Antoinette); J.W.M. Martens (John); J.M. Collée (Margriet); W.J. Blot (William); L.B. Signorello (Lisa B.); Q. Cai (Qiuyin); J. Hopper (John); M.C. Southey (Melissa); H. Tsimiklis (Helen); C. Apicella (Carmel); C-Y. Shen (Chen-Yang); C.-N. Hsiung (Chia-Ni); P.-E. Wu (Pei-Ei); M.-F. Hou (Ming-Feng); V. Kristensen (Vessela); S. Nord (Silje); G.G. Alnæs (Grethe); G.G. Giles (Graham G.); R.L. Milne (Roger); C.A. McLean (Catriona Ann); F. Canzian (Federico); D. Trichopoulos (Dimitrios); P.H.M. Peeters; E. Lund (Eiliv); R. Sund (Reijo); K.T. Khaw; M.J. Gunter (Marc J.); D. Palli (Domenico); L.M. Mortensen (Lotte Maxild); L. Dossus (Laure); J.-M. Huerta (Jose-Maria); A. Meindl (Alfons); R.K. Schmutzler (Rita); C. Sutter (Christian); R. Yang (Rongxi); K. Muir (Kenneth); A. Lophatananon (Artitaya); S. Stewart-Brown (Sarah); P. Siriwanarangsan (Pornthep); J.M. Hartman (Joost); X. Miao; K.S. Chia (Kee Seng); C.W. Chan (Ching Wan); P.A. Fasching (Peter); R. Hein (Rebecca); M.W. Beckmann (Matthias); L. Haeberle (Lothar); H. Brenner (Hermann); A.K. Dieffenbach (Aida Karina); V. Arndt (Volker); C. Stegmaier (Christa); A. Ashworth (Alan); N. Orr (Nick); M. Schoemaker (Minouk); A.J. Swerdlow (Anthony ); L.A. Brinton (Louise); M. García-Closas (Montserrat); W. Zheng (Wei); S.L. Halverson (Sandra L.); M. Shrubsole (Martha); J. Long (Jirong); M.S. Goldberg (Mark); F. Labrèche (France); M. Dumont (Martine); R. Winqvist (Robert); K. Pykäs (Katri); A. Jukkola-Vuorinen (Arja); M. Grip (Mervi); H. Brauch (Hiltrud); U. Hamann (Ute); T. Brüning (Thomas); P. Radice (Paolo); P. Peterlongo (Paolo); S. Manoukian (Siranoush); L. Bernard (Loris); N.V. Bogdanova (Natalia); T. Dörk (Thilo); A. Mannermaa (Arto); V. Kataja (Vesa); V-M. Kosma (Veli-Matti); J.M. Hartikainen (J.); P. Devilee (Peter); R.A.E.M. Tollenaar (Rob); C.M. Seynaeve (Caroline); C.J. van Asperen (Christi); A. Jakubowska (Anna); J. Lubinski (Jan); K. Jaworska (Katarzyna); T. Huzarski (Tomasz); S. Sangrajrang (Suleeporn); V. Gaborieau (Valerie); P. Brennan (Paul); J.D. McKay (James); S. Slager (Susan); A.E. Toland (Amanda); C.B. Ambrosone (Christine); D. Yannoukakos (Drakoulis); M. Kabisch (Maria); D. Torres (Diana); S.L. Neuhausen (Susan); H. Anton-Culver (Hoda); C. Luccarini (Craig); C. Baynes (Caroline); S. Ahmed (Shahana); S. Healey (Sue); D.C. Tessier (Daniel C.); D. Vincent (Daniel); F. Bacot (Francois); G. Pita (Guillermo); M.R. Alonso (Rosario); N. Álvarez (Nuria); D. Herrero (Daniel); J. Simard (Jacques); P.P.D.P. Pharoah (Paul P.D.P.); P. Kraft (Peter); A.M. Dunning (Alison); G. Chenevix-Trench (Georgia); P. Hall (Per); D.F. Easton (Douglas)

    2015-01-01

    textabstractGenome-wide association studies (GWAS) and large-scale replication studies have identified common variants in 79 loci associated with breast cancer, explaining ∼14% of the familial risk of the disease. To identify new susceptibility loci, we performed a meta-analysis of 11 GWAS,

  17. Genome-wide association study identifies novel loci associated with circulating phospho- and sphingolipid concentrations

    DEFF Research Database (Denmark)

    Demirkan, Ayşe; van Duijn, Cornelia M; Ugocsai, Peter

    2012-01-01

    , and metabolic consequences. A large number of phospholipid and sphingolipid species can be detected and measured in human plasma. We conducted a meta-analysis of five European family-based genome-wide association studies (N = 4034) on plasma levels of 24 sphingomyelins (SPM), 9 ceramides (CER), 57...

  18. A genome-wide association study reveals variants in ARL15 that influence adiponectin levels.

    Directory of Open Access Journals (Sweden)

    J Brent Richards

    2009-12-01

    Full Text Available The adipocyte-derived protein adiponectin is highly heritable and inversely associated with risk of type 2 diabetes mellitus (T2D and coronary heart disease (CHD. We meta-analyzed 3 genome-wide association studies for circulating adiponectin levels (n = 8,531 and sought validation of the lead single nucleotide polymorphisms (SNPs in 5 additional cohorts (n = 6,202. Five SNPs were genome-wide significant in their relationship with adiponectin (P< or =5x10(-8. We then tested whether these 5 SNPs were associated with risk of T2D and CHD using a Bonferroni-corrected threshold of P< or =0.011 to declare statistical significance for these disease associations. SNPs at the adiponectin-encoding ADIPOQ locus demonstrated the strongest associations with adiponectin levels (P-combined = 9.2x10(-19 for lead SNP, rs266717, n = 14,733. A novel variant in the ARL15 (ADP-ribosylation factor-like 15 gene was associated with lower circulating levels of adiponectin (rs4311394-G, P-combined = 2.9x10(-8, n = 14,733. This same risk allele at ARL15 was also associated with a higher risk of CHD (odds ratio [OR] = 1.12, P = 8.5x10(-6, n = 22,421 more nominally, an increased risk of T2D (OR = 1.11, P = 3.2x10(-3, n = 10,128, and several metabolic traits. Expression studies in humans indicated that ARL15 is well-expressed in skeletal muscle. These findings identify a novel protein, ARL15, which influences circulating adiponectin levels and may impact upon CHD risk.

  19. A genome-wide association analysis of a broad psychosis phenotype identifies three loci for further investigation

    NARCIS (Netherlands)

    Bramon, Elvira; Pirinen, Matti; Strange, Amy; Lin, Kuang; Freeman, Colin; Bellenguez, Céline; Su, Zhan; Band, Gavin; Pearson, Richard; Vukcevic, Damjan; Langford, Cordelia; Deloukas, Panos; Hunt, Sarah; Gray, Emma; Dronov, Serge; Potter, Simon C.; Tashakkori-Ghanbaria, Avazeh; Edkins, Sarah; Bumpstead, Suzannah J.; Arranz, Maria J.; Bakker, Steven; Bender, Stephan; Bruggeman, Richard; Cahn, Wiepke; Chandler, David; Collier, David A.; Crespo-Facorro, Benedicto; Dazzan, Paola; de Haan, Lieuwe; Di Forti, Marta; Dragović, Milan; Giegling, Ina; Hall, Jeremy; Iyegbe, Conrad; Jablensky, Assen; Kahn, René S.; Kalaydjieva, Luba; Kravariti, Eugenia; Lawrie, Stephen; Linszen, Don H.; Mata, Ignacio; McDonald, Colm; McIntosh, Andrew; Myin-Germeys, Inez; Ophoff, Roel A.; Pariante, Carmine M.; Paunio, Tiina; Picchioni, Marco; Ripke, Stephan; Rujescu, Dan

    2014-01-01

    Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories. 1239 cases with schizophrenia, schizoaffective disorder, or psychotic

  20. Structural and functional insights of β-glucosidases identified from the genome of Aspergillus fumigatus

    Science.gov (United States)

    Dodda, Subba Reddy; Aich, Aparajita; Sarkar, Nibedita; Jain, Piyush; Jain, Sneha; Mondal, Sudipa; Aikat, Kaustav; Mukhopadhyay, Sudit S.

    2018-03-01

    Thermostable glucose tolerant β-glucosidase from Aspergillus species has attracted worldwide interest for their potentiality in industrial applications and bioethanol production. A strain of Aspergillus fumigatus (AfNITDGPKA3) identified by our laboratory from straw retting ground showed higher cellulase activity, specifically the β-glucosidase activity, compared to other contemporary strains. Though A. fumigatus has been known for high cellulase activity, detailed identification and characterization of the cellulase genes from their genome is yet to be done. In this work we have been analyzed the cellulase genes from the genome sequence database of Aspergillus fumigatus (Af293). Genome analysis suggests two cellobiohydrolase, eleven endoglucanase and seventeen β-glucosidase genes present. β-Glucosidase genes belong to either Glycohydro1 (GH1 or Bgl1) or Glycohydro3 (GH3 or Bgl3) family. The sequence similarity suggests that Bgl1 and Bgl3 of A. fumagatus are phylogenetically close to those of A. fisheri and A. oryzae. The modelled structure of the Bgl1 predicts the (β/α)8 barrel type structure with deep and narrow active site, whereas, Bgl3 shows the (α/β)8 barrel and (α/β)6 sandwich structure with shallow and open active site. Docking results suggest that amino acids Glu544, Glu466, Trp408,Trp567,Tyr44,Tyr222,Tyr770,Asp844,Asp537,Asn212,Asn217 of Bgl3 and Asp224,Asn242,Glu440, Glu445, Tyr367, Tyr365,Thr994,Trp435,Trp446 of Bgl1 are involved in the hydrolysis. Binding affinity analyses suggest that Bgl3 and Bgl1 enzymes are more active on the substrates like 4-methylumbelliferyl glycoside (MUG) and p-nitrophenyl-β-D-1, 4-glucopyranoside (pNPG) than on cellobiose. Further docking with glucose suggests that Bgl1 is more glucose tolerant than Bgl3. Analysis of the Aspergillus fumigatus genome may help to identify a β-glucosidase enzyme with better property and the structural information may help to develop an engineered recombinant enzyme.

  1. Genomic and Transcriptomic Associations Identify a New Insecticide Resistance Phenotype for the Selective Sweep at the Cyp6g1 Locus of Drosophila melanogaster.

    Science.gov (United States)

    Battlay, Paul; Schmidt, Joshua M; Fournier-Level, Alexandre; Robin, Charles

    2016-08-09

    Scans of the Drosophila melanogaster genome have identified organophosphate resistance loci among those with the most pronounced signature of positive selection. In this study, the molecular basis of resistance to the organophosphate insecticide azinphos-methyl was investigated using the Drosophila Genetic Reference Panel, and genome-wide association. Recently released full transcriptome data were used to extend the utility of the Drosophila Genetic Reference Panel resource beyond traditional genome-wide association studies to allow systems genetics analyses of phenotypes. We found that both genomic and transcriptomic associations independently identified Cyp6g1, a gene involved in resistance to DDT and neonicotinoid insecticides, as the top candidate for azinphos-methyl resistance. This was verified by transgenically overexpressing Cyp6g1 using natural regulatory elements from a resistant allele, resulting in a 6.5-fold increase in resistance. We also identified four novel candidate genes associated with azinphos-methyl resistance, all of which are involved in either regulation of fat storage, or nervous system development. In Cyp6g1, we find a demonstrable resistance locus, a verification that transcriptome data can be used to identify variants associated with insecticide resistance, and an overlap between peaks of a genome-wide association study, and a genome-wide selective sweep analysis. Copyright © 2016 Battlay et al.

  2. Comparative genomics defines the core genome of the growing N4-like phage genus and identifies N4-like Roseophage specific genes

    Directory of Open Access Journals (Sweden)

    Jacqueline Zoe-Munn Chan

    2014-10-01

    Full Text Available Two bacteriophages, RPP1 and RLP1, infecting members of the marine Roseobacter clade were isolated from seawater. Their linear genomes are 74.7 and 74.6 kb and encode 91 and 92 coding DNA sequences, respectively. Around 30% of these are homologous to genes found in Enterobacter phage N4. Comparative genomics of these two new Roseobacter phages and twenty-three other sequenced N4-like phages (three infecting members of the Roseobacter lineage and twenty infecting other Gammaproteobacteria revealed that N4-like phages share a core genome of 14 genes responsible for control of gene expression, replication and virion proteins. Phylogenetic analysis of these genes placed the five N4-like roseophages (RN4 into a distinct subclade. Analysis of the RN4 phage genomes revealed they share a further 19 genes of which nine are found exclusively in RN4 phages and four appear to have been acquired from their bacterial hosts. Proteomic analysis of the RPP1 and RLP1 virions identified a second structural module present in the RN4 phages similar to that found in the Pseudomonas N4-like phage LIT1. Searches of various metagenomic databases, included the GOS database, using CDS sequences from RPP1 suggests these phages are widely distributed in marine environments in particular in the open ocean environment.

  3. Omics Approaches for Identifying Physiological Adaptations to Genome Instability in Aging.

    Science.gov (United States)

    Edifizi, Diletta; Schumacher, Björn

    2017-11-04

    DNA damage causally contributes to aging and age-related diseases. The declining functioning of tissues and organs during aging can lead to the increased risk of succumbing to aging-associated diseases. Congenital syndromes that are caused by heritable mutations in DNA repair pathways lead to cancer susceptibility and accelerated aging, thus underlining the importance of genome maintenance for withstanding aging. High-throughput mass-spectrometry-based approaches have recently contributed to identifying signalling response networks and gaining a more comprehensive understanding of the physiological adaptations occurring upon unrepaired DNA damage. The insulin-like signalling pathway has been implicated in a DNA damage response (DDR) network that includes epidermal growth factor (EGF)-, AMP-activated protein kinases (AMPK)- and the target of rapamycin (TOR)-like signalling pathways, which are known regulators of growth, metabolism, and stress responses. The same pathways, together with the autophagy-mediated proteostatic response and the decline in energy metabolism have also been found to be similarly regulated during natural aging, suggesting striking parallels in the physiological adaptation upon persistent DNA damage due to DNA repair defects and long-term low-level DNA damage accumulation occurring during natural aging. These insights will be an important starting point to study the interplay between signalling networks involved in progeroid syndromes that are caused by DNA repair deficiencies and to gain new understanding of the consequences of DNA damage in the aging process.

  4. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer

    NARCIS (Netherlands)

    Michailidou, Kyriaki; Beesley, Jonathan; Lindstrom, Sara; Canisius, Sander; Dennis, Joe; Lush, Michael J.; Maranian, Mel J.; Bolla, Manjeet K.; Wang, Qin; Shah, Mitul; Perkins, Barbara J.; Czene, Kamila; Eriksson, Mikael; Darabi, Hatef; Brand, Judith S.; Bojesen, Stig E.; Nordestgaard, Borge G.; Flyger, Henrik; Nielsen, Sune F.; Rahman, Nazneen; Turnbull, Clare; Fletcher, Olivia; Peto, Julian; Gibson, Lorna; dos-Santos-Silva, Isabel; Chang-Claude, Jenny; Flesch-Janys, Dieter; Rudolph, Anja; Eilber, Ursula; Behrens, Sabine; Nevanlinna, Heli; Muranen, Taru A.; Aittomaki, Kristiina; Blomqvist, Carl; Khan, Sofia; Aaltonen, Kirsimari; Ahsan, Habibul; Kibriya, Muhammad G.; Whittemore, Alice S.; John, Esther M.; Malone, Kathleen E.; Gammon, Marilie D.; Santella, Regina M.; Ursin, Giske; Makalic, Enes; Schmidt, Daniel F.; Casey, Graham; Hunter, David J.; Gapstur, Susan M.; Gaudet, Mia M.; Diver, W. Ryan; Haiman, Christopher A.; Schumacher, Fredrick; Henderson, Brian E.; Le Marchand, Loic; Berg, Christine D.; Chanock, Stephen J.; Figueroa, Jonine; Hoover, Robert N.; Lambrechts, Diether; Neven, Patrick; Wildiers, Hans; van Limbergen, Erik; Schmidt, Marjanka K.; Broeks, Annegien; Verhoef, Senno; Cornelissen, Sten; Couch, Fergus J.; Olson, Janet E.; Hallberg, Emily; Vachon, Celine; Waisfisz, Quinten; Meijers-Heijboer, Hanne; Adank, Muriel A.; van der Luijt, Rob B.; Li, Jingmei; Liu, Jianjun; Humphreys, Keith; Kang, Daehee; Choi, Ji-Yeob; Park, Sue K.; Yoo, Keun-Young; Matsuo, Keitaro; Ito, Hidemi; Iwata, Hiroji; Tajima, Kazuo; Guenel, Pascal; Truong, Therese; Mulot, Claire; Sanchez, Marie; Burwinkel, Barbara; Marme, Frederik; Surowy, Harald; Sohn, Christof; Wu, Anna H.; Tseng, Chiu-chen; Van den Berg, David; Stram, Daniel O.; Gonzalez-Neira, Anna; Benitez, Javier; Zamora, M. Pilar; Arias Perez, Jose Ignacio; Shu, Xiao-Ou; Lu, Wei; Gao, Yu-Tang; Cai, Hui; Cox, Angela; Cross, Simon S.; Reed, Malcolm W. R.; Andrulis, Irene L.; Knight, Julia A.; Glendon, Gord; Mulligan, Anna Marie; Sawyer, Elinor J.; Tomlinson, Ian; Kerin, Michael J.; Miller, Nicola; Lindblom, Annika; Margolin, Sara; Teo, Soo Hwang; Yip, Cheng Har; Taib, Nur Aishah Mohd; Tan, Gie-Hooi; Hooning, Maartje J.; Hollestelle, Antoinette; Martens, John W. M.; Collee, J. Margriet; Blot, William; Signorello, Lisa B.; Cai, Qiuyin; Hopper, John L.; Southey, Melissa C.; Tsimiklis, Helen; Apicella, Carmel; Shen, Chen-Yang; Hsiung, Chia-Ni; Wu, Pei-Ei; Hou, Ming-Feng; Kristensen, Vessela N.; Nord, Silje; Alnaes, Grethe I. Grenaker; Giles, Graham G.; Milne, Roger L.; McLean, Catriona; Canzian, Federico; Trichopoulos, Dimitrios; Peeters, Petra; Lund, Eiliv; Sund, Malin; Khaw, Kay-Tee; Gunter, Marc J.; Palli, Domenico; Mortensen, Lotte Maxild; Dossus, Laure; Huerta, Jose-Maria; Meindl, Alfons; Schmutzler, Rita K.; Sutter, Christian; Yang, Rongxi; Muir, Kenneth; Lophatananon, Artitaya; Stewart-Brown, Sarah; Siriwanarangsan, Pornthep; Hartman, Mikael; Miao, Hui; Chia, Kee Seng; Chan, Ching Wan; Fasching, Peter A.; Hein, Alexander; Beckmann, Matthias W.; Haeberle, Lothar; Brenner, Hermann; Dieffenbach, Aida Karina; Arndt, Volker; Stegmaier, Christa; Ashworth, Alan; Orr, Nick; Schoemaker, Minouk J.; Swerdlow, Anthony J.; Brinton, Louise; Garcia-Closas, Montserrat; Zheng, Wei; Halverson, Sandra L.; Shrubsole, Martha; Long, Jirong; Goldberg, Mark S.; Labreche, France; Dumont, Martine; Winqvist, Robert; Pylkas, Katri; Jukkola-Vuorinen, Arja; Grip, Mervi; Brauch, Hiltrud; Hamann, Ute; Bruening, Thomas; Radice, Paolo; Peterlongo, Paolo; Manoukian, Siranoush; Bernard, Loris; Bogdanova, Natalia V.; Doerk, Thilo; Mannermaa, Arto; Kataja, Vesa; Kosma, Veli-Matti; Hartikainen, Jaana M.; Devilee, Peter; Tollenaar, Robert A. E. M.; Seynaeve, Caroline; Van Asperen, Christi J.; Jakubowska, Anna; Lubinski, Jan; Jaworska, Katarzyna; Huzarski, Tomasz; Sangrajrang, Suleeporn; Gaborieau, Valerie; Brennan, Paul; Mckay, James; Slager, Susan; Toland, Amanda E.; Ambrosone, Christine B.; Yannoukakos, Drakoulis; Kabisch, Maria; Torres, Diana; Neuhausen, Susan L.; Anton-Culver, Hoda; Luccarini, Craig; Baynes, Caroline; Ahmed, Shahana; Healey, Catherine S.; Tessier, Daniel C.; Vincent, Daniel; Bacot, Francois; Pita, Guillermo; Rosario Alonso, M.; Alvarez, Nuria; Herrero, Daniel; Simard, Jacques; Pharoah, Paul P. D. P.; Kraft, Peter; Dunning, Alison M.; Chenevix-Trench, Georgia; Hall, Per; Easton, Douglas F.

    Genome-wide association studies (GWAS) and large-scale replication studies have identified common variants in 79 loci associated with breast cancer, explaining similar to 14% of the familial risk of the disease. To identify new susceptibility loci, we performed a meta-analysis of 11 GWAS, comprising

  5. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer

    DEFF Research Database (Denmark)

    Michailidou, Kyriaki; Beesley, Jonathan; Lindstrom, Sara

    2015-01-01

    Genome-wide association studies (GWAS) and large-scale replication studies have identified common variants in 79 loci associated with breast cancer, explaining ∼14% of the familial risk of the disease. To identify new susceptibility loci, we performed a meta-analysis of 11 GWAS, comprising 15,748...

  6. The First Endogenous Herpesvirus, Identified in the Tarsier Genome, and Novel Sequences from Primate Rhadinoviruses and Lymphocryptoviruses

    Science.gov (United States)

    Aswad, Amr; Katzourakis, Aris

    2014-01-01

    Herpesviridae is a diverse family of large and complex pathogens whose genomes are extremely difficult to sequence. This is particularly true for clinical samples, and if the virus, host, or both genomes are being sequenced for the first time. Although herpesviruses are known to occasionally integrate in host genomes, and can also be inherited in a Mendelian fashion, they are notably absent from the genomic fossil record comprised of endogenous viral elements (EVEs). Here, we combine paleovirological and metagenomic approaches to both explore the constituent viral diversity of mammalian genomes and search for endogenous herpesviruses. We describe the first endogenous herpesvirus from the genome of the Philippine tarsier, belonging to the Roseolovirus genus, and characterize its highly defective genome that is integrated and flanked by unambiguous host DNA. From a draft assembly of the aye-aye genome, we use bioinformatic tools to reveal over 100,000 bp of a novel rhadinovirus that is the first lemur gammaherpesvirus, closely related to Kaposi's sarcoma-associated virus. We also identify 58 genes of Pan paniscus lymphocryptovirus 1, the bonobo equivalent of human Epstein-Barr virus. For each of the viruses, we postulate gene function via comparative analysis to known viral relatives. Most notably, the evidence from gene content and phylogenetics suggests that the aye-aye sequences represent the most basal known rhadinovirus, and indicates that tumorigenic herpesviruses have been infecting primates since their emergence in the late Cretaceous. Overall, these data show that a genomic fossil record of herpesviruses exists despite their extremely large genomes, and expands the known diversity of Herpesviridae, which will aid the characterization of pathogenesis. Our analytical approach illustrates the benefit of intersecting evolutionary approaches with metagenomics, genetics and paleovirology. PMID:24945689

  7. Genome-wide methylation analysis identified sexually dimorphic methylated regions in hybrid tilapia

    Science.gov (United States)

    Wan, Zi Yi; Xia, Jun Hong; Lin, Grace; Wang, Le; Lin, Valerie C. L.; Yue, Gen Hua

    2016-01-01

    Sexual dimorphism is an interesting biological phenomenon. Previous studies showed that DNA methylation might play a role in sexual dimorphism. However, the overall picture of the genome-wide methylation landscape in sexually dimorphic species remains unclear. We analyzed the DNA methylation landscape and transcriptome in hybrid tilapia (Oreochromis spp.) using whole genome bisulfite sequencing (WGBS) and RNA-sequencing (RNA-seq). We found 4,757 sexually dimorphic differentially methylated regions (DMRs), with significant clusters of DMRs located on chromosomal regions associated with sex determination. CpG methylation in promoter regions was negatively correlated with the gene expression level. MAPK/ERK pathway was upregulated in male tilapia. We also inferred active cis-regulatory regions (ACRs) in skeletal muscle tissues from WGBS datasets, revealing sexually dimorphic cis-regulatory regions. These results suggest that DNA methylation contribute to sex-specific phenotypes and serve as resources for further investigation to analyze the functions of these regions and their contributions towards sexual dimorphisms. PMID:27782217

  8. Whole Genome Analysis of Injectional Anthrax Identifies Two Disease Clusters Spanning More Than 13 Years

    Directory of Open Access Journals (Sweden)

    Paul Keim

    2015-11-01

    Lay Person Interpretation: Injectional anthrax has been plaguing heroin drug users across Europe for more than 10 years. In order to better understand this outbreak, we assessed genomic relationships of all available injectional anthrax strains from four countries spanning a >12 year period. Very few differences were identified using genome-based analysis, but these differentiated the isolates into two distinct clusters. This strongly supports a hypothesis of at least two separate anthrax spore contamination events perhaps during the drug production processes. Identification of two events would not have been possible from standard epidemiological analysis. These comprehensive data will be invaluable for classifying future injectional anthrax isolates and for future geographic attribution.

  9. Comparison of genome-wide selection strategies to identify furfural tolerance genes in Escherichia coli.

    Science.gov (United States)

    Glebes, Tirzah Y; Sandoval, Nicholas R; Gillis, Jacob H; Gill, Ryan T

    2015-01-01

    Engineering both feedstock and product tolerance is important for transitioning towards next-generation biofuels derived from renewable sources. Tolerance to chemical inhibitors typically results in complex phenotypes, for which multiple genetic changes must often be made to confer tolerance. Here, we performed a genome-wide search for furfural-tolerant alleles using the TRackable Multiplex Recombineering (TRMR) method (Warner et al. (2010), Nature Biotechnology), which uses chromosomally integrated mutations directed towards increased or decreased expression of virtually every gene in Escherichia coli. We employed various growth selection strategies to assess the role of selection design towards growth enrichments. We also compared genes with increased fitness from our TRMR selection to those from a previously reported genome-wide identification study of furfural tolerance genes using a plasmid-based genomic library approach (Glebes et al. (2014) PLOS ONE). In several cases, growth improvements were observed for the chromosomally integrated promoter/RBS mutations but not for the plasmid-based overexpression constructs. Through this assessment, four novel tolerance genes, ahpC, yhjH, rna, and dicA, were identified and confirmed for their effect on improving growth in the presence of furfural. © 2014 Wiley Periodicals, Inc.

  10. Efficient genome-wide association in biobanks using topic modeling identifies multiple novel disease loci.

    Science.gov (United States)

    McCoy, Thomas H; Castro, Victor M; Snapper, Leslie A; Hart, Kamber L; Perlis, Roy H

    2017-08-31

    Biobanks and national registries represent a powerful tool for genomic discovery, but rely on diagnostic codes that may be unreliable and fail to capture the relationship between related diagnoses. We developed an efficient means of conducting genome-wide association studies using combinations of diagnostic codes from electronic health records (EHR) for 10845 participants in a biobanking program at two large academic medical centers. Specifically, we applied latent Dirichilet allocation to fit 50 disease topics based on diagnostic codes, then conducted genome-wide common-variant association for each topic. In sensitivity analysis, these results were contrasted with those obtained from traditional single-diagnosis phenome-wide association analysis, as well as those in which only a subset of diagnostic codes are included per topic. In meta-analysis across three biobank cohorts, we identified 23 disease-associated loci with p<1e-15, including previously associated autoimmune disease loci. In all cases, observed significant associations were of greater magnitude than for single phenome-wide diagnostic codes, and incorporation of less strongly-loading diagnostic codes enhanced association. This strategy provides a more efficient means of phenome-wide association in biobanks with coded clinical data.

  11. META-ANALYSIS OF GENOME-WIDE STUDIES IDENTIFIES WNT16 AND ESR1 SNPS ASSOCIATED WITH BONE MINERAL DENSITY IN PREMENOPAUSAL WOMEN

    Science.gov (United States)

    Koller, Daniel L.; Zheng, Hou-Feng; Karasik, David; Yerges-Armstrong, Laura; Liu, Ching-Ti; McGuigan, Fiona; Kemp, John P.; Giroux, Sylvie; Lai, Dongbing; Edenberg, Howard J.; Peacock, Munro; Czerwinski, Stefan A.; Choh, Audrey C.; McMahon, George; St Pourcain, Beate; Timpson, Nicholas J.; Lawlor, Debbie A; Evans, David M; Towne, Bradford; Blangero, John; Carless, Melanie A.; Kammerer, Candace; Goltzman, David; Kovacs, Christopher S.; Prior, Jerilynn C.; Spector, Tim D.; Rousseau, Francois; Tobias, Jon H.; Akesson, Kristina; Econs, Michael J.; Mitchell, Braxton D.; Richards, J. Brent; Kiel, Douglas P.; Foroud, Tatiana

    2013-01-01

    Previous genome-wide association studies (GWAS) have identified common variants in genes associated with variation in bone mineral density (BMD), although most have been carried out in combined samples of older women and men. Meta-analyses of these results have identified numerous SNPs of modest effect at genome-wide significance levels in genes involved in both bone formation and resorption, as well as other pathways. We performed a meta-analysis restricted to premenopausal white women from four cohorts (n= 4,061 women, ages 20 to 45) to identify genes influencing peak bone mass at the lumbar spine and femoral neck. Following imputation, age- and weight-adjusted BMD values were tested for association with each SNP. Association of a SNP in the WNT16 gene (rs3801387; p=1.7 × 10−9) and multiple SNPs in the ESR1/C6orf97 (rs4870044; p=1.3 × 10−8) achieved genome-wide significance levels for lumbar spine BMD. These SNPs, along with others demonstrating suggestive evidence of association, were then tested for association in seven Replication cohorts that included premenopausal women of European, Hispanic-American, and African-American descent (combined n=5,597 for femoral neck; 4,744 for lumbar spine). When the data from the Discovery and Replication cohorts were analyzed jointly, the evidence was more significant (WNT16 joint p=1.3 × 10−11; ESR1/C6orf97 joint p= 1.4 × 10−10). Multiple independent association signals were observed with spine BMD at the ESR1 region after conditioning on the primary signal. Analyses of femoral neck BMD also supported association with SNPs in WNT16 and ESR1/C6orf97 (p< 1 × 10−5). Our results confirm that several of the genes contributing to BMD variation across a broad age range in both sexes have effects of similar magnitude on BMD of the spine in premenopausal women. These data support the hypothesis that variants in these genes of known skeletal function also affect BMD during the premenopausal period. PMID:23074152

  12. Genome-wide local ancestry approach identifies genes and variants associated with chemotherapeutic susceptibility in African Americans.

    Directory of Open Access Journals (Sweden)

    Heather E Wheeler

    Full Text Available Chemotherapeutic agents are used in the treatment of many cancers, yet variable resistance and toxicities among individuals limit successful outcomes. Several studies have indicated outcome differences associated with ancestry among patients with various cancer types. Using both traditional SNP-based and newly developed gene-based genome-wide approaches, we investigated the genetics of chemotherapeutic susceptibility in lymphoblastoid cell lines derived from 83 African Americans, a population for which there is a disparity in the number of genome-wide studies performed. To account for population structure in this admixed population, we incorporated local ancestry information into our association model. We tested over 2 million SNPs and identified 325, 176, 240, and 190 SNPs that were suggestively associated with cytarabine-, 5'-deoxyfluorouridine (5'-DFUR-, carboplatin-, and cisplatin-induced cytotoxicity, respectively (p≤10(-4. Importantly, some of these variants are found only in populations of African descent. We also show that cisplatin-susceptibility SNPs are enriched for carboplatin-susceptibility SNPs. Using a gene-based genome-wide association approach, we identified 26, 11, 20, and 41 suggestive candidate genes for association with cytarabine-, 5'-DFUR-, carboplatin-, and cisplatin-induced cytotoxicity, respectively (p≤10(-3. Fourteen of these genes showed evidence of association with their respective chemotherapeutic phenotypes in the Yoruba from Ibadan, Nigeria (p<0.05, including TP53I11, COPS5 and GAS8, which are known to be involved in tumorigenesis. Although our results require further study, we have identified variants and genes associated with chemotherapeutic susceptibility in African Americans by using an approach that incorporates local ancestry information.

  13. Genome-wide association analysis identifies 11 risk variants associated with the asthma with hay fever phenotype.

    Science.gov (United States)

    Ferreira, Manuel A R; Matheson, Melanie C; Tang, Clara S; Granell, Raquel; Ang, Wei; Hui, Jennie; Kiefer, Amy K; Duffy, David L; Baltic, Svetlana; Danoy, Patrick; Bui, Minh; Price, Loren; Sly, Peter D; Eriksson, Nicholas; Madden, Pamela A; Abramson, Michael J; Holt, Patrick G; Heath, Andrew C; Hunter, Michael; Musk, Bill; Robertson, Colin F; Le Souëf, Peter; Montgomery, Grant W; Henderson, A John; Tung, Joyce Y; Dharmage, Shyamali C; Brown, Matthew A; James, Alan; Thompson, Philip J; Pennell, Craig; Martin, Nicholas G; Evans, David M; Hinds, David A; Hopper, John L

    2014-06-01

    To date, no genome-wide association study (GWAS) has considered the combined phenotype of asthma with hay fever. Previous analyses of family data from the Tasmanian Longitudinal Health Study provide evidence that this phenotype has a stronger genetic cause than asthma without hay fever. We sought to perform a GWAS of asthma with hay fever to identify variants associated with having both diseases. We performed a meta-analysis of GWASs comparing persons with both physician-diagnosed asthma and hay fever (n = 6,685) with persons with neither disease (n = 14,091). At genome-wide significance, we identified 11 independent variants associated with the risk of having asthma with hay fever, including 2 associations reaching this level of significance with allergic disease for the first time: ZBTB10 (rs7009110; odds ratio [OR], 1.14; P = 4 × 10(-9)) and CLEC16A (rs62026376; OR, 1.17; P = 1 × 10(-8)). The rs62026376:C allele associated with increased asthma with hay fever risk has been found to be associated also with decreased expression of the nearby DEXI gene in monocytes. The 11 variants were associated with the risk of asthma and hay fever separately, but the estimated associations with the individual phenotypes were weaker than with the combined asthma with hay fever phenotype. A variant near LRRC32 was a stronger risk factor for hay fever than for asthma, whereas the reverse was observed for variants in/near GSDMA and TSLP. Single nucleotide polymorphisms with suggestive evidence for association with asthma with hay fever risk included rs41295115 near IL2RA (OR, 1.28; P = 5 × 10(-7)) and rs76043829 in TNS1 (OR, 1.23; P = 2 × 10(-6)). By focusing on the combined phenotype of asthma with hay fever, variants associated with the risk of allergic disease can be identified with greater efficiency. Copyright © 2013 American Academy of Allergy, Asthma & Immunology. Published by Mosby, Inc. All rights reserved.

  14. Evaluating genome-wide association study-identified breast cancer risk variants in African-American women.

    Directory of Open Access Journals (Sweden)

    Jirong Long

    Full Text Available Genome-wide association studies (GWAS, conducted mostly in European or Asian descendants, have identified approximately 67 genetic susceptibility loci for breast cancer. Given the large differences in genetic architecture between the African-ancestry genome and genomes of Asians and Europeans, it is important to investigate these loci in African-ancestry populations. We evaluated index SNPs in all 67 breast cancer susceptibility loci identified to date in our study including up to 3,300 African-American women (1,231 cases and 2,069 controls, recruited in the Southern Community Cohort Study (SCCS and the Nashville Breast Health Study (NBHS. Seven SNPs were statistically significant (P ≤ 0.05 with the risk of overall breast cancer in the same direction as previously reported: rs10069690 (5p15/TERT, rs999737 (14q24/RAD51L1, rs13387042 (2q35/TNP1, rs1219648 (10q26/FGFR2, rs8170 (19p13/BABAM1, rs17817449 (16q12/FTO, and rs13329835 (16q23/DYL2. A marginally significant association (P<0.10 was found for three additional SNPs: rs1045485 (2q33/CASP8, rs4849887 (2q14/INHBB, and rs4808801 (19p13/ELL. Three additional SNPs, including rs1011970 (9p21/CDKN2A/2B, rs941764 (14q32/CCDC88C, and rs17529111 (6q14/FAM46A, showed a significant association in analyses conducted by breast cancer subtype. The risk of breast cancer was elevated with an increasing number of risk variants, as measured by quintile of the genetic risk score, from 1.00 (reference, to 1.75 (1.30-2.37, 1.56 (1.15-2.11, 2.02 (1.50-2.74 and 2.63 (1.96-3.52, respectively, (P = 7.8 × 10(-10. Results from this study highlight the need for large genetic studies in AAs to identify risk variants impacting this population.

  15. Integration of mouse and human genome-wide association data identifies KCNIP4 as an asthma gene

    NARCIS (Netherlands)

    Himes, Blanca E.; Sheppard, Keith; Berndt, Annerose; Leme, Adriana S.; Myers, Rachel A.; Gignoux, Christopher R.; Levin, Albert M.; Gauderman, W. James; Yang, James J.; Mathias, Rasika A.; Romieu, Isabelle; Torgerson, Dara G.; Roth, Lindsey A.; Huntsman, Scott; Eng, Celeste; Klanderman, Barbara; Ziniti, John; Senter-Sylvia, Jody; Szefler, Stanley J.; Lemanske, Robert F.; Zeiger, Robert S.; Strunk, Robert C.; Martinez, Fernando D.; Boushey, Homer; Chinchilli, Vernon M.; Israel, Elliot; Mauger, David; Koppelman, Gerard H.; Postma, Dirkje S.; Nieuwenhuis, Maartje A. E.; Vonk, Judith M.; Lima, John J.; Irvin, Charles G.; Peters, Stephen P.; Kubo, Michiaki; Tamari, Mayumi; Nakamura, Yusuke; Litonjua, Augusto A.; Tantisira, Kelan G.; Raby, Benjamin A.; Bleecker, Eugene R.; Meyers, Deborah A.; London, Stephanie J.; Barnes, Kathleen C.; Gilliland, Frank D.; Williams, L. Keoki; Burchard, Esteban G.; Nicolae, Dan L.; Ober, Carole; DeMeo, Dawn L.; Silverman, Edwin K.; Paigen, Beverly; Churchill, Gary; Shapiro, Steve D.; Weiss, Scott

    2013-01-01

    Asthma is a common chronic respiratory disease characterized by airway hyperresponsiveness (AHR). The genetics of asthma have been widely studied in mouse and human, and homologous genomic regions have been associated with mouse AHR and human asthma-related phenotypes. Our goal was to identify

  16. Genome-wide association study identifies five new schizophrenia loci.

    LENUS (Irish Health Repository)

    Ripke, Stephan

    2011-10-01

    We examined the role of common genetic variation in schizophrenia in a genome-wide association study of substantial size: a stage 1 discovery sample of 21,856 individuals of European ancestry and a stage 2 replication sample of 29,839 independent subjects. The combined stage 1 and 2 analysis yielded genome-wide significant associations with schizophrenia for seven loci, five of which are new (1p21.3, 2q32.3, 8p23.2, 8q21.3 and 10q24.32-q24.33) and two of which have been previously implicated (6p21.32-p22.1 and 18q21.2). The strongest new finding (P = 1.6 × 10(-11)) was with rs1625579 within an intron of a putative primary transcript for MIR137 (microRNA 137), a known regulator of neuronal development. Four other schizophrenia loci achieving genome-wide significance contain predicted targets of MIR137, suggesting MIR137-mediated dysregulation as a previously unknown etiologic mechanism in schizophrenia. In a joint analysis with a bipolar disorder sample (16,374 affected individuals and 14,044 controls), three loci reached genome-wide significance: CACNA1C (rs4765905, P = 7.0 × 10(-9)), ANK3 (rs10994359, P = 2.5 × 10(-8)) and the ITIH3-ITIH4 region (rs2239547, P = 7.8 × 10(-9)).

  17. A genome-wide association study identifies risk loci for spirometric measures among smokers of European and African ancestry.

    Science.gov (United States)

    Lutz, Sharon M; Cho, Michael H; Young, Kendra; Hersh, Craig P; Castaldi, Peter J; McDonald, Merry-Lynn; Regan, Elizabeth; Mattheisen, Manuel; DeMeo, Dawn L; Parker, Margaret; Foreman, Marilyn; Make, Barry J; Jensen, Robert L; Casaburi, Richard; Lomas, David A; Bhatt, Surya P; Bakke, Per; Gulsvik, Amund; Crapo, James D; Beaty, Terri H; Laird, Nan M; Lange, Christoph; Hokanson, John E; Silverman, Edwin K

    2015-12-03

    Pulmonary function decline is a major contributor to morbidity and mortality among smokers. Post bronchodilator FEV1 and FEV1/FVC ratio are considered the standard assessment of airflow obstruction. We performed a genome-wide association study (GWAS) in 9919 current and former smokers in the COPDGene study (6659 non-Hispanic Whites [NHW] and 3260 African Americans [AA]) to identify associations with spirometric measures (post-bronchodilator FEV1 and FEV1/FVC). We also conducted meta-analysis of FEV1 and FEV1/FVC GWAS in the COPDGene, ECLIPSE, and GenKOLS cohorts (total n = 13,532). Among NHW in the COPDGene cohort, both measures of pulmonary function were significantly associated with SNPs at the 15q25 locus [containing CHRNA3/5, AGPHD1, IREB2, CHRNB4] (lowest p-value = 2.17 × 10(-11)), and FEV1/FVC was associated with a genomic region on chromosome 4 [upstream of HHIP] (lowest p-value = 5.94 × 10(-10)); both regions have been previously associated with COPD. For the meta-analysis, in addition to confirming associations to the regions near CHRNA3/5 and HHIP, genome-wide significant associations were identified for FEV1 on chromosome 1 [TGFB2] (p-value = 8.99 × 10(-9)), 9 [DBH] (p-value = 9.69 × 10(-9)) and 19 [CYP2A6/7] (p-value = 3.49 × 10(-8)) and for FEV1/FVC on chromosome 1 [TGFB2] (p-value = 8.99 × 10(-9)), 4 [FAM13A] (p-value = 3.88 × 10(-12)), 11 [MMP3/12] (p-value = 3.29 × 10(-10)) and 14 [RIN3] (p-value = 5.64 × 10(-9)). In a large genome-wide association study of lung function in smokers, we found genome-wide significant associations at several previously described loci with lung function or COPD. We additionally identified a novel genome-wide significant locus with FEV1 on chromosome 9 [DBH] in a meta-analysis of three study populations.

  18. IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.

    Science.gov (United States)

    Dai, Mingwei; Ming, Jingsi; Cai, Mingxuan; Liu, Jin; Yang, Can; Wan, Xiang; Xu, Zongben

    2017-09-15

    Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants. The IGESS software is available at https://github.com/daviddaigithub/IGESS . zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  19. Identification, characterization, and utilization of genome-wide simple sequence repeats to identify a QTL for acidity in apple

    Science.gov (United States)

    2012-01-01

    Background Apple is an economically important fruit crop worldwide. Developing a genetic linkage map is a critical step towards mapping and cloning of genes responsible for important horticultural traits in apple. To facilitate linkage map construction, we surveyed and characterized the distribution and frequency of perfect microsatellites in assembled contig sequences of the apple genome. Results A total of 28,538 SSRs have been identified in the apple genome, with an overall density of 40.8 SSRs per Mb. Di-nucleotide repeats are the most frequent microsatellites in the apple genome, accounting for 71.9% of all microsatellites. AT/TA repeats are the most frequent in genomic regions, accounting for 38.3% of all the G-SSRs, while AG/GA dimers prevail in transcribed sequences, and account for 59.4% of all EST-SSRs. A total set of 310 SSRs is selected to amplify eight apple genotypes. Of these, 245 (79.0%) are found to be polymorphic among cultivars and wild species tested. AG/GA motifs in genomic regions have detected more alleles and higher PIC values than AT/TA or AC/CA motifs. Moreover, AG/GA repeats are more variable than any other dimers in apple, and should be preferentially selected for studies, such as genetic diversity and linkage map construction. A total of 54 newly developed apple SSRs have been genetically mapped. Interestingly, clustering of markers with distorted segregation is observed on linkage groups 1, 2, 10, 15, and 16. A QTL responsible for malic acid content of apple fruits is detected on linkage group 8, and accounts for ~13.5% of the observed phenotypic variation. Conclusions This study demonstrates that di-nucleotide repeats are prevalent in the apple genome and that AT/TA and AG/GA repeats are the most frequent in genomic and transcribed sequences of apple, respectively. All SSR motifs identified in this study as well as those newly mapped SSRs will serve as valuable resources for pursuing apple genetic studies, aiding the apple breeding

  20. Genome-wide association study identifies the SERPINB gene cluster as a susceptibility locus for food allergy.

    Science.gov (United States)

    Marenholz, Ingo; Grosche, Sarah; Kalb, Birgit; Rüschendorf, Franz; Blümchen, Katharina; Schlags, Rupert; Harandi, Neda; Price, Mareike; Hansen, Gesine; Seidenberg, Jürgen; Röblitz, Holger; Yürek, Songül; Tschirner, Sebastian; Hong, Xiumei; Wang, Xiaobin; Homuth, Georg; Schmidt, Carsten O; Nöthen, Markus M; Hübner, Norbert; Niggemann, Bodo; Beyer, Kirsten; Lee, Young-Ae

    2017-10-20

    Genetic factors and mechanisms underlying food allergy are largely unknown. Due to heterogeneity of symptoms a reliable diagnosis is often difficult to make. Here, we report a genome-wide association study on food allergy diagnosed by oral food challenge in 497 cases and 2387 controls. We identify five loci at genome-wide significance, the clade B serpin (SERPINB) gene cluster at 18q21.3, the cytokine gene cluster at 5q31.1, the filaggrin gene, the C11orf30/LRRC32 locus, and the human leukocyte antigen (HLA) region. Stratifying the results for the causative food demonstrates that association of the HLA locus is peanut allergy-specific whereas the other four loci increase the risk for any food allergy. Variants in the SERPINB gene cluster are associated with SERPINB10 expression in leukocytes. Moreover, SERPINB genes are highly expressed in the esophagus. All identified loci are involved in immunological regulation or epithelial barrier function, emphasizing the role of both mechanisms in food allergy.

  1. Single-trait and multi-trait genome-wide association analyses identify novel loci for blood pressure in African-ancestry populations.

    Directory of Open Access Journals (Sweden)

    Jingjing Liang

    2017-05-01

    Full Text Available Hypertension is a leading cause of global disease, mortality, and disability. While individuals of African descent suffer a disproportionate burden of hypertension and its complications, they have been underrepresented in genetic studies. To identify novel susceptibility loci for blood pressure and hypertension in people of African ancestry, we performed both single and multiple-trait genome-wide association analyses. We analyzed 21 genome-wide association studies comprised of 31,968 individuals of African ancestry, and validated our results with additional 54,395 individuals from multi-ethnic studies. These analyses identified nine loci with eleven independent variants which reached genome-wide significance (P < 1.25×10-8 for either systolic and diastolic blood pressure, hypertension, or for combined traits. Single-trait analyses identified two loci (TARID/TCF21 and LLPH/TMBIM4 and multiple-trait analyses identified one novel locus (FRMD3 for blood pressure. At these three loci, as well as at GRP20/CDH17, associated variants had alleles common only in African-ancestry populations. Functional annotation showed enrichment for genes expressed in immune and kidney cells, as well as in heart and vascular cells/tissues. Experiments driven by these findings and using angiotensin-II induced hypertension in mice showed altered kidney mRNA expression of six genes, suggesting their potential role in hypertension. Our study provides new evidence for genes related to hypertension susceptibility, and the need to study African-ancestry populations in order to identify biologic factors contributing to hypertension.

  2. Whole-genome sequencing identifies recurrent somatic NOTCH2 mutations in splenic marginal zone lymphoma.

    Science.gov (United States)

    Kiel, Mark J; Velusamy, Thirunavukkarasu; Betz, Bryan L; Zhao, Lili; Weigelin, Helmut G; Chiang, Mark Y; Huebner-Chan, David R; Bailey, Nathanael G; Yang, David T; Bhagat, Govind; Miranda, Roberto N; Bahler, David W; Medeiros, L Jeffrey; Lim, Megan S; Elenitoba-Johnson, Kojo S J

    2012-08-27

    Splenic marginal zone lymphoma (SMZL), the most common primary lymphoma of spleen, is poorly understood at the genetic level. In this study, using whole-genome DNA sequencing (WGS) and confirmation by Sanger sequencing, we observed mutations identified in several genes not previously known to be recurrently altered in SMZL. In particular, we identified recurrent somatic gain-of-function mutations in NOTCH2, a gene encoding a protein required for marginal zone B cell development, in 25 of 99 (∼25%) cases of SMZL and in 1 of 19 (∼5%) cases of nonsplenic MZLs. These mutations clustered near the C-terminal proline/glutamate/serine/threonine (PEST)-rich domain, resulting in protein truncation or, rarely, were nonsynonymous substitutions affecting the extracellular heterodimerization domain (HD). NOTCH2 mutations were not present in other B cell lymphomas and leukemias, such as chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL; n = 15), mantle cell lymphoma (MCL; n = 15), low-grade follicular lymphoma (FL; n = 44), hairy cell leukemia (HCL; n = 15), and reactive lymphoid hyperplasia (n = 14). NOTCH2 mutations were associated with adverse clinical outcomes (relapse, histological transformation, and/or death) among SMZL patients (P = 0.002). These results suggest that NOTCH2 mutations play a role in the pathogenesis and progression of SMZL and are associated with a poor prognosis.

  3. Meta-analysis of genome-wide association studies identifies eight new loci for type 2 diabetes in east Asians

    DEFF Research Database (Denmark)

    Cho, Yoon Shin; Chen, Chien-Hsiun; Hu, Cheng

    2012-01-01

    We conducted a three-stage genetic study to identify susceptibility loci for type 2 diabetes (T2D) in east Asian populations. We followed our stage 1 meta-analysis of eight T2D genome-wide association studies (6,952 cases with T2D and 11,865 controls) with a stage 2 in silico replication analysis...... (5,843 cases and 4,574 controls) and a stage 3 de novo replication analysis (12,284 cases and 13,172 controls). The combined analysis identified eight new T2D loci reaching genome-wide significance, which mapped in or near GLIS3, PEPD, FITM2-R3HDML-HNF4A, KCNK16, MAEA, GCC1-PAX4, PSMD6 and ZFAND3...

  4. Genome-wide association study identifies a single major locus contributing to survival into old age; the APOE locus revisited

    DEFF Research Database (Denmark)

    Deelen, Joris; Beekman, Marian; Uh, Hae-Won

    2011-01-01

    By studying the loci which contribute to human longevity, we aim to identify mechanisms that contribute to healthy aging. To identify such loci, we performed a genome-wide association study (GWAS) comparing 403 unrelated nonagenarians from long-living families included in the Leiden Longevity Stu...

  5. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Thorup Nielsen, Mette

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely...

  6. Genome-wide association analysis identifies new lung cancer susceptibility loci in never-smoking women in Asia.

    NARCIS (Netherlands)

    Lan, Q.; Hsiung, C.A.; Matsuo, K.; Hong, Y.C.; Seow, A.; Wang, Z.; Hosgood, H.D.; Chen, K.; Wang, J.C.; Chatterjee, N.; Hu, W.; Wong, M.P.; Zheng, W.; Caporaso, N.; Park, J.Y.; Chen, C.J.; Kim, Y.H.; Kim, Y.T.; Landi, M.T.; Shen, H.; Lawrence, C.; Burdett, L.; Yeager, M.; Yuenger, J.; Jacobs, K.B.; Chang, I.S.; Mitsudomi, T.; Kim, H.N.; Chang, G.C.; Bassig, B.A.; Tucker, M.; Wei, F.; Yin, Y.; Wu, C.; An, S.J.; Qian, B.; Lee, V.H.; Lu, D.; Liu, J.; Jeon, H.S.; Hsiao, C.F.; Sung, J.S.; Kim, J.H.; Gao, Y.T.; Tsai, Y.H.; Jung, Y.J.; Guo, H.; Hu, Z.; Hutchinson, A.; Wang, W.C.; Klein, R.; Chung, C.C.; Oh, I.J.; Chen, K.Y.; Berndt, S.I.; He, X.; Wu, W.; Chang, J.; Zhang, X.C.; Huang, M.S.; Zheng, H.; Wang, J.; Zhao, X.|info:eu-repo/dai/nl/413577805; Li, Y.; Choi, J.E.; Su, W.C.; Park, K.H.; Sung, S.W.; Shu, X.O.; Chen, Y.M.; Liu, L.; Kang, C.H.; Hu, L.; Chen, C.H.; Pao, W.; Kim, Y.C.; Yang, T.Y.; Xu, J.; Guan, P.; Tan, W.; Su, J.; Wang, C.L.; Li, H.; Sihoe, A.D.; Zhao, Z.|info:eu-repo/dai/nl/304120995; Chen, Y.; Choi, Y.Y.; Hung, J.Y.; Kim, J.S.; Yoon, H.I.; Cai, Q.; Lin, C.C.; Park, I.K.; Xu, P.; Dong, J.; Kim, C.; He, Q; Perng, R.P.; Kohno, T.; Kweon, S.S.; Chen, C.Y.; Vermeulen, R.|info:eu-repo/dai/nl/216532620; Wu, J.; Lim, W.Y.; Chen, K.C.; Chow, W.H.; Ji, B.T.; Chan, J.K.; Chu, M.; Li, Y.J.; Yokota, J.; Li, J.; Chen, H.; Xiang, Y.B.; Yu, C.J.; Kunitoh, H.; Wu, G.; Jin, L.; Lo, Y.L.; Shiraishi, K.; Chen, Y.H.; Lin, H.C.; Wu, T.; WU, Y.; Yang, P.C.; Zhou, B.; Shin, M.H.; Fraumeni, J.F.; Lin, D.; Chanock, S.J.; Rothman, N.

    2012-01-01

    To identify common genetic variants that contribute to lung cancer susceptibility, we conducted a multistage genome-wide association study of lung cancer in Asian women who never smoked. We scanned 5,510 never-smoking female lung cancer cases and 4,544 controls drawn from 14 studies from mainland

  7. Single-trait and multi-trait genome-wide association analyses identify novel loci for blood pressure in African-ancestry populations

    OpenAIRE

    Liang, Jingjing; Le, Thu H.; Edwards, Digna R. Velez; Tayo, Bamidele O.; Gaulton, Kyle J.; Smith, Jennifer A.; Lu, Yingchang; Jensen, Richard A.; Chen, Guanjie; Yanek, Lisa R.; Schwander, Karen; Tajuddin, Salman M.; Sofer, Tamar; Kim, Wonji; Kayima, James

    2017-01-01

    © 2017 Public Library of Science. All Rights Reserved. Hypertension is a leading cause of global disease, mortality, and disability. While individuals of African descent suffer a disproportionate burden of hypertension and its complications, they have been underrepresented in genetic studies. To identify novel susceptibility loci for blood pressure and hypertension in people of African ancestry, we performed both single and multiple-trait genome-wide association analyses. We analyzed 21 genom...

  8. Thirty new loci for age at menarche identified by a meta-analysis of genome-wide association studies

    NARCIS (Netherlands)

    C.E. Elks (Cathy); J.R.B. Perry (John); P. Sulem (Patrick); D.I. Chasman (Daniel); N. Franceschini (Nora); C. He (Chunyan); K.L. Lunetta (Kathryn); J.A. Visser (Jenny); E.M. Byrne (Enda); D.L. Cousminer (Diana); D.F. Gudbjartsson (Daniel); T. Esko (Tõnu); B. Feenstra (Bjarke); J.J. Hottenga (Jouke Jan); D.L. Koller (Daniel); Z. Kutalik (Zoltán); P. Lin (Peng); M. Mangino (Massimo); M. Marongiu (Mara); P.F. McArdle (Patrick); A.V. Smith (Albert Vernon); L. Stolk (Lisette); S. van Wingerden (Sophie); J.H. Zhao (Jing Hua); E. Albrecht (Eva); T. Corre (Tanguy); E. Ingelsson (Erik); C. Hayward (Caroline); P.K. Magnusson (Patrik); S. Ulivi (Shelia); N.M. Warrington (Nicole); L. Zgaga (Lina); H. Alavere (Helene); N. Amin (Najaf); T. Aspelund (Thor); S. Bandinelli (Stefania); I.E. Barroso (Inês); G. Berenson (Gerald); S.M. Bergmann (Sven); H. Blackburn (Hannah); E.A. Boerwinkle (Eric); J.E. Buring (Julie); F. Busonero; H. Campbell (Harry); S.J. Chanock (Stephen); W. Chen (Wei); M. Cornelis (Marilyn); D.J. Couper (David); A.D. Coviello (Andrea); P. d' Adamo (Pio); U. de Faire (Ulf); E.J.C. de Geus (Eco); P. Deloukas (Panagiotis); A. Döring (Angela); D.F. Easton (Douglas); G. Eiriksdottir (Gudny); V. Emilsson (Valur); J.G. Eriksson (Johan); L. Ferrucci (Luigi); A.R. Folsom (Aaron); T. Foroud (Tatiana); M. Garcia (Melissa); P. Gasparini (Paolo); F. Geller (Frank); C. Gieger (Christian); V. Gudnason (Vilmundur); A.S. Hall (Alistair); S.E. Hankinson (Susan); L. Ferreli (Liana); A.C. Heath (Andrew); D.G. Hernandez (Dena); A. Hofman (Albert); F.B. Hu (Frank); T. Illig (Thomas); M.R. Järvelin; A.D. Johnson (Andrew); D. Karasik (David); K-T. Khaw (Kay-Tee); D.P. Kiel (Douglas); T.O. Kilpelänen (Tuomas); I. Kolcic (Ivana); P. Kraft (Peter); L.J. Launer (Lenore); J.S.E. Laven (Joop); S. Li (Shengxu); J. Liu (Jianjun); D. Levy (Daniel); N.G. Martin (Nicholas); M. Melbye (Mads); V. Mooser (Vincent); J.C. Murray (Jeffrey); M.A. Nalls (Michael); P. Navarro (Pau); M. Nelis (Mari); A.R. Ness (Andrew); K. Northstone (Kate); B.A. Oostra (Ben); M. Peacock (Munro); C. Palmer (Cameron); A. Palotie (Aarno); G. Paré (Guillaume); A.N. Parker (Alex); N.L. Pedersen (Nancy); L. Peltonen (Leena Johanna); C.E. Pennell (Craig); P.D.P. Pharoah (Paul); O. Polasek (Ozren); A.S. Plump (Andrew); A. Pouta (Anneli); E. Porcu (Eleonora); T. Rafnar (Thorunn); J.P. Rice (John); S.M. Ring (Susan); F. Rivadeneira Ramirez (Fernando); I. Rudan (Igor); C. Sala (Cinzia); V. Salomaa (Veikko); S. Sanna (Serena); D. Schlessinger; N.J. Schork (Nicholas); A. Scuteri (Angelo); A.V. Segrè (Ayellet); A.R. Shuldiner (Alan); N. Soranzo (Nicole); U. Sovio (Ulla); S.R. Srinivasan (Sathanur); D.P. Strachan (David); M.L. Tammesoo; E. Tikkanen (Emmi); D. Toniolo (Daniela); K. Tsui (Kim); L. Tryggvadottir (Laufey); J.P. Tyrer (Jonathan); M. Uda (Manuela); R.M. van Dam (Rob); J.B.J. van Meurs (Joyce); P. Vollenweider (Peter); G. Waeber (Gérard); N.J. Wareham (Nick); D. Waterworth (Dawn); H.E. Wichmann (Heinz Erich); G.A.H.M. Willemsen (Gonneke); J.F. Wilson (James); A.F. Wright (Alan); L. Young (Lauren); G. Zhai (Guangju); W.V. Zhuang; L.J. Bierut (Laura); D.I. Boomsma (Dorret); H.A. Boyd (Heather); L. Crisponi (Laura); E.W. Demerath (Ellen); P. Tikka-Kleemola (Päivi); M.J. Econs (Michael); T.B. Harris (Tamara); D. Hunter (David); R.J.F. Loos (Ruth); A. Metspalu (Andres); G.W. Montgomery (Grant); P.M. Ridker (Paul); T.D. Spector (Tim); E.A. Streeten (Elizabeth); K. Stefansson (Kari); U. Thorsteinsdottir (Unnur); A.G. Uitterlinden (André); E. Widen (Elisabeth); J. Murabito (Joanne); K. Ong (Ken); M.N. Weedon (Michael)

    2010-01-01

    textabstractTo identify loci for age at menarche, we performed a meta-analysis of 32 genome-wide association studies in 87,802 women of European descent, with replication in up to 14,731 women. In addition to the known loci at LIN28B (P = 5.4 × 10 -60) and 9q31.2 (P = 2.2 × 10 -33), we identified 30

  9. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    Science.gov (United States)

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  10. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  11. A genomic approach to identify regulatory nodes in the transcriptional network of systemic acquired resistance in plants.

    Directory of Open Access Journals (Sweden)

    Dong Wang

    2006-11-01

    Full Text Available Many biological processes are controlled by intricate networks of transcriptional regulators. With the development of microarray technology, transcriptional changes can be examined at the whole-genome level. However, such analysis often lacks information on the hierarchical relationship between components of a given system. Systemic acquired resistance (SAR is an inducible plant defense response involving a cascade of transcriptional events induced by salicylic acid through the transcription cofactor NPR1. To identify additional regulatory nodes in the SAR network, we performed microarray analysis on Arabidopsis plants expressing the NPR1-GR (glucocorticoid receptor fusion protein. Since nuclear translocation of NPR1-GR requires dexamethasone, we were able to control NPR1-dependent transcription and identify direct transcriptional targets of NPR1. We show that NPR1 directly upregulates the expression of eight WRKY transcription factor genes. This large family of 74 transcription factors has been implicated in various defense responses, but no specific WRKY factor has been placed in the SAR network. Identification of NPR1-regulated WRKY factors allowed us to perform in-depth genetic analysis on a small number of WRKY factors and test well-defined phenotypes of single and double mutants associated with NPR1. Among these WRKY factors we found both positive and negative regulators of SAR. This genomics-directed approach unambiguously positioned five WRKY factors in the complex transcriptional regulatory network of SAR. Our work not only discovered new transcription regulatory components in the signaling network of SAR but also demonstrated that functional studies of large gene families have to take into consideration sequence similarity as well as the expression patterns of the candidates.

  12. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer.

    Science.gov (United States)

    Michailidou, Kyriaki; Beesley, Jonathan; Lindstrom, Sara; Canisius, Sander; Dennis, Joe; Lush, Michael J; Maranian, Mel J; Bolla, Manjeet K; Wang, Qin; Shah, Mitul; Perkins, Barbara J; Czene, Kamila; Eriksson, Mikael; Darabi, Hatef; Brand, Judith S; Bojesen, Stig E; Nordestgaard, Børge G; Flyger, Henrik; Nielsen, Sune F; Rahman, Nazneen; Turnbull, Clare; Fletcher, Olivia; Peto, Julian; Gibson, Lorna; dos-Santos-Silva, Isabel; Chang-Claude, Jenny; Flesch-Janys, Dieter; Rudolph, Anja; Eilber, Ursula; Behrens, Sabine; Nevanlinna, Heli; Muranen, Taru A; Aittomäki, Kristiina; Blomqvist, Carl; Khan, Sofia; Aaltonen, Kirsimari; Ahsan, Habibul; Kibriya, Muhammad G; Whittemore, Alice S; John, Esther M; Malone, Kathleen E; Gammon, Marilie D; Santella, Regina M; Ursin, Giske; Makalic, Enes; Schmidt, Daniel F; Casey, Graham; Hunter, David J; Gapstur, Susan M; Gaudet, Mia M; Diver, W Ryan; Haiman, Christopher A; Schumacher, Fredrick; Henderson, Brian E; Le Marchand, Loic; Berg, Christine D; Chanock, Stephen J; Figueroa, Jonine; Hoover, Robert N; Lambrechts, Diether; Neven, Patrick; Wildiers, Hans; van Limbergen, Erik; Schmidt, Marjanka K; Broeks, Annegien; Verhoef, Senno; Cornelissen, Sten; Couch, Fergus J; Olson, Janet E; Hallberg, Emily; Vachon, Celine; Waisfisz, Quinten; Meijers-Heijboer, Hanne; Adank, Muriel A; van der Luijt, Rob B; Li, Jingmei; Liu, Jianjun; Humphreys, Keith; Kang, Daehee; Choi, Ji-Yeob; Park, Sue K; Yoo, Keun-Young; Matsuo, Keitaro; Ito, Hidemi; Iwata, Hiroji; Tajima, Kazuo; Guénel, Pascal; Truong, Thérèse; Mulot, Claire; Sanchez, Marie; Burwinkel, Barbara; Marme, Frederik; Surowy, Harald; Sohn, Christof; Wu, Anna H; Tseng, Chiu-chen; Van Den Berg, David; Stram, Daniel O; González-Neira, Anna; Benitez, Javier; Zamora, M Pilar; Perez, Jose Ignacio Arias; Shu, Xiao-Ou; Lu, Wei; Gao, Yu-Tang; Cai, Hui; Cox, Angela; Cross, Simon S; Reed, Malcolm W R; Andrulis, Irene L; Knight, Julia A; Glendon, Gord; Mulligan, Anna Marie; Sawyer, Elinor J; Tomlinson, Ian; Kerin, Michael J; Miller, Nicola; Lindblom, Annika; Margolin, Sara; Teo, Soo Hwang; Yip, Cheng Har; Taib, Nur Aishah Mohd; Tan, Gie-Hooi; Hooning, Maartje J; Hollestelle, Antoinette; Martens, John W M; Collée, J Margriet; Blot, William; Signorello, Lisa B; Cai, Qiuyin; Hopper, John L; Southey, Melissa C; Tsimiklis, Helen; Apicella, Carmel; Shen, Chen-Yang; Hsiung, Chia-Ni; Wu, Pei-Ei; Hou, Ming-Feng; Kristensen, Vessela N; Nord, Silje; Alnaes, Grethe I Grenaker; Giles, Graham G; Milne, Roger L; McLean, Catriona; Canzian, Federico; Trichopoulos, Dimitrios; Peeters, Petra; Lund, Eiliv; Sund, Malin; Khaw, Kay-Tee; Gunter, Marc J; Palli, Domenico; Mortensen, Lotte Maxild; Dossus, Laure; Huerta, Jose-Maria; Meindl, Alfons; Schmutzler, Rita K; Sutter, Christian; Yang, Rongxi; Muir, Kenneth; Lophatananon, Artitaya; Stewart-Brown, Sarah; Siriwanarangsan, Pornthep; Hartman, Mikael; Miao, Hui; Chia, Kee Seng; Chan, Ching Wan; Fasching, Peter A; Hein, Alexander; Beckmann, Matthias W; Haeberle, Lothar; Brenner, Hermann; Dieffenbach, Aida Karina; Arndt, Volker; Stegmaier, Christa; Ashworth, Alan; Orr, Nick; Schoemaker, Minouk J; Swerdlow, Anthony J; Brinton, Louise; Garcia-Closas, Montserrat; Zheng, Wei; Halverson, Sandra L; Shrubsole, Martha; Long, Jirong; Goldberg, Mark S; Labrèche, France; Dumont, Martine; Winqvist, Robert; Pylkäs, Katri; Jukkola-Vuorinen, Arja; Grip, Mervi; Brauch, Hiltrud; Hamann, Ute; Brüning, Thomas; Radice, Paolo; Peterlongo, Paolo; Manoukian, Siranoush; Bernard, Loris; Bogdanova, Natalia V; Dörk, Thilo; Mannermaa, Arto; Kataja, Vesa; Kosma, Veli-Matti; Hartikainen, Jaana M; Devilee, Peter; Tollenaar, Robert A E M; Seynaeve, Caroline; Van Asperen, Christi J; Jakubowska, Anna; Lubinski, Jan; Jaworska, Katarzyna; Huzarski, Tomasz; Sangrajrang, Suleeporn; Gaborieau, Valerie; Brennan, Paul; McKay, James; Slager, Susan; Toland, Amanda E; Ambrosone, Christine B; Yannoukakos, Drakoulis; Kabisch, Maria; Torres, Diana; Neuhausen, Susan L; Anton-Culver, Hoda; Luccarini, Craig; Baynes, Caroline; Ahmed, Shahana; Healey, Catherine S; Tessier, Daniel C; Vincent, Daniel; Bacot, Francois; Pita, Guillermo; Alonso, M Rosario; Álvarez, Nuria; Herrero, Daniel; Simard, Jacques; Pharoah, Paul P D P; Kraft, Peter; Dunning, Alison M; Chenevix-Trench, Georgia; Hall, Per; Easton, Douglas F

    2015-04-01

    Genome-wide association studies (GWAS) and large-scale replication studies have identified common variants in 79 loci associated with breast cancer, explaining ∼14% of the familial risk of the disease. To identify new susceptibility loci, we performed a meta-analysis of 11 GWAS, comprising 15,748 breast cancer cases and 18,084 controls together with 46,785 cases and 42,892 controls from 41 studies genotyped on a 211,155-marker custom array (iCOGS). Analyses were restricted to women of European ancestry. We generated genotypes for more than 11 million SNPs by imputation using the 1000 Genomes Project reference panel, and we identified 15 new loci associated with breast cancer at P association analysis with ChIP-seq chromatin binding data in mammary cell lines and ChIA-PET chromatin interaction data from ENCODE, we identified likely target genes in two regions: SETBP1 at 18q12.3 and RNF115 and PDZK1 at 1q21.1. One association appears to be driven by an amino acid substitution encoded in EXO1.

  13. QTL-seq approach identified genomic regions and diagnostic markers for rust and late leaf spot resistance in groundnut (Arachis hypogaea L.).

    Science.gov (United States)

    Pandey, Manish K; Khan, Aamir W; Singh, Vikas K; Vishwakarma, Manish K; Shasidhar, Yaduru; Kumar, Vinay; Garg, Vanika; Bhat, Ramesh S; Chitikineni, Annapurna; Janila, Pasupuleti; Guo, Baozhu; Varshney, Rajeev K

    2017-08-01

    Rust and late leaf spot (LLS) are the two major foliar fungal diseases in groundnut, and their co-occurrence leads to significant yield loss in addition to the deterioration of fodder quality. To identify candidate genomic regions controlling resistance to rust and LLS, whole-genome resequencing (WGRS)-based approach referred as 'QTL-seq' was deployed. A total of 231.67 Gb raw and 192.10 Gb of clean sequence data were generated through WGRS of resistant parent and the resistant and susceptible bulks for rust and LLS. Sequence analysis of bulks for rust and LLS with reference-guided resistant parent assembly identified 3136 single-nucleotide polymorphisms (SNPs) for rust and 66 SNPs for LLS with the read depth of ≥7 in the identified genomic region on pseudomolecule A03. Detailed analysis identified 30 nonsynonymous SNPs affecting 25 candidate genes for rust resistance, while 14 intronic and three synonymous SNPs affecting nine candidate genes for LLS resistance. Subsequently, allele-specific diagnostic markers were identified for three SNPs for rust resistance and one SNP for LLS resistance. Genotyping of one RIL population (TAG 24 × GPBD 4) with these four diagnostic markers revealed higher phenotypic variation for these two diseases. These results suggest usefulness of QTL-seq approach in precise and rapid identification of candidate genomic regions and development of diagnostic markers for breeding applications. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  14. Omics Approaches for Identifying Physiological Adaptations to Genome Instability in Aging

    Directory of Open Access Journals (Sweden)

    Diletta Edifizi

    2017-11-01

    Full Text Available DNA damage causally contributes to aging and age-related diseases. The declining functioning of tissues and organs during aging can lead to the increased risk of succumbing to aging-associated diseases. Congenital syndromes that are caused by heritable mutations in DNA repair pathways lead to cancer susceptibility and accelerated aging, thus underlining the importance of genome maintenance for withstanding aging. High-throughput mass-spectrometry-based approaches have recently contributed to identifying signalling response networks and gaining a more comprehensive understanding of the physiological adaptations occurring upon unrepaired DNA damage. The insulin-like signalling pathway has been implicated in a DNA damage response (DDR network that includes epidermal growth factor (EGF-, AMP-activated protein kinases (AMPK- and the target of rapamycin (TOR-like signalling pathways, which are known regulators of growth, metabolism, and stress responses. The same pathways, together with the autophagy-mediated proteostatic response and the decline in energy metabolism have also been found to be similarly regulated during natural aging, suggesting striking parallels in the physiological adaptation upon persistent DNA damage due to DNA repair defects and long-term low-level DNA damage accumulation occurring during natural aging. These insights will be an important starting point to study the interplay between signalling networks involved in progeroid syndromes that are caused by DNA repair deficiencies and to gain new understanding of the consequences of DNA damage in the aging process.

  15. Genome-wide association study of glioma subtypes identifies specific differences in genetic susceptibility to glioblastoma and non-glioblastoma tumors

    DEFF Research Database (Denmark)

    Melin, Beatrice S; Barnholtz-Sloan, Jill S; Wrensch, Margaret R

    2017-01-01

    Genome-wide association studies (GWAS) have transformed our understanding of glioma susceptibility, but individual studies have had limited power to identify risk loci. We performed a meta-analysis of existing GWAS and two new GWAS, which totaled 12,496 cases and 18,190 controls. We identified fi...

  16. Arthropod phylogenetics in light of three novel millipede (myriapoda: diplopoda mitochondrial genomes with comments on the appropriateness of mitochondrial genome sequence data for inferring deep level relationships.

    Directory of Open Access Journals (Sweden)

    Michael S Brewer

    Full Text Available BACKGROUND: Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. RESULTS: The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly. As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. CONCLUSIONS: The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic

  17. Arthropod phylogenetics in light of three novel millipede (myriapoda: diplopoda) mitochondrial genomes with comments on the appropriateness of mitochondrial genome sequence data for inferring deep level relationships.

    Science.gov (United States)

    Brewer, Michael S; Swafford, Lynn; Spruill, Chad L; Bond, Jason E

    2013-01-01

    Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly). As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic signal renders the resulting tree topologies as suspect

  18. Secondary uses and the governance of de-identified data: Lessons from the human genome diversity panel

    Directory of Open Access Journals (Sweden)

    Lee Sandra S-J

    2011-09-01

    Full Text Available Abstract Background Recent changes to regulatory guidance in the US and Europe have complicated oversight of secondary research by rendering most uses of de-identified data exempt from human subjects oversight. To identify the implications of such guidelines for harms to participants and communities, this paper explores the secondary uses of one de-identified DNA sample collection with limited oversight: the Human Genome Diversity Project (HGDP-Centre d'Etude du Polymorphisme Humain, Fondation Jean Dausset (CEPH Human Genome Diversity Panel. Methods Using a combination of keyword and cited reference search, we identified English-language scientific articles published between 2002 and 2009 that reported analysis of HGDP Diversity Panel samples and/or data. We then reviewed each article to identify the specific research use to which the samples and/or data was applied. Secondary uses were categorized according to the type and kind of research supported by the collection. Results A wide variety of secondary uses were identified from 148 peer-reviewed articles. While the vast majority of these uses were consistent with the original intent of the collection, a minority of published reports described research whose primary findings could be regarded as controversial, objectionable, or potentially stigmatizing in their interpretation. Conclusions We conclude that potential risks to participants and communities cannot be wholly eliminated by anonymization of individual data and suggest that explicit review of proposed secondary uses, by a Data Access Committee or similar internal oversight body with suitable stakeholder representation, should be a required component of the trustworthy governance of any repository of data or specimens.

  19. Genome-wide meta-analysis uncovers novel loci influencing circulating leptin levels

    DEFF Research Database (Denmark)

    Kilpeläinen, Tuomas O; Carli, Jayne F Martin; Skowronski, Alicja A

    2016-01-01

    . Therefore, we performed a genome-wide association study (GWAS) of circulating leptin levels from 32,161 individuals and followed up loci reaching PFTO....... Although the association of the FTO obesity locus with leptin levels is abolished by adjustment for BMI, associations of the four other loci are independent of adiposity. The GCKR locus was found associated with multiple metabolic traits in previous GWAS and the CCNL1 locus with birth weight. Knockdown...

  20. Genome-wide association study identifies variants in HORMAD2 associated with tonsillectomy

    DEFF Research Database (Denmark)

    Feenstra, Bjarke; Bager, Peter; Liu, Xueping

    2017-01-01

    BACKGROUND: Inflammation of the tonsils is a normal response to infection, but some individuals experience recurrent, severe tonsillitis and massive hypertrophy of the tonsils in which case surgical removal of the tonsils may be considered. OBJECTIVE: To identify common genetic variants associate...... the molecular mechanisms underlying the genetic association involve general lymphoid hyper-reaction throughout the mucosa-associated lymphoid tissue system.......BACKGROUND: Inflammation of the tonsils is a normal response to infection, but some individuals experience recurrent, severe tonsillitis and massive hypertrophy of the tonsils in which case surgical removal of the tonsils may be considered. OBJECTIVE: To identify common genetic variants associated...... with tonsillectomy. METHODS: We used tonsillectomy information from Danish health registers and carried out a genome-wide association study comprising 1464 patients and 12 019 controls of Northwestern European ancestry, with replication in an independent sample set of 1575 patients and 1367 controls. RESULTS...

  1. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  2. Genome-wide association analysis of young onset stroke identifies a locus on chromosome 10q25 near HABP2

    Science.gov (United States)

    Cheng, Yu-Ching; Stanne, Tara M.; Giese, Anne-Katrin; Ho, Weang Kee; Traylor, Matthew; Amouyel, Philippe; Holliday, Elizabeth G.; Malik, Rainer; Xu, Huichun; Kittner, Steven J.; Cole, John W.; O’Connell, Jeffrey R.; Danesh, John; Rasheed, Asif; Zhao, Wei; Engelter, Stefan; Grond-Ginsbach, Caspar; Kamatani, Yoichiro; Lathrop, Mark; Leys, Didier; Thijs, Vincent; Metso, Tiina M.; Tatlisumak, Turgut; Pezzini, Alessandro; Parati, Eugenio A.; Norrving, Bo; Bevan, Steve; Rothwell, Peter M; Sudlow, Cathie; Slowik, Agnieszka; Lindgren, Arne; Walters, Matthew R; Jannes, Jim; Shen, Jess; Crosslin, David; Doheny, Kimberly; Laurie, Cathy C.; Kanse, Sandip M.; Bis, Joshua C.; Fornage, Myriam; Mosley, Thomas H.; Hopewell, Jemma C.; Strauch, Konstantin; Müller-Nurasyid, Martina; Gieger, Christian; Waldenberger, Melanie; Peters, Annette; Meisinger, Christine; Ikram, M. Arfan; Longstreth, WT; Meschia, James F.; Seshadri, Sudha; Sharma, Pankaj; Worrall, Bradford; Jern, Christina; Levi, Christopher; Dichgans, Martin; Boncoraglio, Giorgio B.; Markus, Hugh S.; Debette, Stephanie; Rolfs, Arndt; Saleheen, Danish; Mitchell, Braxton D.

    2015-01-01

    Background and Purpose Although a genetic contribution to ischemic stroke is well recognized, only a handful of stroke loci have been identified by large-scale genetic association studies to date. Hypothesizing that genetic effects might be stronger for early- versus late-onset stroke, we conducted a two-stage meta-analysis of genome-wide association studies (GWAS), focusing on stroke cases with an age of onset genetic variants at loci with association Pstroke susceptibility locus at 10q25 reached genome-wide significance in the combined analysis of all samples from the Discovery and Follow-up Stages (rs11196288, OR=1.41, P=9.5×10−9). The associated locus is in an intergenic region between TCF7L2 and HABP2. In a further analysis in an independent sample, we found that two SNPs in high linkage disequilibrium with rs11196288 were significantly associated with total plasma factor VII-activating protease levels, a product of HABP2. Conclusions HABP2, which encodes an extracellular serine protease involved in coagulation, fibrinolysis, and inflammatory pathways, may be a genetic susceptibility locus for early-onset stroke. PMID:26732560

  3. Genome-Wide Association Analysis of Young-Onset Stroke Identifies a Locus on Chromosome 10q25 Near HABP2.

    Science.gov (United States)

    Cheng, Yu-Ching; Stanne, Tara M; Giese, Anne-Katrin; Ho, Weang Kee; Traylor, Matthew; Amouyel, Philippe; Holliday, Elizabeth G; Malik, Rainer; Xu, Huichun; Kittner, Steven J; Cole, John W; O'Connell, Jeffrey R; Danesh, John; Rasheed, Asif; Zhao, Wei; Engelter, Stefan; Grond-Ginsbach, Caspar; Kamatani, Yoichiro; Lathrop, Mark; Leys, Didier; Thijs, Vincent; Metso, Tiina M; Tatlisumak, Turgut; Pezzini, Alessandro; Parati, Eugenio A; Norrving, Bo; Bevan, Steve; Rothwell, Peter M; Sudlow, Cathie; Slowik, Agnieszka; Lindgren, Arne; Walters, Matthew R; Jannes, Jim; Shen, Jess; Crosslin, David; Doheny, Kimberly; Laurie, Cathy C; Kanse, Sandip M; Bis, Joshua C; Fornage, Myriam; Mosley, Thomas H; Hopewell, Jemma C; Strauch, Konstantin; Müller-Nurasyid, Martina; Gieger, Christian; Waldenberger, Melanie; Peters, Annette; Meisinger, Christine; Ikram, M Arfan; Longstreth, W T; Meschia, James F; Seshadri, Sudha; Sharma, Pankaj; Worrall, Bradford; Jern, Christina; Levi, Christopher; Dichgans, Martin; Boncoraglio, Giorgio B; Markus, Hugh S; Debette, Stephanie; Rolfs, Arndt; Saleheen, Danish; Mitchell, Braxton D

    2016-02-01

    Although a genetic contribution to ischemic stroke is well recognized, only a handful of stroke loci have been identified by large-scale genetic association studies to date. Hypothesizing that genetic effects might be stronger for early- versus late-onset stroke, we conducted a 2-stage meta-analysis of genome-wide association studies, focusing on stroke cases with an age of onset genetic variants at loci with association Pstroke susceptibility locus at 10q25 reached genome-wide significance in the combined analysis of all samples from the discovery and follow-up stages (rs11196288; odds ratio =1.41; P=9.5×10(-9)). The associated locus is in an intergenic region between TCF7L2 and HABP2. In a further analysis in an independent sample, we found that 2 single nucleotide polymorphisms in high linkage disequilibrium with rs11196288 were significantly associated with total plasma factor VII-activating protease levels, a product of HABP2. HABP2, which encodes an extracellular serine protease involved in coagulation, fibrinolysis, and inflammatory pathways, may be a genetic susceptibility locus for early-onset stroke. © 2016 American Heart Association, Inc.

  4. The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

    Science.gov (United States)

    Reddy, T.B.K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna; Lobos, Elizabeth A.; Kyrpides, Nikos C.

    2015-01-01

    The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Here we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards. PMID:25348402

  5. The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

    Energy Technology Data Exchange (ETDEWEB)

    Reddy, Tatiparthi B. K. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Thomas, Alex D. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Stamatis, Dimitri [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Bertsch, Jon [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Isbandi, Michelle [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Jansson, Jakob [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Mallajosyula, Jyothi [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Pagani, Ioanna [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Lobos, Elizabeth A. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); King Abdulaziz Univ., Jeddah (Saudi Arabia)

    2014-10-27

    The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Within this paper, we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. Lastly, GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards.

  6. Genome-wide association studies in the Japanese population identify seven novel loci for type 2 diabetes

    DEFF Research Database (Denmark)

    Imamura, Minako; Takahashi, Atsushi; Yamauchi, Toshimasa

    2016-01-01

    Genome-wide association studies (GWAS) have identified more than 80 susceptibility loci for type 2 diabetes (T2D), but most of its heritability still remains to be elucidated. In this study, we conducted a meta-analysis of GWAS for T2D in the Japanese population. Combined data from discovery and ...

  7. Genome-wide analysis of regulatory proteases sequences identified through bioinformatics data mining in Taenia solium.

    Science.gov (United States)

    Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng

    2014-06-04

    Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases

  8. Identifying neuropeptide and protein hormone receptors in Drosophila melanogaster by exploiting genomic data

    DEFF Research Database (Denmark)

    Hauser, Frank; Williamson, Michael; Cazzamali, Giuseppe

    2006-01-01

    insect genome, that of the fruitfly Drosophila melanogaster, was sequenced in 2000, and about 200 GPCRs have been annnotated in this model insect. About 50 of these receptors were predicted to have neuropeptides or protein hormones as their ligands. Since 2000, the cDNAs of most of these candidate...... receptors have been cloned and for many receptors the endogenous ligand has been identified. In this review, we will give an update about the current knowledge of all Drosophila neuropeptide and protein hormone receptors, and discuss their phylogenetic relationships. Udgivelsesdato: 2006-Feb...

  9. A human genome-wide loss-of-function screen identifies effective chikungunya antiviral drugs.

    Science.gov (United States)

    Karlas, Alexander; Berre, Stefano; Couderc, Thérèse; Varjak, Margus; Braun, Peter; Meyer, Michael; Gangneux, Nicolas; Karo-Astover, Liis; Weege, Friderike; Raftery, Martin; Schönrich, Günther; Klemm, Uwe; Wurzlbauer, Anne; Bracher, Franz; Merits, Andres; Meyer, Thomas F; Lecuit, Marc

    2016-05-12

    Chikungunya virus (CHIKV) is a globally spreading alphavirus against which there is no commercially available vaccine or therapy. Here we use a genome-wide siRNA screen to identify 156 proviral and 41 antiviral host factors affecting CHIKV replication. We analyse the cellular pathways in which human proviral genes are involved and identify druggable targets. Twenty-one small-molecule inhibitors, some of which are FDA approved, targeting six proviral factors or pathways, have high antiviral activity in vitro, with low toxicity. Three identified inhibitors have prophylactic antiviral effects in mouse models of chikungunya infection. Two of them, the calmodulin inhibitor pimozide and the fatty acid synthesis inhibitor TOFA, have a therapeutic effect in vivo when combined. These results demonstrate the value of loss-of-function screening and pathway analysis for the rational identification of small molecules with therapeutic potential and pave the way for the development of new, host-directed, antiviral agents.

  10. Enriched pathways for major depressive disorder identified from a genome-wide association study.

    Science.gov (United States)

    Kao, Chung-Feng; Jia, Peilin; Zhao, Zhongming; Kuo, Po-Hsiu

    2012-11-01

    Major depressive disorder (MDD) has caused a substantial burden of disease worldwide with moderate heritability. Despite efforts through conducting numerous association studies and now, genome-wide association (GWA) studies, the success of identifying susceptibility loci for MDD has been limited, which is partially attributed to the complex nature of depression pathogenesis. A pathway-based analytic strategy to investigate the joint effects of various genes within specific biological pathways has emerged as a powerful tool for complex traits. The present study aimed to identify enriched pathways for depression using a GWA dataset for MDD. For each gene, we estimated its gene-wise p value using combined and minimum p value, separately. Canonical pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) and BioCarta were used. We employed four pathway-based analytic approaches (gene set enrichment analysis, hypergeometric test, sum-square statistic, sum-statistic). We adjusted for multiple testing using Benjamini & Hochberg's method to report significant pathways. We found 17 significantly enriched pathways for depression, which presented low-to-intermediate crosstalk. The top four pathways were long-term depression (p⩽1×10-5), calcium signalling (p⩽6×10-5), arrhythmogenic right ventricular cardiomyopathy (p⩽1.6×10-4) and cell adhesion molecules (p⩽2.2×10-4). In conclusion, our comprehensive pathway analyses identified promising pathways for depression that are related to neurotransmitter and neuronal systems, immune system and inflammatory response, which may be involved in the pathophysiological mechanisms underlying depression. We demonstrated that pathway enrichment analysis is promising to facilitate our understanding of complex traits through a deeper interpretation of GWA data. Application of this comprehensive analytic strategy in upcoming GWA data for depression could validate the findings reported in this study.

  11. A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm

    Directory of Open Access Journals (Sweden)

    Allen Eric E

    2008-10-01

    Full Text Available Abstract Background The process of horizontal gene transfer (HGT is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been impractical for large numbers of genomes at once, due to prohibitive computational demands. DarkHorse, a recently described statistical method for discovering phylogenetically atypical genes on a genome-wide basis, provides a means to solve this problem through lineage probability index (LPI ranking scores. LPI scores inversely reflect phylogenetic distance between a test amino acid sequence and its closest available database matches. Proteins with low LPI scores are good horizontal gene transfer candidates; those with high scores are not. Description The DarkHorse algorithm has been applied to 955 microbial genome sequences, and the results organized into a web-searchable relational database, called the DarkHorse HGT Candidate Resource http://darkhorse.ucsd.edu. Users can select individual genomes or groups of genomes to screen by LPI score, search for protein functions by descriptive annotation or amino acid sequence similarity, or select proteins with unusual G+C composition in their underlying coding sequences. The search engine reports LPI scores for match partners as well as query sequences, providing the opportunity to explore whether potential HGT donor sequences are phylogenetically typical or atypical within their own genomes. This information can be used to predict whether or not sufficient information is available to build a well-supported phylogenetic tree using the potential donor sequence. Conclusion The DarkHorse HGT Candidate database provides a powerful, flexible set of tools for identifying phylogenetically atypical proteins, allowing researchers to explore both individual HGT events in single genomes, and

  12. Novel candidate genes and regions for childhood apraxia of speech identified by array comparative genomic hybridization.

    Science.gov (United States)

    Laffin, Jennifer J S; Raca, Gordana; Jackson, Craig A; Strand, Edythe A; Jakielski, Kathy J; Shriberg, Lawrence D

    2012-11-01

    The goal of this study was to identify new candidate genes and genomic copy-number variations associated with a rare, severe, and persistent speech disorder termed childhood apraxia of speech. Childhood apraxia of speech is the speech disorder segregating with a mutation in FOXP2 in a multigenerational London pedigree widely studied for its role in the development of speech-language in humans. A total of 24 participants who were suspected to have childhood apraxia of speech were assessed using a comprehensive protocol that samples speech in challenging contexts. All participants met clinical-research criteria for childhood apraxia of speech. Array comparative genomic hybridization analyses were completed using a customized 385K Nimblegen array (Roche Nimblegen, Madison, WI) with increased coverage of genes and regions previously associated with childhood apraxia of speech. A total of 16 copy-number variations with potential consequences for speech-language development were detected in 12 or half of the 24 participants. The copy-number variations occurred on 10 chromosomes, 3 of which had two to four candidate regions. Several participants were identified with copy-number variations in two to three regions. In addition, one participant had a heterozygous FOXP2 mutation and a copy-number variation on chromosome 2, and one participant had a 16p11.2 microdeletion and copy-number variations on chromosomes 13 and 14. Findings support the likelihood of heterogeneous genomic pathways associated with childhood apraxia of speech.

  13. Evaluation of multiple approaches to identify genome-wide polymorphisms in closely related genotypes of sweet cherry (Prunus avium L.

    Directory of Open Access Journals (Sweden)

    Seanna Hewitt

    Full Text Available Identification of genetic polymorphisms and subsequent development of molecular markers is important for marker assisted breeding of superior cultivars of economically important species. Sweet cherry (Prunus avium L. is an economically important non-climacteric tree fruit crop in the Rosaceae family and has undergone a genetic bottleneck due to breeding, resulting in limited genetic diversity in the germplasm that is utilized for breeding new cultivars. Therefore, it is critical to recognize the best platforms for identifying genome-wide polymorphisms that can help identify, and consequently preserve, the diversity in a genetically constrained species. For the identification of polymorphisms in five closely related genotypes of sweet cherry, a gel-based approach (TRAP, reduced representation sequencing (TRAPseq, a 6k cherry SNParray, and whole genome sequencing (WGS approaches were evaluated in the identification of genome-wide polymorphisms in sweet cherry cultivars. All platforms facilitated detection of polymorphisms among the genotypes with variable efficiency. In assessing multiple SNP detection platforms, this study has demonstrated that a combination of appropriate approaches is necessary for efficient polymorphism identification, especially between closely related cultivars of a species. The information generated in this study provides a valuable resource for future genetic and genomic studies in sweet cherry, and the insights gained from the evaluation of multiple approaches can be utilized for other closely related species with limited genetic diversity in the breeding germplasm. Keywords: Polymorphisms, Prunus avium, Next-generation sequencing, Target region amplification polymorphism (TRAP, Genetic diversity, SNParray, Reduced representation sequencing, Whole genome sequencing (WGS

  14. Coevolution analysis of Hepatitis C virus genome to identify the structural and functional dependency network of viral proteins

    Science.gov (United States)

    Champeimont, Raphaël; Laine, Elodie; Hu, Shuang-Wei; Penin, Francois; Carbone, Alessandra

    2016-05-01

    A novel computational approach of coevolution analysis allowed us to reconstruct the protein-protein interaction network of the Hepatitis C Virus (HCV) at the residue resolution. For the first time, coevolution analysis of an entire viral genome was realized, based on a limited set of protein sequences with high sequence identity within genotypes. The identified coevolving residues constitute highly relevant predictions of protein-protein interactions for further experimental identification of HCV protein complexes. The method can be used to analyse other viral genomes and to predict the associated protein interaction networks.

  15. The genome sequence of the emerging common midwife toad virus identifies an evolutionary intermediate within ranaviruses.

    Science.gov (United States)

    Mavian, Carla; López-Bueno, Alberto; Balseiro, Ana; Casais, Rosa; Alcamí, Antonio; Alejo, Alí

    2012-04-01

    Worldwide amphibian population declines have been ascribed to global warming, increasing pollution levels, and other factors directly related to human activities. These factors may additionally be favoring the emergence of novel pathogens. In this report, we have determined the complete genome sequence of the emerging common midwife toad ranavirus (CMTV), which has caused fatal disease in several amphibian species across Europe. Phylogenetic and gene content analyses of the first complete genomic sequence from a ranavirus isolated in Europe show that CMTV is an amphibian-like ranavirus (ALRV). However, the CMTV genome structure is novel and represents an intermediate evolutionary stage between the two previously described ALRV groups. We find that CMTV clusters with several other ranaviruses isolated from different hosts and locations which might also be included in this novel ranavirus group. This work sheds light on the phylogenetic relationships within this complex group of emerging, disease-causing viruses.

  16. Genome-wide association study identifies a maternal copy-number deletion in PSG11 enriched among preeclampsia patients

    Directory of Open Access Journals (Sweden)

    Zhao Linlu

    2012-06-01

    Full Text Available Abstract Background Specific genetic contributions for preeclampsia (PE are currently unknown. This genome-wide association study (GWAS aims to identify maternal single nucleotide polymorphisms (SNPs and copy-number variants (CNVs involved in the etiology of PE. Methods A genome-wide scan was performed on 177 PE cases (diagnosed according to National Heart, Lung and Blood Institute guidelines and 116 normotensive controls. White female study subjects from Iowa were genotyped on Affymetrix SNP 6.0 microarrays. CNV calls made using a combination of four detection algorithms (Birdseye, Canary, PennCNV, and QuantiSNP were merged using CNVision and screened with stringent prioritization criteria. Due to limited DNA quantities and the deleterious nature of copy-number deletions, it was decided a priori that only deletions would be selected for assay on the entire case-control dataset using quantitative real-time PCR. Results The top four SNP candidates had an allelic or genotypic p-value between 10-5 and 10-6, however, none surpassed the Bonferroni-corrected significance threshold. Three recurrent rare deletions meeting prioritization criteria detected in multiple cases were selected for targeted genotyping. A locus of particular interest was found showing an enrichment of case deletions in 19q13.31 (5/169 cases and 1/114 controls, which encompasses the PSG11 gene contiguous to a highly plastic genomic region. All algorithm calls for these regions were assay confirmed. Conclusions CNVs may confer risk for PE and represent interesting regions that warrant further investigation. Top SNP candidates identified from the GWAS, although not genome-wide significant, may be useful to inform future studies in PE genetics.

  17. DESCARTES' RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA.

    Science.gov (United States)

    Bhaskar, Anand; Song, Yun S

    2014-01-01

    The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the "folded" SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes' rule of signs for polynomials to the Laplace transform of piecewise continuous functions.

  18. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    Directory of Open Access Journals (Sweden)

    Pimlapas Leekitcharoenphon

    Full Text Available Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections.

  19. Genes Important for Schizosaccharomyces pombe Meiosis Identified Through a Functional Genomics Screen

    Science.gov (United States)

    Blyth, Julie; Makrantoni, Vasso; Barton, Rachael E.; Spanos, Christos; Rappsilber, Juri; Marston, Adele L.

    2018-01-01

    Meiosis is a specialized cell division that generates gametes, such as eggs and sperm. Errors in meiosis result in miscarriages and are the leading cause of birth defects; however, the molecular origins of these defects remain unknown. Studies in model organisms are beginning to identify the genes and pathways important for meiosis, but the parts list is still poorly defined. Here we present a comprehensive catalog of genes important for meiosis in the fission yeast, Schizosaccharomyces pombe. Our genome-wide functional screen surveyed all nonessential genes for roles in chromosome segregation and spore formation. Novel genes important at distinct stages of the meiotic chromosome segregation and differentiation program were identified. Preliminary characterization implicated three of these genes in centrosome/spindle pole body, centromere, and cohesion function. Our findings represent a near-complete parts list of genes important for meiosis in fission yeast, providing a valuable resource to advance our molecular understanding of meiosis. PMID:29259000

  20. Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture

    NARCIS (Netherlands)

    S.I. Berndt (Sonja); S. Gustafsson (Stefan); R. Mägi (Reedik); A. Ganna (Andrea); E. Wheeler (Eleanor); M.F. Feitosa (Mary Furlan); A.E. Justice (Anne); K.L. Monda (Keri); D.C. Croteau-Chonka (Damien); F.R. Day (Felix); T. Esko (Tõnu); M. Fall (Magnus); T. Ferreira (Teresa); D. Gentilini (Davide); A.U. Jackson (Anne); J. Luan; J.C. Randall (Joshua); S. Vedantam (Sailaja); C.J. Willer (Cristen); T.W. Winkler (Thomas); A.R. Wood (Andrew); T. Workalemahu (Tsegaselassie); Y.-J. Hu (Yi-Juan); S.H. Lee (Sang Hong); L. Liang (Liming); D.Y. Lin (Dan); J. Min (Josine); B.M. Neale (Benjamin); G. Thorleifsson (Gudmar); J. Yang (Jian); E. Albrecht (Eva); N. Amin (Najaf); J.L. Bragg-Gresham (Jennifer L.); G. Cadby (Gemma); M. den Heijer (Martin); N. Eklund (Niina); K. Fischer (Krista); A. Goel (Anuj); J.J. Hottenga (Jouke Jan); J.E. Huffman (Jennifer); I. Jarick (Ivonne); A. Johansson (Åsa); T. Johnson (Toby); S. Kanoni (Stavroula); M.E. Kleber (Marcus); I.R. König (Inke); K. Kristiansson (Kati); Z. Kutalik (Zoltán); C. Lamina (Claudia); C. Lecoeur (Cécile); G. Li (Guo); M. Mangino (Massimo); W.L. McArdle (Wendy); M.C. Medina-Gomez (Carolina); M. Müller-Nurasyid (Martina); J.S. Ngwa; I.M. Nolte (Ilja); L. Paternoster (Lavinia); S. Pechlivanis (Sonali); M. Perola (Markus); M.J. Peters (Marjolein); M. Preuss (Michael); L.M. Rose (Lynda); J. Shi (Jianxin); D. Shungin (Dmitry); G.D. Smith; R.J. Strawbridge (Rona); I. Surakka (Ida); A. Teumer (Alexander); M.D. Trip (Mieke); J.P. Tyrer (Jonathan); J.V. van Vliet-Ostaptchouk (Jana); L. Vandenput (Liesbeth); L. Waite (Lindsay); J.H. Zhao (Jing Hua); D. Absher (Devin); F.W. Asselbergs (Folkert); M. Atalay (Mustafa); A.P. Attwood (Antony); A.J. Balmforth (Anthony); D.C.G. Basart (Dick); J.P. Beilby (John); L.L. Bonnycastle (Lori); P. Brambilla (Paolo); M. Bruinenberg (M.); H. Campbell (Harry); D.I. Chasman (Daniel); P.S. Chines (Peter); F.S. Collins (Francis); J. Connell (John); W. O Cookson (William); U. de Faire (Ulf); F. de Vegt (Femmie); M. Dei (Mariano); M. Dimitriou (Maria); T. Edkins (Ted); K. Estrada Gil (Karol); D.M. Evans (David); M. Farrall (Martin); F. Ferrario (Franco); J. Ferrières (Jean); L. Franke (Lude); F. Frau (Francesca); P.V. Gejman (Pablo); H. Grallert (Harald); H. Grönberg (Henrik); V. Gudnason (Vilmundur); A. Hall (Anne); A.S. Hall (Alistair); A.L. Hartikainen; C. Hayward (Caroline); N.L. Heard-Costa (Nancy); A.C. Heath (Andrew); J. Hebebrand (Johannes); G. Homuth (Georg); F.B. Hu (Frank); S.E. Hunt (Sarah); E. Hyppönen (Elina); C. Iribarren (Carlos); K.B. Jacobs (Kevin); J.-O. Jansson (John-Olov); A. Jula (Antti); M. Kähönen (Mika); S. Kathiresan (Sekar); F. Kee (F.); K-T. Khaw (Kay-Tee); M. Kivimaki (Mika); W. Koenig (Wolfgang); A. Kraja (Aldi); M. Kumari (Meena); K. Kuulasmaa (Kari); J. Kuusisto (Johanna); J. Laitinen (Jaana); T.A. Lakka (Timo); C. Langenberg (Claudia); L.J. Launer (Lenore); L. Lind (Lars); J. Lindstrom (Jaana); J. Liu (Jianjun); A. Liuzzi (Antonio); M.L. Lokki; M. Lorentzon (Mattias); P.A. Madden (Pamela); P.K. Magnusson (Patrik); P. Manunta (Paolo); D. Marek (Diana); W. März (Winfried); I.M. Leach (Irene Mateo); B. McKnight (Barbara); S.E. Medland (Sarah Elizabeth); E. Mihailov (Evelin); L. Milani (Lili); G.W. Montgomery (Grant); V. Mooser (Vincent); T.W. Mühleisen (Thomas); P. Munroe (Patricia); A.W. Musk (Arthur); N. Narisu (Narisu); G. Navis (Gerjan); G. Nicholson (Ggeorge); C. Nohr (Christian); K. Ong (Ken); B.A. Oostra (Ben); C.N.A. Palmer (Colin); A. Palotie (Aarno); J. Peden (John); N. Pedersen; A. Peters (Annette); O. Polasek (Ozren); A. Pouta (Anneli); P.P. Pramstaller (Peter Paul); I. Prokopenko (Inga); C. Pütter (Carolin); A. Radhakrishnan (Aparna); O. Raitakari (Olli); A. Rendon (Augusto); F. Rivadeneira Ramirez (Fernando); I. Rudan (Igor); T. Saaristo (Timo); J.G. Sambrook (Jennifer); A.R. Sanders (Alan); S. Sanna (Serena); J. Saramies (Jouko); S. Schipf (Sabine); S. Schreiber (Stefan); H. Schunkert (Heribert); S.-Y. Shin; S. Signorini (Stefano); J. Sinisalo (Juha); B. Skrobek (Boris); N. Soranzo (Nicole); A. Stancáková (Alena); K. Stark (Klaus); J. Stephens (Jonathan); K. Stirrups (Kathy); R.P. Stolk (Ronald); M. Stumvoll (Michael); A.J. Swift (Amy); E.V. Theodoraki (Eirini); B. Thorand (Barbara); D.-A. Tregouet (David-Alexandre); E. Tremoli (Elena); M.M. van der Klauw (Melanie); J.B.J. van Meurs (Joyce); S.H.H.M. Vermeulen (Sita); J. Viikari (Jorma); J. Virtamo (Jarmo); V. Vitart (Veronique); G. Waeber (Gérard); Z. Wang (Zhaoming); E. Widen (Elisabeth); S.H. Wild (Sarah); G.A.H.M. Willemsen (Gonneke); B. Winkelmann; J.C.M. Witteman (Jacqueline); B.H.R. Wolffenbuttel (Bruce); A. Wong (Andrew); A.F. Wright (Alan); M.C. Zillikens (Carola); P. Amouyel (Philippe); B.O. Boehm (Bernhard); E.A. Boerwinkle (Eric); D.I. Boomsma (Dorret); M. Caulfield (Mark); S.J. Chanock (Stephen); L.A. Cupples (Adrienne); D. Cusi (Daniele); G.V. Dedoussis (George); J. Erdmann (Jeanette); J.G. Eriksson (Johan); P.W. Franks (Paul); P. Froguel (Philippe); C. Gieger (Christian); U. Gyllensten (Ulf); A. Hamsten (Anders); T.B. Harris (Tamara); C. Hengstenberg (Christian); A.A. Hicks (Andrew); A. Hingorani (Aroon); A. Hinney (Anke); A. Hofman (Albert); G.K. Hovingh (Kees); K. Hveem (Kristian); T. Illig (Thomas); M.-R. Jarvelin (Marjo-Riitta); K.-H. Jöckel (Karl-Heinz); S. Keinanen-Kiukaanniemi (Sirkka); L.A.L.M. Kiemeney (Bart); D. Kuh (Diana); M. Laakso (Markku); T. Lehtimäki (Terho); D.F. Levinson (Douglas); N.G. Martin (Nicholas); A. Metspalu (Andres); A.D. Morris (Andrew); M.S. Nieminen (Markku); I. Njølstad (Inger); C. Ohlsson (Claes); A.J. Oldehinkel (Albertine); W.H. Ouwehand (Willem); C. Palmer (Cameron); B.W.J.H. Penninx (Brenda); C. Power (Christopher); M.A. Province (Mike); B.M. Psaty (Bruce); L. Qi (Lu); R. Rauramaa (Rainer); P.M. Ridker (Paul); S. Ripatti (Samuli); V. Salomaa (Veikko); N.J. Samani (Nilesh); H. Snieder (Harold); H.G. Sorensen; T.D. Spector (Timothy); J-A. Zwart (John-Anker); A. Tönjes (Anke); J. Tuomilehto (Jaakko); A.G. Uitterlinden (André); M. Uusitupa (Matti); P. van der Harst (Pim); P. Vollenweider (Peter); H. Wallaschofski (Henri); N.J. Wareham (Nick); H. Watkins (Hugh); H.E. Wichmann (Heinz Erich); J.F. Wilson (James F); G.R. Abecasis (Gonçalo); T.L. Assimes (Themistocles); I.E. Barroso (Inês); M. Boehnke (Michael); I.B. Borecki (Ingrid); P. Deloukas (Panagiotis); C. Fox (Craig); T.M. Frayling (Timothy); L. Groop (Leif); T. Haritunian (Talin); I.M. Heid (Iris); D. Hunter (David); R.C. Kaplan (Robert); F. Karpe (Fredrik); M.F. Moffatt (Miriam); K.L. Mohlke (Karen); J.R. O´Connell; Y. Pawitan (Yudi); E.E. Schadt (Eric); D. Schlessinger (David); V. Steinthorsdottir (Valgerdur); D.P. Strachan (David); U. Thorsteinsdottir (Unnur); C.M. van Duijn (Cornelia); P.M. Visscher (Peter); A.M. Di Blasio (Anna Maria); J.N. Hirschhorn (Joel); C.M. Lindgren (Cecilia); A.D. Morris (Andrew); D. Meyre (David); A. Scherag (Andre); M.I. McCarthy (Mark); E.K. Speliotes (Elizabeth); K.E. North (Kari); R.J.F. Loos (Ruth); E. Ingelsson (Erik)

    2013-01-01

    textabstractApproaches exploiting trait distribution extremes may be used to identify loci associated with common traits, but it is unknown whether these loci are generalizable to the broader population. In a genome-wide search for loci associated with the upper versus the lower 5th percentiles of

  1. Elucidating the triplicated ancestral genome structure of radish based on chromosome-level comparison with the Brassica genomes.

    Science.gov (United States)

    Jeong, Young-Min; Kim, Namshin; Ahn, Byung Ohg; Oh, Mijin; Chung, Won-Hyong; Chung, Hee; Jeong, Seongmun; Lim, Ki-Byung; Hwang, Yoon-Jung; Kim, Goon-Bo; Baek, Seunghoon; Choi, Sang-Bong; Hyung, Dae-Jin; Lee, Seung-Won; Sohn, Seong-Han; Kwon, Soo-Jin; Jin, Mina; Seol, Young-Joo; Chae, Won Byoung; Choi, Keun Jin; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan

    2016-07-01

    This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.

  2. Whole-genome transcription and DNA methylation analysis of peripheral blood mononuclear cells identified aberrant gene regulation pathways in systemic lupus erythematosus.

    Science.gov (United States)

    Zhu, Honglin; Mi, Wentao; Luo, Hui; Chen, Tao; Liu, Shengxi; Raman, Indu; Zuo, Xiaoxia; Li, Quan-Zhen

    2016-07-13

    Recent achievement in genetics and epigenetics has led to the exploration of the pathogenesis of systemic lupus erythematosus (SLE). Identification of differentially expressed genes and their regulatory mechanism(s) at whole-genome level will provide a comprehensive understanding of the development of SLE and its devastating complications, lupus nephritis (LN). We performed whole-genome transcription and DNA methylation analysis in PBMC of 30 SLE patients, including 15 with LN (SLE LN(+)) and 15 without LN (SLE LN(-)), and 25 normal controls (NC) using HumanHT-12 Beadchips and Illumina Human Methy450 chips. The serum proinflammatory cytokines were quantified using Bio-plex Human Cytokine 27-plex assay. Differentially expressed genes and differentially methylated CpG were analyzed with GenomeStudio, R, and SAM software. The association between DNA methylation and gene expression were tested. Gene interaction pathways of the differentially expressed genes were analyzed by IPA software. We identified 552 upregulated genes and 550 downregulated genes in PBMC of SLE. Integration of DNA methylation and gene expression profiling showed that 334 upregulated genes were hypomethylated, and 479 downregulated genes were hypermethylated. Pathway analysis on the differential genes in SLE revealed significant enrichment in interferon (IFN) signaling and toll-like receptor (TLR) signaling pathways. Nine IFN- and seven TLR-related genes were identified and displayed step-wise increase in SLE LN(-) and SLE LN(+). Hypomethylated CpG sites were detected on these genes. The gene expressions for MX1, GPR84, and E2F2 were increased in SLE LN(+) as compared to SLE LN(-) patients. The serum levels of inflammatory cytokines, including IL17A, IP-10, bFGF, TNF-α, IL-6, IL-15, GM-CSF, IL-1RA, IL-5, and IL-12p70, were significantly elevated in SLE compared with NC. The levels of IL-15 and IL1RA correlated with their mRNA expression. The upregulation of IL-15 may be regulated by hypomethylated

  3. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    DEFF Research Database (Denmark)

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls....... In a dairy data set, predictions using BayesRC and imputed sequence data from 1000 Bull Genomes were 2% more accurate than with 800k data. We could demonstrate the method identified causal mutations in some cases. Further improvements will come from more accurate imputation of sequence variant genotypes...

  4. Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture

    NARCIS (Netherlands)

    Berndt, Sonja I; Gustafsson, Stefan; Mägi, Reedik; Ganna, Andrea; Wheeler, Eleanor; Feitosa, Mary F; Justice, Anne E; Monda, Keri L; Croteau-Chonka, Damien C; Day, Felix R; Esko, Tõnu; Fall, Tove; Ferreira, Teresa; Gentilini, Davide; Jackson, Anne U; Luan, Jian'an; Randall, Joshua C; Vedantam, Sailaja; Willer, Cristen J; Winkler, Thomas W; Wood, Andrew R; Workalemahu, Tsegaselassie; Hu, Yi-Juan; Lee, Sang Hong; Liang, Liming; Lin, Dan-Yu; Min, Josine L; Neale, Benjamin M; Thorleifsson, Gudmar; Yang, Jian; Albrecht, Eva; Amin, Najaf; Bragg-Gresham, Jennifer L; Cadby, Gemma; den Heijer, Martin; Eklund, Niina; Fischer, Krista; Goel, Anuj; Hottenga, Jouke-Jan; Huffman, Jennifer E; Jarick, Ivonne; Johansson, Åsa; Johnson, Toby; Kanoni, Stavroula; Kleber, Marcus E; König, Inke R; Kristiansson, Kati; Kutalik, Zoltán; Lamina, Claudia; Lecoeur, Cecile; Li, Guo; Mangino, Massimo; McArdle, Wendy L; Medina-Gomez, Carolina; Müller-Nurasyid, Martina; Ngwa, Julius S; Nolte, Ilja M; Paternoster, Lavinia; Pechlivanis, Sonali; Perola, Markus; Peters, Marjolein J; Preuss, Michael; Rose, Lynda M; Shi, Jianxin; Shungin, Dmitry; Smith, Albert Vernon; Strawbridge, Rona J; Surakka, Ida; Teumer, Alexander; Trip, Mieke D; Tyrer, Jonathan; Van Vliet-Ostaptchouk, Jana V; Vandenput, Liesbeth; Waite, Lindsay L; Zhao, Jing Hua; Absher, Devin; Asselbergs, Folkert W; Atalay, Mustafa; Attwood, Antony P; Balmforth, Anthony J; Basart, Hanneke; Beilby, John; Bonnycastle, Lori L; Brambilla, Paolo; Bruinenberg, Marcel; Campbell, Harry; Chasman, Daniel I; Chines, Peter S; Collins, Francis S; Connell, John M; Cookson, William O; de Faire, Ulf; de Vegt, Femmie; Dei, Mariano; Dimitriou, Maria; Edkins, Sarah; Estrada, Karol; Evans, David M; Farrall, Martin; Ferrario, Marco M; Ferrières, Jean; Franke, Lude; Frau, Francesca; Gejman, Pablo V; Grallert, Harald; Grönberg, Henrik; Gudnason, Vilmundur; Hall, Alistair S; Hall, Per; Hartikainen, Anna-Liisa; Hayward, Caroline; Heard-Costa, Nancy L; Heath, Andrew C; Hebebrand, Johannes; Homuth, Georg; Hu, Frank B; Hunt, Sarah E; Hyppönen, Elina; Iribarren, Carlos; Jacobs, Kevin B; Jansson, John-Olov; Jula, Antti; Kähönen, Mika; Kathiresan, Sekar; Kee, Frank; Khaw, Kay-Tee; Kivimäki, Mika; Koenig, Wolfgang; Kraja, Aldi T; Kumari, Meena; Kuulasmaa, Kari; Kuusisto, Johanna; Laitinen, Jaana H; Lakka, Timo A; Langenberg, Claudia; Launer, Lenore J; Lind, Lars; Lindström, Jaana; Liu, Jianjun; Liuzzi, Antonio; Lokki, Marja-Liisa; Lorentzon, Mattias; Madden, Pamela A; Magnusson, Patrik K; Manunta, Paolo; Marek, Diana; März, Winfried; Mateo Leach, Irene; McKnight, Barbara; Medland, Sarah E; Mihailov, Evelin; Milani, Lili; Montgomery, Grant W; Mooser, Vincent; Mühleisen, Thomas W; Munroe, Patricia B; Musk, Arthur W; Narisu, Narisu; Navis, Gerjan; Nicholson, George; Nohr, Ellen A; Ong, Ken K; Oostra, Ben A; Palmer, Colin N A; Palotie, Aarno; Peden, John F; Pedersen, Nancy; Peters, Annette; Polasek, Ozren; Pouta, Anneli; Pramstaller, Peter P; Prokopenko, Inga; Pütter, Carolin; Radhakrishnan, Aparna; Raitakari, Olli; Rendon, Augusto; Rivadeneira, Fernando; Rudan, Igor; Saaristo, Timo E; Sambrook, Jennifer G; Sanders, Alan R; Sanna, Serena; Saramies, Jouko; Schipf, Sabine; Schreiber, Stefan; Schunkert, Heribert; Shin, So-Youn; Signorini, Stefano; Sinisalo, Juha; Skrobek, Boris; Soranzo, Nicole; Stančáková, Alena; Stark, Klaus; Stephens, Jonathan C; Stirrups, Kathleen; Stolk, Ronald P; Stumvoll, Michael; Swift, Amy J; Theodoraki, Eirini V; Thorand, Barbara; Tregouet, David-Alexandre; Tremoli, Elena; Van der Klauw, Melanie M; van Meurs, Joyce B J; Vermeulen, Sita H; Viikari, Jorma; Virtamo, Jarmo; Vitart, Veronique; Waeber, Gérard; Wang, Zhaoming; Widén, Elisabeth; Wild, Sarah H; Willemsen, Gonneke; Winkelmann, Bernhard R; Witteman, Jacqueline C M; Wolffenbuttel, Bruce H R; Wong, Andrew; Wright, Alan F; Zillikens, M Carola; Amouyel, Philippe; Boehm, Bernhard O; Boerwinkle, Eric; Boomsma, Dorret I; Caulfield, Mark J; Chanock, Stephen J; Cupples, L Adrienne; Cusi, Daniele; Dedoussis, George V; Erdmann, Jeanette; Eriksson, Johan G; Franks, Paul W; Froguel, Philippe; Gieger, Christian; Gyllensten, Ulf; Hamsten, Anders; Harris, Tamara B; Hengstenberg, Christian; Hicks, Andrew A; Hingorani, Aroon; Hinney, Anke; Hofman, Albert; Hovingh, Kees G; Hveem, Kristian; Illig, Thomas; Jarvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Keinanen-Kiukaanniemi, Sirkka M; Kiemeney, Lambertus A; Kuh, Diana; Laakso, Markku; Lehtimäki, Terho; Levinson, Douglas F; Martin, Nicholas G; Metspalu, Andres; Morris, Andrew D; Nieminen, Markku S; Njølstad, Inger; Ohlsson, Claes; Oldehinkel, Albertine J; Ouwehand, Willem H; Palmer, Lyle J; Penninx, Brenda; Power, Chris; Province, Michael A; Psaty, Bruce M; Qi, Lu; Rauramaa, Rainer; Ridker, Paul M; Ripatti, Samuli; Salomaa, Veikko; Samani, Nilesh J; Snieder, Harold; Sørensen, Thorkild I A; Spector, Timothy D; Stefansson, Kari; Tönjes, Anke; Tuomilehto, Jaakko; Uitterlinden, André G; Uusitupa, Matti; van der Harst, Pim; Vollenweider, Peter; Wallaschofski, Henri; Wareham, Nicholas J; Watkins, Hugh; Wichmann, H-Erich; Wilson, James F; Abecasis, Goncalo R; Assimes, Themistocles L; Barroso, Inês; Boehnke, Michael; Borecki, Ingrid B; Deloukas, Panos; Fox, Caroline S; Frayling, Timothy; Groop, Leif C; Haritunian, Talin; Heid, Iris M; Hunter, David; Kaplan, Robert C; Karpe, Fredrik; Moffatt, Miriam F; Mohlke, Karen L; O'Connell, Jeffrey R; Pawitan, Yudi; Schadt, Eric E; Schlessinger, David; Steinthorsdottir, Valgerdur; Strachan, David P; Thorsteinsdottir, Unnur; van Duijn, Cornelia M; Visscher, Peter M; Di Blasio, Anna Maria; Hirschhorn, Joel N; Lindgren, Cecilia M; Morris, Andrew P; Meyre, David; Scherag, André; McCarthy, Mark I; Speliotes, Elizabeth K; North, Kari E; Loos, Ruth J F; Ingelsson, Erik

    Approaches exploiting trait distribution extremes may be used to identify loci associated with common traits, but it is unknown whether these loci are generalizable to the broader population. In a genome-wide search for loci associated with the upper versus the lower 5th percentiles of body mass

  5. Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture

    NARCIS (Netherlands)

    Berndt, Sonja I.; Gustafsson, Stefan; Mägi, Reedik; Ganna, Andrea; Wheeler, Eleanor; Feitosa, Mary F.; Justice, Anne E.; Monda, Keri L.; Croteau-Chonka, Damien C.; Day, Felix R.; Esko, Tõnu; Fall, Tove; Ferreira, Teresa; Gentilini, Davide; Jackson, Anne U.; Luan, Jian'an; Randall, Joshua C.; Vedantam, Sailaja; Willer, Cristen J.; Winkler, Thomas W.; Wood, Andrew R.; Workalemahu, Tsegaselassie; Hu, Yi-Juan; Lee, Sang Hong; Liang, Liming; Lin, Dan-Yu; Min, Josine L.; Neale, Benjamin M.; Thorleifsson, Gudmar; Yang, Jian; Albrecht, Eva; Amin, Najaf; Bragg-Gresham, Jennifer L.; Cadby, Gemma; den Heijer, Martin; Eklund, Niina; Fischer, Krista; Goel, Anuj; Hottenga, Jouke-Jan; Huffman, Jennifer E.; Jarick, Ivonne; Johansson, Asa; Johnson, Toby; Kanoni, Stavroula; Kleber, Marcus E.; König, Inke R.; Kristiansson, Kati; Kutalik, Zoltán; Lamina, Claudia; Lecoeur, Cecile; Li, Guo; Mangino, Massimo; McArdle, Wendy L.; Medina-Gomez, Carolina; Müller-Nurasyid, Martina; Ngwa, Julius S.; Nolte, Ilja M.; Paternoster, Lavinia; Pechlivanis, Sonali; Perola, Markus; Peters, Marjolein J.; Preuss, Michael; Rose, Lynda M.; Shi, Jianxin; Shungin, Dmitry; Smith, Albert Vernon; Strawbridge, Rona J.; Surakka, Ida; Teumer, Alexander; Trip, Mieke D.; Tyrer, Jonathan; van Vliet-Ostaptchouk, Jana V.; Vandenput, Liesbeth; Waite, Lindsay L.; Zhao, Jing Hua; Absher, Devin; Asselbergs, Folkert W.; Atalay, Mustafa; Attwood, Antony P.; Balmforth, Anthony J.; Basart, Hanneke; Beilby, John; Bonnycastle, Lori L.; Brambilla, Paolo; Bruinenberg, Marcel; Campbell, Harry; Chasman, Daniel I.; Chines, Peter S.; Collins, Francis S.; Connell, John M.; Cookson, William O.; de Faire, Ulf; de Vegt, Femmie; dei, Mariano; Dimitriou, Maria; Edkins, Sarah; Estrada, Karol; Evans, David M.; Farrall, Martin; Ferrario, Marco M.; Ferrières, Jean; Franke, Lude; Frau, Francesca; Gejman, Pablo V.; Grallert, Harald; Grönberg, Henrik; Gudnason, Vilmundur; Hall, Alistair S.; Hall, Per; Hartikainen, Anna-Liisa; Hayward, Caroline; Heard-Costa, Nancy L.; Heath, Andrew C.; Hebebrand, Johannes; Homuth, Georg; Hu, Frank B.; Hunt, Sarah E.; Hyppönen, Elina; Iribarren, Carlos; Jacobs, Kevin B.; Jansson, John-Olov; Jula, Antti; Kähönen, Mika; Kathiresan, Sekar; Kee, Frank; Khaw, Kay-Tee; Kivimäki, Mika; Koenig, Wolfgang; Kraja, Aldi T.; Kumari, Meena; Kuulasmaa, Kari; Kuusisto, Johanna; Laitinen, Jaana H.; Lakka, Timo A.; Langenberg, Claudia; Launer, Lenore J.; Lind, Lars; Lindström, Jaana; Liu, Jianjun; Liuzzi, Antonio; Lokki, Marja-Liisa; Lorentzon, Mattias; Madden, Pamela A.; Magnusson, Patrik K.; Manunta, Paolo; Marek, Diana; März, Winfried; Mateo Leach, Irene; McKnight, Barbara; Medland, Sarah E.; Mihailov, Evelin; Milani, Lili; Montgomery, Grant W.; Mooser, Vincent; Mühleisen, Thomas W.; Munroe, Patricia B.; Musk, Arthur W.; Narisu, Narisu; Navis, Gerjan; Nicholson, George; Nohr, Ellen A.; Ong, Ken K.; Oostra, Ben A.; Palmer, Colin N. A.; Palotie, Aarno; Peden, John F.; Pedersen, Nancy; Peters, Annette; Polasek, Ozren; Pouta, Anneli; Pramstaller, Peter P.; Prokopenko, Inga; Pütter, Carolin; Radhakrishnan, Aparna; Raitakari, Olli; Rendon, Augusto; Rivadeneira, Fernando; Rudan, Igor; Saaristo, Timo E.; Sambrook, Jennifer G.; Sanders, Alan R.; Sanna, Serena; Saramies, Jouko; Schipf, Sabine; Schreiber, Stefan; Schunkert, Heribert; Shin, So-Youn; Signorini, Stefano; Sinisalo, Juha; Skrobek, Boris; Soranzo, Nicole; Stančáková, Alena; Stark, Klaus; Stephens, Jonathan C.; Stirrups, Kathleen; Stolk, Ronald P.; Stumvoll, Michael; Swift, Amy J.; Theodoraki, Eirini V.; Thorand, Barbara; Tregouet, David-Alexandre; Tremoli, Elena; van der Klauw, Melanie M.; van Meurs, Joyce B. J.; Vermeulen, Sita H.; Viikari, Jorma; Virtamo, Jarmo; Vitart, Veronique; Waeber, Gérard; Wang, Zhaoming; Widén, Elisabeth; Wild, Sarah H.; Willemsen, Gonneke; Winkelmann, Bernhard R.; Witteman, Jacqueline C. M.; Wolffenbuttel, Bruce H. R.; Wong, Andrew; Wright, Alan F.; Zillikens, M. Carola; Amouyel, Philippe; Boehm, Bernhard O.; Boerwinkle, Eric; Boomsma, Dorret I.; Caulfield, Mark J.; Chanock, Stephen J.; Cupples, L. Adrienne; Cusi, Daniele; Dedoussis, George V.; Erdmann, Jeanette; Eriksson, Johan G.; Franks, Paul W.; Froguel, Philippe; Gieger, Christian; Gyllensten, Ulf; Hamsten, Anders; Harris, Tamara B.; Hengstenberg, Christian; Hicks, Andrew A.; Hingorani, Aroon; Hinney, Anke; Hofman, Albert; Hovingh, Kees G.; Hveem, Kristian; Illig, Thomas; Jarvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Keinanen-Kiukaanniemi, Sirkka M.; Kiemeney, Lambertus A.; Kuh, Diana; Laakso, Markku; Lehtimäki, Terho; Levinson, Douglas F.; Martin, Nicholas G.; Metspalu, Andres; Morris, Andrew D.; Nieminen, Markku S.; Njølstad, Inger; Ohlsson, Claes; Oldehinkel, Albertine J.; Ouwehand, Willem H.; Palmer, Lyle J.; Penninx, Brenda; Power, Chris; Province, Michael A.; Psaty, Bruce M.; Qi, Lu; Rauramaa, Rainer; Ridker, Paul M.; Ripatti, Samuli; Salomaa, Veikko; Samani, Nilesh J.; Snieder, Harold; Sørensen, Thorkild I. A.; Spector, Timothy D.; Stefansson, Kari; Tönjes, Anke; Tuomilehto, Jaakko; Uitterlinden, André G.; Uusitupa, Matti; van der Harst, Pim; Vollenweider, Peter; Wallaschofski, Henri; Wareham, Nicholas J.; Watkins, Hugh; Wichmann, H.-Erich; Wilson, James F.; Abecasis, Goncalo R.; Assimes, Themistocles L.; Barroso, Inês; Boehnke, Michael; Borecki, Ingrid B.; Deloukas, Panos; Fox, Caroline S.; Frayling, Timothy; Groop, Leif C.; Haritunian, Talin; Heid, Iris M.; Hunter, David; Kaplan, Robert C.; Karpe, Fredrik; Moffatt, Miriam F.; Mohlke, Karen L.; O'Connell, Jeffrey R.; Pawitan, Yudi; Schadt, Eric E.; Schlessinger, David; Steinthorsdottir, Valgerdur; Strachan, David P.; Thorsteinsdottir, Unnur; van Duijn, Cornelia M.; Visscher, Peter M.; Di Blasio, Anna Maria; Hirschhorn, Joel N.; Lindgren, Cecilia M.; Morris, Andrew P.; Meyre, David; Scherag, André; McCarthy, Mark I.; Speliotes, Elizabeth K.; North, Kari E.; Loos, Ruth J. F.; Ingelsson, Erik

    2013-01-01

    Approaches exploiting trait distribution extremes may be used to identify loci associated with common traits, but it is unknown whether these loci are generalizable to the broader population. In a genome-wide search for loci associated with the upper versus the lower 5th percentiles of body mass

  6. Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture

    DEFF Research Database (Denmark)

    Berndt, Sonja I; Gustafsson, Stefan; Mägi, Reedik

    2013-01-01

    Approaches exploiting trait distribution extremes may be used to identify loci associated with common traits, but it is unknown whether these loci are generalizable to the broader population. In a genome-wide search for loci associated with the upper versus the lower 5th percentiles of body mass ...

  7. [Analysis of genomic DNA methylation level in radish under cadmium stress by methylation-sensitive amplified polymorphism technique].

    Science.gov (United States)

    Yang, Jin-Lan; Liu, Li-Wang; Gong, Yi-Qin; Huang, Dan-Qiong; Wang, Feng; He, Ling-Li

    2007-06-01

    The level of cytosine methylation induced by cadmium in radish (Raphanus sativus L.) genome was analysed using the technique of methylation-sensitive amplified polymorphism (MSAP). The MSAP ratios in radish seedling exposed to cadmium chloride at the concentration of 50, 250 and 500 mg/L were 37%, 43% and 51%, respectively, and the control was 34%; the full methylation levels (C(m)CGG in double strands) were at 23%, 25% and 27%, respectively, while the control was 22%. The level of increase in MSAP and full methylation indicated that de novo methylation occurred in some 5'-CCGG sites under Cd stress. There was significant positive correlation between increase of total DNA methylation level and CdCl(2) concentration. Four types of MSAP patterns: de novo methylation, de-methylation, atypical pattern and no changes of methylation pattern were identified among CdCl(2) treatments and the control. DNA methylation alteration in plants treated with CdCl(2) was mainly through de novo methylation.

  8. Genome-wide association studies of autoimmune vitiligo identify 23 new risk loci and highlight key pathways and regulatory variants

    Science.gov (United States)

    Jin, Ying; Andersen, Genevieve; Yorgov, Daniel; Ferrara, Tracey M; Ben, Songtao; Brownson, Kelly M; Holland, Paulene J; Birlea, Stanca A; Siebert, Janet; Hartmann, Anke; Lienert, Anne; van Geel, Nanja; Lambert, Jo; Luiten, Rosalie M; Wolkerstorfer, Albert; van der Veen, JP Wietze; Bennett, Dorothy C; Taïeb, Alain; Ezzedine, Khaled; Kemp, E Helen; Gawkrodger, David J; Weetman, Anthony P; Kõks, Sulev; Prans, Ele; Kingo, Külli; Karelson, Maire; Wallace, Margaret R; McCormack, Wayne T; Overbeck, Andreas; Moretti, Silvia; Colucci, Roberta; Picardo, Mauro; Silverberg, Nanette B; Olsson, Mats; Valle, Yan; Korobko, Igor; Böhm, Markus; Lim, Henry W.; Hamzavi, Iltefat; Zhou, Li; Mi, Qing-Sheng; Fain, Pamela R.; Santorico, Stephanie A; Spritz, Richard A

    2016-01-01

    Vitiligo is an autoimmune disease in which depigmented skin results from destruction of melanocytes1, with epidemiologic association with other autoimmune diseases2. In previous linkage and genome-wide association studies (GWAS1, GWAS2), we identified 27 vitiligo susceptibility loci in patients of European (EUR) ancestry. We carried out a third GWAS (GWAS3) in EUR subjects, with augmented GWAS1 and GWAS2 controls, genome-wide imputation, and meta-analysis of all three GWAS, followed by an independent replication. The combined analyses, with 4,680 cases and 39,586 controls, identified 23 new loci and 7 suggestive loci, most encoding immune and apoptotic regulators, some also associated with other autoimmune diseases, as well as several melanocyte regulators. Bioinformatic analyses indicate a predominance of causal regulatory variation, some corresponding to eQTL at these loci. Together, the identified genes provide a framework for vitiligo genetic architecture and pathobiology, highlight relationships to other autoimmune diseases and melanoma, and offer potential targets for treatment. PMID:27723757

  9. Single nucleotide variants and InDels identified from whole-genome re-sequencing of Guzerat, Gyr, Girolando and Holstein cattle breeds.

    Directory of Open Access Journals (Sweden)

    Nedenia Bonvino Stafuzza

    Full Text Available Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose, Gyr, Girolando and Holstein (dairy production. A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs and 3,828,041 insertions/deletions (InDels were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs.

  10. The Amaranth Genome: Genome, Transcriptome, and Physical Map Assembly

    Directory of Open Access Journals (Sweden)

    J. W. Clouse

    2016-03-01

    Full Text Available Amaranth ( L. is an emerging pseudocereal native to the New World that has garnered increased attention in recent years because of its nutritional quality, in particular its seed protein and more specifically its high levels of the essential amino acid lysine. It belongs to the Amaranthaceae family, is an ancient paleopolyploid that shows disomic inheritance (2 = 32, and has an estimated genome size of 466 Mb. Here we present a high-quality draft genome sequence of the grain amaranth. The genome assembly consisted of 377 Mb in 3518 scaffolds with an N of 371 kb. Repetitive element analysis predicted that 48% of the genome is comprised of repeat sequences, of which -like elements were the most commonly classified retrotransposon. A de novo transcriptome consisting of 66,370 contigs was assembled from eight different amaranth tissue and abiotic stress libraries. Annotation of the genome identified 23,059 protein-coding genes. Seven grain amaranths (, , and and their putative progenitor ( were resequenced. A single nucleotide polymorphism (SNP phylogeny supported the classification of as the progenitor species of the grain amaranths. Lastly, we generated a de novo physical map for using the BioNano Genomics’ Genome Mapping platform. The physical map spanned 340 Mb and a hybrid assembly using the BioNano physical maps nearly doubled the N of the assembly to 697 kb. Moreover, we analyzed synteny between amaranth and sugar beet ( L. and estimated, using analysis, the age of the most recent polyploidization event in amaranth.

  11. Genome Evolution of Plant-Parasitic Nematodes.

    Science.gov (United States)

    Kikuchi, Taisei; Eves-van den Akker, Sebastian; Jones, John T

    2017-08-04

    Plant parasitism has evolved independently on at least four separate occasions in the phylum Nematoda. The application of next-generation sequencing (NGS) to plant-parasitic nematodes has allowed a wide range of genome- or transcriptome-level comparisons, and these have identified genome adaptations that enable parasitism of plants. Current genome data suggest that horizontal gene transfer, gene family expansions, evolution of new genes that mediate interactions with the host, and parasitism-specific gene regulation are important adaptations that allow nematodes to parasitize plants. Sequencing of a larger number of nematode genomes, including plant parasites that show different modes of parasitism or that have evolved in currently unsampled clades, and using free-living taxa as comparators would allow more detailed analysis and a better understanding of the organization of key genes within the genomes. This would facilitate a more complete understanding of the way in which parasitism has shaped the genomes of plant-parasitic nematodes.

  12. Gene interactions in the DNA damage-response pathway identified by genome-wide RNA-interference analysis of synthetic lethality

    NARCIS (Netherlands)

    van Haaften, Gijs; Vastenhouw, Nadine L; Nollen, Ellen A A; Plasterk, Ronald H A; Tijsterman, Marcel

    2004-01-01

    Here, we describe a systematic search for synthetic gene interactions in a multicellular organism, the nematode Caenorhabditis elegans. We established a high-throughput method to determine synthetic gene interactions by genome-wide RNA interference and identified genes that are required to protect

  13. Identifying Rare Variation in Cases of Schizophrenia in the Isolated Population of the Faroe Islands using Whole-genome Sequencing

    DEFF Research Database (Denmark)

    Als, Thomas Damm; Lescai, Francesco; Dahl, Hans

    to map risk variants involved in complex traits. We aim at utilizing samples of cases and controls of the isolated population of the Faroe Islands to conduct whole-genome-sequence analysis in order to identify rare genetic variants associated with schizophrenia. We will search for rare genetic variants...... of developing SZ. However, these studies are designed to examining only “the common variant” proportion of the genomic landscape of SZ. Due to increased genetic drift during founding and potential bottlenecks, followed by population expansion, isolated populations may be particularly useful in identifying rare...... disease variants, that may appear at higher frequencies and/or within a more clearly distinct haplotype structure compared to outbred populations. Small isolated populations also typically show reduced phenotypic, genetic and environmental heterogeneity, thus making them advantageous in studies aiming...

  14. Genome-wide association study identifies shared risk loci common to two malignancies in golden retrievers.

    Directory of Open Access Journals (Sweden)

    Noriko Tonomura

    2015-02-01

    Full Text Available Dogs, with their breed-determined limited genetic background, are great models of human disease including cancer. Canine B-cell lymphoma and hemangiosarcoma are both malignancies of the hematologic system that are clinically and histologically similar to human B-cell non-Hodgkin lymphoma and angiosarcoma, respectively. Golden retrievers in the US show significantly elevated lifetime risk for both B-cell lymphoma (6% and hemangiosarcoma (20%. We conducted genome-wide association studies for hemangiosarcoma and B-cell lymphoma, identifying two shared predisposing loci. The two associated loci are located on chromosome 5, and together contribute ~20% of the risk of developing these cancers. Genome-wide p-values for the top SNP of each locus are 4.6×10-7 and 2.7×10-6, respectively. Whole genome resequencing of nine cases and controls followed by genotyping and detailed analysis identified three shared and one B-cell lymphoma specific risk haplotypes within the two loci, but no coding changes were associated with the risk haplotypes. Gene expression analysis of B-cell lymphoma tumors revealed that carrying the risk haplotypes at the first locus is associated with down-regulation of several nearby genes including the proximal gene TRPC6, a transient receptor Ca2+-channel involved in T-cell activation, among other functions. The shared risk haplotype in the second locus overlaps the vesicle transport and release gene STX8. Carrying the shared risk haplotype is associated with gene expression changes of 100 genes enriched for pathways involved in immune cell activation. Thus, the predisposing germ-line mutations in B-cell lymphoma and hemangiosarcoma appear to be regulatory, and affect pathways involved in T-cell mediated immune response in the tumor. This suggests that the interaction between the immune system and malignant cells plays a common role in the tumorigenesis of these relatively different cancers.

  15. High-Resolution Genome-Wide Linkage Mapping Identifies Susceptibility Loci for BMI in the Chinese Population

    DEFF Research Database (Denmark)

    Zhang, Dong Feng; Pang, Zengchang; Li, Shuxia

    2012-01-01

    The genetic loci affecting the commonly used BMI have been intensively investigated using linkage approaches in multiple populations. This study aims at performing the first genome-wide linkage scan on BMI in the Chinese population in mainland China with hypothesis that heterogeneity in genetic...... linkage could exist in different ethnic populations. BMI was measured from 126 dizygotic twins in Qingdao municipality who were genotyped using high-resolution Affymetrix Genome-Wide Human SNP arrays containing about 1 million single-nucleotide polymorphisms (SNPs). Nonparametric linkage analysis...... in western countries. Multiple loci showing suggestive linkage were found on chromosome 1 (lod score 2.38 at 242 cM), chromosome 8 (2.48 at 95 cM), and chromosome 14 (2.2 at 89.4 cM). The strong linkage identified in the Chinese subjects that is consistent with that found in populations of European origin...

  16. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  17. Identifying and Fostering Higher Levels of Geometric Thinking

    Science.gov (United States)

    Škrbec, Maja; Cadež, Tatjana Hodnik

    2015-01-01

    Pierre M. Van Hiele created five levels of geometric thinking. We decided to identify the level of geometric thinking in the students in Slovenia, aged 9 to 11 years. The majority of students (60.7%) are at the transition between the zero (visual) level and the first (descriptive) level of geometric thinking. Nearly a third (31.7%) of students is…

  18. Genomic profiling in Down syndrome acute lymphoblastic leukemia identifies histone gene deletions associated with altered methylation profiles

    Science.gov (United States)

    Loudin, Michael G.; Wang, Jinhua; Leung, Hon-Chiu Eastwood; Gurusiddappa, Sivashankarappa; Meyer, Julia; Condos, Gregory; Morrison, Debra; Tsimelzon, Anna; Devidas, Meenakshi; Heerema, Nyla A.; Carroll, Andrew J.; Plon, Sharon E.; Hunger, Stephen P.; Basso, Giuseppe; Pession, Andrea; Bhojwani, Deepa; Carroll, William L.; Rabin, Karen R.

    2014-01-01

    Patients with Down syndrome (DS) and acute lymphoblastic leukemia (ALL) have distinct clinical and biological features. Whereas most DS-ALL cases lack the sentinel cytogenetic lesions that guide risk assignment in childhood ALL, JAK2 mutations and CRLF2 overexpression are highly enriched. To further characterize the unique biology of DS-ALL, we performed genome-wide profiling of 58 DS-ALL and 68 non-Down syndrome (NDS) ALL cases by DNA copy number, loss of heterozygosity, gene expression, and methylation analyses. We report a novel deletion within the 6p22 histone gene cluster as significantly more frequent in DS-ALL, occurring in 11 DS (22%) and only two NDS cases (3.1%) (Fisher’s exact p = 0.002). Homozygous deletions yielded significantly lower histone expression levels, and were associated with higher methylation levels, distinct spatial localization of methylated promoters, and enrichment of highly methylated genes for specific pathways and transcription factor binding motifs. Gene expression profiling demonstrated heterogeneity of DS-ALL cases overall, with supervised analysis defining a 45-transcript signature associated with CRLF2 overexpression. Further characterization of pathways associated with histone deletions may identify opportunities for novel targeted interventions. PMID:21647151

  19. Genome-wide association study of clinically defined gout identifies multiple risk loci and its association with clinical subtypes.

    Science.gov (United States)

    Matsuo, Hirotaka; Yamamoto, Ken; Nakaoka, Hirofumi; Nakayama, Akiyoshi; Sakiyama, Masayuki; Chiba, Toshinori; Takahashi, Atsushi; Nakamura, Takahiro; Nakashima, Hiroshi; Takada, Yuzo; Danjoh, Inaho; Shimizu, Seiko; Abe, Junko; Kawamura, Yusuke; Terashige, Sho; Ogata, Hiraku; Tatsukawa, Seishiro; Yin, Guang; Okada, Rieko; Morita, Emi; Naito, Mariko; Tokumasu, Atsumi; Onoue, Hiroyuki; Iwaya, Keiichi; Ito, Toshimitsu; Takada, Tappei; Inoue, Katsuhisa; Kato, Yukio; Nakamura, Yukio; Sakurai, Yutaka; Suzuki, Hiroshi; Kanai, Yoshikatsu; Hosoya, Tatsuo; Hamajima, Nobuyuki; Inoue, Ituro; Kubo, Michiaki; Ichida, Kimiyoshi; Ooyama, Hiroshi; Shimizu, Toru; Shinomiya, Nariyoshi

    2016-04-01

    Gout, caused by hyperuricaemia, is a multifactorial disease. Although genome-wide association studies (GWASs) of gout have been reported, they included self-reported gout cases in which clinical information was insufficient. Therefore, the relationship between genetic variation and clinical subtypes of gout remains unclear. Here, we first performed a GWAS of clinically defined gout cases only. A GWAS was conducted with 945 patients with clinically defined gout and 1213 controls in a Japanese male population, followed by replication study of 1048 clinically defined cases and 1334 controls. Five gout susceptibility loci were identified at the genome-wide significance level (pgenes (ABCG2 and SLC2A9) and additional genes: rs1260326 (p=1.9×10(-12); OR=1.36) of GCKR (a gene for glucose and lipid metabolism), rs2188380 (p=1.6×10(-23); OR=1.75) of MYL2-CUX2 (genes associated with cholesterol and diabetes mellitus) and rs4073582 (p=6.4×10(-9); OR=1.66) of CNIH-2 (a gene for regulation of glutamate signalling). The latter two are identified as novel gout loci. Furthermore, among the identified single-nucleotide polymorphisms (SNPs), we demonstrated that the SNPs of ABCG2 and SLC2A9 were differentially associated with types of gout and clinical parameters underlying specific subtypes (renal underexcretion type and renal overload type). The effect of the risk allele of each SNP on clinical parameters showed significant linear relationships with the ratio of the case-control ORs for two distinct types of gout (r=0.96 [p=4.8×10(-4)] for urate clearance and r=0.96 [p=5.0×10(-4)] for urinary urate excretion). Our findings provide clues to better understand the pathogenesis of gout and will be useful for development of companion diagnostics. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  20. A genome-wide association study identified AFF1 as a susceptibility locus for systemic lupus eyrthematosus in Japanese.

    Directory of Open Access Journals (Sweden)

    Yukinori Okada

    2012-01-01

    Full Text Available Systemic lupus erythematosus (SLE is an autoimmune disease that causes multiple organ damage. Although recent genome-wide association studies (GWAS have contributed to discovery of SLE susceptibility genes, few studies has been performed in Asian populations. Here, we report a GWAS for SLE examining 891 SLE cases and 3,384 controls and multi-stage replication studies examining 1,387 SLE cases and 28,564 controls in Japanese subjects. Considering that expression quantitative trait loci (eQTLs have been implicated in genetic risks for autoimmune diseases, we integrated an eQTL study into the results of the GWAS. We observed enrichments of cis-eQTL positive loci among the known SLE susceptibility loci (30.8% compared to the genome-wide SNPs (6.9%. In addition, we identified a novel association of a variant in the AF4/FMR2 family, member 1 (AFF1 gene at 4q21 with SLE susceptibility (rs340630; P = 8.3×10(-9, odds ratio = 1.21. The risk A allele of rs340630 demonstrated a cis-eQTL effect on the AFF1 transcript with enhanced expression levels (P<0.05. As AFF1 transcripts were prominently expressed in CD4(+ and CD19(+ peripheral blood lymphocytes, up-regulation of AFF1 may cause the abnormality in these lymphocytes, leading to disease onset.

  1. Inversion variants in human and primate genomes.

    Science.gov (United States)

    Catacchio, Claudia Rita; Maggiolini, Flavia Angela Maria; D'Addabbo, Pietro; Bitonto, Miriana; Capozzi, Oronzo; Signorile, Martina Lepore; Miroballo, Mattia; Archidiacono, Nicoletta; Eichler, Evan E; Ventura, Mario; Antonacci, Francesca

    2018-05-18

    For many years, inversions have been proposed to be a direct driving force in speciation since they suppress recombination when heterozygous. Inversions are the most common large-scale differences among humans and great apes. Nevertheless, they represent large events easily distinguishable by classical cytogenetics, whose resolution, however, is limited. Here, we performed a genome-wide comparison between human, great ape, and macaque genomes using the net alignments for the most recent releases of genome assemblies. We identified a total of 156 putative inversions, between 103 kb and 91 Mb, corresponding to 136 human loci. Combining literature, sequence, and experimental analyses, we analyzed 109 of these loci and found 67 regions inverted in one or multiple primates, including 28 newly identified inversions. These events overlap with 81 human genes at their breakpoints, and seven correspond to sites of recurrent rearrangements associated with human disease. This work doubles the number of validated primate inversions larger than 100 kb, beyond what was previously documented. We identified 74 sites of errors, where the sequence has been assembled in the wrong orientation, in the reference genomes analyzed. Our data serve two purposes: First, we generated a map of evolutionary inversions in these genomes representing a resource for interrogating differences among these species at a functional level; second, we provide a list of misassembled regions in these primate genomes, involving over 300 Mb of DNA and 1978 human genes. Accurately annotating these regions in the genome references has immediate applications for evolutionary and biomedical studies on primates. © 2018 Catacchio et al.; Published by Cold Spring Harbor Laboratory Press.

  2. Genome-wide association study identifies multiple loci associated with both mammographic density and breast cancer risk

    Science.gov (United States)

    Lindström, Sara; Thompson, Deborah J.; Paterson, Andrew D.; Li, Jingmei; Gierach, Gretchen L.; Scott, Christopher; Stone, Jennifer; Douglas, Julie A.; dos-Santos-Silva, Isabel; Fernandez-Navarro, Pablo; Verghase, Jajini; Smith, Paula; Brown, Judith; Luben, Robert; Wareham, Nicholas J.; Loos, Ruth J.F.; Heit, John A.; Pankratz, V. Shane; Norman, Aaron; Goode, Ellen L.; Cunningham, Julie M.; deAndrade, Mariza; Vierkant, Robert A.; Czene, Kamila; Fasching, Peter A.; Baglietto, Laura; Southey, Melissa C.; Giles, Graham G.; Shah, Kaanan P.; Chan, Heang-Ping; Helvie, Mark A.; Beck, Andrew H.; Knoblauch, Nicholas W.; Hazra, Aditi; Hunter, David J.; Kraft, Peter; Pollan, Marina; Figueroa, Jonine D.; Couch, Fergus J.; Hopper, John L.; Hall, Per; Easton, Douglas F.; Boyd, Norman F.; Vachon, Celine M.; Tamimi, Rulla M.

    2015-01-01

    Mammographic density reflects the amount of stromal and epithelial tissues in relation to adipose tissue in the breast and is a strong risk factor for breast cancer. Here we report the results from meta-analysis of genome-wide association studies (GWAS) of three mammographic density phenotypes: dense area, non-dense area and percent density in up to 7,916 women in stage 1 and an additional 10,379 women in stage 2. We identify genome-wide significant (P<5×10−8) loci for dense area (AREG, ESR1, ZNF365, LSP1/TNNT3, IGF1, TMEM184B, SGSM3/MKL1), non-dense area (8p11.23) and percent density (PRDM6, 8p11.23, TMEM184B). Four of these regions are known breast cancer susceptibility loci, and four additional regions were found to be associated with breast cancer (P<0.05) in a large meta-analysis. These results provide further evidence of a shared genetic basis between mammographic density and breast cancer and illustrate the power of studying intermediate quantitative phenotypes to identify putative disease susceptibility loci. PMID:25342443

  3. Meta-analysis of five genome-wide association studies identifies multiple new loci associated with testicular germ cell tumor

    DEFF Research Database (Denmark)

    Wang, Zhaoming; McGlynn, Katherine A.; Rajpert-De Meyts, Ewa

    2017-01-01

    The international Testicular Cancer Consortium (TECAC) combined five published genome-wide association studies of testicular germ cell tumor (TGCT; 3,558 cases and 13,970 controls) to identify new susceptibility loci. We conducted a fixed-effects meta-analysis, including, to our knowledge, the fi...

  4. Genome-wide CRISPR/Cas9 Screen Identifies Host Factors Essential for Influenza Virus Replication

    Directory of Open Access Journals (Sweden)

    Julianna Han

    2018-04-01

    Full Text Available Summary: The emergence of influenza A viruses (IAVs from zoonotic reservoirs poses a great threat to human health. As seasonal vaccines are ineffective against zoonotic strains, and newly transmitted viruses can quickly acquire drug resistance, there remains a need for host-directed therapeutics against IAVs. Here, we performed a genome-scale CRISPR/Cas9 knockout screen in human lung epithelial cells with a human isolate of an avian H5N1 strain. Several genes involved in sialic acid biosynthesis and related glycosylation pathways were highly enriched post-H5N1 selection, including SLC35A1, a sialic acid transporter essential for IAV receptor expression and thus viral entry. Importantly, we have identified capicua (CIC as a negative regulator of cell-intrinsic immunity, as loss of CIC resulted in heightened antiviral responses and restricted replication of multiple viruses. Therefore, our study demonstrates that the CRISPR/Cas9 system can be utilized for the discovery of host factors critical for the replication of intracellular pathogens. : Using a genome-wide CRISPR/Cas9 screen, Han et al. demonstrate that the major hit, the sialic acid transporter SLC35A1, is an essential host factor for IAV entry. In addition, they identify the DNA-binding transcriptional repressor CIC as a negative regulator of cell-intrinsic immunity. Keywords: CRISPR/Cas9 screen, GeCKO, influenza virus, host factors, sialic acid pathway, SLC35A1, Capicua, CIC, cell-intrinsic immunity, H5N1

  5. A Genome-Wide Association Study Identifies Risk Loci to Equine Recurrent Uveitis in German Warmblood Horses

    Science.gov (United States)

    Kulbrock, Maike; Lehner, Stefanie; Metzger, Julia; Ohnesorge, Bernhard; Distl, Ottmar

    2013-01-01

    Equine recurrent uveitis (ERU) is a common eye disease affecting up to 3–15% of the horse population. A genome-wide association study (GWAS) using the Illumina equine SNP50 bead chip was performed to identify loci conferring risk to ERU. The sample included a total of 144 German warmblood horses. A GWAS showed a significant single nucleotide polymorphism (SNP) on horse chromosome (ECA) 20 at 49.3 Mb, with IL-17A and IL-17F being the closest genes. This locus explained a fraction of 23% of the phenotypic variance for ERU. A GWAS taking into account the severity of ERU, revealed a SNP on ECA18 nearby to the crystalline gene cluster CRYGA-CRYGF. For both genomic regions on ECA18 and 20, significantly associated haplotypes containing the genome-wide significant SNPs could be demonstrated. In conclusion, our results are indicative for a genetic component regulating the possible critical role of IL-17A and IL-17F in the pathogenesis of ERU. The associated SNP on ECA18 may be indicative for cataract formation in the course of ERU. PMID:23977091

  6. A meta-analysis of genome-wide association studies to identify prostate cancer susceptibility loci associated with aggressive and non-aggressive disease

    DEFF Research Database (Denmark)

    Amin Al Olama, Ali; Kote-Jarai, Zsofia; Schumacher, Fredrick R

    2013-01-01

    Genome-wide association studies (GWAS) have identified multiple common genetic variants associated with an increased risk of prostate cancer (PrCa), but these explain less than one-third of the heritability. To identify further susceptibility alleles, we conducted a meta-analysis of four GWAS inc...

  7. BIGSdb: Scalable analysis of bacterial genome variation at the population level

    Directory of Open Access Journals (Sweden)

    Maiden Martin CJ

    2010-12-01

    Full Text Available Abstract Background The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner. Results The Bacterial Isolate Genome Sequence Database (BIGSDB is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens. The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences. These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses. Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches. LIMS functionality of the software enables linkage to and organisation of laboratory samples. The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database. Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus. The BIGSDB source code and documentation are available at http://pubmlst.org/software/database/bigsdb/. Conclusions Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies. BIGSDB

  8. A genome scale RNAi screen identifies GLI1 as a novel gene regulating vorinostat sensitivity.

    Science.gov (United States)

    Falkenberg, K J; Newbold, A; Gould, C M; Luu, J; Trapani, J A; Matthews, G M; Simpson, K J; Johnstone, R W

    2016-07-01

    Vorinostat is an FDA-approved histone deacetylase inhibitor (HDACi) that has proven clinical success in some patients; however, it remains unclear why certain patients remain unresponsive to this agent and other HDACis. Constitutive STAT (signal transducer and activator of transcription) activation, overexpression of prosurvival Bcl-2 proteins and loss of HR23B have been identified as potential biomarkers of HDACi resistance; however, none have yet been used to aid the clinical utility of HDACi. Herein, we aimed to further elucidate vorinostat-resistance mechanisms through a functional genomics screen to identify novel genes that when knocked down by RNA interference (RNAi) sensitized cells to vorinostat-induced apoptosis. A synthetic lethal functional screen using a whole-genome protein-coding RNAi library was used to identify genes that when knocked down cooperated with vorinostat to induce tumor cell apoptosis in otherwise resistant cells. Through iterative screening, we identified 10 vorinostat-resistance candidate genes that sensitized specifically to vorinostat. One of these vorinostat-resistance genes was GLI1, an oncogene not previously known to regulate the activity of HDACi. Treatment of vorinostat-resistant cells with the GLI1 small-molecule inhibitor, GANT61, phenocopied the effect of GLI1 knockdown. The mechanism by which GLI1 loss of function sensitized tumor cells to vorinostat-induced apoptosis is at least in part through interactions with vorinostat to alter gene expression in a manner that favored apoptosis. Upon GLI1 knockdown and vorinostat treatment, BCL2L1 expression was repressed and overexpression of BCL2L1 inhibited GLI1-knockdown-mediated vorinostat sensitization. Taken together, we present the identification and characterization of GLI1 as a new HDACi resistance gene, providing a strong rationale for development of GLI1 inhibitors for clinical use in combination with HDACi therapy.

  9. Systematic Functional Interrogation of Rare Cancer Variants Identifies Oncogenic Alleles | Office of Cancer Genomics

    Science.gov (United States)

    Cancer genome characterization efforts now provide an initial view of the somatic alterations in primary tumors. However, most point mutations occur at low frequency, and the function of these alleles remains undefined. We have developed a scalable systematic approach to interrogate the function of cancer-associated gene variants. We subjected 474 mutant alleles curated from 5,338 tumors to pooled in vivo tumor formation assays and gene expression profiling. We identified 12 transforming alleles, including two in genes (PIK3CB, POT1) that have not been shown to be tumorigenic.

  10. Genome-wide Analyses Identify KIF5A as a Novel ALS Gene.

    Science.gov (United States)

    Nicolas, Aude; Kenna, Kevin P; Renton, Alan E; Ticozzi, Nicola; Faghri, Faraz; Chia, Ruth; Dominov, Janice A; Kenna, Brendan J; Nalls, Mike A; Keagle, Pamela; Rivera, Alberto M; van Rheenen, Wouter; Murphy, Natalie A; van Vugt, Joke J F A; Geiger, Joshua T; Van der Spek, Rick A; Pliner, Hannah A; Shankaracharya; Smith, Bradley N; Marangi, Giuseppe; Topp, Simon D; Abramzon, Yevgeniya; Gkazi, Athina Soragia; Eicher, John D; Kenna, Aoife; Mora, Gabriele; Calvo, Andrea; Mazzini, Letizia; Riva, Nilo; Mandrioli, Jessica; Caponnetto, Claudia; Battistini, Stefania; Volanti, Paolo; La Bella, Vincenzo; Conforti, Francesca L; Borghero, Giuseppe; Messina, Sonia; Simone, Isabella L; Trojsi, Francesca; Salvi, Fabrizio; Logullo, Francesco O; D'Alfonso, Sandra; Corrado, Lucia; Capasso, Margherita; Ferrucci, Luigi; Moreno, Cristiane de Araujo Martins; Kamalakaran, Sitharthan; Goldstein, David B; Gitler, Aaron D; Harris, Tim; Myers, Richard M; Phatnani, Hemali; Musunuri, Rajeeva Lochan; Evani, Uday Shankar; Abhyankar, Avinash; Zody, Michael C; Kaye, Julia; Finkbeiner, Steven; Wyman, Stacia K; LeNail, Alex; Lima, Leandro; Fraenkel, Ernest; Svendsen, Clive N; Thompson, Leslie M; Van Eyk, Jennifer E; Berry, James D; Miller, Timothy M; Kolb, Stephen J; Cudkowicz, Merit; Baxi, Emily; Benatar, Michael; Taylor, J Paul; Rampersaud, Evadnie; Wu, Gang; Wuu, Joanne; Lauria, Giuseppe; Verde, Federico; Fogh, Isabella; Tiloca, Cinzia; Comi, Giacomo P; Sorarù, Gianni; Cereda, Cristina; Corcia, Philippe; Laaksovirta, Hannu; Myllykangas, Liisa; Jansson, Lilja; Valori, Miko; Ealing, John; Hamdalla, Hisham; Rollinson, Sara; Pickering-Brown, Stuart; Orrell, Richard W; Sidle, Katie C; Malaspina, Andrea; Hardy, John; Singleton, Andrew B; Johnson, Janel O; Arepalli, Sampath; Sapp, Peter C; McKenna-Yasek, Diane; Polak, Meraida; Asress, Seneshaw; Al-Sarraj, Safa; King, Andrew; Troakes, Claire; Vance, Caroline; de Belleroche, Jacqueline; Baas, Frank; Ten Asbroek, Anneloor L M A; Muñoz-Blanco, José Luis; Hernandez, Dena G; Ding, Jinhui; Gibbs, J Raphael; Scholz, Sonja W; Floeter, Mary Kay; Campbell, Roy H; Landi, Francesco; Bowser, Robert; Pulst, Stefan M; Ravits, John M; MacGowan, Daniel J L; Kirby, Janine; Pioro, Erik P; Pamphlett, Roger; Broach, James; Gerhard, Glenn; Dunckley, Travis L; Brady, Christopher B; Kowall, Neil W; Troncoso, Juan C; Le Ber, Isabelle; Mouzat, Kevin; Lumbroso, Serge; Heiman-Patterson, Terry D; Kamel, Freya; Van Den Bosch, Ludo; Baloh, Robert H; Strom, Tim M; Meitinger, Thomas; Shatunov, Aleksey; Van Eijk, Kristel R; de Carvalho, Mamede; Kooyman, Maarten; Middelkoop, Bas; Moisse, Matthieu; McLaughlin, Russell L; Van Es, Michael A; Weber, Markus; Boylan, Kevin B; Van Blitterswijk, Marka; Rademakers, Rosa; Morrison, Karen E; Basak, A Nazli; Mora, Jesús S; Drory, Vivian E; Shaw, Pamela J; Turner, Martin R; Talbot, Kevin; Hardiman, Orla; Williams, Kelly L; Fifita, Jennifer A; Nicholson, Garth A; Blair, Ian P; Rouleau, Guy A; Esteban-Pérez, Jesús; García-Redondo, Alberto; Al-Chalabi, Ammar; Rogaeva, Ekaterina; Zinman, Lorne; Ostrow, Lyle W; Maragakis, Nicholas J; Rothstein, Jeffrey D; Simmons, Zachary; Cooper-Knock, Johnathan; Brice, Alexis; Goutman, Stephen A; Feldman, Eva L; Gibson, Summer B; Taroni, Franco; Ratti, Antonia; Gellera, Cinzia; Van Damme, Philip; Robberecht, Wim; Fratta, Pietro; Sabatelli, Mario; Lunetta, Christian; Ludolph, Albert C; Andersen, Peter M; Weishaupt, Jochen H; Camu, William; Trojanowski, John Q; Van Deerlin, Vivianna M; Brown, Robert H; van den Berg, Leonard H; Veldink, Jan H; Harms, Matthew B; Glass, Jonathan D; Stone, David J; Tienari, Pentti; Silani, Vincenzo; Chiò, Adriano; Shaw, Christopher E; Traynor, Bryan J; Landers, John E

    2018-03-21

    To identify novel genes associated with ALS, we undertook two lines of investigation. We carried out a genome-wide association study comparing 20,806 ALS cases and 59,804 controls. Independently, we performed a rare variant burden analysis comparing 1,138 index familial ALS cases and 19,494 controls. Through both approaches, we identified kinesin family member 5A (KIF5A) as a novel gene associated with ALS. Interestingly, mutations predominantly in the N-terminal motor domain of KIF5A are causative for two neurodegenerative diseases: hereditary spastic paraplegia (SPG10) and Charcot-Marie-Tooth type 2 (CMT2). In contrast, ALS-associated mutations are primarily located at the C-terminal cargo-binding tail domain and patients harboring loss-of-function mutations displayed an extended survival relative to typical ALS cases. Taken together, these results broaden the phenotype spectrum resulting from mutations in KIF5A and strengthen the role of cytoskeletal defects in the pathogenesis of ALS. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4.

    LENUS (Irish Health Repository)

    Sklar, Pamela

    2011-10-01

    We conducted a combined genome-wide association study (GWAS) of 7,481 individuals with bipolar disorder (cases) and 9,250 controls as part of the Psychiatric GWAS Consortium. Our replication study tested 34 SNPs in 4,496 independent cases with bipolar disorder and 42,422 independent controls and found that 18 of 34 SNPs had P < 0.05, with 31 of 34 SNPs having signals with the same direction of effect (P = 3.8 × 10(-7)). An analysis of all 11,974 bipolar disorder cases and 51,792 controls confirmed genome-wide significant evidence of association for CACNA1C and identified a new intronic variant in ODZ4. We identified a pathway comprised of subunits of calcium channels enriched in bipolar disorder association intervals. Finally, a combined GWAS analysis of schizophrenia and bipolar disorder yielded strong association evidence for SNPs in CACNA1C and in the region of NEK4-ITIH1-ITIH3-ITIH4. Our replication results imply that increasing sample sizes in bipolar disorder will confirm many additional loci.

  12. Genome-wide association study identifies novel locus for neuroticism and shows polygenic association with Major Depressive Disorder

    Science.gov (United States)

    de Moor, Marleen H.M.; van den Berg, Stéphanie M.; Verweij, Karin J.H.; Krueger, Robert F.; Luciano, Michelle; Vasquez, Alejandro Arias; Matteson, Lindsay K.; Derringer, Jaime; Esko, Tõnu; Amin, Najaf; Gordon, Scott D.; Hansell, Narelle K.; Hart, Amy B.; Seppälä, Ilkka; Huffman, Jennifer E.; Konte, Bettina; Lahti, Jari; Lee, Minyoung; Miller, Mike; Nutile, Teresa; Tanaka, Toshiko; Teumer, Alexander; Viktorin, Alexander; Wedenoja, Juho; Abecasis, Goncalo R.; Adkins, Daniel E.; Agrawal, Arpana; Allik, Jüri; Appel, Katja; Bigdeli, Timothy B.; Busonero, Fabio; Campbell, Harry; Costa, Paul T.; Smith, George Davey; Davies, Gail; de Wit, Harriet; Ding, Jun; Engelhardt, Barbara E.; Eriksson, Johan G.; Fedko, Iryna O.; Ferrucci, Luigi; Franke, Barbara; Giegling, Ina; Grucza, Richard; Hartmann, Annette M.; Heath, Andrew C.; Heinonen, Kati; Henders, Anjali K.; Homuth, Georg; Hottenga, Jouke-Jan; Janzing, Joost; Jokela, Markus; Karlsson, Robert; Kemp, John P.; Kirkpatrick, Matthew G.; Latvala, Antti; Lehtimäki, Terho; Liewald, David C.; Madden, Pamela A.F.; Magri, Chiara; Magnusson, Patrik K.E.; Marten, Jonathan; Maschio, Andrea; Medland, Sarah E.; Mihailov, Evelin; Milaneschi, Yuri; Montgomery, Grant W.; Nauck, Matthias; Ouwens, Klaasjan G.; Palotie, Aarno; Pettersson, Erik; Polasek, Ozren; Qian, Yong; Pulkki-Råback, Laura; Raitakari, Olli T.; Realo, Anu; Rose, Richard J.; Ruggiero, Daniela; Schmidt, Carsten O.; Slutske, Wendy S.; Sorice, Rossella; Starr, John M.; Pourcain, Beate St; Sutin, Angelina R.; Timpson, Nicholas J.; Trochet, Holly; Vermeulen, Sita; Vuoksimaa, Eero; Widen, Elisabeth; Wouda, Jasper; Wright, Margaret J.; Zgaga, Lina; Scotland, Generation; Porteous, David; Minelli, Alessandra; Palmer, Abraham A.; Rujescu, Dan; Ciullo, Marina; Hayward, Caroline; Rudan, Igor; Metspalu, Andres; Kaprio, Jaakko; Deary, Ian J.; Räikkönen, Katri; Wilson, James F.; Keltikangas-Järvinen, Liisa; Bierut, Laura J.; Hettema, John M.; Grabe, Hans J.; van Duijn, Cornelia M.; Evans, David M.; Schlessinger, David; Pedersen, Nancy L.; Terracciano, Antonio; McGue, Matt; Penninx, Brenda W.J.H.; Martin, Nicholas G.; Boomsma, Dorret I.

    2015-01-01

    Importance Neuroticism is a personality trait that is briefly defined by emotional instability. It is a robust genetic risk factor for Major Depressive Disorder (MDD) and other psychiatric disorders. Hence, neuroticism is an important phenotype for psychiatric genetics. The Genetics of Personality Consortium (GPC) has created a resource for genome-wide association analyses of personality traits in over 63,000 participants (including MDD cases). Objective To identify genetic variants associated with neuroticism by performing a meta-analysis of genome-wide association (GWA) results based on 1000Genomes imputation, to evaluate if common genetic variants as assessed by Single Nucleotide Polymorphisms (SNPs) explain variation in neuroticism by estimating SNP-based heritability, and to examine whether SNPs that predict neuroticism also predict MDD. Setting 30 cohorts with genome-wide genotype, personality and MDD data from the GPC. Participants The study included 63,661 participants from 29 discovery cohorts and 9,786 participants from a replication cohort. Participants came from Europe, the United States or Australia. Main outcome measure(s) Neuroticism scores harmonized across all cohorts by Item Response Theory (IRT) analysis, and clinically assessed MDD case-control status. Results A genome-wide significant SNP was found in the MAGI1 gene (rs35855737; P=9.26 × 10−9 in the discovery meta-analysis, and P=2.38 × 10−8 in the meta-analysis of all 30 cohorts). Common genetic variants explain 15% of the variance in neuroticism. Polygenic scores based on the meta-analysis of neuroticism in 27 of the discovery cohorts significantly predicted neuroticism in 2 independent cohorts. Importantly, polygenic scores also predicted MDD in these cohorts. Conclusions and relevance This study identifies a novel locus for neuroticism. The variant is located in a known gene that has been associated with bipolar disorder and schizophrenia in previous studies. In addition, the study

  13. Genome-wide association study meta-analysis of European and Asian-ancestry samples identifies three novel loci associated with bipolar disorder.

    Science.gov (United States)

    Chen, D T; Jiang, X; Akula, N; Shugart, Y Y; Wendland, J R; Steele, C J M; Kassem, L; Park, J-H; Chatterjee, N; Jamain, S; Cheng, A; Leboyer, M; Muglia, P; Schulze, T G; Cichon, S; Nöthen, M M; Rietschel, M; McMahon, F J; Farmer, A; McGuffin, P; Craig, I; Lewis, C; Hosang, G; Cohen-Woods, S; Vincent, J B; Kennedy, J L; Strauss, J

    2013-02-01

    Meta-analyses of bipolar disorder (BD) genome-wide association studies (GWAS) have identified several genome-wide significant signals in European-ancestry samples, but so far account for little of the inherited risk. We performed a meta-analysis of ∼750,000 high-quality genetic markers on a combined sample of ∼14,000 subjects of European and Asian-ancestry (phase I). The most significant findings were further tested in an extended sample of ∼17,700 cases and controls (phase II). The results suggest novel association findings near the genes TRANK1 (LBA1), LMAN2L and PTGFR. In phase I, the most significant single nucleotide polymorphism (SNP), rs9834970 near TRANK1, was significant at the P=2.4 × 10(-11) level, with no heterogeneity. Supportive evidence for prior association findings near ANK3 and a locus on chromosome 3p21.1 was also observed. The phase II results were similar, although the heterogeneity test became significant for several SNPs. On the basis of these results and other established risk loci, we used the method developed by Park et al. to estimate the number, and the effect size distribution, of BD risk loci that could still be found by GWAS methods. We estimate that >63,000 case-control samples would be needed to identify the ∼105 BD risk loci discoverable by GWAS, and that these will together explain <6% of the inherited risk. These results support previous GWAS findings and identify three new candidate genes for BD. Further studies are needed to replicate these findings and may potentially lead to identification of functional variants. Sample size will remain a limiting factor in the discovery of common alleles associated with BD.

  14. Identifying selectively important amino acid positions associated with alternative habitat environments in fish mitochondrial genomes.

    Science.gov (United States)

    Xia, Jun Hong; Li, Hong Lian; Zhang, Yong; Meng, Zi Ning; Lin, Hao Ran

    2018-05-01

    Fish species inhabitating seawater (SW) or freshwater (FW) habitats have to develop genetic adaptations to alternative environment factors, especially salinity. Functional consequences of the protein variations associated with habitat environments in fish mitochondrial genomes have not yet received much attention. We analyzed 829 complete fish mitochondrial genomes and compared the amino acid differences of 13 mitochondrial protein families between FW and SW fish groups. We identified 47 specificity determining sites (SDS) that associated with FW or SW environments from 12 mitochondrial protein families. Thirty-two (68%) of the SDS sites are hydrophobic, 13 (28%) are neutral, and the remaining sites are acidic or basic. Seven of those SDS from ND1, ND2 and ND5 were scored as probably damaging to the protein structures. Furthermore, phylogenetic tree based Bayes Empirical Bayes analysis also detected 63 positive sites associated with alternative habitat environments across ten mtDNA proteins. These signatures could be important for studying mitochondrial genetic variation relevant to fish physiology and ecology.

  15. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

    Science.gov (United States)

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  16. Population-level genomics identifies the emergence and global spread of a human transmissible multidrug-resistant nontuberculous mycobacterium

    Science.gov (United States)

    Rodriguez-Rincon, Daniela; Everall, Isobel; Brown, Karen P; Moreno, Pablo; Verma, Deepshikha; Hill, Emily; Drijkoningen, Judith; Gilligan, Peter; Esther, Charles R; Noone, Peadar G; Giddings, Olivia; Bell, Scott C.; Thomson, Rachel; Wainwright, Claire E.; Coulter, Chris; Pandey, Sushil; Wood, Michelle E; Stockwell, Rebecca E; Ramsay, Kay A; Sherrard, Laura J; Kidd, Timothy J; Jabbour, Nassib; Johnson, Graham R; Knibbs, Luke D; Morawska, Lidia; Sly, Peter D; Jones, Andrew; Bilton, Diana; Laurenson, Ian; Ruddy, Michael; Bourke, Stephen; Bowler, Ian CJW; Chapman, Stephen J; Clayton, Andrew; Cullen, Mairi; Daniels, Thomas; Dempsey, Owen; Denton, Miles; Desai, Maya; Drew, Richard J; Edenborough, Frank; Evans, Jason; Folb, Jonathan; Humphrey, Helen; Isalska, Barbara; Jensen-Fangel, Søren; Jönsson, Bodil; Jones, Andrew M.; Katzenstein, Terese L; Lillebaek, Troels; MacGregor, Gordon; Mayell, Sarah; Millar, Michael; Modha, Deborah; Nash, Edward F; O’Brien, Christopher; O’Brien, Deirdre; Ohri, Chandra; Pao, Caroline S; Peckham, Daniel; Perrin, Felicity; Perry, Audrey; Pressler, Tania; Prtak, Laura; Qvist, Tavs; Robb, Ali; Rodgers, Helen; Schaffer, Kirsten; Shafi, Nadia; van Ingen, Jakko; Walshaw, Martin; Watson, Danie; West, Noreen; Whitehouse, Joanna; Haworth, Charles S; Harris, Simon R; Ordway, Diane; Parkhill, Julian; Floto, R. Andres

    2016-01-01

    Lung infections with Mycobacterium abscessus, a species of multidrug resistant nontuberculous mycobacteria, are emerging as an important global threat to individuals with cystic fibrosis (CF) where they accelerate inflammatory lung damage leading to increased morbidity and mortality. Previously, M. abscessus was thought to be independently acquired by susceptible individuals from the environment. However, using whole genome analysis of a global collection of clinical isolates, we show that the majority of M. abscessus infections are acquired through transmission, potentially via fomites and aerosols, of recently emerged dominant circulating clones that have spread globally. We demonstrate that these clones are associated with worse clinical outcomes, show increased virulence in cell-based and mouse infection models, and thus represent an urgent international infection challenge. PMID:27846606

  17. The mitochondrial genome of Frankliniella intonsa: insights into the evolution of mitochondrial genomes at lower taxonomic levels in Thysanoptera.

    Science.gov (United States)

    Yan, Dankan; Tang, Yunxia; Hu, Min; Liu, Fengquan; Zhang, Dongfang; Fan, Jiaqin

    2014-10-01

    Thrips is an ideal group for studying the evolution of mitochondrial (mt) genomes in the genus and family due to independent rearrangements within this order. The complete sequence of the mitochondrial DNA (mtDNA) of the flower thrips Frankliniella intonsa has been completed and annotated in this study. The circular genome is 15,215bp in length with an A+T content of 75.9% and contains the typical 37 genes and it has triplicate putative control regions. Nucleotide composition is A+T biased, and the majority of the protein-coding genes present opposite CG skew which is reflected by the nucleotide composition, codon and amino acid usage. Although the known thrips have massive gene rearrangements, it showed no reversal of strand asymmetry. Gene rearrangements have been found in the lower taxonomic levels of thrips. Three tRNA genes were translocated in the genus Frankliniella and eight tRNA genes in the family Thripidae. Although the gene arrangements of mt genomes of all three thrips species differ massively from the ancestral insect, they are all very similar to each other, indicating that there was a large rearrangement somewhere before the most recent common ancestor of these three species and very little genomic evolution or rearrangements after then. The extremely similar sequences among the CRs suggest that they are ongoing concerted evolution. Analyses of the up and downstream sequence of CRs reveal that the CR2 is actually the ancestral CR. The three CRs are in the same spot in each of the three thrips mt genomes which have the identical inverted genes. These characteristics might be obtained from the most recent common ancestor of this three thrips. Above observations suggest that the mt genomes of the three thrips keep a single massive rearrangement from the common ancestor and have low evolutionary rates among them. Copyright © 2014 Elsevier Inc. All rights reserved.

  18. Genome-wide association studies of autoimmune vitiligo identify 23 new risk loci and highlight key pathways and regulatory variants.

    Science.gov (United States)

    Jin, Ying; Andersen, Genevieve; Yorgov, Daniel; Ferrara, Tracey M; Ben, Songtao; Brownson, Kelly M; Holland, Paulene J; Birlea, Stanca A; Siebert, Janet; Hartmann, Anke; Lienert, Anne; van Geel, Nanja; Lambert, Jo; Luiten, Rosalie M; Wolkerstorfer, Albert; Wietze van der Veen, J P; Bennett, Dorothy C; Taïeb, Alain; Ezzedine, Khaled; Kemp, E Helen; Gawkrodger, David J; Weetman, Anthony P; Kõks, Sulev; Prans, Ele; Kingo, Külli; Karelson, Maire; Wallace, Margaret R; McCormack, Wayne T; Overbeck, Andreas; Moretti, Silvia; Colucci, Roberta; Picardo, Mauro; Silverberg, Nanette B; Olsson, Mats; Valle, Yan; Korobko, Igor; Böhm, Markus; Lim, Henry W; Hamzavi, Iltefat; Zhou, Li; Mi, Qing-Sheng; Fain, Pamela R; Santorico, Stephanie A; Spritz, Richard A

    2016-11-01

    Vitiligo is an autoimmune disease in which depigmented skin results from the destruction of melanocytes, with epidemiological association with other autoimmune diseases. In previous linkage and genome-wide association studies (GWAS1 and GWAS2), we identified 27 vitiligo susceptibility loci in patients of European ancestry. We carried out a third GWAS (GWAS3) in European-ancestry subjects, with augmented GWAS1 and GWAS2 controls, genome-wide imputation, and meta-analysis of all three GWAS, followed by an independent replication. The combined analyses, with 4,680 cases and 39,586 controls, identified 23 new significantly associated loci and 7 suggestive loci. Most encode immune and apoptotic regulators, with some also associated with other autoimmune diseases, as well as several melanocyte regulators. Bioinformatic analyses indicate a predominance of causal regulatory variation, some of which corresponds to expression quantitative trait loci (eQTLs) at these loci. Together, the identified genes provide a framework for the genetic architecture and pathobiology of vitiligo, highlight relationships with other autoimmune diseases and melanoma, and offer potential targets for treatment.

  19. Integrative Genomic Analysis of Cholangiocarcinoma Identifies Distinct IDH-Mutant Molecular Profiles

    Directory of Open Access Journals (Sweden)

    Farshad Farshidfar

    2017-03-01

    Full Text Available Cholangiocarcinoma (CCA is an aggressive malignancy of the bile ducts, with poor prognosis and limited treatment options. Here, we describe the integrated analysis of somatic mutations, RNA expression, copy number, and DNA methylation by The Cancer Genome Atlas of a set of predominantly intrahepatic CCA cases and propose a molecular classification scheme. We identified an IDH mutant-enriched subtype with distinct molecular features including low expression of chromatin modifiers, elevated expression of mitochondrial genes, and increased mitochondrial DNA copy number. Leveraging the multi-platform data, we observed that ARID1A exhibited DNA hypermethylation and decreased expression in the IDH mutant subtype. More broadly, we found that IDH mutations are associated with an expanded histological spectrum of liver tumors with molecular features that stratify with CCA. Our studies reveal insights into the molecular pathogenesis and heterogeneity of cholangiocarcinoma and provide classification information of potential therapeutic significance.

  20. A genome-wide association analysis of a broad psychosis phenotype identifies three loci for further investigation

    OpenAIRE

    Psychosis Endophenotypes International Consortium; Wellcome Trust Case-Control Consortium; Bramon, E.; Pirinen, M.; Strange, A.; Lin, K.; Freeman, C.; Bellenguez, C.; Su, Z.; Band, G.; Pearson, R.; Vukcevic, D.; Langford, C.; Deloukas, P.; Hunt, S.

    2014-01-01

    BACKGROUND: Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories. METHODS: 1239 cases with schizophrenia, schizoaffective disorder, or psychotic bipolar disorder; 857 of their unaffected relatives, and 2739 healthy controls were genotyped with the Affymetrix 6.0 single nucleotide polymorphism (SNP) array. Analyses of 69...

  1. A Genome-wide Association Analysis of a Broad Psychosis Phenotype Identifies Three Loci for Further Investigation

    OpenAIRE

    Tosato, Sarah; Myin-germeys, Inez; Barroso, Ines; Bender, Stephan; Giegling, Ina; Arranz, Maria J.; Donnelly, Peter; Bellenguez, Celine; Brown, Matthew A.; Lawrie, Stephen; Kalaydjieva, Luba; Vukcevic, Damjan; Kahn, Rene S.; Dronov, Serge; Walshe, Muriel

    2014-01-01

    Background: Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories.Methods: 1239 cases with schizophrenia, schizoaffective disorder, or psychotic bipolar disorder; 857 of their unaffected relatives, and 2739 healthy controls were genotyped with the Affymetrix 6.0 single nucleotide polymorphism (SNP) array. Analyses of 695,19...

  2. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

    Directory of Open Access Journals (Sweden)

    Cronn Richard

    2009-12-01

    Full Text Available Abstract Background Molecular evolutionary studies share the common goal of elucidating historical relationships, and the common challenge of adequately sampling taxa and characters. Particularly at low taxonomic levels, recent divergence, rapid radiations, and conservative genome evolution yield limited sequence variation, and dense taxon sampling is often desirable. Recent advances in massively parallel sequencing make it possible to rapidly obtain large amounts of sequence data, and multiplexing makes extensive sampling of megabase sequences feasible. Is it possible to efficiently apply massively parallel sequencing to increase phylogenetic resolution at low taxonomic levels? Results We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome generated using multiplexed massively parallel sequencing. 30/33 ingroup nodes resolved with ≥ 95% bootstrap support; this is a substantial improvement relative to prior studies, and shows massively parallel sequencing-based strategies can produce sufficient high quality sequence to reach support levels originally proposed for the phylogenetic bootstrap. Resampling simulations show that at least the entire plastome is necessary to fully resolve Pinus, particularly in rapidly radiating clades. Meta-analysis of 99 published infrageneric phylogenies shows that whole plastome analysis should provide similar gains across a range of plant genera. A disproportionate amount of phylogenetic information resides in two loci (ycf1, ycf2, highlighting their unusual evolutionary properties. Conclusion Plastome sequencing is now an efficient option for increasing phylogenetic resolution at lower taxonomic levels in plant phylogenetic and population genetic analyses. With continuing improvements in sequencing capacity, the strategies herein should revolutionize efforts requiring dense taxon and character sampling

  3. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D' Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  4. Genome-level homology and phylogeny of Shewanella (Gammaproteobacteria: lteromonadales: Shewanellaceae

    Directory of Open Access Journals (Sweden)

    Dikow Rebecca B

    2011-05-01

    Full Text Available Abstract Background The explosion in availability of whole genome data provides the opportunity to build phylogenetic hypotheses based on these data as well as the ability to learn more about the genomes themselves. The biological history of genes and genomes can be investigated based on the taxomonic history provided by the phylogeny. A phylogenetic hypothesis based on complete genome data is presented for the genus Shewanella (Gammaproteobacteria: Alteromonadales: Shewanellaceae. Nineteen taxa from Shewanella (16 species and 3 additional strains of one species as well as three outgroup species representing the genera Aeromonas (Gammaproteobacteria: Aeromonadales: Aeromonadaceae, Alteromonas (Gammaproteobacteria: Alteromonadales: Alteromonadaceae and Colwellia (Gammaproteobacteria: Alteromonadales: Colwelliaceae are included for a total of 22 taxa. Results Putatively homologous regions were found across unannotated genomes and tested with a phylogenetic analysis. Two genome-wide data-sets are considered, one including only those genomic regions for which all taxa are represented, which included 3,361,015 aligned nucleotide base-pairs (bp and a second that additionally includes those regions present in only subsets of taxa, which totaled 12,456,624 aligned bp. Alignment columns in these large data-sets were then randomly sampled to create smaller data-sets. After the phylogenetic hypothesis was generated, genome annotations were projected onto the DNA sequence alignment to compare the historical hypothesis generated by the phylogeny with the functional hypothesis posited by annotation. Conclusions Individual phylogenetic analyses of the 243 locally co-linear genome regions all failed to recover the genome topology, but the smaller data-sets that were random samplings of the large concatenated alignments all produced the genome topology. It is shown that there is not a single orthologous copy of 16S rRNA across the taxon sampling included in this

  5. A genome-wide screen identifies conserved protein hubs required for cadherin-mediated cell–cell adhesion

    Science.gov (United States)

    Toret, Christopher P.; D’Ambrosio, Michael V.; Vale, Ronald D.; Simon, Michael A.

    2014-01-01

    Cadherins and associated catenins provide an important structural interface between neighboring cells, the actin cytoskeleton, and intracellular signaling pathways in a variety of cell types throughout the Metazoa. However, the full inventory of the proteins and pathways required for cadherin-mediated adhesion has not been established. To this end, we completed a genome-wide (∼14,000 genes) ribonucleic acid interference (RNAi) screen that targeted Ca2+-dependent adhesion in DE-cadherin–expressing Drosophila melanogaster S2 cells in suspension culture. This novel screen eliminated Ca2+-independent cell–cell adhesion, integrin-based adhesion, cell spreading, and cell migration. We identified 17 interconnected regulatory hubs, based on protein functions and protein–protein interactions that regulate the levels of the core cadherin–catenin complex and coordinate cadherin-mediated cell–cell adhesion. Representative proteins from these hubs were analyzed further in Drosophila oogenesis, using targeted germline RNAi, and adhesion was analyzed in Madin–Darby canine kidney mammalian epithelial cell–cell adhesion. These experiments reveal roles for a diversity of cellular pathways that are required for cadherin function in Metazoa, including cytoskeleton organization, cell–substrate interactions, and nuclear and cytoplasmic signaling. PMID:24446484

  6. Integration of genomic, transcriptomic and proteomic data identifies two biologically distinct subtypes of invasive lobular breast cancer.

    Science.gov (United States)

    Michaut, Magali; Chin, Suet-Feung; Majewski, Ian; Severson, Tesa M; Bismeijer, Tycho; de Koning, Leanne; Peeters, Justine K; Schouten, Philip C; Rueda, Oscar M; Bosma, Astrid J; Tarrant, Finbarr; Fan, Yue; He, Beilei; Xue, Zheng; Mittempergher, Lorenza; Kluin, Roelof J C; Heijmans, Jeroen; Snel, Mireille; Pereira, Bernard; Schlicker, Andreas; Provenzano, Elena; Ali, Hamid Raza; Gaber, Alexander; O'Hurley, Gillian; Lehn, Sophie; Muris, Jettie J F; Wesseling, Jelle; Kay, Elaine; Sammut, Stephen John; Bardwell, Helen A; Barbet, Aurélie S; Bard, Floriane; Lecerf, Caroline; O'Connor, Darran P; Vis, Daniël J; Benes, Cyril H; McDermott, Ultan; Garnett, Mathew J; Simon, Iris M; Jirström, Karin; Dubois, Thierry; Linn, Sabine C; Gallagher, William M; Wessels, Lodewyk F A; Caldas, Carlos; Bernards, Rene

    2016-01-05

    Invasive lobular carcinoma (ILC) is the second most frequently occurring histological breast cancer subtype after invasive ductal carcinoma (IDC), accounting for around 10% of all breast cancers. The molecular processes that drive the development of ILC are still largely unknown. We have performed a comprehensive genomic, transcriptomic and proteomic analysis of a large ILC patient cohort and present here an integrated molecular portrait of ILC. Mutations in CDH1 and in the PI3K pathway are the most frequent molecular alterations in ILC. We identified two main subtypes of ILCs: (i) an immune related subtype with mRNA up-regulation of PD-L1, PD-1 and CTLA-4 and greater sensitivity to DNA-damaging agents in representative cell line models; (ii) a hormone related subtype, associated with Epithelial to Mesenchymal Transition (EMT), and gain of chromosomes 1q and 8q and loss of chromosome 11q. Using the somatic mutation rate and eIF4B protein level, we identified three groups with different clinical outcomes, including a group with extremely good prognosis. We provide a comprehensive overview of the molecular alterations driving ILC and have explored links with therapy response. This molecular characterization may help to tailor treatment of ILC through the application of specific targeted, chemo- and/or immune-therapies.

  7. Molluscan Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Simison, W. Brian; Boore, Jeffrey L.

    2005-12-01

    In the last 20 years there have been dramatic advances in techniques of high-throughput DNA sequencing, most recently accelerated by the Human Genome Project, a program that has determined the three billion base pair code on which we are based. Now this tremendous capability is being directed at other genome targets that are being sampled across the broad range of life. This opens up opportunities as never before for evolutionary and organismal biologists to address questions of both processes and patterns of organismal change. We stand at the dawn of a new 'modern synthesis' period, paralleling that of the early 20th century when the fledgling field of genetics first identified the underlying basis for Darwin's theory. We must now unite the efforts of systematists, paleontologists, mathematicians, computer programmers, molecular biologists, developmental biologists, and others in the pursuit of discovering what genomics can teach us about the diversity of life. Genome-level sampling for mollusks to date has mostly been limited to mitochondrial genomes and it is likely that these will continue to provide the best targets for broad phylogenetic sampling in the near future. However, we are just beginning to see an inroad into complete nuclear genome sequencing, with several mollusks and other eutrochozoans having been selected for work about to begin. Here, we provide an overview of the state of molluscan mitochondrial genomics, highlight a few of the discoveries from this research, outline the promise of broadening this dataset, describe upcoming projects to sequence whole mollusk nuclear genomes, and challenge the community to prepare for making the best use of these data.

  8. Genome-wide association studies of autoimmune vitiligo identify 23 new risk loci and highlight key pathways and regulatory variants

    NARCIS (Netherlands)

    Jin, Ying; Andersen, Genevieve; Yorgov, Daniel; Ferrara, Tracey M.; Ben, Songtao; Brownson, Kelly M.; Holland, Paulene J.; Birlea, Stanca A.; Siebert, Janet; Hartmann, Anke; Lienert, Anne; van Geel, Nanja; Lambert, Jo; Luiten, Rosalie M.; Wolkerstorfer, Albert; Wietze van der Veen, J. P.; Bennett, Dorothy C.; Taïeb, Alain; Ezzedine, Khaled; Kemp, E. Helen; Gawkrodger, David J.; Weetman, Anthony P.; Kõks, Sulev; Prans, Ele; Kingo, Külli; Karelson, Maire; Wallace, Margaret R.; McCormack, Wayne T.; Overbeck, Andreas; Moretti, Silvia; Colucci, Roberta; Picardo, Mauro; Silverberg, Nanette B.; Olsson, Mats; Valle, Yan; Korobko, Igor; Böhm, Markus; Lim, Henry W.; Hamzavi, Iltefat; Zhou, Li; Mi, Qing-Sheng; Fain, Pamela R.; Santorico, Stephanie A.; Spritz, Richard A.

    2016-01-01

    Vitiligo is an autoimmune disease in which depigmented skin results from the destruction of melanocytes, with epidemiological association with other autoimmune diseases. In previous linkage and genome-wide association studies (GWAS1 and GWAS2), we identified 27 vitiligo susceptibility loci in

  9. A Large-Scale Multi-ancestry Genome-wide Study Accounting for Smoking Behavior Identifies Multiple Significant Loci for Blood Pressure

    NARCIS (Netherlands)

    Sung, Yun J.; Winkler, Thomas W.; de las Fuentes, Lisa; Bentley, Amy R.; Brown, Michael R.; Kraja, Aldi T.; Schwander, Karen; Ntalla, Ioanna; Guo, Xiuqing; Franceschini, Nora; Lu, Yingchang; Cheng, Ching-Yu; Sim, Xueling; Vojinovic, Dina; Marten, Jonathan; Musani, Solomon K.; Li, Changwei; Feitosa, Mary F.; Kilpelainen, Tuomas O.; Richard, Melissa A.; Noordam, Raymond; Aslibekyan, Stella; Aschard, Hugues; Bartz, Traci M.; Dorajoo, Rajkumar; Liu, Yongmei; Manning, Alisa K.; Rankinen, Tuomo; Smith, Albert Vernon; Tajuddin, Salman M.; Tayo, Bamidele O.; Warren, Helen R.; Zhao, Wei; Zhou, Yanhua; Matoba, Nana; Sofer, Tamar; Alver, Maris; Amini, Marzyeh; Boissel, Mathilde; Chai, Jin Fang; Chen, Xu; Divers, Jasmin; Gandin, Ilaria; Gao, Chuan; Giulianini, Franco; Goel, Anuj; Harris, Sarah E.; Hartwig, Fernando Pires; Horimoto, Andrea R. V. R.; Hsu, Fang-Chi; Jackson, Anne U.; Kahonen, Mika; Kasturiratne, Anuradhani; Kuhnel, Brigitte; Leander, Karin; Lee, Wen-Jane; Lin, Keng-Hung; Luan, Jian' an; McKenzie, Colin A.; He Meian,; Nelson, Christopher P.; Rauramaa, Rainer; Schupf, Nicole; Scott, Robert A.; Sheu, Wayne H. H.; Stancakova, Alena; Takeuchi, Fumihiko; van der Most, Peter J.; Varga, Tibor V.; Wang, Heming; Wang, Yajuan; Ware, Erin B.; Weiss, Stefan; Wen, Wanqing; Yanek, Lisa R.; Zhang, Weihua; Zhao, Jing Hua; Afaq, Saima; Alfred, Tamuno; Amin, Najaf; Arking, Dan; Aung, Tin; Barr, R. Graham; Bielak, Lawrence F.; Boerwinkle, Eric; Bottinger, Erwin P.; Braund, Peter S.; Brody, Jennifer A.; Broeckel, Ulrich; Cabrera, Claudia P.; Cade, Brian; Yu Caizheng,; Campbell, Archie; Canouil, Mickael; Chakravarti, Aravinda; Chauhan, Ganesh; Christensen, Kaare; Cocca, Massimiliano; Collins, Francis S.; Connell, John M.; de Mutsert, Renee; de Silva, H. Janaka; Debette, Stephanie; Dorr, Marcus; Duan, Qing; Eaton, Charles B.; Ehret, Georg; Evangelou, Evangelos; Faul, Jessica D.; Fisher, Virginia A.; Forouhi, Nita G.; Franco, Oscar H.; Friedlander, Yechiel; Gao, He; Gigante, Bruna; Graff, Misa; Gu, C. Charles; Gu, Dongfeng; Gupta, Preeti; Hagenaars, Saskia P.; Harris, Tamara B.; He, Jiang; Heikkinen, Sami; Heng, Chew-Kiat; Hirata, Makoto; Hofman, Albert; Howard, Barbara V.; Hunt, Steven; Irvin, Marguerite R.; Jia, Yucheng; Joehanes, Roby; Justice, Anne E.; Katsuya, Tomohiro; Kaufman, Joel; Kerrison, Nicola D.; Khor, Chiea Chuen; Koh, Woon-Puay; Koistinen, Heikki A.; Komulainen, Pirjo; Kooperberg, Charles; Krieger, Jose E.; Kubo, Michiaki; Kuusisto, Johanna; Langefeld, Carl D.; Langenberg, Claudia; Launer, Lenore J.; Lehne, Benjamin; Lewis, Cora E.; Li, Yize; Lim, Sing Hui; Lin, Shiow; Liu, Ching-Ti; Liu, Jianjun; Liu, Jingmin; Liu, Kiang; Liu, Yeheng; Loh, Marie; Lohman, Kurt K.; Long, Jirong; Louie, Tin; Magi, Reedik; Mahajan, Anubha; Meitinger, Thomas; Metspalu, Andres; Milani, Lili; Momozawa, Yukihide; Morris, Andrew P.; Mosley, Thomas H.; Munson, Peter; Murray, Alison D.; Nalls, Mike A.; Nasri, Ubaydah; Norris, Jill M.; North, Kari; Ogunniyi, Adesola; Padmanabhan, Sandosh; Palmas, Walter R.; Palmer, Nicholette D.; Pankow, James S.; Pedersen, Nancy L.; Peters, Annette; Peyser, Patricia A.; Polasek, Ozren; Raitakari, Olli T.; Renstrom, Frida; Rice, Treva K.; Ridker, Paul M.; Robino, Antonietta; Robinson, Jennifer G.; Rose, Lynda M.; Rudan, Igor; Sabanayagam, Charumathi; Salako, Babatunde L.; Sandow, Kevin; Schmidt, Carsten O.; Schreiner, Pamela J.; Scott, William R.; Seshadri, Sudha; Sever, Peter; Sitlani, Colleen M.; Smith, Jennifer A.; Snieder, Harold; Starr, John M.; Strauch, Konstantin; Tang, Hua; Taylor, Kent D.; Teo, Yik Ying; Tham, Yih Chung; Ultterlinden, Andre G.; Waldenberger, Melanie; Wang, Lihua; Wang, Ya X.; Bin Wei, Wen; Williams, Christine; Wilson, Gregory; Wojczynski, Mary K.; Yao, Jie; Yuan, Jian-Min; Zonderman, Alan B.; Becker, Diane M.; Boehnke, Michael; Bowden, Donald W.; Chambers, John C.; Chen, Yii-Der Ida; de Faire, Ulf; Deary, Ian J.; Esko, Tonu; Farrall, Martin; Forrester, Terrence; Franks, Paul W.; Freedman, Barry I.; Froguel, Philippe; Gasparini, Paolo; Gieger, Christian; Horta, Bernardo Lessa; Hung, Yi-Jen; Jonas, Jost B.; Kato, Norihiro; Kooner, Jaspal S.; Laakso, Markku; Lehtimaki, Terho; Liang, Kae-Woei; Magnusson, Patrik K. E.; Newman, Anne B.; Oldehinkel, Albertine J.; Pereira, Alexandre C.; Redline, Susan; Rettig, Rainer; Samani, Nilesh J.; Scott, James; Shu, Xiao-Ou; van der Harst, Pim; Wagenknecht, Lynne E.; Wareham, Nicholas J.; Watkins, Hugh; Weir, David R.; Wickremasinghe, Ananda R.; Wu, Tangchun; Zheng, Wei; Kamatani, Yoichiro; Laurie, Cathy C.; Bouchard, Claude; Cooper, Richard S.; Evans, Michele K.; Gudnason, Vilmundur; Kardia, Sharon L. R.; Kritchevsky, Stephen B.; Levy, Daniel; O'Connell, Jeff R.; Psaty, Bruce M.; van Dam, Rob M.; Sims, Mario; Arnett, Donna K.; Mook-Kanamori, Dennis O.; Kelly, Tanika N.; Fox, Ervin R.; Hayward, Caroline; Fornage, Myriam; Rotimi, Charles N.; Province, Michael A.; van Duijn, Cornelia M.; Tai, E. Shyong; Wong, Tien Yin; Loos, Ruth J. F.; Reiner, Alex P.; Rotter, Jerome I.; Zhu, Xiaofeng; Bierut, Laura J.; Gauderman, W. James; Caulfield, Mark J.; Elliott, Paul; Rice, Kenneth; Munroe, Patricia B.; Morrison, Alanna C.; Cupples, L. Adrienne; Rao, Dabeeru C.; Chasman, Daniel I.; Study, Lifelines Cohort

    2018-01-01

    Genome-wide association analysis advanced understanding of blood pressure (BP), a major risk factor for vascular conditions such as coronary heart disease and stroke. Accounting for smoking behavior may help identify BP loci and extend our knowledge of its genetic architecture. We performed

  10. Identifying cis-mediators for trans-eQTLs across many human tissues using genomic mediation analysis.

    Science.gov (United States)

    Yang, Fan; Wang, Jiebiao; Pierce, Brandon L; Chen, Lin S

    2017-11-01

    The impact of inherited genetic variation on gene expression in humans is well-established. The majority of known expression quantitative trait loci (eQTLs) impact expression of local genes ( cis -eQTLs). More research is needed to identify effects of genetic variation on distant genes ( trans -eQTLs) and understand their biological mechanisms. One common trans -eQTLs mechanism is "mediation" by a local ( cis ) transcript. Thus, mediation analysis can be applied to genome-wide SNP and expression data in order to identify transcripts that are " cis -mediators" of trans -eQTLs, including those " cis -hubs" involved in regulation of many trans -genes. Identifying such mediators helps us understand regulatory networks and suggests biological mechanisms underlying trans -eQTLs, both of which are relevant for understanding susceptibility to complex diseases. The multitissue expression data from the Genotype-Tissue Expression (GTEx) program provides a unique opportunity to study cis -mediation across human tissue types. However, the presence of complex hidden confounding effects in biological systems can make mediation analyses challenging and prone to confounding bias, particularly when conducted among diverse samples. To address this problem, we propose a new method: Genomic Mediation analysis with Adaptive Confounding adjustment (GMAC). It enables the search of a very large pool of variables, and adaptively selects potential confounding variables for each mediation test. Analyses of simulated data and GTEx data demonstrate that the adaptive selection of confounders by GMAC improves the power and precision of mediation analysis. Application of GMAC to GTEx data provides new insights into the observed patterns of cis -hubs and trans -eQTL regulation across tissue types. © 2017 Yang et al.; Published by Cold Spring Harbor Laboratory Press.

  11. Genomic Signatures of Reinforcement

    Directory of Open Access Journals (Sweden)

    Austin G. Garner

    2018-04-01

    Full Text Available Reinforcement is the process by which selection against hybridization increases reproductive isolation between taxa. Much research has focused on demonstrating the existence of reinforcement, yet relatively little is known about the genetic basis of reinforcement or the evolutionary conditions under which reinforcement can occur. Inspired by reinforcement’s characteristic phenotypic pattern of reproductive trait divergence in sympatry but not in allopatry, we discuss whether reinforcement also leaves a distinct genomic pattern. First, we describe three patterns of genetic variation we expect as a consequence of reinforcement. Then, we discuss a set of alternative processes and complicating factors that may make the identification of reinforcement at the genomic level difficult. Finally, we consider how genomic analyses can be leveraged to inform if and to what extent reinforcement evolved in the face of gene flow between sympatric lineages and between allopatric and sympatric populations of the same lineage. Our major goals are to understand if genome scans for particular patterns of genetic variation could identify reinforcement, isolate the genetic basis of reinforcement, or infer the conditions under which reinforcement evolved.

  12. Genomic Signatures of Reinforcement

    Science.gov (United States)

    Goulet, Benjamin E.

    2018-01-01

    Reinforcement is the process by which selection against hybridization increases reproductive isolation between taxa. Much research has focused on demonstrating the existence of reinforcement, yet relatively little is known about the genetic basis of reinforcement or the evolutionary conditions under which reinforcement can occur. Inspired by reinforcement’s characteristic phenotypic pattern of reproductive trait divergence in sympatry but not in allopatry, we discuss whether reinforcement also leaves a distinct genomic pattern. First, we describe three patterns of genetic variation we expect as a consequence of reinforcement. Then, we discuss a set of alternative processes and complicating factors that may make the identification of reinforcement at the genomic level difficult. Finally, we consider how genomic analyses can be leveraged to inform if and to what extent reinforcement evolved in the face of gene flow between sympatric lineages and between allopatric and sympatric populations of the same lineage. Our major goals are to understand if genome scans for particular patterns of genetic variation could identify reinforcement, isolate the genetic basis of reinforcement, or infer the conditions under which reinforcement evolved. PMID:29614048

  13. PAPA: a flexible tool for identifying pleiotropic pathways using genome-wide association study summaries.

    Science.gov (United States)

    Wen, Yan; Wang, Wenyu; Guo, Xiong; Zhang, Feng

    2016-03-15

    : Pleiotropy is common in the genetic architectures of complex diseases. To the best of our knowledge, no analysis tool has been developed for identifying pleiotropic pathways using multiple genome-wide association study (GWAS) summaries by now. Here, we present PAPA, a flexible tool for pleiotropic pathway analysis utilizing GWAS summary results. The performance of PAPA was validated using publicly available GWAS summaries of body mass index and waist-hip ratio of the GIANT datasets. PAPA identified a set of pleiotropic pathways, which have been demonstrated to be involved in the development of obesity. PAPA program, document and illustrative example are available at http://sourceforge.net/projects/papav1/files/ : fzhxjtu@mail.xjtu.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Determination of Elizabethkingia Diversity by MALDI-TOF Mass Spectrometry and Whole-Genome Sequencing

    DEFF Research Database (Denmark)

    Eriksen, Helle Brander; Gumpert, Heidi; Faurholt, Cecilie Haase

    2017-01-01

    In a hospital-acquired infection with multidrug-resistant Elizabethkingia, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry and 16S rRNA gene analysis identified the pathogen as Elizabethkingia miricola. Whole-genome sequencing, genus-level core genome analysis, and in...

  15. Genetic variants associated with subjective well-being, depressive symptoms and neuroticism identified through genome-wide analyses

    Science.gov (United States)

    Derringer, Jaime; Gratten, Jacob; Lee, James J; Liu, Jimmy Z; de Vlaming, Ronald; Ahluwalia, Tarunveer S; Buchwald, Jadwiga; Cavadino, Alana; Frazier-Wood, Alexis C; Davies, Gail; Furlotte, Nicholas A; Garfield, Victoria; Geisel, Marie Henrike; Gonzalez, Juan R; Haitjema, Saskia; Karlsson, Robert; van der Laan, Sander W; Ladwig, Karl-Heinz; Lahti, Jari; van der Lee, Sven J; Miller, Michael B; Lind, Penelope A; Liu, Tian; Matteson, Lindsay; Mihailov, Evelin; Minica, Camelia C; Nolte, Ilja M; Mook-Kanamori, Dennis O; van der Most, Peter J; Oldmeadow, Christopher; Qian, Yong; Raitakari, Olli; Rawal, Rajesh; Realo, Anu; Rueedi, Rico; Schmidt, Börge; Smith, Albert V; Stergiakouli, Evie; Tanaka, Toshiko; Taylor, Kent; Thorleifsson, Gudmar; Wedenoja, Juho; Wellmann, Juergen; Westra, Harm-Jan; Willems, Sara M; Zhao, Wei; Amin, Najaf; Bakshi, Andrew; Bergmann, Sven; Bjornsdottir, Gyda; Boyle, Patricia A; Cherney, Samantha; Cox, Simon R; Davis, Oliver S P; Ding, Jun; Direk, Nese; Eibich, Peter; Emeny, Rebecca T; Fatemifar, Ghazaleh; Faul, Jessica D; Ferrucci, Luigi; Forstner, Andreas J; Gieger, Christian; Gupta, Richa; Harris, Tamara B; Harris, Juliette M; Holliday, Elizabeth G; Hottenga, Jouke-Jan; De Jager, Philip L; Kaakinen, Marika A; Kajantie, Eero; Karhunen, Ville; Kolcic, Ivana; Kumari, Meena; Launer, Lenore J; Franke, Lude; Li-Gao, Ruifang; Liewald, David C; Koini, Marisa; Loukola, Anu; Marques-Vidal, Pedro; Montgomery, Grant W; Mosing, Miriam A; Paternoster, Lavinia; Pattie, Alison; Petrovic, Katja E; Pulkki-Råback, Laura; Quaye, Lydia; Räikkönen, Katri; Rudan, Igor; Scott, Rodney J; Smith, Jennifer A; Sutin, Angelina R; Trzaskowski, Maciej; Vinkhuyzen, Anna E; Yu, Lei; Zabaneh, Delilah; Attia, John R; Bennett, David A; Berger, Klaus; Bertram, Lars; Boomsma, Dorret I; Snieder, Harold; Chang, Shun-Chiao; Cucca, Francesco; Deary, Ian J; van Duijn, Cornelia M; Eriksson, Johan G; Bültmann, Ute; de Geus, Eco J C; Groenen, Patrick J F; Gudnason, Vilmundur; Hansen, Torben; Hartman, Catharine A; Haworth, Claire M A; Hayward, Caroline; Heath, Andrew C; Hinds, David A; Hyppönen, Elina; Iacono, William G; Järvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Kaprio, Jaakko; Kardia, Sharon L R; Keltikangas-Järvinen, Liisa; Kraft, Peter; Kubzansky, Laura D; Lehtimäki, Terho; Magnusson, Patrik K E; Martin, Nicholas G; McGue, Matt; Metspalu, Andres; Mills, Melinda; de Mutsert, Renée; Oldehinkel, Albertine J; Pasterkamp, Gerard; Pedersen, Nancy L; Plomin, Robert; Polasek, Ozren; Power, Christine; Rich, Stephen S; Rosendaal, Frits R; den Ruijter, Hester M; Schlessinger, David; Schmidt, Helena; Svento, Rauli; Schmidt, Reinhold; Alizadeh, Behrooz Z; Sørensen, Thorkild I A; Spector, Tim D; Starr, John M; Stefansson, Kari; Steptoe, Andrew; Terracciano, Antonio; Thorsteinsdottir, Unnur; Thurik, A Roy; Timpson, Nicholas J; Tiemeier, Henning; Uitterlinden, André G; Vollenweider, Peter; Wagner, Gert G; Weir, David R; Yang, Jian; Conley, Dalton C; Smith, George Davey; Hofman, Albert; Johannesson, Magnus; Laibson, David I; Medland, Sarah E; Meyer, Michelle N; Pickrell, Joseph K; Esko, Tõnu; Krueger, Robert F; Beauchamp, Jonathan P; Koellinger, Philipp D; Benjamin, Daniel J; Bartels, Meike; Cesarini, David

    2016-01-01

    We conducted genome-wide association studies of three phenotypes: subjective well-being (N = 298,420), depressive symptoms (N = 161,460), and neuroticism (N = 170,910). We identified three variants associated with subjective well-being, two with depressive symptoms, and eleven with neuroticism, including two inversion polymorphisms. The two depressive symptoms loci replicate in an independent depression sample. Joint analyses that exploit the high genetic correlations between the phenotypes (|ρ^| ≈ 0.8) strengthen the overall credibility of the findings, and allow us to identify additional variants. Across our phenotypes, loci regulating expression in central nervous system and adrenal/pancreas tissues are strongly enriched for association. PMID:27089181

  16. Phylogenetic Conflict in Bears Identified by Automated Discovery of Transposable Element Insertions in Low-Coverage Genomes

    Science.gov (United States)

    Gallus, Susanne; Janke, Axel

    2017-01-01

    Abstract Phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high-throughput sequencing data to detect thousands of TE insertions. The newly developed pipeline for TE detection called TeddyPi (TE detection and discovery for Phylogenetic Inference) identified 150,513 high-quality TE insertions in the genomes of ursine and tremarctine bears. By integrating different TE insertion callers and using a stringent filtering approach, the TeddyPi pipeline produced highly reliable TE insertion calls, which were confirmed by extensive in vitro validation experiments. Analysis of single nucleotide substitutions in the flanking regions of the TEs shows that these substitutions correlate with the phylogenetic signal from the TE insertions. Our phylogenomic analyses show that TEs are a major driver of genomic variation in bears and enabled phylogenetic reconstruction of a well-resolved species tree, despite strong signals for incomplete lineage sorting and introgression. The analyses show that the Asiatic black, sun, and sloth bear form a monophyletic clade, in which phylogenetic incongruence originates from incomplete lineage sorting. TeddyPi is open source and can be adapted to various TE and structural variation callers. The pipeline makes it possible to confidently extract thousands of TE insertions even from low-coverage genomes (∼10×) of nonmodel organisms. This opens new possibilities for biologists to study phylogenies and evolutionary processes as well as rates and patterns of (retro-)transposition and structural variation. PMID:28985298

  17. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    Directory of Open Access Journals (Sweden)

    Abernathy Jason

    2009-12-01

    Full Text Available Abstract Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of 43,000 BAC end sequences and their applications for comparative genome analysis in catfish. Using these and the additional 20,000 existing BAC end sequences as a resource along with linkage mapping and existing physical map, conserved syntenic regions were identified between the catfish and zebrafish genomes. A total of 10,943 catfish BAC end sequences (17.3% had significant BLAST hits to the zebrafish genome (cutoff value ≤ e-5, of which 3,221 were unique gene hits, providing a platform for comparative mapping based on locations of these genes in catfish and zebrafish. Genetic linkage mapping of microsatellites associated with contigs allowed identification of large conserved genomic segments and construction of super scaffolds. Conclusion BAC end sequences and their associated polymorphic markers are great resources for comparative genome analysis in catfish. Highly conserved chromosomal regions were identified to exist between catfish and zebrafish. However, it appears that the level of conservation at local genomic regions are high while a high level of chromosomal shuffling and rearrangements exist between catfish and zebrafish genomes. Orthologous regions established through comparative analysis should facilitate both structural and functional genome analysis in catfish.

  18. Genome chaos: survival strategy during crisis.

    Science.gov (United States)

    Liu, Guo; Stevens, Joshua B; Horne, Steven D; Abdallah, Batoul Y; Ye, Karen J; Bremer, Steven W; Ye, Christine J; Chen, David J; Heng, Henry H

    2014-01-01

    Genome chaos, a process of complex, rapid genome re-organization, results in the formation of chaotic genomes, which is followed by the potential to establish stable genomes. It was initially detected through cytogenetic analyses, and recently confirmed by whole-genome sequencing efforts which identified multiple subtypes including "chromothripsis", "chromoplexy", "chromoanasynthesis", and "chromoanagenesis". Although genome chaos occurs commonly in tumors, both the mechanism and detailed aspects of the process are unknown due to the inability of observing its evolution over time in clinical samples. Here, an experimental system to monitor the evolutionary process of genome chaos was developed to elucidate its mechanisms. Genome chaos occurs following exposure to chemotherapeutics with different mechanisms, which act collectively as stressors. Characterization of the karyotype and its dynamic changes prior to, during, and after induction of genome chaos demonstrates that chromosome fragmentation (C-Frag) occurs just prior to chaotic genome formation. Chaotic genomes seem to form by random rejoining of chromosomal fragments, in part through non-homologous end joining (NHEJ). Stress induced genome chaos results in increased karyotypic heterogeneity. Such increased evolutionary potential is demonstrated by the identification of increased transcriptome dynamics associated with high levels of karyotypic variance. In contrast to impacting on a limited number of cancer genes, re-organized genomes lead to new system dynamics essential for cancer evolution. Genome chaos acts as a mechanism of rapid, adaptive, genome-based evolution that plays an essential role in promoting rapid macroevolution of new genome-defined systems during crisis, which may explain some unwanted consequences of cancer treatment.

  19. Family genome browser: visualizing genomes with pedigree information.

    Science.gov (United States)

    Juan, Liran; Liu, Yongzhuang; Wang, Yongtian; Teng, Mingxiang; Zang, Tianyi; Wang, Yadong

    2015-07-15

    Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. The FGB is available at http://mlg.hit.edu.cn/FGB/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. Resource base influences genome-wide DNA methylation levels in wild baboons (Papio cynocephalus)

    Science.gov (United States)

    Lea, Amanda J.; Altmann, Jeanne; Alberts, Susan C.; Tung, Jenny

    2015-01-01

    Variation in resource availability commonly exerts strong effects on fitness-related traits in wild animals. However, we know little about the molecular mechanisms that mediate these effects, or about their persistence over time. To address these questions, we profiled genome-wide whole blood DNA methylation levels in two sets of wild baboons: (i) ‘wild-feeding’ baboons that foraged naturally in a savanna environment and (ii) ‘Lodge’ baboons that had ready access to spatially concentrated human food scraps, resulting in high feeding efficiency and low daily travel distances. We identified 1,014 sites (0.20% of sites tested) that were differentially methylated between wild-feeding and Lodge baboons, providing the first evidence that resource availability shapes the epigenome in a wild mammal. Differentially methylated sites tended to occur in contiguous stretches (i.e., in differentially methylated regions or DMRs), in promoters and enhancers, and near metabolism-related genes, supporting their functional importance in gene regulation. In agreement, reporter assay experiments confirmed that methylation at the largest identified DMR, located in the promoter of a key glycolysis-related gene, was sufficient to causally drive changes in gene expression. Intriguingly, all dispersing males carried a consistent epigenetic signature of their membership in a wild-feeding group, regardless of whether males dispersed into or out of this group as adults. Together, our findings support a role for DNA methylation in mediating ecological effects on phenotypic traits in the wild, and emphasize the dynamic environmental sensitivity of DNA methylation levels across the life course. PMID:26508127

  1. PineElm_SSRdb: a microsatellite marker database identified from genomic, chloroplast, mitochondrial and EST sequences of pineapple (Ananas comosus (L.) Merrill).

    Science.gov (United States)

    Chaudhary, Sakshi; Mishra, Bharat Kumar; Vivek, Thiruvettai; Magadum, Santoshkumar; Yasin, Jeshima Khan

    2016-01-01

    Simple Sequence Repeats or microsatellites are resourceful molecular genetic markers. There are only few reports of SSR identification and development in pineapple. Complete genome sequence of pineapple available in the public domain can be used to develop numerous novel SSRs. Therefore, an attempt was made to identify SSRs from genomic, chloroplast, mitochondrial and EST sequences of pineapple which will help in deciphering genetic makeup of its germplasm resources. A total of 359511 SSRs were identified in pineapple (356385 from genome sequence, 45 from chloroplast sequence, 249 in mitochondrial sequence and 2832 from EST sequences). The list of EST-SSR markers and their details are available in the database. PineElm_SSRdb is an open source database available for non-commercial academic purpose at http://app.bioelm.com/ with a mapping tool which can develop circular maps of selected marker set. This database will be of immense use to breeders, researchers and graduates working on Ananas spp. and to others working on cross-species transferability of markers, investigating diversity, mapping and DNA fingerprinting.

  2. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer.

    OpenAIRE

    Michailidou, Kyriaki; Beesley, Jonathan; Lindstrom, Sara; Canisius, Sander; Dennis, Joe; Lush, Michael J; Maranian, Mel J; Bolla, Manjeet K; Wang, Qin; Shah, Mitulkumar Nandlal; Perkins, Barbara J; Czene, Kamila; Eriksson, Mikael; Darabi, Hatef; Brand, Judith S

    2015-01-01

    Genome-wide association studies (GWAS) and large-scale replication studies have identified common variants in 79 loci associated with breast cancer, explaining ~14% of the familial risk of the disease. To identify new susceptibility loci, we performed a meta-analysis of 11 GWAS, comprising 15,748 breast cancer cases and 18,084 controls together with 46,785 cases and 42,892 controls from 41 studies genotyped on a 211,155-marker custom array (iCOGS). Analyses were restricted to women of Europea...

  3. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer

    OpenAIRE

    Michailidou, Kyriaki; Beesley, Jonathan; Lindstrom, Stephen; Canisius, Sander; Dennis, Joe; Lush, Michael; Maranian, Melanie; Bolla, Manjeet; Wang, Qing; Shah, Mitul; Perkins, Barbara; Czene, Kamila; Eriksson, Mikael; Darabi, Hatef; Brand, Judith S.

    2015-01-01

    textabstractGenome-wide association studies (GWAS) and large-scale replication studies have identified common variants in 79 loci associated with breast cancer, explaining ∼14% of the familial risk of the disease. To identify new susceptibility loci, we performed a meta-analysis of 11 GWAS, comprising 15,748 breast cancer cases and 18,084 controls together with 46,785 cases and 42,892 controls from 41 studies genotyped on a 211,155-marker custom array (iCOGS). Analyses were restricted to wome...

  4. Exploring Other Genomes: Bacteria.

    Science.gov (United States)

    Flannery, Maura C.

    2001-01-01

    Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)

  5. Comparative Genomic Hybridization Analysis of Yersinia enterocolitica and Yersinia pseudotuberculosis Identifies Genetic Traits to Elucidate Their Different Ecologies

    Directory of Open Access Journals (Sweden)

    Kaisa Jaakkola

    2015-01-01

    Full Text Available Enteropathogenic Yersinia enterocolitica and Yersinia pseudotuberculosis are both etiological agents for intestinal infection known as yersiniosis, but their epidemiology and ecology bear many differences. Swine are the only known reservoir for Y. enterocolitica 4/O:3 strains, which are the most common cause of human disease, while Y. pseudotuberculosis has been isolated from a variety of sources, including vegetables and wild animals. Infections caused by Y. enterocolitica mainly originate from swine, but fresh produce has been the source for widespread Y. pseudotuberculosis outbreaks within recent decades. A comparative genomic hybridization analysis with a DNA microarray based on three Yersinia enterocolitica and four Yersinia pseudotuberculosis genomes was conducted to shed light on the genomic differences between enteropathogenic Yersinia. The hybridization results identified Y. pseudotuberculosis strains to carry operons linked with the uptake and utilization of substances not found in living animal tissues but present in soil, plants, and rotting flesh. Y. pseudotuberculosis also harbors a selection of type VI secretion systems targeting other bacteria and eukaryotic cells. These genetic traits are not found in Y. enterocolitica, and it appears that while Y. pseudotuberculosis has many tools beneficial for survival in varied environments, the Y. enterocolitica genome is more streamlined and adapted to their preferred animal reservoir.

  6. DESCARTES’ RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA1

    Science.gov (United States)

    Bhaskar, Anand; Song, Yun S.

    2016-01-01

    The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the “folded” SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes’ rule of signs for polynomials to the Laplace transform of piecewise continuous functions. PMID:28018011

  7. Next-generation sampling: Pairing genomics with herbarium specimens provides species-level signal in Solidago (Asteraceae).

    Science.gov (United States)

    Beck, James B; Semple, John C

    2015-06-01

    The ability to conduct species delimitation and phylogeny reconstruction with genomic data sets obtained exclusively from herbarium specimens would rapidly enhance our knowledge of large, taxonomically contentious plant genera. In this study, the utility of genotyping by sequencing is assessed in the notoriously difficult genus Solidago (Asteraceae) by attempting to obtain an informative single-nucleotide polymorphism data set from a set of specimens collected between 1970 and 2010. Reduced representation libraries were prepared and Illumina-sequenced from 95 Solidago herbarium specimen DNAs, and resulting reads were processed with the nonreference Universal Network-Enabled Analysis Kit (UNEAK) pipeline. Multidimensional clustering was used to assess the correspondence between genetic groups and morphologically defined species. Library construction and sequencing were successful in 93 of 95 samples. The UNEAK pipeline identified 8470 single-nucleotide polymorphisms, and a filtered data set was analyzed for each of three Solidago subsections. Although results varied, clustering identified genomic groups that often corresponded to currently recognized species or groups of closely related species. These results suggest that genotyping by sequencing is broadly applicable to DNAs obtained from herbarium specimens. The data obtained and their biological signal suggest that pairing genomics with large-scale herbarium sampling is a promising strategy in species-rich plant groups.

  8. GenoSets: visual analytic methods for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Aurora A Cain

    Full Text Available Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest.

  9. Integrated proteomic and genomic analysis of colorectal cancer

    Science.gov (United States)

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  10. Comparative genome analysis identifies two large deletions in the genome of highly-passaged attenuated Streptococcus agalactiae strain YM001 compared to the parental pathogenic strain HN016.

    Science.gov (United States)

    Wang, Rui; Li, Liping; Huang, Yan; Luo, Fuguang; Liang, Wanwen; Gan, Xi; Huang, Ting; Lei, Aiying; Chen, Ming; Chen, Lianfu

    2015-11-04

    Streptococcus agalactiae (S. agalactiae), also known as group B Streptococcus (GBS), is an important pathogen for neonatal pneumonia, meningitis, bovine mastitis, and fish meningoencephalitis. The global outbreaks of Streptococcus disease in tilapia cause huge economic losses and threaten human food hygiene safety as well. To investigate the mechanism of S. agalactiae pathogenesis in tilapia and develop attenuated S. agalactiae vaccine, this study sequenced and comparatively analyzed the whole genomes of virulent wild-type S. agalactiae strain HN016 and its highly-passaged attenuated strain YM001 derived from tilapia. We performed Illumina sequencing of DNA prepared from strain HN016 and YM001. Sequencedreads were assembled and nucleotide comparisons, single nucleotide polymorphism (SNP) , indels were analyzed between the draft genomes of HN016 and YM001. Clustered regularly interspaced short palindromic repeats (CRISPRs) and prophage were detected and analyzed in different S. agalactiae strains. The genome of S. agalactiae YM001 was 2,047,957 bp with a GC content of 35.61 %; it contained 2044 genes and 88 RNAs. Meanwhile, the genome of S. agalactiae HN016 was 2,064,722 bp with a GC content of 35.66 %; it had 2063 genes and 101 RNAs. Comparative genome analysis indicated that compared with HN016, YM001 genome had two significant large deletions, at the sizes of 5832 and 11,116 bp respectively, resulting in the deletion of three rRNA and ten tRNA genes, as well as the deletion and functional damage of ten genes related to metabolism, transport, growth, anti-stress, etc. Besides these two large deletions, other ten deletions and 28 single nucleotide variations (SNVs) were also identified, mainly affecting the metabolism- and growth-related genes. The genome of attenuated S. agalactiae YM001 showed significant variations, resulting in the deletion of 10 functional genes, compared to the parental pathogenic strain HN016. The deleted and mutated functional genes all

  11. Photobiomodulation effects on mRNA levels from genomic and chromosome stabilization genes in injured muscle.

    Science.gov (United States)

    da Silva Neto Trajano, Larissa Alexsandra; Trajano, Eduardo Tavares Lima; da Silva Sergio, Luiz Philippe; Teixeira, Adilson Fonseca; Mencalha, Andre Luiz; Stumbo, Ana Carolina; de Souza da Fonseca, Adenilson

    2018-04-26

    Muscle injuries are the most prevalent type of injury in sports. A great number of athletes have relapsed in muscle injuries not being treated properly. Photobiomodulation therapy is an inexpensive and safe technique with many benefits in muscle injury treatment. However, little has been explored about the infrared laser effects on DNA and telomeres in muscle injuries. Thus, the aim of this study was to evaluate photobiomodulation effects on mRNA relative levels from genes related to telomere and genomic stabilization in injured muscle. Wistar male rats were randomly divided into six groups: control, laser 25 mW, laser 75 mW, injury, injury laser 25 mW, and injury laser 75 mW. Photobiomodulation was performed with 904 nm, 3 J/cm 2 at 25 or 75 mW. Cryoinjury was induced by two applications of a metal probe cooled in liquid nitrogen directly on the tibialis anterior muscle. After euthanasia, skeletal muscle samples were withdrawn and total RNA extracted for evaluation of mRNA levels from genomic (ATM and p53) and chromosome stabilization (TRF1 and TRF2) genes by real-time quantitative polymerization chain reaction. Data show that photobiomodulation reduces the mRNA levels from ATM and p53, as well reduces mRNA levels from TRF1 and TRF2 at 25 and 75 mW in injured skeletal muscle. In conclusion, photobiomodulation alters mRNA relative levels from genes related to genomic and telomere stabilization in injured skeletal muscle.

  12. Genome-wide association study to identify common variants associated with brachial circumference: a meta-analysis of 14 cohorts.

    Directory of Open Access Journals (Sweden)

    Vesna Boraska

    Full Text Available Brachial circumference (BC, also known as upper arm or mid arm circumference, can be used as an indicator of muscle mass and fat tissue, which are distributed differently in men and women. Analysis of anthropometric measures of peripheral fat distribution such as BC could help in understanding the complex pathophysiology behind overweight and obesity. The purpose of this study is to identify genetic variants associated with BC through a large-scale genome-wide association scan (GWAS meta-analysis. We used fixed-effects meta-analysis to synthesise summary results across 14 GWAS discovery and 4 replication cohorts comprising overall 22,376 individuals (12,031 women and 10,345 men of European ancestry. Individual analyses were carried out for men, women, and combined across sexes using linear regression and an additive genetic model: adjusted for age and adjusted for age and BMI. We prioritised signals for follow-up in two-stages. We did not detect any signals reaching genome-wide significance. The FTO rs9939609 SNP showed nominal evidence for association (p<0.05 in the age-adjusted strata for men and across both sexes. In this first GWAS meta-analysis for BC to date, we have not identified any genome-wide significant signals and do not observe robust association of previously established obesity loci with BC. Large-scale collaborations will be necessary to achieve higher power to detect loci underlying BC.

  13. The role of parasite-driven selection in shaping landscape genomic structure in red grouse (Lagopus lagopus scotica).

    Science.gov (United States)

    Wenzel, Marius A; Douglas, Alex; James, Marianne C; Redpath, Steve M; Piertney, Stuart B

    2016-01-01

    Landscape genomics promises to provide novel insights into how neutral and adaptive processes shape genome-wide variation within and among populations. However, there has been little emphasis on examining whether individual-based phenotype-genotype relationships derived from approaches such as genome-wide association (GWAS) manifest themselves as a population-level signature of selection in a landscape context. The two may prove irreconcilable as individual-level patterns become diluted by high levels of gene flow and complex phenotypic or environmental heterogeneity. We illustrate this issue with a case study that examines the role of the highly prevalent gastrointestinal nematode Trichostrongylus tenuis in shaping genomic signatures of selection in red grouse (Lagopus lagopus scotica). Individual-level GWAS involving 384 SNPs has previously identified five SNPs that explain variation in T. tenuis burden. Here, we examine whether these same SNPs display population-level relationships between T. tenuis burden and genetic structure across a small-scale landscape of 21 sites with heterogeneous parasite pressure. Moreover, we identify adaptive SNPs showing signatures of directional selection using F(ST) outlier analysis and relate population- and individual-level patterns of multilocus neutral and adaptive genetic structure to T. tenuis burden. The five candidate SNPs for parasite-driven selection were neither associated with T. tenuis burden on a population level, nor under directional selection. Similarly, there was no evidence of parasite-driven selection in SNPs identified as candidates for directional selection. We discuss these results in the context of red grouse ecology and highlight the broader consequences for the utility of landscape genomics approaches for identifying signatures of selection. © 2015 John Wiley & Sons Ltd.

  14. Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models

    Directory of Open Access Journals (Sweden)

    Surovcik Katharina

    2006-03-01

    Full Text Available Abstract Background Horizontal gene transfer (HGT is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications or mutations. For a precise characterization, algorithms are needed that identify transfer events with high reliability. Frequently, the transferred pieces of DNA have a considerable length, comprise several genes and are called genomic islands (GIs or more specifically pathogenicity or symbiotic islands. Results We have implemented the program SIGI-HMM that predicts GIs and the putative donor of each individual alien gene. It is based on the analysis of codon usage (CU of each individual gene of a genome under study. CU of each gene is compared against a carefully selected set of CU tables representing microbial donors or highly expressed genes. Multiple tests are used to identify putatively alien genes, to predict putative donors and to mask putatively highly expressed genes. Thus, we determine the states and emission probabilities of an inhomogeneous hidden Markov model working on gene level. For the transition probabilities, we draw upon classical test theory with the intention of integrating a sensitivity controller in a consistent manner. SIGI-HMM was written in JAVA and is publicly available. It accepts as input any file created according to the EMBL-format. It generates output in the common GFF format readable for genome browsers. Benchmark tests showed that the output of SIGI-HMM is in agreement with known findings. Its predictions were both consistent with annotated GIs and with predictions generated by different methods. Conclusion SIGI-HMM is a sensitive tool for the identification of GIs in microbial genomes. It allows to interactively analyze genomes in detail and to generate or to test hypotheses about the origin of acquired

  15. DNA sequence explains seemingly disordered methylation levels in partially methylated domains of Mammalian genomes.

    Directory of Open Access Journals (Sweden)

    Dimos Gaidatzis

    2014-02-01

    Full Text Available For the most part metazoan genomes are highly methylated and harbor only small regions with low or absent methylation. In contrast, partially methylated domains (PMDs, recently discovered in a variety of cell lines and tissues, do not fit this paradigm as they show partial methylation for large portions (20%-40% of the genome. While in PMDs methylation levels are reduced on average, we found that at single CpG resolution, they show extensive variability along the genome outside of CpG islands and DNase I hypersensitive sites (DHS. Methylation levels range from 0% to 100% in a roughly uniform fashion with only little similarity between neighboring CpGs. A comparison of various PMD-containing methylomes showed that these seemingly disordered states of methylation are strongly conserved across cell types for virtually every PMD. Comparative sequence analysis suggests that DNA sequence is a major determinant of these methylation states. This is further substantiated by a purely sequence based model which can predict 31% (R(2 of the variation in methylation. The model revealed CpG density as the main driving feature promoting methylation, opposite to what has been shown for CpG islands, followed by various dinucleotides immediately flanking the CpG and a minor contribution from sequence preferences reflecting nucleosome positioning. Taken together we provide a reinterpretation for the nucleotide-specific methylation levels observed in PMDs, demonstrate their conservation across tissues and suggest that they are mainly determined by specific DNA sequence features.

  16. Genome-Wide Analysis in Three Fusarium Pathogens Identifies Rapidly Evolving Chromosomes and Genes Associated with Pathogenicity

    Science.gov (United States)

    Sperschneider, Jana; Gardiner, Donald M.; Thatcher, Louise F.; Lyons, Rebecca; Singh, Karam B.; Manners, John M.; Taylor, Jennifer M.

    2015-01-01

    Pathogens and hosts are in an ongoing arms race and genes involved in host–pathogen interactions are likely to undergo diversifying selection. Fusarium plant pathogens have evolved diverse infection strategies, but how they interact with their hosts in the biotrophic infection stage remains puzzling. To address this, we analyzed the genomes of three Fusarium plant pathogens for genes that are under diversifying selection. We found a two-speed genome structure both on the chromosome and gene group level. Diversifying selection acts strongly on the dispensable chromosomes in Fusarium oxysporum f. sp. lycopersici and on distinct core chromosome regions in Fusarium graminearum, all of which have associations with virulence. Members of two gene groups evolve rapidly, namely those that encode proteins with an N-terminal [SG]-P-C-[KR]-P sequence motif and proteins that are conserved predominantly in pathogens. Specifically, 29 F. graminearum genes are rapidly evolving, in planta induced and encode secreted proteins, strongly pointing toward effector function. In summary, diversifying selection in Fusarium is strongly reflected as genomic footprints and can be used to predict a small gene set likely to be involved in host–pathogen interactions for experimental verification. PMID:25994930

  17. Genome-level comparisons provide insight into the phylogeny and metabolic diversity of species within the genus Lactococcus.

    Science.gov (United States)

    Yu, Jie; Song, Yuqin; Ren, Yan; Qing, Yanting; Liu, Wenjun; Sun, Zhihong

    2017-11-03

    The genomic diversity of different species within the genus Lactococcus and the relationships between genomic differentiation and environmental factors remain unclear. In this study, type isolates of ten Lactococcus species/subspecies were sequenced to assess their genomic characteristics, metabolic diversity, and phylogenetic relationships. The total genome sizes varied between 1.99 (Lactococcus plantarum) and 2.46 megabases (Mb; L. lactis subsp. lactis), and the G + C content ranged from 34.81 (L. lactis subsp. hordniae) to 39.67% (L. raffinolactis) with an average value of 37.02%. Analysis of genome dynamics indicated that the genus Lactococcus has an open pan-genome, while the core genome size decreased with sequential addition at the genus and species group levels. A phylogenetic dendrogram based on the concatenated amino acid sequences of 643 core genes was largely consistent with the phylogenetic tree obtained by 16S ribosomal RNA (rRNA) genes, but it provided a more robust phylogenetic resolution than the 16S rRNA gene-based analysis. Comparative genomics indicated that species in the genus Lactococcus had high degrees of diversity in genome size, gene content, and carbohydrate metabolism. This may be important for the specific adaptations that allow different Lactococcus species to survive in different environments. These results provide a quantitative basis for understanding the genomic and metabolic diversity within the genus Lactococcus, laying the foundation for future studies on taxonomy and functional genomics.

  18. The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

    Science.gov (United States)

    Braasch, Ingo; Gehrke, Andrew R; Smith, Jeramiah J; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M; Campbell, Michael S; Barrell, Daniel; Martin, Kyle J; Mulley, John F; Ravi, Vydianathan; Lee, Alison P; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E G; Sun, Yi; Hertel, Jana; Beam, Michael J; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H; Litman, Gary W; Litman, Ronda T; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F; Wang, Han; Taylor, John S; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M J; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T; Venkatesh, Byrappa; Holland, Peter W H; Guiguen, Yann; Bobe, Julien; Shubin, Neil H; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H

    2016-04-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.

  19. A genome-wide association study identifies susceptibility loci for ovarian cancer at 2q31 and 8q24

    DEFF Research Database (Denmark)

    Goode, Ellen L; Chenevix-Trench, Georgia; Song, Honglin

    2010-01-01

    Ovarian cancer accounts for more deaths than all other gynecological cancers combined. To identify common low-penetrance ovarian cancer susceptibility genes, we conducted a genome-wide association study of 507,094 SNPs in 1,768 individuals with ovarian cancer (cases) and 2,354 controls, with foll...

  20. Genome-wide association studies in dogs and humans identify ADAMTS20 as a risk variant for cleft lip and palate.

    Science.gov (United States)

    Wolf, Zena T; Brand, Harrison A; Shaffer, John R; Leslie, Elizabeth J; Arzi, Boaz; Willet, Cali E; Cox, Timothy C; McHenry, Toby; Narayan, Nicole; Feingold, Eleanor; Wang, Xioajing; Sliskovic, Saundra; Karmi, Nili; Safra, Noa; Sanchez, Carla; Deleyiannis, Frederic W B; Murray, Jeffrey C; Wade, Claire M; Marazita, Mary L; Bannasch, Danika L

    2015-03-01

    Cleft lip with or without cleft palate (CL/P) is the most commonly occurring craniofacial birth defect. We provide insight into the genetic etiology of this birth defect by performing genome-wide association studies in two species: dogs and humans. In the dog, a genome-wide association study of 7 CL/P cases and 112 controls from the Nova Scotia Duck Tolling Retriever (NSDTR) breed identified a significantly associated region on canine chromosome 27 (unadjusted p=1.1 x 10(-13); adjusted p= 2.2 x 10(-3)). Further analysis in NSDTR families and additional full sibling cases identified a 1.44 Mb homozygous haplotype (chromosome 27: 9.29 - 10.73 Mb) segregating with a more complex phenotype of cleft lip, cleft palate, and syndactyly (CLPS) in 13 cases. Whole-genome sequencing of 3 CLPS cases and 4 controls at 15X coverage led to the discovery of a frameshift mutation within ADAMTS20 (c.1360_1361delAA (p.Lys453Ilefs*3)), which segregated concordant with the phenotype. In a parallel study in humans, a family-based association analysis (DFAM) of 125 CL/P cases, 420 unaffected relatives, and 392 controls from a Guatemalan cohort, identified a suggestive association (rs10785430; p =2.67 x 10-6) with the same gene, ADAMTS20. Sequencing of cases from the Guatemalan cohort was unable to identify a causative mutation within the coding region of ADAMTS20, but four coding variants were found in additional cases of CL/P. In summary, this study provides genetic evidence for a role of ADAMTS20 in CL/P development in dogs and as a candidate gene for CL/P development in humans.

  1. Genome-wide association analysis identifies three new risk loci for gout arthritis in Han Chinese

    Science.gov (United States)

    Li, Changgui; Li, Zhiqiang; Liu, Shiguo; Wang, Can; Han, Lin; Cui, Lingling; Zhou, Jingguo; Zou, Hejian; Liu, Zhen; Chen, Jianhua; Cheng, Xiaoyu; Zhou, Zhaowei; Ding, Chengcheng; Wang, Meng; Chen, Tong; Cui, Ying; He, Hongmei; Zhang, Keke; Yin, Congcong; Wang, Yunlong; Xing, Shichao; Li, Baojie; Ji, Jue; Jia, Zhaotong; Ma, Lidan; Niu, Jiapeng; Xin, Ying; Liu, Tian; Chu, Nan; Yu, Qing; Ren, Wei; Wang, Xuefeng; Zhang, Aiqing; Sun, Yuping; Wang, Haili; Lu, Jie; Li, Yuanyuan; Qing, Yufeng; Chen, Gang; Wang, Yangang; Zhou, Li; Niu, Haitao; Liang, Jun; Dong, Qian; Li, Xinde; Mi, Qing-Sheng; Shi, Yongyong

    2015-01-01

    Gout is one of the most common types of inflammatory arthritis, caused by the deposition of monosodium urate crystals in and around the joints. Previous genome-wide association studies (GWASs) have identified many genetic loci associated with raised serum urate concentrations. However, hyperuricemia alone is not sufficient for the development of gout arthritis. Here we conduct a multistage GWAS in Han Chinese using 4,275 male gout patients and 6,272 normal male controls (1,255 cases and 1,848 controls were genome-wide genotyped), with an additional 1,644 hyperuricemic controls. We discover three new risk loci, 17q23.2 (rs11653176, P=1.36 × 10−13, BCAS3), 9p24.2 (rs12236871, P=1.48 × 10−10, RFX3) and 11p15.5 (rs179785, P=1.28 × 10−8, KCNQ1), which contain inflammatory candidate genes. Our results suggest that these loci are most likely related to the progression from hyperuricemia to inflammatory gout, which will provide new insights into the pathogenesis of gout arthritis. PMID:25967671

  2. Genome-wide association analysis identifies three new risk loci for gout arthritis in Han Chinese.

    Science.gov (United States)

    Li, Changgui; Li, Zhiqiang; Liu, Shiguo; Wang, Can; Han, Lin; Cui, Lingling; Zhou, Jingguo; Zou, Hejian; Liu, Zhen; Chen, Jianhua; Cheng, Xiaoyu; Zhou, Zhaowei; Ding, Chengcheng; Wang, Meng; Chen, Tong; Cui, Ying; He, Hongmei; Zhang, Keke; Yin, Congcong; Wang, Yunlong; Xing, Shichao; Li, Baojie; Ji, Jue; Jia, Zhaotong; Ma, Lidan; Niu, Jiapeng; Xin, Ying; Liu, Tian; Chu, Nan; Yu, Qing; Ren, Wei; Wang, Xuefeng; Zhang, Aiqing; Sun, Yuping; Wang, Haili; Lu, Jie; Li, Yuanyuan; Qing, Yufeng; Chen, Gang; Wang, Yangang; Zhou, Li; Niu, Haitao; Liang, Jun; Dong, Qian; Li, Xinde; Mi, Qing-Sheng; Shi, Yongyong

    2015-05-13

    Gout is one of the most common types of inflammatory arthritis, caused by the deposition of monosodium urate crystals in and around the joints. Previous genome-wide association studies (GWASs) have identified many genetic loci associated with raised serum urate concentrations. However, hyperuricemia alone is not sufficient for the development of gout arthritis. Here we conduct a multistage GWAS in Han Chinese using 4,275 male gout patients and 6,272 normal male controls (1,255 cases and 1,848 controls were genome-wide genotyped), with an additional 1,644 hyperuricemic controls. We discover three new risk loci, 17q23.2 (rs11653176, P=1.36 × 10(-13), BCAS3), 9p24.2 (rs12236871, P=1.48 × 10(-10), RFX3) and 11p15.5 (rs179785, P=1.28 × 10(-8), KCNQ1), which contain inflammatory candidate genes. Our results suggest that these loci are most likely related to the progression from hyperuricemia to inflammatory gout, which will provide new insights into the pathogenesis of gout arthritis.

  3. Genome-wide analysis of histone H3 acetylation patterns in AML identifies PRDX2 as an epigenetically silenced tumor suppressor gene

    DEFF Research Database (Denmark)

    Agrawal-Singh, Shuchi; Isken, Fabienne; Agelopoulos, Konstantin

    2012-01-01

    to have lower H3Ac levels in AML compared with progenitor cells, which suggested that a large number of genes are epigenetically silenced in AML. Intriguingly, we identified peroxiredoxin 2 (PRDX2) as a novel potential tumor suppressor gene in AML. H3Ac was decreased at the PRDX2 gene promoter in AML......With the use of ChIP on microarray assays in primary leukemia samples, we report that acute myeloid leukemia (AML) blasts exhibit significant alterations in histone H3 acetylation (H3Ac) levels at > 1000 genomic loci compared with CD34+ progenitor cells. Importantly, core promoter regions tended......, which correlated with low mRNA and protein expression. We also observed DNA hypermethylation at the PRDX2 promoter in AML. Low protein expression of the antioxidant PRDX2 gene was clinically associated with poor prognosis in patients with AML. Functionally, PRDX2 acted as inhibitor of myeloid cell...

  4. Genomics using the Assembly of the Mink Genome

    DEFF Research Database (Denmark)

    Guldbrandtsen, Bernt; Cai, Zexi; Sahana, Goutam

    2018-01-01

    The American Mink’s (Neovison vison) genome has recently been sequenced. This opens numerous avenues of research both for studying the basic genetics and physiology of the mink as well as genetic improvement in mink. Using genotyping-by-sequencing (GBS) generated marker data for 2,352 Danish farm...... mink runs of homozygosity (ROH) were detect in mink genomes. Detectable ROH made up on average 1.7% of the genome indicating the presence of at most a moderate level of genomic inbreeding. The fraction of genome regions found in ROH varied. Ten percent of the included regions were never found in ROH....... The ability to detect ROH in the mink genome also demonstrates the general reliability of the new mink genome assembly. Keywords: american mink, run of homozygosity, genome, selection, genomic inbreeding...

  5. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  6. phiGENOME: an integrative navigation throughout bacteriophage genomes.

    Science.gov (United States)

    Stano, Matej; Klucar, Lubos

    2011-11-01

    phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics. Copyright © 2011 Elsevier Inc. All rights reserved.

  7. SNP-associations and phenotype predictions from hundreds of microbial genomes without genome alignments.

    Science.gov (United States)

    Hall, Barry G

    2014-01-01

    SNP-association studies are a starting point for identifying genes that may be responsible for specific phenotypes, such as disease traits. The vast bulk of tools for SNP-association studies are directed toward SNPs in the human genome, and I am unaware of any tools designed specifically for such studies in bacterial or viral genomes. The PPFS (Predict Phenotypes From SNPs) package described here is an add-on to kSNP , a program that can identify SNPs in a data set of hundreds of microbial genomes. PPFS identifies those SNPs that are non-randomly associated with a phenotype based on the χ² probability, then uses those diagnostic SNPs for two distinct, but related, purposes: (1) to predict the phenotypes of strains whose phenotypes are unknown, and (2) to identify those diagnostic SNPs that are most likely to be causally related to the phenotype. In the example illustrated here, from a set of 68 E. coli genomes, for 67 of which the pathogenicity phenotype was known, there were 418,500 SNPs. Using the phenotypes of 36 of those strains, PPFS identified 207 diagnostic SNPs. The diagnostic SNPs predicted the phenotypes of all of the genomes with 97% accuracy. It then identified 97 SNPs whose probability of being causally related to the pathogenic phenotype was >0.999. In a second example, from a set of 116 E. coli genome sequences, using the phenotypes of 65 strains PPFS identified 101 SNPs that predicted the source host (human or non-human) with 90% accuracy.

  8. Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB, target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB, it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  9. Genomic and Molecular Landscape of DNA Damage Repair Deficiency across The Cancer Genome Atlas

    Directory of Open Access Journals (Sweden)

    Theo A. Knijnenburg

    2018-04-01

    Full Text Available Summary: DNA damage repair (DDR pathways modulate cancer risk, progression, and therapeutic response. We systematically analyzed somatic alterations to provide a comprehensive view of DDR deficiency across 33 cancer types. Mutations with accompanying loss of heterozygosity were observed in over 1/3 of DDR genes, including TP53 and BRCA1/2. Other prevalent alterations included epigenetic silencing of the direct repair genes EXO5, MGMT, and ALKBH3 in ∼20% of samples. Homologous recombination deficiency (HRD was present at varying frequency in many cancer types, most notably ovarian cancer. However, in contrast to ovarian cancer, HRD was associated with worse outcomes in several other cancers. Protein structure-based analyses allowed us to predict functional consequences of rare, recurrent DDR mutations. A new machine-learning-based classifier developed from gene expression data allowed us to identify alterations that phenocopy deleterious TP53 mutations. These frequent DDR gene alterations in many human cancers have functional consequences that may determine cancer progression and guide therapy. : Knijnenburg et al. present The Cancer Genome Atlas (TCGA Pan-Cancer analysis of DNA damage repair (DDR deficiency in cancer. They use integrative genomic and molecular analyses to identify frequent DDR alterations across 33 cancer types, correlate gene- and pathway-level alterations with genome-wide measures of genome instability and impaired function, and demonstrate the prognostic utility of DDR deficiency scores. Keywords: The Cancer Genome Atlas PanCanAtlas project, DNA damage repair, somatic mutations, somatic copy-number alterations, epigenetic silencing, DNA damage footprints, mutational signatures, integrative statistical analysis, protein structure analysis

  10. Functional profiling of cyanobacterial genomes and its role in ecological adaptations

    Directory of Open Access Journals (Sweden)

    Ratna Prabha

    2016-09-01

    Full Text Available With the availability of complete genome sequences of many cyanobacterial species, it is becoming feasible to study the broad prospective of the environmental adaptation and the overall changes at transcriptional and translational level in these organisms. In the evolutionary phase, niche-specific competitive forces have resulted in specific features of the cyanobacterial genomes. In this study, functional composition of the 84 different cyanobacterial genomes and their adaptations to different environments was examined by identifying the genomic composition for specific cellular processes, which reflect their genomic functional profile and ecological adaptation. It was identified that among cyanobacterial genomes, metabolic genes have major share over other categories and differentiation of genomic functional profile was observed for the species inhabiting different habitats. The cyanobacteria of freshwater and other habitats accumulate large number of poorly characterized genes. Strain specific functions were also reported in many cyanobacterial members, of which an important feature was the occurrence of phage-related sequences. From this study, it can be speculated that habitat is one of the major factors in giving the shape of functional composition of cyanobacterial genomes towards their ecological adaptations.

  11. Cross-comparison of the genome sequences from human, chimpanzee, Neanderthal and a Denisovan hominin identifies novel potentially compensated mutations

    Directory of Open Access Journals (Sweden)

    Zhang Guojie

    2011-07-01

    Full Text Available Abstract The recent publication of the draft genome sequences of the Neanderthal and a ~50,000-year-old archaic hominin from Denisova Cave in southern Siberia has ushered in a new age in molecular archaeology. We previously cross-compared the human, chimpanzee and Neanderthal genome sequences with respect to a set of disease-causing/disease-associated missense and regulatory mutations (Human Gene Mutation Database and succeeded in identifying genetic variants which, although apparently pathogenic in humans, may represent a 'compensated' wild-type state in at least one of the other two species. Here, in an attempt to identify further 'potentially compensated mutations' (PCMs of interest, we have compared our dataset of disease-causing/disease-associated mutations with their corresponding nucleotide positions in the Denisovan hominin, Neanderthal and chimpanzee genomes. Of the 15 human putatively disease-causing mutations that were found to be compensated in chimpanzee, Denisovan or Neanderthal, only a solitary F5 variant (Val1736Met was specific to the Denisovan. In humans, this missense mutation is associated with activated protein C resistance and an increased risk of thromboembolism and recurrent miscarriage. It is unclear at this juncture whether this variant was indeed a PCM in the Denisovan or whether it could instead have been associated with disease in this ancient hominin.

  12. Identifying signatures of natural selection in Tibetan and Andean populations using dense genome scan data.

    Directory of Open Access Journals (Sweden)

    Abigail Bigham

    2010-09-01

    Full Text Available High-altitude hypoxia (reduced inspired oxygen tension due to decreased barometric pressure exerts severe physiological stress on the human body. Two high-altitude regions where humans have lived for millennia are the Andean Altiplano and the Tibetan Plateau. Populations living in these regions exhibit unique circulatory, respiratory, and hematological adaptations to life at high altitude. Although these responses have been well characterized physiologically, their underlying genetic basis remains unknown. We performed a genome scan to identify genes showing evidence of adaptation to hypoxia. We looked across each chromosome to identify genomic regions with previously unknown function with respect to altitude phenotypes. In addition, groups of genes functioning in oxygen metabolism and sensing were examined to test the hypothesis that particular pathways have been involved in genetic adaptation to altitude. Applying four population genetic statistics commonly used for detecting signatures of natural selection, we identified selection-nominated candidate genes and gene regions in these two populations (Andeans and Tibetans separately. The Tibetan and Andean patterns of genetic adaptation are largely distinct from one another, with both populations showing evidence of positive natural selection in different genes or gene regions. Interestingly, one gene previously known to be important in cellular oxygen sensing, EGLN1 (also known as PHD2, shows evidence of positive selection in both Tibetans and Andeans. However, the pattern of variation for this gene differs between the two populations. Our results indicate that several key HIF-regulatory and targeted genes are responsible for adaptation to high altitude in Andeans and Tibetans, and several different chromosomal regions are implicated in the putative response to selection. These data suggest a genetic role in high-altitude adaption and provide a basis for future genotype/phenotype association

  13. Phylogenetic Conflict in Bears Identified by Automated Discovery of Transposable Element Insertions in Low-Coverage Genomes.

    Science.gov (United States)

    Lammers, Fritjof; Gallus, Susanne; Janke, Axel; Nilsson, Maria A

    2017-10-01

    Phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high-throughput sequencing data to detect thousands of TE insertions. The newly developed pipeline for TE detection called TeddyPi (TE detection and discovery for Phylogenetic Inference) identified 150,513 high-quality TE insertions in the genomes of ursine and tremarctine bears. By integrating different TE insertion callers and using a stringent filtering approach, the TeddyPi pipeline produced highly reliable TE insertion calls, which were confirmed by extensive in vitro validation experiments. Analysis of single nucleotide substitutions in the flanking regions of the TEs shows that these substitutions correlate with the phylogenetic signal from the TE insertions. Our phylogenomic analyses show that TEs are a major driver of genomic variation in bears and enabled phylogenetic reconstruction of a well-resolved species tree, despite strong signals for incomplete lineage sorting and introgression. The analyses show that the Asiatic black, sun, and sloth bear form a monophyletic clade, in which phylogenetic incongruence originates from incomplete lineage sorting. TeddyPi is open source and can be adapted to various TE and structural variation callers. The pipeline makes it possible to confidently extract thousands of TE insertions even from low-coverage genomes (∼10×) of nonmodel organisms. This opens new possibilities for biologists to study phylogenies and evolutionary processes as well as rates and patterns of (retro-)transposition and structural variation. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. In Silico Post Genome-Wide Association Studies Analysis of C-Reactive Protein Loci Suggests an Important Role for Interferons

    NARCIS (Netherlands)

    Vaez, Ahmad; Jansen, Rick; Prins, Bram P.; Hottenga, Jouke-Jan; de Geus, Eco J. C.; Boomsma, Dorret I.; Penninx, Brenda W. J. H.; Nolte, Ilja M.; Snieder, Harold; Alizadeh, Behrooz Z.

    Background Genome-wide association studies (GWASs) have successfully identified several single nucleotide polymorphisms (SNPs) associated with serum levels of C-reactive protein (CRP). An important limitation of GWASs is that the identified variants merely flag the nearby genomic region and do not

  15. In Silico Post Genome-Wide Association Studies Analysis of C-Reactive Protein Loci Suggests an Important Role for Interferons

    NARCIS (Netherlands)

    Vaez, A.; Jansen, R.; Prins, B.P.; Hottenga, J.J.; de Geus, E.J.C.; Boomsma, D.I.; Penninx, B.W.J.H.; Nolte, I.M.; Snieder, H.; Alizadeh, BZ

    2015-01-01

    Background - Genome-wide association studies (GWASs) have successfully identified several single nucleotide polymorphisms (SNPs) associated with serum levels of C-reactive protein (CRP). An important limitation of GWASs is that the identified variants merely flag the nearby genomic region and do not

  16. Relationship between Deleterious Variation, Genomic Autozygosity, and Disease Risk: Insights from The 1000 Genomes Project.

    Science.gov (United States)

    Pemberton, Trevor J; Szpiech, Zachary A

    2018-04-05

    Genomic regions of autozygosity (ROAs) represent segments of individual genomes that are homozygous for haplotypes inherited identical-by-descent (IBD) from a common ancestor. ROAs are nonuniformly distributed across the genome, and increased ROA levels are a reported risk factor for numerous complex diseases. Previously, we hypothesized that long ROAs are enriched for deleterious homozygotes as a result of young haplotypes with recent deleterious mutations-relatively untouched by purifying selection-being paired IBD as a consequence of recent parental relatedness, a pattern supported by ROA and whole-exome sequence data on 27 individuals. Here, we significantly bolster support for our hypothesis and expand upon our original analyses using ROA and whole-genome sequence data on 2,436 individuals from The 1000 Genomes Project. Considering CADD deleteriousness scores, we reaffirm our previous observation that long ROAs are enriched for damaging homozygotes worldwide. We show that strongly damaging homozygotes experience greater enrichment than weaker damaging homozygotes, while overall enrichment varies appreciably among populations. Mendelian disease genes and those encoding FDA-approved drug targets have significantly increased rates of gain in damaging homozygotes with increasing ROA coverage relative to all other genes. In genes implicated in eight complex phenotypes for which ROA levels have been identified as a risk factor, rates of gain in damaging homozygotes vary across phenotypes and populations but frequently differ significantly from non-disease genes. These findings highlight the potential confounding effects of population background in the assessment of associations between ROA levels and complex disease risk, which might underlie reported inconsistencies in ROA-phenotype associations. Copyright © 2018 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  17. Genome-wide association study identifies five new schizophrenia loci

    NARCIS (Netherlands)

    Ripke, S.; Sanders, A. R.; Kendler, K. S.; Levinson, D. F.; Sklar, P.; Holmans, P. A.; Lin, D. Y.; Duan, J.; Ophoff, R. A.; Andreassen, O. A.; Scolnick, E.; Cichon, S.; St Clair, D.; Corvin, A.; Gurling, H.; Werge, T.; Rujescu, D.; Blackwood, D. H.; Pato, C. N.; Malhotra, A. K.; Purcell, S.; Dudbridge, F.; Neale, B. M.; Rossin, L.; Visscher, P. M.; Posthuma, D.; Ruderfer, D. M.; Fanous, A.; Stefansson, H.; Steinberg, S.; Mowry, B. J.; Golimbet, V.; de Hert, M.; Jonsson, E. G.; Bitter, I.; Pietilainen, O. P.; Collier, D. A.; Tosato, S.; Agartz, I.; Albus, M.; Alexander, M.; Amdur, R. L.; Amin, F.; Bass, N.; Bergen, S. E.; Black, D. W.; Borglum, A. D.; Brown, M. A.; Bruggeman, R.; Buccola, N. G.; Byerley, W. F.; Cahn, W.; Cantor, R. M.; Carr, V. J.; Catts, S. V.; Choudhury, K.; Cloninger, C. R.; Cormican, P.; Craddock, N.; Danoy, P. A.; Datta, S.; de Haan, L.; Demontis, D.; Dikeos, D.; Djurovic, S.; Donnely, P.; Donohoe, G.; Duong, L.; Dwyer, S.; Fink-Jensen, A.; Freedman, R.; Freimer, N. B.; Friedl, M.; Georgieva, L.; Giegling, I.; Gill, M.; Glenthoj, B.; Godard, S.; Hamshere, M.; Hansen, M.; Hartmann, A. M.; Henskens, F. A.; Hougaard, D. M.; Hultman, C. M.; Ingason, A.; Jablensky, A. V.; Jakobsen, K. D.; Jay, M.; Jurgens, G.; Kahn, R. S.; Keller, M. C.; Kenis, G.; Kenny, E.; Kim, Y.; Kirov, G. K.; Konnerth, H.; Konte, B.; Krabbendam, L.; Krasucki, R.; Lasseter, V. K.; Laurent, C.; Lawrence, J.; Lencz, T.; Lerer, F. B.; Liang, K. Y.; Lichtenstein, P.; Lieberman, J. A.; Linszen, D. H.; Lonnqvist, J.; Loughland, C. M.; Maclean, A. W.; Maher, B. S.; Maier, W.; Mallet, J.; Malloy, P.; Mattheisen, M.; Mattingsdal, M.; McGhee, K. A.; McGrath, J. J.; McIntosh, A.; McLean, D. E.; McQuillin, A.; Melle, I.; Michie, P. T.; Milanova, V.; Morris, D. W.; Mors, O.; Mortensen, P. B.; Moskvina, V.; Muglia, P.; Myin-Germeys, I.; Nertney, D. A.; Nestadt, G.; Nielsen, J.; Nikolov, I.; Nordentoft, M.; Norton, N.; Nothen, M. M.; O'Dushlaine, C. T.; Olincy, A.; Olsen, L.; O'Neill, F. A.; Orntoft, T. F.; Owen, M. J.; Pantelis, C.; Papadimitriou, G.; Pato, M. T.; Peltonen, L.; Petursson, H.; Pickard, B.; Pimm, J.; Pulver, A. E.; Puri, V.; Quested, D.; Quinn, E. M.; Rasmussen, H. B.; Rethelyi, J. M.; Ribble, R.; Rietschel, M.; Riley, B. P.; Ruggeri, M.; Schall, U.; Schulze, T. G.; Schwab, S. G.; Scott, R. J.; Shi, J.; Sigurdsson, E.; Silvermann, J. M.; Spencer, C. C.; Stefansson, K.; Strange, A.; Strengman, E.; Stroup, T. S.; Suvisaari, J.; Terenius, L.; Thirumalai, S.; Thygesen, J. H.; Timm, S.; Toncheva, D.; van den Oord, E.; van Os, J.; van Winkel, R.; Veldink, J.; Walsh, D.; Wang, A. G.; Wiersma, D.; Wildenauer, D. B.; Williams, H. J.; Williams, N. M.; Wormley, B.; Zammit, S.; Sullivan, P. F.; O'Donovan, M. C.; Daly, M. J.; Gejman, P. V.

    2011-01-01

    We examined the role of common genetic variation in schizophrenia in a genome-wide association study of substantial size: a stage 1 discovery sample of 21,856 individuals of European ancestry and a stage 2 replication sample of 29,839 independent subjects. The combined stage 1 and 2 analysis yielded

  18. BISQUE: locus- and variant-specific conversion of genomic, transcriptomic and proteomic database identifiers.

    Science.gov (United States)

    Meyer, Michael J; Geske, Philip; Yu, Haiyuan

    2016-05-15

    Biological sequence databases are integral to efforts to characterize and understand biological molecules and share biological data. However, when analyzing these data, scientists are often left holding disparate biological currency-molecular identifiers from different databases. For downstream applications that require converting the identifiers themselves, there are many resources available, but analyzing associated loci and variants can be cumbersome if data is not given in a form amenable to particular analyses. Here we present BISQUE, a web server and customizable command-line tool for converting molecular identifiers and their contained loci and variants between different database conventions. BISQUE uses a graph traversal algorithm to generalize the conversion process for residues in the human genome, genes, transcripts and proteins, allowing for conversion across classes of molecules and in all directions through an intuitive web interface and a URL-based web service. BISQUE is freely available via the web using any major web browser (http://bisque.yulab.org/). Source code is available in a public GitHub repository (https://github.com/hyulab/BISQUE). haiyuan.yu@cornell.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  19. ERIC-PCR fingerprinting-based community DNA hybridization to pinpoint genome-specific fragments as molecular markers to identify and track populations common to healthy human guts.

    Science.gov (United States)

    Wei, Guifang; Pan, Li; Du, Huimin; Chen, Junyi; Zhao, Liping

    2004-10-01

    Bacterial populations common to healthy human guts may play important roles in human health. A new strategy for discovering genomic sequences as markers for these bacteria was developed using Enterobacterial Repetitive Intergenic Consensus (ERIC)-PCR fingerprinting. Structural features within microbial communities are compared with ERIC-PCR followed by DNA hybridization to identify genomic fragments shared by samples from healthy human individuals. ERIC-PCR profiles of fecal samples from 12 diseased or healthy human and piglet subjects demonstrated stable, unique banding patterns for each individual tested. Sequence homology of DNA fragments in bands of identical size was examined between samples by hybridization under high stringency conditions with DIG-labeled ERIC-PCR products derived from the fecal sample of one healthy child. Comparative analysis of the hybridization profiles with the original agarose fingerprints identified three predominant bands as signatures for populations associated with healthy human guts with sizes of 500, 800 and 1000 bp. Clone library profiling of the three bands produced 17 genome fragments, three of which showed high similarity only with regions of the Bacteroides thetaiotaomicron genome, while the remainder were orphan sequences. Association of these sequences with healthy guts was validated by sequence-selective PCR experiments, which showed that a single fragment was present in all 32 healthy humans and 13 healthy piglets tested. Two fragments were present in the healthy human group and in 18 children with non-infectious diarrhea but not in eight children with infectious diarrhea. Genome fragments identified with this novel strategy may be used as genome-specific markers for dynamic monitoring and sequence-guided isolation of functionally important bacterial populations in complex communities such as human gut microflora.

  20. Correlation of microRNA levels during hypoxia with predicted target mRNAs through genome-wide microarray analysis

    Directory of Open Access Journals (Sweden)

    Page Grier P

    2009-03-01

    Full Text Available Abstract Background Low levels of oxygen in tissues, seen in situations such as chronic lung disease, necrotic tumors, and high altitude exposures, initiate a signaling pathway that results in active transcription of genes possessing a hypoxia response element (HRE. The aim of this study was to investigate whether a change in miRNA expression following hypoxia could account for changes in the cellular transcriptome based on currently available miRNA target prediction tools. Methods To identify changes induced by hypoxia, we conducted mRNA- and miRNA-array-based experiments in HT29 cells, and performed comparative analysis of the resulting data sets based on multiple target prediction algorithms. To date, few studies have investigated an environmental perturbation for effects on genome-wide miRNA levels, or their consequent influence on mRNA output. Results Comparison of miRNAs with predicted mRNA targets indicated a lower level of concordance than expected. We did, however, find preliminary evidence of combinatorial regulation of mRNA expression by miRNA. Conclusion Target prediction programs and expression profiling techniques do not yet adequately represent the complexity of miRNA-mediated gene repression, and new methods may be required to better elucidate these pathways. Our data suggest the physiologic impact of miRNAs on cellular transcription results from a multifaceted network of miRNA and mRNA relationships, working together in an interconnected system and in context of hundreds of RNA species. The methods described here for comparative analysis of cellular miRNA and mRNA will be useful for understanding genome wide regulatory responsiveness and refining miRNA predictive algorithms.

  1. Genome-wide association studies identify four ER negative-specific breast cancer risk loci

    DEFF Research Database (Denmark)

    Garcia-Closas, Montserrat; Couch, Fergus J; Lindstrom, Sara

    2013-01-01

    differences in genetic predisposition. To identify susceptibility loci specific to ER-negative disease, we combined in a meta-analysis 3 genome-wide association studies of 4,193 ER-negative breast cancer cases and 35,194 controls with a series of 40 follow-up studies (6,514 cases and 41,455 controls......), genotyped using a custom Illumina array, iCOGS, developed by the Collaborative Oncological Gene-environment Study (COGS). SNPs at four loci, 1q32.1 (MDM4, P = 2.1 × 10(-12) and LGR6, P = 1.4 × 10(-8)), 2p24.1 (P = 4.6 × 10(-8)) and 16q12.2 (FTO, P = 4.0 × 10(-8)), were associated with ER-negative but not ER...

  2. Identifying novel glioma associated pathways based on systems biology level meta-analysis.

    Science.gov (United States)

    Hu, Yangfan; Li, Jinquan; Yan, Wenying; Chen, Jiajia; Li, Yin; Hu, Guang; Shen, Bairong

    2013-01-01

    With recent advances in microarray technology, including genomics, proteomics, and metabolomics, it brings a great challenge for integrating this "-omics" data to analysis complex disease. Glioma is an extremely aggressive and lethal form of brain tumor, and thus the study of the molecule mechanism underlying glioma remains very important. To date, most studies focus on detecting the differentially expressed genes in glioma. However, the meta-analysis for pathway analysis based on multiple microarray datasets has not been systematically pursued. In this study, we therefore developed a systems biology based approach by integrating three types of omics data to identify common pathways in glioma. Firstly, the meta-analysis has been performed to study the overlapping of signatures at different levels based on the microarray gene expression data of glioma. Among these gene expression datasets, 12 pathways were found in GeneGO database that shared by four stages. Then, microRNA expression profiles and ChIP-seq data were integrated for the further pathway enrichment analysis. As a result, we suggest 5 of these pathways could be served as putative pathways in glioma. Among them, the pathway of TGF-beta-dependent induction of EMT via SMAD is of particular importance. Our results demonstrate that the meta-analysis based on systems biology level provide a more useful approach to study the molecule mechanism of complex disease. The integration of different types of omics data, including gene expression microarrays, microRNA and ChIP-seq data, suggest some common pathways correlated with glioma. These findings will offer useful potential candidates for targeted therapeutic intervention of glioma.

  3. Meta-analysis of Genome Wide Association Studies Identifies Genetic Markers of Late Toxicity Following Radiotherapy for Prostate Cancer

    Directory of Open Access Journals (Sweden)

    Sarah L. Kerns

    2016-08-01

    Full Text Available Nearly 50% of cancer patients undergo radiotherapy. Late radiotherapy toxicity affects quality-of-life in long-term cancer survivors and risk of side-effects in a minority limits doses prescribed to the majority of patients. Development of a test predicting risk of toxicity could benefit many cancer patients. We aimed to meta-analyze individual level data from four genome-wide association studies from prostate cancer radiotherapy cohorts including 1564 men to identify genetic markers of toxicity. Prospectively assessed two-year toxicity endpoints (urinary frequency, decreased urine stream, rectal bleeding, overall toxicity and single nucleotide polymorphism (SNP associations were tested using multivariable regression, adjusting for clinical and patient-related risk factors. A fixed-effects meta-analysis identified two SNPs: rs17599026 on 5q31.2 with urinary frequency (odds ratio [OR] 3.12, 95% confidence interval [CI] 2.08–4.69, p-value 4.16 × 10−8 and rs7720298 on 5p15.2 with decreased urine stream (OR 2.71, 95% CI 1.90–3.86, p-value = 3.21 × 10−8. These SNPs lie within genes that are expressed in tissues adversely affected by pelvic radiotherapy including bladder, kidney, rectum and small intestine. The results show that heterogeneous radiotherapy cohorts can be combined to identify new moderate-penetrance genetic variants associated with radiotherapy toxicity. The work provides a basis for larger collaborative efforts to identify enough variants for a future test involving polygenic risk profiling.

  4. A genome-wide shRNA screen identifies GAS1 as a novel melanoma metastasis suppressor gene.

    Science.gov (United States)

    Gobeil, Stephane; Zhu, Xiaochun; Doillon, Charles J; Green, Michael R

    2008-11-01

    Metastasis suppressor genes inhibit one or more steps required for metastasis without affecting primary tumor formation. Due to the complexity of the metastatic process, the development of experimental approaches for identifying genes involved in metastasis prevention has been challenging. Here we describe a genome-wide RNAi screening strategy to identify candidate metastasis suppressor genes. Following expression in weakly metastatic B16-F0 mouse melanoma cells, shRNAs were selected based upon enhanced satellite colony formation in a three-dimensional cell culture system and confirmed in a mouse experimental metastasis assay. Using this approach we discovered 22 genes whose knockdown increased metastasis without affecting primary tumor growth. We focused on one of these genes, Gas1 (Growth arrest-specific 1), because we found that it was substantially down-regulated in highly metastatic B16-F10 melanoma cells, which contributed to the high metastatic potential of this mouse cell line. We further demonstrated that Gas1 has all the expected properties of a melanoma tumor suppressor including: suppression of metastasis in a spontaneous metastasis assay, promotion of apoptosis following dissemination of cells to secondary sites, and frequent down-regulation in human melanoma metastasis-derived cell lines and metastatic tumor samples. Thus, we developed a genome-wide shRNA screening strategy that enables the discovery of new metastasis suppressor genes.

  5. Genomics and the human genome project: implications for psychiatry

    OpenAIRE

    Kelsoe, J R

    2004-01-01

    In the past decade the Human Genome Project has made extraordinary strides in understanding of fundamental human genetics. The complete human genetic sequence has been determined, and the chromosomal location of almost all human genes identified. Presently, a large international consortium, the HapMap Project, is working to identify a large portion of genetic variation in different human populations and the structure and relationship of these variants to each other. The Human Genome Project h...

  6. Parent-of-origin effects in autism identified through genome-wide linkage analysis of 16,000 SNPs.

    Directory of Open Access Journals (Sweden)

    Delphine Fradin

    2010-09-01

    Full Text Available Autism is a common heritable neurodevelopmental disorder with complex etiology. Several genome-wide linkage and association scans have been carried out to identify regions harboring genes related to autism or autism spectrum disorders, with mixed results. Given the overlap in autism features with genetic abnormalities known to be associated with imprinting, one possible reason for lack of consistency would be the influence of parent-of-origin effects that may mask the ability to detect linkage and association.We have performed a genome-wide linkage scan that accounts for potential parent-of-origin effects using 16,311 SNPs among families from the Autism Genetic Resource Exchange (AGRE and the National Institute of Mental Health (NIMH autism repository. We report parametric (GH, Genehunter and allele-sharing linkage (Aspex results using a broad spectrum disorder case definition. Paternal-origin genome-wide statistically significant linkage was observed on chromosomes 4 (LOD(GH = 3.79, empirical p<0.005 and LOD(Aspex = 2.96, p = 0.008, 15 (LOD(GH = 3.09, empirical p<0.005 and LOD(Aspex = 3.62, empirical p = 0.003 and 20 (LOD(GH = 3.36, empirical p<0.005 and LOD(Aspex = 3.38, empirical p = 0.006.These regions may harbor imprinted sites associated with the development of autism and offer fruitful domains for molecular investigation into the role of epigenetic mechanisms in autism.

  7. The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies.

    Science.gov (United States)

    Argout, X; Martin, G; Droc, G; Fouet, O; Labadie, K; Rivals, E; Aury, J M; Lanaud, C

    2017-09-15

    Theobroma cacao L., native to the Amazonian basin of South America, is an economically important fruit tree crop for tropical countries as a source of chocolate. The first draft genome of the species, from a Criollo cultivar, was published in 2011. Although a useful resource, some improvements are possible, including identifying misassemblies, reducing the number of scaffolds and gaps, and anchoring un-anchored sequences to the 10 chromosomes. We used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined four Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions and reduced the number of scaffolds. We then used genotyping by sequencing (GBS) methods to increase the proportion of the assembly anchored to chromosomes. The scaffold number decreased from 4,792 in assembly V1 to 554 in V2 while the scaffold N50 size has increased from 0.47 Mb in V1 to 6.5 Mb in V2. A total of 96.7% of the assembly was anchored to the 10 chromosomes compared to 66.8% in the previous version. Unknown sites (Ns) were reduced from 10.8% to 5.7%. In addition, we updated the functional annotations and performed a new RefSeq structural annotation based on RNAseq evidence. Theobroma cacao Criollo genome version 2 will be a valuable resource for the investigation of complex traits at the genomic level and for future comparative genomics and genetics studies in cacao tree. New functional tools and annotations are available on the Cocoa Genome Hub ( http://cocoa-genome-hub.southgreen.fr ).

  8. Genome-wide gene-environment study identifies glutamate receptor gene GRIN2A as a Parkinson's disease modifier gene via interaction with coffee.

    OpenAIRE

    Taye H Hamza; Honglei Chen; Erin M Hill-Burns; Shannon L Rhodes; Jennifer Montimurro; Denise M Kay; Albert Tenesa; Victoria I Kusel; Patricia Sheehan; Muthukrishnan Eaaswarkhanth; Dora Yearout; Ali Samii; John W Roberts; Pinky Agarwal; Yvette Bordelon

    2011-01-01

    Our aim was to identify genes that influence the inverse association of coffee with the risk of developing Parkinson's disease (PD). We used genome-wide genotype data and lifetime caffeinated-coffee-consumption data on 1,458 persons with PD and 931 without PD from the NeuroGenetics Research Consortium (NGRC), and we performed a genome-wide association and interaction study (GWAIS), testing each SNP's main-effect plus its interaction with coffee, adjusting for sex, age, and two principal compo...

  9. Genome-wide association study identifies a novel canine glaucoma locus.

    Directory of Open Access Journals (Sweden)

    Saija J Ahonen

    Full Text Available Glaucoma is an optic neuropathy and one of the leading causes of blindness. Its hereditary forms are classified into primary closed-angle (PCAG, primary open-angle (POAG and primary congenital glaucoma (PCG. Although many loci have been mapped in human, only a few genes have been identified that are associated with the development of glaucoma and the genetic basis of the disease remains poorly understood. Glaucoma has also been described in many dog breeds, including Dandie Dinmont Terriers (DDT in which it is a late-onset (>7 years disease. We designed clinical and genetic studies to better define the clinical features of glaucoma in the DDT and to identify the genetic cause. Clinical diagnosis was based on ophthalmic examinations of the affected dogs and 18 additionally investigated unaffected DDTs. We collected DNA from over 400 DTTs and a genome wide association study was performed in a cohort of 23 affected and 23 controls, followed by a fine mapping, a replication study and candidate gene sequencing. The clinical study suggested that ocular abnormalities including abnormal iridocorneal angles and pectinate ligament dysplasia are common (50% and 72%, respectively in the breed and the disease resembles human PCAG. The genetic study identified a novel 9.5 Mb locus on canine chromosome 8 including the 1.6 Mb best associated region (p = 1.63 × 10(-10, OR = 32 for homozygosity. Mutation screening in five candidate genes did not reveal any causative variants. This study indicates that although ocular abnormalities are common in DDTs, the genetic risk for glaucoma is conferred by a novel locus on CFA8. The canine locus shares synteny to a region in human chromosome 14q, which harbors several loci associated with POAG and PCG. Our study reveals a new locus for canine glaucoma and ongoing molecular studies will likely help to understand the genetic etiology of the disease.

  10. Genome-wide association studies in dogs and humans identify ADAMTS20 as a risk variant for cleft lip and palate.

    Directory of Open Access Journals (Sweden)

    Zena T Wolf

    2015-03-01

    Full Text Available Cleft lip with or without cleft palate (CL/P is the most commonly occurring craniofacial birth defect. We provide insight into the genetic etiology of this birth defect by performing genome-wide association studies in two species: dogs and humans. In the dog, a genome-wide association study of 7 CL/P cases and 112 controls from the Nova Scotia Duck Tolling Retriever (NSDTR breed identified a significantly associated region on canine chromosome 27 (unadjusted p=1.1 x 10(-13; adjusted p= 2.2 x 10(-3. Further analysis in NSDTR families and additional full sibling cases identified a 1.44 Mb homozygous haplotype (chromosome 27: 9.29 - 10.73 Mb segregating with a more complex phenotype of cleft lip, cleft palate, and syndactyly (CLPS in 13 cases. Whole-genome sequencing of 3 CLPS cases and 4 controls at 15X coverage led to the discovery of a frameshift mutation within ADAMTS20 (c.1360_1361delAA (p.Lys453Ilefs*3, which segregated concordant with the phenotype. In a parallel study in humans, a family-based association analysis (DFAM of 125 CL/P cases, 420 unaffected relatives, and 392 controls from a Guatemalan cohort, identified a suggestive association (rs10785430; p =2.67 x 10-6 with the same gene, ADAMTS20. Sequencing of cases from the Guatemalan cohort was unable to identify a causative mutation within the coding region of ADAMTS20, but four coding variants were found in additional cases of CL/P. In summary, this study provides genetic evidence for a role of ADAMTS20 in CL/P development in dogs and as a candidate gene for CL/P development in humans.

  11. Genome-wide RNAi Screen Identifies Networks Involved in Intestinal Stem Cell Regulation in Drosophila

    Directory of Open Access Journals (Sweden)

    Xiankun Zeng

    2015-02-01

    Full Text Available The intestinal epithelium is the most rapidly self-renewing tissue in adult animals and maintained by intestinal stem cells (ISCs in both Drosophila and mammals. To comprehensively identify genes and pathways that regulate ISC fates, we performed a genome-wide transgenic RNAi screen in adult Drosophila intestine and identified 405 genes that regulate ISC maintenance and lineage-specific differentiation. By integrating these genes into publicly available interaction databases, we further developed functional networks that regulate ISC self-renewal, ISC proliferation, ISC maintenance of diploid status, ISC survival, ISC-to-enterocyte (EC lineage differentiation, and ISC-to-enteroendocrine (EE lineage differentiation. By comparing regulators among ISCs, female germline stem cells, and neural stem cells, we found that factors related to basic stem cell cellular processes are commonly required in all stem cells, and stem-cell-specific, niche-related signals are required only in the unique stem cell type. Our findings provide valuable insights into stem cell maintenance and lineage-specific differentiation.

  12. The spotted gar genome illuminates vertebrate evolution and facilitates human-to-teleost comparisons

    Science.gov (United States)

    Braasch, Ingo; Gehrke, Andrew R.; Smith, Jeramiah J.; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M.; Campbell, Michael S.; Barrell, Daniel; Martin, Kyle J.; Mulley, John F.; Ravi, Vydianathan; Lee, Alison P.; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E. G.; Sun, Yi; Hertel, Jana; Beam, Michael J.; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H.; Litman, Gary W.; Litman, Ronda T.; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F.; Wang, Han; Taylor, John S.; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M. J.; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A.; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T.; Venkatesh, Byrappa; Holland, Peter W. H.; Guiguen, Yann; Bobe, Julien; Shubin, Neil H.; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H.

    2016-01-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences. PMID:26950095

  13. Genome Sequencing Identifies Two Nearly Unchanged Strains of Persistent Listeria monocytogenes Isolated at Two Different Fish Processing Plants Sampled 6 Years Apart

    DEFF Research Database (Denmark)

    Holch, Anne; Webb, Kristen; Lukjancenko, Oksana

    2013-01-01

    Listeria monocytogenes is a food-borne human-pathogenic bacterium that can cause infections with a high mortality rate. It has a remarkable ability to persist in food processing facilities. Here we report the genome sequences for two L. monocytogenes strains (N53-1 and La111) that were isolated 6...... that has been isolated as a persistent subtype in several European countries. The purpose of this study was to use genome analyses to identify genes or proteins that could contribute to persistence. In a genome comparison, the two persistent strains were extremely similar and collectively differed from...... are required to determine if the absence of these genes promotes persistence. While the genome comparison did not point to a clear physiological explanation of the persistent phenotype, the remarkable similarity between the two strains indicates that subtypes with specific traits are selected for in the food...

  14. Systems-based analysis of the Sarcocystis neurona genome identifies pathways that contribute to a heteroxenous life cycle.

    Science.gov (United States)

    Blazejewski, Tomasz; Nursimulu, Nirvana; Pszenny, Viviana; Dangoudoubiyam, Sriveny; Namasivayam, Sivaranjani; Chiasson, Melissa A; Chessman, Kyle; Tonkin, Michelle; Swapna, Lakshmipuram S; Hung, Stacy S; Bridgers, Joshua; Ricklefs, Stacy M; Boulanger, Martin J; Dubey, Jitender P; Porcella, Stephen F; Kissinger, Jessica C; Howe, Daniel K; Grigg, Michael E; Parkinson, John

    2015-02-10

    Sarcocystis neurona is a member of the coccidia, a clade of single-celled parasites of medical and veterinary importance including Eimeria, Sarcocystis, Neospora, and Toxoplasma. Unlike Eimeria, a single-host enteric pathogen, Sarcocystis, Neospora, and Toxoplasma are two-host parasites that infect and produce infectious tissue cysts in a wide range of intermediate hosts. As a genus, Sarcocystis is one of the most successful protozoan parasites; all vertebrates, including birds, reptiles, fish, and mammals are hosts to at least one Sarcocystis species. Here we sequenced Sarcocystis neurona, the causal agent of fatal equine protozoal myeloencephalitis. The S. neurona genome is 127 Mbp, more than twice the size of other sequenced coccidian genomes. Comparative analyses identified conservation of the invasion machinery among the coccidia. However, many dense-granule and rhoptry kinase genes, responsible for altering host effector pathways in Toxoplasma and Neospora, are absent from S. neurona. Further, S. neurona has a divergent repertoire of SRS proteins, previously implicated in tissue cyst formation in Toxoplasma. Systems-based analyses identified a series of metabolic innovations, including the ability to exploit alternative sources of energy. Finally, we present an S. neurona model detailing conserved molecular innovations that promote the transition from a purely enteric lifestyle (Eimeria) to a heteroxenous parasite capable of infecting a wide range of intermediate hosts. Sarcocystis neurona is a member of the coccidia, a clade of single-celled apicomplexan parasites responsible for major economic and health care burdens worldwide. A cousin of Plasmodium, Cryptosporidium, Theileria, and Eimeria, Sarcocystis is one of the most successful parasite genera; it is capable of infecting all vertebrates (fish, reptiles, birds, and mammals-including humans). The past decade has witnessed an increasing number of human outbreaks of clinical significance associated with

  15. The Population Genomics of Sunflowers and Genomic Determinants of Protein Evolution Revealed by RNAseq

    Directory of Open Access Journals (Sweden)

    Loren H. Rieseberg

    2012-10-01

    Full Text Available Few studies have investigated the causes of evolutionary rate variation among plant nuclear genes, especially in recently diverged species still capable of hybridizing in the wild. The recent advent of Next Generation Sequencing (NGS permits investigation of genome wide rates of protein evolution and the role of selection in generating and maintaining divergence. Here, we use individual whole-transcriptome sequencing (RNAseq to refine our understanding of the population genomics of wild species of sunflowers (Helianthus spp. and the factors that affect rates of protein evolution. We aligned 35 GB of transcriptome sequencing data and identified 433,257 polymorphic sites (SNPs in a reference transcriptome comprising 16,312 genes. Using SNP markers, we identified strong population clustering largely corresponding to the three species analyzed here (Helianthus annuus, H. petiolaris, H. debilis, with one distinct early generation hybrid. Then, we calculated the proportions of adaptive substitution fixed by selection (alpha and identified gene ontology categories with elevated values of alpha. The “response to biotic stimulus” category had the highest mean alpha across the three interspecific comparisons, implying that natural selection imposed by other organisms plays an important role in driving protein evolution in wild sunflowers. Finally, we examined the relationship between protein evolution (dN/dS ratio and several genomic factors predicted to co-vary with protein evolution (gene expression level, divergence and specificity, genetic divergence [FST], and nucleotide diversity pi. We find that variation in rates of protein divergence was correlated with gene expression level and specificity, consistent with results from a broad range of taxa and timescales. This would in turn imply that these factors govern protein evolution both at a microevolutionary and macroevolutionary timescale. Our results contribute to a general understanding of the

  16. Genome-wide association scan meta-analysis identifies three Loci influencing adiposity and fat distribution.

    Directory of Open Access Journals (Sweden)

    Cecilia M Lindgren

    2009-06-01

    Full Text Available To identify genetic loci influencing central obesity and fat distribution, we performed a meta-analysis of 16 genome-wide association studies (GWAS, N = 38,580 informative for adult waist circumference (WC and waist-hip ratio (WHR. We selected 26 SNPs for follow-up, for which the evidence of association with measures of central adiposity (WC and/or WHR was strong and disproportionate to that for overall adiposity or height. Follow-up studies in a maximum of 70,689 individuals identified two loci strongly associated with measures of central adiposity; these map near TFAP2B (WC, P = 1.9x10(-11 and MSRA (WC, P = 8.9x10(-9. A third locus, near LYPLAL1, was associated with WHR in women only (P = 2.6x10(-8. The variants near TFAP2B appear to influence central adiposity through an effect on overall obesity/fat-mass, whereas LYPLAL1 displays a strong female-only association with fat distribution. By focusing on anthropometric measures of central obesity and fat distribution, we have identified three loci implicated in the regulation of human adiposity.

  17. Identification of Ohnolog Genes Originating from Whole Genome Duplication in Early Vertebrates, Based on Synteny Comparison across Multiple Genomes.

    Science.gov (United States)

    Singh, Param Priya; Arora, Jatin; Isambert, Hervé

    2015-07-01

    Whole genome duplications (WGD) have now been firmly established in all major eukaryotic kingdoms. In particular, all vertebrates descend from two rounds of WGDs, that occurred in their jawless ancestor some 500 MY ago. Paralogs retained from WGD, also coined 'ohnologs' after Susumu Ohno, have been shown to be typically associated with development, signaling and gene regulation. Ohnologs, which amount to about 20 to 35% of genes in the human genome, have also been shown to be prone to dominant deleterious mutations and frequently implicated in cancer and genetic diseases. Hence, identifying ohnologs is central to better understand the evolution of vertebrates and their susceptibility to genetic diseases. Early computational analyses to identify vertebrate ohnologs relied on content-based synteny comparisons between the human genome and a single invertebrate outgroup genome or within the human genome itself. These approaches are thus limited by lineage specific rearrangements in individual genomes. We report, in this study, the identification of vertebrate ohnologs based on the quantitative assessment and integration of synteny conservation between six amniote vertebrates and six invertebrate outgroups. Such a synteny comparison across multiple genomes is shown to enhance the statistical power of ohnolog identification in vertebrates compared to earlier approaches, by overcoming lineage specific genome rearrangements. Ohnolog gene families can be browsed and downloaded for three statistical confidence levels or recompiled for specific, user-defined, significance criteria at http://ohnologs.curie.fr/. In the light of the importance of WGD on the genetic makeup of vertebrates, our analysis provides a useful resource for researchers interested in gaining further insights on vertebrate evolution and genetic diseases.

  18. Application of Chemical Genomics to Plant-Bacteria Communication: A High-Throughput System to Identify Novel Molecules Modulating the Induction of Bacterial Virulence Genes by Plant Signals.

    Science.gov (United States)

    Vandelle, Elodie; Puttilli, Maria Rita; Chini, Andrea; Devescovi, Giulia; Venturi, Vittorio; Polverari, Annalisa

    2017-01-01

    The life cycle of bacterial phytopathogens consists of a benign epiphytic phase, during which the bacteria grow in the soil or on the plant surface, and a virulent endophytic phase involving the penetration of host defenses and the colonization of plant tissues. Innovative strategies are urgently required to integrate copper treatments that control the epiphytic phase with complementary tools that control the virulent endophytic phase, thus reducing the quantity of chemicals applied to economically and ecologically acceptable levels. Such strategies include targeted treatments that weaken bacterial pathogens, particularly those inhibiting early infection steps rather than tackling established infections. This chapter describes a reporter gene-based chemical genomic high-throughput screen for the induction of bacterial virulence by plant molecules. Specifically, we describe a chemical genomic screening method to identify agonist and antagonist molecules for the induction of targeted bacterial virulence genes by plant extracts, focusing on the experimental controls required to avoid false positives and thus ensuring the results are reliable and reproducible.

  19. Universal features in the genome-level evolution of protein domains.

    Science.gov (United States)

    Cosentino Lagomarsino, Marco; Sellerio, Alessandro L; Heijning, Philip D; Bassetti, Bruno

    2009-01-01

    Protein domains can be used to study proteome evolution at a coarse scale. In particular, they are found on genomes with notable statistical distributions. It is known that the distribution of domains with a given topology follows a power law. We focus on a further aspect: these distributions, and the number of distinct topologies, follow collective trends, or scaling laws, depending on the total number of domains only, and not on genome-specific features. We present a stochastic duplication/innovation model, in the class of the so-called 'Chinese restaurant processes', that explains this observation with two universal parameters, representing a minimal number of domains and the relative weight of innovation to duplication. Furthermore, we study a model variant where new topologies are related to occurrence in genomic data, accounting for fold specificity. Both models have general quantitative agreement with data from hundreds of genomes, which indicates that the domains of a genome are built with a combination of specificity and robust self-organizing phenomena. The latter are related to the basic evolutionary 'moves' of duplication and innovation, and give rise to the observed scaling laws, a priori of the specific evolutionary history of a genome. We interpret this as the concurrent effect of neutral and selective drives, which increase duplication and decrease innovation in larger and more complex genomes. The validity of our model would imply that the empirical observation of a small number of folds in nature may be a consequence of their evolution.

  20. Integration of genome-wide association studies with biological knowledge identifies six novel genes related to kidney function.

    Science.gov (United States)

    Chasman, Daniel I; Fuchsberger, Christian; Pattaro, Cristian; Teumer, Alexander; Böger, Carsten A; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; O'Seaghdha, Conall M; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D; Gierman, Hinco J; Feitosa, Mary F; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Asa; Tönjes, Anke; Dehghan, Abbas; Lambert, Jean-Charles; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Coassin, Stefan; Pistis, Giorgio; Harris, Tamara B; Launer, Lenore J; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank; Demirkan, Ayse; Oostra, Ben A; de Andrade, Mariza; Turner, Stephen T; Ding, Jingzhong; Andrews, Jeanette S; Freedman, Barry I; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Meisinger, Christa; Gieger, Christian; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H; Wright, Alan F; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G; Rivadeneira, Fernando; Aulchenko, Yurii S; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K; Portas, Laura; Ford, Ian; Buckley, Brendan M; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J Wouter; Probst-Hensch, Nicole M; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S; van Duijn, Cornelia M; Borecki, Ingrid B; Kardia, Sharon L R; Liu, Yongmei; Curhan, Gary C; Rudan, Igor; Gyllensten, Ulf; Wilson, James F; Franke, Andre; Pramstaller, Peter P; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Parsa, Afshin; Bochud, Murielle; Heid, Iris M; Kao, W H Linda; Fox, Caroline S; Köttgen, Anna

    2012-12-15

    In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10(-9)) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10(-4)-2.2 × 10(-7). Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general.

  1. Quantification of trace-level DNA by real-time whole genome amplification.

    Science.gov (United States)

    Kang, Min-Jung; Yu, Hannah; Kim, Sook-Kyung; Park, Sang-Ryoul; Yang, Inchul

    2011-01-01

    Quantification of trace amounts of DNA is a challenge in analytical applications where the concentration of a target DNA is very low or only limited amounts of samples are available for analysis. PCR-based methods including real-time PCR are highly sensitive and widely used for quantification of low-level DNA samples. However, ordinary PCR methods require at least one copy of a specific gene sequence for amplification and may not work for a sub-genomic amount of DNA. We suggest a real-time whole genome amplification method adopting the degenerate oligonucleotide primed PCR (DOP-PCR) for quantification of sub-genomic amounts of DNA. This approach enabled quantification of sub-picogram amounts of DNA independently of their sequences. When the method was applied to the human placental DNA of which amount was accurately determined by inductively coupled plasma-optical emission spectroscopy (ICP-OES), an accurate and stable quantification capability for DNA samples ranging from 80 fg to 8 ng was obtained. In blind tests of laboratory-prepared DNA samples, measurement accuracies of 7.4%, -2.1%, and -13.9% with analytical precisions around 15% were achieved for 400-pg, 4-pg, and 400-fg DNA samples, respectively. A similar quantification capability was also observed for other DNA species from calf, E. coli, and lambda phage. Therefore, when provided with an appropriate standard DNA, the suggested real-time DOP-PCR method can be used as a universal method for quantification of trace amounts of DNA.

  2. Genome-wide association analysis identifies three new susceptibility loci for childhood body mass index

    Science.gov (United States)

    Felix, Janine F.; Bradfield, Jonathan P.; Monnereau, Claire; van der Valk, Ralf J.P.; Stergiakouli, Evie; Chesi, Alessandra; Gaillard, Romy; Feenstra, Bjarke; Thiering, Elisabeth; Kreiner-Møller, Eskil; Mahajan, Anubha; Pitkänen, Niina; Joro, Raimo; Cavadino, Alana; Huikari, Ville; Franks, Steve; Groen-Blokhuis, Maria M.; Cousminer, Diana L.; Marsh, Julie A.; Lehtimäki, Terho; Curtin, John A.; Vioque, Jesus; Ahluwalia, Tarunveer S.; Myhre, Ronny; Price, Thomas S.; Vilor-Tejedor, Natalia; Yengo, Loïc; Grarup, Niels; Ntalla, Ioanna; Ang, Wei; Atalay, Mustafa; Bisgaard, Hans; Blakemore, Alexandra I.; Bonnefond, Amelie; Carstensen, Lisbeth; Eriksson, Johan; Flexeder, Claudia; Franke, Lude; Geller, Frank; Geserick, Mandy; Hartikainen, Anna-Liisa; Haworth, Claire M.A.; Hirschhorn, Joel N.; Hofman, Albert; Holm, Jens-Christian; Horikoshi, Momoko; Hottenga, Jouke Jan; Huang, Jinyan; Kadarmideen, Haja N.; Kähönen, Mika; Kiess, Wieland; Lakka, Hanna-Maaria; Lakka, Timo A.; Lewin, Alexandra M.; Liang, Liming; Lyytikäinen, Leo-Pekka; Ma, Baoshan; Magnus, Per; McCormack, Shana E.; McMahon, George; Mentch, Frank D.; Middeldorp, Christel M.; Murray, Clare S.; Pahkala, Katja; Pers, Tune H.; Pfäffle, Roland; Postma, Dirkje S.; Power, Christine; Simpson, Angela; Sengpiel, Verena; Tiesler, Carla M. T.; Torrent, Maties; Uitterlinden, André G.; van Meurs, Joyce B.; Vinding, Rebecca; Waage, Johannes; Wardle, Jane; Zeggini, Eleftheria; Zemel, Babette S.; Dedoussis, George V.; Pedersen, Oluf; Froguel, Philippe; Sunyer, Jordi; Plomin, Robert; Jacobsson, Bo; Hansen, Torben; Gonzalez, Juan R.; Custovic, Adnan; Raitakari, Olli T.; Pennell, Craig E.; Widén, Elisabeth; Boomsma, Dorret I.; Koppelman, Gerard H.; Sebert, Sylvain; Järvelin, Marjo-Riitta; Hyppönen, Elina; McCarthy, Mark I.; Lindi, Virpi; Harri, Niinikoski; Körner, Antje; Bønnelykke, Klaus; Heinrich, Joachim; Melbye, Mads; Rivadeneira, Fernando; Hakonarson, Hakon; Ring, Susan M.; Smith, George Davey; Sørensen, Thorkild I.A.; Timpson, Nicholas J.; Grant, Struan F.A.; Jaddoe, Vincent W.V.

    2016-01-01

    A large number of genetic loci are associated with adult body mass index. However, the genetics of childhood body mass index are largely unknown. We performed a meta-analysis of genome-wide association studies of childhood body mass index, using sex- and age-adjusted standard deviation scores. We included 35 668 children from 20 studies in the discovery phase and 11 873 children from 13 studies in the replication phase. In total, 15 loci reached genome-wide significance (P-value < 5 × 10−8) in the joint discovery and replication analysis, of which 12 are previously identified loci in or close to ADCY3, GNPDA2, TMEM18, SEC16B, FAIM2, FTO, TFAP2B, TNNI3K, MC4R, GPR61, LMX1B and OLFM4 associated with adult body mass index or childhood obesity. We identified three novel loci: rs13253111 near ELP3, rs8092503 near RAB27B and rs13387838 near ADAM23. Per additional risk allele, body mass index increased 0.04 Standard Deviation Score (SDS) [Standard Error (SE) 0.007], 0.05 SDS (SE 0.008) and 0.14 SDS (SE 0.025), for rs13253111, rs8092503 and rs13387838, respectively. A genetic risk score combining all 15 SNPs showed that each additional average risk allele was associated with a 0.073 SDS (SE 0.011, P-value = 3.12 × 10−10) increase in childhood body mass index in a population of 1955 children. This risk score explained 2% of the variance in childhood body mass index. This study highlights the shared genetic background between childhood and adult body mass index and adds three novel loci. These loci likely represent age-related differences in strength of the associations with body mass index. PMID:26604143

  3. A genome-wide association study reveals variants in ARL15 that influence adiponectin levels

    NARCIS (Netherlands)

    J.B. Richards (Brent); D. Waterworth (Dawn); S. O'Rahilly (Stephen); M.-F. Hivert (Marie-France); R.J.F. Loos (Ruth); J.R.B. Perry (John); T. Tanaka (Toshiko); N.J. Timpson (Nicholas); R.K. Semple (Robert); N. Soranzo (Nicole); K. Song (Kijoung); N. Rocha (Nuno); E. Grundberg (Elin); J. Dupuis (Josée); J.C. Florez (Jose); C. Langenberg (Claudia); I. Prokopenko (Inga); R. Saxena (Richa); R. Sladek (Rob); Y.S. Aulchenko (Yurii); D.M. Evans (David); G. Waeber (Gérard); M.S. Burnett; N. Sattar (Naveed); J. Devaney (Joseph); C. Willenborg (Christina); A. Hingorani (Aroon); J.C.M. Witteman (Jacqueline); P. Vollenweider (Peter); B. Glaser (Beate); C. Hengstenberg (Christian); L. Ferrucci (Luigi); D. Melzer (David); K. Stark (Klaus); J. Deanfield (John); J. Winogradow (Janina); M. Grassl (Martina); A.S. Hall (Alistair); J.M. Egan (Josephine); J.R. Thompson (John); S.L. Ricketts (Sally); I.R. König (Inke); W. Reinhard (Wibke); S.M. Grundy (Scott); H.E. Wichmann (Heinz Erich); P. Barter (Phil); R. Mahley (Robert); Y.A. Kesaniemi (Antero); D.J. Rader (Daniel); M.P. Reilly (Muredach); S.E. Epstein (Stephen); A.F.R. Stewart (Alexandre); P. Tikka-Kleemola (Päivi); H. Schunkert (Heribert); K.A. Burling (Keith); J. Erdmann (Jeanette); P. Deloukas (Panagiotis); T. Pastinen (Tomi); N.J. Samani (Nilesh); R. McPherson (Ruth); G.D. Smith; T.M. Frayling (Timothy); N.J. Wareham (Nick); J.B. Meigs (James); V. Mooser (Vincent); T.D. Spector (Timothy)

    2009-01-01

    textabstractThe adipocyte-derived protein adiponectin is highly heritable and inversely associated with risk of type 2 diabetes mellitus (T2D) and coronary heart disease (CHD). We meta-analyzed 3 genome-wide association studies for circulating adiponectin levels (n = 8,531) and sought validation of

  4. Genome-wide association study identifies novel loci associated with circulating phospho- and sphingolipid concentrations.

    Directory of Open Access Journals (Sweden)

    Ayşe Demirkan

    Full Text Available Phospho- and sphingolipids are crucial cellular and intracellular compounds. These lipids are required for active transport, a number of enzymatic processes, membrane formation, and cell signalling. Disruption of their metabolism leads to several diseases, with diverse neurological, psychiatric, and metabolic consequences. A large number of phospholipid and sphingolipid species can be detected and measured in human plasma. We conducted a meta-analysis of five European family-based genome-wide association studies (N = 4034 on plasma levels of 24 sphingomyelins (SPM, 9 ceramides (CER, 57 phosphatidylcholines (PC, 20 lysophosphatidylcholines (LPC, 27 phosphatidylethanolamines (PE, and 16 PE-based plasmalogens (PLPE, as well as their proportions in each major class. This effort yielded 25 genome-wide significant loci for phospholipids (smallest P-value = 9.88×10(-204 and 10 loci for sphingolipids (smallest P-value = 3.10×10(-57. After a correction for multiple comparisons (P-value<2.2×10(-9, we observed four novel loci significantly associated with phospholipids (PAQR9, AGPAT1, PKD2L1, PDXDC1 and two with sphingolipids (PLD2 and APOE explaining up to 3.1% of the variance. Further analysis of the top findings with respect to within class molar proportions uncovered three additional loci for phospholipids (PNLIPRP2, PCDH20, and ABDH3 suggesting their involvement in either fatty acid elongation/saturation processes or fatty acid specific turnover mechanisms. Among those, 14 loci (KCNH7, AGPAT1, PNLIPRP2, SYT9, FADS1-2-3, DLG2, APOA1, ELOVL2, CDK17, LIPC, PDXDC1, PLD2, LASS4, and APOE mapped into the glycerophospholipid and 12 loci (ILKAP, ITGA9, AGPAT1, FADS1-2-3, APOA1, PCDH20, LIPC, PDXDC1, SGPP1, APOE, LASS4, and PLD2 to the sphingolipid pathways. In large meta-analyses, associations between FADS1-2-3 and carotid intima media thickness, AGPAT1 and type 2 diabetes, and APOA1 and coronary artery disease were observed. In conclusion, our

  5. Genome-wide mapping of virulence in brown planthopper identifies loci that break down host plant resistance.

    Science.gov (United States)

    Jing, Shengli; Zhang, Lei; Ma, Yinhua; Liu, Bingfang; Zhao, Yan; Yu, Hangjin; Zhou, Xi; Qin, Rui; Zhu, Lili; He, Guangcun

    2014-01-01

    Insects and plants have coexisted for over 350 million years and their interactions have affected ecosystems and agricultural practices worldwide. Variation in herbivorous insects' virulence to circumvent host resistance has been extensively documented. However, despite decades of investigation, the genetic foundations of virulence are currently unknown. The brown planthopper (Nilaparvata lugens) is the most destructive rice (Oryza sativa) pest in the world. The identification of the resistance gene Bph1 and its introduction in commercial rice varieties prompted the emergence of a new virulent brown planthopper biotype that was able to break the resistance conferred by Bph1. In this study, we aimed to construct a high density linkage map for the brown planthopper and identify the loci responsible for its virulence in order to determine their genetic architecture. Based on genotyping data for hundreds of molecular markers in three mapping populations, we constructed the most comprehensive linkage map available for this species, covering 96.6% of its genome. Fifteen chromosomes were anchored with 124 gene-specific markers. Using genome-wide scanning and interval mapping, the Qhp7 locus that governs preference for Bph1 plants was mapped to a 0.1 cM region of chromosome 7. In addition, two major QTLs that govern the rate of insect growth on resistant rice plants were identified on chromosomes 5 (Qgr5) and 14 (Qgr14). This is the first study to successfully locate virulence in the genome of this important agricultural insect by marker-based genetic mapping. Our results show that the virulence which overcomes the resistance conferred by Bph1 is controlled by a few major genes and that the components of virulence originate from independent genetic characters. The isolation of these loci will enable the elucidation of the molecular mechanisms underpinning the rice-brown planthopper interaction and facilitate the development of durable approaches for controlling this most

  6. Genome-wide mapping of virulence in brown planthopper identifies loci that break down host plant resistance.

    Directory of Open Access Journals (Sweden)

    Shengli Jing

    Full Text Available Insects and plants have coexisted for over 350 million years and their interactions have affected ecosystems and agricultural practices worldwide. Variation in herbivorous insects' virulence to circumvent host resistance has been extensively documented. However, despite decades of investigation, the genetic foundations of virulence are currently unknown. The brown planthopper (Nilaparvata lugens is the most destructive rice (Oryza sativa pest in the world. The identification of the resistance gene Bph1 and its introduction in commercial rice varieties prompted the emergence of a new virulent brown planthopper biotype that was able to break the resistance conferred by Bph1. In this study, we aimed to construct a high density linkage map for the brown planthopper and identify the loci responsible for its virulence in order to determine their genetic architecture. Based on genotyping data for hundreds of molecular markers in three mapping populations, we constructed the most comprehensive linkage map available for this species, covering 96.6% of its genome. Fifteen chromosomes were anchored with 124 gene-specific markers. Using genome-wide scanning and interval mapping, the Qhp7 locus that governs preference for Bph1 plants was mapped to a 0.1 cM region of chromosome 7. In addition, two major QTLs that govern the rate of insect growth on resistant rice plants were identified on chromosomes 5 (Qgr5 and 14 (Qgr14. This is the first study to successfully locate virulence in the genome of this important agricultural insect by marker-based genetic mapping. Our results show that the virulence which overcomes the resistance conferred by Bph1 is controlled by a few major genes and that the components of virulence originate from independent genetic characters. The isolation of these loci will enable the elucidation of the molecular mechanisms underpinning the rice-brown planthopper interaction and facilitate the development of durable approaches for

  7. Identification of genome-specific transcripts in wheat–rye translocation lines

    Directory of Open Access Journals (Sweden)

    Tong Geon Lee

    2015-09-01

    Full Text Available Studying gene expression in wheat–rye translocation lines is complicated due to the presence of homeologs in hexaploid wheat and high levels of synteny between wheat and rye genomes (Naranjo and Fernandez-Rueda, 1991 [1]; Devos et al., 1995 [2]; Lee et al., 2010 [3]; Lee et al., 2013 [4]. To overcome limitations of current gene expression studies on wheat–rye translocation lines and identify genome-specific transcripts, we developed a custom Roche NimbleGen Gene Expression microarray that contains probes derived from the sequence of hexaploid wheat, diploid rye and diploid progenitors of hexaploid wheat genome (Lee et al., 2014. Using the array developed, we identified genome-specific transcripts in a wheat–rye translocation line (Lee et al., 2014. Expression data are deposited in the NCBI Gene Expression Omnibus (GEO under accession number GSE58678. Here we report the details of the methods used in the array workflow and data analysis.

  8. Genome-wide association study identifies six new loci influencing pulse pressure and mean arterial pressure

    Science.gov (United States)

    Wain, Louise V; Verwoert, Germaine C; O’Reilly, Paul F; Shi, Gang; Johnson, Toby; Johnson, Andrew D; Bochud, Murielle; Rice, Kenneth M; Henneman, Peter; Smith, Albert V; Ehret, Georg B; Amin, Najaf; Larson, Martin G; Mooser, Vincent; Hadley, David; Dörr, Marcus; Bis, Joshua C; Aspelund, Thor; Esko, Tõnu; Janssens, A Cecile JW; Zhao, Jing Hua; Heath, Simon; Laan, Maris; Fu, Jingyuan; Pistis, Giorgio; Luan, Jian’an; Arora, Pankaj; Lucas, Gavin; Pirastu, Nicola; Pichler, Irene; Jackson, Anne U; Webster, Rebecca J; Zhang, Feng; Peden, John F; Schmidt, Helena; Tanaka, Toshiko; Campbell, Harry; Igl, Wilmar; Milaneschi, Yuri; Hotteng, Jouke-Jan; Vitart, Veronique; Chasman, Daniel I; Trompet, Stella; Bragg-Gresham, Jennifer L; Alizadeh, Behrooz Z; Chambers, John C; Guo, Xiuqing; Lehtimäki, Terho; Kühnel, Brigitte; Lopez, Lorna M; Polašek, Ozren; Boban, Mladen; Nelson, Christopher P; Morrison, Alanna C; Pihur, Vasyl; Ganesh, Santhi K; Hofman, Albert; Kundu, Suman; Mattace-Raso, Francesco US; Rivadeneira, Fernando; Sijbrands, Eric JG; Uitterlinden, Andre G; Hwang, Shih-Jen; Vasan, Ramachandran S; Wang, Thomas J; Bergmann, Sven; Vollenweider, Peter; Waeber, Gérard; Laitinen, Jaana; Pouta, Anneli; Zitting, Paavo; McArdle, Wendy L; Kroemer, Heyo K; Völker, Uwe; Völzke, Henry; Glazer, Nicole L; Taylor, Kent D; Harris, Tamara B; Alavere, Helene; Haller, Toomas; Keis, Aime; Tammesoo, Mari-Liis; Aulchenko, Yurii; Barroso, Inês; Khaw, Kay-Tee; Galan, Pilar; Hercberg, Serge; Lathrop, Mark; Eyheramendy, Susana; Org, Elin; Sõber, Siim; Lu, Xiaowen; Nolte, Ilja M; Penninx, Brenda W; Corre, Tanguy; Masciullo, Corrado; Sala, Cinzia; Groop, Leif; Voight, Benjamin F; Melander, Olle; O’Donnell, Christopher J; Salomaa, Veikko; d’Adamo, Adamo Pio; Fabretto, Antonella; Faletra, Flavio; Ulivi, Sheila; Del Greco, M Fabiola; Facheris, Maurizio; Collins, Francis S; Bergman, Richard N; Beilby, John P; Hung, Joseph; Musk, A William; Mangino, Massimo; Shin, So-Youn; Soranzo, Nicole; Watkins, Hugh; Goel, Anuj; Hamsten, Anders; Gider, Pierre; Loitfelder, Marisa; Zeginigg, Marion; Hernandez, Dena; Najjar, Samer S; Navarro, Pau; Wild, Sarah H; Corsi, Anna Maria; Singleton, Andrew; de Geus, Eco JC; Willemsen, Gonneke; Parker, Alex N; Rose, Lynda M; Buckley, Brendan; Stott, David; Orru, Marco; Uda, Manuela; van der Klauw, Melanie M; Zhang, Weihua; Li, Xinzhong; Scott, James; Chen, Yii-Der Ida; Burke, Gregory L; Kähönen, Mika; Viikari, Jorma; Döring, Angela; Meitinger, Thomas; Davies, Gail; Starr, John M; Emilsson, Valur; Plump, Andrew; Lindeman, Jan H; ’t Hoen, Peter AC; König, Inke R; Felix, Janine F; Clarke, Robert; Hopewell, Jemma C; Ongen, Halit; Breteler, Monique; Debette, Stéphanie; DeStefano, Anita L; Fornage, Myriam; Mitchell, Gary F; Smith, Nicholas L; Holm, Hilma; Stefansson, Kari; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Samani, Nilesh J; Preuss, Michael; Rudan, Igor; Hayward, Caroline; Deary, Ian J; Wichmann, H-Erich; Raitakari, Olli T; Palmas, Walter; Kooner, Jaspal S; Stolk, Ronald P; Jukema, J Wouter; Wright, Alan F; Boomsma, Dorret I; Bandinelli, Stefania; Gyllensten, Ulf B; Wilson, James F; Ferrucci, Luigi; Schmidt, Reinhold; Farrall, Martin; Spector, Tim D; Palmer, Lyle J; Tuomilehto, Jaakko; Pfeufer, Arne; Gasparini, Paolo; Siscovick, David; Altshuler, David; Loos, Ruth JF; Toniolo, Daniela; Snieder, Harold; Gieger, Christian; Meneton, Pierre; Wareham, Nicholas J; Oostra, Ben A; Metspalu, Andres; Launer, Lenore; Rettig, Rainer; Strachan, David P; Beckmann, Jacques S; Witteman, Jacqueline CM; Erdmann, Jeanette; van Dijk, Ko Willems; Boerwinkle, Eric; Boehnke, Michael; Ridker, Paul M; Jarvelin, Marjo-Riitta; Chakravarti, Aravinda; Abecasis, Goncalo R; Gudnason, Vilmundur; Newton-Cheh, Christopher; Levy, Daniel; Munroe, Patricia B; Psaty, Bruce M; Caulfield, Mark J; Rao, Dabeeru C

    2012-01-01

    Numerous genetic loci influence systolic blood pressure (SBP) and diastolic blood pressure (DBP) in Europeans 1-3. We now report genome-wide association studies of pulse pressure (PP) and mean arterial pressure (MAP). In discovery (N=74,064) and follow-up studies (N=48,607), we identified at genome-wide significance (P= 2.7×10-8 to P=2.3×10-13) four novel PP loci (at 4q12 near CHIC2/PDGFRAI, 7q22.3 near PIK3CG, 8q24.12 in NOV, 11q24.3 near ADAMTS-8), two novel MAP loci (3p21.31 in MAP4, 10q25.3 near ADRB1) and one locus associated with both traits (2q24.3 near FIGN) which has recently been associated with SBP in east Asians. For three of the novel PP signals, the estimated effect for SBP was opposite to that for DBP, in contrast to the majority of common SBP- and DBP-associated variants which show concordant effects on both traits. These findings indicate novel genetic mechanisms underlying blood pressure variation, including pathways that may differentially influence SBP and DBP. PMID:21909110

  9. Genome Wide Association Study of SNP-, Gene-, and Pathway-based Approaches to Identify Genes Influencing Susceptibility to Staphylococcus aureus Infections

    Directory of Open Access Journals (Sweden)

    Zhan eYe

    2014-05-01

    Full Text Available Background: We conducted a genome-wide association study (GWAS to identify specific genetic variants that underlie susceptibility to disease caused by Staphylococcus aureus in humans. Methods: Cases (n=309 and controls (n=2,925 were genotyped at 508,921 single nucleotide polymorphisms (SNPs. Cases had at least one laboratory and clinician confirmed disease caused by S. aureus whereas controls did not. R-package (for SNP association, EIGENSOFT (to estimate and adjust for population stratification and gene- (VEGAS and pathway-based (DAVID, PANTHER, and Ingenuity Pathway Analysis analyses were performed.Results: No SNP reached genome-wide significance. Four SNPs exceeded the pConclusion: We identified potential susceptibility genes for S. aureus diseases in this preliminary study but confirmation by other studies is needed. The observed associations could be relevant given the complexity of S. aureus as a pathogen and its ability to exploit multiple biological pathways to cause infections in humans.

  10. My sister's keeper?: genomic research and the identifiability of siblings

    Directory of Open Access Journals (Sweden)

    Kohane Isaac S

    2008-07-01

    Full Text Available Abstract Background Genomic sequencing of SNPs is increasingly prevalent, though the amount of familial information these data contain has not been quantified. Methods We provide a framework for measuring the risk to siblings of a patient's SNP genotype disclosure, and demonstrate that sibling SNP genotypes can be inferred with substantial accuracy. Results Extending this inference technique, we determine that a very low number of matches at commonly varying SNPs is sufficient to confirm sib-ship, demonstrating that published sequence data can reliably be used to derive sibling identities. Using HapMap trio data, at SNPs where one child is homozygotic major, with a minor allele frequency ≤ 0.20, (N = 452684, 65.1% we achieve 91.9% inference accuracy for sibling genotypes. Conclusion These findings demonstrate that substantial discrimination and privacy risks arise from use of inferred familial genomic data.

  11. Characterization of canine osteosarcoma by array comparative genomic hybridization and RT-qPCR: signatures of genomic imbalance in canine osteosarcoma parallel the human counterpart.

    Science.gov (United States)

    Angstadt, Andrea Y; Motsinger-Reif, Alison; Thomas, Rachael; Kisseberth, William C; Guillermo Couto, C; Duval, Dawn L; Nielsen, Dahlia M; Modiano, Jaime F; Breen, Matthew

    2011-11-01

    Osteosarcoma (OS) is the most commonly diagnosed malignant bone tumor in humans and dogs, characterized in both species by extremely complex karyotypes exhibiting high frequencies of genomic imbalance. Evaluation of genomic signatures in human OS using array comparative genomic hybridization (aCGH) has assisted in uncovering genetic mechanisms that result in disease phenotype. Previous low-resolution (10-20 Mb) aCGH analysis of canine OS identified a wide range of recurrent DNA copy number aberrations, indicating extensive genomic instability. In this study, we profiled 123 canine OS tumors by 1 Mb-resolution aCGH to generate a dataset for direct comparison with current data for human OS, concluding that several high frequency aberrations in canine and human OS are orthologous. To ensure complete coverage of gene annotation, we identified the human refseq genes that map to these orthologous aberrant dog regions and found several candidate genes warranting evaluation for OS involvement. Specifically, subsequenct FISH and qRT-PCR analysis of RUNX2, TUSC3, and PTEN indicated that expression levels correlated with genomic copy number status, showcasing RUNX2 as an OS associated gene and TUSC3 as a possible tumor suppressor candidate. Together these data demonstrate the ability of genomic comparative oncology to identify genetic abberations which may be important for OS progression. Large scale screening of genomic imbalance in canine OS further validates the use of the dog as a suitable model for human cancers, supporting the idea that dysregulation discovered in canine cancers will provide an avenue for complementary study in human counterparts. Copyright © 2011 Wiley-Liss, Inc.

  12. Genome-Wide Approaches to Drosophila Heart Development

    Directory of Open Access Journals (Sweden)

    Manfred Frasch

    2016-05-01

    Full Text Available The development of the dorsal vessel in Drosophila is one of the first systems in which key mechanisms regulating cardiogenesis have been defined in great detail at the genetic and molecular level. Due to evolutionary conservation, these findings have also provided major inputs into studies of cardiogenesis in vertebrates. Many of the major components that control Drosophila cardiogenesis were discovered based on candidate gene approaches and their functions were defined by employing the outstanding genetic tools and molecular techniques available in this system. More recently, approaches have been taken that aim to interrogate the entire genome in order to identify novel components and describe genomic features that are pertinent to the regulation of heart development. Apart from classical forward genetic screens, the availability of the thoroughly annotated Drosophila genome sequence made new genome-wide approaches possible, which include the generation of massive numbers of RNA interference (RNAi reagents that were used in forward genetic screens, as well as studies of the transcriptomes and proteomes of the developing heart under normal and experimentally manipulated conditions. Moreover, genome-wide chromatin immunoprecipitation experiments have been performed with the aim to define the full set of genomic binding sites of the major cardiogenic transcription factors, their relevant target genes, and a more complete picture of the regulatory network that drives cardiogenesis. This review will give an overview on these genome-wide approaches to Drosophila heart development and on computational analyses of the obtained information that ultimately aim to provide a description of this process at the systems level.

  13. Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes

    Science.gov (United States)

    Thybert, David; Roller, Maša; Navarro, Fábio C.P.; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janoušek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C.; Laukaitis, Christina M.; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A.; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J.; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M.; Odom, Duncan T.; Flicek, Paul

    2018-01-01

    Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology. PMID:29563166

  14. Analysis of pan-genome to identify the core genes and essential genes of Brucella spp.

    Science.gov (United States)

    Yang, Xiaowen; Li, Yajie; Zang, Juan; Li, Yexia; Bie, Pengfei; Lu, Yanli; Wu, Qingmin

    2016-04-01

    Brucella spp. are facultative intracellular pathogens, that cause a contagious zoonotic disease, that can result in such outcomes as abortion or sterility in susceptible animal hosts and grave, debilitating illness in humans. For deciphering the survival mechanism of Brucella spp. in vivo, 42 Brucella complete genomes from NCBI were analyzed for the pan-genome and core genome by identification of their composition and function of Brucella genomes. The results showed that the total 132,143 protein-coding genes in these genomes were divided into 5369 clusters. Among these, 1710 clusters were associated with the core genome, 1182 clusters with strain-specific genes and 2477 clusters with dispensable genomes. COG analysis indicated that 44 % of the core genes were devoted to metabolism, which were mainly responsible for energy production and conversion (COG category C), and amino acid transport and metabolism (COG category E). Meanwhile, approximately 35 % of the core genes were in positive selection. In addition, 1252 potential essential genes were predicted in the core genome by comparison with a prokaryote database of essential genes. The results suggested that the core genes in Brucella genomes are relatively conservation, and the energy and amino acid metabolism play a more important role in the process of growth and reproduction in Brucella spp. This study might help us to better understand the mechanisms of Brucella persistent infection and provide some clues for further exploring the gene modules of the intracellular survival in Brucella spp.

  15. Complete genome sequence and comparative genomics of the probiotic yeast Saccharomyces boulardii.

    Science.gov (United States)

    Khatri, Indu; Tomar, Rajul; Ganesan, K; Prasad, G S; Subramanian, Srikrishna

    2017-03-23

    The probiotic yeast, Saccharomyces boulardii (Sb) is known to be effective against many gastrointestinal disorders and antibiotic-associated diarrhea. To understand molecular basis of probiotic-properties ascribed to Sb we determined the complete genomes of two strains of Sb i.e. Biocodex and unique28 and the draft genomes for three other Sb strains that are marketed as probiotics in India. We compared these genomes with 145 strains of S. cerevisiae (Sc) to understand genome-level similarities and differences between these yeasts. A distinctive feature of Sb from other Sc is absence of Ty elements Ty1, Ty3, Ty4 and associated LTR. However, we could identify complete Ty2 and Ty5 elements in Sb. The genes for hexose transporters HXT11 and HXT9, and asparagine-utilization are absent in all Sb strains. We find differences in repeat periods and copy numbers of repeats in flocculin genes that are likely related to the differential adhesion of Sb as compared to Sc. Core-proteome based taxonomy places Sb strains along with wine strains of Sc. We find the introgression of five genes from Z. bailii into the chromosome IV of Sb and wine strains of Sc. Intriguingly, genes involved in conferring known probiotic properties to Sb are conserved in most Sc strains.

  16. Genome-Wide Tuning of Protein Expression Levels to Rapidly Engineer Microbial Traits.

    Science.gov (United States)

    Freed, Emily F; Winkler, James D; Weiss, Sophie J; Garst, Andrew D; Mutalik, Vivek K; Arkin, Adam P; Knight, Rob; Gill, Ryan T

    2015-11-20

    The reliable engineering of biological systems requires quantitative mapping of predictable and context-independent expression over a broad range of protein expression levels. However, current techniques for modifying expression levels are cumbersome and are not amenable to high-throughput approaches. Here we present major improvements to current techniques through the design and construction of E. coli genome-wide libraries using synthetic DNA cassettes that can tune expression over a ∼10(4) range. The cassettes also contain molecular barcodes that are optimized for next-generation sequencing, enabling rapid and quantitative tracking of alleles that have the highest fitness advantage. We show these libraries can be used to determine which genes and expression levels confer greater fitness to E. coli under different growth conditions.

  17. Functional genomics identifies specific vulnerabilities in PTEN-deficient breast cancer.

    Science.gov (United States)

    Tang, Yew Chung; Ho, Szu-Chi; Tan, Elisabeth; Ng, Alvin Wei Tian; McPherson, John R; Goh, Germaine Yen Lin; Teh, Bin Tean; Bard, Frederic; Rozen, Steven G

    2018-03-22

    Phosphatase and tensin homolog (PTEN) is one of the most frequently inactivated tumor suppressors in breast cancer. While PTEN itself is not considered a druggable target, PTEN synthetic-sick or synthetic-lethal (PTEN-SSL) genes are potential drug targets in PTEN-deficient breast cancers. Therefore, with the aim of identifying potential targets for precision breast cancer therapy, we sought to discover PTEN-SSL genes present in a broad spectrum of breast cancers. To discover broad-spectrum PTEN-SSL genes in breast cancer, we used a multi-step approach that started with (1) a genome-wide short interfering RNA (siRNA) screen of ~ 21,000 genes in a pair of isogenic human mammary epithelial cell lines, followed by (2) a short hairpin RNA (shRNA) screen of ~ 1200 genes focused on hits from the first screen in a panel of 11 breast cancer cell lines; we then determined reproducibility of hits by (3) identification of overlaps between our results and reanalyzed data from 3 independent gene-essentiality screens, and finally, for selected candidate PTEN-SSL genes we (4) confirmed PTEN-SSL activity using either drug sensitivity experiments in a panel of 19 cell lines or mutual exclusivity analysis of publicly available pan-cancer somatic mutation data. The screens (steps 1 and 2) and the reproducibility analysis (step 3) identified six candidate broad-spectrum PTEN-SSL genes (PIK3CB, ADAMTS20, AP1M2, HMMR, STK11, and NUAK1). PIK3CB was previously identified as PTEN-SSL, while the other five genes represent novel PTEN-SSL candidates. Confirmation studies (step 4) provided additional evidence that NUAK1 and STK11 have PTEN-SSL patterns of activity. Consistent with PTEN-SSL status, inhibition of the NUAK1 protein kinase by the small molecule drug HTH-01-015 selectively impaired viability in multiple PTEN-deficient breast cancer cell lines, while mutations affecting STK11 and PTEN were largely mutually exclusive across large pan-cancer data sets. Six genes showed PTEN

  18. Sequencing of mitochondrial genomes of nine Aspergillus and Penicillium species identifies mobile introns and accessory genes as main sources of genome size variability.

    Science.gov (United States)

    Joardar, Vinita; Abrams, Natalie F; Hostetler, Jessica; Paukstelis, Paul J; Pakala, Suchitra; Pakala, Suman B; Zafar, Nikhat; Abolude, Olukemi O; Payne, Gary; Andrianopoulos, Alex; Denning, David W; Nierman, William C

    2012-12-12

    The genera Aspergillus and Penicillium include some of the most beneficial as well as the most harmful fungal species such as the penicillin-producer Penicillium chrysogenum and the human pathogen Aspergillus fumigatus, respectively. Their mitochondrial genomic sequences may hold vital clues into the mechanisms of their evolution, population genetics, and biology, yet only a handful of these genomes have been fully sequenced and annotated. Here we report the complete sequence and annotation of the mitochondrial genomes of six Aspergillus and three Penicillium species: A. fumigatus, A. clavatus, A. oryzae, A. flavus, Neosartorya fischeri (A. fischerianus), A. terreus, P. chrysogenum, P. marneffei, and Talaromyces stipitatus (P. stipitatum). The accompanying comparative analysis of these and related publicly available mitochondrial genomes reveals wide variation in size (25-36 Kb) among these closely related fungi. The sources of genome expansion include group I introns and accessory genes encoding putative homing endonucleases, DNA and RNA polymerases (presumed to be of plasmid origin) and hypothetical proteins. The two smallest sequenced genomes (A. terreus and P. chrysogenum) do not contain introns in protein-coding genes, whereas the largest genome (T. stipitatus), contains a total of eleven introns. All of the sequenced genomes have a group I intron in the large ribosomal subunit RNA gene, suggesting that this intron is fixed in these species. Subsequent analysis of several A. fumigatus strains showed low intraspecies variation. This study also includes a phylogenetic analysis based on 14 concatenated core mitochondrial proteins. The phylogenetic tree has a different topology from published multilocus trees, highlighting the challenges still facing the Aspergillus systematics. The study expands the genomic resources available to fungal biologists by providing mitochondrial genomes with consistent annotations for future genetic, evolutionary and population

  19. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome

    Science.gov (United States)

    Cornick, Jennifer E.; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R.; Gray, Katherine J.; Kiran, Anmol M.; Molyneux, Elizabeth; French, Neil; Faragher, Brian E.; Everett, Dean B.; Bentley, Stephen D.

    2015-01-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites. PMID:26259813

  20. The genomic landscape at a late stage of stickleback speciation: High genomic divergence interspersed by small localized regions of introgression.

    Directory of Open Access Journals (Sweden)

    Mark Ravinet

    2018-05-01

    Full Text Available Speciation is a continuous process and analysis of species pairs at different stages of divergence provides insight into how it unfolds. Previous genomic studies on young species pairs have revealed peaks of divergence and heterogeneous genomic differentiation. Yet less known is how localised peaks of differentiation progress to genome-wide divergence during the later stages of speciation in the presence of persistent gene flow. Spanning the speciation continuum, stickleback species pairs are ideal for investigating how genomic divergence builds up during speciation. However, attention has largely focused on young postglacial species pairs, with little knowledge of the genomic signatures of divergence and introgression in older stickleback systems. The Japanese stickleback species pair, composed of the Pacific Ocean three-spined stickleback (Gasterosteus aculeatus and the Japan Sea stickleback (G. nipponicus, which co-occur in the Japanese islands, is at a late stage of speciation. Divergence likely started well before the end of the last glacial period and crosses between Japan Sea females and Pacific Ocean males result in hybrid male sterility. Here we use coalescent analyses and Approximate Bayesian Computation to show that the two species split approximately 0.68-1 million years ago but that they have continued to exchange genes at a low rate throughout divergence. Population genomic data revealed that, despite gene flow, a high level of genomic differentiation is maintained across the majority of the genome. However, we identified multiple, small regions of introgression, occurring mainly in areas of low recombination rate. Our results demonstrate that a high level of genome-wide divergence can establish in the face of persistent introgression and that gene flow can be localized to small genomic regions at the later stages of speciation with gene flow.

  1. The genomic landscape at a late stage of stickleback speciation: High genomic divergence interspersed by small localized regions of introgression.

    Science.gov (United States)

    Ravinet, Mark; Yoshida, Kohta; Shigenobu, Shuji; Toyoda, Atsushi; Fujiyama, Asao; Kitano, Jun

    2018-05-01

    Speciation is a continuous process and analysis of species pairs at different stages of divergence provides insight into how it unfolds. Previous genomic studies on young species pairs have revealed peaks of divergence and heterogeneous genomic differentiation. Yet less known is how localised peaks of differentiation progress to genome-wide divergence during the later stages of speciation in the presence of persistent gene flow. Spanning the speciation continuum, stickleback species pairs are ideal for investigating how genomic divergence builds up during speciation. However, attention has largely focused on young postglacial species pairs, with little knowledge of the genomic signatures of divergence and introgression in older stickleback systems. The Japanese stickleback species pair, composed of the Pacific Ocean three-spined stickleback (Gasterosteus aculeatus) and the Japan Sea stickleback (G. nipponicus), which co-occur in the Japanese islands, is at a late stage of speciation. Divergence likely started well before the end of the last glacial period and crosses between Japan Sea females and Pacific Ocean males result in hybrid male sterility. Here we use coalescent analyses and Approximate Bayesian Computation to show that the two species split approximately 0.68-1 million years ago but that they have continued to exchange genes at a low rate throughout divergence. Population genomic data revealed that, despite gene flow, a high level of genomic differentiation is maintained across the majority of the genome. However, we identified multiple, small regions of introgression, occurring mainly in areas of low recombination rate. Our results demonstrate that a high level of genome-wide divergence can establish in the face of persistent introgression and that gene flow can be localized to small genomic regions at the later stages of speciation with gene flow.

  2. Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

    Science.gov (United States)

    Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat

    2017-01-01

    Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.

  3. Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

    Directory of Open Access Journals (Sweden)

    Matthias Christen

    Full Text Available Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.

  4. PICMI: mapping point mutations on genomes.

    KAUST Repository

    Le Pera, Loredana; Marcatili, Paolo; Tramontano, Anna

    2010-01-01

    MOTIVATION: Several international collaborations and local projects are producing extensive catalogues of genomic variations that are supplementing existing collections such as the OMIM catalogue. The flood of this type of data will keep increasing and, especially, it will be relevant to a wider user base, including not only molecular biologists, geneticists and bioinformaticians, but also clinical researchers. Mapping the observed variations, sometimes only described at the amino acid level, on a genome, identifying whether they affect a gene and-if so-whether they also affect different isoforms of the same gene, is a time consuming and often frustrating task. RESULTS: The PICMI server is an easy to use tool for quickly mapping one or more amino acid or nucleotide variations on a genome and its products, including alternatively spliced isoforms. AVAILABILITY: The server is available at www.biocomputing.it/picmi.

  5. PICMI: mapping point mutations on genomes.

    KAUST Repository

    Le Pera, Loredana

    2010-10-12

    MOTIVATION: Several international collaborations and local projects are producing extensive catalogues of genomic variations that are supplementing existing collections such as the OMIM catalogue. The flood of this type of data will keep increasing and, especially, it will be relevant to a wider user base, including not only molecular biologists, geneticists and bioinformaticians, but also clinical researchers. Mapping the observed variations, sometimes only described at the amino acid level, on a genome, identifying whether they affect a gene and-if so-whether they also affect different isoforms of the same gene, is a time consuming and often frustrating task. RESULTS: The PICMI server is an easy to use tool for quickly mapping one or more amino acid or nucleotide variations on a genome and its products, including alternatively spliced isoforms. AVAILABILITY: The server is available at www.biocomputing.it/picmi.

  6. Novel genetic loci underlying human intracranial volume identified through genome-wide association

    Science.gov (United States)

    Adams, Hieab HH; Hibar, Derrek P; Chouraki, Vincent; Stein, Jason L; Nyquist, Paul A; Rentería, Miguel E; Trompet, Stella; Arias-Vasquez, Alejandro; Seshadri, Sudha; Desrivières, Sylvane; Beecham, Ashley H; Jahanshad, Neda; Wittfeld, Katharina; Van der Lee, Sven J; Abramovic, Lucija; Alhusaini, Saud; Amin, Najaf; Andersson, Micael; Arfanakis, Konstantinos; Aribisala, Benjamin S; Armstrong, Nicola J; Athanasiu, Lavinia; Axelsson, Tomas; Beiser, Alexa; Bernard, Manon; Bis, Joshua C; Blanken, Laura ME; Blanton, Susan H; Bohlken, Marc M; Boks, Marco P; Bralten, Janita; Brickman, Adam M; Carmichael, Owen; Chakravarty, M Mallar; Chauhan, Ganesh; Chen, Qiang; Ching, Christopher RK; Cuellar-Partida, Gabriel; Den Braber, Anouk; Doan, Nhat Trung; Ehrlich, Stefan; Filippi, Irina; Ge, Tian; Giddaluru, Sudheer; Goldman, Aaron L; Gottesman, Rebecca F; Greven, Corina U; Grimm, Oliver; Griswold, Michael E; Guadalupe, Tulio; Hass, Johanna; Haukvik, Unn K; Hilal, Saima; Hofer, Edith; Hoehn, David; Holmes, Avram J; Hoogman, Martine; Janowitz, Deborah; Jia, Tianye; Kasperaviciute, Dalia; Kim, Sungeun; Klein, Marieke; Kraemer, Bernd; Lee, Phil H; Liao, Jiemin; Liewald, David CM; Lopez, Lorna M; Luciano, Michelle; Macare, Christine; Marquand, Andre; Matarin, Mar; Mather, Karen A; Mattheisen, Manuel; Mazoyer, Bernard; McKay, David R; McWhirter, Rebekah; Milaneschi, Yuri; Mirza-Schreiber, Nazanin; Muetzel, Ryan L; Maniega, Susana Muñoz; Nho, Kwangsik; Nugent, Allison C; Olde Loohuis, Loes M; Oosterlaan, Jaap; Papmeyer, Martina; Pappa, Irene; Pirpamer, Lukas; Pudas, Sara; Pütz, Benno; Rajan, Kumar B; Ramasamy, Adaikalavan; Richards, Jennifer S; Risacher, Shannon L; Roiz-Santiañez, Roberto; Rommelse, Nanda; Rose, Emma J; Royle, Natalie A; Rundek, Tatjana; Sämann, Philipp G; Satizabal, Claudia L; Schmaal, Lianne; Schork, Andrew J; Shen, Li; Shin, Jean; Shumskaya, Elena; Smith, Albert V; Sprooten, Emma; Strike, Lachlan T; Teumer, Alexander; Thomson, Russell; Tordesillas-Gutierrez, Diana; Toro, Roberto; Trabzuni, Daniah; Vaidya, Dhananjay; Van der Grond, Jeroen; Van der Meer, Dennis; Van Donkelaar, Marjolein MJ; Van Eijk, Kristel R; Van Erp, Theo GM; Van Rooij, Daan; Walton, Esther; Westlye, Lars T; Whelan, Christopher D; Windham, Beverly G; Winkler, Anderson M; Woldehawariat, Girma; Wolf, Christiane; Wolfers, Thomas; Xu, Bing; Yanek, Lisa R; Yang, Jingyun; Zijdenbos, Alex; Zwiers, Marcel P; Agartz, Ingrid; Aggarwal, Neelum T; Almasy, Laura; Ames, David; Amouyel, Philippe; Andreassen, Ole A; Arepalli, Sampath; Assareh, Amelia A; Barral, Sandra; Bastin, Mark E; Becker, Diane M; Becker, James T; Bennett, David A; Blangero, John; van Bokhoven, Hans; Boomsma, Dorret I; Brodaty, Henry; Brouwer, Rachel M; Brunner, Han G; Buckner, Randy L; Buitelaar, Jan K; Bulayeva, Kazima B; Cahn, Wiepke; Calhoun, Vince D; Cannon, Dara M; Cavalleri, Gianpiero L; Chen, Christopher; Cheng, Ching-Yu; Cichon, Sven; Cookson, Mark R; Corvin, Aiden; Crespo-Facorro, Benedicto; Curran, Joanne E; Czisch, Michael; Dale, Anders M; Davies, Gareth E; De Geus, Eco JC; De Jager, Philip L; de Zubicaray, Greig I; Delanty, Norman; Depondt, Chantal; DeStefano, Anita L; Dillman, Allissa; Djurovic, Srdjan; Donohoe, Gary; Drevets, Wayne C; Duggirala, Ravi; Dyer, Thomas D; Erk, Susanne; Espeseth, Thomas; Evans, Denis A; Fedko, Iryna O; Fernández, Guillén; Ferrucci, Luigi; Fisher, Simon E; Fleischman, Debra A; Ford, Ian; Foroud, Tatiana M; Fox, Peter T; Francks, Clyde; Fukunaga, Masaki; Gibbs, J Raphael; Glahn, David C; Gollub, Randy L; Göring, Harald HH; Grabe, Hans J; Green, Robert C; Gruber, Oliver; Gudnason, Vilmundur; Guelfi, Sebastian; Hansell, Narelle K; Hardy, John; Hartman, Catharina A; Hashimoto, Ryota; Hegenscheid, Katrin; Heinz, Andreas; Le Hellard, Stephanie; Hernandez, Dena G; Heslenfeld, Dirk J; Ho, Beng-Choon; Hoekstra, Pieter J; Hoffmann, Wolfgang; Hofman, Albert; Holsboer, Florian; Homuth, Georg; Hosten, Norbert; Hottenga, Jouke-Jan; Hulshoff Pol, Hilleke E; Ikeda, Masashi; Ikram, M Kamran; Jack, Clifford R; Jenkinson, Mark; Johnson, Robert; Jönsson, Erik G; Jukema, J Wouter; Kahn, René S; Kanai, Ryota; Kloszewska, Iwona; Knopman, David S; Kochunov, Peter; Kwok, John B; Lawrie, Stephen M; Lemaître, Hervé; Liu, Xinmin; Longo, Dan L; Longstreth, WT; Lopez, Oscar L; Lovestone, Simon; Martinez, Oliver; Martinot, Jean-Luc; Mattay, Venkata S; McDonald, Colm; McIntosh, Andrew M; McMahon, Katie L; McMahon, Francis J; Mecocci, Patrizia; Melle, Ingrid; Meyer-Lindenberg, Andreas; Mohnke, Sebastian; Montgomery, Grant W; Morris, Derek W; Mosley, Thomas H; Mühleisen, Thomas W; Müller-Myhsok, Bertram; Nalls, Michael A; Nauck, Matthias; Nichols, Thomas E; Niessen, Wiro J; Nöthen, Markus M; Nyberg, Lars; Ohi, Kazutaka; Olvera, Rene L; Ophoff, Roel A; Pandolfo, Massimo; Paus, Tomas; Pausova, Zdenka; Penninx, Brenda WJH; Pike, G Bruce; Potkin, Steven G; Psaty, Bruce M; Reppermund, Simone; Rietschel, Marcella; Roffman, Joshua L; Romanczuk-Seiferth, Nina; Rotter, Jerome I; Ryten, Mina; Sacco, Ralph L; Sachdev, Perminder S; Saykin, Andrew J; Schmidt, Reinhold; Schofield, Peter R; Sigurdsson, Sigurdur; Simmons, Andy; Singleton, Andrew; Sisodiya, Sanjay M; Smith, Colin; Smoller, Jordan W; Soininen, Hilkka; Srikanth, Velandai; Steen, Vidar M; Stott, David J; Sussmann, Jessika E; Thalamuthu, Anbupalam; Tiemeier, Henning; Toga, Arthur W; Traynor, Bryan J; Troncoso, Juan; Turner, Jessica A; Tzourio, Christophe; Uitterlinden, Andre G; Valdés Hernández, Maria C; Van der Brug, Marcel; Van der Lugt, Aad; Van der Wee, Nic JA; Van Duijn, Cornelia M; Van Haren, Neeltje EM; Van 't Ent, Dennis; Van Tol, Marie-Jose; Vardarajan, Badri N; Veltman, Dick J; Vernooij, Meike W; Völzke, Henry; Walter, Henrik; Wardlaw, Joanna M; Wassink, Thomas H; Weale, Michael E; Weinberger, Daniel R; Weiner, Michael W; Wen, Wei; Westman, Eric; White, Tonya; Wong, Tien Y; Wright, Clinton B; Zielke, H Ronald; Zonderman, Alan B; Deary, Ian J; DeCarli, Charles; Schmidt, Helena; Martin, Nicholas G; De Craen, Anton JM; Wright, Margaret J; Launer, Lenore J; Schumann, Gunter; Fornage, Myriam; Franke, Barbara; Debette, Stéphanie; Medland, Sarah E; Ikram, M Arfan; Thompson, Paul M

    2016-01-01

    Intracranial volume reflects the maximally attained brain size during development, and remains stable with loss of tissue in late life. It is highly heritable, but the underlying genes remain largely undetermined. In a genome-wide association study of 32,438 adults, we discovered five novel loci for intracranial volume and confirmed two known signals. Four of the loci are also associated with adult human stature, but these remained associated with intracranial volume after adjusting for height. We found a high genetic correlation with child head circumference (ρgenetic=0.748), which indicated a similar genetic background and allowed for the identification of four additional loci through meta-analysis (Ncombined = 37,345). Variants for intracranial volume were also related to childhood and adult cognitive function, Parkinson’s disease, and enriched near genes involved in growth pathways including PI3K–AKT signaling. These findings identify biological underpinnings of intracranial volume and provide genetic support for theories on brain reserve and brain overgrowth. PMID:27694991

  7. The Genomic Code: Genome Evolution and Potential Applications

    KAUST Repository

    Bernardi, Giorgio

    2016-01-25

    The genome of metazoans is organized according to a genomic code which comprises three laws: 1) Compositional correlations hold between contiguous coding and non-coding sequences, as well as among the three codon positions of protein-coding genes; these correlations are the consequence of the fact that the genomes under consideration consist of fairly homogeneous, long (≥200Kb) sequences, the isochores; 2) Although isochores are defined on the basis of purely compositional properties, GC levels of isochores are correlated with all tested structural and functional properties of the genome; 3) GC levels of isochores are correlated with chromosome architecture from interphase to metaphase; in the case of interphase the correlation concerns isochores and the three-dimensional “topological associated domains” (TADs); in the case of mitotic chromosomes, the correlation concerns isochores and chromosomal bands. Finally, the genomic code is the fourth and last pillar of molecular biology, the first three pillars being 1) the double helix structure of DNA; 2) the regulation of gene expression in prokaryotes; and 3) the genetic code.

  8. HLA diversity in the 1000 genomes dataset.

    Directory of Open Access Journals (Sweden)

    Pierre-Antoine Gourraud

    Full Text Available The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation by sequencing at a level that should allow the genome-wide detection of most variants with frequencies as low as 1%. However, in the major histocompatibility complex (MHC, only the top 10 most frequent haplotypes are in the 1% frequency range whereas thousands of haplotypes are present at lower frequencies. Given the limitation of both the coverage and the read length of the sequences generated by the 1000 Genomes Project, the highly variable positions that define HLA alleles may be difficult to identify. We used classical Sanger sequencing techniques to type the HLA-A, HLA-B, HLA-C, HLA-DRB1 and HLA-DQB1 genes in the available 1000 Genomes samples and combined the results with the 103,310 variants in the MHC region genotyped by the 1000 Genomes Project. Using pairwise identity-by-descent distances between individuals and principal component analysis, we established the relationship between ancestry and genetic diversity in the MHC region. As expected, both the MHC variants and the HLA phenotype can identify the major ancestry lineage, informed mainly by the most frequent HLA haplotypes. To some extent, regions of the genome with similar genetic or similar recombination rate have similar properties. An MHC-centric analysis underlines departures between the ancestral background of the MHC and the genome-wide picture. Our analysis of linkage disequilibrium (LD decay in these samples suggests that overestimation of pairwise LD occurs due to a limited sampling of the MHC diversity. This collection of HLA-specific MHC variants, available on the dbMHC portal, is a valuable resource for future analyses of the role of MHC in population and disease studies.

  9. Genome-wide copy number variation analysis identified deletions in SFMBT1 associated with fasting plasma glucose in a Han Chinese population.

    Science.gov (United States)

    Chung, Ren-Hua; Chiu, Yen-Feng; Hung, Yi-Jen; Lee, Wen-Jane; Wu, Kwan-Dun; Chen, Hui-Ling; Lin, Ming-Wei; Chen, Yii-Der I; Quertermous, Thomas; Hsiung, Chao A

    2017-08-08

    Fasting glucose and fasting insulin are glycemic traits closely related to diabetes, and understanding the role of genetic factors in these traits can help reveal the etiology of type 2 diabetes. Although single nucleotide polymorphisms (SNPs) in several candidate genes have been found to be associated with fasting glucose and fasting insulin, copy number variations (CNVs), which have been reported to be associated with several complex traits, have not been reported for association with these two traits. We aimed to identify CNVs associated with fasting glucose and fasting insulin. We conducted a genome-wide CNV association analysis for fasting plasma glucose (FPG) and fasting plasma insulin (FPI) using a family-based genome-wide association study sample from a Han Chinese population in Taiwan. A family-based CNV association test was developed in this study to identify common CNVs (i.e., CNVs with frequencies ≥ 5%), and a generalized estimating equation approach was used to test the associations between the traits and counts of global rare CNVs (i.e., CNVs with frequencies <5%). We found a significant genome-wide association for common deletions with a frequency of 5.2% in the Scm-like with four mbt domains 1 (SFMBT1) gene with FPG (association p-value = 2×10 -4 and an adjusted p-value = 0.0478 for multiple testing). No significant association was observed between global rare CNVs and FPG or FPI. The deletions in 20 individuals with DNA samples available were successfully validated using PCR-based amplification. The association of the deletions in SFMBT1 with FPG was further evaluated using an independent population-based replication sample obtained from the Taiwan Biobank. An association p-value of 0.065, which was close to the significance level of 0.05, for FPG was obtained by testing 9 individuals with CNVs in the SFMBT1 gene region and 11,692 individuals with normal copies in the replication cohort. Previous studies have found that SNPs in SFMBT1 are

  10. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes

    DEFF Research Database (Denmark)

    Zeggini, Eleftheria; Scott, Laura J; Saxena, Richa

    2008-01-01

    analyses had limited power to identify variants with modest effects, we carried out meta-analysis of three T2D GWA scans comprising 10,128 individuals of European descent and approximately 2.2 million SNPs (directly genotyped and imputed), followed by replication testing in an independent sample......Genome-wide association (GWA) studies have identified multiple loci at which common variants modestly but reproducibly influence risk of type 2 diabetes (T2D). Established associations to common and rare variants explain only a small proportion of the heritability of T2D. As previously published...

  11. Optical Whole-Genome Restriction Mapping as a Tool for Rapidly Distinguishing and Identifying Bacterial Contaminants in Clinical Samples

    Science.gov (United States)

    2015-08-01

    Article 3. DATES COVERED (From – To) Oct 2011 – Aug 2012 4. TITLE AND SUBTITLE Optical Whole-Genome Restriction Mapping as a Tool for Rapidly...multiple bacteria could be uniquely identified within mixtures. In the first set of experiments, three unique organisms ( Bacillus subtilis subsp. globigii...be useful in monitoring nosocomial outbreaks in neonatal and intensive care wards, or even as an initial screen for antibiotic resistant strains

  12. Comparative genome analysis to identify SNPs associated with high oleic acid and elevated protein content in soybean.

    Science.gov (United States)

    Kulkarni, Krishnanand P; Patil, Gunvant; Valliyodan, Babu; Vuong, Tri D; Shannon, J Grover; Nguyen, Henry T; Lee, Jeong-Dong

    2018-03-01

    The objective of this study was to determine the genetic relationship between the oleic acid and protein content. The genotypes having high oleic acid and elevated protein (HOEP) content were crossed with five elite lines having normal oleic acid and average protein (NOAP) content. The selected accessions were grown at six environments in three different locations and phenotyped for protein, oil, and fatty acid components. The mean protein content of parents, HOEP, and NOAP lines was 34.6%, 38%, and 34.9%, respectively. The oleic acid concentration of parents, HOEP, and NOAP lines was 21.7%, 80.5%, and 20.8%, respectively. The HOEP plants carried both FAD2-1A (S117N) and FAD2-1B (P137R) mutant alleles contributing to the high oleic acid phenotype. Comparative genome analysis using whole-genome resequencing data identified six genes having single nucleotide polymorphism (SNP) significantly associated with the traits analyzed. A single SNP in the putative gene Glyma.10G275800 was associated with the elevated protein content, and palmitic, oleic, and linoleic acids. The genes from the marker intervals of previously identified QTL did not carry SNPs associated with protein content and fatty acid composition in the lines used in this study, indicating that all the genes except Glyma.10G278000 may be the new genes associated with the respective traits.

  13. Genome-wide association identifies genetic variants associated with lentiform nucleus volume in N = 1345 young and elderly subjects.

    Science.gov (United States)

    Hibar, Derrek P; Stein, Jason L; Ryles, April B; Kohannim, Omid; Jahanshad, Neda; Medland, Sarah E; Hansell, Narelle K; McMahon, Katie L; de Zubicaray, Greig I; Montgomery, Grant W; Martin, Nicholas G; Wright, Margaret J; Saykin, Andrew J; Jack, Clifford R; Weiner, Michael W; Toga, Arthur W; Thompson, Paul M

    2013-06-01

    Deficits in lentiform nucleus volume and morphometry are implicated in a number of genetically influenced disorders, including Parkinson's disease, schizophrenia, and ADHD. Here we performed genome-wide searches to discover common genetic variants associated with differences in lentiform nucleus volume in human populations. We assessed structural MRI scans of the brain in two large genotyped samples: the Alzheimer's Disease Neuroimaging Initiative (ADNI; N = 706) and the Queensland Twin Imaging Study (QTIM; N = 639). Statistics of association from each cohort were combined meta-analytically using a fixed-effects model to boost power and to reduce the prevalence of false positive findings. We identified a number of associations in and around the flavin-containing monooxygenase (FMO) gene cluster. The most highly associated SNP, rs1795240, was located in the FMO3 gene; after meta-analysis, it showed genome-wide significant evidence of association with lentiform nucleus volume (P MA  = 4.79 × 10(-8)). This commonly-carried genetic variant accounted for 2.68 % and 0.84 % of the trait variability in the ADNI and QTIM samples, respectively, even though the QTIM sample was on average 50 years younger. Pathway enrichment analysis revealed significant contributions of this gene to the cytochrome P450 pathway, which is involved in metabolizing numerous therapeutic drugs for pain, seizures, mania, depression, anxiety, and psychosis. The genetic variants we identified provide replicated, genome-wide significant evidence for the FMO gene cluster's involvement in lentiform nucleus volume differences in human populations.

  14. A novel data mining method to identify assay-specific signatures in functional genomic studies

    Directory of Open Access Journals (Sweden)

    Guidarelli Jack W

    2006-08-01

    Full Text Available Abstract Background: The highly dimensional data produced by functional genomic (FG studies makes it difficult to visualize relationships between gene products and experimental conditions (i.e., assays. Although dimensionality reduction methods such as principal component analysis (PCA have been very useful, their application to identify assay-specific signatures has been limited by the lack of appropriate methodologies. This article proposes a new and powerful PCA-based method for the identification of assay-specific gene signatures in FG studies. Results: The proposed method (PM is unique for several reasons. First, it is the only one, to our knowledge, that uses gene contribution, a product of the loading and expression level, to obtain assay signatures. The PM develops and exploits two types of assay-specific contribution plots, which are new to the application of PCA in the FG area. The first type plots the assay-specific gene contribution against the given order of the genes and reveals variations in distribution between assay-specific gene signatures as well as outliers within assay groups indicating the degree of importance of the most dominant genes. The second type plots the contribution of each gene in ascending or descending order against a constantly increasing index. This type of plots reveals assay-specific gene signatures defined by the inflection points in the curve. In addition, sharp regions within the signature define the genes that contribute the most to the signature. We proposed and used the curvature as an appropriate metric to characterize these sharp regions, thus identifying the subset of genes contributing the most to the signature. Finally, the PM uses the full dataset to determine the final gene signature, thus eliminating the chance of gene exclusion by poor screening in earlier steps. The strengths of the PM are demonstrated using a simulation study, and two studies of real DNA microarray data – a study of

  15. Genetic architecture of circulating lipid levels

    DEFF Research Database (Denmark)

    Demirkan, Ayşe; Amin, Najaf; Isaacs, Aaron

    2011-01-01

    Serum concentrations of low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), triglycerides (TGs) and total cholesterol (TC) are important heritable risk factors for cardiovascular disease. Although genome-wide association studies (GWASs) of circulating lipid...... the ENGAGE Consortium GWAS on serum lipids, were applied to predict lipid levels in an independent population-based study, the Rotterdam Study-II (RS-II). We additionally tested for evidence of a shared genetic basis for different lipid phenotypes. Finally, the polygenic score approach was used to identify...... an alternative genome-wide significance threshold before pathway analysis and those results were compared with those based on the classical genome-wide significance threshold. Our study provides evidence suggesting that many loci influencing circulating lipid levels remain undiscovered. Cross-prediction models...

  16. The Functional Genomics Initiative at Oak Ridge National Laboratory

    Energy Technology Data Exchange (ETDEWEB)

    Johnson, Dabney; Justice, Monica; Beattle, Ken; Buchanan, Michelle; Ramsey, Michael; Ramsey, Rose; Paulus, Michael; Ericson, Nance; Allison, David; Kress, Reid; Mural, Richard; Uberbacher, Ed; Mann, Reinhold

    1997-12-31

    The Functional Genomics Initiative at the Oak Ridge National Laboratory integrates outstanding capabilities in mouse genetics, bioinformatics, and instrumentation. The 50 year investment by the DOE in mouse genetics/mutagenesis has created a one-of-a-kind resource for generating mutations and understanding their biological consequences. It is generally accepted that, through the mouse as a surrogate for human biology, we will come to understand the function of human genes. In addition to this world class program in mammalian genetics, ORNL has also been a world leader in developing bioinformatics tools for the analysis, management and visualization of genomic data. Combining this expertise with new instrumentation technologies will provide a unique capability to understand the consequences of mutations in the mouse at both the organism and molecular levels. The goal of the Functional Genomics Initiative is to develop the technology and methodology necessary to understand gene function on a genomic scale and apply these technologies to megabase regions of the human genome. The effort is scoped so as to create an effective and powerful resource for functional genomics. ORNL is partnering with the Joint Genome Institute and other large scale sequencing centers to sequence several multimegabase regions of both human and mouse genomic DNA, to identify all the genes in these regions, and to conduct fundamental surveys to examine gene function at the molecular and organism level. The Initiative is designed to be a pilot for larger scale deployment in the post-genome era. Technologies will be applied to the examination of gene expression and regulation, metabolism, gene networks, physiology and development.

  17. Genome-wide meta-analysis identifies new susceptibility loci for migraine

    NARCIS (Netherlands)

    Anttila, Verneri; Winsvold, Bendik S.; Gormley, Padhraig; Kurth, Tobias; Bettella, Francesco; McMahon, George; Kallela, Mikko; Malik, Rainer; de Vries, Boukje; Terwindt, Gisela; Medland, Sarah E.; Todt, Unda; McArdle, Wendy L.; Quaye, Lydia; Koiranen, Markku; Ikram, M. Arfan; Lehtimaki, Terho; Stam, Anine H.; Ligthart, Lannie; Wedenoja, Juho; Dunham, Ian; Neale, Benjamin M.; Palta, Priit; Hamalainen, Eija; Schuerks, Markus; Rose, Lynda M.; Buring, Julie E.; Ridker, Paul M.; Steinberg, Stacy; Stefansson, Hreinn; Jakobsson, Finnbogi; Lawlor, Debbie A.; Evans, David M.; Ring, Susan M.; Farkkila, Markus; Artto, Ville; Kaunisto, Mari A.; Freilinger, Tobias; Schoenen, Jean; Frants, Rune R.; Pelzer, Nadine; Weller, Claudia M.; Zielman, Ronald; Heath, Andrew C.; Madden, Pamela A. F.; Montgomery, Grant W.; Martin, Nicholas G.; Borck, Guntram; Goebel, Hartmut; Heinze, Axel

    Migraine is the most common brain disorder, affecting approximately 14% of the adult population, but its molecular mechanisms are poorly understood. We report the results of a meta-analysis across 29 genome-wide association studies, including a total of 23,285 individuals with migraine (cases) and

  18. Genome-wide meta-analysis identifies new susceptibility loci for migraine

    NARCIS (Netherlands)

    Anttila, V.; Winsvold, B.S.; Gormley, P.; Kurth, T.; Bettella, F.; McMahon, G.; Kallela, M.; Malik, R.; de Vries, B.; Terwindt, G.; Medland, S.E.; Todt, U.; McArdle, W.L.; Quaye, L.; Koiranen, M.; Ikram, M.A.; Lehtimäki, T.; Stam, A.H.; Ligthart, R.S.L.; Wedenoja, J.; Dunham, I.; Neale, B. M.; Palta, P.; Hamalainen, E.; Schürks, M.; Rose, L.M.; Buring, J.E.; Ridker, P.M.; Steinberg, S.; Stefansson, H.; Jakobsson, F.; Lawlor, D.A.; Evans, D.M.; Ring, S.M.; Färkkilä, M.; Artto, V.; Kaunisto, M.A.; Freilinger, T.; Schoenen, J.; Frants, R.R.; Pelzer, N.; Weller, C.M.; Zielman, R.; Heath, A.C.; Madden, P.A.F.; Montgomery, G.W.; Martin, N.G.; Borck, G.; Göbel, H.; Heinze, A.; Heinze-Kuhn, K.; Williams, F.M.; Hartikainen, A.-L.; Pouta, A.; van den Ende, J..; Uitterlinden, A.G.; Hofman, A.; Amin, N.; Hottenga, J.J.; Vink, J.M.; Heikkilä, K.; Alexander, M.; Muller-Myhsok, B.; Schreiber, S; Meitinger, T.; Wichmann, H. E.; Aromaa, A.; Eriksson, J.G.; Traynor, B.J.; Trabzuni, D.; Rossin, E.; Lage, K.; Jacobs, S.B.; Gibbs, J.R.; Birney, E.; Kaprio, J.; Penninx, B.W.J.H.; Boomsma, D.I.; van Duijn, C.M.; Raitakari, O.; Jarvelin, M.-R.; Zwart, J.A.; Cherkas, L.; Strachan, D.P.; Kubisch, C.; Ferrari, M.D.; van den Maagdenberg, A.M.J.M.; Dichgans, M.; Wessman, M.; Smith, G.D.; Stefansson, K.; Daly, M.J.; Nyholt, DR; Chasman, D.I.; Palotie, A.

    2013-01-01

    Migraine is the most common brain disorder, affecting approximately 14% of the adult population, but its molecular mechanisms are poorly understood. We report the results of a meta-analysis across 29 genome-wide association studies, including a total of 23,285 individuals with migraine (cases) and

  19. Genome-wide meta-analysis identifies new susceptibility loci for migraine

    DEFF Research Database (Denmark)

    Anttila, Verneri; Winsvold, Bendik S; Gormley, Padhraig

    2013-01-01

    Migraine is the most common brain disorder, affecting approximately 14% of the adult population, but its molecular mechanisms are poorly understood. We report the results of a meta-analysis across 29 genome-wide association studies, including a total of 23,285 individuals with migraine (cases) an...

  20. DeepBipolar: Identifying genomic mutations for bipolar disorder via deep learning.

    Science.gov (United States)

    Laksshman, Sundaram; Bhat, Rajendra Rana; Viswanath, Vivek; Li, Xiaolin

    2017-09-01

    Bipolar disorder, also known as manic depression, is a brain disorder that affects the brain structure of a patient. It results in extreme mood swings, severe states of depression, and overexcitement simultaneously. It is estimated that roughly 3% of the population of the United States (about 5.3 million adults) suffers from bipolar disorder. Recent research efforts like the Twin studies have demonstrated a high heritability factor for the disorder, making genomics a viable alternative for detecting and treating bipolar disorder, in addition to the conventional lengthy and costly postsymptom clinical diagnosis. Motivated by this study, leveraging several emerging deep learning algorithms, we design an end-to-end deep learning architecture (called DeepBipolar) to predict bipolar disorder based on limited genomic data. DeepBipolar adopts the Deep Convolutional Neural Network (DCNN) architecture that automatically extracts features from genotype information to predict the bipolar phenotype. We participated in the Critical Assessment of Genome Interpretation (CAGI) bipolar disorder challenge and DeepBipolar was considered the most successful by the independent assessor. In this work, we thoroughly evaluate the performance of DeepBipolar and analyze the type of signals we believe could have affected the classifier in distinguishing the case samples from the control set. © 2017 Wiley Periodicals, Inc.

  1. Signatures of selection in tilapia revealed by whole genome resequencing.

    Science.gov (United States)

    Xia, Jun Hong; Bai, Zhiyi; Meng, Zining; Zhang, Yong; Wang, Le; Liu, Feng; Jing, Wu; Wan, Zi Yi; Li, Jiale; Lin, Haoran; Yue, Gen Hua

    2015-09-16

    Natural selection and selective breeding for genetic improvement have left detectable signatures within the genome of a species. Identification of selection signatures is important in evolutionary biology and for detecting genes that facilitate to accelerate genetic improvement. However, selection signatures, including artificial selection and natural selection, have only been identified at the whole genome level in several genetically improved fish species. Tilapia is one of the most important genetically improved fish species in the world. Using next-generation sequencing, we sequenced the genomes of 47 tilapia individuals. We identified a total of 1.43 million high-quality SNPs and found that the LD block sizes ranged from 10-100 kb in tilapia. We detected over a hundred putative selective sweep regions in each line of tilapia. Most selection signatures were located in non-coding regions of the tilapia genome. The Wnt signaling, gonadotropin-releasing hormone receptor and integrin signaling pathways were under positive selection in all improved tilapia lines. Our study provides a genome-wide map of genetic variation and selection footprints in tilapia, which could be important for genetic studies and accelerating genetic improvement of tilapia.

  2. Genome-Enhanced Detection and Identification (GEDI of plant pathogens

    Directory of Open Access Journals (Sweden)

    Nicolas Feau

    2018-02-01

    Full Text Available Plant diseases caused by fungi and Oomycetes represent worldwide threats to crops and forest ecosystems. Effective prevention and appropriate management of emerging diseases rely on rapid detection and identification of the causal pathogens. The increase in genomic resources makes it possible to generate novel genome-enhanced DNA detection assays that can exploit whole genomes to discover candidate genes for pathogen detection. A pipeline was developed to identify genome regions that discriminate taxa or groups of taxa and can be converted into PCR assays. The modular pipeline is comprised of four components: (1 selection and genome sequencing of phylogenetically related taxa, (2 identification of clusters of orthologous genes, (3 elimination of false positives by filtering, and (4 assay design. This pipeline was applied to some of the most important plant pathogens across three broad taxonomic groups: Phytophthoras (Stramenopiles, Oomycota, Dothideomycetes (Fungi, Ascomycota and Pucciniales (Fungi, Basidiomycota. Comparison of 73 fungal and Oomycete genomes led the discovery of 5,939 gene clusters that were unique to the targeted taxa and an additional 535 that were common at higher taxonomic levels. Approximately 28% of the 299 tested were converted into qPCR assays that met our set of specificity criteria. This work demonstrates that a genome-wide approach can efficiently identify multiple taxon-specific genome regions that can be converted into highly specific PCR assays. The possibility to easily obtain multiple alternative regions to design highly specific qPCR assays should be of great help in tackling challenging cases for which higher taxon-resolution is needed.

  3. Genomic taxonomy of vibrios

    Directory of Open Access Journals (Sweden)

    Iida Tetsuya

    2009-10-01

    Full Text Available Abstract Background Vibrio taxonomy has been based on a polyphasic approach. In this study, we retrieve useful taxonomic information (i.e. data that can be used to distinguish different taxonomic levels, such as species and genera from 32 genome sequences of different vibrio species. We use a variety of tools to explore the taxonomic relationship between the sequenced genomes, including Multilocus Sequence Analysis (MLSA, supertrees, Average Amino Acid Identity (AAI, genomic signatures, and Genome BLAST atlases. Our aim is to analyse the usefulness of these tools for species identification in vibrios. Results We have generated four new genome sequences of three Vibrio species, i.e., V. alginolyticus 40B, V. harveyi-like 1DA3, and V. mimicus strains VM573 and VM603, and present a broad analyses of these genomes along with other sequenced Vibrio species. The genome atlas and pangenome plots provide a tantalizing image of the genomic differences that occur between closely related sister species, e.g. V. cholerae and V. mimicus. The vibrio pangenome contains around 26504 genes. The V. cholerae core genome and pangenome consist of 1520 and 6923 genes, respectively. Pangenomes might allow different strains of V. cholerae to occupy different niches. MLSA and supertree analyses resulted in a similar phylogenetic picture, with a clear distinction of four groups (Vibrio core group, V. cholerae-V. mimicus, Aliivibrio spp., and Photobacterium spp.. A Vibrio species is defined as a group of strains that share > 95% DNA identity in MLSA and supertree analysis, > 96% AAI, ≤ 10 genome signature dissimilarity, and > 61% proteome identity. Strains of the same species and species of the same genus will form monophyletic groups on the basis of MLSA and supertree. Conclusion The combination of different analytical and bioinformatics tools will enable the most accurate species identification through genomic computational analysis. This endeavour will culminate in

  4. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic

  5. CHESS (CgHExpreSS): a comprehensive analysis tool for the analysis of genomic alterations and their effects on the expression profile of the genome.

    Science.gov (United States)

    Lee, Mikyung; Kim, Yangseok

    2009-12-16

    test. By successive operations of two modules, users can clarify how gene expression levels are affected by the phenotype specific genomic alterations. As CHESS was developed in both Java application and web environments, it can be run on a web browser or a local machine. It also supports all experimental platforms if a properly formatted text file is provided to include the chromosomal position of probes and their gene identifiers. CHESS is a user-friendly tool for investigating disease specific genomic alterations and quantitative relationships between those genomic alterations and genome-wide gene expression profiling.

  6. Genome-wide association study identifies new prostate cancer susceptibility loci

    DEFF Research Database (Denmark)

    Schumacher, Fredrick R.; Berndt, Sonja I.; Siddiq, Afshan

    2011-01-01

    Prostate cancer (PrCa) is the most common non-skin cancer diagnosed among males in developed countries and the second leading cause of cancer mortality, yet little is known regarding its etiology and factors that influence clinical outcome. Genome-wide association studies (GWAS) of PrCa have iden...

  7. Runs of homozygosity and distribution of functional variants in cattle genome

    DEFF Research Database (Denmark)

    Zhang, Qianqian; Guldbrandtsen, Bernt; Bosse, Mirte

    Runs of homozygosity (ROH) are identified in four dairy cattle breeds using NGS data. Cattle populations have been exposed to strong artificial selection for some generations. Genomic regions under selection will show increased levels of ROH. By investigating the relationship between ROH and dist......Runs of homozygosity (ROH) are identified in four dairy cattle breeds using NGS data. Cattle populations have been exposed to strong artificial selection for some generations. Genomic regions under selection will show increased levels of ROH. By investigating the relationship between ROH...... and distribution of predicted deleterious and tolerated variants, we can gain insight into how selection shapes the distribution of functional variants in inbred regions. We observe that predicted deleterious variants are more enriched in ROHs than predicted tolerated variants. Moreover, increase of enrichment...

  8. Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes.

    Science.gov (United States)

    Thybert, David; Roller, Maša; Navarro, Fábio C P; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janoušek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C; Laukaitis, Christina M; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M; Odom, Duncan T; Flicek, Paul

    2018-04-01

    Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli , which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology. © 2018 Thybert et al.; Published by Cold Spring Harbor Laboratory Press.

  9. Genome-Wide Association Identifies Multiple Genomic Regions Associated with Susceptibility to and Control of Ovine Lentivirus

    Science.gov (United States)

    2012-10-17

    to varying degrees of dyspnea (respiratory distress), cachexia (body condition wasting), mastitis , arthritis, and/or encephalitis [5,6]. One of the...General Transcription Factor IIH, polypeptide 5), the gene order does not agree with other mammal genomes including cow , human, dog, and mouse, and it may

  10. Genome U-Plot: a whole genome visualization.

    Science.gov (United States)

    Gaitatzes, Athanasios; Johnson, Sarah H; Smadbeck, James B; Vasmatzis, George

    2018-05-15

    The ability to produce and analyze whole genome sequencing (WGS) data from samples with structural variations (SV) generated the need to visualize such abnormalities in simplified plots. Conventional two-dimensional representations of WGS data frequently use either circular or linear layouts. There are several diverse advantages regarding both these representations, but their major disadvantage is that they do not use the two-dimensional space very efficiently. We propose a layout, termed the Genome U-Plot, which spreads the chromosomes on a two-dimensional surface and essentially quadruples the spatial resolution. We present the Genome U-Plot for producing clear and intuitive graphs that allows researchers to generate novel insights and hypotheses by visualizing SVs such as deletions, amplifications, and chromoanagenesis events. The main features of the Genome U-Plot are its layered layout, its high spatial resolution and its improved aesthetic qualities. We compare conventional visualization schemas with the Genome U-Plot using visualization metrics such as number of line crossings and crossing angle resolution measures. Based on our metrics, we improve the readability of the resulting graph by at least 2-fold, making apparent important features and making it easy to identify important genomic changes. A whole genome visualization tool with high spatial resolution and improved aesthetic qualities. An implementation and documentation of the Genome U-Plot is publicly available at https://github.com/gaitat/GenomeUPlot. vasmatzis.george@mayo.edu. Supplementary data are available at Bioinformatics online.

  11. Unleashing the genome of Brassica rapa

    Directory of Open Access Journals (Sweden)

    Haibao eTang

    2012-07-01

    Full Text Available The completion and release of the Brassica rapa genome is of great benefit to researchers of the Brassicas, Arabidopsis, and genome evolution. While its lineage is closely related to the model organism Arabidopsis thaliana, the Brassicas experienced a whole genome triplication subsequent to their divergence. This event contemporaneously created three copies of its ancestral genome, which had diploidized through the process of homeologous gene loss known as fractionation. By the fractionation of homeologous gene content and genetic regulatory binding sites, Brassica’s genome is well placed to use comparative genomic techniques to identify syntenic regions, homeologous gene duplications, and putative regulatory sequences. Here, we use the comparative genomics platform CoGe to perform several different genomic analyses with which to study structural changes of its genome and dynamics of various genetic elements. Starting with whole genome comparisons, the Brassica paleohexaploidy is characterized, syntenic regions with Arabidopsis thaliana are identified, and the TOC1 gene in the circadian rhythm pathway from Arabidopsis thaliana is used to find duplicated orthologs in Brassica rapa. These TOC1 genes are further analyzed to identify conserved noncoding sequences that contain cis-acting regulatory elements and promoter sequences previously implicated in circadian rhythmicity. Each 'cookbook style' analysis includes a step-by-step walkthrough with links to CoGe to quickly reproduce each step of the analytical process.

  12. Advances in Exercise, Fitness, and Performance Genomics in 2015.

    Science.gov (United States)

    Sarzynski, Mark A; Loos, Ruth J F; Lucia, Alejandro; Pérusse, Louis; Roth, Stephen M; Wolfarth, Bernd; Rankinen, Tuomo; Bouchard, Claude

    2016-10-01

    This review of the exercise genomics literature encompasses the highest-quality articles published in 2015 across seven broad topics: physical activity behavior, muscular strength and power, cardiorespiratory fitness and endurance performance, body weight and adiposity, insulin and glucose metabolism, lipid and lipoprotein metabolism, and hemodynamic traits. One study used a quantitative trait locus for wheel running in mice to identify single nucleotide polymorphisms (SNPs) in humans associated with physical activity levels. Two studies examined the association of candidate gene ACTN3 R577X genotype on muscular performance. Several studies examined gene-physical activity interactions on cardiometabolic traits. One study showed that physical inactivity exacerbated the body mass index (BMI)-increasing effect of an FTO SNP but only in individuals of European ancestry, whereas another showed that high-density lipoprotein cholesterol (HDL-C) SNPs from genome-wide association studies exerted a smaller effect in active individuals. Increased levels of moderate-to-vigorous-intensity physical activity were associated with higher Matsuda insulin sensitivity index in PPARG Ala12 carriers but not Pro12 homozygotes. One study combined genome-wide and transcriptome-wide profiling to identify genes and SNPs associated with the response of triglycerides (TG) to exercise training. The genome-wide association study results showed that four SNPs accounted for all of the heritability of △TG, whereas the baseline expression of 11 genes predicted 27% of △TG. A composite SNP score based on the top eight SNPs derived from the genomic and transcriptomic analyses was the strongest predictor of ΔTG, explaining 14% of the variance. The review concludes with a discussion of a conceptual framework defining some of the critical conditions for exercise genomics studies and highlights the importance of the recently launched National Institutes of Health Common Fund program titled "Molecular

  13. A hybrid reference-guided de novo assembly approach for generating Cyclospora mitochondrion genomes.

    Science.gov (United States)

    Gopinath, G R; Cinar, H N; Murphy, H R; Durigan, M; Almeria, M; Tall, B D; DaSilva, A J

    2018-01-01

    Cyclospora cayetanensis is a coccidian parasite associated with large and complex foodborne outbreaks worldwide. Linking samples from cyclosporiasis patients during foodborne outbreaks with suspected contaminated food sources, using conventional epidemiological methods, has been a persistent challenge. To address this issue, development of new methods based on potential genomically-derived markers for strain-level identification has been a priority for the food safety research community. The absence of reference genomes to identify nucleotide and structural variants with a high degree of confidence has limited the application of using sequencing data for source tracking during outbreak investigations. In this work, we determined the quality of a high resolution, curated, public mitochondrial genome assembly to be used as a reference genome by applying bioinformatic analyses. Using this reference genome, three new mitochondrial genome assemblies were built starting with metagenomic reads generated by sequencing DNA extracted from oocysts present in stool samples from cyclosporiasis patients. Nucleotide variants were identified in the new and other publicly available genomes in comparison with the mitochondrial reference genome. A consolidated workflow, presented here, to generate new mitochondrion genomes using our reference-guided de novo assembly approach could be useful in facilitating the generation of other mitochondrion sequences, and in their application for subtyping C. cayetanensis strains during foodborne outbreak investigations.

  14. Identifying genetic relatives without compromising privacy.

    Science.gov (United States)

    He, Dan; Furlotte, Nicholas A; Hormozdiari, Farhad; Joo, Jong Wha J; Wadia, Akshay; Ostrovsky, Rafail; Sahai, Amit; Eskin, Eleazar

    2014-04-01

    The development of high-throughput genomic technologies has impacted many areas of genetic research. While many applications of these technologies focus on the discovery of genes involved in disease from population samples, applications of genomic technologies to an individual's genome or personal genomics have recently gained much interest. One such application is the identification of relatives from genetic data. In this application, genetic information from a set of individuals is collected in a database, and each pair of individuals is compared in order to identify genetic relatives. An inherent issue that arises in the identification of relatives is privacy. In this article, we propose a method for identifying genetic relatives without compromising privacy by taking advantage of novel cryptographic techniques customized for secure and private comparison of genetic information. We demonstrate the utility of these techniques by allowing a pair of individuals to discover whether or not they are related without compromising their genetic information or revealing it to a third party. The idea is that individuals only share enough special-purpose cryptographically protected information with each other to identify whether or not they are relatives, but not enough to expose any information about their genomes. We show in HapMap and 1000 Genomes data that our method can recover first- and second-order genetic relationships and, through simulations, show that our method can identify relationships as distant as third cousins while preserving privacy.

  15. Cancer associated epigenetic transitions identified by genome-wide histone methylation binding profiles in human colorectal cancer samples and paired normal mucosa

    International Nuclear Information System (INIS)

    Enroth, Stefan; Rada-Iglesisas, Alvaro; Andersson, Robin; Wallerman, Ola; Wanders, Alkwin; Påhlman, Lars; Komorowski, Jan; Wadelius, Claes

    2011-01-01

    Despite their well-established functional roles, histone modifications have received less attention than DNA methylation in the cancer field. In order to evaluate their importance in colorectal cancer (CRC), we generated the first genome-wide histone modification profiles in paired normal colon mucosa and tumor samples. Chromatin immunoprecipitation and microarray hybridization (ChIP-chip) was used to identify promoters enriched for histone H3 trimethylated on lysine 4 (H3K4me3) and lysine 27 (H3K27me3) in paired normal colon mucosa and tumor samples from two CRC patients and for the CRC cell line HT29. By comparing histone modification patterns in normal mucosa and tumors, we found that alterations predicted to have major functional consequences were quite rare. Furthermore, when normal or tumor tissue samples were compared to HT29, high similarities were observed for H3K4me3. However, the differences found for H3K27me3, which is important in determining cellular identity, indicates that cell lines do not represent optimal tissue models. Finally, using public expression data, we uncovered previously unknown changes in CRC expression patterns. Genes positive for H3K4me3 in normal and/or tumor samples, which are typically already active in normal mucosa, became hyperactivated in tumors, while genes with H3K27me3 in normal and/or tumor samples and which are expressed at low levels in normal mucosa, became hypersilenced in tumors. Genome wide histone modification profiles can be used to find epigenetic aberrations in genes associated with cancer. This strategy gives further insights into the epigenetic contribution to the oncogenic process and may identify new biomarkers

  16. Comparative genomics of Mycoplasma: analysis of conserved essential genes and diversity of the pan-genome.

    Directory of Open Access Journals (Sweden)

    Wei Liu

    Full Text Available Mycoplasma, the smallest self-replicating organism with a minimal metabolism and little genomic redundancy, is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. This study employs comparative evolutionary analysis of twenty Mycoplasma genomes to gain an improved understanding of essential genes. By analyzing the core genome of mycoplasmas, we finally revealed the conserved essential genes set for mycoplasma survival. Further analysis showed that the core genome set has many characteristics in common with experimentally identified essential genes. Several key genes, which are related to DNA replication and repair and can be disrupted in transposon mutagenesis studies, may be critical for bacteria survival especially over long period natural selection. Phylogenomic reconstructions based on 3,355 homologous groups allowed robust estimation of phylogenetic relatedness among mycoplasma strains. To obtain deeper insight into the relative roles of molecular evolution in pathogen adaptation to their hosts, we also analyzed the positive selection pressures on particular sites and lineages. There appears to be an approximate correlation between the divergence of species and the level of positive selection detected in corresponding lineages.

  17. Genome-wide meta-analysis of 241,258 adults accounting for smoking behaviour identifies novel loci for obesity traits

    DEFF Research Database (Denmark)

    Justice, Anne E; Winkler, Thomas W; Feitosa, Mary F

    2017-01-01

    Few genome-wide association studies (GWAS) account for environmental exposures, like smoking, potentially impacting the overall trait variance when investigating the genetic contribution to obesity-related traits. Here, we use GWAS data from 51,080 current smokers and 190,178 nonsmokers (87......% European descent) to identify loci influencing BMI and central adiposity, measured as waist circumference and waist-to-hip ratio both adjusted for BMI. We identify 23 novel genetic loci, and 9 loci with convincing evidence of gene-smoking interaction (GxSMK) on obesity-related traits. We show consistent...... direction of effect for all identified loci and significance for 18 novel and for 5 interaction loci in an independent study sample. These loci highlight novel biological functions, including response to oxidative stress, addictive behaviour, and regulatory functions emphasizing the importance of accounting...

  18. RUMINANT NUTRITION SYMPOSIUM: Use of genomics and transcriptomics to identify strategies to lower ruminal methanogenesis.

    Science.gov (United States)

    McAllister, T A; Meale, S J; Valle, E; Guan, L L; Zhou, M; Kelly, W J; Henderson, G; Attwood, G T; Janssen, P H

    2015-04-01

    Globally, methane (CH4) emissions account for 40% to 45% of greenhouse gas emissions from ruminant livestock, with over 90% of these emissions arising from enteric fermentation. Reduction of carbon dioxide to CH4 is critical for efficient ruminal fermentation because it prevents the accumulation of reducing equivalents in the rumen. Methanogens exist in a symbiotic relationship with rumen protozoa and fungi and within biofilms associated with feed and the rumen wall. Genomics and transcriptomics are playing an increasingly important role in defining the ecology of ruminal methanogenesis and identifying avenues for its mitigation. Metagenomic approaches have provided information on changes in abundances as well as the species composition of the methanogen community among ruminants that vary naturally in their CH4 emissions, their feed efficiency, and their response to CH4 mitigators. Sequencing the genomes of rumen methanogens has provided insight into surface proteins that may prove useful in the development of vaccines and has allowed assembly of biochemical pathways for use in chemogenomic approaches to lowering ruminal CH4 emissions. Metagenomics and metatranscriptomic analysis of entire rumen microbial communities are providing new perspectives on how methanogens interact with other members of this ecosystem and how these relationships may be altered to reduce methanogenesis. Identification of community members that produce antimethanogen agents that either inhibit or kill methanogens could lead to the identification of new mitigation approaches. Discovery of a lytic archaeophage that specifically lyses methanogens is 1 such example. Efforts in using genomic data to alter methanogenesis have been hampered by a lack of sequence information that is specific to the microbial community of the rumen. Programs such as Hungate1000 and the Global Rumen Census are increasing the breadth and depth of our understanding of global ruminal microbial communities, steps that

  19. Genome Sequences of Marine Shrimp Exopalaemon carinicauda Holthuis Provide Insights into Genome Size Evolution of Caridea.

    Science.gov (United States)

    Yuan, Jianbo; Gao, Yi; Zhang, Xiaojun; Wei, Jiankai; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

    2017-07-05

    Crustacea, particularly Decapoda, contains many economically important species, such as shrimps and crabs. Crustaceans exhibit enormous (nearly 500-fold) variability in genome size. However, limited genome resources are available for investigating these species. Exopalaemon carinicauda Holthuis, an economical caridean shrimp, is a potential ideal experimental animal for research on crustaceans. In this study, we performed low-coverage sequencing and de novo assembly of the E. carinicauda genome. The assembly covers more than 95% of coding regions. E. carinicauda possesses a large complex genome (5.73 Gb), with size twice higher than those of many decapod shrimps. As such, comparative genomic analyses were implied to investigate factors affecting genome size evolution of decapods. However, clues associated with genome duplication were not identified, and few horizontally transferred sequences were detected. Ultimately, the burst of transposable elements, especially retrotransposons, was determined as the major factor influencing genome expansion. A total of 2 Gb repeats were identified, and RTE-BovB, Jockey, Gypsy, and DIRS were the four major retrotransposons that significantly expanded. Both recent (Jockey and Gypsy) and ancestral (DIRS) originated retrotransposons responsible for the genome evolution. The E. carinicauda genome also exhibited potential for the genomic and experimental research of shrimps.

  20. The genome of the extremophile crucifer Thellungiella parvula

    KAUST Repository

    Dassanayake, Maheshi; Oh, Dongha; Haas, Jeffrey S.; Herná ndez, Á lvaro Gonzalez; Hong, Hyewon; Ali, Shahjahan; Yun, Daejin; Bressan, Ray Anthony; Zhu, Jian-Kang; Bohnert, Hans Jü rgen; Cheeseman, John McP

    2011-01-01

    Thellungiella parvula is related to Arabidopsis thaliana and is endemic to saline, resource-poor habitats, making it a model for the evolution of plant adaptation to extreme environments. Here we present the draft genome for this extremophile species. Exclusively by next generation sequencing, we obtained the de novo assembled genome in 1,496 gap-free contigs, closely approximating the estimated genome size of 140 Mb. We anchored these contigs to seven pseudo chromosomes without the use of maps. We show that short reads can be assembled to a near-complete chromosome level for a eukaryotic species lacking prior genetic information. The sequence identifies a number of tandem duplications that, by the nature of the duplicated genes, suggest a possible basis for T. parvula's extremophile lifestyle. Our results provide essential background for developing genomically influenced testable hypotheses for the evolution of environmental stress tolerance. © 2011 Nature America, Inc. All rights reserved.

  1. Phytozome Comparative Plant Genomics Portal

    Energy Technology Data Exchange (ETDEWEB)

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  2. Short and long-term genome stability analysis of prokaryotic genomes.

    Science.gov (United States)

    Brilli, Matteo; Liò, Pietro; Lacroix, Vincent; Sagot, Marie-France

    2013-05-08

    Gene organization dynamics is actively studied because it provides useful evolutionary information, makes functional annotation easier and often enables to characterize pathogens. There is therefore a strong interest in understanding the variability of this trait and the possible correlations with life-style. Two kinds of events affect genome organization: on one hand translocations and recombinations change the relative position of genes shared by two genomes (i.e. the backbone gene order); on the other, insertions and deletions leave the backbone gene order unchanged but they alter the gene neighborhoods by breaking the syntenic regions. A complete picture about genome organization evolution therefore requires to account for both kinds of events. We developed an approach where we model chromosomes as graphs on which we compute different stability estimators; we consider genome rearrangements as well as the effect of gene insertions and deletions. In a first part of the paper, we fit a measure of backbone gene order conservation (hereinafter called backbone stability) against phylogenetic distance for over 3000 genome comparisons, improving existing models for the divergence in time of backbone stability. Intra- and inter-specific comparisons were treated separately to focus on different time-scales. The use of multiple genomes of a same species allowed to identify genomes with diverging gene order with respect to their conspecific. The inter-species analysis indicates that pathogens are more often unstable with respect to non-pathogens. In a second part of the text, we show that in pathogens, gene content dynamics (insertions and deletions) have a much more dramatic effect on genome organization stability than backbone rearrangements. In this work, we studied genome organization divergence taking into account the contribution of both genome order rearrangements and genome content dynamics. By studying species with multiple sequenced genomes available, we were

  3. ICSNPathway: identify candidate causal SNPs and pathways from genome-wide association study by one analytical framework.

    Science.gov (United States)

    Zhang, Kunlin; Chang, Suhua; Cui, Sijia; Guo, Liyuan; Zhang, Liuyan; Wang, Jing

    2011-07-01

    Genome-wide association study (GWAS) is widely utilized to identify genes involved in human complex disease or some other trait. One key challenge for GWAS data interpretation is to identify causal SNPs and provide profound evidence on how they affect the trait. Currently, researches are focusing on identification of candidate causal variants from the most significant SNPs of GWAS, while there is lack of support on biological mechanisms as represented by pathways. Although pathway-based analysis (PBA) has been designed to identify disease-related pathways by analyzing the full list of SNPs from GWAS, it does not emphasize on interpreting causal SNPs. To our knowledge, so far there is no web server available to solve the challenge for GWAS data interpretation within one analytical framework. ICSNPathway is developed to identify candidate causal SNPs and their corresponding candidate causal pathways from GWAS by integrating linkage disequilibrium (LD) analysis, functional SNP annotation and PBA. ICSNPathway provides a feasible solution to bridge the gap between GWAS and disease mechanism study by generating hypothesis of SNP → gene → pathway(s). The ICSNPathway server is freely available at http://icsnpathway.psych.ac.cn/.

  4. A whole genome screening and RNA interference identify a juvenile hormone esterase-like gene of the diamondback moth, Plutella xylostella.

    Science.gov (United States)

    Gu, Xiaojun; Kumar, Sunil; Kim, Eunjin; Kim, Yonggyun

    2015-09-01

    Juvenile hormone (JH) plays a crucial role in preventing precocious metamorphosis and stimulating reproduction. Thus, its hemolymph titer should be under a tight control. As a negative controller, juvenile hormone esterase (JHE) performs a rapid breakdown of residual JH in the hemolymph during last instar to induce a larval-to-pupal metamorphosis. A whole genome of the diamondback moth (DBM), Plutella xylostella, has been annotated and proposed 11 JHE candidates. Sequence analysis using conserved motifs commonly found in other JHEs proposed a putative JHE (Px004817). Px004817 (64.61 kDa, pI=5.28) exhibited a characteristic JHE expression pattern by showing high peak at the early last instar, at which JHE enzyme activity was also at a maximal level. RNA interference of Px004817 reduced JHE activity and interrupted pupal development with a significant increase of larval period. This study identifies Px004817 as a JHE-like gene of P. xylostella. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    DEFF Research Database (Denmark)

    Machado, Henrique; Gram, Lone

    2017-01-01

    was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.......Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...

  6. Identification of prophages in bacterial genomes by dinucleotide relative abundance difference.

    Directory of Open Access Journals (Sweden)

    K V Srividhya

    Full Text Available BACKGROUND: Prophages are integrated viral forms in bacterial genomes that have been found to contribute to interstrain genetic variability. Many virulence-associated genes are reported to be prophage encoded. Present computational methods to detect prophages are either by identifying possible essential proteins such as integrases or by an extension of this technique, which involves identifying a region containing proteins similar to those occurring in prophages. These methods suffer due to the problem of low sequence similarity at the protein level, which suggests that a nucleotide based approach could be useful. METHODOLOGY: Earlier dinucleotide relative abundance (DRA have been used to identify regions, which deviate from the neighborhood areas, in genomes. We have used the difference in the dinucleotide relative abundance (DRAD between the bacterial and prophage DNA to aid location of DNA stretches that could be of prophage origin in bacterial genomes. Prophage sequences which deviate from bacterial regions in their dinucleotide frequencies are detected by scanning bacterial genome sequences. The method was validated using a subset of genomes with prophage data from literature reports. A web interface for prophage scan based on this method is available at http://bicmku.in:8082/prophagedb/dra.html. Two hundred bacterial genomes which do not have annotated prophages have been scanned for prophage regions using this method. CONCLUSIONS: The relative dinucleotide distribution difference helps detect prophage regions in genome sequences. The usefulness of this method is seen in the identification of 461 highly probable loci pertaining to prophages which have not been annotated so earlier. This work emphasizes the need to extend the efforts to detect and annotate prophage elements in genome sequences.

  7. Investigation of 95 variants identified in a genome-wide study for association with mortality after acute coronary syndrome

    Directory of Open Access Journals (Sweden)

    Winkelmann Bernhard R

    2011-09-01

    Full Text Available Abstract Background Genome-wide association studies (GWAS have identified new candidate genes for the occurrence of acute coronary syndrome (ACS, but possible effects of such genes on survival following ACS have yet to be investigated. Methods We examined 95 polymorphisms in 69 distinct gene regions identified in a GWAS for premature myocardial infarction for their association with post-ACS mortality among 811 whites recruited from university-affiliated hospitals in Kansas City, Missouri. We then sought replication of a positive genetic association in a large, racially diverse cohort of myocardial infarction patients (N = 2284 using Kaplan-Meier survival analyses and Cox regression to adjust for relevant covariates. Finally, we investigated the apparent association further in 6086 additional coronary artery disease patients. Results After Cox adjustment for other ACS risk factors, of 95 SNPs tested in 811 whites only the association with the rs6922269 in MTHFD1L was statistically significant, with a 2.6-fold mortality hazard (P = 0.007. The recessive A/A genotype was of borderline significance in an age- and race-adjusted analysis of the entire combined cohort (N = 3095; P = 0.052, but this finding was not confirmed in independent cohorts (N = 6086. Conclusions We found no support for the hypothesis that the GWAS-identified variants in this study substantially alter the probability of post-ACS survival. Large-scale, collaborative, genome-wide studies may be required in order to detect genetic variants that are robustly associated with survival in patients with coronary artery disease.

  8. Genome-wide joint meta-analysis of SNP and SNP-by-smoking interaction identifies novel loci for pulmonary function.

    Directory of Open Access Journals (Sweden)

    Dana B Hancock

    Full Text Available Genome-wide association studies have identified numerous genetic loci for spirometic measures of pulmonary function, forced expiratory volume in one second (FEV(1, and its ratio to forced vital capacity (FEV(1/FVC. Given that cigarette smoking adversely affects pulmonary function, we conducted genome-wide joint meta-analyses (JMA of single nucleotide polymorphism (SNP and SNP-by-smoking (ever-smoking or pack-years associations on FEV(1 and FEV(1/FVC across 19 studies (total N = 50,047. We identified three novel loci not previously associated with pulmonary function. SNPs in or near DNER (smallest P(JMA = 5.00×10(-11, HLA-DQB1 and HLA-DQA2 (smallest P(JMA = 4.35×10(-9, and KCNJ2 and SOX9 (smallest P(JMA = 1.28×10(-8 were associated with FEV(1/FVC or FEV(1 in meta-analysis models including SNP main effects, smoking main effects, and SNP-by-smoking (ever-smoking or pack-years interaction. The HLA region has been widely implicated for autoimmune and lung phenotypes, unlike the other novel loci, which have not been widely implicated. We evaluated DNER, KCNJ2, and SOX9 and found them to be expressed in human lung tissue. DNER and SOX9 further showed evidence of differential expression in human airway epithelium in smokers compared to non-smokers. Our findings demonstrated that joint testing of SNP and SNP-by-environment interaction identified novel loci associated with complex traits that are missed when considering only the genetic main effects.

  9. Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer

    NARCIS (Netherlands)

    Wang, Kai; Yuen, Siu Tsan; Xu, Jiangchun; Lee, Siu Po; Yan, Helen H N; Shi, Stephanie T; Siu, Hoi Cheong; Deng, Shibing; Chu, Kent Man; Law, Simon; Chan, Kok Hoe; Chan, Annie S Y; Tsui, Wai Yin; Ho, Siu Lun; Chan, Anthony K W; Man, Jonathan L K; Foglizzo, Valentina; Ng, Man Kin; Chan, April S; Ching, Yick Pang; Cheng, Grace H W; Xie, Tao; Fernandez, Julio; Li, Vivian S W; Clevers, Hans; Rejto, Paul A; Mao, Mao; Leung, Suet Yi

    Gastric cancer is a heterogeneous disease with diverse molecular and histological subtypes. We performed whole-genome sequencing in 100 tumor-normal pairs, along with DNA copy number, gene expression and methylation profiling, for integrative genomic analysis. We found subtype-specific genetic and

  10. Delineating slowly and rapidly evolving fractions of the Drosophila genome.

    Science.gov (United States)

    Keith, Jonathan M; Adams, Peter; Stephen, Stuart; Mattick, John S

    2008-05-01

    Evolutionary conservation is an important indicator of function and a major component of bioinformatic methods to identify non-protein-coding genes. We present a new Bayesian method for segmenting pairwise alignments of eukaryotic genomes while simultaneously classifying segments into slowly and rapidly evolving fractions. We also describe an information criterion similar to the Akaike Information Criterion (AIC) for determining the number of classes. Working with pairwise alignments enables detection of differences in conservation patterns among closely related species. We analyzed three whole-genome and three partial-genome pairwise alignments among eight Drosophila species. Three distinct classes of conservation level were detected. Sequences comprising the most slowly evolving component were consistent across a range of species pairs, and constituted approximately 62-66% of the D. melanogaster genome. Almost all (>90%) of the aligned protein-coding sequence is in this fraction, suggesting much of it (comprising the majority of the Drosophila genome, including approximately 56% of non-protein-coding sequences) is functional. The size and content of the most rapidly evolving component was species dependent, and varied from 1.6% to 4.8%. This fraction is also enriched for protein-coding sequence (while containing significant amounts of non-protein-coding sequence), suggesting it is under positive selection. We also classified segments according to conservation and GC content simultaneously. This analysis identified numerous sub-classes of those identified on the basis of conservation alone, but was nevertheless consistent with that classification. Software, data, and results available at www.maths.qut.edu.au/-keithj/. Genomic segments comprising the conservation classes available in BED format.

  11. Identifying Heterogeneities in Subsurface Environment using the Level Set Method

    Energy Technology Data Exchange (ETDEWEB)

    Lei, Hongzhuan [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Lu, Zhiming [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Vesselinov, Velimir Valentinov [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-08-25

    These are slides from a presentation on identifying heterogeneities in subsurface environment using the level set method. The slides start with the motivation, then explain Level Set Method (LSM), the algorithms, some examples are given, and finally future work is explained.

  12. A genome-wide association study of atopic dermatitis identifies loci with overlapping effects on asthma and psoriasis.

    Science.gov (United States)

    Weidinger, Stephan; Willis-Owen, Saffron A G; Kamatani, Yoichiro; Baurecht, Hansjörg; Morar, Nilesh; Liang, Liming; Edser, Pauline; Street, Teresa; Rodriguez, Elke; O'Regan, Grainne M; Beattie, Paula; Fölster-Holst, Regina; Franke, Andre; Novak, Natalija; Fahy, Caoimhe M; Winge, Mårten C G; Kabesch, Michael; Illig, Thomas; Heath, Simon; Söderhäll, Cilla; Melén, Erik; Pershagen, Göran; Kere, Juha; Bradley, Maria; Lieden, Agne; Nordenskjold, Magnus; Harper, John I; McLean, W H Irwin; Brown, Sara J; Cookson, William O C; Lathrop, G Mark; Irvine, Alan D; Moffatt, Miriam F

    2013-12-01

    Atopic dermatitis (AD) is the most common dermatological disease of childhood. Many children with AD have asthma and AD shares regions of genetic linkage with psoriasis, another chronic inflammatory skin disease. We present here a genome-wide association study (GWAS) of childhood-onset AD in 1563 European cases with known asthma status and 4054 European controls. Using Illumina genotyping followed by imputation, we generated 268 034 consensus genotypes and in excess of 2 million single nucleotide polymorphisms (SNPs) for analysis. Association signals were assessed for replication in a second panel of 2286 European cases and 3160 European controls. Four loci achieved genome-wide significance for AD and replicated consistently across all cohorts. These included the epidermal differentiation complex (EDC) on chromosome 1, the genomic region proximal to LRRC32 on chromosome 11, the RAD50/IL13 locus on chromosome 5 and the major histocompatibility complex (MHC) on chromosome 6; reflecting action of classical HLA alleles. We observed variation in the contribution towards co-morbid asthma for these regions of association. We further explored the genetic relationship between AD, asthma and psoriasis by examining previously identified susceptibility SNPs for these diseases. We found considerable overlap between AD and psoriasis together with variable coincidence between allergic rhinitis (AR) and asthma. Our results indicate that the pathogenesis of AD incorporates immune and epidermal barrier defects with combinations of specific and overlapping effects at individual loci.

  13. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute's genomic medicine portfolio.

    Science.gov (United States)

    Manolio, Teri A

    2016-10-01

    Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual's genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of "Genomic Medicine Meetings," under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and difficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI's genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so. Published by Elsevier Ireland Ltd.

  14. Genome-wide association study of classical Hodgkin lymphoma identifies key regulators of disease susceptibility

    NARCIS (Netherlands)

    Sud, A. (Amit); Thomsen, H. (Hauke); Law, P.J. (Philip J.); A. Försti (Asta); Filho, M.I.D.S. (Miguel Inacio Da Silva); Holroyd, A. (Amy); P. Broderick (Peter); Orlando, G. (Giulia); Lenive, O. (Oleg); Wright, L. (Lauren); R. Cooke (Rosie); D.F. Easton (Douglas); P.D.P. Pharoah (Paul); A.M. Dunning (Alison); J. Peto (Julian); F. Canzian (Federico); Eeles, R. (Rosalind); Z. Kote-Jarai; K.R. Muir (K.); Pashayan, N. (Nora); B.E. Henderson (Brian); C.A. Haiman (Christopher); S. Benlloch (Sara); F.R. Schumacher (Fredrick R); Olama, A.A.A. (Ali Amin Al); S.I. Berndt (Sonja); G. Conti (Giario); F. Wiklund (Fredrik); S.J. Chanock (Stephen); Stevens, V.L. (Victoria L.); C.M. Tangen (Catherine M.); Batra, J. (Jyotsna); Clements, J. (Judith); H. Grönberg (Henrik); Schleutker, J. (Johanna); D. Albanes (Demetrius); Weinstein, S. (Stephanie); K. Wolk (Kerstin); West, C. (Catharine); Mucci, L. (Lorelei); Cancel-Tassin, G. (Géraldine); Koutros, S. (Stella); Sorensen, K.D. (Karina Dalsgaard); L. Maehle; D. Neal (David); S.P.L. Travis (Simon); Hamilton, R.J. (Robert J.); S.A. Ingles (Sue); B.S. Rosenstein (Barry S.); Lu, Y.-J. (Yong-Jie); Giles, G.G. (Graham G.); A. Kibel (Adam); Vega, A. (Ana); M. Kogevinas (Manolis); Penney, K.L. (Kathryn L.); Park, J.Y. (Jong Y.); Stanford, J.L. (Janet L.); C. Cybulski (Cezary); B.G. Nordestgaard (Børge); Brenner, H. (Hermann); Maier, C. (Christiane); Kim, J. (Jeri); E.M. John (Esther); P.J. Teixeira; Neuhausen, S.L. (Susan L.); De Ruyck, K. (Kim); Razack, A. (Azad); Newcomb, L.F. (Lisa F.); Lessel, D. (Davor); Kaneva, R. (Radka); N. Usmani (Nawaid); F. Claessens; Townsend, P.A. (Paul A.); Dominguez, M.G. (Manuela Gago); Roobol, M.J. (Monique J.); F. Menegaux (Florence); P. Hoffmann (Per); M.M. Nöthen (Markus); K.-H. JöCkel (Karl-Heinz); Strandmann, E.P.V. (Elke Pogge Von); Lightfoot, T. (Tracy); Kane, E. (Eleanor); Roman, E. (Eve); Lake, A. (Annette); Montgomery, D. (Dorothy); Jarrett, R.F. (Ruth F.); A.J. Swerdlow (Anthony ); A. Engert (Andreas); N. Orr (Nick); K. Hemminki (Kari); Houlston, R.S. (Richard S.)

    2017-01-01

    textabstractSeveral susceptibility loci for classical Hodgkin lymphoma have been reported. However, much of the heritable risk is unknown. Here, we perform a meta-analysis of two existing genome-wide association studies, a new genome-wide association study, and replication totalling 5,314 cases and

  15. Genome-wide RNAi screening identifies genes inhibiting the migration of glioblastoma cells.

    Directory of Open Access Journals (Sweden)

    Jian Yang

    Full Text Available Glioblastoma Multiforme (GBM cells are highly invasive, infiltrating into the surrounding normal brain tissue, making it impossible to completely eradicate GBM tumors by surgery or radiation. Increasing evidence also shows that these migratory cells are highly resistant to cytotoxic reagents, but decreasing their migratory capability can re-sensitize them to chemotherapy. These evidences suggest that the migratory cell population may serve as a better therapeutic target for more effective treatment of GBM. In order to understand the regulatory mechanism underlying the motile phenotype, we carried out a genome-wide RNAi screen for genes inhibiting the migration of GBM cells. The screening identified a total of twenty-five primary hits; seven of them were confirmed by secondary screening. Further study showed that three of the genes, FLNA, KHSRP and HCFC1, also functioned in vivo, and knocking them down caused multifocal tumor in a mouse model. Interestingly, two genes, KHSRP and HCFC1, were also found to be correlated with the clinical outcome of GBM patients. These two genes have not been previously associated with cell migration.

  16. Informative genomic microsatellite markers for efficient genotyping applications in sugarcane.

    Science.gov (United States)

    Parida, Swarup K; Kalia, Sanjay K; Kaul, Sunita; Dalal, Vivek; Hemaprabha, G; Selvi, Athiappan; Pandit, Awadhesh; Singh, Archana; Gaikwad, Kishor; Sharma, Tilak R; Srivastava, Prem Shankar; Singh, Nagendra K; Mohapatra, Trilochan

    2009-01-01

    Genomic microsatellite markers are capable of revealing high degree of polymorphism. Sugarcane (Saccharum sp.), having a complex polyploid genome requires more number of such informative markers for various applications in genetics and breeding. With the objective of generating a large set of microsatellite markers designated as Sugarcane Enriched Genomic MicroSatellite (SEGMS), 6,318 clones from genomic libraries of two hybrid sugarcane cultivars enriched with 18 different microsatellite repeat-motifs were sequenced to generate 4.16 Mb high-quality sequences. Microsatellites were identified in 1,261 of the 5,742 non-redundant clones that accounted for 22% enrichment of the libraries. Retro-transposon association was observed for 23.1% of the identified microsatellites. The utility of the microsatellite containing genomic sequences were demonstrated by higher primer designing potential (90%) and PCR amplification efficiency (87.4%). A total of 1,315 markers including 567 class I microsatellite markers were designed and placed in the public domain for unrestricted use. The level of polymorphism detected by these markers among sugarcane species, genera, and varieties was 88.6%, while cross-transferability rate was 93.2% within Saccharum complex and 25% to cereals. Cloning and sequencing of size variant amplicons revealed that the variation in the number of repeat-units was the main source of SEGMS fragment length polymorphism. High level of polymorphism and wide range of genetic diversity (0.16-0.82 with an average of 0.44) assayed with the SEGMS markers suggested their usefulness in various genotyping applications in sugarcane.

  17. Effect of genomics-related literacy on non-communicable diseases.

    Science.gov (United States)

    Nakamura, Sho; Narimatsu, Hiroto; Katayama, Kayoko; Sho, Ri; Yoshioka, Takashi; Fukao, Akira; Kayama, Takamasa

    2017-09-01

    Recent progress in genomic research has raised expectations for the development of personalized preventive medicine, although genomics-related literacy of patients will be essential. Thus, enhancing genomics-related literacy is crucial, particularly for individuals with low genomics-related literacy because they might otherwise miss the opportunity to receive personalized preventive care. This should be especially emphasized when a lack of genomics-related literacy is associated with elevated disease risk, because patients could therefore be deprived of the added benefits of preventive interventions; however, whether such an association exists is unclear. Association between genomics-related literacy, calculated as the genomics literacy score (GLS), and the prevalence of non-communicable diseases was assessed using propensity score matching on 4646 participants (males: 1891; 40.7%). Notably, the low-GLS group (score below median) presented a higher risk of hypertension (relative risk (RR) 1.09, 95% confidence interval (CI) 1.03-1.16) and obesity (RR 1.11, 95% CI 1.01-1.22) than the high-GLS group. Our results suggest that a low level of genomics-related literacy could represent a risk factor for hypertension and obesity. Evaluating genomics-related literacy could be used to identify a more appropriate population for health and educational interventions.

  18. Molecular markers for tolerance of European ash (Fraxinus excelsior) to dieback disease identified using Associative Transcriptomics

    DEFF Research Database (Denmark)

    Harper, Andrea L.; McKinney, Lea Vig; Nielsen, Lene Rostgaard

    2016-01-01

    panel scored for disease symptoms and identified markers strongly associated with canopy damage in infected trees. Using these markers we predicted phenotypes in a test panel of unrelated trees, successfully identifying individuals with a low level of susceptibility to the disease. Co......Tree disease epidemics are a global problem, impacting food security, biodiversity and national economies. The potential for conservation and breeding in trees is hampered by complex genomes and long lifecycles, with most species lacking genomic resources. The European Ash tree Fraxinus excelsior...

  19. Genomic analyses identify hundreds of variants associated with age at menarche and support a role for puberty timing in cancer risk

    NARCIS (Netherlands)

    Day, Felix R; Thompson, Deborah J; Helgason, Hannes; Chasman, Daniel I; Finucane, Hilary; Sulem, Patrick; Ruth, Katherine S; Whalen, Sean; Sarkar, Abhishek K; Albrecht, Eva; Altmaier, Elisabeth; Amini, Marzyeh; Barbieri, Caterina M; Boutin, Thibaud; Campbell, Archie; Demerath, Ellen; Giri, Ayush; He, Chunyan; Hottenga, Jouke J; Karlsson, Robert; Kolcic, Ivana; Loh, Po-Ru; Lunetta, Kathryn L; Mangino, Massimo; Marco, Brumat; McMahon, George; Medland, Sarah E; Nolte, Ilja M; Noordam, Raymond; Nutile, Teresa; Paternoster, Lavinia; Perjakova, Natalia; Porcu, Eleonora; Rose, Lynda M; Schraut, Katharina E; Segrè, Ayellet V; Smith, Albert V; Stolk, Lisette; Teumer, Alexander; Andrulis, Irene L; Bandinelli, Stefania; Beckmann, Matthias W; Benitez, Javier; Bergmann, Sven; Bochud, Murielle; de Geus, Eco J C N; Mbarek, Hamdi; Willemsen, Gonneke; Boomsma, Dorret I; Visser, Jenny A

    2017-01-01

    The timing of puberty is a highly polygenic childhood trait that is epidemiologically associated with various adult diseases. Using 1000 Genomes Project-imputed genotype data in up to ∼370,000 women, we identify 389 independent signals (P < 5 × 10(-8)) for age at menarche, a milestone in female

  20. Parasitism drives host genome evolution: Insights from the Pasteuria ramosa-Daphnia magna system.

    Science.gov (United States)

    Bourgeois, Yann; Roulin, Anne C; Müller, Kristina; Ebert, Dieter

    2017-04-01

    Because parasitism is thought to play a major role in shaping host genomes, it has been predicted that genomic regions associated with resistance to parasites should stand out in genome scans, revealing signals of selection above the genomic background. To test whether parasitism is indeed such a major factor in host evolution and to better understand host-parasite interaction at the molecular level, we studied genome-wide polymorphisms in 97 genotypes of the planktonic crustacean Daphnia magna originating from three localities across Europe. Daphnia magna is known to coevolve with the bacterial pathogen Pasteuria ramosa for which host genotypes (clonal lines) are either resistant or susceptible. Using association mapping, we identified two genomic regions involved in resistance to P. ramosa, one of which was already known from a previous QTL analysis. We then performed a naïve genome scan to test for signatures of positive selection and found that the two regions identified with the association mapping further stood out as outliers. Several other regions with evidence for selection were also found, but no link between these regions and phenotypic variation could be established. Our results are consistent with the hypothesis that parasitism is driving host genome evolution. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.

  1. Harnessing genomics to identify environmental determinants of heritable disease

    Science.gov (United States)

    Yauk, Carole Lyn; Argueso, J. Lucas; Auerbach, Scott S.; Awadalla, Philip; Davis, Sean R.; DeMarini, David M.; Douglas, George R.; Dubrova, Yuri E.; Elespuru, Rosalie K.; Glover, Thomas W.; Hales, Barbara F.; Hurles, Matthew E.; Klein, Catherine B.; Lupski, James R.; Manchester, David K.; Marchetti, Francesco; Montpetit, Alexandre; Mulvihill, John J.; Robaire, Bernard; Robbins, Wendie A.; Rouleau, Guy A.; Shaughnessy, Daniel T.; Somers, Christopher M.; Taylor, James G.; Trasler, Jacquetta; Waters, Michael D.; Wilson, Thomas E.; Witt, Kristine L.; Bishop, Jack B.

    2012-01-01

    Next-generation sequencing technologies can now be used to directly measure heritable de novo DNA sequence mutations in humans. However, these techniques have not been used to examine environmental factors that induce such mutations and their associated diseases. To address this issue, a working group on environmentally induced germline mutation analysis (ENIGMA) met in October 2011 to propose the necessary foundational studies, which include sequencing of parent–offspring trios from highly exposed human populations, and controlled dose–response experiments in animals. These studies will establish background levels of variability in germline mutation rates and identify environmental agents that influence these rates and heritable disease. Guidance for the types of exposures to examine come from rodent studies that have identified agents such as cancer chemotherapeutic drugs, ionizing radiation, cigarette smoke, and air pollution as germ-cell mutagens. Research is urgently needed to establish the health consequences of parental exposures on subsequent generations. PMID:22935230

  2. CRISPRseek: a bioconductor package to identify target-specific guide RNAs for CRISPR-Cas9 genome-editing systems.

    Directory of Open Access Journals (Sweden)

    Lihua J Zhu

    Full Text Available CRISPR-Cas systems are a diverse family of RNA-protein complexes in bacteria that target foreign DNA sequences for cleavage. Derivatives of these complexes have been engineered to cleave specific target sequences depending on the sequence of a CRISPR-derived guide RNA (gRNA and the source of the Cas9 protein. Important considerations for the design of gRNAs are to maximize aimed activity at the desired target site while minimizing off-target cleavage. Because of the rapid advances in the understanding of existing CRISPR-Cas9-derived RNA-guided nucleases and the development of novel RNA-guided nuclease systems, it is critical to have computational tools that can accommodate a wide range of different parameters for the design of target-specific RNA-guided nuclease systems. We have developed CRISPRseek, a highly flexible, open source software package to identify gRNAs that target a given input sequence while minimizing off-target cleavage at other sites within any selected genome. CRISPRseek will identify potential gRNAs that target a sequence of interest for CRISPR-Cas9 systems from different bacterial species and generate a cleavage score for potential off-target sequences utilizing published or user-supplied weight matrices with position-specific mismatch penalty scores. Identified gRNAs may be further filtered to only include those that occur in paired orientations for increased specificity and/or those that overlap restriction enzyme sites. For applications where gRNAs are desired to discriminate between two related sequences, CRISPRseek can rank gRNAs based on the difference between predicted cleavage scores in each input sequence. CRISPRseek is implemented as a Bioconductor package within the R statistical programming environment, allowing it to be incorporated into computational pipelines to automate the design of gRNAs for target sequences identified in a wide variety of genome-wide analyses. CRISPRseek is available under the GNU General

  3. Genome-Wide Analyses Suggest Mechanisms Involving Early B-Cell Development in Canine IgA Deficiency.

    Directory of Open Access Journals (Sweden)

    Mia Olsson

    Full Text Available Immunoglobulin A deficiency (IgAD is the most common primary immune deficiency disorder in both humans and dogs, characterized by recurrent mucosal tract infections and a predisposition for allergic and other immune mediated diseases. In several dog breeds, low IgA levels have been observed at a high frequency and with a clinical resemblance to human IgAD. In this study, we used genome-wide association studies (GWAS to identify genomic regions associated with low IgA levels in dogs as a comparative model for human IgAD. We used a novel percentile groups-approach to establish breed-specific cut-offs and to perform analyses in a close to continuous manner. GWAS performed in four breeds prone to low IgA levels (German shepherd, Golden retriever, Labrador retriever and Shar-Pei identified 35 genomic loci suggestively associated (p <0.0005 to IgA levels. In German shepherd, three genomic regions (candidate genes include KIRREL3 and SERPINA9 were genome-wide significantly associated (p <0.0002 with IgA levels. A ~20kb long haplotype on CFA28, significantly associated (p = 0.0005 to IgA levels in Shar-Pei, was positioned within the first intron of the gene SLIT1. Both KIRREL3 and SLIT1 are highly expressed in the central nervous system and in bone marrow and are potentially important during B-cell development. SERPINA9 expression is restricted to B-cells and peaks at the time-point when B-cells proliferate into antibody-producing plasma cells. The suggestively associated regions were enriched for genes in Gene Ontology gene sets involving inflammation and early immune cell development.

  4. Genomic and transcriptome profiling identified both human and HBV genetic variations and their interactions in Chinese hepatocellular carcinoma

    Directory of Open Access Journals (Sweden)

    Hua Dong

    2015-12-01

    Full Text Available Interaction between HBV and host genome integrations in hepatocellular carcinoma (HCC development is a complex process and the mechanism is still unclear. Here we described in details the quality controls and data mining of aCGH and transcriptome sequencing data on 50 HCC samples from the Chinese patients, published by Dong et al. (2015 (GEO#: GSE65486. In additional to the HBV-MLL4 integration discovered, we also investigated the genetic aberrations of HBV and host genes as well as their genetic interactions. We reported human genome copy number changes and frequent transcriptome variations (e.g. TP53, CTNNB1 mutation, especially MLL family mutations in this cohort of the patients. For HBV genotype C, we identified a novel linkage disequilibrium region covering HBV replication regulatory elements, including basal core promoter, DR1, epsilon and poly-A regions, which is associated with HBV core antigen over-expression and almost exclusive to HBV-MLL4 integration.

  5. Genome-wide association study of classical Hodgkin lymphoma identifies key regulators of disease susceptibility

    DEFF Research Database (Denmark)

    Sud, Amit; Thomsen, Hauke; Law, Philip J.

    2017-01-01

    Several susceptibility loci for classical Hodgkin lymphoma have been reported. However, much of the heritable risk is unknown. Here, we perform a meta-analysis of two existing genome-wide association studies, a new genome-wide association study, and replication totalling 5,314 cases and 16,749 co...

  6. Genome-wide association study of primary tooth eruption identifies pleiotropic loci associated with height and craniofacial distances

    DEFF Research Database (Denmark)

    Fatemifar, Ghazaleh; Hoggart, Clive J; Paternoster, Lavinia

    2013-01-01

    Twin and family studies indicate that the timing of primary tooth eruption is highly heritable, with estimates typically exceeding 80%. To identify variants involved in primary tooth eruption, we performed a population-based genome-wide association study of 'age at first tooth' and 'number of teeth......' using 5998 and 6609 individuals, respectively, from the Avon Longitudinal Study of Parents and Children (ALSPAC) and 5403 individuals from the 1966 Northern Finland Birth Cohort (NFBC1966). We tested 2 446 724 SNPs imputed in both studies. Analyses were controlled for the effect of gestational age, sex...

  7. Thirty new loci for age at menarche identified by a meta-analysis of genome-wide association studies

    Science.gov (United States)

    Elks, Cathy E.; Perry, John R.B.; Sulem, Patrick; Chasman, Daniel I.; Franceschini, Nora; He, Chunyan; Lunetta, Kathryn L.; Visser, Jenny A.; Byrne, Enda M.; Cousminer, Diana L.; Gudbjartsson, Daniel F.; Esko, Tõnu; Feenstra, Bjarke; Hottenga, Jouke-Jan; Koller, Daniel L.; Kutalik, Zoltán; Lin, Peng; Mangino, Massimo; Marongiu, Mara; McArdle, Patrick F.; Smith, Albert V.; Stolk, Lisette; van Wingerden, Sophie W.; Zhao, Jing Hua; Albrecht, Eva; Corre, Tanguy; Ingelsson, Erik; Hayward, Caroline; Magnusson, Patrik K.E.; Smith, Erin N.; Ulivi, Shelia; Warrington, Nicole M.; Zgaga, Lina; Alavere, Helen; Amin, Najaf; Aspelund, Thor; Bandinelli, Stefania; Barroso, Ines; Berenson, Gerald S.; Bergmann, Sven; Blackburn, Hannah; Boerwinkle, Eric; Buring, Julie E.; Busonero, Fabio; Campbell, Harry; Chanock, Stephen J.; Chen, Wei; Cornelis, Marilyn C.; Couper, David; Coviello, Andrea D.; d’Adamo, Pio; de Faire, Ulf; de Geus, Eco J.C.; Deloukas, Panos; Döring, Angela; Smith, George Davey; Easton, Douglas F.; Eiriksdottir, Gudny; Emilsson, Valur; Eriksson, Johan; Ferrucci, Luigi; Folsom, Aaron R.; Foroud, Tatiana; Garcia, Melissa; Gasparini, Paolo; Geller, Frank; Gieger, Christian; Gudnason, Vilmundur; Hall, Per; Hankinson, Susan E.; Ferreli, Liana; Heath, Andrew C.; Hernandez, Dena G.; Hofman, Albert; Hu, Frank B.; Illig, Thomas; Järvelin, Marjo-Riitta; Johnson, Andrew D.; Karasik, David; Khaw, Kay-Tee; Kiel, Douglas P.; Kilpeläinen, Tuomas O.; Kolcic, Ivana; Kraft, Peter; Launer, Lenore J.; Laven, Joop S.E.; Li, Shengxu; Liu, Jianjun; Levy, Daniel; Martin, Nicholas G.; McArdle, Wendy L.; Melbye, Mads; Mooser, Vincent; Murray, Jeffrey C.; Murray, Sarah S.; Nalls, Michael A.; Navarro, Pau; Nelis, Mari; Ness, Andrew R.; Northstone, Kate; Oostra, Ben A.; Peacock, Munro; Palmer, Lyle J.; Palotie, Aarno; Paré, Guillaume; Parker, Alex N.; Pedersen, Nancy L.; Peltonen, Leena; Pennell, Craig E.; Pharoah, Paul; Polasek, Ozren; Plump, Andrew S.; Pouta, Anneli; Porcu, Eleonora; Rafnar, Thorunn; Rice, John P.; Ring, Susan M.; Rivadeneira, Fernando; Rudan, Igor; Sala, Cinzia; Salomaa, Veikko; Sanna, Serena; Schlessinger, David; Schork, Nicholas J.; Scuteri, Angelo; Segrè, Ayellet V.; Shuldiner, Alan R.; Soranzo, Nicole; Sovio, Ulla; Srinivasan, Sathanur R.; Strachan, David P.; Tammesoo, Mar-Liis; Tikkanen, Emmi; Toniolo, Daniela; Tsui, Kim; Tryggvadottir, Laufey; Tyrer, Jonathon; Uda, Manuela; van Dam, Rob M.; van Meurs, Joyve B.J.; Vollenweider, Peter; Waeber, Gerard; Wareham, Nicholas J.; Waterworth, Dawn M.; Weedon, Michael N.; Wichmann, H. Erich; Willemsen, Gonneke; Wilson, James F.; Wright, Alan F.; Young, Lauren; Zhai, Guangju; Zhuang, Wei Vivian; Bierut, Laura J.; Boomsma, Dorret I.; Boyd, Heather A.; Crisponi, Laura; Demerath, Ellen W.; van Duijn, Cornelia M.; Econs, Michael J.; Harris, Tamara B.; Hunter, David J.; Loos, Ruth J.F.; Metspalu, Andres; Montgomery, Grant W.; Ridker, Paul M.; Spector, Tim D.; Streeten, Elizabeth A.; Stefansson, Kari; Thorsteinsdottir, Unnur; Uitterlinden, André G.; Widen, Elisabeth; Murabito, Joanne M.; Ong, Ken K.; Murray, Anna

    2011-01-01

    To identify loci for age at menarche, we performed a meta-analysis of 32 genome-wide association studies in 87,802 women of European descent, with replication in up to 14,731 women. In addition to the known loci at LIN28B (P=5.4×10−60) and 9q31.2 (P=2.2×10−33), we identified 30 novel menarche loci (all P<5×10−8) and found suggestive evidence for a further 10 loci (P<1.9×10−6). New loci included four previously associated with BMI (in/near FTO, SEC16B, TRA2B and TMEM18), three in/near other genes implicated in energy homeostasis (BSX, CRTC1, and MCHR2), and three in/near genes implicated in hormonal regulation (INHBA, PCSK2 and RXRG). Ingenuity and MAGENTA pathway analyses identified coenzyme A and fatty acid biosynthesis as biological processes related to menarche timing. PMID:21102462

  8. Comparative genomics of chondrichthyan Hoxa clusters

    Directory of Open Access Journals (Sweden)

    Zhong Ying-Fu

    2009-09-01

    Full Text Available Abstract Background The chondrichthyan or cartilaginous fish (chimeras, sharks, skates and rays occupy an important phylogenetic position as the sister group to all other jawed vertebrates and as an early lineage to diverge from the vertebrate lineage following two whole genome duplication events in vertebrate evolution. There have been few comparative genomic analyses incorporating data from chondrichthyan fish and none comparing genomic information from within the group. We have sequenced the complete Hoxa cluster of the Little Skate (Leucoraja erinacea and compared to the published Hoxa cluster of the Horn Shark (Heterodontus francisci and to available data from the Elephant Shark (Callorhinchus milii genome project. Results A BAC clone containing the full Little Skate Hoxa cluster was fully sequenced and assembled. Analyses of coding sequences and conserved non-coding elements reveal a strikingly high level of conservation across the cartilaginous fish, with twenty ultraconserved elements (100%,100 bp found between Skate and Horn Shark, compared to three between human and marsupials. We have also identified novel potential non-coding RNAs in the Skate BAC clone, some of which are conserved to other species. Conclusion We find that the Little Skate Hoxa cluster is remarkably similar to the previously published Horn Shark Hoxa cluster with respect to sequence identity, gene size and intergenic distance despite over 180 million years of separation between the two lineages. We suggest that the genomes of cartilaginous fish are more highly conserved than those of tetrapods or teleost fish and so are more likely to have retained ancestral non-coding elements. While useful for isolating homologous DNA, this complicates bioinformatic approaches to identify chondrichthyan-specific non-coding DNA elements

  9. The 1000 Genomes Project: new opportunities for research and social challenges

    Science.gov (United States)

    2010-01-01

    The 1000 Genomes Project, an international collaboration, is sequencing the whole genome of approximately 2,000 individuals from different worldwide populations. The central goal of this project is to describe most of the genetic variation that occurs at a population frequency greater than 1%. The results of this project will allow scientists to identify genetic variation at an unprecedented degree of resolution and will also help improve the imputation methods for determining unobserved genetic variants that are not represented on current genotyping arrays. By identifying novel or rare functional genetic variants, researchers will be able to pinpoint disease-causing genes in genomic regions initially identified by association studies. This level of detailed sequence information will also improve our knowledge of the evolutionary processes and the genomic patterns that have shaped the human species as we know it today. The new data will also lay the foundation for future clinical applications, such as prediction of disease susceptibility and drug response. However, the forthcoming availability of whole genome sequences at affordable prices will raise ethical concerns and pose potential threats to individual privacy. Nevertheless, we believe that these potential risks are outweighed by the benefits in terms of diagnosis and research, so long as rigorous safeguards are kept in place through legislation that prevents discrimination on the basis of the results of genetic testing. PMID:20193048

  10. A High-Throughput Computational Framework for Identifying Significant Copy Number Aberrations from Array Comparative Genomic Hybridisation Data

    Directory of Open Access Journals (Sweden)

    Ian Roberts

    2012-01-01

    Full Text Available Reliable identification of copy number aberrations (CNA from comparative genomic hybridization data would be improved by the availability of a generalised method for processing large datasets. To this end, we developed swatCGH, a data analysis framework and region detection heuristic for computational grids. swatCGH analyses sequentially displaced (sliding windows of neighbouring probes and applies adaptive thresholds of varying stringency to identify the 10% of each chromosome that contains the most frequently occurring CNAs. We used the method to analyse a published dataset, comparing data preprocessed using four different DNA segmentation algorithms, and two methods for prioritising the detected CNAs. The consolidated list of the most commonly detected aberrations confirmed the value of swatCGH as a simplified high-throughput method for identifying biologically significant CNA regions of interest.

  11. PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Wasnick Michael

    2008-03-01

    Full Text Available Abstract Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any

  12. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes

    DEFF Research Database (Denmark)

    Nielsen, Henrik Bjørn; Almeida, Mathieu; Juncker, Agnieszka

    2014-01-01

    of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify...

  13. A genome-wide RNAi screen identifies regulators of cholesterol-modified hedgehog secretion in Drosophila.

    Directory of Open Access Journals (Sweden)

    Reid Aikin

    Full Text Available Hedgehog (Hh proteins are secreted molecules that function as organizers in animal development. In addition to being palmitoylated, Hh is the only metazoan protein known to possess a covalently-linked cholesterol moiety. The absence of either modification severely disrupts the organization of numerous tissues during development. It is currently not known how lipid-modified Hh is secreted and released from producing cells. We have performed a genome-wide RNAi screen in Drosophila melanogaster cells to identify regulators of Hh secretion. We found that cholesterol-modified Hh secretion is strongly dependent on coat protein complex I (COPI but not COPII vesicles, suggesting that cholesterol modification alters the movement of Hh through the early secretory pathway. We provide evidence that both proteolysis and cholesterol modification are necessary for the efficient trafficking of Hh through the ER and Golgi. Finally, we identified several putative regulators of protein secretion and demonstrate a role for some of these genes in Hh and Wingless (Wg morphogen secretion in vivo. These data open new perspectives for studying how morphogen secretion is regulated, as well as provide insight into regulation of lipid-modified protein secretion.

  14. The genome of the extremophile crucifer Thellungiella parvula

    KAUST Repository

    Dassanayake, Maheshi

    2011-08-07

    Thellungiella parvula is related to Arabidopsis thaliana and is endemic to saline, resource-poor habitats, making it a model for the evolution of plant adaptation to extreme environments. Here we present the draft genome for this extremophile species. Exclusively by next generation sequencing, we obtained the de novo assembled genome in 1,496 gap-free contigs, closely approximating the estimated genome size of 140 Mb. We anchored these contigs to seven pseudo chromosomes without the use of maps. We show that short reads can be assembled to a near-complete chromosome level for a eukaryotic species lacking prior genetic information. The sequence identifies a number of tandem duplications that, by the nature of the duplicated genes, suggest a possible basis for T. parvula\\'s extremophile lifestyle. Our results provide essential background for developing genomically influenced testable hypotheses for the evolution of environmental stress tolerance. © 2011 Nature America, Inc. All rights reserved.

  15. Comparative Pan-Genome Analysis of Piscirickettsia salmonis Reveals Genomic Divergences within Genogroups

    Directory of Open Access Journals (Sweden)

    Guillermo Nourdin-Galindo

    2017-10-01

    Full Text Available Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis, functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these

  16. Genome-wide analysis suggests high level of microsynteny and purifying selection affect the evolution of EIN3/EIL family in Rosaceae.

    Science.gov (United States)

    Cao, Yunpeng; Han, Yahui; Meng, Dandan; Li, Dahui; Jin, Qing; Lin, Yi; Cai, Yongping

    2017-01-01

    The ethylene-insensitive3/ethylene-insensitive3-like ( EIN3/EIL ) proteins are a type of nuclear-localized protein with DNA-binding activity in plants. Although the EIN3/EIL gene family has been studied in several plant species, little is known about comprehensive study of the EIN3/EIL gene family in Rosaceae. In this study, ten, five, four, and five EIN3/EIL genes were identified in the genomes of pear ( Pyrus bretschneideri ), mei ( Prunus mume ), peach ( Prunus persica ) and strawberry ( Fragaria vesca ), respectively. Twenty-eight chromosomal segments of EIL/EIN3 gene family were found in four Rosaceae species, and these segments could form seven orthologous or paralogous groups based on interspecies or intraspecies gene colinearity (microsynteny) analysis. Moreover, the highly conserved regions of microsynteny were found in four Rosaceae species. Subsequently it was found that both whole genome duplication and tandem duplication events significantly contributed to the EIL/EIN3 gene family expansion. Gene expression analysis of the EIL/EIN3 genes in the pear revealed subfunctionalization for several PbEIL genes derived from whole genome duplication. It is noteworthy that according to environmental selection pressure analysis, the strong purifying selection should dominate the maintenance of the EIL/EIN3 gene family in four Rosaceae species. These results provided useful information on Rosaceae EIL/EIN3 genes, as well as insights into the evolution of this gene family in four Rosaceae species. Furthermore, high level of microsynteny in the four Rosaceae plants suggested that a large-scale genome duplication event in the EIL/EIN3 gene family was predated to speciation.

  17. Genome-wide analysis suggests high level of microsynteny and purifying selection affect the evolution of EIN3/EIL family in Rosaceae

    Directory of Open Access Journals (Sweden)

    Yunpeng Cao

    2017-05-01

    Full Text Available The ethylene-insensitive3/ethylene-insensitive3-like (EIN3/EIL proteins are a type of nuclear-localized protein with DNA-binding activity in plants. Although the EIN3/EIL gene family has been studied in several plant species, little is known about comprehensive study of the EIN3/EIL gene family in Rosaceae. In this study, ten, five, four, and five EIN3/EIL genes were identified in the genomes of pear (Pyrus bretschneideri, mei (Prunus mume, peach (Prunus persica and strawberry (Fragaria vesca, respectively. Twenty-eight chromosomal segments of EIL/EIN3 gene family were found in four Rosaceae species, and these segments could form seven orthologous or paralogous groups based on interspecies or intraspecies gene colinearity (microsynteny analysis. Moreover, the highly conserved regions of microsynteny were found in four Rosaceae species. Subsequently it was found that both whole genome duplication and tandem duplication events significantly contributed to the EIL/EIN3 gene family expansion. Gene expression analysis of the EIL/EIN3 genes in the pear revealed subfunctionalization for several PbEIL genes derived from whole genome duplication. It is noteworthy that according to environmental selection pressure analysis, the strong purifying selection should dominate the maintenance of the EIL/EIN3 gene family in four Rosaceae species. These results provided useful information on Rosaceae EIL/EIN3 genes, as well as insights into the evolution of this gene family in four Rosaceae species. Furthermore, high level of microsynteny in the four Rosaceae plants suggested that a large-scale genome duplication event in the EIL/EIN3 gene family was predated to speciation.

  18. Genomes-based phylogeny of the genus Xanthomonas

    Directory of Open Access Journals (Sweden)

    Rodriguez-R Luis M

    2012-03-01

    Full Text Available Abstract Background The genus Xanthomonas comprises several plant pathogenic bacteria affecting a wide range of hosts. Despite the economic, industrial and biological importance of Xanthomonas, the classification and phylogenetic relationships within the genus are still under active debate. Some of the relationships between pathovars and species have not been thoroughly clarified, with old pathovars becoming new species. A change in the genus name has been recently suggested for Xanthomonas albilineans, an early branching species currently located in this genus, but a thorough phylogenomic reconstruction would aid in solving these and other discrepancies in this genus. Results Here we report the results of the genome-wide analysis of DNA sequences from 989 orthologous groups from 17 Xanthomonas spp. genomes available to date, representing all major lineages within the genus. The phylogenetic and computational analyses used in this study have been automated in a Perl package designated Unus, which provides a framework for phylogenomic analyses which can be applied to other datasets at the genomic level. Unus can also be easily incorporated into other phylogenomic pipelines. Conclusions Our phylogeny agrees with previous phylogenetic topologies on the genus, but revealed that the genomes of Xanthomonas citri and Xanthomonas fuscans belong to the same species, and that of Xanthomonas albilineans is basal to the joint clade of Xanthomonas and Xylella fastidiosa. Genome reduction was identified in the species Xanthomonas vasicola in addition to the previously identified reduction in Xanthomonas albilineans. Lateral gene transfer was also observed in two gene clusters.

  19. Genome update: the 1000th genome - a cautionary tale

    DEFF Research Database (Denmark)

    Lagesen, Karin; Ussery, David; Wassenaar, Gertrude Maria

    2010-01-01

    conclusions for example about the largest bacterial genome sequenced. Biological diversity is far greater than many have thought. For example, analysis of multiple Escherichia coli genomes has led to an estimate of around 45 000 gene families more genes than are recognized in the human genome. Moreover......There are now more than 1000 sequenced prokaryotic genomes deposited in public databases and available for analysis. Currently, although the sequence databases GenBank, DNA Database of Japan and EMBL are synchronized continually, there are slight differences in content at the genomes level...... for a variety of logistical reasons, including differences in format and loading errors, such as those caused by file transfer protocol interruptions. This means that the 1000th genome will be different in the various databases. Some of the data on the highly accessed web pages are inaccurate, leading to false...

  20. Genome-wide screen of Pseudomonas aeruginosa In Saccharomyces cerevisiae identifies new virulence factors

    Directory of Open Access Journals (Sweden)

    Rafat eZrieq

    2015-11-01

    Full Text Available Pseudomonas aeruginosa is a human opportunistic pathogen that causes mortality in cystic fibrosis and immunocompromised patients. While many virulence factors of this pathogen have already been identified, several remain to be discovered. In this respect we set an unprecedented genome-wide screen of a P. aeruginosa expression library based on a yeast growth phenotype. 51 candidates were selected in a three-round screening process. The robustness of the screen was validated by the selection of three well known secreted proteins including one demonstrated virulence factor, the protease LepA. Further in silico sorting of the 51 candidates highlighted three potential new Pseudomonas effector candidates (Pec. By testing the cytotoxicity of wild type P. aeruginosa vs pec mutants towards macrophages and the virulence in the Caenorhabditis elegans model, we demonstrated that the three selected Pecs are novel virulence factors of P. aeruginosa. Additional cellular localization experiments in the host revealed specific localization for Pec1 and Pec2 that could inform about their respective functions.

  1. Balancing the risks and benefits of genomic data sharing: genome research participants' perspectives.

    Science.gov (United States)

    Oliver, J M; Slashinski, M J; Wang, T; Kelly, P A; Hilsenbeck, S G; McGuire, A L

    2012-01-01

    Technological advancements are rapidly propelling the field of genome research forward, while lawmakers attempt to keep apace with the risks these advances bear. Balancing normative concerns of maximizing data utility and protecting human subjects, whose privacy is at risk due to the identifiability of DNA data, are central to policy decisions. Research on genome research participants making real-time data sharing decisions is limited; yet, these perspectives could provide critical information to ongoing deliberations. We conducted a randomized trial of 3 consent types affording varying levels of control over data release decisions. After debriefing participants about the randomization process, we invited them to a follow-up interview to assess their attitudes toward genetic research, privacy and data sharing. Participants were more restrictive in their reported data sharing preferences than in their actual data sharing decisions. They saw both benefits and risks associated with sharing their genomic data, but risks were seen as less concrete or happening in the future, and were largely outweighed by purported benefits. Policymakers must respect that participants' assessment of the risks and benefits of data sharing and their privacy-utility determinations, which are associated with their final data release decisions, vary. In order to advance the ethical conduct of genome research, proposed policy changes should carefully consider these stakeholder perspectives. Copyright © 2011 S. Karger AG, Basel.

  2. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute’s genomic medicine portfolio

    Science.gov (United States)

    Manolio, Teri A.

    2016-01-01

    Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual’s genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of “Genomic Medicine Meetings,” under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and diffficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI’s genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so. PMID:27612677

  3. DEFINING THE CHEMICAL SPACE OF PUBLIC GENOMIC ...

    Science.gov (United States)

    The current project aims to chemically index the genomics content of public genomic databases to make these data accessible in relation to other publicly available, chemically-indexed toxicological information. By defining the chemical space of public genomic data, it is possible to identify classes of chemicals on which to develop methodologies for the integration of chemogenomic data into predictive toxicology. The chemical space of public genomic data will be presented as well as the methodologies and tools developed to identify this chemical space.

  4. Case-control genome-wide association study of attention-deficit/hyperactivity disorder.

    NARCIS (Netherlands)

    Neale, B.M.; Medland, S.; Ripke, S.; Anney, R.J.; Asherson, P.; Buitelaar, J.K.; Franke, B.; Gill, M.; Kent, L.; Holmans, P.; Middleton, F.; Thapar, A.; Lesch, K.P.; Faraone, S.V.; Daly, M.; Nguyen, T.T.; Schafer, H.; Steinhausen, H.C.; Reif, A.; Renner, T.J.; Romanos, M.; Romanos, J.; Warnke, A.; Walitza, S.; Freitag, C.; Meyer, J.; Palmason, H.; Rothenberger, A.; Hawi, Z.; Sergeant, J.A.; Roeyers, H.; Mick, E.; Biederman, J.

    2010-01-01

    OBJECTIVE: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. Thus additional genomewide association studies (GWAS) are needed.

  5. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

    Science.gov (United States)

    Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

    2016-01-04

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. NCC-AUC: an AUC optimization method to identify multi-biomarker panel for cancer prognosis from genomic and clinical data.

    Science.gov (United States)

    Zou, Meng; Liu, Zhaoqi; Zhang, Xiang-Sun; Wang, Yong

    2015-10-15

    In prognosis and survival studies, an important goal is to identify multi-biomarker panels with predictive power using molecular characteristics or clinical observations. Such analysis is often challenged by censored, small-sample-size, but high-dimensional genomic profiles or clinical data. Therefore, sophisticated models and algorithms are in pressing need. In this study, we propose a novel Area Under Curve (AUC) optimization method for multi-biomarker panel identification named Nearest Centroid Classifier for AUC optimization (NCC-AUC). Our method is motived by the connection between AUC score for classification accuracy evaluation and Harrell's concordance index in survival analysis. This connection allows us to convert the survival time regression problem to a binary classification problem. Then an optimization model is formulated to directly maximize AUC and meanwhile minimize the number of selected features to construct a predictor in the nearest centroid classifier framework. NCC-AUC shows its great performance by validating both in genomic data of breast cancer and clinical data of stage IB Non-Small-Cell Lung Cancer (NSCLC). For the genomic data, NCC-AUC outperforms Support Vector Machine (SVM) and Support Vector Machine-based Recursive Feature Elimination (SVM-RFE) in classification accuracy. It tends to select a multi-biomarker panel with low average redundancy and enriched biological meanings. Also NCC-AUC is more significant in separation of low and high risk cohorts than widely used Cox model (Cox proportional-hazards regression model) and L1-Cox model (L1 penalized in Cox model). These performance gains of NCC-AUC are quite robust across 5 subtypes of breast cancer. Further in an independent clinical data, NCC-AUC outperforms SVM and SVM-RFE in predictive accuracy and is consistently better than Cox model and L1-Cox model in grouping patients into high and low risk categories. In summary, NCC-AUC provides a rigorous optimization framework to

  7. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  8. Integrated genomics identifies five medulloblastoma subtypes with distinct genetic profiles, pathway signatures and clinicopathological features.

    Directory of Open Access Journals (Sweden)

    Marcel Kool

    Full Text Available BACKGROUND: Medulloblastoma is the most common malignant brain tumor in children. Despite recent improvements in cure rates, prediction of disease outcome remains a major challenge and survivors suffer from serious therapy-related side-effects. Recent data showed that patients with WNT-activated tumors have a favorable prognosis, suggesting that these patients could be treated less intensively, thereby reducing the side-effects. This illustrates the potential benefits of a robust classification of medulloblastoma patients and a detailed knowledge of associated biological mechanisms. METHODS AND FINDINGS: To get a better insight into the molecular biology of medulloblastoma we established mRNA expression profiles of 62 medulloblastomas and analyzed 52 of them also by comparative genomic hybridization (CGH arrays. Five molecular subtypes were identified, characterized by WNT signaling (A; 9 cases, SHH signaling (B; 15 cases, expression of neuronal differentiation genes (C and D; 16 and 11 cases, respectively or photoreceptor genes (D and E; both 11 cases. Mutations in beta-catenin were identified in all 9 type A tumors, but not in any other tumor. PTCH1 mutations were exclusively identified in type B tumors. CGH analysis identified several fully or partly subtype-specific chromosomal aberrations. Monosomy of chromosome 6 occurred only in type A tumors, loss of 9q mostly occurred in type B tumors, whereas chromosome 17 aberrations, most common in medulloblastoma, were strongly associated with type C or D tumors. Loss of the inactivated X-chromosome was highly specific for female cases of type C, D and E tumors. Gene expression levels faithfully reflected the chromosomal copy number changes. Clinicopathological features significantly different between the 5 subtypes included metastatic disease and age at diagnosis and histology. Metastatic disease at diagnosis was significantly associated with subtypes C and D and most strongly with subtype E

  9. Variations in the G6PC2/ABCB11 genomic region are associated with fasting glucose levels

    DEFF Research Database (Denmark)

    Chen, Wei-Min; Erdos, Michael R; Jackson, Anne U

    2008-01-01

    Identifying the genetic variants that regulate fasting glucose concentrations may further our understanding of the pathogenesis of diabetes. We therefore investigated the association of fasting glucose levels with SNPs in 2 genome-wide scans including a total of 5,088 nondiabetic individuals from...... Finland and Sardinia. We found a significant association between the SNP rs563694 and fasting glucose concentrations (P = 3.5 x 10(-7)). This association was further investigated in an additional 18,436 nondiabetic individuals of mixed European descent from 7 different studies. The combined P value...... for association in these follow-up samples was 6.9 x 10(-26), and combining results from all studies resulted in an overall P value for association of 6.4 x 10(-33). Across these studies, fasting glucose concentrations increased 0.01-0.16 mM with each copy of the major allele, accounting for approximately 1...

  10. Identification of novel biomass-degrading enzymes from genomic dark matter: Populating genomic sequence space with functional annotation.

    Science.gov (United States)

    Piao, Hailan; Froula, Jeff; Du, Changbin; Kim, Tae-Wan; Hawley, Erik R; Bauer, Stefan; Wang, Zhong; Ivanova, Nathalia; Clark, Douglas S; Klenk, Hans-Peter; Hess, Matthias

    2014-08-01

    Although recent nucleotide sequencing technologies have significantly enhanced our understanding of microbial genomes, the function of ∼35% of genes identified in a genome currently remains unknown. To improve the understanding of microbial genomes and consequently of microbial processes it will be crucial to assign a function to this "genomic dark matter." Due to the urgent need for additional carbohydrate-active enzymes for improved production of transportation fuels from lignocellulosic biomass, we screened the genomes of more than 5,500 microorganisms for hypothetical proteins that are located in the proximity of already known cellulases. We identified, synthesized and expressed a total of 17 putative cellulase genes with insufficient sequence similarity to currently known cellulases to be identified as such using traditional sequence annotation techniques that rely on significant sequence similarity. The recombinant proteins of the newly identified putative cellulases were subjected to enzymatic activity assays to verify their hydrolytic activity towards cellulose and lignocellulosic biomass. Eleven (65%) of the tested enzymes had significant activity towards at least one of the substrates. This high success rate highlights that a gene context-based approach can be used to assign function to genes that are otherwise categorized as "genomic dark matter" and to identify biomass-degrading enzymes that have little sequence similarity to already known cellulases. The ability to assign function to genes that have no related sequence representatives with functional annotation will be important to enhance our understanding of microbial processes and to identify microbial proteins for a wide range of applications. © 2014 Wiley Periodicals, Inc.

  11. A genome-wide scan study identifies a single nucleotide substitution in ASIP associated with white versus non-white coat-colour variation in sheep (Ovis aries)

    OpenAIRE

    Li, M-H; Tiirikka, T; Kantanen, J

    2013-01-01

    In sheep, coat colour (and pattern) is one of the important traits of great biological, economic and social importance. However, the genetics of sheep coat colour has not yet been fully clarified. We conducted a genome-wide association study of sheep coat colours by genotyping 47 303 single-nucleotide polymorphisms (SNPs) in the Finnsheep population in Finland. We identified 35 SNPs associated with all the coat colours studied, which cover genomic regions encompassing three kno...

  12. The Personal Genome Project Canada: findings from whole genome sequences of the inaugural 56 participants.

    Science.gov (United States)

    Reuter, Miriam S; Walker, Susan; Thiruvahindrapuram, Bhooma; Whitney, Joe; Cohn, Iris; Sondheimer, Neal; Yuen, Ryan K C; Trost, Brett; Paton, Tara A; Pereira, Sergio L; Herbrick, Jo-Anne; Wintle, Richard F; Merico, Daniele; Howe, Jennifer; MacDonald, Jeffrey R; Lu, Chao; Nalpathamkalam, Thomas; Sung, Wilson W L; Wang, Zhuozhi; Patel, Rohan V; Pellecchia, Giovanna; Wei, John; Strug, Lisa J; Bell, Sherilyn; Kellam, Barbara; Mahtani, Melanie M; Bassett, Anne S; Bombard, Yvonne; Weksberg, Rosanna; Shuman, Cheryl; Cohn, Ronald D; Stavropoulos, Dimitri J; Bowdin, Sarah; Hildebrandt, Matthew R; Wei, Wei; Romm, Asli; Pasceri, Peter; Ellis, James; Ray, Peter; Meyn, M Stephen; Monfared, Nasim; Hosseini, S Mohsen; Joseph-George, Ann M; Keeley, Fred W; Cook, Ryan A; Fiume, Marc; Lee, Hin C; Marshall, Christian R; Davies, Jill; Hazell, Allison; Buchanan, Janet A; Szego, Michael J; Scherer, Stephen W

    2018-02-05

    The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set ( n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants - associated with cancer, cardiac or neurodegenerative phenotypes - remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care. © 2018 Joule Inc. or its licensors.

  13. Genome-wide linkage, exome sequencing and functional analyses identify ABCB6 as the pathogenic gene of dyschromatosis universalis hereditaria.

    Directory of Open Access Journals (Sweden)

    Hong Liu

    Full Text Available As a genetic disorder of abnormal pigmentation, the molecular basis of dyschromatosis universalis hereditaria (DUH had remained unclear until recently when ABCB6 was reported as a causative gene of DUH.We performed genome-wide linkage scan using Illumina Human 660W-Quad BeadChip and exome sequencing analyses using Agilent SureSelect Human All Exon Kits in a multiplex Chinese DUH family to identify the pathogenic mutations and verified the candidate mutations using Sanger sequencing. Quantitative RT-PCR and Immunohistochemistry was performed to verify the expression of the pathogenic gene, Zebrafish was also used to confirm the functional role of ABCB6 in melanocytes and pigmentation.Genome-wide linkage (assuming autosomal dominant inheritance mode and exome sequencing analyses identified ABCB6 as the disease candidate gene by discovering a coding mutation (c.1358C>T; p.Ala453Val that co-segregates with the disease phenotype. Further mutation analysis of ABCB6 in four other DUH families and two sporadic cases by Sanger sequencing confirmed the mutation (c.1358C>T; p.Ala453Val and discovered a second, co-segregating coding mutation (c.964A>C; p.Ser322Lys in one of the four families. Both mutations were heterozygous in DUH patients and not present in the 1000 Genome Project and dbSNP database as well as 1,516 unrelated Chinese healthy controls. Expression analysis in human skin and mutagenesis interrogation in zebrafish confirmed the functional role of ABCB6 in melanocytes and pigmentation. Given the involvement of ABCB6 mutations in coloboma, we performed ophthalmological examination of the DUH carriers of ABCB6 mutations and found ocular abnormalities in them.Our study has advanced our understanding of DUH pathogenesis and revealed the shared pathological mechanism between pigmentary DUH and ocular coloboma.

  14. Genome-Wide Association Meta-Analyses to Identify Common Genetic Variants Associated with Hallux Valgus in Caucasian and African Americans

    Science.gov (United States)

    Hsu, Yi-Hsiang; Liu, Youfang; Hannan, Marian T.; Maixner, William; Smith, Shad B.; Diatchenko, Luda; Golightly, Yvonne M.; Menz, Hylton B.; Kraus, Virginia B.; Doherty, Michael; Wilson, A.G.; Jordan, Joanne M.

    2016-01-01

    Objective Hallux valgus (HV) affects ~36% of Caucasian adults. Although considered highly heritable, the underlying genetic determinants are unclear. We conducted the first genome-wide association study (GWAS) aimed to identify genetic variants associated with HV. Methods HV was assessed in 3 Caucasian cohorts (n=2,263, n=915, and n=1,231 participants, respectively). In each cohort, a GWAS was conducted using 2.5M imputed single nucleotide polymorphisms (SNPs). Mixed-effect regression with the additive genetic model adjusted for age, sex, weight and within-family correlations was used for both sex-specific and combined analyses. To combine GWAS results across cohorts, fixed-effect inverse-variance meta-analyses were used. Following meta-analyses, top-associated findings were also examined in an African American cohort (n=327). Results The proportion of HV variance explained by genome-wide genotyped SNPs was 50% in men and 48% in women. A higher proportion of genetic determinants of HV was sex-specific. The most significantly associated SNP in men was rs9675316 located on chr17q23-a24 near the AXIN2 gene (p=5.46×10−7); the most significantly associated SNP in women was rs7996797 located on chr13q14.1-q14.2 near the ESD gene (p=7.21×10−7). Genome-wide significant SNP-by-sex interaction was found for SNP rs1563374 located on chr11p15.1 near the MRGPRX3 gene (interaction p-value =4.1×10−9). The association signals diminished when combining men and women. Conclusion Findings suggest that the potential pathophysiological mechanisms of HV are complex and strongly underlined by sex-specific interactions. The identified genetic variants imply contribution of biological pathways observed in osteoarthritis as well as new pathways, influencing skeletal development and inflammation. PMID:26337638

  15. The human genome project

    International Nuclear Information System (INIS)

    Worton, R.

    1996-01-01

    The Human Genome Project is a massive international research project, costing 3 to 5 billion dollars and expected to take 15 years, which will identify the all the genes in the human genome - i.e. the complete sequence of bases in human DNA. The prize will be the ability to identify genes causing or predisposing to disease, and in some cases the development of gene therapy, but this new knowledge will raise important ethical issues

  16. Genome-wide meta-analyses identify multiple loci associated with smoking behavior

    NARCIS (Netherlands)

    H. Furberg (Helena); Y. Kim (Yunjung); J. Dackor (Jennifer); E.A. Boerwinkle (Eric); N. Franceschini (Nora); D. Ardissino (Diego); L. Bernardinelli (Luisa); P.M. Mannucci (Pier); F. Mauri (Francesco); P.A. Merlini (Piera); D. Absher (Devin); T.L. Assimes (Themistocles); S.P. Fortmann (Stephen); C. Iribarren (Carlos); J.W. Knowles (Joshua); T. Quertermous (Thomas); L. Ferrucci (Luigi); T. Tanaka (Toshiko); J.C. Bis (Joshua); T. Haritunians (Talin); B. McKnight (Barbara); B.M. Psaty (Bruce); K.D. Taylor (Kent); E.L. Thacker (Evan); P. Almgren (Peter); L. Groop (Leif); C. Ladenvall (Claes); M. Boehnke (Michael); A.U. Jackson (Anne); K.L. Mohlke (Karen); H.M. Stringham (Heather); J. Tuomilehto (Jaakko); E.J. Benjamin (Emelia); S.J. Hwang; D. Levy (Daniel); S.R. Preis; R.S. Vasan (Ramachandran Srini); J. Duan (Jubao); P.V. Gejman (Pablo); D.F. Levinson (Douglas); A.R. Sanders (Alan); J. Shi (Jianxin); E.H. Lips (Esther); J.D. McKay (James); A. Agudo (Antonio); L. Barzan (Luigi); V. Bencko (Vladimir); S. Benhamou (Simone); X. Castellsagué (Xavier); C. Canova (Cristina); D.I. Conway (David); E. Fabianova (Eleonora); L. Foretova (Lenka); V. Janout (Vladimir); C.M. Healy (Claire); I. Holcátová (Ivana); K. Kjaerheim (Kristina); P. Lagiou; J. Lissowska (Jolanta); R. Lowry (Ray); T.V. MacFarlane (Tatiana); D. Mates (Dana); L. Richiardi (Lorenzo); P. Rudnai (Peter); N. Szeszenia-Dabrowska (Neonilia); D. Zaridze; A. Znaor (Ariana); M. Lathrop (Mark); P. Brennan (Paul); S. Bandinelli (Stefania); T.M. Frayling (Timothy); J.M. Guralnik (Jack); Y. Milaneschi (Yuri); J.R.B. Perry (John); D. Altshuler (David); R. Elosua (Roberto); S. Kathiresan (Sekar); G. Lucas (Gavin); O. Melander (Olle); V. Salomaa (Veikko); S.M. Schwartz (Stephen); B.F. Voight (Benjamin); B.W.J.H. Penninx (Brenda); J.H. Smit (Johannes); N. Vogelzangs (Nicole); D.I. Boomsma (Dorret); E.J.C. de Geus (Eco); J.M. Vink (Jacqueline); G.A.H.M. Willemsen (Gonneke); S.J. Chanock (Stephen); F. Gu (Fangyi); S.E. Hankinson (Susan); D. Hunter (David); A. Hofman (Albert); H.W. Tiemeier (Henning); A.G. Uitterlinden (André); P. Tikka-Kleemola (Päivi); S. Walter (Stefan); D.I. Chasman (Daniel); B.M. Everett (Brendan); G. Pare (Guillaume); P.M. Ridker (Paul); M.D. Li (Ming); H.H. Maes (Hermine); J. Audrain-Mcgovern (Janet); D. Posthuma (Danielle); L.M. Thornton (Laura); C. Lerman (Caryn); J. Kaprio (Jaakko); J.E. Rose (Jed); J.P.A. Ioannidis (John); P. Kraft (Peter); D.Y. Lin (Dan); P.F. Sullivan (Patrick); C.J. O'Donnell (Christopher)

    2010-01-01

    textabstractConsistent but indirect evidence has implicated genetic factors in smoking behavior. We report meta-analyses of several smoking phenotypes within cohorts of the Tobacco and Genetics Consortium (n = 74,053). We also partnered with the European Network of Genetic and Genomic Epidemiology

  17. From genomes to vaccines: Leishmania as a model.

    Science.gov (United States)

    Almeida, Renata; Norrish, Alan; Levick, Mark; Vetrie, David; Freeman, Tom; Vilo, Jaak; Ivens, Alasdair; Lange, Uta; Stober, Carmel; McCann, Sharon; Blackwell, Jenefer M

    2002-01-01

    The 35 Mb genome of Leishmania should be sequenced by late 2002. It contains approximately 8500 genes that will probably translate into more than 10 000 proteins. In the laboratory we have been piloting strategies to try to harness the power of the genome-proteome for rapid screening of new vaccine candidate. To this end, microarray analysis of 1094 unique genes identified using an EST analysis of 2091 cDNA clones from spliced leader libraries prepared from different developmental stages of Leishmania has been employed. The plan was to identify amastigote-expressed genes that could be used in high-throughput DNA-vaccine screens to identify potential new vaccine candidates. Despite the lack of transcriptional regulation that polycistronic transcription in Leishmania dictates, the data provide evidence for a high level of post-transcriptional regulation of RNA abundance during the developmental cycle of promastigotes in culture and in lesion-derived amastigotes of Leishmania major. This has provided 147 candidates from the 1094 unique genes that are specifically upregulated in amastigotes and are being used in vaccine studies. Using DNA vaccination, it was demonstrated that pooling strategies can work to identify protective vaccines, but it was found that some potentially protective antigens are masked by other disease-exacerbatory antigens in the pool. A total of 100 new vaccine candidates are currently being tested separately and in pools to extend this analysis, and to facilitate retrospective bioinformatic analysis to develop predictive algorithms for sequences that constitute potentially protective antigens. We are also working with other members of the Leishmania Genome Network to determine whether RNA expression determined by microarray analyses parallels expression at the protein level. We believe we are making good progress in developing strategies that will allow rapid translation of the sequence of Leishmania into potential interventions for disease

  18. Whole genome analysis of selected human and animal rotaviruses identified in Uganda from 2012 to 2014 reveals complex genome reassortment events between human, bovine, caprine and porcine strains.

    Science.gov (United States)

    Bwogi, Josephine; Jere, Khuzwayo C; Karamagi, Charles; Byarugaba, Denis K; Namuwulya, Prossy; Baliraine, Frederick N; Desselberger, Ulrich; Iturriza-Gomara, Miren

    2017-01-01

    Rotaviruses of species A (RVA) are a common cause of diarrhoea in children and the young of various other mammals and birds worldwide. To investigate possible interspecies transmission of RVAs, whole genomes of 18 human and 6 domestic animal RVA strains identified in Uganda between 2012 and 2014 were sequenced using the Illumina HiSeq platform. The backbone of the human RVA strains had either a Wa- or a DS-1-like genetic constellation. One human strain was a Wa-like mono-reassortant containing a DS-1-like VP2 gene of possible animal origin. All eleven genes of one bovine RVA strain were closely related to those of human RVAs. One caprine strain had a mixed genotype backbone, suggesting that it emerged from multiple reassortment events involving different host species. The porcine RVA strains had mixed genotype backbones with possible multiple reassortant events with strains of human and bovine origin.Overall, whole genome characterisation of rotaviruses found in domestic animals in Uganda strongly suggested the presence of human-to animal RVA transmission, with concomitant circulation of multi-reassortant strains potentially derived from complex interspecies transmission events. However, whole genome data from the human RVA strains causing moderate and severe diarrhoea in under-fives in Uganda indicated that they were primarily transmitted from person-to-person.

  19. A Gene Gravity Model for the Evolution of Cancer Genomes: A Study of 3,000 Cancer Genomes across 9 Cancer Types

    Science.gov (United States)

    Lin, Chen-Ching; Zhao, Junfei; Jia, Peilin; Li, Wen-Hsiung; Zhao, Zhongming

    2015-01-01

    Cancer development and progression result from somatic evolution by an accumulation of genomic alterations. The effects of those alterations on the fitness of somatic cells lead to evolutionary adaptations such as increased cell proliferation, angiogenesis, and altered anticancer drug responses. However, there are few general mathematical models to quantitatively examine how perturbations of a single gene shape subsequent evolution of the cancer genome. In this study, we proposed the gene gravity model to study the evolution of cancer genomes by incorporating the genome-wide transcription and somatic mutation profiles of ~3,000 tumors across 9 cancer types from The Cancer Genome Atlas into a broad gene network. We found that somatic mutations of a cancer driver gene may drive cancer genome evolution by inducing mutations in other genes. This functional consequence is often generated by the combined effect of genetic and epigenetic (e.g., chromatin regulation) alterations. By quantifying cancer genome evolution using the gene gravity model, we identified six putative cancer genes (AHNAK, COL11A1, DDX3X, FAT4, STAG2, and SYNE1). The tumor genomes harboring the nonsynonymous somatic mutations in these genes had a higher mutation density at the genome level compared to the wild-type groups. Furthermore, we provided statistical evidence that hypermutation of cancer driver genes on inactive X chromosomes is a general feature in female cancer genomes. In summary, this study sheds light on the functional consequences and evolutionary characteristics of somatic mutations during tumorigenesis by propelling adaptive cancer genome evolution, which would provide new perspectives for cancer research and therapeutics. PMID:26352260

  20. Genome-wide association study identifies candidate genes for starch content regulation in maize kernels

    Directory of Open Access Journals (Sweden)

    Na Liu

    2016-07-01

    Full Text Available Kernel starch content is an important trait in maize (Zea mays L. as it accounts for 65% to 75% of the dry kernel weight and positively correlates with seed yield. A number of starch synthesis-related genes have been identified in maize in recent years. However, many loci underlying variation in starch content among maize inbred lines still remain to be identified. The current study is a genome-wide association study that used a set of 263 maize inbred lines. In this panel, the average kernel starch content was 66.99%, ranging from 60.60% to 71.58% over the three study years. These inbred lines were genotyped with the SNP50 BeadChip maize array, which is comprised of 56,110 evenly spaced, random SNPs. Population structure was controlled by a mixed linear model (MLM as implemented in the software package TASSEL. After the statistical analyses, four SNPs were identified as significantly associated with starch content (P ≤ 0.0001, among which one each are located on chromosomes 1 and 5 and two are on chromosome 2. Furthermore, 77 candidate genes associated with starch synthesis were found within the 100-kb intervals containing these four QTLs, and four highly associated genes were within 20-kb intervals of the associated SNPs. Among the four genes, Glucose-1-phosphate adenylyltransferase (APS1; Gene ID GRMZM2G163437 is known as an important regulator of kernel starch content. The identified SNPs, QTLs, and candidate genes may not only be readily used for germplasm improvement by marker-assisted selection in breeding, but can also elucidate the genetic basis of starch content. Further studies on these identified candidate genes may help determine the molecular mechanisms regulating kernel starch content in maize and other important cereal crops.

  1. The genome of the endophytic bacterium H. frisingense GSF30T identifies diverse strategies in the Herbaspirillum genus to interact with plants

    Directory of Open Access Journals (Sweden)

    Daniel eStraub

    2013-06-01

    Full Text Available The diazotrophic, bacterial endophyte Herbaspirillum frisingense GSF30T has been identified in biomass grasses grown in temperate climate, including the highly nitrogen-efficient grass Miscanthus. Its genome was annotated and compared with related Herbaspirillum species from diverse habitats, including H. seropedicae, and further well-characterized endophytes. The analysis revealed that Herbaspirillum frisingense lacks a type III secretion system that is present in some related Herbaspirillum grass endophytes. Together with the lack of components of the type II secretion system, the genomic inventory indicates distinct interaction scenarios of endophytic Herbaspirillum strains with plants. Differences in respiration, carbon, nitrogen and cell wall metabolism among Herbaspirillum isolates partially correlate with their different habitats. Herbaspirillum frisingense is closely related to strains isolated from the rhizosphere of phragmites and from well water, but these lack nitrogen fixation and metabolism genes. Within grass endophytes, the high diversity in their genomic inventory suggests that even individual plant species provide distinct, highly diverse metabolic niches for successful endophyte-plant associations.

  2. The genome of the endophytic bacterium H. frisingense GSF30(T) identifies diverse strategies in the Herbaspirillum genus to interact with plants.

    Science.gov (United States)

    Straub, Daniel; Rothballer, Michael; Hartmann, Anton; Ludewig, Uwe

    2013-01-01

    The diazotrophic, bacterial endophyte Herbaspirillum frisingense GSF30(T) has been identified in biomass grasses grown in temperate climate, including the highly nitrogen-efficient grass Miscanthus. Its genome was annotated and compared with related Herbaspirillum species from diverse habitats, including H. seropedicae, and further well-characterized endophytes. The analysis revealed that Herbaspirillum frisingense lacks a type III secretion system that is present in some related Herbaspirillum grass endophytes. Together with the lack of components of the type II secretion system, the genomic inventory indicates distinct interaction scenarios of endophytic Herbaspirillum strains with plants. Differences in respiration, carbon, nitrogen and cell wall metabolism among Herbaspirillum isolates partially correlate with their different habitats. Herbaspirillum frisingense is closely related to strains isolated from the rhizosphere of phragmites and from well water, but these lack nitrogen fixation and metabolism genes. Within grass endophytes, the high diversity in their genomic inventory suggests that even individual plant species provide distinct, highly diverse metabolic niches for successful endophyte-plant associations.

  3. The genome of the endophytic bacterium H. frisingense GSF30T identifies diverse strategies in the Herbaspirillum genus to interact with plants

    Science.gov (United States)

    Straub, Daniel; Rothballer, Michael; Hartmann, Anton; Ludewig, Uwe

    2013-01-01

    The diazotrophic, bacterial endophyte Herbaspirillum frisingense GSF30T has been identified in biomass grasses grown in temperate climate, including the highly nitrogen-efficient grass Miscanthus. Its genome was annotated and compared with related Herbaspirillum species from diverse habitats, including H. seropedicae, and further well-characterized endophytes. The analysis revealed that Herbaspirillum frisingense lacks a type III secretion system that is present in some related Herbaspirillum grass endophytes. Together with the lack of components of the type II secretion system, the genomic inventory indicates distinct interaction scenarios of endophytic Herbaspirillum strains with plants. Differences in respiration, carbon, nitrogen and cell wall metabolism among Herbaspirillum isolates partially correlate with their different habitats. Herbaspirillum frisingense is closely related to strains isolated from the rhizosphere of phragmites and from well water, but these lack nitrogen fixation and metabolism genes. Within grass endophytes, the high diversity in their genomic inventory suggests that even individual plant species provide distinct, highly diverse metabolic niches for successful endophyte-plant associations. PMID:23825472

  4. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    DEFF Research Database (Denmark)

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang

    2015-01-01

    . Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication...

  5. Association Between Chromosome 9p21 Variants and the Ankle-Brachial Index Identified by a Meta-Analysis of 21 Genome-Wide Association Studies

    DEFF Research Database (Denmark)

    Murabito, Joanne M; White, Charles C; Kavousi, Maryam

    2012-01-01

    BACKGROUND: -Genetic determinants of peripheral arterial disease (PAD) remain largely unknown. To identify genetic variants associated with the ankle-brachial index (ABI), a noninvasive measure of PAD, we conducted a meta-analysis of genome-wide association study data from 21 population-based coh...

  6. Association Between Chromosome 9p21 Variants and the Ankle-Brachial Index Identified by a Meta-Analysis of 21 Genome-Wide Association Studies

    NARCIS (Netherlands)

    Murabito, Joanne M.; White, Charles C.; Kavousi, Maryam; Sun, Yan V.; Feitosa, Mary F.; Nambi, Vijay; Lamina, Claudia; Schillert, Arne; Coassin, Stefan; Bis, Joshua C.; Broer, Linda; Crawford, Dana C.; Franceschini, Nora; Frikke-Schmidt, Ruth; Haun, Margot; Holewijn, Suzanne; Huffman, Jennifer E.; Hwang, Shih-Jen; Kiechl, Stefan; Kollerits, Barbara; Montasser, May E.; Nolte, Ilja M.; Rudock, Megan E.; Senft, Andrea; Teumer, Alexander; van der Harst, Pim; Vitart, Veronique; Waite, Lindsay L.; Wood, Andrew R.; Wassel, Christina L.; Absher, Devin M.; Allison, Matthew A.; Amin, Najaf; Arnold, Alice; Asselbergs, Folkert W.; Aulchenko, Yurii; Bandinelli, Stefania; Barbalic, Maja; Boban, Mladen; Brown-Gentry, Kristin; Couper, David J.; Criqui, Michael H.; Dehghan, Abbas; den Heijer, Martin; Dieplinger, Benjamin; Ding, Jingzhong; Doerr, Marcus; Espinola-Klein, Christine; Felix, Stephan B.; Ferrucci, Luigi; Folsom, Aaron R.; Fraedrich, Gustav; Gibson, Quince; Goodloe, Robert; Gunjaca, Grgo; Haltmayer, Meinhard; Heiss, Gerardo; Hofman, Albert; Kieback, Arne; Kiemeney, Lambertus A.; Kolcic, Ivana; Kullo, Iftikhar J.; Kritchevsky, Stephen B.; Lackner, Karl J.; Li, Xiaohui; Lieb, Wolfgang; Lohman, Kurt; Meisinger, Christa; Melzer, David; Mohler, Emile R.; Mudnic, Ivana; Mueller, Thomas; Navis, Gerjan; Oberhollenzer, Friedrich; Olin, Jeffrey W.; O'Connell, Jeff; O'Donnell, Christopher J.; Palmas, Walter; Penninx, Brenda W.; Petersmann, Astrid; Polasek, Ozren; Psaty, Bruce M.; Rantner, Barbara; Rice, Ken; Rivadeneira, Fernando; Rotter, Jerome I.; Seldenrijk, Adrie; Stadler, Marietta; Summerer, Monika; Tanaka, Toshiko; Tybjaerg-Hansen, Anne; Uitterlinden, Andre G.; van Gilst, Wiek H.; Vermeulen, Sita H.; Wild, Sarah H.; Wild, Philipp S.; Willeit, Johann; Zeller, Tanja; Zemunik, Tatijana; Zgaga, Lina; Assimes, Themistocles L.; Blankenberg, Stefan; Campbell, Harry; Boerwinkle, Eric; Cooke, John P.; de Graaf, Jacqueline; Herrington, David; Kardia, Sharon L. R.; Mitchell, Braxton D.; Murray, Anna; Muenzel, Thomas; Newman, Anne B.; Oostra, Ben A.; Rudan, Igor; Shuldiner, Alan R.; Snieder, Harold; van Duijn, Cornelia M.; Voelker, Uwe; Wright, Alan F.; Wichmann, H. -Erich; Wilson, James F.; Witteman, Jacqueline C. M.; Liu, Yongmei; Hayward, Caroline; Borecki, Ingrid B.; Ziegler, Andreas; North, Kari E.; Cupples, L. Adrienne; Kronenberg, Florian; Dorr, M.; Munzel, T.; Volker, U.

    Background-Genetic determinants of peripheral arterial disease (PAD) remain largely unknown. To identify genetic variants associated with the ankle-brachial index (ABI), a noninvasive measure of PAD, we conducted a meta-analysis of genome-wide association study data from 21 population-based cohorts.

  7. Genome sequences and comparative genomics of two Lactobacillus ruminis strains from the bovine and human intestinal tracts

    LENUS (Irish Health Repository)

    2011-08-30

    Abstract Background The genus Lactobacillus is characterized by an extraordinary degree of phenotypic and genotypic diversity, which recent genomic analyses have further highlighted. However, the choice of species for sequencing has been non-random and unequal in distribution, with only a single representative genome from the L. salivarius clade available to date. Furthermore, there is no data to facilitate a functional genomic analysis of motility in the lactobacilli, a trait that is restricted to the L. salivarius clade. Results The 2.06 Mb genome of the bovine isolate Lactobacillus ruminis ATCC 27782 comprises a single circular chromosome, and has a G+C content of 44.4%. In silico analysis identified 1901 coding sequences, including genes for a pediocin-like bacteriocin, a single large exopolysaccharide-related cluster, two sortase enzymes, two CRISPR loci and numerous IS elements and pseudogenes. A cluster of genes related to a putative pilin was identified, and shown to be transcribed in vitro. A high quality draft assembly of the genome of a second L. ruminis strain, ATCC 25644 isolated from humans, suggested a slightly larger genome of 2.138 Mb, that exhibited a high degree of synteny with the ATCC 27782 genome. In contrast, comparative analysis of L. ruminis and L. salivarius identified a lack of long-range synteny between these closely related species. Comparison of the L. salivarius clade core proteins with those of nine other Lactobacillus species distributed across 4 major phylogenetic groups identified the set of shared proteins, and proteins unique to each group. Conclusions The genome of L. ruminis provides a comparative tool for directing functional analyses of other members of the L. salivarius clade, and it increases understanding of the divergence of this distinct Lactobacillus lineage from other commensal lactobacilli. The genome sequence provides a definitive resource to facilitate investigation of the genetics, biochemistry and host

  8. HGVA: the Human Genome Variation Archive

    OpenAIRE

    Lopez, Javier; Coll, Jacobo; Haimel, Matthias; Kandasamy, Swaathi; Tarraga, Joaquin; Furio-Tari, Pedro; Bari, Wasim; Bleda, Marta; Rueda, Antonio; Gr?f, Stefan; Rendon, Augusto; Dopazo, Joaquin; Medina, Ignacio

    2017-01-01

    Abstract High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic...

  9. Causes of genome instability

    DEFF Research Database (Denmark)

    Langie, Sabine A S; Koppen, Gudrun; Desaulniers, Daniel

    2015-01-01

    function, chromosome segregation, telomere length). The purpose of this review is to describe the crucial aspects of genome instability, to outline the ways in which environmental chemicals can affect this cancer hallmark and to identify candidate chemicals for further study. The overall aim is to make......Genome instability is a prerequisite for the development of cancer. It occurs when genome maintenance systems fail to safeguard the genome's integrity, whether as a consequence of inherited defects or induced via exposure to environmental agents (chemicals, biological agents and radiation). Thus...

  10. Integrative Genomic Analysis of Cholangiocarcinoma Identifies Distinct IDH-Mutant Molecular Profiles

    DEFF Research Database (Denmark)

    Farshidfar, Farshad; Zheng, Siyuan; Gingras, Marie-Claude

    2017-01-01

    Cholangiocarcinoma (CCA) is an aggressive malignancy of the bile ducts, with poor prognosis and limited treatment options. Here, we describe the integrated analysis of somatic mutations, RNA expression, copy number, and DNA methylation by The Cancer Genome Atlas of a set of predominantly intrahep...

  11. Genome-wide association study identifies 74 loci associated with educational attainment

    NARCIS (Netherlands)

    A. Okbay (Aysu); J.P. Beauchamp (Jonathan); Fontana, M.A. (Mark Alan); J.J. Lee (James J.); T.H. Pers (Tune); Rietveld, C.A. (Cornelius A.); P. Turley (Patrick); Chen, G.-B. (Guo-Bo); V. Emilsson (Valur); Meddens, S.F.W. (S. Fleur W.); Oskarsson, S. (Sven); Pickrell, J.K. (Joseph K.); Thom, K. (Kevin); Timshel, P. (Pascal); R. de Vlaming (Ronald); A. Abdellaoui (Abdel); T.S. Ahluwalia (Tarunveer Singh); J. Bacelis (Jonas); C. Baumbach (Clemens); Bjornsdottir, G. (Gyda); J.H. Brandsma (Johan); Pina Concas, M. (Maria); J. Derringer; Furlotte, N.A. (Nicholas A.); T.E. Galesloot (Tessel); S. Girotto; Gupta, R. (Richa); L.M. Hall (Leanne M.); S.E. Harris (Sarah); E. Hofer; Horikoshi, M. (Momoko); J.E. Huffman (Jennifer E.); Kaasik, K. (Kadri); I.-P. Kalafati (Ioanna-Panagiota); R. Karlsson (Robert); A. Kong (Augustine); J. Lahti (Jari); S.J. van der Lee (Sven); Deleeuw, C. (Christiaan); P.A. Lind (Penelope); Lindgren, K.-O. (Karl-Oskar); Liu, T. (Tian); M. Mangino (Massimo); J. Marten (Jonathan); E. Mihailov (Evelin); M. Miller (Mike); P.J. van der Most (Peter); C. Oldmeadow (Christopher); A. Payton (Antony); N. Pervjakova (Natalia); W.J. Peyrot (Wouter ); Qian, Y. (Yong); O. Raitakari (Olli); Rueedi, R. (Rico); Salvi, E. (Erika); Schmidt, B. (Börge); Schraut, K.E. (Katharina E.); Shi, J. (Jianxin); A.V. Smith (Albert Vernon); R.A. Poot (Raymond); B. St Pourcain (Beate); A. Teumer (Alexander); G. Thorleifsson (Gudmar); N. Verweij (Niek); D. Vuckovic (Dragana); Wellmann, J. (Juergen); H.J. Westra (Harm-Jan); Yang, J. (Jingyun); Zhao, W. (Wei); Zhu, Z. (Zhihong); B.Z. Alizadeh (Behrooz); N. Amin (Najaf); Bakshi, A. (Andrew); S.E. Baumeister (Sebastian); G. Biino (Ginevra); K. Bønnelykke (Klaus); P.A. Boyle (Patricia); H. Campbell (Harry); Cappuccio, F.P. (Francesco P.); G. Davies (Gail); J.E. de Neve (Jan-Emmanuel); P. Deloukas (Panagiotis); I. Demuth (Ilja); Ding, J. (Jun); Eibich, P. (Peter); Eisele, L. (Lewin); N. Eklund (Niina); D.M. Evans (David); J.D. Faul (Jessica D.); M.F. Feitosa (Mary Furlan); A.J. Forstner (Andreas); I. Gandin (Ilaria); Gunnarsson, B. (Bjarni); B.V. Halldorsson (Bjarni); T.B. Harris (Tamara); E.G. Holliday (Elizabeth); A.C. Heath (Andrew C.); L.J. Hocking; G. Homuth (Georg); M. Horan (Mike); J.J. Hottenga (Jouke Jan); P.L. de Jager (Philip); P.K. Joshi (Peter); A. Juqessur (Astanand); M. Kaakinen (Marika); M. Kähönen (Mika); S. Kanoni (Stavroula); Keltigangas-Järvinen, L. (Liisa); L.A.L.M. Kiemeney (Bart); I. Kolcic (Ivana); Koskinen, S. (Seppo); A. Kraja (Aldi); Kroh, M. (Martin); Z. Kutalik (Zoltán); A. Latvala (Antti); L.J. Launer (Lenore); Lebreton, M.P. (Maël P.); D.F. Levinson (Douglas F.); P. Lichtenstein (Paul); P. Lichtner (Peter); D.C. Liewald (David C.); A. Loukola (Anu); P.A. Madden (Pamela); R. Mägi (Reedik); Mäki-Opas, T. (Tomi); R.E. Marioni (Riccardo); P. Marques-Vidal; Meddens, G.A. (Gerardus A.); G. Mcmahon (George); C. Meisinger (Christa); T. Meitinger (Thomas); Milaneschi, Y. (Yusplitri); L. Milani (Lili); G.W. Montgomery (Grant); R. Myhre (Ronny); C.P. Nelson (Christopher P.); D.R. Nyholt (Dale); W.E.R. Ollier (William); A. Palotie (Aarno); L. Paternoster (Lavinia); N.L. Pedersen (Nancy); K. Petrovic (Katja); D.J. Porteous (David J.); K. Räikkönen (Katri); Ring, S.M. (Susan M.); A. Robino (Antonietta); O. Rostapshova (Olga); I. Rudan (Igor); A. Rustichini (Aldo); V. Salomaa (Veikko); Sanders, A.R. (Alan R.); A.-P. Sarin; R. Schmidt (Reinhold); R.J. Scott (Rodney); B.H. Smith (Blair); J.A. Smith (Jennifer A); J.A. Staessen (Jan); E. Steinhagen-Thiessen (Elisabeth); K. Strauch (Konstantin); A. Terracciano; M.D. Tobin (Martin); S. Ulivi (Shelia); S. Vaccargiu (Simona); L. Quaye (Lydia); F.J.A. van Rooij (Frank); C. Venturini (Cristina); A.A.E. Vinkhuyzen (Anna A.); U. Völker (Uwe); Völzke, H. (Henry); J.M. Vonk (Judith); D. Vozzi (Diego); J. Waage (Johannes); E.B. Ware (Erin B.); G.A.H.M. Willemsen (Gonneke); J. Attia (John); D.A. Bennett (David A.); Berger, K. (Klaus); L. Bertram (Lars); H. Bisgaard (Hans); D.I. Boomsma (Dorret); I.B. Borecki (Ingrid); U. Bültmann (Ute); C.F. Chabris (Christopher F.); F. Cucca (Francesco); D. Cusi (Daniele); I.J. Deary (Ian J.); G.V. Dedoussis (George); C.M. van Duijn (Cornelia); K. Hagen (Knut); B. Franke (Barbara); L. Franke (Lude); P. Gasparini (Paolo); P.V. Gejman (Pablo); C. Gieger (Christian); H.J. Grabe (Hans Jörgen); J. Gratten (Jacob); P.J.F. Groenen (Patrick); V. Gudnason (Vilmundur); P. van der Harst (Pim); C. Hayward (Caroline); D.A. Hinds (David A.); W. Hoffmann (Wolfgang); E. Hypponen (Elina); W.G. Iacono (William); B. Jacobsson (Bo); M.-R. Jarvelin (Marjo-Riitta); K.-H. JöCkel (Karl-Heinz); J. Kaprio (Jaakko); S.L.R. Kardia (Sharon); T. Lehtimäki (Terho); Lehrer, S.F. (Steven F.); P.K. Magnusson (Patrik); N.G. Martin (Nicholas); M. McGue (Matt); A. Metspalu (Andres); N. Pendleton (Neil); B.W.J.H. Penninx (Brenda); M. Perola (Markus); N. Pirastu (Nicola); M. Pirastu (Mario); O. Polasek (Ozren); D. Posthuma (Danielle); C. Power (Christopher); M.A. Province (Mike); N.J. Samani (Nilesh); Schlessinger, D. (David); R. Schmidt (Reinhold); T.I.A. Sørensen (Thorkild); T.D. Spector (Timothy); J-A. Zwart (John-Anker); U. Thorsteinsdottir (Unnur); A.R. Thurik (Roy); Timpson, N.J. (Nicholas J.); H.W. Tiemeier (Henning); J.Y. Tung (Joyce Y.); A.G. Uitterlinden (André); Vitart, V. (Veronique); P. Vollenweider (Peter); D.R. Weir (David); J.F. Wilson (James F.); A.F. Wright (Alan); Conley, D.C. (Dalton C.); R.F. Krueger; G.D. Smith; Hofman, A. (Albert); D. Laibson (David); S.E. Medland (Sarah Elizabeth); M.N. Meyer (Michelle N.); J. Yang (Joanna); M. Johannesson (Magnus); P.M. Visscher (Peter); T. Esko (Tõnu); Ph.D. Koellinger (Philipp); D. Cesarini (David); D.J. Benjamin (Daniel J.)

    2016-01-01

    textabstractEducational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals. Here we report the results of a genome-wide association study (GWAS) for educational attainment that

  12. Genome-wide association study identifies 74 loci associated with educational attainment

    NARCIS (Netherlands)

    Okbay, A.; Beauchamp, J.; Fontana, M.A.; Lee, J.J.; Pers, T.H.; Rietveld, C.A.; Turley, P.; Chen, G.B.; Emilsson, V.; Meddens, S.F.W.; de Vlaming, R.; Abdellaoui, A.; Peyrot, W.; Vinkhuyzen, A.A.E.; Hottenga, J.J.; Willemsen, G.; Boomsma, D.I.; Penninx, B.W.J.H.; Laibson, D.; Medland, S.E.; Meyer, M.N.; Yang, J.; Johannesson, M.; Visscher, P.M.; Esko, T.; Koellinger, P.D.; Cesarini, D.; Benjamin, D.J.

    2016-01-01

    Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends our

  13. Genome-wide association study identifies 74 loci associated with educational attainment

    NARCIS (Netherlands)

    Okbay, Aysu; Beauchamp, Jonathan P.; Fontana, Mark Alan; Lee, James J.; Pers, Tune H.; Rietveld, Cornelius A.; Turley, Patrick; Chen, Guo-Bo; Emilsson, Valur; Meddens, S. Fleur W.; Oskarsson, Sven; Pickrell, Joseph K.; Thom, Kevin; Timshel, Pascal; de Vlaming, Ronald; Abdellaoui, Abdel; Ahluwalia, Tarunveer S.; Bacelis, Jonas; Baumbach, Clemens; Bjornsdottir, Gyda; Brandsma, Johannes H.; Concas, Maria Pina; Derringer, Jaime; Furlotte, Nicholas A.; Galesloot, Tessel E.; Girotto, Giorgia; Gupta, Richa; Hall, Leanne M.; Harris, Sarah E.; Hofer, Edith; Horikoshi, Momoko; Huffman, Jennifer E.; Kaasik, Kadri; Kalafati, Ioanna P.; Karlsson, Robert; Kong, Augustine; Lahti, Jari; van der Lee, Sven J.; de Leeuw, Christiaan; Lind, Penelope A.; Lindgren, Karl-Oskar; Liu, Tian; van der Most, Peter J.; Verweij, Niek; Alizadeh, Behrooz Z.; Vonk, Judith M.; Bultmann, Ute; Franke, Lude; van der Harst, Pim; Penninx, Brenda W. J. H.

    2016-01-01

    Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals(1). Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends

  14. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

    Science.gov (United States)

    Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

    2016-05-01

    Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.

  15. Association Mapping and the Genomic Consequences of Selection in Sunflower

    Science.gov (United States)

    Mandel, Jennifer R.; Nambeesan, Savithri; Bowers, John E.; Marek, Laura F.; Ebert, Daniel; Rieseberg, Loren H.; Knapp, Steven J.; Burke, John M.

    2013-01-01

    The combination of large-scale population genomic analyses and trait-based mapping approaches has the potential to provide novel insights into the evolutionary history and genome organization of crop plants. Here, we describe the detailed genotypic and phenotypic analysis of a sunflower (Helianthus annuus L.) association mapping population that captures nearly 90% of the allelic diversity present within the cultivated sunflower germplasm collection. We used these data to characterize overall patterns of genomic diversity and to perform association analyses on plant architecture (i.e., branching) and flowering time, successfully identifying numerous associations underlying these agronomically and evolutionarily important traits. Overall, we found variable levels of linkage disequilibrium (LD) across the genome. In general, islands of elevated LD correspond to genomic regions underlying traits that are known to have been targeted by selection during the evolution of cultivated sunflower. In many cases, these regions also showed significantly elevated levels of differentiation between the two major sunflower breeding groups, consistent with the occurrence of divergence due to strong selection. One of these regions, which harbors a major branching locus, spans a surprisingly long genetic interval (ca. 25 cM), indicating the occurrence of an extended selective sweep in an otherwise recombinogenic interval. PMID:23555290

  16. New Sequence Variants in HLA Class II/III Region Associated with Susceptibility to Knee Osteoarthritis Identified by Genome-Wide Association Study

    Science.gov (United States)

    Nakajima, Masahiro; Takahashi, Atsushi; Kou, Ikuyo; Rodriguez-Fontenla, Cristina; Gomez-Reino, Juan J.; Furuichi, Tatsuya; Dai, Jin; Sudo, Akihiro; Uchida, Atsumasa; Fukui, Naoshi; Kubo, Michiaki; Kamatani, Naoyuki; Tsunoda, Tatsuhiko; Malizos, Konstantinos N.; Tsezou, Aspasia; Gonzalez, Antonio; Nakamura, Yusuke; Ikegawa, Shiro

    2010-01-01

    Osteoarthritis (OA) is a common disease that has a definite genetic component. Only a few OA susceptibility genes that have definite functional evidence and replication of association have been reported, however. Through a genome-wide association study and a replication using a total of ∼4,800 Japanese subjects, we identified two single nucleotide polymorphisms (SNPs) (rs7775228 and rs10947262) associated with susceptibility to knee OA. The two SNPs were in a region containing HLA class II/III genes and their association reached genome-wide significance (combined P = 2.43×10−8 for rs7775228 and 6.73×10−8 for rs10947262). Our results suggest that immunologic mechanism is implicated in the etiology of OA. PMID:20305777

  17. Big Data Analytics for Genomic Medicine.

    Science.gov (United States)

    He, Karen Y; Ge, Dongliang; He, Max M

    2017-02-15

    Genomic medicine attempts to build individualized strategies for diagnostic or therapeutic decision-making by utilizing patients' genomic information. Big Data analytics uncovers hidden patterns, unknown correlations, and other insights through examining large-scale various data sets. While integration and manipulation of diverse genomic data and comprehensive electronic health records (EHRs) on a Big Data infrastructure exhibit challenges, they also provide a feasible opportunity to develop an efficient and effective approach to identify clinically actionable genetic variants for individualized diagnosis and therapy. In this paper, we review the challenges of manipulating large-scale next-generation sequencing (NGS) data and diverse clinical data derived from the EHRs for genomic medicine. We introduce possible solutions for different challenges in manipulating, managing, and analyzing genomic and clinical data to implement genomic medicine. Additionally, we also present a practical Big Data toolset for identifying clinically actionable genetic variants using high-throughput NGS data and EHRs.

  18. Single-molecule approach to bacterial genomic comparisons via optical mapping.

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Shiguo [Univ. Wisc.-Madison; Kile, A. [Univ. Wisc.-Madison; Bechner, M. [Univ. Wisc.-Madison; Kvikstad, E. [Univ. Wisc.-Madison; Deng, W. [Univ. Wisc.-Madison; Wei, J. [Univ. Wisc.-Madison; Severin, J. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Dimalanta, E. [Univ. Wisc.-Madison; Lamers, C. [Univ. Wisc.-Madison; Burland, V. [Univ. Wisc.-Madison; Blattner, F. R. [Univ. Wisc.-Madison; Schwartz, David C. [Univ. Wisc.-Madison

    2004-01-01

    Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.

  19. MinGenome: An In Silico Top-Down Approach for the Synthesis of Minimized Genomes.

    Science.gov (United States)

    Wang, Lin; Maranas, Costas D

    2018-02-16

    Genome minimized strains offer advantages as production chassis by reducing transcriptional cost, eliminating competing functions and limiting unwanted regulatory interactions. Existing approaches for identifying stretches of DNA to remove are largely ad hoc based on information on presumably dispensable regions through experimentally determined nonessential genes and comparative genomics. Here we introduce a versatile genome reduction algorithm MinGenome that implements a mixed-integer linear programming (MILP) algorithm to identify in size descending order all dispensable contiguous sequences without affecting the organism's growth or other desirable traits. Known essential genes or genes that cause significant fitness or performance loss can be flagged and their deletion can be prohibited. MinGenome also preserves needed transcription factors and promoter regions ensuring that retained genes will be properly transcribed while also avoiding the simultaneous deletion of synthetic lethal pairs. The potential benefit of removing even larger contiguous stretches of DNA if only one or two essential genes (to be reinserted elsewhere) are within the deleted sequence is explored. We applied the algorithm to design a minimized E. coli strain and found that we were able to recapitulate the long deletions identified in previous experimental studies and discover alternative combinations of deletions that have not yet been explored in vivo.

  20. Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate

    Directory of Open Access Journals (Sweden)

    Andersson Jan O

    2010-10-01

    Full Text Available Abstract Background Giardia intestinalis is a protozoan parasite that causes diarrhea in a wide range of mammalian species. To further understand the genetic diversity between the Giardia intestinalis species, we have performed genome sequencing and analysis of a wild-type Giardia intestinalis sample from the assemblage E group, isolated from a pig. Results We identified 5012 protein coding genes, the majority of which are conserved compared to the previously sequenced genomes of the WB and GS strains in terms of microsynteny and sequence identity. Despite this, there is an unexpectedly large number of chromosomal rearrangements and several smaller structural changes that are present in all chromosomes. Novel members of the VSP, NEK Kinase and HCMP gene families were identified, which may reveal possible mechanisms for host specificity and new avenues for antigenic variation. We used comparative genomics of the three diverse Giardia intestinalis isolates P15, GS and WB to define a core proteome for this species complex and to identify lineage-specific genes. Extensive analyses of polymorphisms in the core proteome of Giardia revealed differential rates of divergence among cellular processes. Conclusions Our results indicate that despite a well conserved core of genes there is significant genome variation between Giardia isolates, both in terms of gene content, gene polymorphisms, structural chromosomal variations and surface molecule repertoires. This study improves the annotation of the Giardia genomes and enables the identification of functionally important variation.