WorldWideScience

Sample records for cell whole-genome expression

  1. Whole genome expression profile in neuroblastoma cells exposed to 1-methyl-4-phenylpyridine

    OpenAIRE

    Mazzio, E; Soliman, KFA

    2012-01-01

    Mitochondrial dysfunction and subsequent energy failure is a contributing factor to degeneration of the substantia nigra pars compacta associated with Parkinson’s disease (PD). In this study, we investigate molecular events trigger by 1-methyl-4-phenylpyridine (MPP+) using whole genome-expression microarray, western blot and HPLC quantification of metabolites. The data show that MPP+ (500μM) evokes obstruction of mitochondrial respiration/oxidative phosphorylation (OXPHOS) in mouse neuroblast...

  2. A global view of Staphylococcus aureus whole genome expression upon internalization in human epithelial cells

    Directory of Open Access Journals (Sweden)

    Vaudaux Pierre

    2007-06-01

    Full Text Available Abstract Background Staphylococcus aureus, a leading cause of chronic or acute infections, is traditionally considered an extracellular pathogen despite repeated reports of S. aureus internalization by a variety of non-myeloid cells in vitro. This property potentially contributes to bacterial persistence, protection from antibiotics and evasion of immune defenses. Mechanisms contributing to internalization have been partly elucidated, but bacterial processes triggered intracellularly are largely unknown. Results We have developed an in vitro model using human lung epithelial cells that shows intracellular bacterial persistence for up to 2 weeks. Using an original approach we successfully collected and amplified low amounts of bacterial RNA recovered from infected eukaryotic cells. Transcriptomic analysis using an oligoarray covering the whole S. aureus genome was performed at two post-internalization times and compared to gene expression of non-internalized bacteria. No signs of cellular death were observed after prolonged internalization of Staphylococcus aureus 6850 in epithelial cells. Following internalization, extensive alterations of bacterial gene expression were observed. Whereas major metabolic pathways including cell division, nutrient transport and regulatory processes were drastically down-regulated, numerous genes involved in iron scavenging and virulence were up-regulated. This initial adaptation was followed by a transcriptional increase in several metabolic functions. However, expression of several toxin genes known to affect host cell integrity appeared strictly limited. Conclusion These molecular insights correlated with phenotypic observations and demonstrated that S. aureus modulates gene expression at early times post infection to promote survival. Staphylococcus aureus appears adapted to intracellular survival in non-phagocytic cells.

  3. Whole-Genome Expression Analysis and Signal Pathway Screening of Synovium-Derived Mesenchymal Stromal Cells in Rheumatoid Arthritis

    Directory of Open Access Journals (Sweden)

    Jingyi Hou

    2016-01-01

    Full Text Available Synovium-derived mesenchymal stromal cells (SMSCs may play an important role in the pathogenesis of rheumatoid arthritis (RA and show promise for therapeutic applications in RA. In this study, a whole-genome microarray analysis was used to detect differential gene expression in SMSCs from RA patients and healthy donors (HDs. Our results showed that there were 4828 differentially expressed genes in the RA group compared to the HD group; 3117 genes were upregulated, and 1711 genes were downregulated. A Gene Ontology analysis showed significantly enriched terms of differentially expressed genes in the biological process, cellular component, and molecular function domains. A Kyoto Encyclopedia of Genes and Genomes analysis showed that the MAPK signaling and rheumatoid arthritis pathways were upregulated and that the p53 signaling pathway was downregulated in RA SMSCs. Quantitative real-time polymerase chain reaction was applied to verify the expression variations of the partial genes mentioned above, and a western blot analysis was used to determine the expression levels of p53, p-JNK, p-ERK, and p-p38. Our study found that differentially expressed genes in the MAPK signaling, rheumatoid arthritis, and p53 signaling pathways may help to explain the pathogenic mechanism of RA and lead to therapeutic RA SMSC applications.

  4. Grouping and Classifying Electrophysiologically-Defined Classes of Neocortical Neurons by Single Cell, Whole-Genome Expression Profiling

    OpenAIRE

    Subkhankulova, Tatiana; Yano, Kojiro; Robinson, Hugh P. C.; Livesey, Frederick J

    2010-01-01

    The diversity of neuronal cell types and how to classify them are perennial questions in neuroscience. The advent of global gene expression analysis raised the possibility that comprehensive transcription profiling will resolve neuronal cell types into groups that reflect some or all aspects of their phenotype. This approach has been successfully used to compare gene expression between groups of neurons defined by a common property. Here we extend this approach to ask whether single neuron ge...

  5. Grouping and classifying electrophysiologically-defined classes of neocortical neurons by single cell, whole-genome expression profiling

    OpenAIRE

    Tatiana Subkhankulova; Kojiro Yano; Hugh Robinson; Livesey, Frederick J

    2010-01-01

    The diversity of neuronal cell types and how to classify them are perennial questions in neuroscience. The advent of global gene expression analysis raised the possibility that comprehensive transcription profiling will resolve neuronal cell types into groups that reflect some or all aspects of their phenotype. This approach has been successfully used to compare gene expression between groups of neurons defined by a common property. Here we extend this approach to ask whether single neuron ge...

  6. Whole-Genome Expression Analysis of Human Mesenchymal Stromal Cells Exposed to Ultrasmooth Tantalum vs. Titanium Oxide Surfaces

    DEFF Research Database (Denmark)

    Stiehler, C.; Bunger, C.; Overall, R. W.;

    2013-01-01

    Durable osseointegration of metallic bone implants requires that progenitor cells attach, proliferate and differentiate on the implant surface. Previously, we demonstrated superior biocompatibility of human mesenchymal stromal cells (MSCs) cultivated on ultrasmooth tantalum (Ta) as compared...... MSCs cultivated on plain sputter-coated surfaces of Ta or Ti for 1, 2, 4, and 8 days were hybridized to n = 16 U133 Plus 2.0 arrays (Affymetrix(A (R))). Functional annotation, cluster and pathway analyses were performed. The vast majority of genes were differentially regulated after 4 days...

  7. Whole-genome gene expression modifications associated with nitrosamine exposure and micronucleus frequency in human blood cells

    DEFF Research Database (Denmark)

    Hebels, Dennie G A J; Jennen, Danyel G J; van Herwijnen, Marcel H M;

    2011-01-01

    N-nitroso compounds (NOCs) are suspected human carcinogens and relevant in human exposure. NOCs also induce micronuclei (MN) formation in vivo. Since lymphocytic MN represent a validated biomarker of human cancer risk, establishing a link between NOC exposure and MN frequency in humans...... association between MN frequency and urinary NOCs (r = 0.41, P = 0.025) and identified modifications in among others cytoskeleton remodeling, cell cycle, apoptosis and survival, signal transduction, immune response, G-protein signaling and development pathways, which indicate a response to NOC......-induced genotoxicity. Moreover, we established a network of genes, the most important ones of which include FBXW7, BUB3, Caspase 2, Caspase 8, SMAD3, Huntingtin and MGMT, which are involved in processes relevant in carcinogenesis. The modified genetic processes and genes found in this study may be of interest...

  8. Rapid Identification of Potential Drugs for Diabetic Nephropathy Using Whole-Genome Expression Profiles of Glomeruli

    OpenAIRE

    Jingsong Shi; Song Jiang; Dandan Qiu; Weibo Le; Xiao Wang; Yinhui Lu; Zhihong Liu

    2016-01-01

    Objective. To investigate potential drugs for diabetic nephropathy (DN) using whole-genome expression profiles and the Connectivity Map (CMAP). Methodology. Eighteen Chinese Han DN patients and six normal controls were included in this study. Whole-genome expression profiles of microdissected glomeruli were measured using the Affymetrix human U133 plus 2.0 chip. Differentially expressed genes (DEGs) between late stage and early stage DN samples and the CMAP database were used to identify pote...

  9. Use of whole genome expression analysis in the toxicity screening of nanoparticles

    Energy Technology Data Exchange (ETDEWEB)

    Fröhlich, Eleonore, E-mail: eleonore.froehlich@medunigraz.at [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Meindl, Claudia; Wagner, Karin [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Leitinger, Gerd [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Institute for Cell Biology, Histology and Embryology, Medical University of Graz, Harrachgasse 21, 8010 Graz (Austria); Roblegg, Eva [Institute of Pharmaceutical Sciences, Department of Pharmaceutical Technology, Karl-Franzens-University of Graz, Universitätsplatz 1, 8010 Graz (Austria)

    2014-10-15

    The use of nanoparticles (NPs) offers exciting new options in technical and medical applications provided they do not cause adverse cellular effects. Cellular effects of NPs depend on particle parameters and exposure conditions. In this study, whole genome expression arrays were employed to identify the influence of particle size, cytotoxicity, protein coating, and surface functionalization of polystyrene particles as model particles and for short carbon nanotubes (CNTs) as particles with potential interest in medical treatment. Another aim of the study was to find out whether screening by microarray would identify other or additional targets than commonly used cell-based assays for NP action. Whole genome expression analysis and assays for cell viability, interleukin secretion, oxidative stress, and apoptosis were employed. Similar to conventional assays, microarray data identified inflammation, oxidative stress, and apoptosis as affected by NP treatment. Application of lower particle doses and presence of protein decreased the total number of regulated genes but did not markedly influence the top regulated genes. Cellular effects of CNTs were small; only carboxyl-functionalized single-walled CNTs caused appreciable regulation of genes. It can be concluded that regulated functions correlated well with results in cell-based assays. Presence of protein mitigated cytotoxicity but did not cause a different pattern of regulated processes. - Highlights: • Regulated functions were screened using whole genome expression assays. • Polystyrene particles regulated more genes than short carbon nanotubes. • Protein coating of polystyrene particles did not change regulation pattern. • Functions regulated by microarray were confirmed by cell-based assay.

  10. Use of whole genome expression analysis in the toxicity screening of nanoparticles

    International Nuclear Information System (INIS)

    The use of nanoparticles (NPs) offers exciting new options in technical and medical applications provided they do not cause adverse cellular effects. Cellular effects of NPs depend on particle parameters and exposure conditions. In this study, whole genome expression arrays were employed to identify the influence of particle size, cytotoxicity, protein coating, and surface functionalization of polystyrene particles as model particles and for short carbon nanotubes (CNTs) as particles with potential interest in medical treatment. Another aim of the study was to find out whether screening by microarray would identify other or additional targets than commonly used cell-based assays for NP action. Whole genome expression analysis and assays for cell viability, interleukin secretion, oxidative stress, and apoptosis were employed. Similar to conventional assays, microarray data identified inflammation, oxidative stress, and apoptosis as affected by NP treatment. Application of lower particle doses and presence of protein decreased the total number of regulated genes but did not markedly influence the top regulated genes. Cellular effects of CNTs were small; only carboxyl-functionalized single-walled CNTs caused appreciable regulation of genes. It can be concluded that regulated functions correlated well with results in cell-based assays. Presence of protein mitigated cytotoxicity but did not cause a different pattern of regulated processes. - Highlights: • Regulated functions were screened using whole genome expression assays. • Polystyrene particles regulated more genes than short carbon nanotubes. • Protein coating of polystyrene particles did not change regulation pattern. • Functions regulated by microarray were confirmed by cell-based assay

  11. Current Developments in Prokaryotic Single Cell Whole Genome Amplification

    Energy Technology Data Exchange (ETDEWEB)

    Goudeau, Danielle; Nath, Nandita; Ciobanu, Doina; Cheng, Jan-Fang; Malmstrom, Rex

    2014-03-14

    Our approach to prokaryotic single-cell Whole Genome Amplification at the JGI continues to evolve. To increase both the quality and number of single-cell genomes produced, we explore all aspects of the process from cell sorting to sequencing. For example, we now utilize specialized reagents, acoustic liquid handling, and reduced reaction volumes eliminate non-target DNA contamination in WGA reactions. More specifically, we use a cleaner commercial WGA kit from Qiagen that employs a UV decontamination procedure initially developed at the JGI, and we use the Labcyte Echo for tip-less liquid transfer to set up 2uL reactions. Acoustic liquid handling also dramatically reduces reagent costs. In addition, we are exploring new cell lysis methods including treatment with Proteinase K, lysozyme, and other detergents, in order to complement standard alkaline lysis and allow for more efficient disruption of a wider range of cells. Incomplete lysis represents a major hurdle for WGA on some environmental samples, especially rhizosphere, peatland, and other soils. Finding effective lysis strategies that are also compatible with WGA is challenging, and we are currently assessing the impact of various strategies on genome recovery.

  12. Copy Number Variation Analysis by Array Analysis of Single Cells Following Whole Genome Amplification.

    Science.gov (United States)

    Dimitriadou, Eftychia; Zamani Esteki, Masoud; Vermeesch, Joris Robert

    2015-01-01

    Whole genome amplification is required to ensure the availability of sufficient material for copy number variation analysis of a genome deriving from an individual cell. Here, we describe the protocols we use for copy number variation analysis of non-fixed single cells by array-based approaches following single-cell isolation and whole genome amplification. We are focusing on two alternative protocols, an isothermal and a PCR-based whole genome amplification method, followed by either comparative genome hybridization (aCGH) or SNP array analysis, respectively.

  13. Estrogen receptor-mediated effects of isoflavone supplementation were not observed in whole-genome gene expression profiles of peripheral blood mononuclear cells in postmenopausal, equol-producing women.

    Science.gov (United States)

    van der Velpen, Vera; Geelen, Anouk; Schouten, Evert G; Hollman, Peter C; Afman, Lydia A; van 't Veer, Pieter

    2013-06-01

    Isoflavones (genistein, daidzein, and glycitein) are suggested to have benefits as well as risks for human health. Approximately one-third of the Western population is able to metabolize daidzein into the more potent metabolite equol. Having little endogenous estradiol, equol-producing postmenopausal women who use isoflavone supplements to relieve their menopausal symptoms could potentially be at high risk of adverse effects of isoflavone supplementation. The current trial aimed to study the effects of intake of an isoflavone supplement rich in daidzein compared with placebo on whole-genome gene expression profiles of peripheral blood mononuclear cells (PBMCs) in equol-producing, postmenopausal women. Thirty participants received an isoflavone supplement or a placebo for 8 wk each in a double-blind, randomized cross-over design. The isoflavone supplement was rich in daidzein (60%) and provided 94 mg isoflavones (aglycone equivalents) daily. Gene expression in PBMCs was significantly changed (P isoflavone intervention compared with placebo. Gene set enrichment analysis revealed downregulated clusters of gene sets involved in inflammation, oxidative phosphorylation, and cell cycle. The expression of estrogen receptor (ER) target genes and gene sets related to ER signaling were not significantly altered, which may be explained by the low ERα and ERβ expression in PBMCs. The observed downregulated gene sets point toward potential beneficial effects of isoflavone supplementation with respect to prevention of cancer and cardiovascular disease. However, whether ER-related effects of isoflavones are beneficial or harmful should be studied in tissues that express ERs. PMID:23616509

  14. Whole genome RNA expression profiling for the identification of novel biomarkers in the diagnosis and prognosis of biliary tract cancer

    OpenAIRE

    Chapman, M H

    2011-01-01

    Biliary tract cancer (BTC) is difficult to diagnose, in part related to the lack of reliable tumour markers. The aim of this project was to use whole genome RNA expression profiling in order to identify novel biomarkers for diagnosis and prognosis in biliary tract cancer. Chapter 1 summarises clinical aspects of BTC as well as current diagnostic and prognostic tests. Chapter 2 addresses the identification of circulating tumour cells for the diagnosis of BTC. It includes d...

  15. Whole Genome Expression Profiling and Signal Pathway Screening of MSCs in Ankylosing Spondylitis

    Directory of Open Access Journals (Sweden)

    Yuxi Li

    2014-01-01

    Full Text Available The pathogenesis of dysfunctional immunoregulation of mesenchymal stem cells (MSCs in ankylosing spondylitis (AS is thought to be a complex process that involves multiple genetic alterations. In this study, MSCs derived from both healthy donors and AS patients were cultured in normal media or media mimicking an inflammatory environment. Whole genome expression profiling analysis of 33,351 genes was performed and differentially expressed genes related to AS were analyzed by GO term analysis and KEGG pathway analysis. Our results showed that in normal media 676 genes were differentially expressed in AS, 354 upregulated and 322 downregulated, while in an inflammatory environment 1767 genes were differentially expressed in AS, 1230 upregulated and 537 downregulated. GO analysis showed that these genes were mainly related to cellular processes, physiological processes, biological regulation, regulation of biological processes, and binding. In addition, by KEGG pathway analysis, 14 key genes from the MAPK signaling and 8 key genes from the TLR signaling pathway were identified as differentially regulated. The results of qRT-PCR verified the expression variation of the 9 genes mentioned above. Our study found that in an inflammatory environment ankylosing spondylitis pathogenesis may be related to activation of the MAPK and TLR signaling pathways.

  16. Pathway Processor: A Tool for Integrating Whole-Genome Expression Results into Metabolic Networks

    OpenAIRE

    Grosu, Paul; Townsend, Jeffrey P.; Hartl, Daniel L.; Cavalieri, Duccio

    2002-01-01

    We have developed a new tool to visualize expression data on metabolic pathways and to evaluate which metabolic pathways are most affected by transcriptional changes in whole-genome expression experiments. Using the Fisher Exact Test, the method scores biochemical pathways according to the probability that as many or more genes in a pathway would be significantly altered in a given experiment by chance alone. This method has been validated on diauxic shift experiments and reproduces well know...

  17. Rapid Identification of Potential Drugs for Diabetic Nephropathy Using Whole-Genome Expression Profiles of Glomeruli

    Directory of Open Access Journals (Sweden)

    Jingsong Shi

    2016-01-01

    Full Text Available Objective. To investigate potential drugs for diabetic nephropathy (DN using whole-genome expression profiles and the Connectivity Map (CMAP. Methodology. Eighteen Chinese Han DN patients and six normal controls were included in this study. Whole-genome expression profiles of microdissected glomeruli were measured using the Affymetrix human U133 plus 2.0 chip. Differentially expressed genes (DEGs between late stage and early stage DN samples and the CMAP database were used to identify potential drugs for DN using bioinformatics methods. Results. (1 A total of 1065 DEGs (FDR 1.5 were found in late stage DN patients compared with early stage DN patients. (2 Piperlongumine, 15d-PGJ2 (15-delta prostaglandin J2, vorinostat, and trichostatin A were predicted to be the most promising potential drugs for DN, acting as NF-κB inhibitors, histone deacetylase inhibitors (HDACIs, PI3K pathway inhibitors, or PPARγ agonists, respectively. Conclusion. Using whole-genome expression profiles and the CMAP database, we rapidly predicted potential DN drugs, and therapeutic potential was confirmed by previously published studies. Animal experiments and clinical trials are needed to confirm both the safety and efficacy of these drugs in the treatment of DN.

  18. Whole-genome expression analysis reveals genes associated with treatment response to escitalopram in major depression.

    Science.gov (United States)

    Pettai, Kristi; Milani, Lili; Tammiste, Anu; Võsa, Urmo; Kolde, Raivo; Eller, Triin; Nutt, David; Metspalu, Andres; Maron, Eduard

    2016-09-01

    The reasons for variability in treatment response in major depressive disorder (MDD) are not fully understood, but there is accumulating evidence suggesting that therapeutic outcomes of antidepressants can be influenced by genetic factors. In the present study we applied the microarray Illumina platform for whole genome expression profiling in depressive patients treated with escitalopram medication in order to identify genes underlying response to antidepressant treatment. The initial study sample consisted of 135 outpatients with major depressive disorder (mean age 31.1±11.6 years, 68% females) treated with escitalopram 10-20mg/day for 12 weeks, from which 87 patients (55 females) were included in gene expression analyzing. The gene expression profiles were measured on peripheral blood cells at baseline, at week 4 and at the end of treatment (week 12) using BeadChips Illumina. The fold change was used to demonstrate rate of changes in average gene expressions between studied groups. Statistical analyses were performed using the false discovery rate (FDR). The most interesting gene, which showed the predictive effect on treatment outcome by delineating low dose responders and treatment-resistant patients at the beginning of medication, was NLGN2, belonging to a family of neuronal cell surface proteins and involving in synapse formation. In addition, the several gene clusters, related to immune response, signal transduction and neurotrophin pathway, have distinguished responders from non-responders at the week 4 of treatment. After 4 weeks of escitalopram treatment (10mg/day), the YWHAZ gene has showed the highest transcriptional change in responders as compared with non-responders. Finally, at the end of the treatment we noticed that at least three genes (NR2C2, ZNF641, FKBP1A) have been strongly associated with resistance to escitalopram. Thus the results of this study support that exploration of peripheral gene expression is a useful tool in the further

  19. Comparison of variations detection between whole-genome amplification methods used in single-cell resequencing

    DEFF Research Database (Denmark)

    Hou, Yong; Wu, Kui; Shi, Xulian;

    2015-01-01

    BACKGROUND: Single-cell resequencing (SCRS) provides many biomedical advances in variations detection at the single-cell level, but it currently relies on whole genome amplification (WGA). Three methods are commonly used for WGA: multiple displacement amplification (MDA), degenerate-oligonucleoti...

  20. Whole genome analysis of p38 SAPK-mediated gene expression upon stress

    Directory of Open Access Journals (Sweden)

    Lopez-Bigas Nuria

    2010-03-01

    Full Text Available Abstract Background Cells have the ability to respond and adapt to environmental changes through activation of stress-activated protein kinases (SAPKs. Although p38 SAPK signalling is known to participate in the regulation of gene expression little is known on the molecular mechanisms used by this SAPK to regulate stress-responsive genes and the overall set of genes regulated by p38 in response to different stimuli. Results Here, we report a whole genome expression analyses on mouse embryonic fibroblasts (MEFs treated with three different p38 SAPK activating-stimuli, namely osmostress, the cytokine TNFα and the protein synthesis inhibitor anisomycin. We have found that the activation kinetics of p38α SAPK in response to these insults is different and also leads to a complex gene pattern response specific for a given stress with a restricted set of overlapping genes. In addition, we have analysed the contribution of p38α the major p38 family member present in MEFs, to the overall stress-induced transcriptional response by using both a chemical inhibitor (SB203580 and p38α deficient (p38α-/- MEFs. We show here that p38 SAPK dependency ranged between 60% and 88% depending on the treatments and that there is a very good overlap between the inhibitor treatment and the ko cells. Furthermore, we have found that the dependency of SAPK varies depending on the time the cells are subjected to osmostress. Conclusions Our genome-wide transcriptional analyses shows a selective response to specific stimuli and a restricted common response of up to 20% of the stress up-regulated early genes that involves an important set of transcription factors, which might be critical for either cell adaptation or preparation for continuous extra-cellular changes. Interestingly, up to 85% of the up-regulated genes are under the transcriptional control of p38 SAPK. Thus, activation of p38 SAPK is critical to elicit the early gene expression program required for cell

  1. Whole-genome fingerprint of the DNA methylome during human B cell differentiation

    OpenAIRE

    Kulis, Marta; Merkel, Angelika; Heath, Simon; Queirós, Ana C; Schuyler, Ronald P.; Castellano, Giancarlo; Beekman, Renée; Raineri, Emanuele; Esteve, Anna; Clot, Guillem; Verdaguer-Dot, Néria; Duran-Ferrer, Martí; Russiñol, Nuria; Vilarrasa-Blasi, Roser; Ecker, Simone

    2015-01-01

    International audience We analyzed the DNA methylome of ten subpopulations spanning the entire B cell differentiation program by whole-genome bisulfite sequencing and high-density microarrays. We observed that non-CpG methylation disappeared upon B cell commitment, whereas CpG methylation changed extensively during B cell maturation, showing an accumulative pattern and affecting around 30% of all measured CpG sites. Early differentiation stages mainly displayed enhancer demethylation, whic...

  2. Single Cell HLA Matching Feasibility by Whole Genomic Amplification and Nested PCR

    Institute of Scientific and Technical Information of China (English)

    Xiao-hong Li; Fang-yin Meng

    2004-01-01

    @@ PCR based single-cell DNA analysis has been widely used in forensic science, preimplantation genetic diagnosis and so on. However, the original sample cannot be efficiently retrieved following single cell PCR, consequently the amount of information gained is limited. HLA system is too sophisticated that it is very hard to complete HLA typing by single cell. A Taq polymerase-based method using random primers to amplify whole genome termed as whole genome amplification (WGA) has demonstrated to be a useful method in increasing the copies of minimum sample. We establish a technique in this study to amplify HLA-A and HLA-B loci at same time in a single cell using WGA.

  3. Is gene activity in plant cells affected by UMTS-irradiation? A whole genome approach

    Directory of Open Access Journals (Sweden)

    Julia C Engelmann

    2008-10-01

    Full Text Available Julia C Engelmann3,* Rosalia Deeken1,* Tobias Müller3, Günter Nimtz2, M Rob G Roelfsema1, Rainer Hedrich11Molecular Plant Physiology and Biophysics, Julius-von-Sachs Institute for Biosciences; 2Institute of Physics II, University of Cologne, Cologne, Germany; 3Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany; *These authors contributed equally to this workAbstract: Mobile phone technology makes use of radio frequency (RF electromagnetic fields transmitted through a dense network of base stations in Europe. Possible harmful effects of RF fields on humans and animals are discussed, but their effect on plants has received little attention. In search for physiological processes of plant cells sensitive to RF fields, cell suspension cultures of Arabidopsis thaliana were exposed for 24 h to a RF field protocol representing typical microwave exposition in an urban environment. mRNA of exposed cultures and controls was used to hybridize Affymetrix-ATH1 whole genome microarrays. Differential expression analysis revealed significant changes in transcription of 10 genes, but they did not exceed a fold change of 2.5. Besides that 3 of them are dark-inducible, their functions do not point to any known responses of plants to environmental stimuli. The changes in transcription of these genes were compared with published microarray datasets and revealed a weak similarity of the microwave to light treatment experiments. Considering the large changes described in published experiments, it is questionable if the small alterations caused by a 24 h continuous microwave exposure would have any impact on the growth and reproduction of whole plants.Keywords: suspension cultured plant cells, radio frequency electromagnetic fields, microarrays, Arabidopsis thaliana

  4. Whole genome amplification from a single cell: implications for genetic analysis.

    OpenAIRE

    Zhang, L; Cui, X.; Schmitt, K.; R.; Hubert; Navidi, W.; Arnheim, N

    1992-01-01

    We have developed an in vitro method for amplifying a large fraction of the DNA sequences present in a single haploid cell by repeated primer extensions using a mixture of 15-base random oligonucleotides. We studied 12 genetic loci and estimate that the probability of amplifying any sequence in the genome to a minimum of 30 copies is not less than 0.78 (95% confidence). Whole genome amplification beginning with a single cell, or other samples with very small amounts of DNA, has significant im...

  5. Whole-genome gene expression profiling of formalin-fixed, paraffin-embedded tissue samples.

    Directory of Open Access Journals (Sweden)

    Craig April

    Full Text Available BACKGROUND: We have developed a gene expression assay (Whole-Genome DASL, capable of generating whole-genome gene expression profiles from degraded samples such as formalin-fixed, paraffin-embedded (FFPE specimens. METHODOLOGY/PRINCIPAL FINDINGS: We demonstrated a similar level of sensitivity in gene detection between matched fresh-frozen (FF and FFPE samples, with the number and overlap of probes detected in the FFPE samples being approximately 88% and 95% of that in the corresponding FF samples, respectively; 74% of the differentially expressed probes overlapped between the FF and FFPE pairs. The WG-DASL assay is also able to detect 1.3-1.5 and 1.5-2 -fold changes in intact and FFPE samples, respectively. The dynamic range for the assay is approximately 3 logs. Comparing the WG-DASL assay with an in vitro transcription-based labeling method yielded fold-change correlations of R(2 approximately 0.83, while fold-change comparisons with quantitative RT-PCR assays yielded R(2 approximately 0.86 and R(2 approximately 0.55 for intact and FFPE samples, respectively. Additionally, the WG-DASL assay yielded high self-correlations (R(2>0.98 with low intact RNA inputs ranging from 1 ng to 100 ng; reproducible expression profiles were also obtained with 250 pg total RNA (R(2 approximately 0.92, with approximately 71% of the probes detected in 100 ng total RNA also detected at the 250 pg level. When FFPE samples were assayed, 1 ng total RNA yielded self-correlations of R(2 approximately 0.80, while still maintaining a correlation of R(2 approximately 0.75 with standard FFPE inputs (200 ng. CONCLUSIONS/SIGNIFICANCE: Taken together, these results show that WG-DASL assay provides a reliable platform for genome-wide expression profiling in archived materials. It also possesses utility within clinical settings where only limited quantities of samples may be available (e.g. microdissected material or when minimally invasive procedures are performed (e

  6. Whole-genome sequencing of a malignant granular cell tumor with metabolic response to pazopanib

    Science.gov (United States)

    Wei, Lei; Liu, Song; Conroy, Jeffrey; Wang, Jianmin; Papanicolau-Sengos, Antonios; Glenn, Sean T.; Murakami, Mitsuko; Liu, Lu; Hu, Qiang; Conroy, Jacob; Miles, Kiersten Marie; Nowak, David E.; Liu, Biao; Qin, Maochun; Bshara, Wiam; Omilian, Angela R.; Head, Karen; Bianchi, Michael; Burgher, Blake; Darlak, Christopher; Kane, John; Merzianu, Mihai; Cheney, Richard; Fabiano, Andrew; Salerno, Kilian; Talati, Chetasi; Khushalani, Nikhil I.; Trump, Donald L.; Johnson, Candace S.; Morrison, Carl D.

    2015-01-01

    Granular cell tumors are an uncommon soft tissue neoplasm. Malignant granular cell tumors comprise T transitions, particularly when immediately preceded by a 5′ G. A loss-of-function mutation was detected in a newly recognized tumor suppressor candidate, BRD7. No mutations were found in known targets of pazopanib. However, we identified a receptor tyrosine kinase pathway mutation in GFRA2 that warrants further evaluation. To the best of our knowledge, this is only the second reported case of a malignant granular cell tumor exhibiting a response to pazopanib, and the first whole-genome sequencing of this uncommon tumor type. The findings provide insight into the genetic basis of malignant granular cell tumors and identify potential targets for further investigation. PMID:27148567

  7. Gene evolution and gene expression after whole genome duplication in fish: the PhyloFish database.

    Science.gov (United States)

    Pasquier, Jeremy; Cabau, Cédric; Nguyen, Thaovi; Jouanno, Elodie; Severac, Dany; Braasch, Ingo; Journot, Laurent; Pontarotti, Pierre; Klopp, Christophe; Postlethwait, John H; Guiguen, Yann; Bobe, Julien

    2016-01-01

    With more than 30,000 species, ray-finned fish represent approximately half of vertebrates. The evolution of ray-finned fish was impacted by several whole genome duplication (WGD) events including a teleost-specific WGD event (TGD) that occurred at the root of the teleost lineage about 350 million years ago (Mya) and more recent WGD events in salmonids, carps, suckers and others. In plants and animals, WGD events are associated with adaptive radiations and evolutionary innovations. WGD-spurred innovation may be especially relevant in the case of teleost fish, which colonized a wide diversity of habitats on earth, including many extreme environments. Fish biodiversity, the use of fish models for human medicine and ecological studies, and the importance of fish in human nutrition, fuel an important need for the characterization of gene expression repertoires and corresponding evolutionary histories of ray-finned fish genes. To this aim, we performed transcriptome analyses and developed the PhyloFish database to provide (i) de novo assembled gene repertoires in 23 different ray-finned fish species including two holosteans (i.e. a group that diverged from teleosts before TGD) and 21 teleosts (including six salmonids), and (ii) gene expression levels in ten different tissues and organs (and embryos for many) in the same species. This resource was generated using a common deep RNA sequencing protocol to obtain the most exhaustive gene repertoire possible in each species that allows between-species comparisons to study the evolution of gene expression in different lineages. The PhyloFish database described here can be accessed and searched using RNAbrowse, a simple and efficient solution to give access to RNA-seq de novo assembled transcripts. PMID:27189481

  8. Whole-genome amplification of single-cell genomes for next-generation sequencing.

    Science.gov (United States)

    Korfhage, Christian; Fisch, Evelyn; Fricke, Evelyn; Baedker, Silke; Loeffert, Dirk

    2013-10-11

    DNA sequence analysis and genotyping of biological samples using next-generation sequencing (NGS), microarrays, or real-time PCR is often limited by the small amount of sample available. A single cell contains only one to four copies of the genomic DNA, depending on the organism (haploid or diploid organism) and the cell-cycle phase. The DNA content of a single cell ranges from a few femtograms in bacteria to picograms in mammalia. In contrast, a deep analysis of the genome currently requires a few hundred nanograms up to micrograms of genomic DNA for library formation necessary for NGS sequencing or labeling protocols (e.g., microarrays). Consequently, accurate whole-genome amplification (WGA) of single-cell DNA is required for reliable genetic analysis (e.g., NGS) and is particularly important when genomic DNA is limited. The use of single-cell WGA has enabled the analysis of genomic heterogeneity of individual cells (e.g., somatic genomic variation in tumor cells). This unit describes how the genome of single cells can be used for WGA for further genomic studies, such as NGS. Recommendations for isolation of single cells are given and common sources of errors are discussed.

  9. Whole Genome Sequencing

    Science.gov (United States)

    ... you want to learn. Search form Search Whole Genome Sequencing You are here Home Testing & Services Testing ... the full story, click here . What is whole genome sequencing? Whole genome sequencing is the mapping out ...

  10. Whole genome amplification and de novo assembly of single bacterial cells.

    Directory of Open Access Journals (Sweden)

    Sébastien Rodrigue

    Full Text Available BACKGROUND: Single-cell genome sequencing has the potential to allow the in-depth exploration of the vast genetic diversity found in uncultured microbes. We used the marine cyanobacterium Prochlorococcus as a model system for addressing important challenges facing high-throughput whole genome amplification (WGA and complete genome sequencing of individual cells. METHODOLOGY/PRINCIPAL FINDINGS: We describe a pipeline that enables single-cell WGA on hundreds of cells at a time while virtually eliminating non-target DNA from the reactions. We further developed a post-amplification normalization procedure that mitigates extreme variations in sequencing coverage associated with multiple displacement amplification (MDA, and demonstrated that the procedure increased sequencing efficiency and facilitated genome assembly. We report genome recovery as high as 99.6% with reference-guided assembly, and 95% with de novo assembly starting from a single cell. We also analyzed the impact of chimera formation during MDA on de novo assembly, and discuss strategies to minimize the presence of incorrectly joined regions in contigs. CONCLUSIONS/SIGNIFICANCE: The methods describe in this paper will be useful for sequencing genomes of individual cells from a variety of samples.

  11. Implementation of exon arrays: alternative splicing during T-cell proliferation as determined by whole genome analysis

    Directory of Open Access Journals (Sweden)

    Whistler Toni

    2010-09-01

    Full Text Available Abstract Background The contribution of alternative splicing and isoform expression to cellular response is emerging as an area of considerable interest, and the newly developed exon arrays allow for systematic study of these processes. We use this pilot study to report on the feasibility of exon array implementation looking to replace the 3' in vitro transcription expression arrays in our laboratory. One of the most widely studied models of cellular response is T-cell activation from exogenous stimulation. Microarray studies have contributed to our understanding of key pathways activated during T-cell stimulation. We use this system to examine whole genome transcription and alternate exon usage events that are regulated during lymphocyte proliferation in an attempt to evaluate the exon arrays. Results Peripheral blood mononuclear cells form healthy donors were activated using phytohemagglutinin, IL2 and ionomycin and harvested at 5 points over a 7 day period. Flow cytometry measured cell cycle events and the Affymetrix exon array platform was used to identify the gene expression and alternate exon usage changes. Gene expression changes were noted in a total of 2105 transcripts, and alternate exon usage identified in 472 transcript clusters. There was an overlap of 263 transcripts which showed both differential expression and alternate exon usage over time. Gene ontology enrichment analysis showed a broader range of biological changes in biological processes for the differentially expressed genes, which include cell cycle, cell division, cell proliferation, chromosome segregation, cell death, component organization and biogenesis and metabolic process ontologies. The alternate exon usage ontological enrichments are in metabolism and component organization and biogenesis. We focus on alternate exon usage changes in the transcripts of the spliceosome complex. The real-time PCR validation rates were 86% for transcript expression and 71% for

  12. Genome management and mismanagement--cell-level opportunities and challenges of whole-genome duplication.

    Science.gov (United States)

    Yant, Levi; Bomblies, Kirsten

    2015-12-01

    Whole-genome duplication (WGD) doubles the DNA content in the nucleus and leads to polyploidy. In whole-organism polyploids, WGD has been implicated in adaptability and the evolution of increased genome complexity, but polyploidy can also arise in somatic cells of otherwise diploid plants and animals, where it plays important roles in development and likely environmental responses. As with whole organisms, WGD can also promote adaptability and diversity in proliferating cell lineages, although whether WGD is beneficial is clearly context-dependent. WGD is also sometimes associated with aging and disease and may be a facilitator of dangerous genetic and karyotypic diversity in tumorigenesis. Scaling changes can affect cell physiology, but problems associated with WGD in large part seem to arise from problems with chromosome segregation in polyploid cells. Here we discuss both the adaptive potential and problems associated with WGD, focusing primarily on cellular effects. We see value in recognizing polyploidy as a key player in generating diversity in development and cell lineage evolution, with intriguing parallels across kingdoms.

  13. Single Cell Analysis of Dystrophin and SRY Gene by Using Whole Genome Amplification

    Institute of Scientific and Technical Information of China (English)

    徐晨明; 金帆; 黄荷凤; 陶冶; 叶英辉

    2001-01-01

    Objective To develop a reliable and sensitive method for detection of sex and multiloci of Duchenne muscular dystrophy (DMD) gene in single cell Materials & methods Whole genome of single cell were amplified by using 15-base random primers (primer extension preamplification, PEP), then a small aliquot of PEP product were analyzed by using locus-specific nest PCR amplification. The procedure was evaluated by detection dystrophin exons 8, 17, 19, 44, 45, 48 and human testis-determining gene (SRY)in single lymphocytes from known sources and single blastomeres from the couples with no family history of DMD.Results The amplification efficiency rate of six dystrophin exons from single lymphocytes and single blastomeres were 97. 2% (175/180) and 100% (60/60) respectively.Results of SRY showed that 100% (15/15) amplification in single male-derived lymphocytes and 0% (0/15) amplification in single female-derived lymphocytes. Conclusion The technique of single cell PEP-nest PCR for dystrophin exons 8, 17,19, 44, 45, 48 and SRY is highly specifc. PEP-nest PCR is suitable for Preimplantation genetic diagnosis (PGD) of DMD at single cell level.

  14. Whole genome transcription profiling of Anaplasma phagocytophilum in human and tick host cells by tiling array analysis

    Directory of Open Access Journals (Sweden)

    Chavez Adela

    2008-07-01

    Full Text Available Abstract Background Anaplasma phagocytophilum (Ap is an obligate intracellular bacterium and the agent of human granulocytic anaplasmosis, an emerging tick-borne disease. Ap alternately infects ticks and mammals and a variety of cell types within each. Understanding the biology behind such versatile cellular parasitism may be derived through the use of tiling microarrays to establish high resolution, genome-wide transcription profiles of the organism as it infects cell lines representative of its life cycle (tick; ISE6 and pathogenesis (human; HL-60 and HMEC-1. Results Detailed, host cell specific transcriptional behavior was revealed. There was extensive differential Ap gene transcription between the tick (ISE6 and the human (HL-60 and HMEC-1 cell lines, with far fewer differentially transcribed genes between the human cell lines, and all disproportionately represented by membrane or surface proteins. There were Ap genes exclusively transcribed in each cell line, apparent human- and tick-specific operons and paralogs, and anti-sense transcripts that suggest novel expression regulation processes. Seven virB2 paralogs (of the bacterial type IV secretion system showed human or tick cell dependent transcription. Previously unrecognized genes and coding sequences were identified, as were the expressed p44/msp2 (major surface proteins paralogs (of 114 total, through elevated signal produced to the unique hypervariable region of each – 2/114 in HL-60, 3/114 in HMEC-1, and none in ISE6. Conclusion Using these methods, whole genome transcription profiles can likely be generated for Ap, as well as other obligate intracellular organisms, in any host cells and for all stages of the cell infection process. Visual representation of comprehensive transcription data alongside an annotated map of the genome renders complex transcription into discernable patterns.

  15. Whole genome bisulfite sequencing of cell-free DNA and its cellular contributors uncovers placenta hypomethylated domains

    OpenAIRE

    Jensen, Taylor J.; Kim, Sung K; Zhu, Zhanyang; Chin, Christine; Gebhard, Claudia; Lu, Tim; Deciu, Cosmin; Van den Boom, Dirk; Ehrich, Mathias

    2015-01-01

    Background Circulating cell-free fetal DNA has enabled non-invasive prenatal fetal aneuploidy testing without direct discrimination of the maternal and fetal DNA. Testing may be improved by specifically enriching the sample material for fetal DNA. DNA methylation may allow for such a separation of DNA; however, this depends on knowledge of the methylomes of circulating cell-free DNA and its cellular contributors. Results We perform whole genome bisulfite sequencing on a set of unmatched sampl...

  16. Whole genome expression profiling using DNA microarray for determining biocompatibility of polymeric surfaces

    DEFF Research Database (Denmark)

    Stangegaard, Michael; Wang, Zhenyu; Kutter, Jörg Peter;

    2006-01-01

    There is an ever increasing need to find surfaces that are biocompatible for applications like medical implants and microfluidics-based cell culture systems. The biocompatibility of five different surfaces with different hydrophobicity was determined using gene expression profiling as well as more...

  17. Classification and toxicity mechanisms of novel flame retardants (NFRs) based on whole genome expression profiling.

    Science.gov (United States)

    Guan, Miao; Su, Guanyong; Giesy, John P; Zhang, Xiaowei

    2016-02-01

    Recently some novel alternative flame retardants (NFRs), which have been widely applied to meet demands for mandated flame retardation of products, have been detected in various matrices of the environment. However, knowledge on toxic effects and associated molecular mechanisms of these chemicals was limited. Here, toxic mechanisms of action of six NFRs, bis (2-ethylhexyl) phosphate (BEHP), chlorendic acid (Het acid), 2,2-bis (bromomethyl)-1,3-propanediol (BMP), tris (2-butoxyethyl) phosphate (TBEP), triethyl phosphate (TEP), tributyl phosphate (TBP) were investigated by use of a library containing ∼1820 modified green fluorescent protein (GFP) expressing promoter reporter vectors constructed from Escherichia coli K12(E.coli). BEHP, Het acid, BMP, TBEP, TEP, TBP inhibited growth of E. coli with 4 h 10%-inhibition concentrations of 53.0-3102.3 μM. A total of 119, 44, 26, 131, 62, 103 genes out of 336 genes selected during preliminary screening were significantly altered with fold-changes greater than 1.5 by BEHP, Het acid, BMP, TBEP, TEP and TBP, respectively. GO analyses of responsive genes suggested that RNA and primary metabolism process were involved in molecular mechanisms of toxicity. Chemical clustering based on expression of 62 multi-responsive genes showed that BEHP, TBP and TBEP were grouped together, which is consistent with similarity of their chemical structures, especially for BEHP and TBP. Clustering by molecular descriptors and molecular activity by use of the multivariate classification system ToxCast was consistent with that by profiles of multi-responsive genes. The results of this study demonstrated the utility of the E. coli, whole-cell assay for determining mechanisms of toxic action of chemicals. PMID:26588597

  18. Differential Gene Expression Analysis of Placentas with Increased Vascular Resistance and Pre-Eclampsia Using Whole-Genome Microarrays

    Directory of Open Access Journals (Sweden)

    M. Centlow

    2011-01-01

    Full Text Available Pre-eclampsia is a pregnancy complication characterized by hypertension and proteinuria. There are several factors associated with an increased risk of developing pre-eclampsia, one of which is increased uterine artery resistance, referred to as “notching”. However, some women do not progress into pre-eclampsia whereas others may have a higher risk of doing so. The placenta, central in pre-eclampsia pathology, may express genes associated with either protection or progression into pre-eclampsia. In order to search for genes associated with protection or progression, whole-genome profiling was performed. Placental tissue from 15 controls, 10 pre-eclamptic, 5 pre-eclampsia with notching, and 5 with notching only were analyzed using microarray and antibody microarrays to study some of the same gene product and functionally related ones. The microarray showed 148 genes to be significantly altered between the four groups. In the preeclamptic group compared to notch only, there was increased expression of genes related to chemotaxis and the NF-kappa B pathway and decreased expression of genes related to antigen processing and presentation, such as human leukocyte antigen B. Our results indicate that progression of pre-eclampsia from notching may involve the development of inflammation. Increased expression of antigen-presenting genes, as seen in the notch-only placenta, may prevent this inflammatory response and, thereby, protect the patient from developing pre-eclampsia.

  19. Analysis of the differences in whole-genome expression related to asthma and obesity

    NARCIS (Netherlands)

    Gruchala-Niedoszytko, Marta; Niedoszytko, Marek; Sanjabi, Bahram; van der Vlies, Pieter; Niedoszytko, Piotr; Jassem, Ewa; Malgorzewicz, Sylwia

    2015-01-01

    Introduction Concomitant obesity significantly impairs asthma control. Obese asthmatics show more severe symptoms and an increased use of medications. Objectives The primary aim of the study was to identify genes that are differentially expressed in the peripheral blood of asthmatic patients with ob

  20. Prospective Evaluation of Whole Genome MicroRNA Expression Profiling in Childhood Acute Lymphoblastic Leukemia

    Directory of Open Access Journals (Sweden)

    Muhterem Duyu

    2014-01-01

    Full Text Available Dysregulation of microRNA (miRNA expression contributes to the pathogenesis of several clinical conditions. The aim of this study is to evaluate the associations between miRNAs and childhood acute lymphoblastic leukemia (ALL to discover their role in the course of the disease. Forty-three children with ALL and 14 age-matched healthy controls were included in the study. MicroRNA microarray expression profiling was used for peripheral blood and bone marrow samples. Aberrant miRNA expressions associated with the diagnosis and outcome were prospectively evaluated. Confirmation analysis was performed by real time RT-PCR. miR-128, miR-146a, miR-155, miR-181a, and miR-195 were significantly dysregulated in ALL patients at day 0. Following a six-month treatment period, the change in miRNA levels was determined by real time RT-PCR and expression of miR-146a, miR-155, miR-181a, and miR-195 significantly decreased. To conclude, these miRNAs not only may be used as biomarkers in diagnosis of ALL and monitoring the disease but also provide new insights into the potential roles of them in leukemogenesis.

  1. Effect of long real space flight on the whole genome mRNA expression properties in medaka Oryzias latipes

    Science.gov (United States)

    Kozlova, Olga; Gusev, Oleg; Levinskikh, Margarita; Sychev, Vladimir; Poddubko, Svetlana

    The current study is addressed to the complex analysis of whole genome mRNA expression profile and properties of splicing variants formation in different organs of medaka fish exposed to prolonged space flight in the frame of joint Russia-Japan research program “Aquarium-AQH”. The fish were kept in the AQH joint-aquariums system in October-December 2013, followed by fixation in RNA-preserving buffers and freezing during the space flight. The samples we returned to the Earth frozen in March 2013 and mRNAs from four fish were sequenced in organ-specific manner using HiSeq Illumina sequencing platform. The ground group fish treated in the same way was used as a control. The comparison between the groups revealed space group-specific specific mRNA expression pattern. More than 50 genes (including several types of myosins) were down-regulated in the space group. Moreover, we found an evidence for formation of space group-specific splicing variants of mRNA. Taking together, the data suggest that in spite of aquatic environment, space flight-associated factors have a strong effect on the activity of fish genome. This work was supported in part by subsidy of the Russian Government to support the Program of competitive growth of Kazan Federal University among world class academic centres and universities.

  2. Hyperlipidemia-associated gene variations and expression patterns revealed by whole-genome and transcriptome sequencing of rabbit models.

    Science.gov (United States)

    Wang, Zhen; Zhang, Jifeng; Li, Hong; Li, Junyi; Niimi, Manabu; Ding, Guohui; Chen, Haifeng; Xu, Jie; Zhang, Hongjiu; Xu, Ze; Dai, Yulin; Gui, Tuantuan; Li, Shengdi; Liu, Zhi; Wu, Sujuan; Cao, Mushui; Zhou, Lu; Lu, Xingyu; Wang, Junxia; Yang, Jing; Fu, Yunhe; Yang, Dongshan; Song, Jun; Zhu, Tianqing; Li, Shen; Ning, Bo; Wang, Ziyun; Koike, Tomonari; Shiomi, Masashi; Liu, Enqi; Chen, Luonan; Fan, Jianglin; Chen, Y Eugene; Li, Yixue

    2016-01-01

    The rabbit (Oryctolagus cuniculus) is an important experimental animal for studying human diseases, such as hypercholesterolemia and atherosclerosis. Despite this, genetic information and RNA expression profiling of laboratory rabbits are lacking. Here, we characterized the whole-genome variants of three breeds of the most popular experimental rabbits, New Zealand White (NZW), Japanese White (JW) and Watanabe heritable hyperlipidemic (WHHL) rabbits. Although the genetic diversity of WHHL rabbits was relatively low, they accumulated a large proportion of high-frequency deleterious mutations due to the small population size. Some of the deleterious mutations were associated with the pathophysiology of WHHL rabbits in addition to the LDLR deficiency. Furthermore, we conducted transcriptome sequencing of different organs of both WHHL and cholesterol-rich diet (Chol)-fed NZW rabbits. We found that gene expression profiles of the two rabbit models were essentially similar in the aorta, even though they exhibited different types of hypercholesterolemia. In contrast, Chol-fed rabbits, but not WHHL rabbits, exhibited pronounced inflammatory responses and abnormal lipid metabolism in the liver. These results provide valuable insights into identifying therapeutic targets of hypercholesterolemia and atherosclerosis with rabbit models. PMID:27245873

  3. Hyperlipidemia-associated gene variations and expression patterns revealed by whole-genome and transcriptome sequencing of rabbit models

    Science.gov (United States)

    Wang, Zhen; Zhang, Jifeng; Li, Hong; Li, Junyi; Niimi, Manabu; Ding, Guohui; Chen, Haifeng; Xu, Jie; Zhang, Hongjiu; Xu, Ze; Dai, Yulin; Gui, Tuantuan; Li, Shengdi; Liu, Zhi; Wu, Sujuan; Cao, Mushui; Zhou, Lu; Lu, Xingyu; Wang, Junxia; Yang, Jing; Fu, Yunhe; Yang, Dongshan; Song, Jun; Zhu, Tianqing; Li, Shen; Ning, Bo; Wang, Ziyun; Koike, Tomonari; Shiomi, Masashi; Liu, Enqi; Chen, Luonan; Fan, Jianglin; Chen, Y. Eugene; Li, Yixue

    2016-01-01

    The rabbit (Oryctolagus cuniculus) is an important experimental animal for studying human diseases, such as hypercholesterolemia and atherosclerosis. Despite this, genetic information and RNA expression profiling of laboratory rabbits are lacking. Here, we characterized the whole-genome variants of three breeds of the most popular experimental rabbits, New Zealand White (NZW), Japanese White (JW) and Watanabe heritable hyperlipidemic (WHHL) rabbits. Although the genetic diversity of WHHL rabbits was relatively low, they accumulated a large proportion of high-frequency deleterious mutations due to the small population size. Some of the deleterious mutations were associated with the pathophysiology of WHHL rabbits in addition to the LDLR deficiency. Furthermore, we conducted transcriptome sequencing of different organs of both WHHL and cholesterol-rich diet (Chol)-fed NZW rabbits. We found that gene expression profiles of the two rabbit models were essentially similar in the aorta, even though they exhibited different types of hypercholesterolemia. In contrast, Chol-fed rabbits, but not WHHL rabbits, exhibited pronounced inflammatory responses and abnormal lipid metabolism in the liver. These results provide valuable insights into identifying therapeutic targets of hypercholesterolemia and atherosclerosis with rabbit models. PMID:27245873

  4. Quality assessment metrics for whole genome gene expression profiling of paraffin embedded samples

    OpenAIRE

    Mahoney, Douglas W.; Terry M. Therneau; Anderson, S. Keith; Jen, Jin; Kocher, Jean-Pierre A.; Reinholz, Monica M; Perez, Edith A.; Eckel-Passow, Jeanette E

    2013-01-01

    Background Formalin fixed, paraffin embedded tissues are most commonly used for routine pathology analysis and for long term tissue preservation in the clinical setting. Many institutions have large archives of Formalin fixed, paraffin embedded tissues that provide a unique opportunity for understanding genomic signatures of disease. However, genome-wide expression profiling of Formalin fixed, paraffin embedded samples have been challenging due to RNA degradation. Because of the significant h...

  5. Regulation of the expression of the whole genome of Ustilago maydis by a MAPK pathway.

    Science.gov (United States)

    Martínez-Soto, Domingo; Ruiz-Herrera, José

    2015-05-01

    The operation of mitogen-activated protein kinase (MAPK) signal transduction pathways is one of the most important mechanisms for the transfer of extracellular information into the cell. These pathways are highly conserved in eukaryotic organisms. In fungi, MAPK pathways are involved in the regulation of a number of cellular processes such as metabolism, homeostasis, pathogenesis and cell differentiation and morphogenesis. Considering the importance of pathways, in the present work we proceeded to identify all the genes that are regulated by the signal transduction pathway involved in mating, pathogenesis and morphogenesis of Ustilago maydis. Accordingly we made a comparison between the transcriptomes from a wild-type strain and an Ubc2 mutant affected in the interacting protein of this pathway by use of microarrays. By this methodology, we identified 939 genes regulated directly or indirectly by the MAPK pathway. Of them, 432 were positively, and 507 were negatively found regulated. By functional grouping, genes encoding cyclin-dependent kinases, transcription factors, proteins involved in signal transduction, in synthesis of wall and cell membrane, and involved in dimorphism were identified as differentially regulated. These data reveal the importance of these global studies, and the large (and unsuspected) number of functions of the fungus under the control of this MAPK, providing clues to the possible mechanisms involved.

  6. 全基因组表达谱芯片筛选非小细胞肺癌常规分割和大分割放疗差异基因的初步研究*%Identifying the genetic pattern of conventional fractionated and hypofractionated radiotherapy using whole genome expression microarray in a non-small-cell lung cancer cell line

    Institute of Scientific and Technical Information of China (English)

    孙健; 刘宁波; 曲晨慧; 王宝虎; 郭华; 王平

    2013-01-01

    目的:获得稳定的非小细胞肺癌(NSCLC)放射抗拒细胞系,明确常规分割和大分割放疗后肿瘤基因表达改变。方法:采用A549细胞系,6MV X线常规照射(2 Gy×17 f)和大分割照射(4 Gy×7 f),克隆形成实验和γ-H2AX免疫荧光染色结合共聚焦显微镜验证细胞的放射抗拒特性。提取mRNA,全基因组表达谱芯片检测差异基因表达,分析2倍以上改变的基因(P<0.05),同时对芯片结果行Pathway分析(Q<0.05)。结果:获得了2株放疗抗拒细胞系A549R2Gy-R和A549R4Gy-R。表达谱芯片显示,A549与A549R2Gy-R相比,差异表达基因为1701个(357个上调,1344个下调);A549与A549R4Gy-R相比,944个基因上调,2602个基因下调。A549R2Gy-R与A549R4Gy-R相比,318个基因上调,699个基因下调。常规分割照射与大分割照射的pathway显著性富集分析显示,PI3K和Erb B通路等多条信号通路激酶出现显著性差异。结论:多种基因和信号通路参与了NSCLC常规分割和大分割放疗抗拒过程,进一步研究能明确NSCLC放射抗拒机制和为放疗增敏药物开发提供新靶点。%Objective:To obtain stable radioresistant non-small-cell lung cancer (NSCLC) cell lines and identify the genetic pattern of conventional fractioned and hypofractionated radiotherapy. Methods:A549 NSCLC cells were treated with 6 MV of x-rays through conventional fractionated (2 Gy, 17 f) and hypofractionated irradiation (4 Gy, 7 f) to establish a radiation resistance cell model. Tumor cell radioresistance was determined using a clonogenic assay andγ-H2AX immunofluorescence staining combined with confocal microscopy. After extracting total mRNA from the cells, a whole genome expression microarray was applied to detect differential gene expression. The genes with at least a twofold increase in expression (P<0.05) were analyzed, and the pathway (Q<0.05) methods were used to further analyze the chip results

  7. Multi-platform whole-genome microarray analyses refine the epigenetic signature of breast cancer metastasis with gene expression and copy number.

    Directory of Open Access Journals (Sweden)

    Joseph Andrews

    Full Text Available BACKGROUND: We have previously identified genome-wide DNA methylation changes in a cell line model of breast cancer metastasis. These complex epigenetic changes that we observed, along with concurrent karyotype analyses, have led us to hypothesize that complex genomic alterations in cancer cells (deletions, translocations and ploidy are superimposed over promoter-specific methylation events that are responsible for gene-specific expression changes observed in breast cancer metastasis. METHODOLOGY/PRINCIPAL FINDINGS: We undertook simultaneous high-resolution, whole-genome analyses of MDA-MB-468GFP and MDA-MB-468GFP-LN human breast cancer cell lines (an isogenic, paired lymphatic metastasis cell line model using Affymetrix gene expression (U133, promoter (1.0R, and SNP/CNV (SNP 6.0 microarray platforms to correlate data from gene expression, epigenetic (DNA methylation, and combination copy number variant/single nucleotide polymorphism microarrays. Using Partek Software and Ingenuity Pathway Analysis we integrated datasets from these three platforms and detected multiple hypomethylation and hypermethylation events. Many of these epigenetic alterations correlated with gene expression changes. In addition, gene dosage events correlated with the karyotypic differences observed between the cell lines and were reflected in specific promoter methylation patterns. Gene subsets were identified that correlated hyper (and hypo methylation with the loss (or gain of gene expression and in parallel, with gene dosage losses and gains, respectively. Individual gene targets from these subsets were also validated for their methylation, expression and copy number status, and susceptible gene pathways were identified that may indicate how selective advantage drives the processes of tumourigenesis and metastasis. CONCLUSIONS/SIGNIFICANCE: Our approach allows more precisely profiling of functionally relevant epigenetic signatures that are associated with cancer

  8. 哮喘患儿外周血单个核细胞全基因组表达谱的差异研究%Profiling of differential expression in asthma and healthy children' s peripheral blood mononuclear cells by whole-genome microarray

    Institute of Scientific and Technical Information of China (English)

    孔倩; 黄花荣; 吴葆菁; 李雯静; 陈纯

    2013-01-01

    目的:筛选哮喘患儿与对照组儿童外周血的基因表达谱差异,寻找与哮喘防治相关的靶基因.方法:从5名哮喘患儿和5名对照组儿童外周血单个核细胞中提取总RNA,利用全基因组表达谱芯片进行检测,选取经非配对t检验计算所得P≤0.05、同时基因表达变化≥2倍的差异表达基因,以荧光定量PCR(qRT-PCR)验证芯片结果.通过生物信息学软件对初步筛选的差异基因进行层次聚类分析和Gene Ontology (GO)功能分类分析.结果:从45 033条表达基因谱中筛选出哮喘患儿与对照组儿童差异表达2倍以上且P≤0.05的已命名基因758个(含上调基因345个,下调基因413个),其GO生物学过程功能分类主要涉及免疫反应、对外部刺激的反应、信号转导及分子功能的负性调节、细胞死亡、凋亡及其调节等.其中有29个基因与哮喘、气道炎症或气道重构有关,且变化趋势与文献报道一致(含上调基因14个,下调基因15个),并可被层次聚类分析划分为9类.qRT-PCR验证结果与芯片结果一致.结论:用全基因组表达谱芯片可以筛选出哮喘息儿与对照组儿童外周血单个核细胞中的差异表达基因,进一步的数据挖掘很可能寻找到与哮喘防治相关的靶基因或靶通路.%AIM:To compare the differences of whole-genome expression in peripheral blood mononuclear cells (PBMC) between asthma children and healthy controls.METHODS:The subjects were 5 cases of asthma children and 5 cases of healthy controls.Total RNA was extracted from PBMC and subjected to microarray analysis with NimbleG,en human gene expression array.Unpaired t-test algorithm was used to screen the differentially expressed genes when P≤0.05and fold change ≥ 2.Real-time quantitative PCR (qRT-PCR) was performed to verify the microarray results.Classification and function of the differential genes were illustrated by bioinformatic processing including hierarchical clustering and Gene

  9. Monodisperse Picoliter Droplets for Low-Bias and Contamination-Free Reactions in Single-Cell Whole Genome Amplification.

    Directory of Open Access Journals (Sweden)

    Yohei Nishikawa

    Full Text Available Whole genome amplification (WGA is essential for obtaining genome sequences from single bacterial cells because the quantity of template DNA contained in a single cell is very low. Multiple displacement amplification (MDA, using Phi29 DNA polymerase and random primers, is the most widely used method for single-cell WGA. However, single-cell MDA usually results in uneven genome coverage because of amplification bias, background amplification of contaminating DNA, and formation of chimeras by linking of non-contiguous chromosomal regions. Here, we present a novel MDA method, termed droplet MDA, that minimizes amplification bias and amplification of contaminants by using picoliter-sized droplets for compartmentalized WGA reactions. Extracted DNA fragments from a lysed cell in MDA mixture are divided into 105 droplets (67 pL within minutes via flow through simple microfluidic channels. Compartmentalized genome fragments can be individually amplified in these droplets without the risk of encounter with reagent-borne or environmental contaminants. Following quality assessment of WGA products from single Escherichia coli cells, we showed that droplet MDA minimized unexpected amplification and improved the percentage of genome recovery from 59% to 89%. Our results demonstrate that microfluidic-generated droplets show potential as an efficient tool for effective amplification of low-input DNA for single-cell genomics and greatly reduce the cost and labor investment required for determination of nearly complete genome sequences of uncultured bacteria from environmental samples.

  10. Coriandrum sativum L. (Coriander essential oil: antifungal activity and mode of action on Candida spp., and molecular targets affected in human whole-genome expression.

    Directory of Open Access Journals (Sweden)

    Irlan de Almeida Freires

    Full Text Available Oral candidiasis is an opportunistic fungal infection of the oral cavity with increasingly worldwide prevalence and incidence rates. Novel specifically-targeted strategies to manage this ailment have been proposed using essential oils (EO known to have antifungal properties. In this study, we aim to investigate the antifungal activity and mode of action of the EO from Coriandrum sativum L. (coriander leaves on Candida spp. In addition, we detected the molecular targets affected in whole-genome expression in human cells. The EO phytochemical profile indicates monoterpenes and sesquiterpenes as major components, which are likely to negatively impact the viability of yeast cells. There seems to be a synergistic activity of the EO chemical compounds as their isolation into fractions led to a decreased antimicrobial effect. C. sativum EO may bind to membrane ergosterol, increasing ionic permeability and causing membrane damage leading to cell death, but it does not act on cell wall biosynthesis-related pathways. This mode of action is illustrated by photomicrographs showing disruption in biofilm integrity caused by the EO at varied concentrations. The EO also inhibited Candida biofilm adherence to a polystyrene substrate at low concentrations, and decreased the proteolytic activity of Candida albicans at minimum inhibitory concentration. Finally, the EO and its selected active fraction had low cytotoxicity on human cells, with putative mechanisms affecting gene expression in pathways involving chemokines and MAP-kinase (proliferation/apoptosis, as well as adhesion proteins. These findings highlight the potential antifungal activity of the EO from C. sativum leaves and suggest avenues for future translational toxicological research.

  11. Whole Genome Selection

    Science.gov (United States)

    Whole genome selection (WGS) is an approach to using DNA markers that are distributed throughout the entire genome. Genes affecting most economically-important traits are distributed throughout the genome and there are relatively few that have large effects with many more genes with progressively sm...

  12. Effect of Wortmannin on the repair profiles of DNA double-strand breaks in the whole genome and in interstitial telomeric sequences of Chinese hamster cells

    International Nuclear Information System (INIS)

    The DNA breakage detection-fluorescence in situ hybridization (DBD-FISH) procedure was applied to analyze the effect of Wortmannin (WM) in the rejoining kinetics of ionizing radiation-induced DNA double-strand breaks (DSBs) in the whole genome and in the long interstitial telomeric repeat sequence (ITRS) blocks from Chinese hamster cell lines. The results indicate that the ITRS blocks from wild-type Chinese hamster cell lines, CHO9 and V79B, exhibit a slower initial rejoining rate of ionizing radiation-induced DSBs than the genome overall. Neither Rad51C nor the catalytic subunit of DNA-dependent protein kinase (DNA-PKcs) activities, involved in homologous recombination (HR) and in non-homologous end-joining (NHEJ) pathways of DSB repair respectively, influenced the rejoining kinetics within ITRS in contrast to DNA sequences in the whole genome. Nevertheless, DSB removal rate within ITRS was decreased in the absence of Ku86 activity, though at a lower affectation level than in the whole genome, thus homogenizing both rejoining kinetics rates. WM treatment slowed down the DSB rejoining kinetics rate in ITRS, this effect being more pronounced in the whole genome, resulting in a similar pattern to that of the Ku86 deficient cells. In fact, no WM effect was detected in the Ku86 deficient Chinese hamster cells, so probably WM does not add further impairment in DSB rejoining than that resulted as a consequence of absence of Ku activity. The same slowing effect was also observed after treatment of Rad51C and DNA-PKcs defective hamster cells by WM, suggesting that: (1) there is no potentiation of the HR when the NHEJ is impaired by WM, either in the whole genome or in the ITRS, and (2) that this impairment may probably involve more targets than DNA-PKcs. These results suggest that there is an intragenomic heterogeneity in DSB repair, as well as in the effect of WM on this process

  13. A simple method for encapsulating single cells in alginate microspheres allows for direct PCR and whole genome amplification.

    Directory of Open Access Journals (Sweden)

    Saharnaz Bigdeli

    Full Text Available Microdroplets are an effective platform for segregating individual cells and amplifying DNA. However, a key challenge is to recover the contents of individual droplets for downstream analysis. This paper offers a method for embedding cells in alginate microspheres and performing multiple serial operations on the isolated cells. Rhodobacter sphaeroides cells were diluted in alginate polymer and sprayed into microdroplets using a fingertip aerosol sprayer. The encapsulated cells were lysed and subjected either to conventional PCR, or whole genome amplification using either multiple displacement amplification (MDA or a two-step PCR protocol. Microscopic examination after PCR showed that the lumen of the occupied microspheres contained fluorescently stained DNA product, but multiple displacement amplification with phi29 produced only a small number of polymerase colonies. The 2-step WGA protocol was successful in generating fluorescent material, and quantitative PCR from DNA extracted from aliquots of microspheres suggested that the copy number inside the microspheres was amplified up to 3 orders of magnitude. Microspheres containing fluorescent material were sorted by a dilution series and screened with a fluorescent plate reader to identify single microspheres. The DNA was extracted from individual isolates, re-amplified with full-length sequencing adapters, and then a single isolate was sequenced using the Illumina MiSeq platform. After filtering the reads, the only sequences that collectively matched a genome in the NCBI nucleotide database belonged to R. sphaeroides. This demonstrated that sequencing-ready DNA could be generated from the contents of a single microsphere without culturing. However, the 2-step WGA strategy showed limitations in terms of low genome coverage and an uneven frequency distribution of reads across the genome. This paper offers a simple method for embedding cells in alginate microspheres and performing PCR on isolated

  14. Murine hyperglycemic vasculopathy and cardiomyopathy: whole-genome gene expression analysis predicts cellular targets and regulatory networks influenced by mannose binding lectin

    Directory of Open Access Journals (Sweden)

    Chenhui eZou

    2012-02-01

    Full Text Available Hyperglycemia, in the absence of type 1 or 2 diabetes, is an independent risk factor for cardiovascular disease. We have previously demonstrated a central role for mannose binding lectin (MBL-mediated cardiac dysfunction in acute hyperglycemic mice. In this study, we applied whole genome microarray data analysis to investigate MBL’s role in systematic gene expression changes. The data predict possible intracellular events taking place in multiple cellular compartments such as enhanced insulin signaling pathway sensitivity, promoted mitochondrial respiratory function, improved cellular energy expenditure and protein quality control, improved cytoskeleton structure and facilitated intracellular trafficking, all of which may contribute to the organismal health of MBL null mice against acute hyperglycemia. Our data show a tight association between gene expression profile and tissue function which might be a very useful tool in predicting cellular targets and regulatory networks connected with in vivo observations, providing clues for further mechanistic studies.

  15. Whole genome co-expression analysis of soybean cytochrome P450 genes identifies nodulation-specific P450 monooxygenases

    Directory of Open Access Journals (Sweden)

    Pandey Sona

    2010-11-01

    Full Text Available Abstract Background Cytochrome P450 monooxygenases (P450s catalyze oxidation of various substrates using oxygen and NAD(PH. Plant P450s are involved in the biosynthesis of primary and secondary metabolites performing diverse biological functions. The recent availability of the soybean genome sequence allows us to identify and analyze soybean putative P450s at a genome scale. Co-expression analysis using an available soybean microarray and Illumina sequencing data provides clues for functional annotation of these enzymes. This approach is based on the assumption that genes that have similar expression patterns across a set of conditions may have a functional relationship. Results We have identified a total number of 332 full-length P450 genes and 378 pseudogenes from the soybean genome. From the full-length sequences, 195 genes belong to A-type, which could be further divided into 20 families. The remaining 137 genes belong to non-A type P450s and are classified into 28 families. A total of 178 probe sets were found to correspond to P450 genes on the Affymetrix soybean array. Out of these probe sets, 108 represented single genes. Using the 28 publicly available microarray libraries that contain organ-specific information, some tissue-specific P450s were identified. Similarly, stress responsive soybean P450s were retrieved from 99 microarray soybean libraries. We also utilized Illumina transcriptome sequencing technology to analyze the expressions of all 332 soybean P450 genes. This dataset contains total RNAs isolated from nodules, roots, root tips, leaves, flowers, green pods, apical meristem, mock-inoculated and Bradyrhizobium japonicum-infected root hair cells. The tissue-specific expression patterns of these P450 genes were analyzed and the expression of a representative set of genes were confirmed by qRT-PCR. We performed the co-expression analysis on many of the 108 P450 genes on the Affymetrix arrays. First we confirmed that CYP93C5 (an

  16. Quality control parameters on a large dataset of regionally dissected human control brains for whole genome expression studies

    OpenAIRE

    Trabzuni, Daniah; Ryten, Mina; Walker, Robert; Smith, Colin; Imran, Sabaena; Ramasamy, Adaikalavan; Weale, Michael E; Hardy, John

    2011-01-01

    We are building an open-access database of regional human brain expression designed to allow the genome-wide assessment of genetic variability on expression. Array and RNA sequencing technologies make assessment of genome-wide expression possible. Human brain tissue is a challenging source for this work because it can only be obtained several and variable hours post-mortem and after varying agonal states. These variables alter RNA integrity in a complex manner. In this report, we assess the e...

  17. Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing | Office of Cancer Genomics

    Science.gov (United States)

    Abstract: Diffuse large B-cell lymphoma (DLBCL) is a genetically heterogeneous cancer comprising at least two molecular subtypes that differ in gene expression and distribution of mutations. Recently, application of genome/exome sequencing and RNA-seq to DLBCL has revealed numerous genes that are recurrent targets of somatic point mutation in this disease.

  18. Effects of Temperature on Gene Expression Patterns in Leptospira interrogans Serovar Lai as Assessed by Whole-Genome Microarrays†

    OpenAIRE

    Lo, Miranda; Bulach, Dieter M.; Powell, David R.; David A Haake; Matsunaga, James; Paustian, Michael L.; Zuerner, Richard L.; Adler, Ben

    2006-01-01

    Leptospirosis is an important zoonosis of worldwide distribution. Humans become infected via exposure to pathogenic Leptospira spp. from infected animals or contaminated water or soil. The availability of genome sequences for Leptospira interrogans, serovars Lai and Copenhageni, has opened up opportunities to examine global transcription profiles using microarray technology. Temperature is a key environmental factor known to affect leptospiral protein expression. Leptospira spp. can grow in a...

  19. Whole genome and global gene expression analyses of the model mushroom Flammulina velutipes reveal a high capacity for lignocellulose degradation.

    Directory of Open Access Journals (Sweden)

    Young-Jin Park

    Full Text Available Flammulina velutipes is a fungus with health and medicinal benefits that has been used for consumption and cultivation in East Asia. F. velutipes is also known to degrade lignocellulose and produce ethanol. The overlapping interests of mushroom production and wood bioconversion make F. velutipes an attractive new model for fungal wood related studies. Here, we present the complete sequence of the F. velutipes genome. This is the first sequenced genome for a commercially produced edible mushroom that also degrades wood. The 35.6-Mb genome contained 12,218 predicted protein-encoding genes and 287 tRNA genes assembled into 11 scaffolds corresponding with the 11 chromosomes of strain KACC42780. The 88.4-kb mitochondrial genome contained 35 genes. Well-developed wood degrading machinery with strong potential for lignin degradation (69 auxiliary activities, formerly FOLymes and carbohydrate degradation (392 CAZymes, along with 58 alcohol dehydrogenase genes were highly expressed in the mycelium, demonstrating the potential application of this organism to bioethanol production. Thus, the newly uncovered wood degrading capacity and sequential nature of this process in F. velutipes, offer interesting possibilities for more detailed studies on either lignin or (hemi- cellulose degradation in complex wood substrates. The mutual interest in wood degradation by the mushroom industry and (ligno-cellulose biomass related industries further increase the significance of F. velutipes as a new model.

  20. Analyses and interpretation of whole-genome gene expression from formalin-fixed paraffin-embedded tissue: an illustration with breast cancer tissues

    OpenAIRE

    Argos Maria; Paul-Brutus Rachelle M; Roy Shantanu; Jasmine Farzana; Kibriya Muhammad G; Ahsan Habibul

    2010-01-01

    Abstract Background We evaluated (a) the feasibility of whole genome cDNA-mediated Annealing, Selection, extension and Ligation (DASL) assay on formalin-fixed paraffin-embedded (FFPE) tissue and (b) whether similar conclusions can be drawn by examining FFPE samples as proxies for fresh frozen (FF) tissues. We used a whole genome DASL assay (addressing 18,391 genes) on a total of 72 samples from paired breast tumor and surrounding healthy tissues from both FF and FFPE samples. Results Gene det...

  1. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    Energy Technology Data Exchange (ETDEWEB)

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.

  2. Whole Genome Sequence of Multiple Myeloma-Prone C57BL/KaLwRij Mouse Strain Suggests the Origin of Disease Involves Multiple Cell Types.

    Directory of Open Access Journals (Sweden)

    Sarah R Amend

    Full Text Available Monoclonal gammopathy of undetermined significance (MGUS is the requisite precursor to multiple myeloma (MM, a malignancy of antibody-producing plasma B-cells. The genetic basis of MGUS and its progression to MM remains poorly understood. C57BL/KaLwRij (KaLwRij is a spontaneously-derived inbred mouse strain with a high frequency of benign idiopathic paraproteinemia (BIP, a phenotype with similarities to MGUS including progression to MM. Using mouse haplotype analysis, human MM SNP array data, and whole exome and whole genome sequencing of KaLwRij mice, we identified novel KaLwRij gene variants, including deletion of Samsn1 and deleterious point mutations in Tnfrsf22 and Tnfrsf23. These variants significantly affected multiple cell types implicated in MM pathogenesis including B-cells, macrophages, and bone marrow stromal cells. These data demonstrate that multiple cell types contribute to MM development prior to the acquisition of somatic driver mutations in KaLwRij mice, and suggest that MM may an inherently non-cell autonomous malignancy.

  3. A whole-genome RNAi screen uncovers a novel role for human potassium channels in cell killing by the parasite Entamoeba histolytica.

    Science.gov (United States)

    Marie, Chelsea; Verkerke, Hans P; Theodorescu, Dan; Petri, William A

    2015-09-08

    The parasite Entamoeba histolytica kills human cells resulting in ulceration, inflammation and invasion of the colonic epithelium. We used the cytotoxic properties of ameba to select a genome-wide RNAi library to reveal novel host factors that control susceptibility to amebic killing. We identified 281 candidate susceptibility genes and bioinformatics analyses revealed that ion transporters were significantly enriched among susceptibility genes. Potassium (K(+)) channels were the most common transporter identified. Their importance was further supported by colon biopsy of humans with amebiasis that demonstrated suppressed K(+) channel expression. Inhibition of human K(+) channels by genetic silencing, pharmacologic inhibitors and with excess K(+) protected diverse cell types from E. histolytica-induced death. Contact with E. histolytica parasites triggered K(+) channel activation and K(+) efflux by intestinal epithelial cells, which preceded cell killing. Specific inhibition of Ca(2+)-dependent K(+) channels was highly effective in preventing amebic cytotoxicity in intestinal epithelial cells and macrophages. Blockade of K(+) efflux also inhibited caspase-1 activation, IL-1β secretion and pyroptotic death in THP-1 macrophages. We concluded that K(+) channels are host mediators of amebic cytotoxicity in multiple cells types and of inflammasome activation in macrophages.

  4. Whole Genome and Transcriptome Sequencing of a B3 Thymoma

    OpenAIRE

    Iacopo Petrini; Arun Rajan; Trung Pham; Donna Voeller; Sean Davis; James Gao; Yisong Wang; Giuseppe Giaccone

    2013-01-01

    Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomi...

  5. Metabolic Adaptation after Whole Genome Duplication

    NARCIS (Netherlands)

    Hoek, M.J.A. van; Hogeweg, P.

    2009-01-01

    Whole genome duplications (WGDs) have been hypothesized to be responsible for major transitions in evolution. However, the effects of WGD and subsequent gene loss on cellular behavior and metabolism are still poorly understood. Here we develop a genome scale evolutionary model to study the dynamics

  6. Whole genome linkage disequilibrium maps in cattle

    Science.gov (United States)

    Bovine whole genome linkage disequilibrium maps were constructed for eight breeds of cattle. These data provide fundamental information concerning bovine genome organization which will allow the design of studies to associate genetic variation with economically important traits and also provides bac...

  7. Whole-genome identiifcation and expression analysis of K+eflfux antiporter (KEA) and Na+/H+antiporter (NHX) families under abiotic stress in soybean

    Institute of Scientific and Technical Information of China (English)

    CHEN Hua-tao; CHEN Xin; WU Bing-yue; YUAN Xing-xing; ZHANG Hong-mei; CUI Xiao-yan; LIU Xiao-qing

    2015-01-01

    Sodium toxicity and potassium insufifcient are important factors affecting the growth and development of soybean in saline soil. As the capacity of plants to maintain a high cytosolic, K+/Na+ratio is the key determinant of tolerance under salt stress. The aims of the present study were to identify and analyse expression patterns of the soybean K+eflfux antiporter (KEA) gene and Na+/H+ antiporter (NHX) gene family, and to explore their roles under abiotic stress. As a result, 12 soybean GmKEAs genes and 10 soybean GmNHXs genes were identiifed and analyzed from soybean genome. Interestingly, the novel soybean KEA gene Glyma16g32821 which encodes 11 transmembrane domains were extremely up-regulated and remained high level until 48 h in root after the excessive potassium treatment and lack of potassium treatment, respectively. The novel soybean NHX gene Glyma09g02130 which encodes 10 transmembrane domains were extremely up-regulated and remained high level until 48 h in root with NaCl stress. Imaging of subcel ular locations of the two new Glyma16g32821-GFP and Glyma09g02130-GFP fusion proteins indicated al plasma membrane localizations of the two novel soybean genes. The 3D structures indicated that the two soybean novel proteins Glyma09g02130 (NHX) and Glyma16g32821 (KEA) al belong to the cation/hydrogen antiporter family.

  8. Whole-genome cartography of estrogen receptor alpha binding sites.

    Directory of Open Access Journals (Sweden)

    Chin-Yo Lin

    2007-06-01

    Full Text Available Using a chromatin immunoprecipitation-paired end diTag cloning and sequencing strategy, we mapped estrogen receptor alpha (ERalpha binding sites in MCF-7 breast cancer cells. We identified 1,234 high confidence binding clusters of which 94% are projected to be bona fide ERalpha binding regions. Only 5% of the mapped estrogen receptor binding sites are located within 5 kb upstream of the transcriptional start sites of adjacent genes, regions containing the proximal promoters, whereas vast majority of the sites are mapped to intronic or distal locations (>5 kb from 5' and 3' ends of adjacent transcript, suggesting transcriptional regulatory mechanisms over significant physical distances. Of all the identified sites, 71% harbored putative full estrogen response elements (EREs, 25% bore ERE half sites, and only 4% had no recognizable ERE sequences. Genes in the vicinity of ERalpha binding sites were enriched for regulation by estradiol in MCF-7 cells, and their expression profiles in patient samples segregate ERalpha-positive from ERalpha-negative breast tumors. The expression dynamics of the genes adjacent to ERalpha binding sites suggest a direct induction of gene expression through binding to ERE-like sequences, whereas transcriptional repression by ERalpha appears to be through indirect mechanisms. Our analysis also indicates a number of candidate transcription factor binding sites adjacent to occupied EREs at frequencies much greater than by chance, including the previously reported FOXA1 sites, and demonstrate the potential involvement of one such putative adjacent factor, Sp1, in the global regulation of ERalpha target genes. Unexpectedly, we found that only 22%-24% of the bona fide human ERalpha binding sites were overlapping conserved regions in whole genome vertebrate alignments, which suggest limited conservation of functional binding sites. Taken together, this genome-scale analysis suggests complex but definable rules governing ERalpha

  9. Whole genome and transcriptome sequencing of a B3 thymoma.

    Directory of Open Access Journals (Sweden)

    Iacopo Petrini

    Full Text Available Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37. Copy number (CN aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs and 2 insertion/deletions (INDELs were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma.

  10. Whole-genome shotgun optical mapping of rhodospirillumrubrum

    Energy Technology Data Exchange (ETDEWEB)

    Reslewic, Susan; Zhou, Shiguo; Place, Mike; Zhang, Yaoping; Briska, Adam; Goldstein, Steve; Churas, Chris; Runnheim, Rod; Forrest,Dan; Lim, Alex; Lapidus, Alla; Han, Cliff S.; Roberts, Gary P.; Schwartz,David C.

    2004-07-01

    Rhodospirillum rubrum is a phototrophic purple non-sulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems, and as a source of hydrogen and biodegradable plastics production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction maps (Xba I, Nhe I, and Hind III) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction maps from randomly sheared genomic DNA molecules extracted directly from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the Hind III map acted as a scaffold for high resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and validation of genome sequence, our work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a ''molecular cytogenetics'' approach to solving problems in genomic analysis.

  11. Whole-genome shotgun optical mapping of Rhodospirillum rubrum

    Energy Technology Data Exchange (ETDEWEB)

    Reslewic, S. [Univ. Wisc.-Madison; Zhou, S. [Univ. Wisc.-Madison; Place, M. [Univ. Wisc.-Madison; Zhang, Y. [Univ. Wisc.-Madison; Briska, A. [Univ. Wisc.-Madison; Goldstein, S. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Lim, A. [Univ. Wisc.-Madison; Lapidus, A. [Univ. Wisc.-Madison; Han, C. S. [Univ. Wisc.-Madison; Roberts, G. P. [Univ. Wisc.-Madison; Schwartz, D. C. [Univ. Wisc.-Madison

    2005-09-01

    Rhodospirillum rubrum is a phototrophic purple nonsulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems and as a source of hydrogen and biodegradable plastic production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction endonuclease maps (XbaI, NheI, and HindIII) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction endonuclease maps from randomly sheared genomic DNA molecules extracted from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the HindIII map acted as a scaffold for high-resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and confirmation of genome sequence, this work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a "molecular cytogenetics" approach to solving problems in genomic analysis.

  12. Whole genome sequence analysis suggests intratumoral heterogeneity in dissemination of breast cancer to lymph nodes.

    Directory of Open Access Journals (Sweden)

    Kevin Blighe

    Full Text Available BACKGROUND: Intratumoral heterogeneity may help drive resistance to targeted therapies in cancer. In breast cancer, the presence of nodal metastases is a key indicator of poorer overall survival. The aim of this study was to identify somatic genetic alterations in early dissemination of breast cancer by whole genome next generation sequencing (NGS of a primary breast tumor, a matched locally-involved axillary lymph node and healthy normal DNA from blood. METHODS: Whole genome NGS was performed on 12 µg (range 11.1-13.3 µg of DNA isolated from fresh-frozen primary breast tumor, axillary lymph node and peripheral blood following the DNA nanoball sequencing protocol. Single nucleotide variants, insertions, deletions, and substitutions were identified through a bioinformatic pipeline and compared to CIN25, a key set of genes associated with tumor metastasis. RESULTS: Whole genome sequencing revealed overlapping variants between the tumor and node, but also variants that were unique to each. Novel mutations unique to the node included those found in two CIN25 targets, TGIF2 and CCNB2, which are related to transcription cyclin activity and chromosomal stability, respectively, and a unique frameshift in PDS5B, which is required for accurate sister chromatid segregation during cell division. We also identified dominant clonal variants that progressed from tumor to node, including SNVs in TP53 and ARAP3, which mediates rearrangements to the cytoskeleton and cell shape, and an insertion in TOP2A, the expression of which is significantly associated with tumor proliferation and can segregate breast cancers by outcome. CONCLUSION: This case study provides preliminary evidence that primary tumor and early nodal metastasis have largely overlapping somatic genetic alterations. There were very few mutations unique to the involved node. However, significant conclusions regarding early dissemination needs analysis of a larger number of patient samples.

  13. 基因芯片技术研究虚寒证大鼠肝全基因表达谱%Study of Gene Chip Technology on Whole Genome Expression of Liver of Rats with Asthenia Cold Syndrome

    Institute of Scientific and Technical Information of China (English)

    韩冰冰; 王世军; 于华芸; 赵海军; 王媛

    2011-01-01

    Objective;To study the whole genome expression of liver on asthenia cold syndrome rats by gene chip technology. Methods:The Asthenia cold syndrome rats models were induced by compound preparation of traditional Chinese medicine of Raw Gypsum,Radix gentianae、Cortex Phellodendri 、Anemarrhenae Rhizoma.The liver gene expression in each group was detected by gene chip. We selected the differential expression genes and conducted the significant analysis on the genetic function of differential genes. A part of genes were selected to test the accuracy of results by RT - PCR. Results: As compared to the control group,in asthenia cold model group there were 99 strips of differential expression gene, mainly about function of response to stimulus. Conclusion ; Many strips of gene about response to stimulus were down - regulated in asthenia cold syndrome rats, which induced down regulation of function about immune response,defense response,response to other organism. The substance foundation of asthenia cold syndrome was possibly related to these genes.%目的:采用基因芯片技术研究虚寒证大鼠肝全基因表达谱的改变.方法:使用中药复方生石膏、龙胆草、黄柏和知母建立虚寒证大鼠模型,应用基因芯片检测各组大鼠肝脏基因表达,筛选差异表达基因,进行基因功能分类注释.荧光定量PCR验证芯片结果.结果:虚寒模型组与空白对照组比较有99条基因差异表达,主要涉及刺激应答功能.结论:虚寒证可能通过多种刺激应答相关基因的下调,导致免疫应答功能、防御应答功能及对其他生物体刺激应答功能的降低.虚寒证的物质基础可能与此类基因的异常表达相关.

  14. Whole genome linkage disequilibrium maps in cattle

    Directory of Open Access Journals (Sweden)

    Mannen Hideyuki

    2007-10-01

    Full Text Available Abstract Background Bovine whole genome linkage disequilibrium maps were constructed for eight breeds of cattle. These data provide fundamental information concerning bovine genome organization which will allow the design of studies to associate genetic variation with economically important traits and also provides background information concerning the extent of long range linkage disequilibrium in cattle. Results Linkage disequilibrium was assessed using r2 among all pairs of syntenic markers within eight breeds of cattle from the Bos taurus and Bos indicus subspecies. Bos taurus breeds included Angus, Charolais, Dutch Black and White Dairy, Holstein, Japanese Black and Limousin while Bos indicus breeds included Brahman and Nelore. Approximately 2670 markers spanning the entire bovine autosomal genome were used to estimate pairwise r2 values. We found that the extent of linkage disequilibrium is no more than 0.5 Mb in these eight breeds of cattle. Conclusion Linkage disequilibrium in cattle has previously been reported to extend several tens of centimorgans. Our results, based on a much larger sample of marker loci and across eight breeds of cattle indicate that in cattle linkage disequilibrium persists over much more limited distances. Our findings suggest that 30,000–50,000 loci will be needed to conduct whole genome association studies in cattle.

  15. Whole genome sequencing analysis of Plasmodium vivax using whole genome capture

    Directory of Open Access Journals (Sweden)

    Bright A

    2012-06-01

    Full Text Available Abstract Background Malaria caused by Plasmodium vivax is an experimentally neglected severe disease with a substantial burden on human health. Because of technical limitations, little is known about the biology of this important human pathogen. Whole genome analysis methods on patient-derived material are thus likely to have a substantial impact on our understanding of P. vivax pathogenesis and epidemiology. For example, it will allow study of the evolution and population biology of the parasite, allow parasite transmission patterns to be characterized, and may facilitate the identification of new drug resistance genes. Because parasitemias are typically low and the parasite cannot be readily cultured, on-site leukocyte depletion of blood samples is typically needed to remove human DNA that may be 1000X more abundant than parasite DNA. These features have precluded the analysis of archived blood samples and require the presence of laboratories in close proximity to the collection of field samples for optimal pre-cryopreservation sample preparation. Results Here we show that in-solution hybridization capture can be used to extract P. vivax DNA from human contaminating DNA in the laboratory without the need for on-site leukocyte filtration. Using a whole genome capture method, we were able to enrich P. vivax DNA from bulk genomic DNA from less than 0.5% to a median of 55% (range 20%-80%. This level of enrichment allows for efficient analysis of the samples by whole genome sequencing and does not introduce any gross biases into the data. With this method, we obtained greater than 5X coverage across 93% of the P. vivax genome for four P. vivax strains from Iquitos, Peru, which is similar to our results using leukocyte filtration (greater than 5X coverage across 96% . Conclusion The whole genome capture technique will enable more efficient whole genome analysis of P. vivax from a larger geographic region and from valuable archived sample collections.

  16. Whole-genome sequencing reveals oncogenic mutations in mycosis fungoides.

    Science.gov (United States)

    McGirt, Laura Y; Jia, Peilin; Baerenwald, Devin A; Duszynski, Robert J; Dahlman, Kimberly B; Zic, John A; Zwerner, Jeffrey P; Hucks, Donald; Dave, Utpal; Zhao, Zhongming; Eischen, Christine M

    2015-07-23

    The pathogenesis of mycosis fungoides (MF), the most common cutaneous T-cell lymphoma (CTCL), is unknown. Although genetic alterations have been identified, none are considered consistently causative in MF. To identify potential drivers of MF, we performed whole-genome sequencing of MF tumors and matched normal skin. Targeted ultra-deep sequencing of MF samples and exome sequencing of CTCL cell lines were also performed. Multiple mutations were identified that affected the same pathways, including epigenetic, cell-fate regulation, and cytokine signaling, in MF tumors and CTCL cell lines. Specifically, interleukin-2 signaling pathway mutations, including activating Janus kinase 3 (JAK3) mutations, were detected. Treatment with a JAK3 inhibitor significantly reduced CTCL cell survival. Additionally, the mutation data identified 2 other potential contributing factors to MF, ultraviolet light, and a polymorphism in the tumor suppressor p53 (TP53). Therefore, genetic alterations in specific pathways in MF were identified that may be viable, effective new targets for treatment.

  17. Microbial species delineation using whole genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Mukherjee, Supratim; Ivanova, Natalia; Mavrommatics, Kostas; Pati, Amrita; Konstantinidis, Konstantinos

    2014-10-20

    Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

  18. Strategies and tools for whole genome alignments

    Energy Technology Data Exchange (ETDEWEB)

    Couronne, Olivier; Poliakov, Alexander; Bray, Nicolas; Ishkhanov,Tigran; Ryaboy, Dmitriy; Rubin, Edward; Pachter, Lior; Dubchak, Inna

    2002-11-25

    The availability of the assembled mouse genome makespossible, for the first time, an alignment and comparison of two largevertebrate genomes. We have investigated different strategies ofalignment for the subsequent analysis of conservation of genomes that areeffective for different quality assemblies. These strategies were appliedto the comparison of the working draft of the human genome with the MouseGenome Sequencing Consortium assembly, as well as other intermediatemouse assemblies. Our methods are fast and the resulting alignmentsexhibit a high degree of sensitivity, covering more than 90 percent ofknown coding exons in the human genome. We have obtained such coveragewhile preserving specificity. With a view towards the end user, we havedeveloped a suite of tools and websites for automatically aligning, andsubsequently browsing and working with whole genome comparisons. Wedescribe the use of these tools to identify conserved non-coding regionsbetween the human and mouse genomes, some of which have not beenidentified by other methods.

  19. Whole Genome Epidemiological Typing of Salmonella

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas

    . Technological advances and effective price in high throughput genome sequencing are making whole genome sequencing (WGS) available as a routine tool for bacterial typing. Typing of Salmonella, especially sub-typing within the same serotype or even the same clone, the genetic variation of the target genes being...... used for typing is crucial for successful discrimination. The core genes or the genes that are conserved in all members of a genus or species are potentially good candidates for investigating genomic variation in phylogeny and epidemiology. A total of 2,882 core genes have been observed among 73...... available Salmonella enterica genomes (accessed in April 2011). A consensus tree based on variation of the core genes gives better resolution than 16S rRNA and MLST that rarely provide separation between closely related strains. The performance of the pan-genome tree which is based on the presence...

  20. BSMAP: whole genome bisulfite sequence MAPping program

    Directory of Open Access Journals (Sweden)

    Li Wei

    2009-07-01

    Full Text Available Abstract Background Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the increased searching space, reduced complexity of bisulfite sequence, asymmetric cytosine to thymine alignments, and multiple CpG heterogeneous methylation. Results We developed an efficient bisulfite reads mapping algorithm BSMAP to address the above issues. BSMAP combines genome hashing and bitwise masking to achieve fast and accurate bisulfite mapping. Compared with existing bisulfite mapping approaches, BSMAP is faster, more sensitive and more flexible. Conclusion BSMAP is the first general-purpose bisulfite mapping software. It is able to map high-throughput bisulfite reads at whole genome level with feasible memory and CPU usage. It is freely available under GPL v3 license at http://code.google.com/p/bsmap/.

  1. Whole genome analysis of a Vietnamese trio

    Indian Academy of Sciences (India)

    Dang Thanh Hai; Nguyen Dai Thanh; Pham Thi Minh Trang; Le Si Quang; Phan Thi Thu Hang; Dang Cao Cuong; Hoang Kim Phuc; Nguyen Huu Duc; Do Duc Dong; Bui Quang Minh; Pham Bao Son; Le Sy Vinh

    2015-03-01

    We here present the first whole genome analysis of an anonymous Kinh Vietnamese (KHV) trio whose genomes were deeply sequenced to 30-fold average coverage. The resulting short reads covered 99.91% of the human reference genome (GRCh37d5). We identified 4,719,412 SNPs and 827,385 short indels that satisfied the Mendelian inheritance law. Among them, 109,914 (2.3%) SNPs and 59,119 (7.1%) short indels were novel. We also detected 30,171 structural variants of which 27,604 (91.5%) were large indels. There were 6,681 large indels in the range 0.1–100 kbp occurring in the child genome that were also confirmed in either the father or mother genome.We compared these large indels against the DGV database and found that 1,499 (22.44%) were KHV specific. De novo assembly of high-quality unmapped reads yielded 789 contigs with the length ≥ 300 bp. There were 235 contigs from the child genome of which 199 (84.7%) were significantly matched with at least one contig from the father or mother genome. Blasting these 199 contigs against other alternative human genomes revealed 4 novel contigs. The novel variants identified from our study demonstrated the necessity of conducting more genome-wide studies not only for Kinh but also for other ethnic groups in Vietnam.

  2. Small Sample Whole-Genome Amplification

    Energy Technology Data Exchange (ETDEWEB)

    Hara, C A; Nguyen, C P; Wheeler, E K; Sorensen, K J; Arroyo, E S; Vrankovich, G P; Christian, A T

    2005-09-20

    Many challenges arise when trying to amplify and analyze human samples collected in the field due to limitations in sample quantity, and contamination of the starting material. Tests such as DNA fingerprinting and mitochondrial typing require a certain sample size and are carried out in large volume reactions; in cases where insufficient sample is present whole genome amplification (WGA) can be used. WGA allows very small quantities of DNA to be amplified in a way that enables subsequent DNA-based tests to be performed. A limiting step to WGA is sample preparation. To minimize the necessary sample size, we have developed two modifications of WGA: the first allows for an increase in amplified product from small, nanoscale, purified samples with the use of carrier DNA while the second is a single-step method for cleaning and amplifying samples all in one column. Conventional DNA cleanup involves binding the DNA to silica, washing away impurities, and then releasing the DNA for subsequent testing. We have eliminated losses associated with incomplete sample release, thereby decreasing the required amount of starting template for DNA testing. Both techniques address the limitations of sample size by providing ample copies of genomic samples. Carrier DNA, included in our WGA reactions, can be used when amplifying samples with the standard purification method, or can be used in conjunction with our single-step DNA purification technique to potentially further decrease the amount of starting sample necessary for future forensic DNA-based assays.

  3. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby;

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  4. Allelic imbalance analysis by high-density single-nucleotide polymorphic allele (SNP) array with whole genome amplified DNA

    OpenAIRE

    Wong, Kwong-Kwok; Tsang, Yvonne T.M.; Shen, Jianhe; Cheng, Rita S.; Chang, Yi-Mieng; Man, Tsz-Kwong; Lau, Ching C.

    2004-01-01

    Besides their use in mRNA expression profiling, oligonucleotide microarrays have also been applied to single-nucleotide polymorphism (SNP) and loss of heterozygosity (LOH) or allelic imbalance studies. In this report, we evaluate the reliability of using whole genome amplified DNA for analysis with an oligonucleotide microarray containing 11 560 SNPs to detect allelic imbalance and chromosomal copy number abnormalities. Whole genome SNP analyses were performed with DNA extracted from osteosar...

  5. Whole Genome Sequencing: Cracking the Genetic Code for Foodborne Illness

    Science.gov (United States)

    ... For Consumers Home For Consumers Consumer Updates Whole Genome Sequencing: Cracking the Genetic Code for Foodborne Illness ... Bacteria that cause disease have millions of different genomes, or sequences of genetic code, each as unique ...

  6. Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities

    OpenAIRE

    Chen, Kevin; Pachter, Lior

    2005-01-01

    The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fe...

  7. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 com-plex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  8. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 complex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  9. Multiple Whole Genome Alignments Without a Reference Organism

    Energy Technology Data Exchange (ETDEWEB)

    Dubchak, Inna; Poliakov, Alexander; Kislyuk, Andrey; Brudno, Michael

    2009-01-16

    Multiple sequence alignments have become one of the most commonly used resources in genomics research. Most algorithms for multiple alignment of whole genomes rely either on a reference genome, against which all of the other sequences are laid out, or require a one-to-one mapping between the nucleotides of the genomes, preventing the alignment of recently duplicated regions. Both approaches have drawbacks for whole-genome comparisons. In this paper we present a novel symmetric alignment algorithm. The resulting alignments not only represent all of the genomes equally well, but also include all relevant duplications that occurred since the divergence from the last common ancestor. Our algorithm, implemented as a part of the VISTA Genome Pipeline (VGP), was used to align seven vertebrate and sixDrosophila genomes. The resulting whole-genome alignments demonstrate a higher sensitivity and specificity than the pairwise alignments previously available through the VGP and have higher exon alignment accuracy than comparable public whole-genome alignments. Of the multiple alignment methods tested, ours performed the best at aligning genes from multigene families?perhaps the most challenging test for whole-genome alignments. Our whole-genome multiple alignments are available through the VISTA Browser at http://genome.lbl.gov/vista/index.shtml.

  10. Information recovery from low coverage whole-genome bisulfite sequencing.

    Science.gov (United States)

    Libertini, Emanuele; Heath, Simon C; Hamoudi, Rifat A; Gut, Marta; Ziller, Michael J; Czyz, Agata; Ruotti, Victor; Stunnenberg, Hendrik G; Frontini, Mattia; Ouwehand, Willem H; Meissner, Alexander; Gut, Ivo G; Beck, Stephan

    2016-06-27

    The cost of whole-genome bisulfite sequencing (WGBS) remains a bottleneck for many studies and it is therefore imperative to extract as much information as possible from a given dataset. This is particularly important because even at the recommend 30X coverage for reference methylomes, up to 50% of high-resolution features such as differentially methylated positions (DMPs) cannot be called with current methods as determined by saturation analysis. To address this limitation, we have developed a tool that dynamically segments WGBS methylomes into blocks of comethylation (COMETs) from which lost information can be recovered in the form of differentially methylated COMETs (DMCs). Using this tool, we demonstrate recovery of ∼30% of the lost DMP information content as DMCs even at very low (5X) coverage. This constitutes twice the amount that can be recovered using an existing method based on differentially methylated regions (DMRs). In addition, we explored the relationship between COMETs and haplotypes in lymphoblastoid cell lines of African and European origin. Using best fit analysis, we show COMETs to be correlated in a population-specific manner, suggesting that this type of dynamic segmentation may be useful for integrated (epi)genome-wide association studies in the future.

  11. Information recovery from low coverage whole-genome bisulfite sequencing

    Science.gov (United States)

    Libertini, Emanuele; Heath, Simon C.; Hamoudi, Rifat A.; Gut, Marta; Ziller, Michael J.; Czyz, Agata; Ruotti, Victor; Stunnenberg, Hendrik G.; Frontini, Mattia; Ouwehand, Willem H.; Meissner, Alexander; Gut, Ivo G.; Beck, Stephan

    2016-01-01

    The cost of whole-genome bisulfite sequencing (WGBS) remains a bottleneck for many studies and it is therefore imperative to extract as much information as possible from a given dataset. This is particularly important because even at the recommend 30X coverage for reference methylomes, up to 50% of high-resolution features such as differentially methylated positions (DMPs) cannot be called with current methods as determined by saturation analysis. To address this limitation, we have developed a tool that dynamically segments WGBS methylomes into blocks of comethylation (COMETs) from which lost information can be recovered in the form of differentially methylated COMETs (DMCs). Using this tool, we demonstrate recovery of ∼30% of the lost DMP information content as DMCs even at very low (5X) coverage. This constitutes twice the amount that can be recovered using an existing method based on differentially methylated regions (DMRs). In addition, we explored the relationship between COMETs and haplotypes in lymphoblastoid cell lines of African and European origin. Using best fit analysis, we show COMETs to be correlated in a population-specific manner, suggesting that this type of dynamic segmentation may be useful for integrated (epi)genome-wide association studies in the future. PMID:27346250

  12. Whole genome transcript profiling from fingerstick blood samples: a comparison and feasibility study

    OpenAIRE

    Williams Adam R; Mondala Tony S; Robison Elizabeth H; Head Steven R; Salomon Daniel R; Kurian Sunil M

    2009-01-01

    Abstract Background Whole genome gene expression profiling has revolutionized research in the past decade especially with the advent of microarrays. Recently, there have been significant improvements in whole blood RNA isolation techniques which, through stabilization of RNA at the time of sample collection, avoid bias and artifacts introduced during sample handling. Despite these improvements, current human whole blood RNA stabilization/isolation kits are limited by the requirement of a veno...

  13. Novel Altered Region for Biomarker Discovery in Hepatocellular Carcinoma (HCC Using Whole Genome SNP Array

    Directory of Open Access Journals (Sweden)

    Esraa M. Hashem

    2016-04-01

    Full Text Available cancer represents one of the greatest medical causes of mortality. The majority of Hepatocellular carcinoma arises from the accumulation of genetic abnormalities, and possibly induced by exterior etiological factors especially HCV and HBV infections. There is a need for new tools to analysis the large sum of data to present relevant genetic changes that may be critical for both understanding how cancers develop and determining how they could ultimately be treated. Gene expression profiling may lead to new biomarkers that may help develop diagnostic accuracy for detecting Hepatocellular carcinoma. In this work, statistical technique (discrete stationary wavelet transform for detection of copy number alternations to analysis high-density single-nucleotide polymorphism array of 30 cell lines on specific chromosomes, which are frequently detected in Hepatocellular carcinoma have been proposed. The results demonstrate the feasibility of whole-genome fine mapping of copy number alternations via high-density single-nucleotide polymorphism genotyping, Results revealed that a novel altered chromosomal region is discovered; region amplification (4q22.1 have been detected in 22 out of 30-Hepatocellular carcinoma cell lines (73%. This region strike, AFF1 and DSPP, tumor suppressor genes. This finding has not previously reported to be involved in liver carcinogenesis; it can be used to discover a new HCC biomarker, which helps in a better understanding of hepatocellular carcinoma.

  14. Mapping the human genome by using {open_quotes}whole genome{close_quotes} radiation hybrids

    Energy Technology Data Exchange (ETDEWEB)

    Cox, D.R. [Stanford Univ., CA (United States)

    1995-12-31

    An important goal of the Human Genome Project is to construct a map of the human genome at an average resolution of 100 kilobases (kb), which should provide the scientific community with a valuable resource for the localization an isolation of any human DNA sequence of interest. In an effort to complete this map by the projected date of 1998, we have constructed two sets of {open_quotes}whole genome{close_quotes} radiation hybrids. The first set of 83 hamster-human somatic cell hybrids contains human DNA fragments approximately 5 million base pairs in length. Each individual hybrid cell line contains approximately one fifth of the entire human genome. Our mapping results indicate that these whole genome radiation hybrids represent an important resource for constructing the 100 kb map in a timely and cost-effective fashion.

  15. Whole-Genome Sequences of Three Symbiotic Endozoicomonas Bacteria

    KAUST Repository

    Neave, Matthew J.

    2014-08-14

    Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp.

  16. Optimized design and assessment of whole genome tiling arrays.

    NARCIS (Netherlands)

    Graf, S.; Nielsen, F.G.G.; Kurtz, S.; Huynen, M.A.; Birney, E.; Stunnenberg, H.G.; Flicek, P.

    2007-01-01

    MOTIVATION: Recent advances in microarray technologies have made it feasible to interrogate whole genomes with tiling arrays and this technique is rapidly becoming one of the most important high-throughput functional genomics assays. For large mammalian genomes, analyzing oligonucleotide tiling arra

  17. Whole-Genome Sequencing of Two Bartonella bacilliformis Strains

    Science.gov (United States)

    Guillen, Yolanda; Casadellà, Maria; García-de-la-Guarda, Ruth; Espinoza-Culupú, Abraham; Paredes, Roger; Ruiz, Joaquim

    2016-01-01

    Bartonella bacilliformis is the causative agent of Carrion’s disease, a highly endemic human bartonellosis in Peru. We performed a whole-genome assembly of two B. bacilliformis strains isolated from the blood of infected patients in the acute phase of Carrion’s disease from the Cusco and Piura regions in Peru. PMID:27389274

  18. Whole-genome bisulfite DNA sequencing of a DNMT3B mutant patient

    Science.gov (United States)

    Heyn, Holger; Vidal, Enrique; Sayols, Sergi; Sanchez-Mut, Jose V.; Moran, Sebastian; Medina, Ignacio; Sandoval, Juan; Simó-Riudalbas, Laia; Szczesna, Karolina; Huertas, Dori; Gatto, Sole; Matarazzo, Maria R.; Dopazo, Joaquin; Esteller, Manel

    2012-01-01

    The immunodeficiency, centromere instability and facial anomalies (ICF) syndrome is associated to mutations of the DNA methyl-transferase DNMT3B, resulting in a reduction of enzyme activity. Aberrant expression of immune system genes and hypomethylation of pericentromeric regions accompanied by chromosomal instability were determined as alterations driving the disease phenotype. However, so far only technologies capable to analyze single loci were applied to determine epigenetic alterations in ICF patients. In the current study, we performed whole-genome bisulphite sequencing to assess alteration in DNA methylation at base pair resolution. Genome-wide we detected a decrease of methylation level of 42%, with the most profound changes occurring in inactive heterochromatic regions, satellite repeats and transposons. Interestingly, transcriptional active loci and ribosomal RNA repeats escaped global hypomethylation. Despite a genome-wide loss of DNA methylation the epigenetic landscape and crucial regulatory structures were conserved. Remarkably, we revealed a mislocated activity of mutant DNMT3B to H3K4me1 loci resulting in hypermethylation of active promoters. Functionally, we could associate alterations in promoter methylation with the ICF syndrome immunodeficient phenotype by detecting changes in genes related to the B-cell receptor mediated maturation pathway. PMID:22595875

  19. Whole Genome Sequencing Identifies a Novel Factor Required for Secretory Granule Maturation in Tetrahymena thermophila

    Science.gov (United States)

    Kontur, Cassandra; Kumar, Santosh; Lan, Xun; Pritchard, Jonathan K.; Turkewitz, Aaron P.

    2016-01-01

    Unbiased genetic approaches have a unique ability to identify novel genes associated with specific biological pathways. Thanks to next generation sequencing, forward genetic strategies can be expanded to a wider range of model organisms. The formation of secretory granules, called mucocysts, in the ciliate Tetrahymena thermophila relies, in part, on ancestral lysosomal sorting machinery, but is also likely to involve novel factors. In prior work, multiple strains with defects in mucocyst biogenesis were generated by nitrosoguanidine mutagenesis, and characterized using genetic and cell biological approaches, but the genetic lesions themselves were unknown. Here, we show that analyzing one such mutant by whole genome sequencing reveals a novel factor in mucocyst formation. Strain UC620 has both morphological and biochemical defects in mucocyst maturation—a process analogous to dense core granule maturation in animals. Illumina sequencing of a pool of UC620 F2 clones identified a missense mutation in a novel gene called MMA1 (Mucocyst maturation). The defects in UC620 were rescued by expression of a wild-type copy of MMA1, and disrupting MMA1 in an otherwise wild-type strain phenocopies UC620. The product of MMA1, characterized as a CFP-tagged copy, encodes a large soluble cytosolic protein. A small fraction of Mma1p-CFP is pelletable, which may reflect association with endosomes. The gene has no identifiable homologs except in other Tetrahymena species, and therefore represents an evolutionarily recent innovation that is required for granule maturation. PMID:27317773

  20. Fish-oil supplementation induces antiinflammatory gene expression profiles in human blood mononuclear cells

    OpenAIRE

    Bouwens, M.; Rest, van de, O.; Dellschaft, N.; Grootte Bromhaar, M.M.; Groot, de, W.T.; Geleijnse, J M; Müller, M.R.; Afman, L.A.

    2009-01-01

    Background: Polyunsaturated fatty acids can have beneficial effects on human immune cells, such as peripheral blood mononuclear cells (PBMCs). However, the mechanisms of action of polyunsaturated fatty acids on immune cells are still largely unknown. Objective: The objective was to examine the effects of supplementation with the polyunsaturated fatty acids eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) on whole-genome PBMC gene expression profiles, in healthy Dutch elderly subject...

  1. Post-Fragmentation Whole Genome Amplification-Based Method

    Science.gov (United States)

    Benardini, James; LaDuc, Myron T.; Langmore, John

    2011-01-01

    This innovation is derived from a proprietary amplification scheme that is based upon random fragmentation of the genome into a series of short, overlapping templates. The resulting shorter DNA strands (genomic hybridization microarray, SNP analysis, and sequencing. The standard reaction can be performed with minimal hands-on time, and can produce amplified DNA in as little as three hours. Post-fragmentation whole genome amplification-based technology provides a robust and accurate method of amplifying femtogram levels of starting material into microgram yields with no detectable allele bias. The amplified DNA also facilitates the preservation of samples (spacecraft samples) by amplifying scarce amounts of template DNA into microgram concentrations in just a few hours. Based on further optimization of this technology, this could be a feasible technology to use in sample preservation for potential future sample return missions. The research and technology development described here can be pivotal in dealing with backward/forward biological contamination from planetary missions. Such efforts rely heavily on an increasing understanding of the burden and diversity of microorganisms present on spacecraft surfaces throughout assembly and testing. The development and implementation of these technologies could significantly improve the comprehensiveness and resolving power of spacecraft-associated microbial population censuses, and are important to the continued evolution and advancement of planetary protection capabilities. Current molecular procedures for assaying spacecraft-associated microbial burden and diversity have inherent sample loss issues at practically every step, particularly nucleic acid extraction. In engineering a molecular means of amplifying nucleic acids directly from single cells in their native state within the sample matrix, this innovation has circumvented entirely the need for DNA extraction regimes in the sample processing scheme.

  2. Assessment of whole genome amplification for sequence capture and massively parallel sequencing.

    Directory of Open Access Journals (Sweden)

    Johanna Hasmats

    Full Text Available Exome sequence capture and massively parallel sequencing can be combined to achieve inexpensive and rapid global analyses of the functional sections of the genome. The difficulties of working with relatively small quantities of genetic material, as may be necessary when sharing tumor biopsies between collaborators for instance, can be overcome using whole genome amplification. However, the potential drawbacks of using a whole genome amplification technology based on random primers in combination with sequence capture followed by massively parallel sequencing have not yet been examined in detail, especially in the context of mutation discovery in tumor material. In this work, we compare mutations detected in sequence data for unamplified DNA, whole genome amplified DNA, and RNA originating from the same tumor tissue samples from 16 patients diagnosed with non-small cell lung cancer. The results obtained provide a comprehensive overview of the merits of these techniques for mutation analysis. We evaluated the identified genetic variants, and found that most (74% of them were observed in both the amplified and the unamplified sequence data. Eighty-nine percent of the variations found by WGA were shared with unamplified DNA. We demonstrate a strategy for avoiding allelic bias by including RNA-sequencing information.

  3. Assessment of whole genome amplification for sequence capture and massively parallel sequencing.

    Science.gov (United States)

    Hasmats, Johanna; Gréen, Henrik; Orear, Cedric; Validire, Pierre; Huss, Mikael; Käller, Max; Lundeberg, Joakim

    2014-01-01

    Exome sequence capture and massively parallel sequencing can be combined to achieve inexpensive and rapid global analyses of the functional sections of the genome. The difficulties of working with relatively small quantities of genetic material, as may be necessary when sharing tumor biopsies between collaborators for instance, can be overcome using whole genome amplification. However, the potential drawbacks of using a whole genome amplification technology based on random primers in combination with sequence capture followed by massively parallel sequencing have not yet been examined in detail, especially in the context of mutation discovery in tumor material. In this work, we compare mutations detected in sequence data for unamplified DNA, whole genome amplified DNA, and RNA originating from the same tumor tissue samples from 16 patients diagnosed with non-small cell lung cancer. The results obtained provide a comprehensive overview of the merits of these techniques for mutation analysis. We evaluated the identified genetic variants, and found that most (74%) of them were observed in both the amplified and the unamplified sequence data. Eighty-nine percent of the variations found by WGA were shared with unamplified DNA. We demonstrate a strategy for avoiding allelic bias by including RNA-sequencing information.

  4. Construction and Evaluation of Desulfovibrio vulgaris Whole-Genome Oligonucleotide Microarrays

    Energy Technology Data Exchange (ETDEWEB)

    Z. He; Q. He; L. Wu; M.E. Clark; J.D. Wall; Jizhong Zhou; Matthew W. Fields

    2004-03-17

    ,4-cyclodiphosphate (MECDP) synthase. Spermidines are polyamines that are typically abundant in rapidly dividing cells and are essential growth factors in eukaryotic organisms. Polyamines are thought to stabilize DNA by the association of the amino groups with the phosphate residues of DNA and can also enhance tRNA and ribosome stability. The MECDP synthase enzyme is essential in Escherichia coli and participates in the nonmevalonate pathway of isoprenoid biosynthesis, a critical pathway present in some bacteria and apicomplexans but distinct from that used by mammals. Several of the highly up-regulated ORFs were annotated as conserved hypothetical proteins. Interestingly, an ORF that was predicted to contain a flocculin repeat domain was almost 9-fold up-regulated in stationary phase cells compared to logarithmically growing cells. The flocculin domain is commonly observed in fungi, and is thought to play a role during flocculation (non-sexual aggregation of single-cell microorganisms). These preliminary results have identified possible responses of D. vulgaris cells to stationary phase growth and suggest that polyamine production as well as cell aggregation and/or extracellular polymer production are responses of D. vulgaris during stationary phase. The initial microarray results indicate that the recently produced oligonucleotide microarrays are functional. We are currently optimizing growth conditions in order to culture D. vulgaris cells in the presence of uranium(VI) and to monitor whole-genome expression levels.

  5. When aging meets microgravity: whole genome promoters and enchancers transcription landscape in zebrafish onboard ISS

    Science.gov (United States)

    Arshanovskii, Kirill; Gusev, Oleg; Sychev, Vladimir; Poddubko, Svetlana; Deviatiiarov, Ruslan

    2016-07-01

    In order to gen new insights of gene regulation changes under conditions of real spaceflight, we have conducted whole-genome analysis of dynamic of promotes and enhancers transcriptional changes in zebrafish during prolonged exposure to real spaceflight. In the frame of Russia-Japan joint experiments "Aquatic Habitat"-"Aquarium" we have conducted Cap Analysis of Gene Expression (CAGE) assay of zebrafish in the rage from 7 to 40 days of real spaceflight onboard ISS. The analysis showed that both gene expression patterns and architecture of shapes and types of the promoters are affected by spaceflight environment.

  6. Whole genome amplification - Review of applications and advances

    Energy Technology Data Exchange (ETDEWEB)

    Hawkins, Trevor L.; Detter, J.C.; Richardson, Paul

    2001-11-15

    The concept of Whole Genome Amplification is something that has arisen in the past few years as modifications to the polymerase chain reaction (PCR) have been adapted to replicate regions of genomes which are of biological interest. The applications here are many--forensics, embryonic disease diagnosis, bio terrorism genome detection, ''imoralization'' of clinical samples, microbial diversity, and genotyping. The key question is if DNA can be replicated a genome at a time without bias or non random distribution of the target. Several papers published in the last year and currently in preparation may lead to the conclusion that whole genome amplification may indeed be possible and therefore open up a new avenue to molecular biology.

  7. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  8. Physical map-assisted whole-genome shotgun sequence assemblies

    OpenAIRE

    Warren, René L.; Varabei, Dmitry; Platt, Darren; Huang, Xiaoqiu; Messina, David; Yang, Shiaw-Pyng; Kronstad, James W.; Krzywinski, Martin; Warren, Wesley C; Wallis, John W.; Hillier, LaDeana W.; Chinwalla, Asif T.; Schein, Jacqueline E.; Siddiqui, Asim S.; Marra, Marco A.

    2006-01-01

    We describe a targeted approach to improve the contiguity of whole-genome shotgun sequence (WGS) assemblies at run-time, using information from Bacterial Artificial Chromosome (BAC)-based physical maps. Clone sizes and overlaps derived from clone fingerprints are used for the calculation of length constraints between any two BAC neighbors sharing 40% of their size. These constraints are used to promote the linkage and guide the arrangement of sequence contigs within a sequence scaffold at the...

  9. Whole genome amplification of DNA for genotyping pharmacogenetics candidate genes.

    Directory of Open Access Journals (Sweden)

    Santosh ePhilips

    2012-03-01

    Full Text Available Whole genome amplification (WGA technologies can be used to amplify genomic DNA when only small amounts of DNA are available. The Multiple Displacement Amplification Phi polymerase based amplification has been shown to accurately amplify DNA for a variety of genotyping assays; however, it has not been tested for genotyping many of the clinically relevant genes important for pharmacogenetic studies, such as the cytochrome P450 genes, that are typically difficult to genotype due to multiple pseudogenes, copy number variations, and high similarity to other related genes. We evaluated whole genome amplified samples for Taqman™ genotyping of SNPs in a variety of pharmacogenetic genes. In 24 DNA samples from the Coriell human diversity panel, the call rates and concordance between amplified (~200-fold amplification and unamplified samples was 100% for two SNPs in CYP2D6 and one in ESR1. In samples from a breast cancer clinical trial (Trial 1, we compared the genotyping results in samples before and after WGA for four SNPs in CYP2D6, one SNP in CYP2C19, one SNP in CYP19A1, two SNPs in ESR1, and two SNPs in ESR2. The concordance rates were all >97%. Finally, we compared the allele frequencies of 143 SNPs determined in Trial 1 (whole genome amplified DNA to the allele frequencies determined in unamplified DNA samples from a separate trial (Trial 2 that enrolled a similar population. The call rates and allele frequencies between the two trials were 98% and 99.7%, respectively. We conclude that the whole genome amplified DNA is suitable for Taqman™ genotyping for a wide variety of pharmacogenetically relevant SNPs.

  10. Whole Genome Sequencing: Innovation Dream or Privacy Nightmare?

    OpenAIRE

    De Cristofaro, Emiliano

    2012-01-01

    Over the past several years, DNA sequencing has emerged as one of the driving forces in life-sciences, paving the way for affordable and accurate whole genome sequencing. As genomes represent the entirety of an organism's hereditary information, the availability of complete human genomes prompts a wide range of revolutionary applications. The hope for improving modern healthcare and better understanding the human genome propels many interesting and challenging research frontiers. Unfortunatel...

  11. A phylogenetic strategy based on a legume-specific whole genome duplication yields symbiotic cytokinin type-A Response Regulators

    NARCIS (Netherlands)

    Camp, Op den R.; Mita, De S.; Lillo, A.; Cao, Q.; Limpens, E.H.M.; Bisseling, T.; Geurts, R.

    2011-01-01

    Legumes host their rhizobium symbiont in novel root organs, called nodules. Nodules originate from differentiated root cortical cells that de-differentiate and subsequently form nodule primordia, a process controlled by cytokinin. A whole genome duplication (WGD) has occurred at the root of the legu

  12. Epigenetic regulation of subgenome dominance following whole genome triplication in Brassica rapa.

    Science.gov (United States)

    Cheng, Feng; Sun, Chao; Wu, Jian; Schnable, James; Woodhouse, Margaret R; Liang, Jianli; Cai, Chengcheng; Freeling, Michael; Wang, Xiaowu

    2016-07-01

    Subgenome dominance is an important phenomenon observed in allopolyploids after whole genome duplication, in which one subgenome retains more genes as well as contributes more to the higher expressing gene copy of paralogous genes. To dissect the mechanism of subgenome dominance, we systematically investigated the relationships of gene expression, transposable element (TE) distribution and small RNA targeting, relating to the multicopy paralogous genes generated from whole genome triplication in Brassica rapa. The subgenome dominance was found to be regulated by a relatively stable factor established previously, then inherited by and shared among B. rapa varieties. In addition, we found a biased distribution of TEs between flanking regions of paralogous genes. Furthermore, the 24-nt small RNAs target TEs and are negatively correlated to the dominant expression of individual paralogous gene pairs. The biased distribution of TEs among subgenomes and the targeting of 24-nt small RNAs together produce the dominant expression phenomenon at a subgenome scale. Based on these findings, we propose a bucket hypothesis to illustrate subgenome dominance and hybrid vigor. Our findings and hypothesis are valuable for the evolutionary study of polyploids, and may shed light on studies of hybrid vigor, which is common to most species. PMID:26871271

  13. The potential of whole genome NGS for infectious disease diagnosis.

    Science.gov (United States)

    Lecuit, Marc; Eloit, Marc

    2015-01-01

    Non-targeted identification of microbes is now possible directly in biological samples, based on whole-genome-NGS (WG-NGS) techniques that allow deep sequencing of nucleic acids, data mining and sorting out of sequences of pathogens without any a priori hypothesis. WG-NGS was first only used as a research tool due to its cost, complexity and lack of standardization. Recent improvements in sample preparation and bioinformatics pipelines and decrease in cost now allow actionable diagnostics in patients. The potency and limits of WG-NGS and possible future indications are discussed here. WG-NGS will likely soon become a standard procedure in microbiological diagnosis.

  14. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    Energy Technology Data Exchange (ETDEWEB)

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  15. Whole genome amplification and its impact on CGH array profiles

    Directory of Open Access Journals (Sweden)

    Meldrum Cliff

    2008-07-01

    Full Text Available Abstract Background Some array comparative genomic hybridisation (array CGH platforms require a minimum of micrograms of DNA for the generation of reliable and reproducible data. For studies where there are limited amounts of genetic material, whole genome amplification (WGA is an attractive method for generating sufficient quantities of genomic material from miniscule amounts of starting material. A range of WGA methods are available and the multiple displacement amplification (MDA approach has been shown to be highly accurate, although amplification bias has been reported. In the current study, WGA was used to amplify DNA extracted from whole blood. In total, six array CGH experiments were performed to investigate whether the use of whole genome amplified DNA (wgaDNA produces reliable and reproducible results. Four experiments were conducted on amplified DNA compared to unamplified DNA and two experiments on unamplified DNA compared to unamplified DNA. Findings All the experiments involving wgaDNA resulted in a high proportion of losses and gains of genomic material. Previously, amplification bias has been overcome by using amplified DNA in both the test and reference DNA. Our data suggests that this approach may not be effective, as the gains and losses introduced by WGA appears to be random and are not reproducible between different experiments using the same DNA. Conclusion In light of these findings, the use of both amplified test and reference DNA on CGH arrays may not provide an accurate representation of copy number variation in the DNA.

  16. Whole Genome Re-Sequencing of Three Domesticated Chicken Breeds.

    Science.gov (United States)

    Oh, Dongyep; Son, Bongjun; Mun, Seyoung; Oh, Man Hwan; Oh, Sejong; Ha, Jaejung; Yi, Junkoo; Lee, Seunguk; Han, Kyudong

    2016-02-01

    Chicken is one of the most popular domesticated species worldwide, as it can serve an important role in agricultural as well as biomedical research fields. Because it inhabits almost every continent and presents diverse morphology and traits, the need of genetic markers for distinguishing each breed for various purposes has increased. The whole genome sequencing of three different breeds (White Leghorn, Korean domestic, and Araucana) that show similar coloring patterns, with the exception of the White Leghorn breed, have confirmed previously reported genomic alterations and identified many novel variants. Additionally, the Whole Genome Re-Sequencing (WGRS) approach identified an approximately 4 kb insert within SLCO1B3 responsible for blue egg shell color. Targeted investigation of pigment-related genes corroborated previously reported non-synonymous mutations, and provided deeper insight into chicken coloring, where not a single but a combination of non-synonymous mutations in the MC1R gene is likely to be responsible for altered feather coloring. PMID:26853871

  17. Whole-genome haplotyping approaches and genomic medicine.

    Science.gov (United States)

    Glusman, Gustavo; Cox, Hannah C; Roach, Jared C

    2014-01-01

    Genomic information reported as haplotypes rather than genotypes will be increasingly important for personalized medicine. Current technologies generate diploid sequence data that is rarely resolved into its constituent haplotypes. Furthermore, paradigms for thinking about genomic information are based on interpreting genotypes rather than haplotypes. Nevertheless, haplotypes have historically been useful in contexts ranging from population genetics to disease-gene mapping efforts. The main approaches for phasing genomic sequence data are molecular haplotyping, genetic haplotyping, and population-based inference. Long-read sequencing technologies are enabling longer molecular haplotypes, and decreases in the cost of whole-genome sequencing are enabling the sequencing of whole-chromosome genetic haplotypes. Hybrid approaches combining high-throughput short-read assembly with strategic approaches that enable physical or virtual binning of reads into haplotypes are enabling multi-gene haplotypes to be generated from single individuals. These techniques can be further combined with genetic and population approaches. Here, we review advances in whole-genome haplotyping approaches and discuss the importance of haplotypes for genomic medicine. Clinical applications include diagnosis by recognition of compound heterozygosity and by phasing regulatory variation to coding variation. Haplotypes, which are more specific than less complex variants such as single nucleotide variants, also have applications in prognostics and diagnostics, in the analysis of tumors, and in typing tissue for transplantation. Future advances will include technological innovations, the application of standard metrics for evaluating haplotype quality, and the development of databases that link haplotypes to disease. PMID:25473435

  18. Integration of transcriptome and whole genomic resequencing data to identify key genes affecting swine fat deposition.

    Directory of Open Access Journals (Sweden)

    Kai Xing

    Full Text Available Fat deposition is highly correlated with the growth, meat quality, reproductive performance and immunity of pigs. Fatty acid synthesis takes place mainly in the adipose tissue of pigs; therefore, in this study, a high-throughput massively parallel sequencing approach was used to generate adipose tissue transcriptomes from two groups of Songliao black pigs that had opposite backfat thickness phenotypes. The total number of paired-end reads produced for each sample was in the range of 39.29-49.36 millions. Approximately 188 genes were differentially expressed in adipose tissue and were enriched for metabolic processes, such as fatty acid biosynthesis, lipid synthesis, metabolism of fatty acids, etinol, caffeine and arachidonic acid and immunity. Additionally, many genetic variations were detected between the two groups through pooled whole-genome resequencing. Integration of transcriptome and whole-genome resequencing data revealed important genomic variations among the differentially expressed genes for fat deposition, for example, the lipogenic genes. Further studies are required to investigate the roles of candidate genes in fat deposition to improve pig breeding programs.

  19. 基因芯片技术研究附子对虚寒证大鼠肝全基因表达谱的影响%Influenceof Aconiti Lateralis Radix Praeparata on asthenia cold syndrome rats with whole genome gene expression of liver by gene chip technique

    Institute of Scientific and Technical Information of China (English)

    韩冰冰; 王世军; 张发艳; 赵海军; 王成岗

    2012-01-01

    Objective; To study the influence of Aconiti Lateralis Radix Praeparata on asthenia cold syndrome rats with whole genome gene expression of liver by gene chip technique. Method: The asthenia cold syndrome rat models were established by administering traditional Chinese medicine raw Gypsum Fibrosum, Gentianae Radix, Phellodendri Chinensis Cortex and Anemarrhenae Rhizo-ma. After treated with Aconiti Lateralis Radix Praeparata, the rats' liver gene expressions were detected using gene chip. Differential expression genes were screened for gene function annotation, and some genes were selected to check the accuracy of the results by RT-PCR. Result; Compared with the asthenia cold model group, the asthenia cold treatment group showed 212 differential expression genes, mainly involving function of immune response and oxidoreductase activity. Conclusion: Aconiti Lateralis Radix Praeparata is proved to have an effect on up-regulating immune response-related genes and oxidizing oxidoreductase activity-related genes of asthenia cold syndrome rats and may be a molecular mechanism for classical warm-nature medicine Aconiti Lateralis Radix Praeparata in warming meridians and dissipating cold.%目的:采用基因芯片技术研究附子对虚寒证大鼠肝全基因表达谱的影响.方法:使用中药复方生石膏、龙胆草、黄柏、知母建立虚寒证大鼠模型,附子治疗后应用基因芯片检测大鼠肝脏基因表达,筛选差异表达基因,进行基因功能注释,荧光定量PCR验证芯片结果.结果:虚寒治疗组与虚寒模型组比较有212条基因差异表达,主要涉及免疫应答及氧化还原酶活性相关基因.结论:附子能够上调虚寒证大鼠的免疫应答相关基因及氧化还原酶活性相关基因,可能是经典热药附子温阳散寒作用的分子机制.

  20. Computel: computation of mean telomere length from whole-genome next-generation sequencing data.

    Science.gov (United States)

    Nersisyan, Lilit; Arakelyan, Arsen

    2015-01-01

    Telomeres are the ends of eukaryotic chromosomes, consisting of consecutive short repeats that protect chromosome ends from degradation. Telomeres shorten with each cell division, leading to replicative cell senescence. Deregulation of telomere length homeostasis is associated with the development of various age-related diseases and cancers. A number of experimental techniques exist for telomere length measurement; however, until recently, the absence of tools for extracting telomere lengths from high-throughput sequencing data has significantly obscured the association of telomere length with molecular processes in normal and diseased conditions. We have developed Computel, a program in R for computing mean telomere length from whole-genome next-generation sequencing data. Computel is open source, and is freely available at https://github.com/lilit-nersisyan/computel. It utilizes a short-read alignment-based approach and integrates various popular tools for sequencing data analysis. We validated it with synthetic and experimental data, and compared its performance with the previously available software. The results have shown that Computel outperforms existing software in accuracy, independence of results from sequencing conditions, stability against inherent sequencing errors, and better ability to distinguish pure telomeric sequences from interstitial telomeric repeats. By providing a highly reliable methodology for determining telomere lengths from whole-genome sequencing data, Computel should help to elucidate the role of telomeres in cellular health and disease.

  1. Lysis of a Single Cyanobacterium for Whole Genome Amplification

    Directory of Open Access Journals (Sweden)

    Richard N. Zare

    2013-08-01

    Full Text Available Bacterial species from natural environments, exhibiting a great degree of genetic diversity that has yet to be characterized, pose a specific challenge to whole genome amplification (WGA from single cells. A major challenge is establishing an effective, compatible, and controlled lysis protocol. We present a novel lysis protocol that can be used to extract genomic information from a single cyanobacterium of Synechocystis sp. PCC 6803 known to have multilayer cell wall structures that resist conventional lysis methods. Simple but effective strategies for releasing genomic DNA from captured cells while retaining cellular identities for single-cell analysis are presented. Successful sequencing of genetic elements from single-cell amplicons prepared by multiple displacement amplification (MDA is demonstrated for selected genes (15 loci nearly equally spaced throughout the main chromosome.

  2. Comparative whole genome sequence analysis of wild-type and cidofovir-resistant monkeypoxvirus

    Directory of Open Access Journals (Sweden)

    Huggins John

    2010-05-01

    Full Text Available Abstract We performed whole genome sequencing of a cidofovir {[(S-1-(3-hydroxy-2-phosphonylmethoxy-propyl cytosine] [HPMPC]}-resistant (CDV-R strain of Monkeypoxvirus (MPV. Whole-genome comparison with the wild-type (WT strain revealed 55 single-nucleotide polymorphisms (SNPs and one tandem-repeat contraction. Over one-third of all identified SNPs were located within genes comprising the poxvirus replication complex, including the DNA polymerase, RNA polymerase, mRNA capping methyltransferase, DNA processivity factor, and poly-A polymerase. Four polymorphic sites were found within the DNA polymerase gene. DNA polymerase mutations observed at positions 314 and 684 in MPV were consistent with CDV-R loci previously identified in Vaccinia virus (VACV. These data suggest the mechanism of CDV resistance may be highly conserved across Orthopoxvirus (OPV species. SNPs were also identified within virulence genes such as the A-type inclusion protein, serine protease inhibitor-like protein SPI-3, Schlafen ATPase and thymidylate kinase, among others. Aberrant chain extension induced by CDV may lead to diverse alterations in gene expression and viral replication that may result in both adaptive and attenuating mutations. Defining the potential contribution of substitutions in the replication complex and RNA processing machinery reported here may yield further insight into CDV resistance and may augment current therapeutic development strategies.

  3. Characterization of Genomic Variants Associated with Scout and Recruit Behavioral Castes in Honey Bees Using Whole-Genome Sequencing

    OpenAIRE

    Southey, Bruce R.; Ping Zhu; Carr-Markell, Morgan K.; Liang, Zhengzheng S.; Amro Zayed; Ruiqiang Li; Robinson, Gene E.; Rodriguez-Zas, Sandra L.

    2016-01-01

    Among forager honey bees, scouts seek new resources and return to the colony, enlisting recruits to collect these resources. Differentially expressed genes between these behaviors and genetic variability in scouting phenotypes have been reported. Whole-genome sequencing of 44 Apis mellifera scouts and recruits was undertaken to detect variants and further understand the genetic architecture underlying the behavioral differences between scouts and recruits. The median coverage depth in recruit...

  4. Divergent Whole-Genome Methylation Maps of Human and Chimpanzee Brains Reveal Epigenetic Basis of Human Regulatory Evolution

    OpenAIRE

    Zeng, Jia; Konopka, Genevieve; Hunt, Brendan G.; Preuss, Todd M.; Geschwind, Dan; Yi, Soojin V.

    2012-01-01

    DNA methylation is a pervasive epigenetic DNA modification that strongly affects chromatin regulation and gene expression. To date, it remains largely unknown how patterns of DNA methylation differ between closely related species and whether such differences contribute to species-specific phenotypes. To investigate these questions, we generated nucleotide-resolution whole-genome methylation maps of the prefrontal cortex of multiple humans and chimpanzees. Levels and patterns of DNA methylatio...

  5. Whole genome sequencing in clinical and public health microbiology.

    Science.gov (United States)

    Kwong, J C; McCallum, N; Sintchenko, V; Howden, B P

    2015-04-01

    Genomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology.The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology.Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories.As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future.Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure.

  6. SNP annotation-based whole genomic prediction and selection

    DEFF Research Database (Denmark)

    Do, Duy Ngoc; Janss, Luc; Jensen, Just;

    2015-01-01

    into a training (968 pigs) and a validation dataset (304 pigs) by assigning records as before and after January 1, 2012, respectively. SNP were annotated by 14 different classes using Ensembl variant effect prediction. Predictive accuracy and prediction bias were calculated using Bayesian Power LASSO...... SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP groups. Genomic prediction has accuracy comparable to observed phenotype, and use of genomic prediction can be cost...... effective by replacing feed intake measurement. Genomic annotation had less impact on predictive accuracy traits considered here but may be different for other traits. It is the first study to provide useful insights into biological classes of SNP driving the whole genomic prediction for complex traits in...

  7. Low-pass whole-genome sequencing in clinical cytogenetics

    DEFF Research Database (Denmark)

    Dong, Zirui; Zhang, Jun; Hu, Ping;

    2016-01-01

    Purpose: Chromosomal microarray analysis is the gold standard for copy-number variant (CNV) detection in prenatal and postnatal diagnosis. We aimed to determine whether next-generation sequencing (NGS) technology could be an alternative method for CNV detection in routine clinical application....... Methods: Genome-wide CNV analysis (>50 kb) was performed on a multicenter group of 570 patients using a low-coverage whole-genome sequencing pipeline. These samples were referred for chromosomal analysis; CNVs (i.e., pathogenic CNVs, pCNVs) were classified according to the American College of Medical...... Genetics and Genomics guidelines. Results: Overall, a total of 198 abortuses, 37 stillbirths, 149 prenatal, and 186 postnatal samples were tested. Our approach yielded results in 549 samples (96.3%). In addition to 119 subjects with aneuploidies, 103 pCNVs (74 losses and 29 gains) were identified in 82...

  8. Whole-genome characterization of chemoresistant ovarian cancer.

    Science.gov (United States)

    Patch, Ann-Marie; Christie, Elizabeth L; Etemadmoghadam, Dariush; Garsed, Dale W; George, Joshy; Fereday, Sian; Nones, Katia; Cowin, Prue; Alsop, Kathryn; Bailey, Peter J; Kassahn, Karin S; Newell, Felicity; Quinn, Michael C J; Kazakoff, Stephen; Quek, Kelly; Wilhelm-Benartzi, Charlotte; Curry, Ed; Leong, Huei San; Hamilton, Anne; Mileshkin, Linda; Au-Yeung, George; Kennedy, Catherine; Hung, Jillian; Chiew, Yoke-Eng; Harnett, Paul; Friedlander, Michael; Quinn, Michael; Pyman, Jan; Cordner, Stephen; O'Brien, Patricia; Leditschke, Jodie; Young, Greg; Strachan, Kate; Waring, Paul; Azar, Walid; Mitchell, Chris; Traficante, Nadia; Hendley, Joy; Thorne, Heather; Shackleton, Mark; Miller, David K; Arnau, Gisela Mir; Tothill, Richard W; Holloway, Timothy P; Semple, Timothy; Harliwong, Ivon; Nourse, Craig; Nourbakhsh, Ehsan; Manning, Suzanne; Idrisoglu, Senel; Bruxner, Timothy J C; Christ, Angelika N; Poudel, Barsha; Holmes, Oliver; Anderson, Matthew; Leonard, Conrad; Lonie, Andrew; Hall, Nathan; Wood, Scott; Taylor, Darrin F; Xu, Qinying; Fink, J Lynn; Waddell, Nick; Drapkin, Ronny; Stronach, Euan; Gabra, Hani; Brown, Robert; Jewell, Andrea; Nagaraj, Shivashankar H; Markham, Emma; Wilson, Peter J; Ellul, Jason; McNally, Orla; Doyle, Maria A; Vedururu, Ravikiran; Stewart, Collin; Lengyel, Ernst; Pearson, John V; Waddell, Nicola; deFazio, Anna; Grimmond, Sean M; Bowtell, David D L

    2015-05-28

    Patients with high-grade serous ovarian cancer (HGSC) have experienced little improvement in overall survival, and standard treatment has not advanced beyond platinum-based combination chemotherapy, during the past 30 years. To understand the drivers of clinical phenotypes better, here we use whole-genome sequencing of tumour and germline DNA samples from 92 patients with primary refractory, resistant, sensitive and matched acquired resistant disease. We show that gene breakage commonly inactivates the tumour suppressors RB1, NF1, RAD51B and PTEN in HGSC, and contributes to acquired chemotherapy resistance. CCNE1 amplification was common in primary resistant and refractory disease. We observed several molecular events associated with acquired resistance, including multiple independent reversions of germline BRCA1 or BRCA2 mutations in individual patients, loss of BRCA1 promoter methylation, an alteration in molecular subtype, and recurrent promoter fusion associated with overexpression of the drug efflux pump MDR1. PMID:26017449

  9. Genomic prediction using QTL derived from whole genome sequence data

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc;

    This study investigated the gain in accuracy of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k SNP data. Analyses were performed for Nordic Holstein and Danish Jersey animals, using either...... a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model, results showed increases in accuracy of up to two percentage points for production traits in both Holstein and Jersey animals by including the extra variants in the analysis, and an extra 1.5 percentage points...... for fertility in Jersey animals. When using a Bayesian model accuracies were generally higher, but only small increases in accuracy of up to 0.6 percentage points were observed for the Holstein animals when including the extra markers, while both increases and decreases were observed for Jersey...

  10. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  11. Multiple mutations in heterogeneous miltefosine-resistant Leishmania major population as determined by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Adriano C Coelho

    Full Text Available BACKGROUND: Miltefosine (MF is the first oral compound used in the chemotherapy against leishmaniasis. Since the mechanism of action of this drug and the targets of MF in Leishmania are unclear, we generated in a step-by-step manner Leishmania major promastigote mutants highly resistant to MF. Two of the mutants were submitted to a short-read whole genome sequencing for identifying potential genes associated with MF resistance. METHODS/PRINCIPAL FINDINGS: Analysis of the genome assemblies revealed several independent point mutations in a P-type ATPase involved in phospholipid translocation. Mutations in two other proteins-pyridoxal kinase and α-adaptin like protein-were also observed in independent mutants. The role of these proteins in the MF resistance was evaluated by gene transfection and gene disruption and both the P-type ATPase and pyridoxal kinase were implicated in MF susceptibility. The study also highlighted that resistance can be highly heterogeneous at the population level with individual clones derived from this population differing both in terms of genotypes but also susceptibility phenotypes. CONCLUSIONS/SIGNIFICANCE: Whole genome sequencing was used to pinpoint known and new resistance markers associated with MF resistance in the protozoan parasite Leishmania. The study also demonstrated the polyclonal nature of a resistant population with individual cells with varying susceptibilities and genotypes.

  12. Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing

    Directory of Open Access Journals (Sweden)

    Plant Ramona N

    2006-08-01

    Full Text Available Abstract Background Whole genome amplification is an increasingly common technique through which minute amounts of DNA can be multiplied to generate quantities suitable for genetic testing and analysis. Questions of amplification-induced error and template bias generated by these methods have previously been addressed through either small scale (SNPs or large scale (CGH array, FISH methodologies. Here we utilized whole genome sequencing to assess amplification-induced bias in both coding and non-coding regions of two bacterial genomes. Halobacterium species NRC-1 DNA and Campylobacter jejuni were amplified by several common, commercially available protocols: multiple displacement amplification, primer extension pre-amplification and degenerate oligonucleotide primed PCR. The amplification-induced bias of each method was assessed by sequencing both genomes in their entirety using the 454 Sequencing System technology and comparing the results with those obtained from unamplified controls. Results All amplification methodologies induced statistically significant bias relative to the unamplified control. For the Halobacterium species NRC-1 genome, assessed at 100 base resolution, the D-statistics from GenomiPhi-amplified material were 119 times greater than those from unamplified material, 164.0 times greater for Repli-G, 165.0 times greater for PEP-PCR and 252.0 times greater than the unamplified controls for DOP-PCR. For Campylobacter jejuni, also analyzed at 100 base resolution, the D-statistics from GenomiPhi-amplified material were 15 times greater than those from unamplified material, 19.8 times greater for Repli-G, 61.8 times greater for PEP-PCR and 220.5 times greater than the unamplified controls for DOP-PCR. Conclusion Of the amplification methodologies examined in this paper, the multiple displacement amplification products generated the least bias, and produced significantly higher yields of amplified DNA.

  13. A whole genome screen for HIV restriction factors

    Directory of Open Access Journals (Sweden)

    Liu Li

    2011-11-01

    Full Text Available Abstract Background Upon cellular entry retroviruses must avoid innate restriction factors produced by the host cell. For human immunodeficiency virus (HIV human restriction factors, APOBEC3 (apolipoprotein-B-mRNA-editing-enzyme, p21 and tetherin are well characterised. Results To identify intrinsic resistance factors to HIV-1 replication we screened 19,121 human genes and identified 114 factors with significant inhibition of infection. Those with a known function are involved in a broad spectrum of cellular processes including receptor signalling, vesicle trafficking, transcription, apoptosis, cross-nuclear membrane transport, meiosis, DNA damage repair, ubiquitination and RNA processing. We focused on the PAF1 complex which has been previously implicated in gene transcription, cell cycle control and mRNA surveillance. Knockdown of all members of the PAF1 family of proteins enhanced HIV-1 reverse transcription and integration of provirus. Over-expression of PAF1 in host cells renders them refractory to HIV-1. Simian Immunodeficiency Viruses and HIV-2 are also restricted in PAF1 expressing cells. PAF1 is expressed in primary monocytes, macrophages and T-lymphocytes and we demonstrate strong activity in MonoMac1, a monocyte cell line. Conclusions We propose that the PAF1c establishes an anti-viral state to prevent infection by incoming retroviruses. This previously unrecognised mechanism of restriction could have implications for invasion of cells by any pathogen.

  14. Whole-genome sequencing of individuals from a founder population identifies candidate genes for asthma.

    Science.gov (United States)

    Campbell, Catarina D; Mohajeri, Kiana; Malig, Maika; Hormozdiari, Fereydoun; Nelson, Benjamin; Du, Gaixin; Patterson, Kristen M; Eng, Celeste; Torgerson, Dara G; Hu, Donglei; Herman, Catherine; Chong, Jessica X; Ko, Arthur; O'Roak, Brian J; Krumm, Niklas; Vives, Laura; Lee, Choli; Roth, Lindsey A; Rodriguez-Cintron, William; Rodriguez-Santana, Jose; Brigino-Buenaventura, Emerita; Davis, Adam; Meade, Kelley; LeNoir, Michael A; Thyne, Shannon; Jackson, Daniel J; Gern, James E; Lemanske, Robert F; Shendure, Jay; Abney, Mark; Burchard, Esteban G; Ober, Carole; Eichler, Evan E

    2014-01-01

    Asthma is a complex genetic disease caused by a combination of genetic and environmental risk factors. We sought to test classes of genetic variants largely missed by genome-wide association studies (GWAS), including copy number variants (CNVs) and low-frequency variants, by performing whole-genome sequencing (WGS) on 16 individuals from asthma-enriched and asthma-depleted families. The samples were obtained from an extended 13-generation Hutterite pedigree with reduced genetic heterogeneity due to a small founding gene pool and reduced environmental heterogeneity as a result of a communal lifestyle. We sequenced each individual to an average depth of 13-fold, generated a comprehensive catalog of genetic variants, and tested the most severe mutations for association with asthma. We identified and validated 1960 CNVs, 19 nonsense or splice-site single nucleotide variants (SNVs), and 18 insertions or deletions that were out of frame. As follow-up, we performed targeted sequencing of 16 genes in 837 cases and 540 controls of Puerto Rican ancestry and found that controls carry a significantly higher burden of mutations in IL27RA (2.0% of controls; 0.23% of cases; nominal p = 0.004; Bonferroni p = 0.21). We also genotyped 593 CNVs in 1199 Hutterite individuals. We identified a nominally significant association (p = 0.03; Odds ratio (OR) = 3.13) between a 6 kbp deletion in an intron of NEDD4L and increased risk of asthma. We genotyped this deletion in an additional 4787 non-Hutterite individuals (nominal p = 0.056; OR = 1.69). NEDD4L is expressed in bronchial epithelial cells, and conditional knockout of this gene in the lung in mice leads to severe inflammation and mucus accumulation. Our study represents one of the early instances of applying WGS to complex disease with a large environmental component and demonstrates how WGS can identify risk variants, including CNVs and low-frequency variants, largely untested in GWAS. PMID:25116239

  15. Whole-genome sequencing of individuals from a founder population identifies candidate genes for asthma.

    Directory of Open Access Journals (Sweden)

    Catarina D Campbell

    Full Text Available Asthma is a complex genetic disease caused by a combination of genetic and environmental risk factors. We sought to test classes of genetic variants largely missed by genome-wide association studies (GWAS, including copy number variants (CNVs and low-frequency variants, by performing whole-genome sequencing (WGS on 16 individuals from asthma-enriched and asthma-depleted families. The samples were obtained from an extended 13-generation Hutterite pedigree with reduced genetic heterogeneity due to a small founding gene pool and reduced environmental heterogeneity as a result of a communal lifestyle. We sequenced each individual to an average depth of 13-fold, generated a comprehensive catalog of genetic variants, and tested the most severe mutations for association with asthma. We identified and validated 1960 CNVs, 19 nonsense or splice-site single nucleotide variants (SNVs, and 18 insertions or deletions that were out of frame. As follow-up, we performed targeted sequencing of 16 genes in 837 cases and 540 controls of Puerto Rican ancestry and found that controls carry a significantly higher burden of mutations in IL27RA (2.0% of controls; 0.23% of cases; nominal p = 0.004; Bonferroni p = 0.21. We also genotyped 593 CNVs in 1199 Hutterite individuals. We identified a nominally significant association (p = 0.03; Odds ratio (OR = 3.13 between a 6 kbp deletion in an intron of NEDD4L and increased risk of asthma. We genotyped this deletion in an additional 4787 non-Hutterite individuals (nominal p = 0.056; OR = 1.69. NEDD4L is expressed in bronchial epithelial cells, and conditional knockout of this gene in the lung in mice leads to severe inflammation and mucus accumulation. Our study represents one of the early instances of applying WGS to complex disease with a large environmental component and demonstrates how WGS can identify risk variants, including CNVs and low-frequency variants, largely untested in GWAS.

  16. Whole-genome sequencing of individuals from a founder population identifies candidate genes for asthma.

    Science.gov (United States)

    Campbell, Catarina D; Mohajeri, Kiana; Malig, Maika; Hormozdiari, Fereydoun; Nelson, Benjamin; Du, Gaixin; Patterson, Kristen M; Eng, Celeste; Torgerson, Dara G; Hu, Donglei; Herman, Catherine; Chong, Jessica X; Ko, Arthur; O'Roak, Brian J; Krumm, Niklas; Vives, Laura; Lee, Choli; Roth, Lindsey A; Rodriguez-Cintron, William; Rodriguez-Santana, Jose; Brigino-Buenaventura, Emerita; Davis, Adam; Meade, Kelley; LeNoir, Michael A; Thyne, Shannon; Jackson, Daniel J; Gern, James E; Lemanske, Robert F; Shendure, Jay; Abney, Mark; Burchard, Esteban G; Ober, Carole; Eichler, Evan E

    2014-01-01

    Asthma is a complex genetic disease caused by a combination of genetic and environmental risk factors. We sought to test classes of genetic variants largely missed by genome-wide association studies (GWAS), including copy number variants (CNVs) and low-frequency variants, by performing whole-genome sequencing (WGS) on 16 individuals from asthma-enriched and asthma-depleted families. The samples were obtained from an extended 13-generation Hutterite pedigree with reduced genetic heterogeneity due to a small founding gene pool and reduced environmental heterogeneity as a result of a communal lifestyle. We sequenced each individual to an average depth of 13-fold, generated a comprehensive catalog of genetic variants, and tested the most severe mutations for association with asthma. We identified and validated 1960 CNVs, 19 nonsense or splice-site single nucleotide variants (SNVs), and 18 insertions or deletions that were out of frame. As follow-up, we performed targeted sequencing of 16 genes in 837 cases and 540 controls of Puerto Rican ancestry and found that controls carry a significantly higher burden of mutations in IL27RA (2.0% of controls; 0.23% of cases; nominal p = 0.004; Bonferroni p = 0.21). We also genotyped 593 CNVs in 1199 Hutterite individuals. We identified a nominally significant association (p = 0.03; Odds ratio (OR) = 3.13) between a 6 kbp deletion in an intron of NEDD4L and increased risk of asthma. We genotyped this deletion in an additional 4787 non-Hutterite individuals (nominal p = 0.056; OR = 1.69). NEDD4L is expressed in bronchial epithelial cells, and conditional knockout of this gene in the lung in mice leads to severe inflammation and mucus accumulation. Our study represents one of the early instances of applying WGS to complex disease with a large environmental component and demonstrates how WGS can identify risk variants, including CNVs and low-frequency variants, largely untested in GWAS.

  17. Use of metaphors about exome and whole genome sequencing.

    Science.gov (United States)

    Nelson, Sarah C; Crouch, Julia M; Bamshad, Michael J; Tabor, Holly K; Yu, Joon-Ho

    2016-05-01

    Clinical and research uses of exome and whole genome sequencing (ES/WGS) are growing rapidly. An enhanced understanding of how individuals conceptualize and communicate about sequencing results is needed to ensure effective, mutual exchange of information between care providers and patients and between researchers and participants. Focus groups and interviews participants were recruited to discuss their attitudes and preferences for receiving hypothetical results from ES/WGS. African Americans were intentionally oversampled. We qualitatively analyzed participants' speech to identify unsolicited metaphorical language pertaining to genes and health, and grouped these occurrences into metaphorical concepts. Participants compared genetic information to physical objects including tools, weapons, contents of boxes, and formal documents or reports. These metaphorical concepts centered on several key themes, including locus of control; containment versus release of information; and desirability, usability, interpretability, and ownership of genetic results. Metaphorical language is often used intentionally or unintentionally in discussions about receiving results from ES/WGS in both clinical and research settings. Awareness of the use of metaphorical language and attention to its varied meanings facilitates effective communication about return of ES/WGS results. In turn, both should foster shared and informed decision-making and improve the translation of genetic information by clinicians and researchers. © 2016 Wiley Periodicals, Inc. PMID:26822973

  18. Whole genomes redefine the mutational landscape of pancreatic cancer

    Science.gov (United States)

    Waddell, Nicola; Pajic, Marina; Patch, Ann-Marie; Chang, David K.; Kassahn, Karin S.; Bailey, Peter; Johns, Amber L.; Miller, David; Nones, Katia; Quek, Kelly; Quinn, Michael C. J.; Robertson, Alan J.; Fadlullah, Muhammad Z. H.; Bruxner, Tim J. C.; Christ, Angelika N.; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourse, Craig; Nourbakhsh, Ehsan; Wani, Shivangi; Wilson, Peter J; Markham, Emma; Cloonan, Nicole; Anderson, Matthew J.; Fink, J. Lynn; Holmes, Oliver; Kazakoff, Stephen H.; Leonard, Conrad; Newell, Felicity; Poudel, Barsha; Song, Sarah; Taylor, Darrin; Waddell, Nick; Wood, Scott; Xu, Qinying; Wu, Jianmin; Pinese, Mark; Cowley, Mark J.; Lee, Hong C.; Jones, Marc D.; Nagrial, Adnan M.; Humphris, Jeremy; Chantrill, Lorraine A.; Chin, Venessa; Steinmann, Angela M.; Mawson, Amanda; Humphrey, Emily S.; Colvin, Emily K.; Chou, Angela; Scarlett, Christopher J.; Pinho, Andreia V.; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S.; Kench, James G.; Pettitt, Jessica A.; Merrett, Neil D.; Toon, Christopher; Epari, Krishna; Nguyen, Nam Q.; Barbour, Andrew; Zeps, Nikolajs; Jamieson, Nigel B.; Graham, Janet S.; Niclou, Simone P.; Bjerkvig, Rolf; Grützmann, Robert; Aust, Daniela; Hruban, Ralph H.; Maitra, Anirban; Iacobuzio-Donahue, Christine A.; Wolfgang, Christopher L.; Morgan, Richard A.; Lawlor, Rita T.; Corbo, Vincenzo; Bassi, Claudio; Falconi, Massimo; Zamboni, Giuseppe; Tortora, Giampaolo; Tempero, Margaret A.; Gill, Anthony J.; Eshleman, James R.; Pilarsky, Christian; Scarpa, Aldo; Musgrove, Elizabeth A.; Pearson, John V.; Biankin, Andrew V.; Grimmond, Sean M.

    2015-01-01

    Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (variation in chromosomal structure) classified PDACs into 4 subtypes with potential clinical utility: the subtypes were termed stable, locally rearranged, scattered and unstable. A significant proportion harboured focal amplifications, many of which contained druggable oncogenes (ERBB2, MET, FGFR1, CDK6, PIK3R3 and PIK3CA), but at low individual patient prevalence. Genomic instability co-segregated with inactivation of DNA maintenance genes (BRCA1, BRCA2 or PALB2) and a mutational signature of DNA damage repair deficiency. Of 8 patients who received platinum therapy, 4 of 5 individuals with these measures of defective DNA maintenance responded. PMID:25719666

  19. Whole genome shotgun assembly in theory and practice

    Science.gov (United States)

    Chapman, Jarrod Andrew

    The subject of this dissertation is the development of novel analytical and algorithmic approaches to the fragment assembly problem in the context of the Whole Genome Shotgun (WGS) DNA sequencing strategy. A collection of analyses and methods centered on the computational reconstruction of genomic DNA sequence from randomly sampled genome fragments, with particular focus on applications to large, polymorphic, and inhomogeneous datasets are presented. Several novel pre-assembly WGS data analyses are described including assessment of genome size, sequence uniformity, and repetitive element content with particular emphasis on the establishment of standardized quality assurance metrics for large WGS sequencing projects. A theoretical framework for understanding the statistical properties of WGS assemblies in the presence of paired-end sequence data is discussed and the algorithmic sub-problems of quality-based sequence trimming, global pairwise alignment detection, and consensus sequence generation are treated. Finally, as a novel application of these analyses and methods, the results of a collaboration to produce the first WGS sequence reconstruction of a community sample from a natural environment are presented.

  20. Cryptococcus gattii in the Age of Whole-Genome Sequencing.

    Science.gov (United States)

    Meyer, Wieland

    2015-11-17

    Cryptococcus gattii, the sister species of Cryptococcus neoformans, is an emerging pathogen which gained importance in connection with the ongoing cryptococcosis outbreak on Vancouver Island. Many molecular studies have divided this species into for major lineages: VGI, VGII, VGIII, and VGIV. This commentary summarizes the whole-genome sequencing (WGS) studies that have been carried out with this species, re-emphasizing the phylogenetic relationships, showing chromosomal rearrangements between those four groups, and identifying VGII as ancestral population within C. gattii. In addition, WGS specific to VGII, containing the Vancouver Island outbreak genotypes and those from the Pacific Northwest region of the United States, has placed the origin of this lineage within South America and identified specific genes responsible for either brain or lung infection. It also showed, that many genotypes are spread across a number of different continents, as has been previously shown by multilocus sequence typing (MLST). In addition, it showed that recombination occurs more frequently between mitochondrial than nuclear genomes.

  1. Whole-genome sequence variation, population structure and demographic history of the Dutch population

    NARCIS (Netherlands)

    The Genome of the Netherlands Consortium; Marschall, T.; Schoenhuth, A.

    2014-01-01

    Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring

  2. Complete Sequence Construction of the Highly Repetitive Ribosomal RNA Gene Repeats in Eukaryotes Using Whole Genome Sequence Data.

    Science.gov (United States)

    Agrawal, Saumya; Ganley, Austen R D

    2016-01-01

    The ribosomal RNA genes (rDNA) encode the major rRNA species of the ribosome, and thus are essential across life. These genes are highly repetitive in most eukaryotes, forming blocks of tandem repeats that form the core of nucleoli. The primary role of the rDNA in encoding rRNA has been long understood, but more recently the rDNA has been implicated in a number of other important biological phenomena, including genome stability, cell cycle, and epigenetic silencing. Noncoding elements, primarily located in the intergenic spacer region, appear to mediate many of these phenomena. Although sequence information is available for the genomes of many organisms, in almost all cases rDNA repeat sequences are lacking, primarily due to problems in assembling these intriguing regions during whole genome assemblies. Here, we present a method to obtain complete rDNA repeat unit sequences from whole genome assemblies. Limitations of next generation sequencing (NGS) data make them unsuitable for assembling complete rDNA unit sequences; therefore, the method we present relies on the use of Sanger whole genome sequence data. Our method makes use of the Arachne assembler, which can assemble highly repetitive regions such as the rDNA in a memory-efficient way. We provide a detailed step-by-step protocol for generating rDNA sequences from whole genome Sanger sequence data using Arachne, for refining complete rDNA unit sequences, and for validating the sequences obtained. In principle, our method will work for any species where the rDNA is organized into tandem repeats. This will help researchers working on species without a complete rDNA sequence, those working on evolutionary aspects of the rDNA, and those interested in conducting phylogenetic footprinting studies with the rDNA. PMID:27576718

  3. Environmental Whole-Genome Amplification to Access Microbial Diversity in Contaminated Sediments

    Energy Technology Data Exchange (ETDEWEB)

    Abulencia, C.B.; Wyborski, D.L.; Garcia, J.; Podar, M.; Chen, W.; Chang, S.H.; Chang, H.W.; Watson, D.; Brodie,E.I.; Hazen, T.C.; Keller, M.

    2005-12-10

    Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using ?29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2 percent genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9 percent of the sequences had significant similarities to known proteins, and ''clusters of orthologous groups'' (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible.

  4. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  5. A whole-genome association study for pig reproductive traits.

    Science.gov (United States)

    Onteru, S K; Fan, B; Du, Z-Q; Garrick, D J; Stalder, K J; Rothschild, M F

    2012-02-01

    A whole-genome association study was performed for reproductive traits in commercial sows using the PorcineSNP60 BeadChip and Bayesian statistical methods. The traits included total number born (TNB), number born alive (NBA), number of stillborn (SB), number of mummified foetuses at birth (MUM) and gestation length (GL) in each of the first three parities. We report the associations of informative QTL and the genes within the QTL for each reproductive trait in different parities. These results provide evidence of gene effects having temporal impacts on reproductive traits in different parities. Many QTL identified in this study are new for pig reproductive traits. Around 48% of total genes located in the identified QTL regions were predicted to be involved in placental functions. The genomic regions containing genes important for foetal developmental (e.g. MEF2C) and uterine functions (e.g. PLSCR4) were associated with TNB and NBA in the first two parities. Similarly, QTL in other foetal developmental (e.g. HNRNPD and AHR) and placental (e.g. RELL1 and CD96) genes were associated with SB and MUM in different parities. The QTL with genes related to utero-placental blood flow (e.g. VEGFA) and hematopoiesis (e.g. MAFB) were associated with GL differences among sows in this population. Pathway analyses using genes within QTL identified some modest underlying biological pathways, which are interesting candidates (e.g. the nucleotide metabolism pathway for SB) for pig reproductive traits in different parities. Further validation studies on large populations are warranted to improve our understanding of the complex genetic architecture for pig reproductive traits.

  6. A whole genome association study on meat palatability in hanwoo.

    Science.gov (United States)

    Hyeong, K-E; Lee, Y-M; Kim, Y-S; Nam, K C; Jo, C; Lee, K-H; Lee, J-E; Kim, J-J

    2014-09-01

    A whole genome association (WGA) study was carried out to find quantitative trait loci (QTL) for sensory evaluation traits in Hanwoo. Carcass samples of 250 Hanwoo steers were collected from National Agricultural Cooperative Livestock Research Institute, Ansung, Gyeonggi province, Korea, between 2011 and 2012 and genotyped with the Affymetrix Bovine Axiom Array 640K single nucleotide polymorphism (SNP) chip. Among the SNPs in the chip, a total of 322,160 SNPs were chosen after quality control tests. After adjusting for the effects of age, slaughter-year-season, and polygenic effects using genome relationship matrix, the corrected phenotypes for the sensory evaluation measurements were regressed on each SNP using a simple linear regression additive based model. A total of 1,631 SNPs were detected for color, aroma, tenderness, juiciness and palatability at 0.1% comparison-wise level. Among the significant SNPs, the best set of 52 SNP markers were chosen using a forward regression procedure at 0.05 level, among which the sets of 8, 14, 11, 10, and 9 SNPs were determined for the respectively sensory evaluation traits. The sets of significant SNPs explained 18% to 31% of phenotypic variance. Three SNPs were pleiotropic, i.e. AX-26703353 and AX-26742891 that were located at 101 and 110 Mb of BTA6, respectively, influencing tenderness, juiciness and palatability, while AX-18624743 at 3 Mb of BTA10 affected tenderness and palatability. Our results suggest that some QTL for sensory measures are segregating in a Hanwoo steer population. Additional WGA studies on fatty acid and nutritional components as well as the sensory panels are in process to characterize genetic architecture of meat quality and palatability in Hanwoo. PMID:25178363

  7. Complete sequence of the first chimera genome constructed by cloning the whole genome of Synechocystis strain PCC6803 into the Bacillus subtilis 168 genome.

    Science.gov (United States)

    Watanabe, Satoru; Shiwa, Yuh; Itaya, Mitsuhiro; Yoshikawa, Hirofumi

    2012-12-01

    Genome synthesis of existing or designed genomes is made feasible by the first successful cloning of a cyanobacterium, Synechocystis PCC6803, in Gram-positive, endospore-forming Bacillus subtilis. Whole-genome sequence analysis of the isolate and parental B. subtilis strains provides clues for identifying single nucleotide polymorphisms (SNPs) in the 2 complete bacterial genomes in one cell.

  8. 三倍化复制对白菜花粉特异表达候选基因的影响%The Influence of Whole-Genome Triplication (WGT) on the Candidate Genes of Pollen Specific Expression in Brassica rapa

    Institute of Scientific and Technical Information of China (English)

    王晓波; 马原; 程锋; 武剑; 梁建丽; 王晓武

    2015-01-01

    [Objective] The objective of this study is to research the evolution of the candidate genes of pollen specific expression in Brassica rapa after whole-genome triplication (WGT). This study will provide a theoretical basis for pollen specific expressed genes in B. rapa in the future research.[Method]The genes of pollen specific expression in B. rapa were acquired by SynOrths referred to Arabidopsis thaliana based on their syntenic relationship. Interpro Scan was used to get their Gene Ontology(GO), which were grouped into 13 categories related to pollen. To compare among three duplicated categories, the ratio of gene number of each GO to total number of genes in these categories was counted. Furthermore, these genes were grouped into three subgenomes based on their explicit dataset of B. rapa genes. Based on the syntenic relationship of tandem genes of pollen specific expression in B. rapa with genes in A. thaliana,whether these genes were generated before WGT or after the event was determined.[Result] Totally, 1962 candidate genes of pollen specific expression in B. rapa were verified via the syntenic relationship with 1651 genes in A. thaliana. There are 182 tandem genes in A. thaliana, while 137 tandem genes in B. rapa. The result showed that the number of pollen specific expressed genes between A. thaliana and B. rapa was almost equivalent, so it could be inferred that most of copy genes of B. rapa were lost after WGT event occurs. A total of 549 genes of pollen specific expression in A. thaliana were not found syntenic counterparts in B. rapa, so these genes might be also lost after WGT. There are 898 genes of pollen specific expression in A. thaliana corresponding to these genes with single-copy, double-copies or triple-copies in B. rapa. A total of 480 genes in A. thaliana with single-copy in B. rapa, accounting for 53.5%, which is the largest proportion. While 322 genes with double-copies account for about 35.8%, and only 96 genes with triple-copies account

  9. Gene expression profile of renal cell carcinoma clear cell type

    Directory of Open Access Journals (Sweden)

    Marcos F. Dall’Oglio

    2010-08-01

    Full Text Available PURPOSE: The determination of prognosis in patients with renal cell carcinoma (RCC is based, classically, on stage and histopathological aspects. The metastatic disease develops in one third of patients after surgery, even in localized tumors. There are few options for treating those patients, and even the new target designed drugs have shown low rates of success in controlling disease progression. Few studies used high throughput genomic analysis in renal cell carcinoma for determination of prognosis. This study is focused on the identification of gene expression signatures in tissues of low-risk, high-risk and metastatic RCC clear cell type (RCC-CCT. MATERIALS AND METHODS: We analyzed the expression of approximately 55,000 distinct transcripts using the Whole Genome microarray platform hybridized with RNA extracted from 19 patients submitted to surgery to treat RCC-CCT with different clinical outcomes. They were divided into three groups (1 low risk, characterized by pT1, Fuhrman grade 1 or 2, no microvascular invasion RCC; (2 high risk, pT2-3, Fuhrman grade 3 or 4 with, necrosis and microvascular invasion present and (3 metastatic RCC-CCT. Normal renal tissue was used as control. RESULTS: After comparison of differentially expressed genes among low-risk, high-risk and metastatic groups, we identified a group of common genes characterizing metastatic disease. Among them Interleukin-8 and Heat shock protein 70 were over-expressed in metastasis and validated by real-time polymerase chain reaction. CONCLUSION: These findings can be used as a starting point to generate molecular markers of RCC-CCT as well as a target for the development of innovative therapies.

  10. Whole-Genome Shotgun Sequence of Arthrospira platensis Strain Paraca, a Cultivated and Edible Cyanobacterium

    OpenAIRE

    Lefort, Francois; Calmin, Gautier; Crovadore, Julien; Falquet, Jacques; Hurni, Jean-Pierre; Osteras, Magne; Haldemann, Francois; Farinelli, Laurent

    2014-01-01

    Here we report the whole-genome shotgun sequence of a Peruvian strain of Arthrospira platensis (Paraca), a cultivated and edible haloalkaliphilic cyanobacterium of great scientific, technical, and economic potential.

  11. Systematic evaluation of bias in microbial community profiles induced by whole genome amplification

    NARCIS (Netherlands)

    S.O.L. Direito; E. Zaura; M. Little; P. Ehrenfreund; W.F.M. Röling

    2014-01-01

    Whole genome amplification methods facilitate the detection and characterization of microbial communities in low biomass environments. We examined the extent to which the actual community structure is reliably revealed and factors contributing to bias. One widely used [multiple displacement amplific

  12. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data

    NARCIS (Netherlands)

    J.M. Bryant (Josephine); A. Schürch (Anita); H. van Deutekom (Henk); S.R. Harris (Simon); J.L. de Beer (Jessica); V. de Jager (Victor); K. Kremer (Kristin); S.A.F.T. van Hijum (Sacha); R.J. Siezen (Roland); M.W. Borgdorff (Martien ); S.D. Bentley (Stephen); J. Parkhill (Julian); D. van Soolingen (Dick)

    2013-01-01

    textabstractBackground: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate kno

  13. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data

    NARCIS (Netherlands)

    Bryant, J.M.; Schürch, A.C.; Deutekom, van H.; Harris, S.R.; Beer, de J.L.; Jager, de V.C.L.; Kremer, K.; Hijum, van S.A.F.T.; Siezen, R.J.; Borgdorff, M.; Bentley, S.D.; Parkhill, J.; Soolingen, van D.

    2013-01-01

    BACKGROUND: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate knowledge of th

  14. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data.

    NARCIS (Netherlands)

    Bryant, J.M.; Schurch, A.C.; Deutekom, H. van; Harris, S.R.; Beer, J.L. de; Jager, V. de; Kremer, K.; Hijum, S.A.F.T. van; Siezen, R.J.; Borgdorff, M.; Bentley, S.D.; Parkhill, J.; Soolingen, D. van

    2013-01-01

    BACKGROUND: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate knowledge of th

  15. New perspectives on microbial community distortion after whole-genome amplification

    Science.gov (United States)

    Whole-genome amplification (WGA) has become an important tool to explore the genomic information of microorganisms in an environmental sample with limited biomass, however potential selective biases during the amplification processes are poorly understood. Here, we describe the e...

  16. Maps of cis-Regulatory Nodes in Megabase Long Genome Segments are an Inevitable Intermediate Step Toward Whole Genome Functional Mapping.

    Science.gov (United States)

    Nikolaev, Lev G; Akopov, Sergey B; Chernov, Igor P; Sverdlov, Eugene D

    2007-04-01

    The availability of complete human and other metazoan genome sequences has greatly facilitated positioning and analysis of various genomic functional elements, with initial emphasis on coding sequences. However, complete functional maps of sequenced eukaryotic genomes should include also positions of all non-coding regulatory elements. Unfortunately, experimental data on genomic positions of a multitude of regulatory sequences, such as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. Since most genomic regulatory elements (e.g. enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements by computational methods is difficult and often ambiguous. Therefore, the development of high-throughput experimental approaches for identifying and mapping genomic functional elements is highly desirable. At the same time, the creation of whole-genome map of hundreds of thousands of regulatory elements in several hundreds of tissue/cell types is presently far beyond our capabilities. A possible alternative for the whole genome approach is to concentrate efforts on individual genomic segments and then to integrate the data obtained into a whole genome functional map. Moreover, the maps of polygenic fragments with functional cis-regulatory elements would provide valuable data on complex regulatory systems, including their variability and evolution. Here, we reviewed experimental approaches to the realization of these ideas, including our own developments of experimental techniques for selection of cis-acting functionally active DNA fragments from large (megabase-sized) segments of mammalian genomes. PMID:18660850

  17. Maps of cis-Regulatory Nodes in Megabase Long Genome Segments are an Inevitable Intermediate Step Toward Whole Genome Functional Mapping

    Science.gov (United States)

    Nikolaev, Lev G; Akopov, Sergey B; Chernov, Igor P; Sverdlov, Eugene D

    2007-01-01

    The availability of complete human and other metazoan genome sequences has greatly facilitated positioning and analysis of various genomic functional elements, with initial emphasis on coding sequences. However, complete functional maps of sequenced eukaryotic genomes should include also positions of all non-coding regulatory elements. Unfortunately, experimental data on genomic positions of a multitude of regulatory sequences, such as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. Since most genomic regulatory elements (e.g. enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements by computational methods is difficult and often ambiguous. Therefore, the development of high-throughput experimental approaches for identifying and mapping genomic functional elements is highly desirable. At the same time, the creation of whole-genome map of hundreds of thousands of regulatory elements in several hundreds of tissue/cell types is presently far beyond our capabilities. A possible alternative for the whole genome approach is to concentrate efforts on individual genomic segments and then to integrate the data obtained into a whole genome functional map. Moreover, the maps of polygenic fragments with functional cis-regulatory elements would provide valuable data on complex regulatory systems, including their variability and evolution. Here, we reviewed experimental approaches to the realization of these ideas, including our own developments of experimental techniques for selection of cis-acting functionally active DNA fragments from large (megabase-sized) segments of mammalian genomes. PMID:18660850

  18. Whole-Genome Sequence of the Nitrogen-Fixing Symbiotic Rhizobium Mesorhizobium loti Strain TONO

    Science.gov (United States)

    Hirakawa, Hideki; Sato, Shusei; Saeki, Kazuhiko; Hayashi, Makoto

    2016-01-01

    Mesorhizobium loti is the nitrogen-fixing microsymbiont for legumes of the genus Lotus. Here, we report the whole-genome sequence of a Mesorhizobium loti strain, TONO, which is used as a symbiont for the model legume Lotus japonicus. The whole-genome sequence of the strain TONO will be a solid platform for comparative genomics analyses and for the identification of genes responsible for the symbiotic properties of Mesorhizobium species.

  19. Clinical Diagnosis by Whole-Genome Sequencing of a Prenatal Sample

    OpenAIRE

    Talkowski, Michael E.; Ordulu, Zehra; Pillalamarri, Vamsee; Benson, Carol B.; Blumenthal, Ian; Connolly, Susan; Hanscom, Carrie; Hussain, Naveed; Pereira, Shahrin; Picker, Jonathan; Rosenfeld, Jill A.; Shaffer, Lisa G.; Wilkins-Haug, Louise E.; Gusella, James F.; Morton, Cynthia C.

    2012-01-01

    Conventional cytogenetic testing offers low-resolution detection of balanced karyotypic abnormalities but cannot provide the precise, gene-level knowledge required to predict outcomes. The use of high-resolution whole-genome deep sequencing is currently impractical for the purpose of routine clinical care. We show here that whole-genome “jumping libraries” can offer an immediately applicable, nucleotide-level complement to conventional genetic diagnostics within a time frame that allows for c...

  20. A Whole Genome Pairwise Comparative and Functional Analysis of Geobacter sulfurreducens PCA

    OpenAIRE

    Selvaraj, Ashok; Thankaswamy Kosalai, Subazini; Chinnasamy Perumal, Rajadurai; Pitchai, Subhashini; Kumar, Gopal Ramesh

    2013-01-01

    Geobacter species are involved in electricity production, bioremediations, and various environmental friendly activities. Whole genome comparative analyses of Geobacter sulfurreducens PCA, Geobacter bemidjiensis Bem, Geobacter sp. FRC-32, Geobacter lovleyi SZ, Geobacter sp. M21, Geobacter metallireducens GS-15, Geobacter uraniireducens Rf4 have been made to find out similarities and dissimilarities among them. For whole genome comparison of Geobacter species, an in-house tool, Geobacter Compa...

  1. SOX4 expression in bladder carcinoma

    DEFF Research Database (Denmark)

    Aaboe, Mads; Birkenkamp-Demtroder, Karin; Wiuf, Carsten;

    2006-01-01

    The human transcription factor SOX4 was 5-fold up-regulated in bladder tumors compared with normal tissue based on whole-genome expression profiling of 166 clinical bladder tumor samples and 27 normal urothelium samples. Using a SOX4-specific antibody, we found that the cancer cells expressed the...

  2. Whole genome sequence of Staphylococcus saprophyticus reveals the pathogenesis of uncomplicated urinary tract infection.

    Science.gov (United States)

    Kuroda, Makoto; Yamashita, Atsushi; Hirakawa, Hideki; Kumano, Miyuki; Morikawa, Kazuya; Higashide, Masato; Maruyama, Atsushi; Inose, Yumiko; Matoba, Kimio; Toh, Hidehiro; Kuhara, Satoru; Hattori, Masahira; Ohta, Toshiko

    2005-09-13

    Staphylococcus saprophyticus is a uropathogenic Staphylococcus frequently isolated from young female outpatients presenting with uncomplicated urinary tract infections. We sequenced the whole genome of S. saprophyticus type strain ATCC 15305, which harbors a circular chromosome of 2,516,575 bp with 2,446 ORFs and two plasmids. Comparative genomic analyses with the strains of two other species, Staphylococcus aureus and Staphylococcus epidermidis, as well as experimental data, revealed the following characteristics of the S. saprophyticus genome. S. saprophyticus does not possess any virulence factors found in S. aureus, such as coagulase, enterotoxins, exoenzymes, and extracellular matrix-binding proteins, although it does have a remarkable paralog expansion of transport systems related to highly variable ion contents in the urinary environment. A further unique feature is that only a single ORF is predictable as a cell wall-anchored protein, and it shows positive hemagglutination and adherence to human bladder cell associated with initial colonization in the urinary tract. It also shows significantly high urease activity in S. saprophyticus. The uropathogenicity of S. saprophyticus can be attributed to its genome that is needed for its survival in the human urinary tract by means of novel cell wall-anchored adhesin and redundant uro-adaptive transport systems, together with urease.

  3. Whole genome sequencing of Saccharomyces cerevisiae: from genotype to phenotype for improved metabolic engineering applications

    Directory of Open Access Journals (Sweden)

    Asadollahi Mohammad A

    2010-12-01

    Full Text Available Abstract Background The need for rapid and efficient microbial cell factory design and construction are possible through the enabling technology, metabolic engineering, which is now being facilitated by systems biology approaches. Metabolic engineering is often complimented by directed evolution, where selective pressure is applied to a partially genetically engineered strain to confer a desirable phenotype. The exact genetic modification or resulting genotype that leads to the improved phenotype is often not identified or understood to enable further metabolic engineering. Results In this work we performed whole genome high-throughput sequencing and annotation can be used to identify single nucleotide polymorphisms (SNPs between Saccharomyces cerevisiae strains S288c and CEN.PK113-7D. The yeast strain S288c was the first eukaryote sequenced, serving as the reference genome for the Saccharomyces Genome Database, while CEN.PK113-7D is a preferred laboratory strain for industrial biotechnology research. A total of 13,787 high-quality SNPs were detected between both strains (reference strain: S288c. Considering only metabolic genes (782 of 5,596 annotated genes, a total of 219 metabolism specific SNPs are distributed across 158 metabolic genes, with 85 of the SNPs being nonsynonymous (e.g., encoding amino acid modifications. Amongst metabolic SNPs detected, there was pathway enrichment in the galactose uptake pathway (GAL1, GAL10 and ergosterol biosynthetic pathway (ERG8, ERG9. Physiological characterization confirmed a strong deficiency in galactose uptake and metabolism in S288c compared to CEN.PK113-7D, and similarly, ergosterol content in CEN.PK113-7D was significantly higher in both glucose and galactose supplemented cultivations compared to S288c. Furthermore, DNA microarray profiling of S288c and CEN.PK113-7D in both glucose and galactose batch cultures did not provide a clear hypothesis for major phenotypes observed, suggesting that

  4. Whole-genome transcriptional analysis of heavy metal stresses inCaulobacter crescentus

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Ping; Brodie, Eoin L.; Suzuki, Yohey; McAdams, Harley H.; Andersen, Gary L.

    2005-09-21

    The bacterium Caulobacter crescentus and related stalkbacterial species are known for their distinctive ability to live in lownutrient environments, a characteristic of most heavy metal contaminatedsites. Caulobacter crescentus is a model organism for studying cell cycleregulation with well developed genetics. We have identified the pathwaysresponding to heavy metal toxicity in C. crescentus to provide insightsfor possible application of Caulobacter to environmental restoration. Weexposed C. crescentus cells to four heavy metals (chromium, cadmium,selenium and uranium) and analyzed genome wide transcriptional activitiespost exposure using a Affymetrix GeneChip microarray. C. crescentusshowed surprisingly high tolerance to uranium, a possible mechanism forwhich may be formation of extracellular calcium-uranium-phosphateprecipitates. The principal response to these metals was protectionagainst oxidative stress (up-regulation of manganese-dependent superoxidedismutase, sodA). Glutathione S-transferase, thioredoxin, glutaredoxinsand DNA repair enzymes responded most strongly to cadmium and chromate.The cadmium and chromium stress response also focused on reducing theintracellular metal concentration, with multiple efflux pumps employed toremove cadmium while a sulfate transporter was down-regulated to reducenon-specific uptake of chromium. Membrane proteins were also up-regulatedin response to most of the metals tested. A two-component signaltransduction system involved in the uranium response was identified.Several differentially regulated transcripts from regions previously notknown to encode proteins were identified, demonstrating the advantage ofevaluating the transcriptome using whole genome microarrays.

  5. Mining metagenomic whole genome sequences revealed subdominant but constant Lactobacillus population in the human gut microbiota.

    Science.gov (United States)

    Rossi, Maddalena; Martínez-Martínez, Daniel; Amaretti, Alberto; Ulrici, Alessandro; Raimondi, Stefano; Moya, Andrés

    2016-06-01

    The genus Lactobacillus includes over 215 species that colonize plants, foods, sewage and the gastrointestinal tract (GIT) of humans and animals. In the GIT, Lactobacillus population can be made by true inhabitants or by bacteria occasionally ingested with fermented or spoiled foods, or with probiotics. This study longitudinally surveyed Lactobacillus species and strains in the feces of a healthy subject through whole genome sequencing (WGS) data-mining, in order to identify members of the permanent or transient populations. In three time-points (0, 670 and 700 d), 58 different species were identified, 16 of them being retrieved for the first time in human feces. L. rhamnosus, L. ruminis, L. delbrueckii, L. plantarum, L. casei and L. acidophilus were the most represented, with estimated amounts ranging between 6 and 8 Log (cells g(-1) ), while the other were detected at 4 or 5 Log (cells g(-1) ). 86 Lactobacillus strains belonging to 52 species were identified. 43 seemingly occupied the GIT as true residents, since were detected in a time span of almost 2 years in all the three samples or in 2 samples separated by 670 or 700 d. As a whole, a stable community of lactobacilli was disclosed, with wide and understudied biodiversity. PMID:27043715

  6. Whole genome sequencing and complete genetic analysis reveals novel pathways to glycopeptide resistance in Staphylococcus aureus.

    Directory of Open Access Journals (Sweden)

    Adriana Renzoni

    Full Text Available The precise mechanisms leading to the emergence of low-level glycopeptide resistance in Staphylococcus aureus are poorly understood. In this study, we used whole genome deep sequencing to detect differences between two isogenic strains: a parental strain and a stable derivative selected stepwise for survival on 4 µg/ml teicoplanin, but which grows at higher drug concentrations (MIC 8 µg/ml. We uncovered only three single nucleotide changes in the selected strain. Nonsense mutations occurred in stp1, encoding a serine/threonine phosphatase, and in yjbH, encoding a post-transcriptional negative regulator of the redox/thiol stress sensor and global transcriptional regulator, Spx. A missense mutation (G45R occurred in the histidine kinase sensor of cell wall stress, VraS. Using genetic methods, all single, pairwise combinations, and a fully reconstructed triple mutant were evaluated for their contribution to low-level glycopeptide resistance. We found a synergistic cooperation between dual phospho-signalling systems and a subtle contribution from YjbH, suggesting the activation of oxidative stress defences via Spx. To our knowledge, this is the first genetic demonstration of multiple sensor and stress pathways contributing simultaneously to glycopeptide resistance development. The multifactorial nature of glycopeptide resistance in this strain suggests a complex reprogramming of cell physiology to survive in the face of drug challenge.

  7. The "most wanted" taxa from the human microbiome for whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Anthony A Fodor

    Full Text Available The goal of the Human Microbiome Project (HMP is to generate a comprehensive catalog of human-associated microorganisms including reference genomes representing the most common species. Toward this goal, the HMP has characterized the microbial communities at 18 body habitats in a cohort of over 200 healthy volunteers using 16S rRNA gene (16S sequencing and has generated nearly 1,000 reference genomes from human-associated microorganisms. To determine how well current reference genome collections capture the diversity observed among the healthy microbiome and to guide isolation and future sequencing of microbiome members, we compared the HMP's 16S data sets to several reference 16S collections to create a 'most wanted' list of taxa for sequencing. Our analysis revealed that the diversity of commonly occurring taxa within the HMP cohort microbiome is relatively modest, few novel taxa are represented by these OTUs and many common taxa among HMP volunteers recur across different populations of healthy humans. Taken together, these results suggest that it should be possible to perform whole-genome sequencing on a large fraction of the human microbiome, including the 'most wanted', and that these sequences should serve to support microbiome studies across multiple cohorts. Also, in stark contrast to other taxa, the 'most wanted' organisms are poorly represented among culture collections suggesting that novel culture- and single-cell-based methods will be required to isolate these organisms for sequencing.

  8. Sensitive and specific KRAS somatic mutation analysis on whole-genome amplified DNA from archival tissues.

    Science.gov (United States)

    van Eijk, Ronald; van Puijenbroek, Marjo; Chhatta, Amiet R; Gupta, Nisha; Vossen, Rolf H A M; Lips, Esther H; Cleton-Jansen, Anne-Marie; Morreau, Hans; van Wezel, Tom

    2010-01-01

    Kirsten RAS (KRAS) is a small GTPase that plays a key role in Ras/mitogen-activated protein kinase signaling; somatic mutations in KRAS are frequently found in many cancers. The most common KRAS mutations result in a constitutively active protein. Accurate detection of KRAS mutations is pivotal to the molecular diagnosis of cancer and may guide proper treatment selection. Here, we describe a two-step KRAS mutation screening protocol that combines whole-genome amplification (WGA), high-resolution melting analysis (HRM) as a prescreen method for mutation carrying samples, and direct Sanger sequencing of DNA from formalin-fixed, paraffin-embedded (FFPE) tissue, from which limited amounts of DNA are available. We developed target-specific primers, thereby avoiding amplification of homologous KRAS sequences. The addition of herring sperm DNA facilitated WGA in DNA samples isolated from as few as 100 cells. KRAS mutation screening using high-resolution melting analysis on wgaDNA from formalin-fixed, paraffin-embedded tissue is highly sensitive and specific; additionally, this method is feasible for screening of clinical specimens, as illustrated by our analysis of pancreatic cancers. Furthermore, PCR on wgaDNA does not introduce genotypic changes, as opposed to unamplified genomic DNA. This method can, after validation, be applied to virtually any potentially mutated region in the genome.

  9. Whole-genome shotgun optical mapping of Rhodobacter sphaeroides strain 2.4. 1 and its use for whole-genome shotgun sequence assembly

    Energy Technology Data Exchange (ETDEWEB)

    Shou, S. [Univ. Wisc.-Madison; Kvikstad, E. [Univ. Wisc.-Madison; Kile, A. [Univ. Wisc.-Madison; Severin, J. [Whole-genome shotgun optical mapping of Rhodobacter sphaeroides strain 2.4. 1 and its use for whole-genome shotgun sequence assembly; Forrest, D. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Hickman, J. W. [Univ. Wisc.-Madison; Mackenzie, C. [University of Texas–Houston Medical School; Choudhary, M. [University of Texas–Houston Medical School; Donohue, T. [Univ. Wisc.-Madison; Kaplan, S. [University of Texas–Houston Medical School; Schwartz, D. C. [Univ. Wisc.-Madison

    2003-09-01

    Rhodobacter sphaeroides 2.4.1 is a facultative photoheterotrophic bacterium with tremendous metabolic diversity, which has significantly contributed to our understanding of the molecular genetics of photosynthesis, photoheterotrophy, nitrogen fixation, hydrogen metabolism, carbon dioxide fixation, taxis, and tetrapyrrole biosynthesis. To further understand this remarkable bacterium, and to accelerate an ongoing sequencing project, two whole-genome restriction maps (EcoRI and HindIII) of R. sphaeroides strain 2.4.1 were constructed using shotgun optical mapping. The approach directly mapped genomic DNA by the random mapping of single molecules. The two maps were used to facilitate sequence assembly by providing an optical scaffold for high-resolution alignment and verification of sequence contigs. Our results show that such maps facilitated the closure of sequence gaps by the early detection of nascent sequence contigs during the course of the whole-genome shotgun sequencing process.

  10. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  11. swDMR: A Sliding Window Approach to Identify Differentially Methylated Regions Based on Whole Genome Bisulfite Sequencing.

    Directory of Open Access Journals (Sweden)

    Zhen Wang

    Full Text Available DNA methylation is a widespread epigenetic modification that plays an essential role in gene expression through transcriptional regulation and chromatin remodeling. The emergence of whole genome bisulfite sequencing (WGBS represents an important milestone in the detection of DNA methylation. Characterization of differential methylated regions (DMRs is fundamental as well for further functional analysis. In this study, we present swDMR (http://sourceforge.net/projects/swDMR/ for the comprehensive analysis of DMRs from whole genome methylation profiles by a sliding window approach. It is an integrated tool designed for WGBS data, which not only implements accessible statistical methods to perform hypothesis test adapted to two or more samples without replicates, but false discovery rate was also controlled by multiple test correction. Downstream analysis tools were also provided, including cluster, annotation and visualization modules. In summary, based on WGBS data, swDMR can produce abundant information of differential methylated regions. As a convenient and flexible tool, we believe swDMR will bring us closer to unveil the potential functional regions involved in epigenetic regulation.

  12. Whole genome protein microarrays for serum profiling of immunodominant antigens of Bacillus anthracis

    Directory of Open Access Journals (Sweden)

    Karen Elizabeth Kempsell

    2015-08-01

    Full Text Available A commercial Bacillus anthracis (Anthrax whole genome protein microarray has been used to identify immunogenic Anthrax proteins using sera from groups of donors with (a confirmed B. anthracis naturally acquired cutaneous infection, (b confirmed B. anthracis intravenous drug use-acquired infection (c occupational exposure in a wool-sorters factory (d humans and rabbits vaccinated with the UK Anthrax protein vaccine and compared to naïve unexposed controls. Anti-IAP responses were observed for both IgG and IgA in the challenged groups; however the anti-IAP IgG response was more evident in the vaccinated group and the anti-IAP IgA response more evident in the B. anthracis-infected groups. Infected individuals appeared somewhat suppressed for their general IgG response, compared with other challenged groups.Immunogenic protein antigens were identified in all groups, some of which were shared between groups whilst others were specific for individual groups. The toxin proteins were immunodominant in all vaccinated, infected or other challenged groups. However a number of other chromosomally-located and plasmid encoded open reading frames were also recognised by infected or exposed groups in comparison to controls. Some of these antigens e.g. BA4182 are not recognised by vaccinated individuals, suggesting that there are proteins more specifically expressed by live Anthrax spores in vivo and are not currently found in the UK licensed Anthrax Vaccine (AVP. These may perhaps be preferentially expressed during infection and represent expression of alternative pathways in the B. anthracis ‘infectome’. These may make highly attractive candidates for diagnostic and vaccine biomarker development as they may be more specifically associated with the infectious phase of the pathogen. A number of B. anthracis small hypothetical protein targets have been synthesised, tested in mouse immunogenicity studies and validated in parallel using human sera from the

  13. Whole genome sequence analysis of unidentified genetically modified papaya for development of a specific detection method.

    Science.gov (United States)

    Nakamura, Kosuke; Kondo, Kazunari; Akiyama, Hiroshi; Ishigaki, Takumi; Noguchi, Akio; Katsumata, Hiroshi; Takasaki, Kazuto; Futo, Satoshi; Sakata, Kozue; Fukuda, Nozomi; Mano, Junichi; Kitta, Kazumi; Tanaka, Hidenori; Akashi, Ryo; Nishimaki-Mogami, Tomoko

    2016-08-15

    Identification of transgenic sequences in an unknown genetically modified (GM) papaya (Carica papaya L.) by whole genome sequence analysis was demonstrated. Whole genome sequence data were generated for a GM-positive fresh papaya fruit commodity detected in monitoring using real-time polymerase chain reaction (PCR). The sequences obtained were mapped against an open database for papaya genome sequence. Transgenic construct- and event-specific sequences were identified as a GM papaya developed to resist infection from a Papaya ringspot virus. Based on the transgenic sequences, a specific real-time PCR detection method for GM papaya applicable to various food commodities was developed. Whole genome sequence analysis enabled identifying unknown transgenic construct- and event-specific sequences in GM papaya and development of a reliable method for detecting them in papaya food commodities.

  14. Whole-Genome de novo Sequencing Of Quail And Grey Partridge

    DEFF Research Database (Denmark)

    Holm, Lars-Erik; Panitz, Frank; Burt, Dave;

    2011-01-01

    The development in sequencing methods has made it possible to perform whole genome de novo sequencing of species without large commercial interests. Within the EU-financed QUANTOMICS project (KBBE-2A-222664), we have performed de novo sequencing of quail (Coturnix coturnix) and grey partridge...... (Perdix perdix) on a Genome Analyzer GAII (Illumina) using paired-end sequencing. The amount of generated sequences amounts to 8 to 9 Gb for each species. The analysis and assembly of the generated sequences is ongoing. Access to the whole genome sequence from these two species will enable enhanced...... comparative studies towards the chicken genome and will aid in identifying evolutionarily conserved sequences within the Galliformes. The obtained sequences from quail and partridge represent a beginning of generating the whole genome sequence for these species. The continuation of establishing the genome...

  15. Whole genome sequence analysis of unidentified genetically modified papaya for development of a specific detection method.

    Science.gov (United States)

    Nakamura, Kosuke; Kondo, Kazunari; Akiyama, Hiroshi; Ishigaki, Takumi; Noguchi, Akio; Katsumata, Hiroshi; Takasaki, Kazuto; Futo, Satoshi; Sakata, Kozue; Fukuda, Nozomi; Mano, Junichi; Kitta, Kazumi; Tanaka, Hidenori; Akashi, Ryo; Nishimaki-Mogami, Tomoko

    2016-08-15

    Identification of transgenic sequences in an unknown genetically modified (GM) papaya (Carica papaya L.) by whole genome sequence analysis was demonstrated. Whole genome sequence data were generated for a GM-positive fresh papaya fruit commodity detected in monitoring using real-time polymerase chain reaction (PCR). The sequences obtained were mapped against an open database for papaya genome sequence. Transgenic construct- and event-specific sequences were identified as a GM papaya developed to resist infection from a Papaya ringspot virus. Based on the transgenic sequences, a specific real-time PCR detection method for GM papaya applicable to various food commodities was developed. Whole genome sequence analysis enabled identifying unknown transgenic construct- and event-specific sequences in GM papaya and development of a reliable method for detecting them in papaya food commodities. PMID:27006240

  16. Whole-Genome Saliva and Blood DNA Methylation Profiling in Individuals with a Respiratory Allergy.

    Science.gov (United States)

    Langie, Sabine A S; Szarc Vel Szic, Katarzyna; Declerck, Ken; Traen, Sophie; Koppen, Gudrun; Van Camp, Guy; Schoeters, Greet; Vanden Berghe, Wim; De Boever, Patrick

    2016-01-01

    The etiology of respiratory allergies (RA) can be partly explained by DNA methylation changes caused by adverse environmental and lifestyle factors experienced early in life. Longitudinal, prospective studies can aid in the unravelment of the epigenetic mechanisms involved in the disease development. High compliance rates can be expected in these studies when data is collected using non-invasive and convenient procedures. Saliva is an attractive biofluid to analyze changes in DNA methylation patterns. We investigated in a pilot study the differential methylation in saliva of RA (n = 5) compared to healthy controls (n = 5) using the Illumina Methylation 450K BeadChip platform. We evaluated the results against the results obtained in mononuclear blood cells from the same individuals. Differences in methylation patterns from saliva and mononuclear blood cells were clearly distinguishable (PAdj0.2), though the methylation status of about 96% of the cg-sites was comparable between peripheral blood mononuclear cells and saliva. When comparing RA cases with healthy controls, the number of differentially methylated sites (DMS) in saliva and blood were 485 and 437 (P0.1), respectively, of which 216 were in common. The methylation levels of these sites were significantly correlated between blood and saliva. The absolute levels of methylation in blood and saliva were confirmed for 3 selected DMS in the PM20D1, STK32C, and FGFR2 genes using pyrosequencing analysis. The differential methylation could only be confirmed for DMS in PM20D1 and STK32C genes in saliva. We show that saliva can be used for genome-wide methylation analysis and that it is possible to identify DMS when comparing RA cases and healthy controls. The results were replicated in blood cells of the same individuals and confirmed by pyrosequencing analysis. This study provides proof-of-concept for the applicability of saliva-based whole-genome methylation analysis in the field of respiratory allergy.

  17. An efficient procedure for plant organellar genome assembly, based on whole genome data from the 454 GS FLX sequencing platform

    Directory of Open Access Journals (Sweden)

    Zhang Tongwu

    2011-11-01

    Full Text Available Abstract Motivation Complete organellar genome sequences (chloroplasts and mitochondria provide valuable resources and information for studying plant molecular ecology and evolution. As high-throughput sequencing technology advances, it becomes the norm that a shotgun approach is used to obtain complete genome sequences. Therefore, to assemble organellar sequences from the whole genome, shotgun reads are inevitable. However, associated techniques are often cumbersome, time-consuming, and difficult, because true organellar DNA is difficult to separate efficiently from nuclear copies, which have been transferred to the nucleus through the course of evolution. Results We report a new, rapid procedure for plant chloroplast and mitochondrial genome sequencing and assembly using the Roche/454 GS FLX platform. Plant cells can contain multiple copies of the organellar genomes, and there is a significant correlation between the depth of sequence reads in contigs and the number of copies of the genome. Without isolating organellar DNA from the mixture of nuclear and organellar DNA for sequencing, we retrospectively extracted assembled contigs of either chloroplast or mitochondrial sequences from the whole genome shotgun data. Moreover, the contig connection graph property of Newbler (a platform-specific sequence assembler ensures an efficient final assembly. Using this procedure, we assembled both chloroplast and mitochondrial genomes of a resurrection plant, Boea hygrometrica, with high fidelity. We also present information and a minimal sequence dataset as a reference for the assembly of other plant organellar genomes.

  18. Postzygotic single-nucleotide mosaicisms in whole-genome sequences of clinically unremarkable individuals

    OpenAIRE

    Huang, August Y.; Xu, Xiaojing; Ye, Adam Y.; Wu, Qixi; Yan, Linlin; Zhao, Boxun; Yang, Xiaoxu; He, Yao; Wang, Sheng; Zhang, Zheng; Gu, Bowen; Han-qing ZHAO; Wang, Meng; Gao, Hua; Gao, Ge

    2014-01-01

    Postzygotic single-nucleotide mutations (pSNMs) have been studied in cancer and a few other overgrowth human disorders at whole-genome scale and found to play critical roles. However, in clinically unremarkable individuals, pSNMs have never been identified at whole-genome scale largely due to technical difficulties and lack of matched control tissue samples, and thus the genome-wide characteristics of pSNMs remain unknown. We developed a new Bayesian-based mosaic genotyper and a series of eff...

  19. Whole-Genome Sequences of Two Borrelia afzelii and Two Borrelia garinii Lyme Disease Agent Isolates

    Energy Technology Data Exchange (ETDEWEB)

    Casjens, S.R.; Dunn, J.; Mongodin, E. F.; Qiu, W.-G.; Luft, B. J.; Fraser-Liggett, C. M.; Schutzer, S. E.

    2011-12-01

    Human Lyme disease is commonly caused by several species of spirochetes in the Borrelia genus. In Eurasia these species are largely Borrelia afzelii, B. garinii, B. burgdorferi, and B. bavariensis sp. nov. Whole-genome sequencing is an excellent tool for investigating and understanding the influence of bacterial diversity on the pathogenesis and etiology of Lyme disease. We report here the whole-genome sequences of four isolates from two of the Borrelia species that cause human Lyme disease, B. afzelii isolates ACA-1 and PKo and B. garinii isolates PBr and Far04.

  20. Rice-arsenate interactions in hydroponics: whole genome transcriptional analysis.

    Science.gov (United States)

    Norton, Gareth J; Lou-Hing, Daniel E; Meharg, Andrew A; Price, Adam H

    2008-01-01

    Rice (Oryza sativa) varieties that are arsenate-tolerant (Bala) and -sensitive (Azucena) were used to conduct a transcriptome analysis of the response of rice seedlings to sodium arsenate (AsV) in hydroponic solution. RNA extracted from the roots of three replicate experiments of plants grown for 1 week in phosphate-free nutrient with or without 13.3 muM AsV was used to challenge the Affymetrix (52K) GeneChip Rice Genome array. A total of 576 probe sets were significantly up-regulated at least 2-fold in both varieties, whereas 622 were down-regulated. Ontological classification is presented. As expected, a large number of transcription factors, stress proteins, and transporters demonstrated differential expression. Striking is the lack of response of classic oxidative stress-responsive genes or phytochelatin synthases/synthatases. However, the large number of responses from genes involved in glutathione synthesis, metabolism, and transport suggests that glutathione conjugation and arsenate methylation may be important biochemical responses to arsenate challenge. In this report, no attempt is made to dissect differences in the response of the tolerant and sensitive variety, but analysis in a companion article will link gene expression to the known tolerance loci available in the BalaxAzucena mapping population.

  1. Identification of innate immunodeficiencies by whole genome sequencing

    DEFF Research Database (Denmark)

    Mogensen, Trine; Christiansen, Mette; Veirum, Jens Erik;

    2014-01-01

    transfection of human cell lines in order to examine for the ability to induce IFN in response to relevant stimuli (i.e the dsRNA mimic Poly(IC) and HSV). Results: In this small number of patients we identified several interesting mutations in molecules involved in antiviral IFN responses. First, in a 6 year...... encephalitis or other herpes simplex virus (HSV) disease manifestations. The goal is to identify host factors in innate immunity which may explain the hitherto unknown mechanism underlying differential susceptibility to HSV infections between individuals. Such knowledge may have clinical and therapeutical...

  2. Discovery of Gene Sources for Economic Traits in Hanwoo by Whole-genome Resequencing

    Science.gov (United States)

    Shin, Younhee; Jung, Ho-jin; Jung, Myunghee; Yoo, Seungil; Subramaniyam, Sathiyamoorthy; Markkandan, Kesavan; Kang, Jun-Mo; Rai, Rajani; Park, Junhyung; Kim, Jong-Joo

    2016-01-01

    Hanwoo, a Korean native cattle (Bos taurus coreana), has great economic value due to high meat quality. Also, the breed has genetic variations that are associated with production traits such as health, disease resistance, reproduction, growth as well as carcass quality. In this study, next generation sequencing technologies and the availability of an appropriate reference genome were applied to discover a large amount of single nucleotide polymorphisms (SNPs) in ten Hanwoo bulls. Analysis of whole-genome resequencing generated a total of 26.5 Gb data, of which 594,716,859 and 592,990,750 reads covered 98.73% and 93.79% of the bovine reference genomes of UMD 3.1 and Btau 4.6.1, respectively. In total, 2,473,884 and 2,402,997 putative SNPs were discovered, of which 1,095,922 (44.3%) and 982,674 (40.9%) novel SNPs were discovered against UMD3.1 and Btau 4.6.1, respectively. Among the SNPs, the 46,301 (UMD 3.1) and 28,613 SNPs (Btau 4.6.1) that were identified as Hanwoo-specific SNPs were included in the functional genes that may be involved in the mechanisms of milk production, tenderness, juiciness, marbling of Hanwoo beef and yellow hair. Most of the Hanwoo-specific SNPs were identified in the promoter region, suggesting that the SNPs influence differential expression of the regulated genes relative to the relevant traits. In particular, the non-synonymous (ns) SNPs found in CORIN, which is a negative regulator of Agouti, might be a causal variant to determine yellow hair of Hanwoo. Our results will provide abundant genetic sources of variation to characterize Hanwoo genetics and for subsequent breeding. PMID:26954201

  3. Whole genome transcript profiling from fingerstick blood samples: a comparison and feasibility study

    Directory of Open Access Journals (Sweden)

    Williams Adam R

    2009-12-01

    Full Text Available Abstract Background Whole genome gene expression profiling has revolutionized research in the past decade especially with the advent of microarrays. Recently, there have been significant improvements in whole blood RNA isolation techniques which, through stabilization of RNA at the time of sample collection, avoid bias and artifacts introduced during sample handling. Despite these improvements, current human whole blood RNA stabilization/isolation kits are limited by the requirement of a venous blood sample of at least 2.5 mL. While fingerstick blood collection has been used for many different assays, there has yet to be a kit developed to isolate high quality RNA for use in gene expression studies from such small human samples. The clinical and field testing advantages of obtaining reliable and reproducible gene expression data from a fingerstick are many; it is less invasive, time saving, more mobile, and eliminates the need of a trained phlebotomist. Furthermore, this method could also be employed in small animal studies, i.e. mice, where larger sample collections often require sacrificing the animal. In this study, we offer a rapid and simple method to extract sufficient amounts of high quality total RNA from approximately 70 μl of whole blood collected via a fingerstick using a modified protocol of the commercially available Qiagen PAXgene RNA Blood Kit. Results From two sets of fingerstick collections, about 70 uL whole blood collected via finger lancet and capillary tube, we recovered an average of 252.6 ng total RNA with an average RIN of 9.3. The post-amplification yields for 50 ng of total RNA averaged at 7.0 ug cDNA. The cDNA hybridized to Affymetrix HG-U133 Plus 2.0 GeneChips had an average % Present call of 52.5%. Both fingerstick collections were highly correlated with r2 values ranging from 0.94 to 0.97. Similarly both fingerstick collections were highly correlated to the venous collection with r2 values ranging from 0.88 to 0

  4. Capsular Typing Method for Streptococcus agalactiae Using Whole-Genome Sequence Data

    OpenAIRE

    Sheppard, AE; Vaughan, A; Jones, N.; Turner, P; Turner, C.; Efstratiou, A.; Patel, D.; MMM Informatics Group; Walker, AS; Berkley, J.; Crook, DW; Seale, AC

    2016-01-01

    Group B streptococcus (GBS) capsular serotype is a major determ inant of virulence, and affects potential vaccine coverage. Here we report a whole genome sequencing-based method for GBS serotype assignment. This shows high agree ment (kappa=0.92) with conventional methods, and increased serotype assignment (100%) to all ten capsular types.

  5. Capsular Typing Method for Streptococcus agalactiae Using Whole-Genome Sequence Data.

    Science.gov (United States)

    Sheppard, Anna E; Vaughan, Alison; Jones, Nicola; Turner, Paul; Turner, Claudia; Efstratiou, Androulla; Patel, Darshana; Walker, A Sarah; Berkley, James A; Crook, Derrick W; Seale, Anna C

    2016-05-01

    Group B streptococcus (GBS) capsular serotypes are major determinants of virulence and affect potential vaccine coverage. Here we report a whole-genome-sequencing-based method for GBS serotype assignment. This method shows strong agreement (kappa of 0.92) with conventional methods and increased serotype assignment (100%) to all 10 capsular types. PMID:26962081

  6. Whole-Genome Sequence of the Cheese Isolate Streptococcus macedonicus 679.

    Science.gov (United States)

    Papadimitriou, Konstantinos; Mavrogonatou, Eleni; Bolotin, Alexander; Tsakalidou, Effie; Renault, Pierre

    2016-01-01

    It is well recognized that Streptococcus macedonicus can populate artisanal fermented foods, especially those of dairy origin. However, the safety of S. macedonicus remains to be established. Here, we present the whole-genome sequence of strain 679, which was isolated from a French uncooked semihard cheese made with cow milk. PMID:27660795

  7. Whole-genome shotgun sequencing of Lactobacillus rhamnosus MTCC 5462, a strain with probiotic potential.

    Science.gov (United States)

    Prajapati, J B; Khedkar, C D; Chitra, J; Suja, Senan; Mishra, V; Sreeja, V; Patel, R K; Ahir, V B; Bhatt, V D; Sajnani, M R; Jakhesara, S J; Koringa, P G; Joshi, C G

    2012-03-01

    Lactobacillus rhamnosus MTCC 5462 was isolated from infant gastrointestinal flora. The strain exhibited an ability to reduce cholesterol and stimulate immunity. The strain has exhibited positive results in alleviating gastrointestinal discomfort and good potential as a probiotic. We sequenced the whole genome of the strain and compared it to the published genome sequence of Lactobacillus rhamnosus GG (ATCC 53103). PMID:22328760

  8. Whole-genome characterization and genotyping of global WU polyomavirus strains

    NARCIS (Netherlands)

    Bialasiewicz, Seweryn; Rockett, Rebecca; Whiley, David W.; Abed, Yacine; Allander, Tobias; Binks, Michael; Boivin, Guy; Cheng, Allen C.; Chung, Ju-Young; Ferguson, Patricia E.; Gilroy, Nicole M.; Leach, Amanda J.; Lindau, Cecilia; Rossen, John W.; Sorrell, Tania C.; Nissen, Michael D.; Sloots, Theo P.

    2010-01-01

    Exploration of the genetic diversity of WU polyomavirus (WUV) has been limited in terms of the specimen numbers and particularly the sizes of the genomic fragments analyzed. Using whole-genome sequencing of 48 WUV strains collected in four continents over a 5-year period and 16 publicly available wh

  9. Whole Genome Selection Project Involving 2,000 Industry AI Sires

    Science.gov (United States)

    Whole genome selection (WGS) uses markers spanning the genome to predict genetic merit for economically important traits. WGS may increase the rate of genetic progress through improved accuracy and reduced generation interval especially for traits that cannot be measured on breeding animals. In cont...

  10. Whole-Genome Sequence of the Cheese Isolate Streptococcus macedonicus 679

    Science.gov (United States)

    Mavrogonatou, Eleni; Bolotin, Alexander; Tsakalidou, Effie

    2016-01-01

    It is well recognized that Streptococcus macedonicus can populate artisanal fermented foods, especially those of dairy origin. However, the safety of S. macedonicus remains to be established. Here, we present the whole-genome sequence of strain 679, which was isolated from a French uncooked semihard cheese made with cow milk. PMID:27660795

  11. Whole-Genome Sequence of Aeromonas hydrophila Strain AH-1 (Serotype O11).

    Science.gov (United States)

    Forn-Cuní, Gabriel; Tomás, Juan M; Merino, Susana

    2016-01-01

    Aeromonas hydrophila is an emerging pathogen of aquatic and terrestrial animals, including humans. Here, we report the whole-genome sequence of the septicemic A. hydrophila AH-1 strain, belonging to the serotype O11, and the first mesophilic Aeromonas with surface layer (S-layer) to be sequenced. PMID:27587829

  12. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity

    Directory of Open Access Journals (Sweden)

    Kok-Gan Chan

    2016-03-01

    Full Text Available Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000.

  13. Whole genome scan to detect quantitative trait loci for bovine milk protein composition

    NARCIS (Netherlands)

    Schopen, G.C.B.; Koks, P.D.; Arendonk, van J.A.M.; Bovenhuis, H.; Visker, M.H.P.W.

    2009-01-01

    The objective of this study was to perform a whole genome scan to detect quantitative trait loci (QTL) for milk protein composition in 849 Holstein–Friesian cows originating from seven sires. One morning milk sample was analysed for the major milk proteins using capillary zone electrophoresis. A gen

  14. Clinical Application of Whole Genome Sequencing In Patients with Primary Immunodeficiency

    Science.gov (United States)

    Mousallem, Talal; Urban, Thomas J.; McSweeney, K. Melodi; Kleinstein, Sarah E.; Zhu, Mingfu; Adeli, Mehdi; Parrott, Roberta E.; Roberts, Joseph L.; Krueger, Brian; Buckley, Rebecca H.; Goldstein, David B

    2016-01-01

    Summary This report illustrates the value of whole genome sequencing (WGS) in elucidating the genetic cause of disease in patients with primary immunodeficiency (PID). As sequencing costs decline, we predict that utilization of next generation sequencing (NGS) in the clinical setting will increase. PMID:25981738

  15. Diagnosis of Capnocytophaga canimorsus Sepsis by Whole-Genome Next-Generation Sequencing

    Science.gov (United States)

    Abril, Maria K.; Barnett, Adam S.; Wegermann, Kara; Fountain, Eric; Strand, Andrew; Heyman, Benjamin M.; Blough, Britton A.; Swaminathan, Aparna C.; Sharma-Kuinkel, Batu; Ruffin, Felicia; Alexander, Barbara D.; McCall, Chad M.; Costa, Sylvia F.; Arcasoy, Murat O.; Hong, David K.; Blauwkamp, Timothy A.; Kertesz, Michael; Fowler, Vance G.; Kraft, Bryan D.

    2016-01-01

    We report the case of a 60-year-old man with septic shock due to Capnocytophaga canimorsus that was diagnosed in 24 hours by a novel whole-genome next-generation sequencing assay. This technology shows great promise in identifying fastidious pathogens, and, if validated, it has profound implications for infectious disease diagnosis.

  16. Whole-Genome Shotgun Sequence of Pseudomonas viridiflava, a Bacterium Species Pathogenic to Arabidopsis thaliana

    OpenAIRE

    Lefort, Francois; Calmin, Gautier; Crovadore, Julien; Osteras, Magne; Farinelli, Laurent

    2013-01-01

    We report here the first whole-genome shotgun sequence of Pseudomonas viridiflava strain UASWS38, a bacterium species pathogenic to the biological model plant Arabidopsis thaliana but also usable as a biological control agent and thus of great scientific interest for understanding the genetics of plant-microbe interactions.

  17. Whole-Genome Shotgun Sequence of Rhodococcus Species Strain JVH1

    OpenAIRE

    Brooks, Shannon L.; Van Hamme, Jonathan D.

    2012-01-01

    Here we present a whole-genome shotgun sequence of Rhodococcus species strain JVH1, an organism capable of degrading a variety of organosulfur compounds. In particular, JVH1 is able to selectively cleave carbon-sulfur bonds within alkyl chains. A large number of oxygenases were identified, consistent with other members of the genus.

  18. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity.

    Science.gov (United States)

    Chan, Kok-Gan; Yin, Wai-Fong; Chan, Xin-Yue

    2016-03-01

    Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000. PMID:26981378

  19. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc;

    2015-01-01

    This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected wi...

  20. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity

    OpenAIRE

    Kok-Gan Chan; Wai-Fong Yin; Xin-Yue Chan

    2015-01-01

    Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000.

  1. Draft Whole-Genome Sequence of the Type Strain Bacillus horikoshii DSM 8719

    Science.gov (United States)

    Hernández-González, Ismael L.

    2016-01-01

    Members of the Bacillus genus have been extensively studied because of their ability to produce enzymes with high biotechnological value. Here, we report the draft of the whole-genome sequence of the type strain Bacillus horikoshii DSM 8719, an alkali-tolerant strain. PMID:27417833

  2. WIDE-CROSS WHOLE-GENOME RADIATION HYBIRD MAPPING OF THE COTTON (GOSSYPIUM BARBADENSE L.) GENOME

    Science.gov (United States)

    Whole-genome radiation hybrid mapping has been applied extensively to human and certain animal species but little to plants. We recently demonstrated an alternative mapping approach in cotton (Gossypium hirsutum L.) based on segmentation by 5-krad gamma-irradiation and derivation of wild-cross whol...

  3. A whole-genome assembly of the domestic cow, Bos taurus

    Science.gov (United States)

    Background: The genome of the domestic cow, Bos taurus, was sequenced using a mixture of hierarchical and whole-genome shotgun sequencing methods. Results: We have assembled the 35 million sequence reads and applied a variety of assembly improvement techniques, creating an assembly of 2.86 billion b...

  4. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    DEFF Research Database (Denmark)

    Li, Ying-hui; Zhao, Shan-cen; Ma, Jian-xin;

    2013-01-01

    BACKGROUND:Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re...

  5. Whole-Genome Sequencing Detection of Ongoing Listeria Contamination at a Restaurant, Rhode Island, USA, 2014

    Science.gov (United States)

    Gosciminski, Michael; Miller, Adam

    2016-01-01

    In November 2014, the Rhode Island Department of Health investigated a cluster of 3 listeriosis cases. Using whole-genome sequencing to support epidemiologic, laboratory, and environmental investigations, the department identified 1 restaurant as the likely source of the outbreak and also linked the establishment to a listeriosis case that occurred in 2013. PMID:27434089

  6. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity.

    Science.gov (United States)

    Chan, Kok-Gan; Yin, Wai-Fong; Chan, Xin-Yue

    2016-03-01

    Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000.

  7. Whole-Genome Sequencing Detection of Ongoing Listeria Contamination at a Restaurant, Rhode Island, USA, 2014.

    Science.gov (United States)

    Barkley, Jonathan S; Gosciminski, Michael; Miller, Adam

    2016-08-01

    In November 2014, the Rhode Island Department of Health investigated a cluster of 3 listeriosis cases. Using whole-genome sequencing to support epidemiologic, laboratory, and environmental investigations, the department identified 1 restaurant as the likely source of the outbreak and also linked the establishment to a listeriosis case that occurred in 2013. PMID:27434089

  8. Whole-Genome Shotgun Sequencing of a Colonizing Multilocus Sequence Type 17 Streptococcus agalactiae Strain

    OpenAIRE

    Singh, Pallavi; Springman, A. Cody; Davies, H Dele; Manning, Shannon D.

    2012-01-01

    This report highlights the whole-genome shotgun draft sequence for a Streptococcus agalactiae strain representing multilocus sequence type (ST) 17, isolated from a colonized woman at 8 weeks postpartum. This sequence represents an important addition to the published genomes and will promote comparative genomic studies of S. agalactiae recovered from diverse sources.

  9. A high-resolution whole genome radiation hybrid map of human chromosome 17q22-q25.3 across the genes for GH and TK

    Energy Technology Data Exchange (ETDEWEB)

    Foster, J.W.; Schafer, A.J.; Critcher, R. [Univ. of Cambridge (United Kingdom)] [and others

    1996-04-15

    We have constructed a whole genome radiation hybrid (WG-RH) map across a region of human chromosome 17q, from growth hormone (GH) to thymidine kinase (TK). A panel of 128 WG-RH hybrid cell lines generated by X-irradiation and fusion has been tested for the retention of 39 sequence-tagged site (STS) markers by the polymerase chain reaction. This genome mapping technique has allowed the integration of existing VNTR and microsatellite markers with additional new markers and existing STS markers previously mapped to this region by other means. The WG-RH map includes eight expressed sequence tag (EST) and three anonymous markers developed for this study, together with 23 anonymous microsatellites and five existing ESTs. Analysis of these data resulted in a high-density comprehensive map across this region of the genome. A subset of these markers has been used to produce a framework map consisting of 20 loci ordered with odds greater than 1000:1. The markers are of sufficient density to build a YAC contig across this region based on marker content. We have developed sequence tags for both ends of a 2.1-Mb YAC and mapped these using the WG-RH panel, allowing a direct comparison of cRay{sub 6000} to physical distance. 31 refs., 3 figs., 2 tabs.

  10. Generation of physical map contig-specific sequences useful for whole genome sequence scaffolding.

    Directory of Open Access Journals (Sweden)

    Yanliang Jiang

    Full Text Available Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs together, more genome resources are needed. In this study, we developed a strategy to generate a valuable genome resource, physical map contig-specific sequences, which are randomly distributed genome sequences in each physical contig. Two-dimensional tagging method was used to create specific tags for 1,824 physical contigs, in which the cost was dramatically reduced. A total of 94,111,841 100-bp reads and 315,277 assembled contigs are identified containing physical map contig-specific tags. The physical map contig-specific sequences along with the currently available BAC end sequences were then used to anchor the catfish draft genome contigs. A total of 156,457 genome contigs (~79% of whole genome sequencing assembly were anchored and grouped into 1,824 pools, in which 16,680 unique genes were annotated. The physical map contig-specific sequences are valuable resources to link physical map, genetic linkage map and draft whole genome sequences, consequently have the capability to improve the whole genome sequences assembly and scaffolding, and improve the genome-wide comparative analysis as well. The strategy developed in this study could also be adopted in other species whose whole genome assembly is still facing a challenge.

  11. Whole-Genome Resequencing and Transcriptomic Analysis to Identify Genes Involved in Leaf-Color Diversity in Ornamental Rice Plants

    Science.gov (United States)

    Shin, Younhee; Lim, Hye-Min; Lee, Gang-Seob; Kim, A-Ram; Lee, Tae-Ho; Lee, Jae-Hee; Park, Dong-Suk; Yoo, Seungil; Kim, Yong-Hwan; Kim, Yong-Kab

    2015-01-01

    Rice field art is a large-scale art form in which people design rice fields using various kinds of ornamental rice plants with different leaf colors. Leaf color-related genes play an important role in the study of chlorophyll biosynthesis, chloroplast structure and function, and anthocyanin biosynthesis. Despite the role of different metabolites in the traditional relationship between leaf and color, comprehensive color-specific metabolite studies of ornamental rice have been limited. We performed whole-genome resequencing and transcriptomic analysis of regulatory patterns and genetic diversity among different rice cultivars to discover new genetic mechanisms that promote enhanced levels of various leaf colors. We resequenced the genomes of 10 rice leaf-color accessions to an average of 40× reads depth and >95% coverage and performed 30 RNA-seq experiments using the 10 rice accessions sampled at three developmental stages. The sequencing results yielded a total of 1,814 × 106 reads and identified an average of 713,114 SNPs per rice accession. Based on our analysis of the DNA variation and gene expression, we selected 47 candidate genes. We used an integrated analysis of the whole-genome resequencing data and the RNA-seq data to divide the candidate genes into two groups: genes related to macronutrient (i.e., magnesium and sulfur) transport and genes related to flavonoid pathways, including anthocyanidin biosynthesis. We verified the candidate genes with quantitative real-time PCR using transgenic T-DNA insertion mutants. Our study demonstrates the potential of integrated screening methods combined with genetic-variation and transcriptomic data to isolate genes involved in complex biosynthetic networks and pathways. PMID:25897514

  12. Combined sequencing of mRNA and DNA from human embryonic stem cells.

    Science.gov (United States)

    Mertes, Florian; Kuhl, Heiner; Wruck, Wasco; Lehrach, Hans; Adjaye, James

    2016-06-01

    Combined transcriptome and whole genome sequencing of the same ultra-low input sample down to single cells is a rapidly evolving approach for the analysis of rare cells. Besides stem cells, rare cells originating from tissues like tumor or biopsies, circulating tumor cells and cells from early embryonic development are under investigation. Herein we describe a universal method applicable for the analysis of minute amounts of sample material (150 to 200 cells) derived from sub-colony structures from human embryonic stem cells. The protocol comprises the combined isolation and separate amplification of poly(A) mRNA and whole genome DNA followed by next generation sequencing. Here we present a detailed description of the method developed and an overview of the results obtained for RNA and whole genome sequencing of human embryonic stem cells, sequencing data is available in the Gene Expression Omnibus (GEO) database under accession number GSE69471.

  13. Whole-genome sequencing reveals a link between β-lactam resistance and synthetases of the alarmone (p)ppGpp in Staphylococcus aureus.

    Science.gov (United States)

    Mwangi, Michael M; Kim, Choonkeun; Chung, Marilyn; Tsai, Jennifer; Vijayadamodar, Govindan; Benitez, Michelle; Jarvie, Thomas P; Du, Lei; Tomasz, Alexander

    2013-06-01

    The overwhelming majority of methicillin-resistant Staphylococcus aureus (MRSA) clinical isolates exhibit a peculiar heterogeneous resistance to β-lactam antibiotics: in cultures of such strains, the majority of cells display only a low level of methicillin resistance--often close to the MIC breakpoint of susceptible strains. Yet, in the same cultures, subpopulations of bacteria exhibiting very high levels of resistance are also present with variable frequencies, which are characteristic of the particular MRSA lineage. The mechanism of heterogeneous resistance is not understood. We describe here an experimental system for exploring the mechanism of heterogeneous resistance. Copies of the resistance gene mecA cloned into a temperature-sensitive plasmid were introduced into the fully sequenced methicillin-susceptible clinical isolate S. aureus strain 476. Transductants of strain 476 expressed methicillin resistance in a heterogeneous fashion: the great majority of cells showed only low MIC (0.75 μg/ml) for the antibiotic, but a minority population of highly resistant bacteria (MIC >300 μg/ml) was also present with a frequency of ∼10(-4). The genetic backgrounds of the majority and minority cells were compared by whole-genome sequencing: the only differences detectable were two point mutations in relA of the highly resistant minority population of bacteria. The relA gene codes for the synthesis of (p)ppGpp, an effector of the stringent stress response. Titration of (p)ppGpp showed increased amounts of this effector in the highly resistant cells. Involvement of (p)ppGpp synthesis genes may explain some of the perplexing aspects of β-lactam resistance in MRSA, since many environmental and genetic changes can modulate cellular levels of (p)ppGpp.

  14. Increased frequency of single base substitutions in a population of transcripts expressed in cancer cells

    Directory of Open Access Journals (Sweden)

    Bianchetti Laurent

    2012-11-01

    Full Text Available Abstract Background Single Base Substitutions (SBS that alter transcripts expressed in cancer originate from somatic mutations. However, recent studies report SBS in transcripts that are not supported by the genomic DNA of tumor cells. Methods We used sequence based whole genome expression profiling, namely Long-SAGE (L-SAGE and Tag-seq (a combination of L-SAGE and deep sequencing, and computational methods to identify transcripts with greater SBS frequencies in cancer. Millions of tags produced by 40 healthy and 47 cancer L-SAGE experiments were compared to 1,959 Reference Tags (RT, i.e. tags matching the human genome exactly once. Similarly, tens of millions of tags produced by 7 healthy and 8 cancer Tag-seq experiments were compared to 8,572 RT. For each transcript, SBS frequencies in healthy and cancer cells were statistically tested for equality. Results In the L-SAGE and Tag-seq experiments, 372 and 4,289 transcripts respectively, showed greater SBS frequencies in cancer. Increased SBS frequencies could not be attributed to known Single Nucleotide Polymorphisms (SNP, catalogued somatic mutations or RNA-editing enzymes. Hypothesizing that Single Tags (ST, i.e. tags sequenced only once, were indicators of SBS, we observed that ST proportions were heterogeneously distributed across Embryonic Stem Cells (ESC, healthy differentiated and cancer cells. ESC had the lowest ST proportions, whereas cancer cells had the greatest. Finally, in a series of experiments carried out on a single patient at 1 healthy and 3 consecutive tumor stages, we could show that SBS frequencies increased during cancer progression. Conclusion If the mechanisms generating the base substitutions could be known, increased SBS frequency in transcripts would be a new useful biomarker of cancer. With the reduction of sequencing cost, sequence based whole genome expression profiling could be used to characterize increased SBS frequency in patient’s tumor and aid diagnostic.

  15. Applications of the double-barreled data in whole-genome shotgun sequence assembly and analysis

    Institute of Scientific and Technical Information of China (English)

    HAN Yujun; WANG Jing; GU Xiaocheng; YU Jun; LI Songgang; NI Peixiang; L(U) Hong; YE Jia; HU Jianfei; CHEN Chen; HUANG Xiangang; CONG Lijuan; LI Guangyuan

    2005-01-01

    Double-barreled (DB) data have been widely used for the assembly of large genomes. Based on the experience of building the whole-genome working draft of Oryza sativa L.ssp. Indica, we present here the prevailing and improved uses of DB data in the assembly procedure and report on novel applications during the following data-mining processes such as acquiring precise insert fragment information of each clone across the genome, and a new kind of Iow-cost whole-genome microarray. With the increasing number of organisms being sequenced,we believe that DB data will play an important role both in other assembly procedures and infuture genomic studies.

  16. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    DEFF Research Database (Denmark)

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D;

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls......, for imputing sequence variant genotypes into reference sets for genomic prediction. Run 3.0 included 429 sequences, with 31.8 million variants detected. BayesRC, a new method for genomic prediction, addresses some challenges associated with using the sequence data, and takes advantage of biological...... information. In a dairy data set, predictions using BayesRC and imputed sequence data from 1000 Bull Genomes were 2% more accurate than with 800k data. We could demonstrate the method identified causal mutations in some cases. Further improvements will come from more accurate imputation of sequence variant...

  17. Whole genome sequencing of Mycobacterium tuberculosis SB24 isolated from Sabah, Malaysia.

    Science.gov (United States)

    Philip, Noraini; Rodrigues, Kenneth Francis; William, Timothy; John, Daisy Vanitha

    2016-09-01

    Mycobacterium tuberculosis (M. tuberculosis) is the causative agent of tuberculosis (TB) that causes millions of death every year. We have sequenced the genome of M. tuberculosis isolated from cerebrospinal fluid (CSF) of a patient diagnosed with tuberculous meningitis (TBM). The isolated strain was referred as M. tuberculosis SB24. Genomic DNA of the M. tuberculosis SB24 was extracted and subjected to whole genome sequencing using PacBio platform. The draft genome size of M. tuberculosis SB24 was determined to be 4,452,489 bp with a G + C content of 65.6%. The whole genome shotgun project has been deposited in NCBI SRA under the accession number SRP076503. PMID:27556011

  18. Whole genome sequencing of mutation accumulation lines reveals a low mutation rate in the social amoeba Dictyostelium discoideum.

    Directory of Open Access Journals (Sweden)

    Gerda Saxer

    Full Text Available Spontaneous mutations play a central role in evolution. Despite their importance, mutation rates are some of the most elusive parameters to measure in evolutionary biology. The combination of mutation accumulation (MA experiments and whole-genome sequencing now makes it possible to estimate mutation rates by directly observing new mutations at the molecular level across the whole genome. We performed an MA experiment with the social amoeba Dictyostelium discoideum and sequenced the genomes of three randomly chosen lines using high-throughput sequencing to estimate the spontaneous mutation rate in this model organism. The mitochondrial mutation rate of 6.76×10(-9, with a Poisson confidence interval of 4.1×10(-9 - 9.5×10(-9, per nucleotide per generation is slightly lower than estimates for other taxa. The mutation rate estimate for the nuclear DNA of 2.9×10(-11, with a Poisson confidence interval ranging from 7.4×10(-13 to 1.6×10(-10, is the lowest reported for any eukaryote. These results are consistent with low microsatellite mutation rates previously observed in D. discoideum and low levels of genetic variation observed in wild D. discoideum populations. In addition, D. discoideum has been shown to be quite resistant to DNA damage, which suggests an efficient DNA-repair mechanism that could be an adaptation to life in soil and frequent exposure to intracellular and extracellular mutagenic compounds. The social aspect of the life cycle of D. discoideum and a large portion of the genome under relaxed selection during vegetative growth could also select for a low mutation rate. This hypothesis is supported by a significantly lower mutation rate per cell division in multicellular eukaryotes compared with unicellular eukaryotes.

  19. Unique Features of a Japanese ‘Candidatus Liberibacter asiaticus’ Strain Revealed by Whole Genome Sequencing

    OpenAIRE

    Hiroshi Katoh; Shin-Ichi Miyata; Hiromitsu Inoue; Toru Iwanami

    2014-01-01

    Citrus greening (huanglongbing) is the most destructive disease of citrus worldwide. It is spread by citrus psyllids and is associated with phloem-limited bacteria of three species of α-Proteobacteria, namely, 'Candidatus Liberibacter asiaticus', 'Ca. L. americanus', and 'Ca. L. africanus'. Recent findings suggested that some Japanese strains lack the bacteriophage-type DNA polymerase region (DNA pol), in contrast to the Floridian psy62 strain. The whole genome sequence of the pol-negative 'C...

  20. Microarray-based whole-genome hybridization as a tool for determining procaryotic species relatedness

    Energy Technology Data Exchange (ETDEWEB)

    Wu, L.; Liu, X.; Fields, M.W.; Thompson, D.K.; Bagwell, C.E.; Tiedje, J. M.; Hazen, T.C.; Zhou, J.

    2008-01-15

    The definition and delineation of microbial species are of great importance and challenge due to the extent of evolution and diversity. Whole-genome DNA-DNA hybridization is the cornerstone for defining procaryotic species relatedness, but obtaining pairwise DNA-DNA reassociation values for a comprehensive phylogenetic analysis of procaryotes is tedious and time consuming. A previously described microarray format containing whole-genomic DNA (the community genome array or CGA) was rigorously evaluated as a high-throughput alternative to the traditional DNA-DNA reassociation approach for delineating procaryotic species relationships. DNA similarities for multiple bacterial strains obtained with the CGA-based hybridization were comparable to those obtained with various traditional whole-genome hybridization methods (r=0.87, P<0.01). Significant linear relationships were also observed between the CGA-based genome similarities and those derived from small subunit (SSU) rRNA gene sequences (r=0.79, P<0.0001), gyrB sequences (r=0.95, P<0.0001) or REP- and BOX-PCR fingerprinting profiles (r=0.82, P<0.0001). The CGA hybridization-revealed species relationships in several representative genera, including Pseudomonas, Azoarcus and Shewanella, were largely congruent with previous classifications based on various conventional whole-genome DNA-DNA reassociation, SSU rRNA and/or gyrB analyses. These results suggest that CGA-based DNA-DNA hybridization could serve as a powerful, high-throughput format for determining species relatedness among microorganisms.

  1. Prospective Whole-Genome Sequencing Enhances National Surveillance of Listeria monocytogenes

    OpenAIRE

    Kwong, Jason C.; Mercoulia, Karolina; Tomita, Takehiro; Easton, Marion; Li, Hua Y.; Bulach, Dieter M.; Stinear, Timothy P.; Seemann, Torsten; Benjamin P Howden

    2016-01-01

    Whole-genome sequencing (WGS) has emerged as a powerful tool for comparing bacterial isolates in outbreak detection and investigation. Here we demonstrate that WGS performed prospectively for national epidemiologic surveillance of Listeria monocytogenes has the capacity to be superior to our current approaches using pulsed-field gel electrophoresis (PFGE), multilocus sequence typing (MLST), multilocus variable-number tandem-repeat analysis (MLVA), binary typing, and serotyping. Initially 423 ...

  2. Heritability of pulmonary function estimated from pedigree and whole-genome markers

    OpenAIRE

    Klimentidis, Yann C.; Vazquez, Ana I; de los Campos, Gustavo; Allison, David B.; Dransfield, Mark T.; Thannickal, Victor J.

    2013-01-01

    Asthma and chronic obstructive pulmonary disease (COPD) are major worldwide health problems. Pulmonary function testing is a useful diagnostic tool for these diseases, and is known to be influenced by genetic and environmental factors. Previous studies have demonstrated that a substantial proportion of the variation in pulmonary function phenotypes can be explained by familial relationships. The availability of whole-genome single nucleotide polymorphism (SNP) data enables us to further evalu...

  3. Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples

    OpenAIRE

    Chun-nan Dong; Ya-dong Yang; Shu-jin Li; Ya-ran Yang; Xiao-jing Zhang; Xiang-dong Fang; Jiang-wei Yan; Bin Cong

    2016-01-01

    In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and ident...

  4. Digital Droplet Multiple Displacement Amplification (ddMDA) for Whole Genome Sequencing of Limited DNA Samples

    OpenAIRE

    Minsoung Rhee; Yooli K Light; Meagher, Robert J.; Anup K. Singh

    2016-01-01

    Multiple displacement amplification (MDA) is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples) before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA) technique where partitioning of the template D...

  5. Whole-Genome Sequence of Chlamydia gallinacea Type Strain 08-1274/3

    Science.gov (United States)

    Hölzer, Martin; Laroucau, Karine; Creasy, Heather Huot; Ott, Sandra; Vorimore, Fabien; Bavoil, Patrik M.; Marz, Manja

    2016-01-01

    The recently introduced bacterial species Chlamydia gallinacea is known to occur in domestic poultry and other birds. Its potential as an avian pathogen and zoonotic agent is under investigation. The whole-genome sequence of its type strain, 08-1274/3, consists of a 1,059,583-bp chromosome with 914 protein-coding sequences (CDSs) and a plasmid (p1274) comprising 7,619 bp with 9 CDSs. PMID:27445388

  6. Self-organizing Approach for Automated Gene Identification in Whole Genomes

    OpenAIRE

    Gorban, Alexander N; Zinovyev, Andrey Yu.; Popova, Tatyana G.

    2001-01-01

    An approach based on using the idea of distinguished coding phase in explicit form for identification of protein-coding regions (exons) in whole genome has been proposed. For several genomes an optimal window length for averaging GC-content function and calculating codon frequencies has been found. Self-training procedure based on clustering in multidimensional space of triplet frequencies is proposed. For visualization of data in the space of triplet requiencies method of elastic maps was ap...

  7. Colorectal cancer and the human gut microbiome: reproducibility with whole-genome shotgun sequencing

    OpenAIRE

    Emily Vogtmann; Xing Hua; Georg Zeller; Shinichi Sunagawa; Voigt, Anita Y.; Rajna Hercog; Goedert, James J.; Jianxin Shi; Peer Bork; Rashmi Sinha

    2016-01-01

    Accumulating evidence indicates that the gut microbiota affects colorectal cancer development, but previous studies have varied in population, technical methods, and associations with cancer. Understanding these variations is needed for comparisons and for potential pooling across studies. Therefore, we performed whole-genome shotgun sequencing on fecal samples from 52 pre-treatment colorectal cancer cases and 52 matched controls from Washington, DC. We compared findings from a previously pub...

  8. Organization and Evolution of Primate Centromeric DNA from Whole-Genome Shotgun Sequence Data

    OpenAIRE

    Alkan, Can; Eichler, Evan E.; Ventura, Mario; Archidiacono, Nicoletta; Rocchi, Mariano; Sahinalp, S Cenk

    2007-01-01

    Author Summary Centromeric DNA has been described as the last frontier of genomic sequencing; such regions are typically poorly assembled during the whole-genome shotgun sequence assembly process due to their repetitive complexity. This paper develops a computational algorithm to systematically extract data regarding primate centromeric DNA structure and organization from that ∼5% of sequence that is not included as part of standard genome sequence assemblies. Using this computational approac...

  9. Comparison of Whole-Genome Sequencing and Molecular-Epidemiological Techniques for Clostridium difficile Strain Typing.

    Science.gov (United States)

    Dominguez, Samuel R; Anderson, Lydia J; Kotter, Cassandra V; Littlehorn, Cynthia A; Arms, Lesley E; Dowell, Elaine; Todd, James K; Frank, Daniel N

    2016-09-01

    We analyzed in parallel 27 pediatric Clostridium difficile isolates by repetitive sequence-based polymerase chain reaction (RepPCR), pulsed-field gel electrophoresis (PFGE), and whole-genome next-generation sequencing. Next-generation sequencing distinguished 3 groups of isolates that were indistinguishable by RepPCR and 1 isolate that clustered in the same PFGE group as other isolates. PMID:26407257

  10. Coverage Bias and Sensitivity of Variant Calling for Four Whole-genome Sequencing Technologies

    OpenAIRE

    Rieber, Nora; Zapatka, Marc; Lasitschka, Bärbel; Jones, David,; Northcott, Paul; Hutter, Barbara; Jäger, Natalie; Kool, Marcel; Taylor, Michael; Lichter, Peter; Pfister, Stefan; Wolf, Stephan; Brors, Benedikt; Eils, Roland

    2013-01-01

    The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina’s HiSeq2000, Life Technologies’ SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics’ technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms...

  11. Analytical validation of whole exome and whole genome sequencing for clinical applications

    OpenAIRE

    Linderman, Michael D.; Brandt, Tracy; Edelmann, Lisa; Jabado, Omar; Kasai, Yumi; Kornreich, Ruth; Mahajan, Milind; Shah, Hardik; Kasarskis, Andrew; Eric E Schadt

    2014-01-01

    Background Whole exome and genome sequencing (WES/WGS) is now routinely offered as a clinical test by a growing number of laboratories. As part of the test design process each laboratory must determine the performance characteristics of the platform, test and informatics pipeline. This report documents one such characterization of WES/WGS. Methods Whole exome and whole genome sequencing was performed on multiple technical replicates of five reference samples using the Illumina HiSeq 2000/2500...

  12. Coverage Bias and Sensitivity of Variant Calling for Four Whole-genome Sequencing Technologies

    OpenAIRE

    Nora Rieber; Marc Zapatka; Bärbel Lasitschka; David Jones1; Paul Northcott; Barbara Hutter; Natalie Jäger; Marcel Kool; Michael Taylor; Peter Lichter; Stefan Pfister; Stephan Wolf; Benedikt Brors; Roland Eils

    2013-01-01

    The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina's HiSeq2000, Life Technologies' SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics' technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms...

  13. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing

    OpenAIRE

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan

    2013-01-01

    Background Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from la...

  14. A whole-genome, radiation hybrid mapping resource of hexaploid wheat.

    Science.gov (United States)

    Tiwari, Vijay K; Heesacker, Adam; Riera-Lizarazu, Oscar; Gunn, Hilary; Wang, Shichen; Wang, Yi; Gu, Young Q; Paux, Etienne; Koo, Dal-Hoe; Kumar, Ajay; Luo, Ming-Cheng; Lazo, Gerard; Zemetra, Robert; Akhunov, Eduard; Friebe, Bernd; Poland, Jesse; Gill, Bikram S; Kianian, Shahryar; Leonard, Jeffrey M

    2016-04-01

    Generating a contiguous, ordered reference sequence of a complex genome such as hexaploid wheat (2n = 6x = 42; approximately 17 GB) is a challenging task due to its large, highly repetitive, and allopolyploid genome. In wheat, ordering of whole-genome or hierarchical shotgun sequencing contigs is primarily based on recombination and comparative genomics-based approaches. However, comparative genomics approaches are limited to syntenic inference and recombination is suppressed within the pericentromeric regions of wheat chromosomes, thus, precise ordering of physical maps and sequenced contigs across the whole-genome using these approaches is nearly impossible. We developed a whole-genome radiation hybrid (WGRH) resource and tested it by genotyping a set of 115 randomly selected lines on a high-density single nucleotide polymorphism (SNP) array. At the whole-genome level, 26 299 SNP markers were mapped on the RH panel and provided an average mapping resolution of approximately 248 Kb/cR1500 with a total map length of 6866 cR1500 . The 7296 unique mapping bins provided a five- to eight-fold higher resolution than genetic maps used in similar studies. Most strikingly, the RH map had uniform bin resolution across the entire chromosome(s), including pericentromeric regions. Our research provides a valuable and low-cost resource for anchoring and ordering sequenced BAC and next generation sequencing (NGS) contigs. The WGRH developed for reference wheat line Chinese Spring (CS-WGRH), will be useful for anchoring and ordering sequenced BAC and NGS based contigs for assembling a high-quality, reference sequence of hexaploid wheat. Additionally, this study provides an excellent model for developing similar resources for other polyploid species. PMID:26945524

  15. Using Mendelian inheritance errors as quality control criteria in whole genome sequencing data set

    OpenAIRE

    Pilipenko, Valentina V; He, Hua; Kurowski, Brad G.; Alexander, Eileen S.; Zhang, Xue; Ding, Lili; Mersha, Tesfaye B.; Kottyan, Leah; Fardo, David W.; Martin, Lisa J.

    2014-01-01

    Although the technical and analytic complexity of whole genome sequencing is generally appreciated, best practices for data cleaning and quality control have not been defined. Family based data can be used to guide the standardization of specific quality control metrics in nonfamily based data. Given the low mutation rate, Mendelian inheritance errors are likely as a result of erroneous genotype calls. Thus, our goal was to identify the characteristics that determine Mendelian inheritance err...

  16. Whole-genome resequencing of Hanwoo (Korean cattle) and insight into regions of homozygosity

    OpenAIRE

    Lee, Kyung-Tai; Chung, Won-Hyong; Lee, Sung-Yeoun; Choi, Jung-Woo; Kim, Jiwoong; Lim, Dajeong; Lee, Seunghwan; Jang, Gul-Won; Kim, Bumsoo; Choy, Yun Ho; Liao, Xiaoping; Stothard, Paul; Moore, Stephen S; Lee, Sang-Heon; Ahn, Sungmin

    2013-01-01

    Background Hanwoo (Korean cattle), which originated from natural crossbreeding between taurine and zebu cattle, migrated to the Korean peninsula through North China. Hanwoo were raised as draft animals until the 1970s without the introduction of foreign germplasm. Since 1979, Hanwoo has been bred as beef cattle. Genetic variation was analyzed by whole-genome deep resequencing of a Hanwoo bull. The Hanwoo genome was compared to that of two other breeds, Black Angus and Holstein, and genes with...

  17. Screening of whole genome sequences identified high-impact variants for stallion fertility

    OpenAIRE

    Schrimpf, Rahel; Gottschalk, Maren; Metzger, Julia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

    2016-01-01

    Background Stallion fertility is an economically important trait due to the increase of artificial insemination in horses. The availability of whole genome sequence data facilitates identification of rare high-impact variants contributing to stallion fertility. The aim of our study was to genotype rare high-impact variants retrieved from next-generation sequencing (NGS)-data of 11 horses in order to unravel harmful genetic variants in large samples of stallions. Methods Gene ontology (GO) ter...

  18. Multiple Mutations in Heterogeneous Miltefosine-Resistant Leishmania major Population as Determined by Whole Genome Sequencing

    OpenAIRE

    Adriano C Coelho; Sébastien Boisvert; Angana Mukherjee; Philippe Leprohon; Jacques Corbeil; Marc Ouellette

    2012-01-01

    BACKGROUND: Miltefosine (MF) is the first oral compound used in the chemotherapy against leishmaniasis. Since the mechanism of action of this drug and the targets of MF in Leishmania are unclear, we generated in a step-by-step manner Leishmania major promastigote mutants highly resistant to MF. Two of the mutants were submitted to a short-read whole genome sequencing for identifying potential genes associated with MF resistance. METHODS/PRINCIPAL FINDINGS: Analysis of the genome assemblies re...

  19. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat.

    Science.gov (United States)

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-07-07

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches.

  20. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat

    Directory of Open Access Journals (Sweden)

    Huajing Teng

    2016-07-01

    Full Text Available Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches.

  1. Analysis on n-gram statistics and linguistic features of whole genome protein sequences

    Institute of Scientific and Technical Information of China (English)

    DONG Qi-wen; WANG Xiao-long; LIN Lei

    2008-01-01

    To obtain the statistical sequence analysis on a large number of genomic and proteomie sequences available for different organisms,the n-grams of whole genome protein sequences from 20 organisms were extracted.Their linguistic features were analyzed by two tests:Zipf power law and Shannon entropy,developed for analysis of natural languages and symbolic sequences.The natural genome proteins and the artificial genome proteins were compared with each other and some statistical features of n-grams were discovered.The results show that:the n-grams of whole genome protein sequences approximately follow the Zipf law when n is larger than 4;the Shannon n-gram entropy of natural genome proteins is lower than that of artificial proteins;a simple unigram model can distinguish different organisms;there exist organism-specific usages of "phrases" in protein sequences.It is suggested that further detailed analysis on n-gram of whole genome protein sequences will result in a powerful model for mapping the relationship of protein sequence,structure and function.

  2. Whole Genome Mapping with Feature Sets from High-Throughput Sequencing Data.

    Science.gov (United States)

    Pan, Yonglong; Wang, Xiaoming; Liu, Lin; Wang, Hao; Luo, Meizhong

    2016-01-01

    A good physical map is essential to guide sequence assembly in de novo whole genome sequencing, especially when sequences are produced by high-throughput sequencing such as next-generation-sequencing (NGS) technology. We here present a novel method, Feature sets-based Genome Mapping (FGM). With FGM, physical map and draft whole genome sequences can be generated, anchored and integrated using the same data set of NGS sequences, independent of restriction digestion. Method model was created and parameters were inspected by simulations using the Arabidopsis genome sequence. In the simulations, when ~4.8X genome BAC library including 4,096 clones was used to sequence the whole genome, ~90% of clones were successfully connected to physical contigs, and 91.58% of genome sequences were mapped and connected to chromosomes. This method was experimentally verified using the existing physical map and genome sequence of rice. Of 4,064 clones covering 115 Mb sequence selected from ~3 tiles of 3 chromosomes of a rice draft physical map, 3,364 clones were reconstructed into physical contigs and 98 Mb sequences were integrated into the 3 chromosomes. The physical map-integrated draft genome sequences can provide permanent frameworks for eventually obtaining high-quality reference sequences by targeted sequencing, gap filling and combining other sequences. PMID:27611682

  3. Development and evaluation of whole-genome oligonucleotide array for Acidithiobacillus ferrooxidans ATCC 23270

    Institute of Scientific and Technical Information of China (English)

    LI Qian; SHEN Li; LUO Hai-lang; YIN Hua-qun; LIAO Li-qin; QIU Guan-zhou; LIU Xue-duan

    2008-01-01

    To effectively monitor the characteristic of Acidithiobacillus ferrooxidans ATCC 23270 at the whole-genomic level,a whole-genome 50-mer-based oligonucleotide microarray was developed based on the 3 217 ORFs of A.ferrooxidans ATCC 23270 genome.Based on artificial oligonucleotide probes,the results showed that the optimal hybridization temperature was 45 ℃.Specificity tests with the purified PCR amplifications of 5 genes (Sulfide-quinone reductase,Cytochrome C,Iron oxidase,Mercuric resistance protein,Nitrogenase iron protein) of A.ferrooxidans ATCC 23270 indicated that the probes on the arrays appeared to be specific to their corresponding target genes.Based on the WGA hybridization to global transcriptional difference of A.ferrooxidans ATCC 23270 strains cultured with Fe(Ⅱ) and S(0),the developed 50-mer WGA could be used for global transcriptome analysis of A.ferrooxidans ATCC 23270.The detection limit was estimated to be approximately 5 ng with the genomic DNA,and at 100 ng of the DNA concentration,all of the signals reached the saturation.In addition,strong linear relationships were observed between hybridization signal intensity and the target DNA concentrations (r2=0.977 and 0.992).The results indicated that this technology had potential as a specific,sensitive and quantitative tool for detection and identification of the strain A.ferrooxidans ATCC 23270 at the whole-genome level.

  4. Whole-genome sequencing of uropathogenic Escherichia coli reveals long evolutionary history of diversity and virulence.

    Science.gov (United States)

    Lo, Yancy; Zhang, Lixin; Foxman, Betsy; Zöllner, Sebastian

    2015-08-01

    Uropathogenic Escherichia coli (UPEC) are phenotypically and genotypically very diverse. This diversity makes it challenging to understand the evolution of UPEC adaptations responsible for causing urinary tract infections (UTI). To gain insight into the relationship between evolutionary divergence and adaptive paths to uropathogenicity, we sequenced at deep coverage (190×) the genomes of 19 E. coli strains from urinary tract infection patients from the same geographic area. Our sample consisted of 14 UPEC isolates and 5 non-UTI-causing (commensal) rectal E. coli isolates. After identifying strain variants using de novo assembly-based methods, we clustered the strains based on pairwise sequence differences using a neighbor-joining algorithm. We examined evolutionary signals on the whole-genome phylogeny and contrasted these signals with those found on gene trees constructed based on specific uropathogenic virulence factors. The whole-genome phylogeny showed that the divergence between UPEC and commensal E. coli strains without known UPEC virulence factors happened over 32 million generations ago. Pairwise diversity between any two strains was also high, suggesting multiple genetic origins of uropathogenic strains in a small geographic region. Contrasting the whole-genome phylogeny with three gene trees constructed from common uropathogenic virulence factors, we detected no selective advantage of these virulence genes over other genomic regions. These results suggest that UPEC acquired uropathogenicity long time ago and used it opportunistically to cause extraintestinal infections.

  5. Divergent whole-genome methylation maps of human and chimpanzee brains reveal epigenetic basis of human regulatory evolution.

    Science.gov (United States)

    Zeng, Jia; Konopka, Genevieve; Hunt, Brendan G; Preuss, Todd M; Geschwind, Dan; Yi, Soojin V

    2012-09-01

    DNA methylation is a pervasive epigenetic DNA modification that strongly affects chromatin regulation and gene expression. To date, it remains largely unknown how patterns of DNA methylation differ between closely related species and whether such differences contribute to species-specific phenotypes. To investigate these questions, we generated nucleotide-resolution whole-genome methylation maps of the prefrontal cortex of multiple humans and chimpanzees. Levels and patterns of DNA methylation vary across individuals within species according to the age and the sex of the individuals. We also found extensive species-level divergence in patterns of DNA methylation and that hundreds of genes exhibit significantly lower levels of promoter methylation in the human brain than in the chimpanzee brain. Furthermore, we investigated the functional consequences of methylation differences in humans and chimpanzees by integrating data on gene expression generated with next-generation sequencing methods, and we found a strong relationship between differential methylation and gene expression. Finally, we found that differentially methylated genes are strikingly enriched with loci associated with neurological disorders, psychological disorders, and cancers. Our results demonstrate that differential DNA methylation might be an important molecular mechanism driving gene-expression divergence between human and chimpanzee brains and might potentially contribute to the evolution of disease vulnerabilities. Thus, comparative studies of humans and chimpanzees stand to identify key epigenomic modifications underlying the evolution of human-specific traits. PMID:22922032

  6. Coverage bias and sensitivity of variant calling for four whole-genome sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Nora Rieber

    Full Text Available The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina's HiSeq2000, Life Technologies' SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics' technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms with Sanger sequencing, which is prohibitively expensive for whole genome studies. Here we present a detailed comparison of the performance of all currently available whole genome sequencing platforms, especially regarding their ability to call SNVs and to evenly cover the genome and specific genomic regions. Unlike earlier studies, we base our comparison on four different samples, allowing us to assess the between-sample variation of the platforms. We find a pronounced GC bias in GC-rich regions for Life Technologies' platforms, with Complete Genomics performing best here, while we see the least bias in GC-poor regions for HiSeq2000 and 5500xl. HiSeq2000 gives the most uniform coverage and displays the least sample-to-sample variation. In contrast, Complete Genomics exhibits by far the smallest fraction of bases not covered, while the SOLiD platforms reveal remarkable shortcomings, especially in covering CpG islands. When comparing the performance of the four platforms for calling SNPs, HiSeq2000 and Complete Genomics achieve the highest sensitivity, while the SOLiD platforms show the lowest false positive rate. Finally, we find that integrating sequencing data from different platforms offers the potential to combine the strengths of different technologies. In summary, our results detail the strengths and weaknesses of all four whole-genome sequencing platforms. It indicates application areas that call for a specific sequencing platform and disallow other

  7. Combination of Whole Genome Sequencing, Linkage and Functional Studies Implicates a Missense Mutation in Titin as a Cause of Autosomal Dominant Cardiomyopathy with Features of Left Ventricular Non-Compaction

    Science.gov (United States)

    Hooper, Charlotte; Ormondroyd, Liz; Pagnamenta, Alistair; Lise, Stefano; Salatino, Silvia; Knight, Samantha JL; Taylor, Jenny C.; Thomson, Kate L.; Arnold, Linda; Chatziefthimiou, Spyros D.; Konarev, Petr V.; Wilmanns, Matthias; Ehler, Elisabeth; Ghisleni, Andrea; Gautel, Mathias; Blair, Edward; Watkins, Hugh; Gehmlich, Katja

    2016-01-01

    Background High throughput next generation sequencing techniques have made whole genome sequencing accessible in clinical practice, however, the abundance of variation in the human genomes makes the identification of a disease-causing mutation on a background of benign rare variants challenging. Methods and Results Here we combine whole genome sequencing with linkage analysis in a three-generation family affected by cardiomyopathy with features of autosomal dominant left-ventricular non-compaction cardiomyopathy. A missense mutation in the giant protein titin is the only plausible disease-causing variant that segregates with disease amongst the eight surviving affected individuals, with interrogation of the entire genome excluding other potential causes. This A178D missense mutation, affecting a conserved residue in the second immunoglobulin-like domain of titin, was introduced in a bacterially expressed recombinant protein fragment and biophysically characterised in comparison to its wild-type counterpart. Multiple experiments, including size exclusion chromatography, small angle X-ray scattering and circular dichroism spectroscopy suggest partial unfolding and domain destabilisation in the presence of the mutation. Moreover, binding experiments in mammalian cells show that the mutation markedly impairs binding to the titin ligand telethonin. Conclusions Here we present genetic and functional evidence implicating the novel A178D missense mutation in titin as the cause of a highly penetrant familial cardiomyopathy with features of left-ventricular non-compaction. This expands the spectrum of titin’s roles in cardiomyopathies. It furthermore highlights that rare titin missense variants, currently often ignored or left un-interpreted, should be considered to be relevant for cardiomyopathies and can be identified by the approach presented here. PMID:27625337

  8. Increased Proportion of Variance Explained and Prediction Accuracy of Survival of Breast Cancer Patients with Use of Whole-Genome Multiomic Profiles.

    Science.gov (United States)

    Vazquez, Ana I; Veturi, Yogasudha; Behring, Michael; Shrestha, Sadeep; Kirst, Matias; Resende, Marcio F R; de Los Campos, Gustavo

    2016-07-01

    Whole-genome multiomic profiles hold valuable information for the analysis and prediction of disease risk and progression. However, integrating high-dimensional multilayer omic data into risk-assessment models is statistically and computationally challenging. We describe a statistical framework, the Bayesian generalized additive model ((BGAM), and present software for integrating multilayer high-dimensional inputs into risk-assessment models. We used BGAM and data from The Cancer Genome Atlas for the analysis and prediction of survival after diagnosis of breast cancer. We developed a sequence of studies to (1) compare predictions based on single omics with those based on clinical covariates commonly used for the assessment of breast cancer patients (COV), (2) evaluate the benefits of combining COV and omics, (3) compare models based on (a) COV and gene expression profiles from oncogenes with (b) COV and whole-genome gene expression (WGGE) profiles, and (4) evaluate the impacts of combining multiple omics and their interactions. We report that (1) WGGE profiles and whole-genome methylation (METH) profiles offer more predictive power than any of the COV commonly used in clinical practice (e.g., subtype and stage), (2) adding WGGE or METH profiles to COV increases prediction accuracy, (3) the predictive power of WGGE profiles is considerably higher than that based on expression from large-effect oncogenes, and (4) the gain in prediction accuracy when combining multiple omics is consistent. Our results show the feasibility of omic integration and highlight the importance of WGGE and METH profiles in breast cancer, achieving gains of up to 7 points area under the curve (AUC) over the COV in some cases. PMID:27129736

  9. Whole-Genome Sequencing Allows for Improved Identification of Persistent Listeria monocytogenes in Food-Associated Environments

    OpenAIRE

    Stasiewicz, Matthew J.; Oliver, Haley F; Wiedmann, Martin; den Bakker, Henk C

    2015-01-01

    While the food-borne pathogen Listeria monocytogenes can persist in food associated environments, there are no whole-genome sequence (WGS) based methods to differentiate persistent from sporadic strains. Whole-genome sequencing of 188 isolates from a longitudinal study of L. monocytogenes in retail delis was used to (i) apply single-nucleotide polymorphism (SNP)-based phylogenetics for subtyping of L. monocytogenes, (ii) use SNP counts to differentiate persistent from repeatedly reintroduced ...

  10. Sequence determination from overlapping fragments: a simple model of whole-genome shotgun sequencing.

    Science.gov (United States)

    Derrida, Bernard; Fink, Thomas M A

    2002-02-11

    Assembling fragments randomly sampled from along a sequence is the basis of whole-genome shotgun sequencing, a technique used to map the DNA of the human and other genomes. We calculate the probability that a random sequence can be recovered from a collection of overlapping fragments. We provide an exact solution for an infinite alphabet and in the case of constant overlaps. For the general problem we apply two assembly strategies and give the probability that the assembly puzzle can be solved in the limit of infinitely many fragments. PMID:11863859

  11. Whole-genome sequencing identifies a recurrent functional synonymous mutation in melanoma

    OpenAIRE

    Gartner, Jared J; Stephen C. J. Parker; Prickett, Todd D.; Dutton-Regester, Ken; Stitzel, Michael L.; Lin, Jimmy C.; Davis, Sean; Simhadri, Vijaya L.; Jha, Sujata; Katagiri, Nobuko; Gotea, Valer; Jamie K. Teer; Wei, Xiaomu; Morken, Mario A; Umesh K Bhanot

    2013-01-01

    Synonymous mutations, which do not alter the protein sequence, have been shown to affect protein function [Sauna ZE, Kimchi-Sarfaty C (2011) Nat Rev Genet 12(10):683–691]. However, synonymous mutations are rarely investigated in the cancer genomics field. We used whole-genome and -exome sequencing to identify somatic mutations in 29 melanoma samples. Validation of one synonymous somatic mutation in BCL2L12 in 285 samples identified 12 cases that harbored the recurrent F17F mutation. This muta...

  12. Two Methods of Whole-Genome Amplification Enable Accurate Genotyping Across a 2320-SNP Linkage Panel

    OpenAIRE

    Barker, David L.; Hansen, Mark S. T.; Faruqi, A. Fawad; Giannola, Diane; Irsula, Orlando R.; Lasken, Roger S; Latterich, Martin; Makarov, Vladimir; Oliphant, Arnold; Pinter, Jonathon H.; Shen, Richard; Sleptsova, Irina; Ziehler, William; Lai, Eric

    2004-01-01

    Comprehensive genome scans involving many thousands of SNP assays will require significant amounts of genomic DNA from each sample. We report two successful methods for amplifying whole-genomic DNA prior to SNP analysis, multiple displacement amplification, and OmniPlex technology. We determined the coverage of amplification by analyzing a SNP linkage marker set that contained 2320 SNP markers spread across the genome at an average distance of 2.5 cM. We observed a concordance of >99.8% in ge...

  13. The First Kazakh Whole Genomes: The First Report of NGS Data

    Directory of Open Access Journals (Sweden)

    Ainur Akilzhanova

    2014-12-01

    Full Text Available Introduction: The human genome sequence will underpin human biology and medicine in the next century, providing a single, essential reference to all genetic information. Extraordinary technological advances and decreases in the cost of DNA sequencing have made the possibility of whole genome sequencing (WGS feasible as a highly accessible test for numerous indications. The international project “Genetic architecture of Kazakh population” is well underway to determine the complete DNA. Next generation sequencing is a powerful tool for genetic analysis, which will enable us to uncover the association of loci at specific sites in the genome associated with disease. The aim of this study was to introduce first data on WGS of 6 Kazakh individuals.Methods: This pilot study is among the first WGS performed on 6 healthy Kazakh individuals, using next generation sequencing platform HiSeq2000, Illumina by manufacturer’s protocols. All generated *.bcl files were simultaneously converted and demultiplexed using bcl2fasta application. Alignment of sequence reads performed using bwa-mem against human b19 reference genome. Sorting, removing of intermediate files, *.bam files assembling, and marking duplicates were performed using PicardTools package. GATK haplotype caller tool was used for variant calling. ClinVar, SNPedia, and Cosmic databases were processed to identify clinical genomic variants in 6 Kazakh whole genomes. Java Runtime Environment and R. Bioconductor packages were installed to perform raw data processing and run program scripts.Results: The sequence alignment and mapping procedures on reference genome hg19 of each 6 healthy Kazakh individual were completed. Between 87,308,581,400 and 107,526,741,301 total base pairs were sequenced with average coverage x29.85. Between 98.85% and 99.58% base pairs were totally mapped and on average 96.07% were properly paired. Het/Hom and Ti/Tv ratios for each whole genome ranged from 1.35 to 1.52 and

  14. Sequence Determination from Overlapping Fragments: A Simple Model of Whole-Genome Shotgun Sequencing

    Science.gov (United States)

    Derrida, Bernard; Fink, Thomas M.

    2002-02-01

    Assembling fragments randomly sampled from along a sequence is the basis of whole-genome shotgun sequencing, a technique used to map the DNA of the human and other genomes. We calculate the probability that a random sequence can be recovered from a collection of overlapping fragments. We provide an exact solution for an infinite alphabet and in the case of constant overlaps. For the general problem we apply two assembly strategies and give the probability that the assembly puzzle can be solved in the limit of infinitely many fragments.

  15. Reflections on the cost of "low-cost" whole genome sequencing: framing the health policy debate.

    Directory of Open Access Journals (Sweden)

    Timothy Caulfield

    2013-11-01

    Full Text Available The cost of whole genome sequencing is dropping rapidly. There has been a great deal of enthusiasm about the potential for this technological advance to transform clinical care. Given the interest and significant investment in genomics, this seems an ideal time to consider what the evidence tells us about potential benefits and harms, particularly in the context of health care policy. The scale and pace of adoption of this powerful new technology should be driven by clinical need, clinical evidence, and a commitment to put patients at the centre of health care policy.

  16. A green-cotyledon/stay-green mutant exemplifies the ancient whole-genome duplications in soybean.

    Science.gov (United States)

    Nakano, Michiharu; Yamada, Tetsuya; Masuda, Yu; Sato, Yutaka; Kobayashi, Hideki; Ueda, Hiroaki; Morita, Ryouhei; Nishimura, Minoru; Kitamura, Keisuke; Kusaba, Makoto

    2014-10-01

    The recent whole-genome sequencing of soybean (Glycine max) revealed that soybean experienced whole-genome duplications 59 million and 13 million years ago, and it has an octoploid-like genome in spite of its diploid nature. We analyzed a natural green-cotyledon mutant line, Tenshin-daiseitou. The physiological analysis revealed that Tenshin-daiseitou shows a non-functional stay-green phenotype in senescent leaves, which is similar to that of the mutant of Mendel's green-cotyledon gene I, the ortholog of SGR in pea. The identification of gene mutations and genetic segregation analysis suggested that defects in GmSGR1 and GmSGR2 were responsible for the green-cotyledon/stay-green phenotype of Tenshin-daiseitou, which was confirmed by RNA interference (RNAi) transgenic soybean experiments using GmSGR genes. The characterized green-cotyledon double mutant d1d2 was found to have the same mutations, suggesting that GmSGR1 and GmSGR2 are D1 and D2. Among the examined d1d2 strains, the d1d2 strain K144a showed a lower Chl a/b ratio in mature seeds than other strains but not in senescent leaves, suggesting a seed-specific genetic factor of the Chl composition in K144a. Analysis of the soybean genome sequence revealed four genomic regions with microsynteny to the Arabidopsis SGR1 region, which included the GmSGR1 and GmSGR2 regions. The other two regions contained GmSGR3a/GmSGR3b and GmSGR4, respectively, which might be pseudogenes or genes with a function that is unrelated to Chl degradation during seed maturation and leaf senescence. These GmSGR genes were thought to be produced by the two whole-genome duplications, and they provide a good example of such whole-genome duplication events in the evolution of the soybean genome.

  17. Whole genome sequences and annotation of Micrococcus luteus SUBG006, a novel phytopathogen of mango

    Directory of Open Access Journals (Sweden)

    Purvi M. Rakhashiya

    2015-12-01

    Full Text Available Actinobaceria, Micrococcus luteus SUBG006 was isolated from infected leaves of Mangifera indica L. vr. Nylon in Rajkot, (22.30°N, 70.78°E, Gujarat, India. The genome size is 3.86 Mb with G + C content of 69.80% and contains 112 rRNA sequences (5S, 16S and 23S. The whole genome sequencing has been deposited in DDBJ/EMBL/GenBank under the accession number JOKP00000000.

  18. Whole genome shotgun sequence of Bacillus amyloliquefaciens TF28, a biocontrol entophytic bacterium.

    Science.gov (United States)

    Zhang, Shumei; Jiang, Wei; Li, Jing; Meng, Liqiang; Cao, Xu; Hu, Jihua; Liu, Yushuai; Chen, Jingyu; Sha, Changqing

    2016-01-01

    Bacillus amyloliquefaciens TF28 is a biocontrol endophytic bacterium that is capable of inhibition of a broad range of plant pathogenic fungi. The strain has the potential to be developed into a biocontrol agent for use in agriculture. Here we report the whole-genome shotgun sequence of the strain. The genome size of B. amyloliquefaciens TF28 is 3,987,635 bp which consists of 3754 protein-coding genes, 65 tandem repeat sequences, 47 minisatellite DNA, 2 microsatellite DNA, 63 tRNA, 7rRNA, 6 sRNA, 3 prophage and CRISPR domains. PMID:27688836

  19. Unique Features of a Japanese ‘Candidatus Liberibacter asiaticus’ Strain Revealed by Whole Genome Sequencing

    OpenAIRE

    Katoh, Hiroshi; Miyata, Shin-ichi; Inoue, Hiromitsu; Iwanami, Toru

    2014-01-01

    Citrus greening (huanglongbing) is the most destructive disease of citrus worldwide. It is spread by citrus psyllids and is associated with phloem-limited bacteria of three species of α-Proteobacteria, namely, ‘Candidatus Liberibacter asiaticus’, ‘Ca. L. americanus’, and ‘Ca. L. africanus’. Recent findings suggested that some Japanese strains lack the bacteriophage-type DNA polymerase region (DNA pol), in contrast to the Floridian psy62 strain. The whole genome sequence of the pol-negative ‘C...

  20. Whole genome sequencing in drug discovery research: a one fits all solution?

    OpenAIRE

    Marc Sultan

    2015-01-01

    With the recent availability of Illumina's HiSeq X ten sequencing platform, the cost of whole genome sequencing (WGS) has dropped to nearly $1,000 per genome. The affordability of WGS has now the potential of replacing other genotyping platforms such as whole exome sequencing (WES) and array based genotyping for (smaller) clinical study cohorts. In a recent pilot study, we compared the performance and genotyping quality of the HiSeq X WGS approach against WES and array based genotyping with r...

  1. The effect of whole genome amplification on samples originating from more than one donor

    DEFF Research Database (Denmark)

    Thacker, C.R.; Balogh, M.K.; Børsting, Claus;

    2006-01-01

    In this study, the GenomiPhi(TM) DNA Amplification Kit (Amersham Biosciences) was used to investigate the potential of whole genome amplification (WGA) when considering samples originating from more than one donor. DNA was extracted from blood samples, quantified and normalised before being mixed...... found to match the expected peak ratios regardless of the starting concentration of DNA. With samples mixed in the ratio of 1:7 and 1:15, and when the concentration of starting material was at the manufacturer's lower limit, too few minor component peaks were found to allow for statistical analysis...

  2. Refining QTL with high-density SNP genotyping and whole genome sequence in three cattle breeds

    DEFF Research Database (Denmark)

    Sahana, Goutam; Guldbrandtsen, Bernt; Lund, Mogens Sandø

    2012-01-01

    method. Principal components were used to account for population structure. The QTL segregating in all three breeds were selected and a few of the most significant ones were followed in further analyses. The polymorphisms in the identified QTL regions were imputed using 90 whole genome sequences...... available from these three breeds. Imputations were done using IMPUTE v2.2. Association analyses with imputed polymorphisms were repeated for the targeted regions. The QTL genotypes of the sires with more than 20 sons were determined by an a posteriori granddaughter design. The concordance of sires...

  3. Estrogen Receptor-Mediated Effects of Isoflavone Supplementation Were Not Observed in Whole-Genome Gene Expression Profiles of Peripheral Blood Mononuclear Cells in Postmenopausal, Equol-Producing Women

    NARCIS (Netherlands)

    Velpen, van der V.; Geelen, A.; Schouten, E.G.; Hollman, P.C.H.; Afman, L.A.; Veer, van 't P.

    2013-01-01

    Isoflavones (genistein, daidzein, and glycitein) are suggested to have benefits as well as risks for human health. Approximately one-third of the Western population is able to metabolize daidzein into the more potent metabolite equol. Having little endogenous estradiol, equol-producing postmenopausa

  4. Kernel-based whole-genome prediction of complex traits: a review

    Science.gov (United States)

    Morota, Gota; Gianola, Daniel

    2014-01-01

    Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics. PMID:25360145

  5. Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

    Science.gov (United States)

    Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

    2016-08-01

    Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. PMID:27237775

  6. Kernel-based whole-genome prediction of complex traits: a review

    Directory of Open Access Journals (Sweden)

    Gota eMorota

    2014-10-01

    Full Text Available Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways, thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

  7. Targeted or whole genome sequencing of formalin fixed tissue samples: potential applications in cancer genomics.

    Science.gov (United States)

    Munchel, Sarah; Hoang, Yen; Zhao, Yue; Cottrell, Joseph; Klotzle, Brandy; Godwin, Andrew K; Koestler, Devin; Beyerlein, Peter; Fan, Jian-Bing; Bibikova, Marina; Chien, Jeremy

    2015-09-22

    Current genomic studies are limited by the poor availability of fresh-frozen tissue samples. Although formalin-fixed diagnostic samples are in abundance, they are seldom used in current genomic studies because of the concern of formalin-fixation artifacts. Better characterization of these artifacts will allow the use of archived clinical specimens in translational and clinical research studies. To provide a systematic analysis of formalin-fixation artifacts on Illumina sequencing, we generated 26 DNA sequencing data sets from 13 pairs of matched formalin-fixed paraffin-embedded (FFPE) and fresh-frozen (FF) tissue samples. The results indicate high rate of concordant calls between matched FF/FFPE pairs at reference and variant positions in three commonly used sequencing approaches (whole genome, whole exome, and targeted exon sequencing). Global mismatch rates and C · G > T · A substitutions were comparable between matched FF/FFPE samples, and discordant rates were low (<0.26%) in all samples. Finally, low-pass whole genome sequencing produces similar pattern of copy number alterations between FF/FFPE pairs. The results from our studies suggest the potential use of diagnostic FFPE samples for cancer genomic studies to characterize and catalog variations in cancer genomes. PMID:26305677

  8. Analysis of the microbiome: Advantages of whole genome shotgun versus 16S amplicon sequencing.

    Science.gov (United States)

    Ranjan, Ravi; Rani, Asha; Metwally, Ahmed; McGee, Halvor S; Perkins, David L

    2016-01-22

    The human microbiome has emerged as a major player in regulating human health and disease. Translational studies of the microbiome have the potential to indicate clinical applications such as fecal transplants and probiotics. However, one major issue is accurate identification of microbes constituting the microbiota. Studies of the microbiome have frequently utilized sequencing of the conserved 16S ribosomal RNA (rRNA) gene. We present a comparative study of an alternative approach using whole genome shotgun sequencing (WGS). In the present study, we analyzed the human fecal microbiome compiling a total of 194.1 × 10(6) reads from a single sample using multiple sequencing methods and platforms. Specifically, after establishing the reproducibility of our methods with extensive multiplexing, we compared: 1) The 16S rRNA amplicon versus the WGS method, 2) the Illumina HiSeq versus MiSeq platforms, 3) the analysis of reads versus de novo assembled contigs, and 4) the effect of shorter versus longer reads. Our study demonstrates that whole genome shotgun sequencing has multiple advantages compared with the 16S amplicon method including enhanced detection of bacterial species, increased detection of diversity and increased prediction of genes. In addition, increased length, either due to longer reads or the assembly of contigs, improved the accuracy of species detection.

  9. Whole-Genome Mapping as a Novel High-Resolution Typing Tool for Legionella pneumophila

    Science.gov (United States)

    Euser, Sjoerd M.; Landman, Fabian; Bruin, Jacob P.; IJzerman, Ed P.; den Boer, Jeroen W.; Schouls, Leo M.

    2015-01-01

    Legionella is the causative agent for Legionnaires' disease (LD) and is responsible for several large outbreaks in the world. More than 90% of LD cases are caused by Legionella pneumophila, and studies on the origin and transmission routes of this pathogen rely on adequate molecular characterization of isolates. Current typing of L. pneumophila mainly depends on sequence-based typing (SBT). However, studies have shown that in some outbreak situations, SBT does not have sufficient discriminatory power to distinguish between related and nonrelated L. pneumophila isolates. In this study, we used a novel high-resolution typing technique, called whole-genome mapping (WGM), to differentiate between epidemiologically related and nonrelated L. pneumophila isolates. Assessment of the method by various validation experiments showed highly reproducible results, and WGM was able to confirm two well-documented Dutch L. pneumophila outbreaks. Comparison of whole-genome maps of the two outbreaks together with WGMs of epidemiologically nonrelated L. pneumophila isolates showed major differences between the maps, and WGM yielded a higher discriminatory power than SBT. In conclusion, WGM can be a valuable alternative to perform outbreak investigations of L. pneumophila in real time since the turnaround time from culture to comparison of the L. pneumophila maps is less than 24 h. PMID:26202110

  10. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

    Science.gov (United States)

    Puente, Xose S.; Pinyol, Magda; Quesada, Víctor; Conde, Laura; Ordóñez, Gonzalo R.; Villamor, Neus; Escaramis, Georgia; Jares, Pedro; Beà, Sílvia; González-Díaz, Marcos; Bassaganyas, Laia; Baumann, Tycho; Juan, Manel; López-Guerra, Mónica; Colomer, Dolors; Tubío, José M. C.; López, Cristina; Navarro, Alba; Tornador, Cristian; Aymerich, Marta; Rozman, María; Hernández, Jesús M.; Puente, Diana A.; Freije, José M. P.; Velasco, Gloria; Gutiérrez-Fernández, Ana; Costa, Dolors; Carrió, Anna; Guijarro, Sara; Enjuanes, Anna; Hernández, Lluís; Yagüe, Jordi; Nicolás, Pilar; Romeo-Casabona, Carlos M.; Himmelbauer, Heinz; Castillo, Ester; Dohm, Juliane C.; de Sanjosé, Silvia; Piris, Miguel A.; de Alava, Enrique; Miguel, Jesús San; Royo, Romina; Gelpí, Josep L.; Torrents, David; Orozco, Modesto; Pisano, David G.; Valencia, Alfonso; Guigó, Roderic; Bayés, Mónica; Heath, Simon; Gut, Marta; Klatt, Peter; Marshall, John; Raine, Keiran; Stebbings, Lucy A.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.; Gut, Ivo; López-Guillermo, Armando; Estivill, Xavier; Montserrat, Emili; López-Otín, Carlos; Campo, Elías

    2012-01-01

    Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution1,2. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes3,4. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer. PMID:21642962

  11. Rediscovery by Whole Genome Sequencing: Classical Mutations and Genome Polymorphisms in Neurospora crassa

    Energy Technology Data Exchange (ETDEWEB)

    McCluskey, Kevin; Wiest, Aric E.; Grigoriev, Igor V.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Baker, Scott E.

    2011-06-02

    Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.

  12. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

    Directory of Open Access Journals (Sweden)

    Can Alkan

    2007-09-01

    Full Text Available The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

  13. From days to hours: reporting clinically actionable variants from whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Sumit Middha

    Full Text Available As the cost of whole genome sequencing (WGS decreases, clinical laboratories will be looking at broadly adopting this technology to screen for variants of clinical significance. To fully leverage this technology in a clinical setting, results need to be reported quickly, as the turnaround rate could potentially impact patient care. The latest sequencers can sequence a whole human genome in about 24 hours. However, depending on the computing infrastructure available, the processing of data can take several days, with the majority of computing time devoted to aligning reads to genomics regions that are to date not clinically interpretable. In an attempt to accelerate the reporting of clinically actionable variants, we have investigated the utility of a multi-step alignment algorithm focused on aligning reads and calling variants in genomic regions of clinical relevance prior to processing the remaining reads on the whole genome. This iterative workflow significantly accelerates the reporting of clinically actionable variants with no loss of accuracy when compared to genotypes obtained with the OMNI SNP platform or to variants detected with a standard workflow that combines Novoalign and GATK.

  14. Phased whole-genome genetic risk in a family quartet using a major allele reference sequence.

    Directory of Open Access Journals (Sweden)

    Frederick E Dewey

    2011-09-01

    Full Text Available Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (< 1,000 base pairs. We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.

  15. Construction of a phylogenetic tree of photosynthetic prokaryotes based on average similarities of whole genome sequences.

    Directory of Open Access Journals (Sweden)

    Soichirou Satoh

    Full Text Available Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.

  16. Whole genome duplication affects evolvability of flowering time in an autotetraploid plant.

    Directory of Open Access Journals (Sweden)

    Sara L Martin

    Full Text Available Whole genome duplications have occurred recurrently throughout the evolutionary history of eukaryotes. The resulting genetic and phenotypic changes can influence physiological and ecological responses to the environment; however, the impact of genome copy number on evolvability has rarely been examined experimentally. Here, we evaluate the effect of genome duplication on the ability to respond to selection for early flowering time in lines drawn from naturally occurring diploid and autotetraploid populations of the plant Chamerion angustifolium (fireweed. We contrast this with the result of four generations of selection on synthesized neoautotetraploids, whose genic variability is similar to diploids but genome copy number is similar to autotetraploids. In addition, we examine correlated responses to selection in all three groups. Diploid and both extant tetraploid and neoautotetraploid lines responded to selection with significant reductions in time to flowering. Evolvability, measured as realized heritability, was significantly lower in extant tetraploids (^b(T =  0.31 than diploids (^b(T =  0.40. Neotetraploids exhibited the highest evolutionary response (^b(T  =  0.55. The rapid shift in flowering time in neotetraploids was associated with an increase in phenotypic variability across generations, but not with change in genome size or phenotypic correlations among traits. Our results suggest that whole genome duplications, without hybridization, may initially alter evolutionary rate, and that the dynamic nature of neoautopolyploids may contribute to the prevalence of polyploidy throughout eukaryotes.

  17. High-resolution Whole-Genome Analysis of Skull Base Chordomas Implicates FHIT Loss in Chordoma Pathogenesis

    Directory of Open Access Journals (Sweden)

    Roberto Jose Diaz

    2012-09-01

    Full Text Available Chordoma is a rare tumor arising in the sacrum, clivus, or vertebrae. It is often not completely resectable and shows a high incidence of recurrence and progression with shortened patient survival and impaired quality of life. Chemotherapeutic options are limited to investigational therapies at present. Therefore, adjuvant therapy for control of tumor recurrence and progression is of great interest, especially in skull base lesions where complete tumor resection is often not possible because of the proximity of cranial nerves. To understand the extent of genetic instability and associated chromosomal and gene losses or gains in skull base chordoma, we undertook whole-genome single-nucleotide polymorphism microarray analysis of flash frozen surgical chordoma specimens, 21 from the clivus and 1 from C1 to C2 vertebrae. We confirm the presence of a deletion at 9p involving CDKN2A, CDKN2B, and MTAP but at a much lower rate (22% than previously reported for sacral chordoma. At a similar frequency (21%, we found aneuploidy of chromosome 3. Tissue microarray immunohistochemistry demonstrated absent or reduced fragile histidine triad (FHIT protein expression in 98% of sacral chordomas and 67%of skull base chordomas. Our data suggest that chromosome 3 aneuploidy and epigenetic regulation of FHIT contribute to loss of the FHIT tumor suppressor in chordoma. The finding that FHIT is lost in a majority of chordomas provides new insight into chordoma pathogenesis and points to a potential new therapeutic target for this challenging neoplasm.

  18. Characterization of Genomic Variants Associated with Scout and Recruit Behavioral Castes in Honey Bees Using Whole-Genome Sequencing.

    Science.gov (United States)

    Southey, Bruce R; Zhu, Ping; Carr-Markell, Morgan K; Liang, Zhengzheng S; Zayed, Amro; Li, Ruiqiang; Robinson, Gene E; Rodriguez-Zas, Sandra L

    2016-01-01

    Among forager honey bees, scouts seek new resources and return to the colony, enlisting recruits to collect these resources. Differentially expressed genes between these behaviors and genetic variability in scouting phenotypes have been reported. Whole-genome sequencing of 44 Apis mellifera scouts and recruits was undertaken to detect variants and further understand the genetic architecture underlying the behavioral differences between scouts and recruits. The median coverage depth in recruits and scouts was 10.01 and 10.7 X, respectively. Representation of bacterial species among the unmapped reads reflected a more diverse microbiome in scouts than recruits. Overall, 1,412,705 polymorphic positions were analyzed for associations with scouting behavior, and 212 significant (p-value 1000 bp apart from each other. A number of these variants were mapped to ncRNA LOC100578102, solute carrier family 12 member 6-like gene, and LOC100576965 (meprin and TRAF-C homology domain containing gene). Functional categories represented among the genes corresponding to significant variants included: neuronal function, exoskeleton, immune response, salivary gland development, and enzymatic food processing. These categories offer a glimpse into the molecular support to the behaviors of scouts and recruits. The level of association between genomic variants and scouting behavior observed in this study may be linked to the honey bee's genomic plasticity and fluidity of transition between castes. PMID:26784945

  19. Characterization of Genomic Variants Associated with Scout and Recruit Behavioral Castes in Honey Bees Using Whole-Genome Sequencing.

    Directory of Open Access Journals (Sweden)

    Bruce R Southey

    Full Text Available Among forager honey bees, scouts seek new resources and return to the colony, enlisting recruits to collect these resources. Differentially expressed genes between these behaviors and genetic variability in scouting phenotypes have been reported. Whole-genome sequencing of 44 Apis mellifera scouts and recruits was undertaken to detect variants and further understand the genetic architecture underlying the behavioral differences between scouts and recruits. The median coverage depth in recruits and scouts was 10.01 and 10.7 X, respectively. Representation of bacterial species among the unmapped reads reflected a more diverse microbiome in scouts than recruits. Overall, 1,412,705 polymorphic positions were analyzed for associations with scouting behavior, and 212 significant (p-value 1000 bp apart from each other. A number of these variants were mapped to ncRNA LOC100578102, solute carrier family 12 member 6-like gene, and LOC100576965 (meprin and TRAF-C homology domain containing gene. Functional categories represented among the genes corresponding to significant variants included: neuronal function, exoskeleton, immune response, salivary gland development, and enzymatic food processing. These categories offer a glimpse into the molecular support to the behaviors of scouts and recruits. The level of association between genomic variants and scouting behavior observed in this study may be linked to the honey bee's genomic plasticity and fluidity of transition between castes.

  20. Characterization of Genomic Variants Associated with Scout and Recruit Behavioral Castes in Honey Bees Using Whole-Genome Sequencing.

    Science.gov (United States)

    Southey, Bruce R; Zhu, Ping; Carr-Markell, Morgan K; Liang, Zhengzheng S; Zayed, Amro; Li, Ruiqiang; Robinson, Gene E; Rodriguez-Zas, Sandra L

    2016-01-01

    Among forager honey bees, scouts seek new resources and return to the colony, enlisting recruits to collect these resources. Differentially expressed genes between these behaviors and genetic variability in scouting phenotypes have been reported. Whole-genome sequencing of 44 Apis mellifera scouts and recruits was undertaken to detect variants and further understand the genetic architecture underlying the behavioral differences between scouts and recruits. The median coverage depth in recruits and scouts was 10.01 and 10.7 X, respectively. Representation of bacterial species among the unmapped reads reflected a more diverse microbiome in scouts than recruits. Overall, 1,412,705 polymorphic positions were analyzed for associations with scouting behavior, and 212 significant (p-value 1000 bp apart from each other. A number of these variants were mapped to ncRNA LOC100578102, solute carrier family 12 member 6-like gene, and LOC100576965 (meprin and TRAF-C homology domain containing gene). Functional categories represented among the genes corresponding to significant variants included: neuronal function, exoskeleton, immune response, salivary gland development, and enzymatic food processing. These categories offer a glimpse into the molecular support to the behaviors of scouts and recruits. The level of association between genomic variants and scouting behavior observed in this study may be linked to the honey bee's genomic plasticity and fluidity of transition between castes.

  1. Single site suppressors of a fission yeast temperature-sensitive mutant in cdc48 identified by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Irina N Marinova

    Full Text Available The protein called p97 in mammals and Cdc48 in budding and fission yeast is a homo-hexameric, ring-shaped, ubiquitin-dependent ATPase complex involved in a range of cellular functions, including protein degradation, vesicle fusion, DNA repair, and cell division. The cdc48+ gene is essential for viability in fission yeast, and point mutations in the human orthologue have been linked to disease. To analyze the function of p97/Cdc48 further, we performed a screen for cold-sensitive suppressors of the temperature-sensitive cdc48-353 fission yeast strain. In total, 29 independent pseudo revertants that had lost the temperature-sensitive growth defect of the cdc48-353 strain were isolated. Of these, 28 had instead acquired a cold-sensitive phenotype. Since the suppressors were all spontaneous mutants, and not the result of mutagenesis induced by chemicals or UV irradiation, we reasoned that the genome sequences of the 29 independent cdc48-353 suppressors were most likely identical with the exception of the acquired suppressor mutations. This prompted us to test if a whole genome sequencing approach would allow us to map the mutations. Indeed genome sequencing unambiguously revealed that the cold-sensitive suppressors were all second site intragenic cdc48 mutants. Projecting these onto the Cdc48 structure revealed that while the original temperature-sensitive G338D mutation is positioned near the central pore in the hexameric ring, the suppressor mutations locate to subunit-subunit and inter-domain boundaries. This suggests that Cdc48-353 is structurally compromized at the restrictive temperature, but re-established in the suppressor mutants. The last suppressor was an extragenic frame shift mutation in the ufd1 gene, which encodes a known Cdc48 co-factor. In conclusion, we show, using a novel whole genome sequencing approach, that Cdc48-353 is structurally compromized at the restrictive temperature, but stabilized in the suppressors.

  2. Human embryonic stem cells as a model for cardiac gene discovery : from chip to chap

    NARCIS (Netherlands)

    Beqqali, A.

    2008-01-01

    Here we described the use of human embryonic stem cells (hESCs) as a model to obtain insights into commitment to the mesoderm and endoderm lineages and the early steps in human cardiac cell differentiation by means of whole-genome temporal expression profiling. Furthermore, we used it as an approach

  3. Folic acid supplementation dysregulates gene expression in lymphoblastoid cells--implications in nutrition.

    Science.gov (United States)

    Junaid, Mohammed A; Kuizon, Salomon; Cardona, Juan; Azher, Tayaba; Murakami, Noriko; Pullarkat, Raju K; Brown, W Ted

    2011-09-01

    For over a decade, folic acid (FA) supplementation has been widely prescribed to pregnant women to prevent neural tube closure defects in newborns. Although neural tube closure occurs within the first trimester, high doses of FA are given throughout pregnancy, the physiological consequences of which are unknown. FA can cause epigenetic modification of the cytosine residues in the CpG dinucleotide, thereby affecting gene expression. Dysregulation of crucial gene expression during gestational development may have lifelong adverse effects or lead to neurodevelopmental defects, such as autism. We have investigated the effect of FA supplementation on gene expression in lymphoblastoid cells by whole-genome expression microarrays. The results showed that high FA caused dysregulation by ≥ four-fold up or down to more than 1000 genes, including many imprinted genes. The aberrant expression of three genes (FMR1, GPR37L1, TSSK3) was confirmed by Western blot analyses. The level of altered gene expression changed in an FA concentration-dependent manner. We found significant dysregulation in gene expression at concentrations as low as 15 ng/ml, a level that is lower than what has been achieved in the blood through FA fortification guidelines. We found evidence of aberrant promoter methylation in the CpG island of the TSSK3 gene. Excessive FA supplementation may require careful monitoring in women who are planning for, or are in the early stages of pregnancy. Aberrant expression of genes during early brain development may have an impact on behavioural characteristics. PMID:21867686

  4. dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions.

    Science.gov (United States)

    Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

    2016-01-01

    The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. PMID:26989155

  5. Whole genome assembly of a natto production strain Bacillus subtilis natto from very short read data

    Directory of Open Access Journals (Sweden)

    Fujiyama Asao

    2010-04-01

    Full Text Available Abstract Background Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and functions as a starter for the production of the traditional Japanese food "natto" made from soybeans. Although re-sequencing whole genomes of several laboratory domesticated B. subtilis 168 derivatives has already been attempted using short read sequencing data, the assembly of the whole genome sequence of a closely related strain, B. subtilis natto, from very short read data is more challenging, particularly with our aim to assemble one fully connected scaffold from short reads around 35 bp in length. Results We applied a comparative genome assembly method, which combines de novo assembly and reference guided assembly, to one of the B. subtilis natto strains. We successfully assembled 28 scaffolds and managed to avoid substantial fragmentation. Completion of the assembly through long PCR experiments resulted in one connected scaffold for B. subtilis natto. Based on the assembled genome sequence, our orthologous gene analysis between natto BEST195 and Marburg 168 revealed that 82.4% of 4375 predicted genes in BEST195 are one-to-one orthologous to genes in 168, with two genes in-paralog, 3.2% are deleted in 168, 14.3% are inserted in BEST195, and 5.9% of genes present in 168 are deleted in BEST195. The natto genome contains the same alleles in the promoter region of degQ and the coding region of swrAA as the wild strain, RO-FF-1. These are specific for γ-PGA production ability, which is related to natto production. Further, the B. subtilis natto strain completely lacked a polyketide synthesis operon, disrupted the plipastatin production operon, and possesses previously unidentified transposases. Conclusions The determination of the whole genome sequence of Bacillus subtilis natto provided detailed analyses of a set of genes related to natto production, demonstrating the number and locations of insertion sequences that B

  6. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  7. Whole-Genome Transcriptional Analysis of Chemolithoautotrophic Thiosulfate Oxidation by Thiobacillus denitrificans Under Aerobic vs. Denitrifying Conditions

    Energy Technology Data Exchange (ETDEWEB)

    Beller, H R; Letain, T E; Chakicherla, A; Kane, S R; Legler, T C; Coleman, M A

    2006-04-22

    Thiobacillus denitrificans is one of the few known obligate chemolithoautotrophic bacteria capable of energetically coupling thiosulfate oxidation to denitrification as well as aerobic respiration. As very little is known about the differential expression of genes associated with ke chemolithoautotrophic functions (such as sulfur-compound oxidation and CO2 fixation) under aerobic versus denitrifying conditions, we conducted whole-genome, cDNA microarray studies to explore this topic systematically. The microarrays identified 277 genes (approximately ten percent of the genome) as differentially expressed using Robust Multi-array Average statistical analysis and a 2-fold cutoff. Genes upregulated (ca. 6- to 150-fold) under aerobic conditions included a cluster of genes associated with iron acquisition (e.g., siderophore-related genes), a cluster of cytochrome cbb3 oxidase genes, cbbL and cbbS (encoding the large and small subunits of form I ribulose 1,5-bisphosphate carboxylase/oxygenase, or RubisCO), and multiple molecular chaperone genes. Genes upregulated (ca. 4- to 95-fold) under denitrifying conditions included nar, nir, and nor genes (associated respectively with nitrate reductase, nitrite reductase, and nitric oxide reductase, which catalyze successive steps of denitrification), cbbM (encoding form II RubisCO), and genes involved with sulfur-compound oxidation (including two physically separated but highly similar copies of sulfide:quinone oxidoreductase and of dsrC, associated with dissimilatory sulfite reductase). Among genes associated with denitrification, relative expression levels (i.e., degree of upregulation with nitrate) tended to decrease in the order nar > nir > nor > nos. Reverse transcription, quantitative PCR analysis was used to validate these trends.

  8. Draft whole genome sequence of the cyanide-degrading bacterium Pseudomonas pseudoalcaligenes CECT5344.

    Science.gov (United States)

    Luque-Almagro, Víctor M; Acera, Felipe; Igeño, Ma Isabel; Wibberg, Daniel; Roldán, Ma Dolores; Sáez, Lara P; Hennig, Magdalena; Quesada, Alberto; Huertas, Ma José; Blom, Jochen; Merchán, Faustino; Escribano, Ma Paz; Jaenicke, Sebastian; Estepa, Jessica; Guijo, Ma Isabel; Martínez-Luque, Manuel; Macías, Daniel; Szczepanowski, Rafael; Becerra, Gracia; Ramirez, Silvia; Carmona, Ma Isabel; Gutiérrez, Oscar; Manso, Isabel; Pühler, Alfred; Castillo, Francisco; Moreno-Vivián, Conrado; Schlüter, Andreas; Blasco, Rafael

    2013-01-01

    Pseudomonas pseudoalcaligenes CECT5344 is a Gram-negative bacterium able to tolerate cyanide and to use it as the sole nitrogen source. We report here the first draft of the whole genome sequence of a P. pseudoalcaligenes strain that assimilates cyanide. Three aspects are specially emphasized in this manuscript. First, some generalities of the genome are shown and discussed in the context of other Pseudomonadaceae genomes, including genome size, G + C content, core genome and singletons among other features. Second, the genome is analysed in the context of cyanide metabolism, describing genes probably involved in cyanide assimilation, like those encoding nitrilases, and genes related to cyanide resistance, like the cio genes encoding the cyanide insensitive oxidases. Finally, the presence of genes probably involved in other processes with a great biotechnological potential like production of bioplastics and biodegradation of pollutants also is discussed. PMID:22998548

  9. Bioinformatics Workflow for Clinical Whole Genome Sequencing at Partners HealthCare Personalized Medicine

    Directory of Open Access Journals (Sweden)

    Ellen A. Tsai

    2016-02-01

    Full Text Available Effective implementation of precision medicine will be enhanced by a thorough understanding of each patient’s genetic composition to better treat his or her presenting symptoms or mitigate the onset of disease. This ideally includes the sequence information of a complete genome for each individual. At Partners HealthCare Personalized Medicine, we have developed a clinical process for whole genome sequencing (WGS with application in both healthy individuals and those with disease. In this manuscript, we will describe our bioinformatics strategy to efficiently process and deliver genomic data to geneticists for clinical interpretation. We describe the handling of data from FASTQ to the final variant list for clinical review for the final report. We will also discuss our methodology for validating this workflow and the cost implications of running WGS.

  10. Hepatitis C virus whole genome sequencing: Current methods/issues and future challenges.

    Science.gov (United States)

    Trémeaux, Pauline; Caporossi, Alban; Thélu, Marie-Ange; Blum, Michael; Leroy, Vincent; Morand, Patrice; Larrat, Sylvie

    2016-10-01

    Therapy for hepatitis C is currently undergoing a revolution. The arrival of new antiviral agents targeting viral proteins reinforces the need for a better knowledge of the viral strains infecting each patient. Hepatitis C virus (HCV) whole genome sequencing provides essential information for precise typing, study of the viral natural history or identification of resistance-associated variants. First performed with Sanger sequencing, the arrival of next-generation sequencing (NGS) has simplified the technical process and provided more detailed data on the nature and evolution of viral quasi-species. We will review the different techniques used for HCV complete genome sequencing and their applications, both before and after the apparition of NGS. The progress brought by new and future technologies will also be discussed, as well as the remaining difficulties, largely due to the genomic variability. PMID:27068766

  11. Multiplex SNP analysis on whole genome amplified DNA from archived dried bloodspots, a validation study

    DEFF Research Database (Denmark)

    Tvedegaard, Kristine C.; Parner, Erik; Hooper, Craig W.;

    Multiplex SNP analysis on whole genome amplified DNA from archived dried bloodspots, a validation study Kristine C. Tvedegaard,1 Erik Parner,1 Craig W. Hooper,2 Jørn Atterman,1 Niels Gregersen3, Poul Thorsen,1 1Institute of Public Health, NANEA at Department of Epidemiology, University of Aarhus...... further development of allele specific primer extension (ASPE) for multiplex SNP analysis based on the Luminex 100 IS platform. It uses isobases (isoC and isoG) and the software MultiCode-PLx platform for data analysis and data handling. We validate the EraGen multicode system in two 6-plex assays used on.......3-100%, repeatability ranged from 99.2-99.7% and robustness ranged from 94.1-99.3%. CONCLUSION: The Multi-Code System is a highly sensitive and specific method for multiplex SNP analysis on WGA DNA from archived dried bloodspots....

  12. Clinical decision support for whole genome sequence information leveraging a service-oriented architecture: a prototype.

    Science.gov (United States)

    Welch, Brandon M; Rodriguez-Loya, Salvador; Eilbeck, Karen; Kawamoto, Kensaku

    2014-01-01

    Whole genome sequence (WGS) information could soon be routinely available to clinicians to support the personalized care of their patients. At such time, clinical decision support (CDS) integrated into the clinical workflow will likely be necessary to support genome-guided clinical care. Nevertheless, developing CDS capabilities for WGS information presents many unique challenges that need to be overcome for such approaches to be effective. In this manuscript, we describe the development of a prototype CDS system that is capable of providing genome-guided CDS at the point of care and within the clinical workflow. To demonstrate the functionality of this prototype, we implemented a clinical scenario of a hypothetical patient at high risk for Lynch Syndrome based on his genomic information. We demonstrate that this system can effectively use service-oriented architecture principles and standards-based components to deliver point of care CDS for WGS information in real-time. PMID:25954430

  13. Real time application of whole genome sequencing for outbreak investigation - What is an achievable turnaround time?

    Science.gov (United States)

    McGann, Patrick; Bunin, Jessica L; Snesrud, Erik; Singh, Seema; Maybank, Rosslyn; Ong, Ana C; Kwak, Yoon I; Seronello, Scott; Clifford, Robert J; Hinkle, Mary; Yamada, Stephen; Barnhill, Jason; Lesho, Emil

    2016-07-01

    Whole genome sequencing (WGS) is increasingly employed in clinical settings, though few assessments of turnaround times (TAT) have been performed in real-time. In this study, WGS was used to investigate an unfolding outbreak of vancomycin resistant Enterococcus faecium (VRE) among 3 patients in the ICU of a tertiary care hospital. Including overnight culturing, a TAT of just 48.5 h for a comprehensive report was achievable using an Illumina Miseq benchtop sequencer. WGS revealed that isolates from patient 2 and 3 differed from that of patient 1 by a single nucleotide polymorphism (SNP), indicating nosocomial transmission. However, the unparalleled resolution provided by WGS suggested that nosocomial transmission involved two separate events from patient 1 to patient 2 and 3, and not a linear transmission suspected by the time line. Rapid TAT's are achievable using WGS in the clinical setting and can provide an unprecedented level of resolution for outbreak investigations. PMID:27185645

  14. Whole genome sequencing as a tool for phylogenetic analysis of clinical strains of Mitis group streptococci.

    Science.gov (United States)

    Rasmussen, L H; Dargis, R; Højholt, K; Christensen, J J; Skovgaard, O; Justesen, U S; Rosenvinge, F S; Moser, C; Lukjancenko, O; Rasmussen, S; Nielsen, X C

    2016-10-01

    Identification of Mitis group streptococci (MGS) to the species level is challenging for routine microbiology laboratories. Correct identification is crucial for the diagnosis of infective endocarditis, identification of treatment failure, and/or infection relapse. Eighty MGS from Danish patients with infective endocarditis were whole genome sequenced. We compared the phylogenetic analyses based on single genes (recA, sodA, gdh), multigene (MLSA), SNPs, and core-genome sequences. The six phylogenetic analyses generally showed a similar pattern of six monophyletic clusters, though a few differences were observed in single gene analyses. Species identification based on single gene analysis showed their limitations when more strains were included. In contrast, analyses incorporating more sequence data, like MLSA, SNPs and core-genome analyses, provided more distinct clustering. The core-genome tree showed the most distinct clustering.

  15. Whole-Genome Scans Provide Evidence of Adaptive Evolution in Malawian Plasmodium falciparum Isolates

    DEFF Research Database (Denmark)

    Ocholla, Harold; Preston, Mark D; Mipando, Mwapatsa;

    2014-01-01

    BACKGROUND:  Selection by host immunity and antimalarial drugs has driven extensive adaptive evolution in Plasmodium falciparum and continues to produce ever-changing landscapes of genetic variation. METHODS:  We performed whole-genome sequencing of 69 P. falciparum isolates from Malawi and used...... population genetics approaches to investigate genetic diversity and population structure and identify loci under selection. RESULTS:  High genetic diversity (π = 2.4 × 10(-4)), moderately high multiplicity of infection (2.7), and low linkage disequilibrium (500-bp) were observed in Chikhwawa District, Malawi......, an area of high malaria transmission. Allele frequency-based tests provided evidence of recent population growth in Malawi and detected potential targets of host immunity and candidate vaccine antigens. Comparison of the sequence variation between isolates from Malawi and those from 5 geographically...

  16. Whole genome sequencing of emerging multidrug resistant Candida auris isolates in India demonstrates low genetic variation.

    Science.gov (United States)

    Sharma, C; Kumar, N; Pandey, R; Meis, J F; Chowdhary, A

    2016-09-01

    Candida auris is an emerging multidrug resistant yeast that causes nosocomial fungaemia and deep-seated infections. Notably, the emergence of this yeast is alarming as it exhibits resistance to azoles, amphotericin B and caspofungin, which may lead to clinical failure in patients. The multigene phylogeny and amplified fragment length polymorphism typing methods report the C. auris population as clonal. Here, using whole genome sequencing analysis, we decipher for the first time that C. auris strains from four Indian hospitals were highly related, suggesting clonal transmission. Further, all C. auris isolates originated from cases of fungaemia and were resistant to fluconazole (MIC >64 mg/L). PMID:27617098

  17. Rapid determination of anti-tuberculosis drug resistance from whole-genome sequences

    KAUST Repository

    Coll, Francesc

    2015-05-27

    Mycobacterium tuberculosis drug resistance (DR) challenges effective tuberculosis disease control. Current molecular tests examine limited numbers of mutations, and although whole genome sequencing approaches could fully characterise DR, data complexity has restricted their clinical application. A library (1,325 mutations) predictive of DR for 15 anti-tuberculosis drugs was compiled and validated for 11 of them using genomic-phenotypic data from 792 strains. A rapid online ‘TB-Profiler’ tool was developed to report DR and strain-type profiles directly from raw sequences. Using our DR mutation library, in silico diagnostic accuracy was superior to some commercial diagnostics and alternative databases. The library will facilitate sequence-based drug-susceptibility testing.

  18. Whole-genome sequencing of giant pandas provides insights into demographic history and local adaptation

    DEFF Research Database (Denmark)

    Zhao, Shancen; Zheng, Pingping; Dong, Shanshan;

    2013-01-01

    The panda lineage dates back to the late Miocene and ultimately leads to only one extant species, the giant panda (Ailuropoda melanoleuca). Although global climate change and anthropogenic disturbances are recognized to shape animal population demography their contribution to panda population...... population expansions, two bottlenecks and two divergences. Evidence indicated that, whereas global changes in climate were the primary drivers of population fluctuation for millions of years, human activities likely underlie recent population divergence and serious decline. We identified three distinct...... dynamics remains largely unknown. We sequenced the whole genomes of 34 pandas at an average 4.7-fold coverage and used this data set together with the previously deep-sequenced panda genome to reconstruct a continuous demographic history of pandas from their origin to the present. We identify two...

  19. A strategic stakeholder approach for addressing further analysis requests in whole genome sequencing research.

    Science.gov (United States)

    Thornock, Bradley Steven O

    2016-01-01

    Whole genome sequencing (WGS) can be a cost-effective and efficient means of diagnosis for some children, but it also raises a number of ethical concerns. One such concern is how researchers derive and communicate results from WGS, including future requests for further analysis of stored sequences. The purpose of this paper is to think about what is at stake, and for whom, in any solution that is developed to deal with such requests. To accomplish this task, this paper will utilize stakeholder theory, a common method used in business ethics. Several scenarios that connect stakeholder concerns and WGS will also posited and analyzed. This paper concludes by developing criteria composed of a series of questions that researchers can answer in order to more effectively address requests for further analysis of stored sequences. PMID:27091475

  20. Evaluation of whole genome sequencing for outbreak detection of Salmonella enterica

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Nielsen, Eva M.; Kaas, Rolf Sommer;

    2014-01-01

    Salmonella enterica is a common cause of minor and large food borne outbreaks. To achieve successful and nearly ‘real-time’ monitoring and identification of outbreaks, reliable sub-typing is essential. Whole genome sequencing (WGS) shows great promises for using as a routine epidemiological typing...... tool. Here we evaluate WGS for typing of S. Typhimurium including different approaches for analyzing and comparing the data. A collection of 34 S. Typhimurium isolates was sequenced. This consisted of 18 isolates from six outbreaks and 16 epidemiologically unrelated background strains. In addition, 8 S....... Enteritidis and 5 S. Derby were also sequenced and used for comparison. A number of different bioinformatics approaches were applied on the data; including pan-genome tree, k-mer tree, nucleotide difference tree and SNP tree. The outcome of each approach was evaluated in relation to the association...

  1. A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

    Science.gov (United States)

    Alioto, Tyler S.; Buchhalter, Ivo; Derdak, Sophia; Hutter, Barbara; Eldridge, Matthew D.; Hovig, Eivind; Heisler, Lawrence E.; Beck, Timothy A.; Simpson, Jared T.; Tonon, Laurie; Sertier, Anne-Sophie; Patch, Ann-Marie; Jäger, Natalie; Ginsbach, Philip; Drews, Ruben; Paramasivam, Nagarajan; Kabbe, Rolf; Chotewutmontri, Sasithorn; Diessl, Nicolle; Previti, Christopher; Schmidt, Sabine; Brors, Benedikt; Feuerbach, Lars; Heinold, Michael; Gröbner, Susanne; Korshunov, Andrey; Tarpey, Patrick S.; Butler, Adam P.; Hinton, Jonathan; Jones, David; Menzies, Andrew; Raine, Keiran; Shepherd, Rebecca; Stebbings, Lucy; Teague, Jon W.; Ribeca, Paolo; Giner, Francesc Castro; Beltran, Sergi; Raineri, Emanuele; Dabad, Marc; Heath, Simon C.; Gut, Marta; Denroche, Robert E.; Harding, Nicholas J.; Yamaguchi, Takafumi N.; Fujimoto, Akihiro; Nakagawa, Hidewaki; Quesada, Víctor; Valdés-Mas, Rafael; Nakken, Sigve; Vodák, Daniel; Bower, Lawrence; Lynch, Andrew G.; Anderson, Charlotte L.; Waddell, Nicola; Pearson, John V.; Grimmond, Sean M.; Peto, Myron; Spellman, Paul; He, Minghui; Kandoth, Cyriac; Lee, Semin; Zhang, John; Létourneau, Louis; Ma, Singer; Seth, Sahil; Torrents, David; Xi, Liu; Wheeler, David A.; López-Otín, Carlos; Campo, Elías; Campbell, Peter J.; Boutros, Paul C.; Puente, Xose S.; Gerhard, Daniela S.; Pfister, Stefan M.; McPherson, John D.; Hudson, Thomas J.; Schlesner, Matthias; Lichter, Peter; Eils, Roland; Jones, David T. W.; Gut, Ivo G.

    2015-01-01

    As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy. PMID:26647970

  2. A strategic stakeholder approach for addressing further analysis requests in whole genome sequencing research.

    Science.gov (United States)

    Thornock, Bradley Steven O

    2016-01-01

    Whole genome sequencing (WGS) can be a cost-effective and efficient means of diagnosis for some children, but it also raises a number of ethical concerns. One such concern is how researchers derive and communicate results from WGS, including future requests for further analysis of stored sequences. The purpose of this paper is to think about what is at stake, and for whom, in any solution that is developed to deal with such requests. To accomplish this task, this paper will utilize stakeholder theory, a common method used in business ethics. Several scenarios that connect stakeholder concerns and WGS will also posited and analyzed. This paper concludes by developing criteria composed of a series of questions that researchers can answer in order to more effectively address requests for further analysis of stored sequences.

  3. Practical Value of Food Pathogen Traceability through Building a Whole-Genome Sequencing Network and Database.

    Science.gov (United States)

    Allard, Marc W; Strain, Errol; Melka, David; Bunning, Kelly; Musser, Steven M; Brown, Eric W; Timme, Ruth

    2016-08-01

    The FDA has created a United States-based open-source whole-genome sequencing network of state, federal, international, and commercial partners. The GenomeTrakr network represents a first-of-its-kind distributed genomic food shield for characterizing and tracing foodborne outbreak pathogens back to their sources. The GenomeTrakr network is leading investigations of outbreaks of foodborne illnesses and compliance actions with more accurate and rapid recalls of contaminated foods as well as more effective monitoring of preventive controls for food manufacturing environments. An expanded network would serve to provide an international rapid surveillance system for pathogen traceback, which is critical to support an effective public health response to bacterial outbreaks. PMID:27008877

  4. Clinical Decision Support for Whole Genome Sequence Information Leveraging a Service-Oriented Architecture: a Prototype

    Science.gov (United States)

    Welch, Brandon M.; Rodriguez-Loya, Salvador; Eilbeck, Karen; Kawamoto, Kensaku

    2014-01-01

    Whole genome sequence (WGS) information could soon be routinely available to clinicians to support the personalized care of their patients. At such time, clinical decision support (CDS) integrated into the clinical workflow will likely be necessary to support genome-guided clinical care. Nevertheless, developing CDS capabilities for WGS information presents many unique challenges that need to be overcome for such approaches to be effective. In this manuscript, we describe the development of a prototype CDS system that is capable of providing genome-guided CDS at the point of care and within the clinical workflow. To demonstrate the functionality of this prototype, we implemented a clinical scenario of a hypothetical patient at high risk for Lynch Syndrome based on his genomic information. We demonstrate that this system can effectively use service-oriented architecture principles and standards-based components to deliver point of care CDS for WGS information in real-time. PMID:25954430

  5. Rapid high resolution genotyping of Francisella tularensis by whole genome sequence comparison of annotated genes ("MLST+".

    Directory of Open Access Journals (Sweden)

    Markus H Antwerpen

    Full Text Available The zoonotic disease tularemia is caused by the bacterium Francisella tularensis. This pathogen is considered as a category A select agent with potential to be misused in bioterrorism. Molecular typing based on DNA-sequence like canSNP-typing or MLVA has become the accepted standard for this organism. Due to the organism's highly clonal nature, the current typing methods have reached their limit of discrimination for classifying closely related subpopulations within the subspecies F. tularensis ssp. holarctica. We introduce a new gene-by-gene approach, MLST+, based on whole genome data of 15 sequenced F. tularensis ssp. holarctica strains and apply this approach to investigate an epidemic of lethal tularemia among non-human primates in two animal facilities in Germany. Due to the high resolution of MLST+ we are able to demonstrate that three independent clones of this highly infectious pathogen were responsible for these spatially and temporally restricted outbreaks.

  6. Bioinformatics Workflow for Clinical Whole Genome Sequencing at Partners HealthCare Personalized Medicine.

    Science.gov (United States)

    Tsai, Ellen A; Shakbatyan, Rimma; Evans, Jason; Rossetti, Peter; Graham, Chet; Sharma, Himanshu; Lin, Chiao-Feng; Lebo, Matthew S

    2016-01-01

    Effective implementation of precision medicine will be enhanced by a thorough understanding of each patient's genetic composition to better treat his or her presenting symptoms or mitigate the onset of disease. This ideally includes the sequence information of a complete genome for each individual. At Partners HealthCare Personalized Medicine, we have developed a clinical process for whole genome sequencing (WGS) with application in both healthy individuals and those with disease. In this manuscript, we will describe our bioinformatics strategy to efficiently process and deliver genomic data to geneticists for clinical interpretation. We describe the handling of data from FASTQ to the final variant list for clinical review for the final report. We will also discuss our methodology for validating this workflow and the cost implications of running WGS. PMID:26927186

  7. Rapid whole genome sequencing for the detection and characterization of microorganisms directly from clinical samples

    DEFF Research Database (Denmark)

    Hasman, Henrik; Saputra, Dhany; Sicheritz-Pontén, Thomas;

    2014-01-01

    Whole genome sequencing (WGS) is becoming available as a routine tool for clinical microbiology. If applied directly on clinical samples this could further reduce diagnostic time and thereby improve control and treatment. A major bottle-neck is the availability of fast and reliable bioinformatics...... information and drastically reduce diagnostic time. This may prove very useful, but the need for data analysis is still a hurdle to clinical implementation. To overcome this problem a publicly available bioinformatics tool was developed in this study....... tools. This study was conducted to evaluate the applicability of WGS directly on clinical samples and to develop easy-to-use bioinformatics tools for analysis of the sequencing data. Thirty-five random urine samples from patients with suspected urinary tract infections were examined using conventional...

  8. Overview of HBV whole genome data in public repositories and the Chinese HBV reference sequences

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The number of Hepatitis B virus (HBV) whole genomic sequences in public nucleotide databases (GenBank, EMBL, and DDBJ) had reached 866 by January 1, 2007. Coming from 46 countries and regions, these sequences were categorized as eight genotypes (A-H). With the statistical and phylogenetic analysis on all available complete genomic data of HBV, we here present an overview of HBV sequences in public databases. From all registered 229 HBV genomes in Chinese regions as well as 59 sequencing data from our research group, we report the establishment of reference sequences of HBV strains prevailing in China. These analyses provide clues for the effects of HBV genotypes in host clinical progressions, geographic distribution of the infection, and the viral evolutionary history. Moreover, the viral sequence reference would be helpful in the identification of various HBV mutations. Based on the analysis of various public databases,we suggest that the Chinese HBV database with the clinical information should be constructed.

  9. Use of Whole Genome Sequencing and Patient Interviews To Link a Case of Sporadic Listeriosis to Consumption of Prepackaged Lettuce.

    Science.gov (United States)

    Jackson, K A; Stroika, S; Katz, L S; Beal, J; Brandt, E; Nadon, C; Reimer, A; Major, B; Conrad, A; Tarr, C; Jackson, B R; Mody, R K

    2016-05-01

    We report on a case of listeriosis in a patient who probably consumed a prepackaged romaine lettuce-containing product recalled for Listeria monocytogenes contamination. Although definitive epidemiological information demonstrating exposure to the specific recalled product was lacking, the patient reported consumption of a prepackaged romaine lettuce-containing product of either the recalled brand or a different brand. A multinational investigation found that patient and food isolates from the recalled product were indistinguishable by pulsed-field gel electrophoresis and were highly related by whole genome sequencing, differing by four alleles by whole genome multilocus sequence typing and by five high-quality single nucleotide polymorphisms, suggesting a common source. To our knowledge, this is the first time prepackaged lettuce has been identified as a likely source for listeriosis. This investigation highlights the power of whole genome sequencing, as well as the continued need for timely and thorough epidemiological exposure data to identify sources of foodborne infections.

  10. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    DEFF Research Database (Denmark)

    Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Bellod Cisneros, Jose Luis;

    2016-01-01

    and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the...... web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes...... platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely...

  11. Use of Whole Genome Sequencing and Patient Interviews To Link a Case of Sporadic Listeriosis to Consumption of Prepackaged Lettuce.

    Science.gov (United States)

    Jackson, K A; Stroika, S; Katz, L S; Beal, J; Brandt, E; Nadon, C; Reimer, A; Major, B; Conrad, A; Tarr, C; Jackson, B R; Mody, R K

    2016-05-01

    We report on a case of listeriosis in a patient who probably consumed a prepackaged romaine lettuce-containing product recalled for Listeria monocytogenes contamination. Although definitive epidemiological information demonstrating exposure to the specific recalled product was lacking, the patient reported consumption of a prepackaged romaine lettuce-containing product of either the recalled brand or a different brand. A multinational investigation found that patient and food isolates from the recalled product were indistinguishable by pulsed-field gel electrophoresis and were highly related by whole genome sequencing, differing by four alleles by whole genome multilocus sequence typing and by five high-quality single nucleotide polymorphisms, suggesting a common source. To our knowledge, this is the first time prepackaged lettuce has been identified as a likely source for listeriosis. This investigation highlights the power of whole genome sequencing, as well as the continued need for timely and thorough epidemiological exposure data to identify sources of foodborne infections. PMID:27296429

  12. Navigating Microbiological Food Safety in the Era of Whole-Genome Sequencing.

    Science.gov (United States)

    Ronholm, J; Nasheri, Neda; Petronella, Nicholas; Pagotto, Franco

    2016-10-01

    The epidemiological investigation of a foodborne outbreak, including identification of related cases, source attribution, and development of intervention strategies, relies heavily on the ability to subtype the etiological agent at a high enough resolution to differentiate related from nonrelated cases. Historically, several different molecular subtyping methods have been used for this purpose; however, emerging techniques, such as single nucleotide polymorphism (SNP)-based techniques, that use whole-genome sequencing (WGS) offer a resolution that was previously not possible. With WGS, unlike traditional subtyping methods that lack complete information, data can be used to elucidate phylogenetic relationships and disease-causing lineages can be tracked and monitored over time. The subtyping resolution and evolutionary context provided by WGS data allow investigators to connect related illnesses that would be missed by traditional techniques. The added advantage of data generated by WGS is that these data can also be used for secondary analyses, such as virulence gene detection, antibiotic resistance gene profiling, synteny comparisons, mobile genetic element identification, and geographic attribution. In addition, several software packages are now available to generate in silico results for traditional molecular subtyping methods from the whole-genome sequence, allowing for efficient comparison with historical databases. Metagenomic approaches using next-generation sequencing have also been successful in the detection of nonculturable foodborne pathogens. This review addresses state-of-the-art techniques in microbial WGS and analysis and then discusses how this technology can be used to help support food safety investigations. Retrospective outbreak investigations using WGS are presented to provide organism-specific examples of the benefits, and challenges, associated with WGS in comparison to traditional molecular subtyping techniques. PMID:27559074

  13. Whole genome investigation of a divergent clade of the pathogen Streptococcus suis

    Directory of Open Access Journals (Sweden)

    Abiyad eBaig

    2015-11-01

    Full Text Available Streptococcus suis is a major porcine and zoonotic pathogen responsible for significant economic losses in the pig industry and an increasing number of human cases. Multiple isolates of S. suis show marked genomic diversity. Here we report the analysis of whole genome sequences of nine pig isolates that caused disease typical of S. suis and had phenotypic characteristics of S. suis, but their genomes were divergent from those of many other S. suis isolates. Comparison of protein sequences predicted from divergent genomes with those from normal S. suis reduced the size of core genome from 793 to only 397 genes. Divergence was clear if phylogenetic analysis was performed on reduced core genes and MLST alleles. Phylogenies based on certain other genes (16S rRNA, sodA, recN and cpn60 did not show divergence for all isolates, suggesting recombination between some divergent isolates with normal S. suis for these genes. Indeed, there is evidence of recent recombination between the divergent and normal S. suis genomes for 249 of 397 core genes. In addition, phylogenetic analysis based on the 16S rRNA gene and 132 genes that were conserved between the divergent isolates and representatives of the broader Streptococcus genus showed that divergent isolates were more closely related to S. suis. Six out of nine divergent isolates possessed a S. suis-like capsule region with variation in capsular gene sequences but the remaining three did not have a discrete capsule locus. The majority (40/70, of virulence-associated genes in normal S. suis were present in the divergent genomes. Overall, the divergent isolates extend the current diversity of S. suis species but the phenotypic similarities and the large amount of gene exchange with normal S. suis gives insufficient evidence to assign these isolates to a new species or subspecies. Further sampling and whole genome analysis of more isolates is warranted to understand the diversity of the species.

  14. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks

    Science.gov (United States)

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S. K.; Mammel, Mark K.; Tarr, Phillip I.; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  15. Light whole genome sequence for SNP discovery across domestic cat breeds

    Directory of Open Access Journals (Sweden)

    Driscoll Carlos

    2010-06-01

    Full Text Available Abstract Background The domestic cat has offered enormous genomic potential in the veterinary description of over 250 hereditary disease models as well as the occurrence of several deadly feline viruses (feline leukemia virus -- FeLV, feline coronavirus -- FECV, feline immunodeficiency virus - FIV that are homologues to human scourges (cancer, SARS, and AIDS respectively. However, to realize this bio-medical potential, a high density single nucleotide polymorphism (SNP map is required in order to accomplish disease and phenotype association discovery. Description To remedy this, we generated 3,178,297 paired fosmid-end Sanger sequence reads from seven cats, and combined these data with the publicly available 2X cat whole genome sequence. All sequence reads were assembled together to form a 3X whole genome assembly allowing the discovery of over three million SNPs. To reduce potential false positive SNPs due to the low coverage assembly, a low upper-limit was placed on sequence coverage and a high lower-limit on the quality of the discrepant bases at a potential variant site. In all domestic cats of different breeds: female Abyssinian, female American shorthair, male Cornish Rex, female European Burmese, female Persian, female Siamese, a male Ragdoll and a female African wildcat were sequenced lightly. We report a total of 964 k common SNPs suitable for a domestic cat SNP genotyping array and an additional 900 k SNPs detected between African wildcat and domestic cats breeds. An empirical sampling of 94 discovered SNPs were tested in the sequenced cats resulting in a SNP validation rate of 99%. Conclusions These data provide a large collection of mapped feline SNPs across the cat genome that will allow for the development of SNP genotyping platforms for mapping feline diseases.

  16. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

    Science.gov (United States)

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  17. Comparison of whole genome sequences from human and non-human Escherichia coli O26 strains

    Directory of Open Access Journals (Sweden)

    Keri N Norman

    2015-03-01

    Full Text Available Shiga toxin-producing Escherichia coli (STEC O26 is the second leading E. coli serogroup responsible for human illness outbreaks behind E. coli O157:H7. Recent outbreaks have been linked to emerging pathogenic O26:H11 strains harboring stx2 only. Cattle have been recognized as an important reservoir of O26 strains harboring stx1; however the reservoir of these emerging stx2 strains is unknown. The objective of this study was to identify nucleotide polymorphisms in human and cattle-derived strains in order to compare differences in polymorphism derived genotypes and virulence gene profiles between the two host species. Whole genome sequencing was performed on 182 epidemiologically unrelated O26 strains, including 109 human-derived strains and 73 non-human-derived strains. A panel of 289 O26 strains (241 STEC and 48 non-STEC was subsequently genotyped using a set of 283 polymorphisms identified by whole genome sequencing, resulting in 64 unique genotypes. Phylogenetic analyses identified seven clusters within the O26 strains. The seven clusters did not distinguish between isolates originating from humans or cattle; however, clusters did correspond with particular virulence gene profiles. Human and non-human-derived strains harboring stx1 clustered separately from strains harboring stx2, strains harboring eae, and non-STEC strains. Strains harboring stx2 were more closely related to non-STEC strains and strains harboring eae than to strains harboring stx1. The finding of human and cattle-derived strains with the same polymorphism derived genotypes and similar virulence gene profiles, provides evidence that similar strains are found in cattle and humans and transmission between the two species may occur.

  18. Whole genome sequencing of sugarbeet and identification of differentially expressed genes regulating beet curly top resistance

    Science.gov (United States)

    The genome of KDH13 doubled haploid line has been sequenced using Illumina HiSeq2000 NGS platform. This line (PI663862) was released by USDA-ARS as a genetic stock resistant to beet curly top. Sequencing of a standard paired end and a 2kb-insert mate-pair genomic libraries, constructed from a leaf ...

  19. The impact of PPARalpha activation on whole genome gene expression in human precision cut liver slices

    NARCIS (Netherlands)

    Janssen, A.W.H.; Betzel, B; Stoopen, G.; Berends, F.J.; Janssen, I.M.C.; Peijnenburg, A.A.; Kersten, S.

    2015-01-01

    BACKGROUND: Studies in mice have shown that PPARalpha is an important regulator of lipid metabolism in liver and key transcription factor involved in the adaptive response to fasting. However, much less is known about the role of PPARalpha in human liver. METHODS: Here we set out to study the functi

  20. Insights on cryoprotectant toxicity from gene expression profiling of endothelial cells exposed to ethylene glycol.

    Science.gov (United States)

    Cordeiro, Rui Martins; Stirling, Soren; Fahy, Gregory M; de Magalhães, João Pedro

    2015-12-01

    Cryopreservation consists of preserving living cells or tissues generally at -80 °C or below and has many current applications in cell and tissue banking, and future potential for organ banking. Cryoprotective agents such as ethylene glycol (EG) are required for successful cryopreservation of most living systems, but have toxic side effects whose mechanisms remain largely unknown. In this work, we investigated the mechanisms of toxicity of ethylene glycol in human umbilical vein endothelial cells (HUVECs) as a model of the vascular endothelium in perfused organs. Exposing cells to 60% v/v EG for 2 h at 4 °C resulted in only a slight decrease in subsequent cell growth, suggesting only modest toxicity of EG for this cell type. Gene expression analysis with whole genome microarrays revealed signatures indicative of a generalized stress response at 24 h after EG exposure and a trend toward partial recovery at 72 h. The observed changes involved signalling pathways, glycoproteins, and genes involved in extracellular and transmembrane functions, the latter suggesting potential effects of ethylene glycol on membranes. These results continue to develop a new paradigm for understanding cryoprotectant toxicity and reveal molecular signatures helpful for future experiments in more completely elucidating the toxic effects of ethylene glycol in vascular endothelial cells and other cell types. PMID:26471925

  1. Regulatory divergence of homeologous Atlantic salmon elovl5 genes following the salmonid-specific whole-genome duplication.

    Science.gov (United States)

    Carmona-Antoñanzas, Greta; Zheng, Xiaozhong; Tocher, Douglas R; Leaver, Michael J

    2016-10-10

    Fatty acyl elongase 5 (elovl5) is a critical enzyme in the vertebrate biosynthetic pathway which produces the physiologically essential long-chain polyunsaturated fatty acids (LC-PUFA), docosahexenoic acid (DHA), and eicosapentenoic acid (EPA) from 18 carbon fatty acids precursors. In contrast to most other vertebrates, Atlantic salmon possess two copies of elovl5 (elovl5a and elovl5b) as a result of a whole-genome duplication (WGD) which occurred at the base of the salmonid lineage. WGDs have had a major influence on vertebrate evolution, providing extra genetic material, enabling neofunctionalization to accelerate adaptation and speciation. However, little is known about the mechanisms by which such duplicated homeologous genes diverge. Here we show that homeologous Atlantic salmon elovl5a and elovl5b genes have been asymmetrically colonised by transposon-like elements. Identical locations and identities of insertions are also present in the rainbow trout duplicate elovl5 genes, but not in the nearest extant representative preduplicated teleost, the northern pike. Both elovl5 salmon duplicates possessed conserved regulatory elements that promoted Srebp1- and Srebp2-dependent transcription, and differences in the magnitude of Srebp response between promoters could be attributed to a tandem duplication of SRE and NF-Y cofactor binding sites in elovl5b. Furthermore, an insertion in the promoter region of elovl5a confers responsiveness to Lxr/Rxr transcriptional activation. Our results indicate that most, but not all, transposon mobilisation into elovl5 genes occurred after the split from the common ancestor of pike and salmon, but before more recent salmonid speciations, and that divergence of elovl5 regulatory regions have enabled neofuntionalization by promoting differential expression of these homeologous genes. PMID:27374149

  2. Whole Genome Sequencing and Phylogenetic Analysis of a Historical Collection of Bacillus anthracis Strains from Danish Cattle

    DEFF Research Database (Denmark)

    Derzelle, Sylviane; Girault, Guillaume; Kokotovic, Branko;

    2015-01-01

    Bacillus anthracis, the causative agent of anthrax, is known as one of the most genetically monomorphic species. Canonical single-nucleotide polymorphism (SNP) typing and whole-genome sequencing were used to investigate the molecular diversity of eleven B. anthracis strains isolated from cattle...

  3. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Thorup Nielsen, Mette;

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely...

  4. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    Directory of Open Access Journals (Sweden)

    Pimlapas Leekitcharoenphon

    Full Text Available Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections.

  5. Whole-genome amplified DNA from stored dried blood spots is reliable in high resolution melting curve and sequencing analysis

    DEFF Research Database (Denmark)

    Winkel, Bo G; Hollegaard, Mads Vilhelm; Olesen, Morten S;

    2011-01-01

    The use of dried blood spots (DBS) samples in genomic workup has been limited by the relative low amounts of genomic DNA (gDNA) they contain. It remains to be proven that whole genome amplified DNA (wgaDNA) from stored DBS samples, constitutes a reliable alternative to gDNA.We wanted to compare m...

  6. Whole-Genome Shotgun Sequencing of an Indian-Origin Lactobacillus helveticus Strain, MTCC 5463, with Probiotic Potential▿

    Science.gov (United States)

    Prajapati, J. B.; Khedkar, C. D.; Chitra, J.; Suja, Senan; Mishra, V.; Sreeja, V.; Patel, R. K.; Ahir, V. B.; Bhatt, V. D.; Sajnani, M. R.; Jakhesara, S. J.; Koringa, P. G.; Joshi, C. G.

    2011-01-01

    Lactobacillus helveticus MTCC 5463 was isolated from a vaginal swab from a healthy adult female. The strain exhibited potential probiotic properties, with their beneficial role in the gastrointestinal tract and their ability to reduce cholesterol and stimulate immunity. We sequenced the whole genome and compared it with the published genome sequence of Lactobacillus helveticus DPC4571. PMID:21705605

  7. Whole-genome shotgun sequencing of an Indian-origin Lactobacillus helveticus strain, MTCC 5463, with probiotic potential.

    Science.gov (United States)

    Prajapati, J B; Khedkar, C D; Chitra, J; Suja, Senan; Mishra, V; Sreeja, V; Patel, R K; Ahir, V B; Bhatt, V D; Sajnani, M R; Jakhesara, S J; Koringa, P G; Joshi, C G

    2011-08-01

    Lactobacillus helveticus MTCC 5463 was isolated from a vaginal swab from a healthy adult female. The strain exhibited potential probiotic properties, with their beneficial role in the gastrointestinal tract and their ability to reduce cholesterol and stimulate immunity. We sequenced the whole genome and compared it with the published genome sequence of Lactobacillus helveticus DPC4571. PMID:21705605

  8. Efficient genomic prediction based on whole-genome sequence data using split-and-merge Bayesian variable selection

    NARCIS (Netherlands)

    Calus, Mario P.L.; Bouwman, Aniek C.; Schrooten, Chris; Veerkamp, Roel F.

    2016-01-01

    Background: Use of whole-genome sequence data is expected to increase persistency of genomic prediction across generations and breeds but affects model performance and requires increased computing time. In this study, we investigated whether the split-and-merge Bayesian stochastic search variable

  9. Whole-genome pyrosequencing of an epidemic multidrug-resistant Acinetobacter baumannii strain belonging to the European clone II group

    DEFF Research Database (Denmark)

    Iacono, M.; Villa, L.; Fortini, D.;

    2008-01-01

    The whole-genome sequence of an epidemic, multidrug-resistant Acinetobacter baumannii strain (strain ACICU) belonging to the European clone II group and carrying the plasmid-mediated bla(OXA-58) carbapenem resistance gene was determined. The A. baumannii ACICU genome was compared with the genomes...

  10. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang;

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their preci...

  11. Understanding the Quorum-Sensing Bacterium Pantoea stewartii Strain M009 with Whole-Genome Sequencing Analysis.

    Science.gov (United States)

    Tan, Wen-Si; Chang, Chien-Yi; Yin, Wai-Fong; Chan, Kok-Gan

    2015-01-01

    Pantoea stewartii is known to be the causative agent of Stewart's wilt, which usually affects sweet corn (Zea mays) with the corn flea beetle as the transmission vector. In this work, we present the whole-genome sequence of Pantoea stewartii strain M009, isolated from a Malaysian tropical rainforest waterfall. PMID:25635007

  12. Understanding the Quorum-Sensing Bacterium Pantoea stewartii Strain M009 with Whole-Genome Sequencing Analysis

    OpenAIRE

    Tan, Wen-Si; Chang, Chien-Yi; Yin, Wai-Fong; Chan, Kok-Gan

    2015-01-01

    Pantoea stewartii is known to be the causative agent of Stewart’s wilt, which usually affects sweet corn (Zea mays) with the corn flea beetle as the transmission vector. In this work, we present the whole-genome sequence of Pantoea stewartii strain M009, isolated from a Malaysian tropical rainforest waterfall.

  13. Direct DNA Extraction from Mycobacterium tuberculosis Frozen Stocks as a Reculture-Independent Approach to Whole-Genome Sequencing

    DEFF Research Database (Denmark)

    Bjorn-Mortensen, K; Zallet, J; Lillebaek, T;

    2015-01-01

    Culturing before DNA extraction represents a major time-consuming step in whole-genome sequencing of slow-growing bacteria, such as Mycobacterium tuberculosis. We report a workflow to extract DNA from frozen isolates without reculturing. Prepared libraries and sequence data were comparable...

  14. Whole-Genome Sequence of Pseudomonas graminis Strain UASWS1507, a Potential Biological Control Agent and Biofertilizer Isolated in Switzerland

    Science.gov (United States)

    Crovadore, Julien; Calmin, Gautier; Chablais, Romain; Cochard, Bastien; Schulz, Torsten

    2016-01-01

    We report here the whole-genome shotgun sequence of the strain UASWS1507 of the species Pseudomonas graminis, isolated in Switzerland from an apple tree. This is the first genome registered for this species, which is considered as a potential and valuable resource of biological control agents and biofertilizers for agriculture.

  15. Draft Whole-Genome Sequence of a Haemophilus quentini Strain Isolated from an Infant in the United Kingdom

    Science.gov (United States)

    Baxter, Laura; Thompson, Sarah; Collery, Mark M.; Hand, Daniel C.; Fink, Colin G.

    2016-01-01

    Haemophilus quentini is a rare and distinct genospecies of Haemophilus that has been suggested as a cause of neonatal bacteremia and urinary tract infections in men. We present the draft whole-genome sequence of H. quentini MP1 isolated from an infant in the United Kingdom, aiding future identification and detection of this pathogen.

  16. Whole-Genome Shotgun Sequence of Bacillus amyloliquefaciens Strain UASWS BA1, a Bacterium Antagonistic to Plant Pathogenic Fungi

    OpenAIRE

    Lefort, F; Calmin, G.; Pelleteret, P.; Farinelli, L.; Osteras, M; Crovadore, J.

    2014-01-01

    We report here the whole-genome shotgun sequence of Bacillus amyloliquefaciens strain UASWS BA1, isolated from inner wood tissues of a decaying Platanus × acerifolia tree. This strain proved to be antagonistic to several plant pathogenic fungi and oomycetes and can be developed as a biological control agent in agriculture.

  17. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus.

    Science.gov (United States)

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-03-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000. PMID:26981432

  18. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections

    Science.gov (United States)

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Nielsen, Mette T.; Rosenqvist Lund, Birthe S.; Ameh, James A.; Ambali, Abdul G.; Sørensen, Gitte; Le Hello, Simon; Aarestrup, Frank M.; Hendriksen, Rene S.

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections. PMID:27228329

  19. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    Science.gov (United States)

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Nielsen, Mette T; Rosenqvist Lund, Birthe S; Ameh, James A; Ambali, Abdul G; Sørensen, Gitte; Le Hello, Simon; Aarestrup, Frank M; Hendriksen, Rene S

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections.

  20. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus

    OpenAIRE

    Yookyung Lee; Sooyeon Lim; Moon-Soo Rhee; Dong-Ho Chang; Byoung-Chan Kim

    2016-01-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000.

  1. Comparing Whole-Genome Sequencing with Sanger Sequencing for spa Typing of Methicillin-Resistant Staphylococcus aureus

    DEFF Research Database (Denmark)

    Bartels, Mette Damkjaer; Petersen, Andreas; Worning, Peder;

    2014-01-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and an...

  2. Whole-Genome Sequence of Fish-Pathogenic Mycobacterium sp. Strain 012931, Isolated from Yellowtail (Seriola quinqueradiata).

    Science.gov (United States)

    Kurokawa, Satoru; Kabayama, Jun; Nho, Seong Won; Hwang, Seong Don; Hikima, Jun-Ichi; Jung, Tae Sung; Kondo, Hidehiro; Hirono, Ikuo; Takeyama, Haruko; Aoki, Takashi

    2013-01-01

    The genus Mycobacterium comprises a large number of well-characterized species, several of which are human and animal pathogens. Here, we report the whole-genome sequence of Mycobacterium sp. strain 012931, a fish pathogen responsible for huge losses in aquaculture farms in Japan. The strain was isolated from a marine fish, yellowtail (Seriola quinqueradiata). PMID:23929466

  3. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus.

    Science.gov (United States)

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-03-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000.

  4. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    Science.gov (United States)

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Nielsen, Mette T; Rosenqvist Lund, Birthe S; Ameh, James A; Ambali, Abdul G; Sørensen, Gitte; Le Hello, Simon; Aarestrup, Frank M; Hendriksen, Rene S

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections. PMID:27228329

  5. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction.

    Science.gov (United States)

    Brøndum, R F; Su, G; Janss, L; Sahana, G; Guldbrandtsen, B; Boichard, D; Lund, M S

    2015-06-01

    This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index itself. Depending on the trait's economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage disequilibrium and assaying performance on the array, a total of 1,623 QTL markers were selected for inclusion on the custom chip. Genomic prediction analyses were performed for Nordic and French Holstein and Nordic Red animals using either a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model including the QTL markers in the analysis, reliability was increased by up to 4 percentage points for production traits in Nordic Holstein animals, up to 3 percentage points for Nordic Reds, and up to 5 percentage points for French Holstein. Smaller gains of up to 1 percentage point was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from this study indicate that the reliability of genomic prediction can be increased by including markers significant in genome-wide association studies on whole genome

  6. Whole Genome Sequencing of Mycobacterium africanum Strains from Mali Provides Insights into the Mechanisms of Geographic Restriction

    Science.gov (United States)

    Maiga, Mamoudou; Abeel, Thomas; Shea, Terrance; Desjardins, Christopher A.; Diarra, Bassirou; Baya, Bocar; Sanogo, Moumine; Diallo, Souleymane; Earl, Ashlee M.; Bishai, William R.

    2016-01-01

    Background Mycobacterium africanum, made up of lineages 5 and 6 within the Mycobacterium tuberculosis complex (MTC), causes up to half of all tuberculosis cases in West Africa, but is rarely found outside of this region. The reasons for this geographical restriction remain unknown. Possible reasons include a geographically restricted animal reservoir, a unique preference for hosts of West African ethnicity, and an inability to compete with other lineages outside of West Africa. These latter two hypotheses could be caused by loss of fitness or altered interactions with the host immune system. Methodology/Principal Findings We sequenced 92 MTC clinical isolates from Mali, including two lineage 5 and 24 lineage 6 strains. Our genome sequencing assembly, alignment, phylogeny and average nucleotide identity analyses enabled us to identify features that typify lineages 5 and 6 and made clear that these lineages do not constitute a distinct species within the MTC. We found that in Mali, lineage 6 and lineage 4 strains have similar levels of diversity and evolve drug resistance through similar mechanisms. In the process, we identified a putative novel streptomycin resistance mutation. In addition, we found evidence of person-to-person transmission of lineage 6 isolates and showed that lineage 6 is not enriched for mutations in virulence-associated genes. Conclusions This is the largest collection of lineage 5 and 6 whole genome sequences to date, and our assembly and alignment data provide valuable insights into what distinguishes these lineages from other MTC lineages. Lineages 5 and 6 do not appear to be geographically restricted due to an inability to transmit between West African hosts or to an elevated number of mutations in virulence-associated genes. However, lineage-specific mutations, such as mutations in cell wall structure, secretion systems and cofactor biosynthesis, provide alternative mechanisms that may lead to host specificity. PMID:26751217

  7. Genetic characterization of 2006-2008 isolates of Chikungunya virus from Kerala, South India, by whole genome sequence analysis.

    Science.gov (United States)

    Sreekumar, E; Issac, Aneesh; Nair, Sajith; Hariharan, Ramkumar; Janki, M B; Arathy, D S; Regu, R; Mathew, Thomas; Anoop, M; Niyas, K P; Pillai, M R

    2010-02-01

    Chikungunya virus (CHIKV), a positive-stranded alphavirus, causes epidemic febrile infections characterized by severe and prolonged arthralgia. In the present study, six CHIKV isolates (2006 RGCB03, RGCB05; 2007 RGCB80, RGCB120; 2008 RGCB355, RGCB356) from three consecutive Chikungunya outbreaks in Kerala, South India, were analyzed for genetic variations by sequencing the 11798 bp whole genome of the virus. A total of 37 novel mutations were identified and they were predominant in the 2007 and 2008 isolates among the six isolates studied. The previously identified E1 A226V critical mutation, which enhances mosquito adaptability, was present in the 2007 and 2008 samples. An important observation was the presence of two coding region substitutions, leading to nsP2 L539S and E2 K252Q change. These were identified in three isolates (2007 RGCB80 and RGCB120; 2008 RGCB355) by full-genome analysis, and also in 13 of the 31 additional samples (42%), obtained from various parts of the state, by sequencing the corresponding genomic regions. These mutations showed 100% co-occurrence in all these samples. In phylogenetic analysis, formation of a new genetic clade by these isolates within the East, Central and South African (ECSA) genotypes was observed. Homology modeling followed by mapping revealed that at least 20 of the identified mutations fall into functionally significant domains of the viral proteins and are predicted to affect protein structure. Eighteen of the identified mutations in structural proteins, including the E2 K252Q change, are predicted to disrupt T-cell epitope immunogenicity. Our study reveals that CHIK virus with novel genetic changes were present in the severe Chikungunya outbreaks in 2007 and 2008 in South India. PMID:19851853

  8. A whole-genome massively parallel sequencing analysis of BRCA1 mutant oestrogen receptor negative and positive breast cancers

    Science.gov (United States)

    Weigelt, Britta; Wilkerson, Paul M; Manie, Elodie; Grigoriadis, Anita; A’Hern, Roger; van der Groep, Petra; Kozarewa, Iwanka; Popova, Tatiana; Mariani, Odette; Turaljic, Samra; Furney, Simon J; Marais, Richard; Rodruigues, Daniel-Nava; Flora, Adriana C; Wai, Patty; Pawar, Vidya; McDade, Simon; Carroll, Jason; Stoppa-Lyonnet, Dominique; Green, Andrew R; Ellis, Ian O; Swanton, Charles; van Diest, Paul; Delattre, Olivier; Lord, Christopher J; Foulkes, William D; Vincent-Salomon, Anne; Ashworth, Alan; Stern, Marc Henri; Reis-Filho, Jorge S

    2016-01-01

    BRCA1 encodes a tumour suppressor protein that plays pivotal roles in homologous recombination (HR) DNA repair, cell-cycle checkpoints, and transcriptional regulation. BRCA1 germline mutations confer a high risk of early-onset breast and ovarian cancer. In >80% of cases, tumours arising in BRCA1 germline mutation carriers are oestrogen receptor (ER)-negative, however up to 15% are ER-positive. It has been suggested that BRCA1 ER-positive breast cancers constitute sporadic cancers arising in the context of a BRCA1 germline mutation rather than being causally related to BRCA1 loss-of-function. Whole-genome massively parallel sequencing of ER-positive and ER-negative BRCA1 breast cancers, and their respective germline DNAs, was used to characterise the genetic landscape of BRCA1 cancers at base-pair resolution. Only BRCA1 germline mutations and somatic loss of the wild-type allele, and TP53 somatic mutations were recurrently found in the index cases. BRCA1 breast cancers displayed a mutational signature consistent with that caused by lack of HR DNA repair in both ER-positive and ER-negative cases. Sequencing analysis of independent cohorts of hereditary BRCA1 and sporadic non-BRCA1 breast cancers for the presence of recurrent pathogenic mutations and/or homozygous deletions found in the index cases revealed that DAPK3, TMEM135, KIAA1797, PDE4D and GATA4 are potential additional drivers of breast cancers. This study demonstrates that BRCA1 pathogenic germline mutations coupled with somatic loss of the wild-type allele are not sufficient for hereditary breast cancers to display an ER-negative phenotype, and has led to the identification of three potential novel breast cancer genes (i.e. DAPK3, TMEM135 and GATA4). PMID:22362584

  9. Novel degenerate PCR method for whole genome amplification applied to Peru Margin (ODP Leg 201 subsurface samples

    Directory of Open Access Journals (Sweden)

    Amanda eMartino

    2012-01-01

    Full Text Available A degenerate PCR-based method of whole-genome amplification, designed to work fluidly with 454 sequencing technology, was developed and tested for use on deep marine subsurface DNA samples. The method, which we have called Random Amplification Metagenomic PCR (RAMP, involves the use of specific primers from Roche 454 amplicon sequencing, modified by the addition of a degenerate region at the 3’ end. It utilizes a PCR reaction, which resulted in no amplification from blanks, even after 50 cycles of PCR. After efforts to optimize experimental conditions, the method was tested with DNA extracted from cultured E. coli cells, and genome coverage was estimated after sequencing on three different occasions. Coverage did not vary greatly with the different experimental conditions tested, and was around 62% with a sequencing effort equivalent to a theoretical genome coverage of 14.10X. The GC content of the sequenced amplification product was within 2% of the predicted values for this strain of E. coli. The method was also applied to DNA extracted from marine subsurface samples from ODP Leg 201 site 1229 (Peru Margin, and results of a taxonomic analysis revealed microbial communities dominated by Proteobacteria, Chloroflexi, Firmicutes, Euryarchaeota, and Crenarchaeota, among others. These results were similar to those obtained previously for those samples; however, variations in the proportions of taxa show that community analysis can be sensitive to both the amplification technique used and the method of assigning sequences to taxonomic groups. Overall, we find that RAMP represents a valid methodology for amplifying metagenomes from low biomass samples.

  10. Whole Genome Sequencing of Newly Established Pancreatic Cancer Lines Identifies Novel Somatic Mutation (c.2587G>A in Axon Guidance Receptor Plexin A1 as Enhancer of Proliferation and Invasion.

    Directory of Open Access Journals (Sweden)

    Rebecca Sorber

    Full Text Available The genetic profile of human pancreatic cancers harbors considerable heterogeneity, which suggests a possible explanation for the pronounced inefficacy of single therapies in this disease. This observation has led to a belief that custom therapies based on individual tumor profiles are necessary to more effectively treat pancreatic cancer. It has recently been discovered that axon guidance genes are affected by somatic structural variants in up to 25% of human pancreatic cancers. Thus far, however, some of these mutations have only been correlated to survival probability and no function has been assigned to these observed axon guidance gene mutations in pancreatic cancer. In this study we established three novel pancreatic cancer cell lines and performed whole genome sequencing to discover novel mutations in axon guidance genes that may contribute to the cancer phenotype of these cells. We discovered, among other novel somatic variants in axon guidance pathway genes, a novel mutation in the PLXNA1 receptor (c.2587G>A in newly established cell line SB.06 that mediates oncogenic cues of increased invasion and proliferation in SB.06 cells and increased invasion in 293T cells upon stimulation with the receptor's natural ligand semaphorin 3A compared to wild type PLXNA1 cells. Mutant PLXNA1 signaling was associated with increased Rho-GTPase and p42/p44 MAPK signaling activity and cytoskeletal expansion, but not changes in E-cadherin, vimentin, or metalloproteinase 9 expression levels. Pharmacologic inhibition of the Rho-GTPase family member CDC42 selectively abrogated PLXNA1 c.2587G>A-mediated increased invasion. These findings provide in-vitro confirmation that somatic mutations in axon guidance genes can provide oncogenic gain-of-function signals and may contribute to pancreatic cancer progression.

  11. Whole-Genome Sequencing of Native Sheep Provides Insights into Rapid Adaptations to Extreme Environments.

    Science.gov (United States)

    Yang, Ji; Li, Wen-Rong; Lv, Feng-Hua; He, San-Gang; Tian, Shi-Lin; Peng, Wei-Feng; Sun, Ya-Wei; Zhao, Yong-Xin; Tu, Xiao-Long; Zhang, Min; Xie, Xing-Long; Wang, Yu-Tao; Li, Jin-Quan; Liu, Yong-Gang; Shen, Zhi-Qiang; Wang, Feng; Liu, Guang-Jian; Lu, Hong-Feng; Kantanen, Juha; Han, Jian-Lin; Li, Meng-Hua; Liu, Ming-Jun

    2016-10-01

    Global climate change has a significant effect on extreme environments and a profound influence on species survival. However, little is known of the genome-wide pattern of livestock adaptations to extreme environments over a short time frame following domestication. Sheep (Ovis aries) have become well adapted to a diverse range of agroecological zones, including certain extreme environments (e.g., plateaus and deserts), during their post-domestication (approximately 8-9 kya) migration and differentiation. Here, we generated whole-genome sequences from 77 native sheep, with an average effective sequencing depth of ∼5× for 75 samples and ∼42× for 2 samples. Comparative genomic analyses among sheep in contrasting environments, that is, plateau (>4,000 m above sea level) versus lowland (1500 m) versus low-altitude region (600 mm), and arid zone (400 mm), detected a novel set of candidate genes as well as pathways and GO categories that are putatively associated with hypoxia responses at high altitudes and water reabsorption in arid environments. In addition, candidate genes and GO terms functionally related to energy metabolism and body size variations were identified. This study offers novel insights into rapid genomic adaptations to extreme environments in sheep and other animals, and provides a valuable resource for future research on livestock breeding in response to climate change. PMID:27401233

  12. Impacts of Whole-Genome Triplication on MIRNA Evolution in Brassica rapa.

    Science.gov (United States)

    Sun, Chao; Wu, Jian; Liang, Jianli; Schnable, James C; Yang, Wencai; Cheng, Feng; Wang, Xiaowu

    2015-11-01

    MicroRNAs (miRNAs) are a class of short non-coding, endogenous RNAs that play essential roles in eukaryotes. Although the influence of whole-genome triplication (WGT) on protein-coding genes has been well documented in Brassica rapa, little is known about its impacts on MIRNAs. In this study, through generating a comprehensive annotation of 680 MIRNAs for B. rapa, we analyzed the evolutionary characteristics of these MIRNAs from different aspects in B. rapa. First, while MIRNAs and genes show similar patterns of biased distribution among subgenomes of B. rapa, we found that MIRNAs are much more overretained than genes following fractionation after WGT. Second, multiple-copy MIRNAs show significant sequence conservation than that of single-copy MIRNAs, which is opposite to that of genes. This indicates that increased purifying selection is acting upon these highly retained multiple-copy MIRNAs and their functional importance over singleton MIRNAs. Furthermore, we found the extensive divergence between pairs of miRNAs and their target genes following the WGT in B. rapa. In summary, our study provides a valuable resource for exploring MIRNA in B. rapa and highlights the impacts of WGT on the evolution of MIRNA. PMID:26527651

  13. Whole-genome sequencing reveals the effect of vaccination on the evolution of Bordetella pertussis.

    Science.gov (United States)

    Xu, Yinghua; Liu, Bin; Gröndahl-Yli-Hannuksila, Kirsi; Tan, Yajun; Feng, Lu; Kallonen, Teemu; Wang, Lichan; Peng, Ding; He, Qiushui; Wang, Lei; Zhang, Shumin

    2015-08-18

    Herd immunity can potentially induce a change of circulating viruses. However, it remains largely unknown that how bacterial pathogens adapt to vaccination. In this study, Bordetella pertussis, the causative agent of whooping cough, was selected as an example to explore possible effect of vaccination on the bacterial pathogen. We sequenced and analysed the complete genomes of 40 B. pertussis strains from Finland and China, as well as 11 previously sequenced strains from the Netherlands, where different vaccination strategies have been used over the past 50 years. The results showed that the molecular clock moved at different rates in these countries and in distinct periods, which suggested that evolution of the B. pertussis population was closely associated with the country vaccination coverage. Comparative whole-genome analyses indicated that evolution in this human-restricted pathogen was mainly characterised by ongoing genetic shift and gene loss. Furthermore, 116 SNPs were specifically detected in currently circulating ptxP3-containing strains. The finding might explain the successful emergence of this lineage and its spread worldwide. Collectively, our results suggest that the immune pressure of vaccination is one major driving force for the evolution of B. pertussis, which facilitates further exploration of the pathogenicity of B. pertussis.

  14. Whole genome sequencing provides insights into the genetic determinants of invasiveness in Salmonella Dublin.

    Science.gov (United States)

    Mohammed, M; Cormican, M

    2016-08-01

    Salmonella enterica subsp. enterica serovar Dublin (S. Dublin) is one of the non-typhoidal Salmonella (NTS); however, a relatively high proportion of human infections are associated with invasive disease. We applied whole genome sequencing to representative invasive and non-invasive clinical isolates of S. Dublin to determine the genomic variations among them and to investigate the underlying genetic determinants associated with invasiveness in S. Dublin. Although no particular genomic variation was found to differentiate in invasive and non-invasive isolates four virulence factors were detected within the genome of all isolates including two different type VI secretion systems (T6SS) encoded on two Salmonella pathogenicity islands (SPI), including SPI-6 (T6SSSPI-6) and SPI-19 (T6SSSPI-19), an intact lambdoid prophage (Gifsy-2-like prophage) that contributes significantly to the virulence and pathogenesis of Salmonella serotypes in addition to a virulence plasmid. These four virulence factors may all contribute to the potential of S. Dublin to cause invasive disease in humans. PMID:26996313

  15. Prospective Whole-Genome Sequencing Enhances National Surveillance of Listeria monocytogenes.

    Science.gov (United States)

    Kwong, Jason C; Mercoulia, Karolina; Tomita, Takehiro; Easton, Marion; Li, Hua Y; Bulach, Dieter M; Stinear, Timothy P; Seemann, Torsten; Howden, Benjamin P

    2016-02-01

    Whole-genome sequencing (WGS) has emerged as a powerful tool for comparing bacterial isolates in outbreak detection and investigation. Here we demonstrate that WGS performed prospectively for national epidemiologic surveillance of Listeria monocytogenes has the capacity to be superior to our current approaches using pulsed-field gel electrophoresis (PFGE), multilocus sequence typing (MLST), multilocus variable-number tandem-repeat analysis (MLVA), binary typing, and serotyping. Initially 423 L. monocytogenes isolates underwent WGS, and comparisons uncovered a diverse genetic population structure derived from three distinct lineages. MLST, binary typing, and serotyping results inferred in silico from the WGS data were highly concordant (>99%) with laboratory typing performed in parallel. However, WGS was able to identify distinct nested clusters within groups of isolates that were otherwise indistinguishable using our current typing methods. Routine WGS was then used for prospective epidemiologic surveillance on a further 97 L. monocytogenes isolates over a 12-month period, which provided a greater level of discrimination than that of conventional typing for inferring linkage to point source outbreaks. A risk-based alert system based on WGS similarity was used to inform epidemiologists required to act on the data. Our experience shows that WGS can be adopted for prospective L. monocytogenes surveillance and investigated for other pathogens relevant to public health. PMID:26607978

  16. RepARK--de novo creation of repeat libraries from whole-genome NGS reads.

    Science.gov (United States)

    Koch, Philipp; Platzer, Matthias; Downie, Bryan R

    2014-05-01

    Generation of repeat libraries is a critical step for analysis of complex genomes. In the era of next-generation sequencing (NGS), such libraries are usually produced using a whole-genome shotgun (WGS) derived reference sequence whose completeness greatly influences the quality of derived repeat libraries. We describe here a de novo repeat assembly method--RepARK (Repetitive motif detection by Assembly of Repetitive K-mers)--which avoids potential biases by using abundant k-mers of NGS WGS reads without requiring a reference genome. For validation, repeat consensuses derived from simulated and real Drosophila melanogaster NGS WGS reads were compared to repeat libraries generated by four established methods. RepARK is orders of magnitude faster than the other methods and generates libraries that are: (i) composed almost entirely of repetitive motifs, (ii) more comprehensive and (iii) almost completely annotated by TEclass. Additionally, we show that the RepARK method is applicable to complex genomes like human and can even serve as a diagnostic tool to identify repetitive sequences contaminating NGS datasets.

  17. Identification of hallmarks of lung adenocarcinoma prognosis using whole genome sequencing.

    Science.gov (United States)

    Liu, Li; Huang, Jiao; Wang, Ke; Li, Li; Li, Yangkai; Yuan, Jingsong; Wei, Sheng

    2015-11-10

    In conjunction with clinical characteristics, prognostic biomarkers are essential for choosing optimal therapies to lower the mortality of lung adenocarcinoma. Whole genome sequencing (WGS) of 7 cancerous-noncancerous tissue pairs was performed to explore the comparative copy number variations (CNVs) associated with lung adenocarcinoma. The frequencies of top ranked CNVs were verified in an independent set of 114 patients and then the roles of target CNVs in disease prognosis were assessed in 313 patients. The WGS yielded 2604 CNVs. After frequency validation and biological function screening of top 10 CNVs, 9 mutant driver genes from 7 CNVs were further analyzed for an association with survival. Compared with the PBXIP1 amplified copy number, unamplified carriers had a 0.62-fold (95%CI = 0.43-0.91) decreased risk of death. Compared with an amplified TERT, those with an unamplified TERT had a 35% reduction (95% CI = 3%-56%) in risk of lung adenocarcinoma progression. Cases with both unamplified PBXIP1 and TERT had a median 34.32-month extension of overall survival and 34.55-month delay in disease progression when compared with both amplified CNVs. This study demonstrates that CNVs of TERT and PBXIP1 have the potential to translate into the clinic and be used to improve outcomes for patients with this fatal disease.

  18. Whole-genome sequencing identifies a recurrent functional synonymous mutation in melanoma

    Science.gov (United States)

    Gartner, Jared J.; Parker, Stephen C. J.; Prickett, Todd D.; Dutton-Regester, Ken; Stitzel, Michael L.; Lin, Jimmy C.; Davis, Sean; Simhadri, Vijaya L.; Jha, Sujata; Katagiri, Nobuko; Gotea, Valer; Teer, Jamie K.; Morken, Mario A.; Bhanot, Umesh K.; Chen, Guo; Elnitski, Laura L.; Davies, Michael A.; Gershenwald, Jeffrey E.; Carter, Hannah; Karchin, Rachel; Robinson, William; Robinson, Steven; Rosenberg, Steven A.; Collins, Francis S.; Parmigiani, Giovanni; Komar, Anton A.; Kimchi-Sarfaty, Chava; Hayward, Nicholas K.; Margulies, Elliott H.; Samuels, Yardena

    2013-01-01

    Synonymous mutations, which do not alter the protein sequence, have been shown to affect protein function [Sauna ZE, Kimchi-Sarfaty C (2011) Nat Rev Genet 12(10):683–691]. However, synonymous mutations are rarely investigated in the cancer genomics field. We used whole-genome and -exome sequencing to identify somatic mutations in 29 melanoma samples. Validation of one synonymous somatic mutation in BCL2L12 in 285 samples identified 12 cases that harbored the recurrent F17F mutation. This mutation led to increased BCL2L12 mRNA and protein levels because of differential targeting of WT and mutant BCL2L12 by hsa-miR-671–5p. Protein made from mutant BCL2L12 transcript bound p53, inhibited UV-induced apoptosis more efficiently than WT BCL2L12, and reduced endogenous p53 target gene transcription. This report shows selection of a recurrent somatic synonymous mutation in cancer. Our data indicate that silent alterations have a role to play in human cancer, emphasizing the importance of their investigation in future cancer genome studies. PMID:23901115

  19. Use of Whole Genome Sequencing for Diagnosis and Discovery in the Cancer Genetics Clinic

    Directory of Open Access Journals (Sweden)

    Samantha B. Foley

    2015-01-01

    Full Text Available Despite the potential of whole-genome sequencing (WGS to improve patient diagnosis and care, the empirical value of WGS in the cancer genetics clinic is unknown. We performed WGS on members of two cohorts of cancer genetics patients: those with BRCA1/2 mutations (n = 176 and those without (n = 82. Initial analysis of potentially pathogenic variants (PPVs, defined as nonsynonymous variants with allele frequency < 1% in ESP6500 in 163 clinically-relevant genes suggested that WGS will provide useful clinical results. This is despite the fact that a majority of PPVs were novel missense variants likely to be classified as variants of unknown significance (VUS. Furthermore, previously reported pathogenic missense variants did not always associate with their predicted diseases in our patients. This suggests that the clinical use of WGS will require large-scale efforts to consolidate WGS and patient data to improve accuracy of interpretation of rare variants. While loss-of-function (LoF variants represented only a small fraction of PPVs, WGS identified additional cancer risk LoF PPVs in patients with known BRCA1/2 mutations and led to cancer risk diagnoses in 21% of non-BRCA cancer genetics patients after expanding our analysis to 3209 ClinVar genes. These data illustrate how WGS can be used to improve our ability to discover patients' cancer genetic risks.

  20. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins.

    Science.gov (United States)

    Croucher, Nicholas J; Page, Andrew J; Connor, Thomas R; Delaney, Aidan J; Keane, Jacqueline A; Bentley, Stephen D; Parkhill, Julian; Harris, Simon R

    2015-02-18

    The emergence of new sequencing technologies has facilitated the use of bacterial whole genome alignments for evolutionary studies and outbreak analyses. These datasets, of increasing size, often include examples of multiple different mechanisms of horizontal sequence transfer resulting in substantial alterations to prokaryotic chromosomes. The impact of these processes demands rapid and flexible approaches able to account for recombination when reconstructing isolates' recent diversification. Gubbins is an iterative algorithm that uses spatial scanning statistics to identify loci containing elevated densities of base substitutions suggestive of horizontal sequence transfer while concurrently constructing a maximum likelihood phylogeny based on the putative point mutations outside these regions of high sequence diversity. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistically parameterized models of bacterial evolution, and achieves convergence in only a few hours on alignments of hundreds of bacterial genome sequences. Gubbins is appropriate for reconstructing the recent evolutionary history of a variety of haploid genotype alignments, as it makes no assumptions about the underlying mechanism of recombination. The software is freely available for download at github.com/sanger-pathogens/Gubbins, implemented in Python and C and supported on Linux and Mac OS X. PMID:25414349

  1. Using whole-genome sequencing to determine appropriate streptomycin epidemiological cutoffs for Salmonella and Escherichia coli.

    Science.gov (United States)

    Tyson, Gregory H; Li, Cong; Ayers, Sherry; McDermott, Patrick F; Zhao, Shaohua

    2016-02-01

    For Enterobacteriaceae such as Salmonella spp. and Escherichia coli, no unified interpretive resistance criteria exist for streptomycin, an epidemiologically important antibiotic. As part of the National Antimicrobial Resistance Monitoring System, we had previously used a minimum inhibitory concentration of ≥ 64 μg mL(-1) as an epidemiological cutoff value (ECV) to define non-wild-type isolates. To identify whether this ECV correlated with genetic determinants of resistance, we performed whole-genome sequencing of 463 Salmonella and E. coli isolates to identify streptomycin resistance genotypes. From this analysis, we found that using a streptomycin resistance breakpoint of ≥ 64 μg mL(-1) classified over 20% of strains possessing aadA or strA/strB resistance genes as wild-type. Therefore, to improve the concordance between genotypic and phenotypic data, we propose reducing the phenotypic cutoff values to ≥ 32 μg mL(-1) for both Salmonella and E. coli, to be used widely as ECVs to categorize non-wild-type isolates.

  2. A proposed clinical decision support architecture capable of supporting whole genome sequence information.

    Science.gov (United States)

    Welch, Brandon M; Loya, Salvador Rodriguez; Eilbeck, Karen; Kawamoto, Kensaku

    2014-04-01

    Whole genome sequence (WGS) information may soon be widely available to help clinicians personalize the care and treatment of patients. However, considerable barriers exist, which may hinder the effective utilization of WGS information in a routine clinical care setting. Clinical decision support (CDS) offers a potential solution to overcome such barriers and to facilitate the effective use of WGS information in the clinic. However, genomic information is complex and will require significant considerations when developing CDS capabilities. As such, this manuscript lays out a conceptual framework for a CDS architecture designed to deliver WGS-guided CDS within the clinical workflow. To handle the complexity and breadth of WGS information, the proposed CDS framework leverages service-oriented capabilities and orchestrates the interaction of several independently-managed components. These independently-managed components include the genome variant knowledge base, the genome database, the CDS knowledge base, a CDS controller and the electronic health record (EHR). A key design feature is that genome data can be stored separately from the EHR. This paper describes in detail: (1) each component of the architecture; (2) the interaction of the components; and (3) how the architecture attempts to overcome the challenges associated with WGS information. We believe that service-oriented CDS capabilities will be essential to using WGS information for personalized medicine. PMID:25411644

  3. Whole-Genome Sequencing of Native Sheep Provides Insights into Rapid Adaptations to Extreme Environments

    Science.gov (United States)

    Yang, Ji; Li, Wen-Rong; Lv, Feng-Hua; He, San-Gang; Tian, Shi-Lin; Peng, Wei-Feng; Sun, Ya-Wei; Zhao, Yong-Xin; Tu, Xiao-Long; Zhang, Min; Xie, Xing-Long; Wang, Yu-Tao; Li, Jin-Quan; Liu, Yong-Gang; Shen, Zhi-Qiang; Wang, Feng; Liu, Guang-Jian; Lu, Hong-Feng; Kantanen, Juha; Han, Jian-Lin; Li, Meng-Hua; Liu, Ming-Jun

    2016-01-01

    Global climate change has a significant effect on extreme environments and a profound influence on species survival. However, little is known of the genome-wide pattern of livestock adaptations to extreme environments over a short time frame following domestication. Sheep (Ovis aries) have become well adapted to a diverse range of agroecological zones, including certain extreme environments (e.g., plateaus and deserts), during their post-domestication (approximately 8–9 kya) migration and differentiation. Here, we generated whole-genome sequences from 77 native sheep, with an average effective sequencing depth of ∼5× for 75 samples and ∼42× for 2 samples. Comparative genomic analyses among sheep in contrasting environments, that is, plateau (>4,000 m above sea level) versus lowland (1500 m) versus low-altitude region (600 mm), and arid zone (400 mm), detected a novel set of candidate genes as well as pathways and GO categories that are putatively associated with hypoxia responses at high altitudes and water reabsorption in arid environments. In addition, candidate genes and GO terms functionally related to energy metabolism and body size variations were identified. This study offers novel insights into rapid genomic adaptations to extreme environments in sheep and other animals, and provides a valuable resource for future research on livestock breeding in response to climate change. PMID:27401233

  4. Whole genome sequence analysis of Cryptococcus gattii from the Pacific Northwest reveals unexpected diversity.

    Directory of Open Access Journals (Sweden)

    John D Gillece

    Full Text Available A recent emergence of Cryptococcus gattii in the Pacific Northwest involves strains that fall into three primarily clonal molecular subtypes: VGIIa, VGIIb and VGIIc. Multilocus sequence typing (MLST and variable number tandem repeat analysis appear to identify little diversity within these molecular subtypes. Given the apparent expansion of these subtypes into new geographic areas and their ability to cause disease in immunocompetent individuals, differentiation of isolates belonging to these subtypes could be very important from a public health perspective. We used whole genome sequence typing (WGST to perform fine-scale phylogenetic analysis on 20 C. gattii isolates, 18 of which are from the VGII molecular type largely responsible for the Pacific Northwest emergence. Analysis both including and excluding (289,586 SNPs and 56,845 SNPs, respectively molecular types VGI and VGIII isolates resulted in phylogenetic reconstructions consistent, for the most part, with MLST analysis but with far greater resolution among isolates. The WGST analysis presented here resulted in identification of over 100 SNPs among eight VGIIc isolates as well as unique genotypes for each of the VGIIa, VGIIb and VGIIc isolates. Similar levels of genetic diversity were found within each of the molecular subtype isolates, despite the fact that the VGIIb clade is thought to have emerged much earlier. The analysis presented here is the first multi-genome WGST study to focus on the C. gattii molecular subtypes involved in the Pacific Northwest emergence and describes the tools that will further our understanding of this emerging pathogen.

  5. Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples.

    Science.gov (United States)

    Dong, Chun-Nan; Yang, Ya-Dong; Li, Shu-Jin; Yang, Ya-Ran; Zhang, Xiao-Jing; Fang, Xiang-Dong; Yan, Jiang-Wei; Cong, Bin

    2016-01-01

    In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and identified 2,462 single nucleotide variations (SNVs), 128 insertion-deletion polymorphisms (indels). After comparing the sequence reads with 44 STR loci commonly used in forensics, five STRs (TH01, TPOX, D18S51, DYS391, and D10S1248)were matched. We compared these "nucleosome protected STRs" (NPSTRs) with five other non-NPSTRs using mini-STR primer design, real-time PCR, and capillary gel electrophoresis on artificially degraded DNA. Moreover, genotyping performance of the five NPSTRs and five non-NPSTRs was also tested with real casework samples. All results show that loci located in nucleosomes are more likely to be successfully genotyped in degraded samples. In conclusion, after further strict validation, these markers could be incorporated into future forensic and paleontology identification kits, resulting in higher discriminatory power for certain degraded sample types. PMID:27189082

  6. A Recent Whole-Genome Duplication Divides Populations of a Globally Distributed Microsporidian.

    Science.gov (United States)

    Williams, Tom A; Nakjang, Sirintra; Campbell, Scott E; Freeman, Mark A; Eydal, Matthías; Moore, Karen; Hirt, Robert P; Embley, T Martin; Williams, Bryony A P

    2016-08-01

    The Microsporidia are a major group of intracellular fungi and important parasites of animals including insects, fish, and immunocompromised humans. Microsporidian genomes have undergone extreme reductive evolution but there are major differences in genome size and structure within the group: some are prokaryote-like in size and organisation (marine microsporidian infecting goosefish worldwide. Our analysis revealed that population structure across the Atlantic Ocean is associated with a conserved difference in ploidy, with American and Canadian isolates sharing an ancestral whole genome duplication that was followed by widespread pseudogenisation and sorting-out of paralogue pairs. While past analyses have suggested de novo gene formation of microsporidian-specific genes, we found evidence for the origin of new genes from noncoding sequence since the divergence of these populations. Some of these genes experience selective constraint, suggesting the evolution of new functions and local host adaptation. Combining our data with published microsporidian genomes, we show that nucleotide composition across the phylum is shaped by a mutational bias favoring A and T nucleotides, which is opposed by an evolutionary force favoring an increase in genomic GC content. This study reveals ongoing dramatic reorganization of genome structure and the evolution of new gene functions in modern microsporidians despite extensive genomic streamlining in their common ancestor. PMID:27189558

  7. Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples

    Science.gov (United States)

    Dong, Chun-nan; Yang, Ya-dong; Li, Shu-jin; Yang, Ya-ran; Zhang, Xiao-jing; Fang, Xiang-dong; Yan, Jiang-wei; Cong, Bin

    2016-01-01

    In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and identified 2,462 single nucleotide variations (SNVs), 128 insertion-deletion polymorphisms (indels). After comparing the sequence reads with 44 STR loci commonly used in forensics, five STRs (TH01, TPOX, D18S51, DYS391, and D10S1248)were matched. We compared these “nucleosome protected STRs” (NPSTRs) with five other non-NPSTRs using mini-STR primer design, real-time PCR, and capillary gel electrophoresis on artificially degraded DNA. Moreover, genotyping performance of the five NPSTRs and five non-NPSTRs was also tested with real casework samples. All results show that loci located in nucleosomes are more likely to be successfully genotyped in degraded samples. In conclusion, after further strict validation, these markers could be incorporated into future forensic and paleontology identification kits, resulting in higher discriminatory power for certain degraded sample types. PMID:27189082

  8. Two Rounds of Whole Genome Duplication in the AncestralVertebrate

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir; Boore, Jeffrey L.

    2005-04-12

    The hypothesis that the relatively large and complex vertebrate genome was created by two ancient, whole genome duplications has been hotly debated, but remains unresolved. We reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, then determined when each gene duplicated relative to the evolutionary tree of the organisms. We confirmed the results of earlier studies that there remains little signal of these events in numbers of duplicated genes, gene tree topology, or the number of genes per multigene family. However, when we plotted the genomic map positions of only the subset of paralogous genes that were duplicated prior to the fish-tetrapod split, their global physical organization provides unmistakable evidence of two distinct genome duplication events early in vertebrate evolution indicated by clear patterns of 4-way paralogous regions covering a large part of the human genome. Our results highlight the potential for these large-scale genomic events to have driven the evolutionary success of the vertebrate lineage.

  9. 2011 German Escherichia coli O104:H4 outbreak: whole-genome phylogeny without alignment

    Directory of Open Access Journals (Sweden)

    Cheung Man

    2011-12-01

    Full Text Available Abstract Background A large-scale Escherichia coli O104:H4 outbreak occurred in Germany from May to July 2011, causing numerous cases of hemolytic-uremic syndrome (HUS and deaths. Genomes of ten outbreak isolates and a historical O104:H4 strain isolated in 2001 were sequenced using different new generation sequencing platforms. Phylogenetic analyses were performed using various approaches which either are not genome-wide or may be subject to errors due to poor sequence alignment. Also, detailed pathogenicity analyses on the 2001 strain were not available. Findings We reconstructed the phylogeny of E. coli using the genome-wide and alignment-free feature frequency profile method and revealed the 2001 strain to be the closest relative to the 2011 outbreak strain among all available E. coli strains at present and confirmed findings from previous alignment-based phylogenetic studies that the HUS-causing O104:H4 strains are more closely related to typical enteroaggregative E. coli (EAEC than to enterohemorrhagic E. coli. Detailed re-examination of pathogenicity-related virulence factors and secreted proteins showed that the 2001 strain possesses virulence factors shared between typical EAEC and the 2011 outbreak strain. Conclusions Our study represents the first attempt to elucidate the whole-genome phylogeny of the 2011 German outbreak using an alignment-free method, and suggested a direct line of ancestry leading from a putative EAEC-like ancestor through the 2001 strain to the 2011 outbreak strain.

  10. A Recent Whole-Genome Duplication Divides Populations of a Globally Distributed Microsporidian

    Science.gov (United States)

    Williams, Tom A.; Nakjang, Sirintra; Campbell, Scott E.; Freeman, Mark A.; Eydal, Matthías; Moore, Karen; Hirt, Robert P.; Embley, T. Martin; Williams, Bryony A. P.

    2016-01-01

    The Microsporidia are a major group of intracellular fungi and important parasites of animals including insects, fish, and immunocompromised humans. Microsporidian genomes have undergone extreme reductive evolution but there are major differences in genome size and structure within the group: some are prokaryote-like in size and organisation (difference in ploidy, with American and Canadian isolates sharing an ancestral whole genome duplication that was followed by widespread pseudogenisation and sorting-out of paralogue pairs. While past analyses have suggested de novo gene formation of microsporidian-specific genes, we found evidence for the origin of new genes from noncoding sequence since the divergence of these populations. Some of these genes experience selective constraint, suggesting the evolution of new functions and local host adaptation. Combining our data with published microsporidian genomes, we show that nucleotide composition across the phylum is shaped by a mutational bias favoring A and T nucleotides, which is opposed by an evolutionary force favoring an increase in genomic GC content. This study reveals ongoing dramatic reorganization of genome structure and the evolution of new gene functions in modern microsporidians despite extensive genomic streamlining in their common ancestor. PMID:27189558

  11. Whole Genome Sequencing Reveals a De Novo SHANK3 Mutation in Familial Autism Spectrum Disorder

    Science.gov (United States)

    Nemirovsky, Sergio I.; Córdoba, Marta; Zaiat, Jonathan J.; Completa, Sabrina P.; Vega, Patricia A.; González-Morón, Dolores; Medina, Nancy M.; Fabbro, Mónica; Romero, Soledad; Brun, Bianca; Revale, Santiago; Ogara, María Florencia; Pecci, Adali; Marti, Marcelo; Vazquez, Martin; Turjanski, Adrián; Kauffman, Marcelo A.

    2015-01-01

    Introduction Clinical genomics promise to be especially suitable for the study of etiologically heterogeneous conditions such as Autism Spectrum Disorder (ASD). Here we present three siblings with ASD where we evaluated the usefulness of Whole Genome Sequencing (WGS) for the diagnostic approach to ASD. Methods We identified a family segregating ASD in three siblings with an unidentified cause. We performed WGS in the three probands and used a state-of-the-art comprehensive bioinformatic analysis pipeline and prioritized the identified variants located in genes likely to be related to ASD. We validated the finding by Sanger sequencing in the probands and their parents. Results Three male siblings presented a syndrome characterized by severe intellectual disability, absence of language, autism spectrum symptoms and epilepsy with negative family history for mental retardation, language disorders, ASD or other psychiatric disorders. We found germline mosaicism for a heterozygous deletion of a cytosine in the exon 21 of the SHANK3 gene, resulting in a missense sequence of 5 codons followed by a premature stop codon (NM_033517:c.3259_3259delC, p.Ser1088Profs*6). Conclusions We reported an infrequent form of familial ASD where WGS proved useful in the clinic. We identified a mutation in SHANK3 that underscores its relevance in Autism Spectrum Disorder. PMID:25646853

  12. Whole genome sequencing reveals a de novo SHANK3 mutation in familial autism spectrum disorder.

    Directory of Open Access Journals (Sweden)

    Sergio I Nemirovsky

    Full Text Available Clinical genomics promise to be especially suitable for the study of etiologically heterogeneous conditions such as Autism Spectrum Disorder (ASD. Here we present three siblings with ASD where we evaluated the usefulness of Whole Genome Sequencing (WGS for the diagnostic approach to ASD.We identified a family segregating ASD in three siblings with an unidentified cause. We performed WGS in the three probands and used a state-of-the-art comprehensive bioinformatic analysis pipeline and prioritized the identified variants located in genes likely to be related to ASD. We validated the finding by Sanger sequencing in the probands and their parents.Three male siblings presented a syndrome characterized by severe intellectual disability, absence of language, autism spectrum symptoms and epilepsy with negative family history for mental retardation, language disorders, ASD or other psychiatric disorders. We found germline mosaicism for a heterozygous deletion of a cytosine in the exon 21 of the SHANK3 gene, resulting in a missense sequence of 5 codons followed by a premature stop codon (NM_033517:c.3259_3259delC, p.Ser1088Profs*6.We reported an infrequent form of familial ASD where WGS proved useful in the clinic. We identified a mutation in SHANK3 that underscores its relevance in Autism Spectrum Disorder.

  13. Automated whole-genome multiple alignment of rat, mouse, and human

    Energy Technology Data Exchange (ETDEWEB)

    Brudno, Michael; Poliakov, Alexander; Salamov, Asaf; Cooper, Gregory M.; Sidow, Arend; Rubin, Edward M.; Solovyev, Victor; Batzoglou, Serafim; Dubchak, Inna

    2004-07-04

    We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.

  14. Whole-Genome Screening of Newborns? The Constitutional Boundaries of State Newborn Screening Programs

    Science.gov (United States)

    King, Jaime S.; Smith, Monica E.

    2016-01-01

    State newborn screening (NBS) programs routinely screen nearly all of the 4 million newborns in the United States each year for ~30 primary conditions and a number of secondary conditions. NBS could be on the cusp of an unprecedented expansion as a result of advances in whole-genome sequencing (WGS). As WGS becomes cheaper and easier and as our knowledge and understanding of human genetics expand, the question of whether WGS has a role to play in state NBS programs becomes increasingly relevant and complex. As geneticists and state public health officials begin to contemplate the technical and procedural details of whether WGS could benefit existing NBS programs, this is an opportune time to revisit the legal framework of state NBS programs. In this article, we examine the constitutional underpinnings of state-mandated NBS and explore the range of current state statutes and regulations that govern the programs. We consider the legal refinements that will be needed to keep state NBS programs within constitutional bounds, focusing on 2 areas of concern: consent procedures and the criteria used to select new conditions for NBS panels. We conclude by providing options for states to consider when contemplating the use of WGS for NBS. PMID:26729704

  15. Digital Droplet Multiple Displacement Amplification (ddMDA for Whole Genome Sequencing of Limited DNA Samples.

    Directory of Open Access Journals (Sweden)

    Minsoung Rhee

    Full Text Available Multiple displacement amplification (MDA is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA technique where partitioning of the template DNA into thousands of sub-nanoliter droplets, each containing a small number of DNA fragments, greatly reduces the competition among DNA fragments for primers and polymerase thereby greatly reducing amplification bias. Consequently, the ddMDA approach enabled a more uniform coverage of amplification over the entire length of the genome, with significantly lower bias and non-specific amplification than conventional MDA. For a sample containing 0.1 pg/μL of E. coli DNA (equivalent of ~3/1000 of an E. coli genome per droplet, ddMDA achieves a 65-fold increase in coverage in de novo assembly, and more than 20-fold increase in specificity (percentage of reads mapping to E. coli compared to the conventional tube MDA. ddMDA offers a powerful method useful for many applications including medical diagnostics, forensics, and environmental microbiology.

  16. Landscape of somatic mutations in 560 breast cancer whole-genome sequences.

    Science.gov (United States)

    Nik-Zainal, Serena; Davies, Helen; Staaf, Johan; Ramakrishna, Manasa; Glodzik, Dominik; Zou, Xueqing; Martincorena, Inigo; Alexandrov, Ludmil B; Martin, Sancha; Wedge, David C; Van Loo, Peter; Ju, Young Seok; Smid, Marcel; Brinkman, Arie B; Morganella, Sandro; Aure, Miriam R; Lingjærde, Ole Christian; Langerød, Anita; Ringnér, Markus; Ahn, Sung-Min; Boyault, Sandrine; Brock, Jane E; Broeks, Annegien; Butler, Adam; Desmedt, Christine; Dirix, Luc; Dronov, Serge; Fatima, Aquila; Foekens, John A; Gerstung, Moritz; Hooijer, Gerrit K J; Jang, Se Jin; Jones, David R; Kim, Hyung-Yong; King, Tari A; Krishnamurthy, Savitri; Lee, Hee Jin; Lee, Jeong-Yeon; Li, Yilong; McLaren, Stuart; Menzies, Andrew; Mustonen, Ville; O'Meara, Sarah; Pauporté, Iris; Pivot, Xavier; Purdie, Colin A; Raine, Keiran; Ramakrishnan, Kamna; Rodríguez-González, F Germán; Romieu, Gilles; Sieuwerts, Anieta M; Simpson, Peter T; Shepherd, Rebecca; Stebbings, Lucy; Stefansson, Olafur A; Teague, Jon; Tommasi, Stefania; Treilleux, Isabelle; Van den Eynden, Gert G; Vermeulen, Peter; Vincent-Salomon, Anne; Yates, Lucy; Caldas, Carlos; van't Veer, Laura; Tutt, Andrew; Knappskog, Stian; Tan, Benita Kiat Tee; Jonkers, Jos; Borg, Åke; Ueno, Naoto T; Sotiriou, Christos; Viari, Alain; Futreal, P Andrew; Campbell, Peter J; Span, Paul N; Van Laere, Steven; Lakhani, Sunil R; Eyfjord, Jorunn E; Thompson, Alastair M; Birney, Ewan; Stunnenberg, Hendrik G; van de Vijver, Marc J; Martens, John W M; Børresen-Dale, Anne-Lise; Richardson, Andrea L; Kong, Gu; Thomas, Gilles; Stratton, Michael R

    2016-06-01

    We analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, another with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer. PMID:27135926

  17. Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium vivax from Colombia.

    Science.gov (United States)

    Winter, David J; Pacheco, M Andreína; Vallejo, Andres F; Schwartz, Rachel S; Arevalo-Herrera, Myriam; Herrera, Socrates; Cartwright, Reed A; Escalante, Ananias A

    2015-12-01

    Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and through the development of new genetic markers that can be used to monitor efforts to reduce malaria transmission. Here we analyze whole-genome data from eight field samples from a region in Cordóba, Colombia where malaria is endemic. We find considerable genetic diversity within this population, a result that contrasts with earlier studies suggesting that P. vivax had limited diversity in the Americas. We also identify a selective sweep around a substitution known to confer resistance to sulphadoxine-pyrimethamine (SP). This is the first observation of a selective sweep for SP resistance in this species. These results indicate that P. vivax has been exposed to SP pressure even when the drug is not in use as a first line treatment for patients afflicted by this parasite. We identify multiple non-synonymous substitutions in three other genes known to be involved with drug resistance in Plasmodium species. Finally, we found extensive microsatellite polymorphisms. Using this information we developed 18 polymorphic and easy to score microsatellite loci that can be used in epidemiological investigations in South America.

  18. Kuwaiti population subgroup of nomadic Bedouin ancestry—Whole genome sequence and analysis

    Directory of Open Access Journals (Sweden)

    Sumi Elsa John

    2015-03-01

    Full Text Available Kuwaiti native population comprises three distinct genetic subgroups of Persian, “city-dwelling” Saudi Arabian tribe, and nomadic “tent-dwelling” Bedouin ancestry. Bedouin subgroup is characterized by presence of 17% African ancestry; it owes it origin to nomadic tribes of the deserts of Arabian Peninsula and North Africa. By sequencing whole genome of a Kuwaiti male from this subgroup at 41X coverage, we report 3,752,878 SNPs, 411,839 indels, and 8451 structural variations. Neighbor-joining tree, based on shared variant positions carrying disease-risk alleles between the Bedouin and other continental genomes, places Bedouin genome at the nexus of African, Asian, and European genomes in concordance with geographical location of Kuwait and Peninsula. In congruence with participant's medical history for morbid obesity and bronchial asthma, risk alleles are seen at deleterious SNPs associated with obesity and asthma. Many of the observed deleterious ‘novel’ variants lie in genes associated with autosomal recessive disorders characteristic of the region.

  19. BALSA: integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU

    Directory of Open Access Journals (Sweden)

    Ruibang Luo

    2014-06-01

    Full Text Available This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels, BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads, or just 25 min for 210-fold whole exome sequencing. BALSA’s speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa.

  20. Multidrug-resistant Escherichia coli soft tissue infection investigated with bacterial whole genome sequencing

    Science.gov (United States)

    Buchanan, Ruaridh; Stoesser, Nicole; Crook, Derrick; Bowler, Ian C J W

    2014-01-01

    A 45-year-old man with dilated cardiomyopathy presented with acute leg pain and erythema suggestive of necrotising fasciitis. Initial surgical exploration revealed no necrosis and treatment for a soft tissue infection was started. Blood and tissue cultures unexpectedly grew a Gram-negative bacillus, subsequently identified by an automated broth microdilution phenotyping system as an extended-spectrum β-lactamase producing Escherichia coli. The patient was treated with a 3-week course of antibiotics (ertapenem followed by ciprofloxacin) and debridement for small areas of necrosis, followed by skin grafting. The presence of E. coli triggered investigation of both host and pathogen. The patient was found to have previously undiagnosed liver disease, a risk factor for E. coli soft tissue infection. Whole genome sequencing of isolates from all specimens confirmed they were clonal, of sequence type ST131 and associated with a likely plasmid-associated AmpC (CMY-2), several other resistance genes and a number of virulence factors. PMID:25331151

  1. Bonus Organisms in High-Throughput Eukaryotic Whole-Genome Shorgun Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Pangilinan, Jasmyn; Shapiro, Harris; Tu, Hank; Platt, Darren

    2006-02-06

    The DOE Joint Genome Institute has sequenced over 50 eukaryotic genomes, ranging in size from 15 MB to 1.6 GB, over a wide range of organism types. In the course of doing so, it has become clear that a substantial fraction of these data sets contains bonus organisms, usually prokaryotes, in addition to the desired genome. While some of these additional organisms are extraneous contamination, they are sometimes symbionts, and so can be of biological interest. Therefore, it is desirable to assemble the bonus organisms along with the main genome. This transforms the problem into one of metagenomic assembly, which is considerably more challenging than traditional whole-genome shotgun (WGS) assembly. The different organisms will usually be present at different sequence depths, which is difficult to handle in most WGS assemblers. In addition, with multiple distinct genomes present, chimerism can produce cross-organism combinations. Finally, there is no guarantee that only a single bonus organism will be present. For example, one JGI project contained at least two different prokaryotic contaminants, plus a 145 KB plasmid of unknown origin. We have developed techniques to routinely identify and handle such bonus organisms in a high-throughput sequencing environment. Approaches include screening and partitioning the unassembled data, and iterative subassemblies. These methods are applicable not only to bonus organisms, but also to desired components such as organelles. These procedures have the additional benefit of identifying, and allowing for the removal of, cloning artifacts such as E.coli and spurious vector inclusions.

  2. Unique features of a Japanese 'Candidatus Liberibacter asiaticus' strain revealed by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Hiroshi Katoh

    Full Text Available Citrus greening (huanglongbing is the most destructive disease of citrus worldwide. It is spread by citrus psyllids and is associated with phloem-limited bacteria of three species of α-Proteobacteria, namely, 'Candidatus Liberibacter asiaticus', 'Ca. L. americanus', and 'Ca. L. africanus'. Recent findings suggested that some Japanese strains lack the bacteriophage-type DNA polymerase region (DNA pol, in contrast to the Floridian psy62 strain. The whole genome sequence of the pol-negative 'Ca. L. asiaticus' Japanese isolate Ishi-1 was determined by metagenomic analysis of DNA extracted from 'Ca. L. asiaticus'-infected psyllids and leaf midribs. The 1.19-Mb genome has an average 36.32% GC content. Annotation revealed 13 operons encoding rRNA and 44 tRNA genes, but no typical bacterial pathogenesis-related genes were located within the genome, similar to the Floridian psy62 and Chinese gxpsy. In contrast to other 'Ca. L. asiaticus' strains, the genome of the Japanese Ishi-1 strain lacks a prophage-related region.

  3. Ancestral whole-genome duplication in the marine chelicerate horseshoe crabs.

    Science.gov (United States)

    Kenny, N J; Chan, K W; Nong, W; Qu, Z; Maeso, I; Yip, H Y; Chan, T F; Kwan, H S; Holland, P W H; Chu, K H; Hui, J H L

    2016-02-01

    Whole-genome duplication (WGD) results in new genomic resources that can be exploited by evolution for rewiring genetic regulatory networks in organisms. In metazoans, WGD occurred before the last common ancestor of vertebrates, and has been postulated as a major evolutionary force that contributed to their speciation and diversification of morphological structures. Here, we have sequenced genomes from three of the four extant species of horseshoe crabs-Carcinoscorpius rotundicauda, Limulus polyphemus and Tachypleus tridentatus. Phylogenetic and sequence analyses of their Hox and other homeobox genes, which encode crucial transcription factors and have been used as indicators of WGD in animals, strongly suggests that WGD happened before the last common ancestor of these marine chelicerates >135 million years ago. Signatures of subfunctionalisation of paralogues of Hox genes are revealed in the appendages of two species of horseshoe crabs. Further, residual homeobox pseudogenes are observed in the three lineages. The existence of WGD in the horseshoe crabs, noted for relative morphological stasis over geological time, suggests that genomic diversity need not always be reflected phenotypically, in contrast to the suggested situation in vertebrates. This study provides evidence of ancient WGD in the ecdysozoan lineage, and reveals new opportunities for studying genomic and regulatory evolution after WGD in the Metazoa. PMID:26419336

  4. Whole genome analyses of marine fish pathogenic isolate, Mycobacterium sp. 012931.

    Science.gov (United States)

    Kurokawa, Satoru; Kabayama, Jun; Hwang, Seong Don; Nho, Seong Won; Hikima, Jun-ichi; Jung, Tae Sung; Kondo, Hidehiro; Hirono, Ikuo; Takeyama, Haruko; Mori, Tetsushi; Aoki, Takashi

    2014-10-01

    Mycobacterium is a genus within the order Actinomycetales that comprises of a large number of well-characterized species, several of which includes pathogens known to cause serious disease in human and animal. Here, we report the whole genome sequence of Mycobacterium sp. strain 012931 isolated from the marine fish, yellowtail (Seriola quinqueradiata). Mycobacterium sp. 012931 is a fish pathogen causing serious damage to aquaculture farms in Japan. DNA dot plot analysis showed that Mycobacterium sp. 012931 was more closely related to Mycobacterium marinum when compared across several Mycobacterium species. However, little conservation of the gene order was observed between Mycobacterium sp. 012931 and M. marinum genome. The annotated 5,464 genes of Mycobacterium sp. 012931 was classified into 26 subsystems. The insertion/deletion gene analysis shows Mycobacterium sp. 012931 had 643 unique genes that were not found in the M. marinum strains. In the virulence, disease, and defense subsystem, both insertion and deletion genes of Mycobacterium sp. 012931 were associated with the PPE gene cluster of Mycobacteria. Of seven plcB genes in Mycobacterium sp. 012931, plcB_2 and plcB_3 showed low identities with those of M. marinum strains. Therefore, Mycobacterium sp. 012931 has differences on genetic and virulence from M. marinum and may induce different interaction mechanisms between host and pathogen. PMID:24879010

  5. Nested radiations and the pulse of angiosperm diversification: increased diversification rates often follow whole genome duplications.

    Science.gov (United States)

    Tank, David C; Eastman, Jonathan M; Pennell, Matthew W; Soltis, Pamela S; Soltis, Douglas E; Hinchliff, Cody E; Brown, Joseph W; Sessa, Emily B; Harmon, Luke J

    2015-07-01

    Our growing understanding of the plant tree of life provides a novel opportunity to uncover the major drivers of angiosperm diversity. Using a time-calibrated phylogeny, we characterized hot and cold spots of lineage diversification across the angiosperm tree of life by modeling evolutionary diversification using stepwise AIC (MEDUSA). We also tested the whole-genome duplication (WGD) radiation lag-time model, which postulates that increases in diversification tend to lag behind established WGD events. Diversification rates have been incredibly heterogeneous throughout the evolutionary history of angiosperms and reveal a pattern of 'nested radiations' - increases in net diversification nested within other radiations. This pattern in turn generates a negative relationship between clade age and diversity across both families and orders. We suggest that stochastically changing diversification rates across the phylogeny explain these patterns. Finally, we demonstrate significant statistical support for the WGD radiation lag-time model. Across angiosperms, nested shifts in diversification led to an overall increasing rate of net diversification and declining relative extinction rates through time. These diversification shifts are only rarely perfectly associated with WGD events, but commonly follow them after a lag period.

  6. Examining phylogenetic relationships of Erwinia and Pantoea species using whole genome sequence data.

    Science.gov (United States)

    Zhang, Yucheng; Qiu, Sai

    2015-11-01

    The genera Erwinia and Pantoea contain species that are devastating plant pathogens, non-pathogen epiphytes, and opportunistic human pathogens. However, some controversies persist in the taxonomic classification of these two closely related genera. The phylogenomic analysis of these two genera was investigated via a comprehensive analysis of 25 Erwinia genomes and 23 Pantoea genomes. Single-copy orthologs could be extracted from the Erwinia/Pantoea core-genome to reconstruct the Erwinia/Pantoea phylogeny. This tree has strong bootstrap support for almost all branches. We also estimated the in silico DNA-DNA hybridization (isDDH) and the average nucleotide identity (ANI) values between each genome; strains from the same species showed ANI values ≥96% and isDDH values >70%. These data confirm that whole genome sequence data provides a powerful tool to resolve the complex taxonomic questions of Erwinia/Pantoea, e.g. Pantoea agglomerans 299R was not clustered into a single group with other P. agglomerans strains, and the ANI values and isDDH values between them were Erwinia/Pantoea phylogeny.

  7. Whole-genome copy number variation analysis in anophthalmia and microphthalmia.

    Science.gov (United States)

    Schilter, K F; Reis, L M; Schneider, A; Bardakjian, T M; Abdul-Rahman, O; Kozel, B A; Zimmerman, H H; Broeckel, U; Semina, E V

    2013-11-01

    Anophthalmia/microphthalmia (A/M) represent severe developmental ocular malformations. Currently, mutations in known genes explain less than 40% of A/M cases. We performed whole-genome copy number variation analysis in 60 patients affected with isolated or syndromic A/M. Pathogenic deletions of 3q26 (SOX2) were identified in four independent patients with syndromic microphthalmia. Other variants of interest included regions with a known role in human disease (likely pathogenic) as well as novel rearrangements (uncertain significance). A 2.2-Mb duplication of 3q29 in a patient with non-syndromic anophthalmia and an 877-kb duplication of 11p13 (PAX6) and a 1.4-Mb deletion of 17q11.2 (NF1) in two independent probands with syndromic microphthalmia and other ocular defects were identified; while ocular anomalies have been previously associated with 3q29 duplications, PAX6 duplications, and NF1 mutations in some cases, the ocular phenotypes observed here are more severe than previously reported. Three novel regions of possible interest included a 2q14.2 duplication which cosegregated with microphthalmia/microcornea and congenital cataracts in one family, and 2q21 and 15q26 duplications in two additional cases; each of these regions contains genes that are active during vertebrate ocular development. Overall, this study identified causative copy number mutations and regions with a possible role in ocular disease in 17% of A/M cases.

  8. Whole-genome sequencing of multidrug-resistant Mycobacterium tuberculosis isolates from Myanmar.

    Science.gov (United States)

    Aung, Htin Lin; Tun, Thanda; Moradigaravand, Danesh; Köser, Claudio U; Nyunt, Wint Wint; Aung, Si Thu; Lwin, Thandar; Thinn, Kyi Kyi; Crump, John A; Parkhill, Julian; Peacock, Sharon J; Cook, Gregory M; Hill, Philip C

    2016-09-01

    Drug-resistant tuberculosis (TB) is a major health threat in Myanmar. An initial study was conducted to explore the potential utility of whole-genome sequencing (WGS) for the diagnosis and management of drug-resistant TB in Myanmar. Fourteen multidrug-resistant Mycobacterium tuberculosis isolates were sequenced. Known resistance genes for a total of nine antibiotics commonly used in the treatment of drug-susceptible and multidrug-resistant TB (MDR-TB) in Myanmar were interrogated through WGS. All 14 isolates were MDR-TB, consistent with the results of phenotypic drug susceptibility testing (DST), and the Beijing lineage predominated. Based on the results of WGS, 9 of the 14 isolates were potentially resistant to at least one of the drugs used in the standard MDR-TB regimen but for which phenotypic DST is not conducted in Myanmar. This study highlights a need for the introduction of second-line DST as part of routine TB diagnosis in Myanmar as well as new classes of TB drugs to construct effective regimens. PMID:27530852

  9. Use of whole genome shotgun metagenomics: a practical guide for the microbiome-minded physician scientist.

    Science.gov (United States)

    Ma, Jun; Prince, Amanda; Aagaard, Kjersti M

    2014-01-01

    Whole genome shotgun sequencing (WGS) has been increasingly recognized as the most comprehensive and robust approach for metagenomics research. When compared with 16S-based metagenomics, it offers the advantage of identification of species level taxonomy and the estimation of metabolic pathway activities from human and environmental samples. Several large-scale metagenomic projects have been recently conducted or are currently underway utilizing WGS. With the generation of vast amounts of data, the bioinformatics and computational analysis of WGS results become vital for the success of a metagenomics study. However, each step in the WGS data analysis, including metagenome assembly, gene prediction, taxonomy identification, function annotation, and pathway analysis, is complicated by the shear amount of data. Algorithms and tools have been developed specifically to handle WGS-generated metagenomics data with the hope of reducing the requirement on computational time and storage space. Here, we present an overview of the current state of metagenomics through WGS sequencing, challenges frequently encountered, and up-to-date solutions. Several applications that are uniquely applicable to microbiome studies in reproductive and perinatal medicine are also discussed.

  10. Rapid Whole-Genome Sequencing for Genetic Disease Diagnosis in Neonatal Intensive Care Units

    Science.gov (United States)

    Saunders, Carol Jean; Miller, Neil Andrew; Soden, Sarah Elizabeth; Dinwiddie, Darrell Lee; Noll, Aaron; Alnadi, Noor Abu; Andraws, Nevene; Patterson, Melanie LeAnn; Krivohlavek, Lisa Ann; Fellis, Joel; Humphray, Sean; Saffrey, Peter; Kingsbury, Zoya; Weir, Jacqueline Claire; Betley, Jason; Grocock, Russell James; Margulies, Elliott Harrison; Farrow, Emily Gwendolyn; Artman, Michael; Safina, Nicole Pauline; Petrikin, Joshua Erin; Hall, Kevin Peter; Kingsmore, Stephen Francis

    2014-01-01

    Monogenic diseases are frequent causes of neonatal morbidity and mortality, and disease presentations are often undifferentiated at birth. More than 3500 monogenic diseases have been characterized, but clinical testing is available for only some of them and many feature clinical and genetic heterogeneity. Hence, an immense unmet need exists for improved molecular diagnosis in infants. Because disease progression is extremely rapid, albeit heterogeneous, in newborns, molecular diagnoses must occur quickly to be relevant for clinical decision-making. We describe 50-hour differential diagnosis of genetic disorders by whole-genome sequencing (WGS) that features automated bioinformatic analysis and is intended to be a prototype for use in neonatal intensive care units. Retrospective 50-hour WGS identified known molecular diagnoses in two children. Prospective WGS disclosed potential molecular diagnosis of a severe GJB2-related skin disease in one neonate; BRAT1-related lethal neonatal rigidity and multifocal seizure syndrome in another infant; identified BCL9L as a novel, recessive visceral heterotaxy gene (HTX6) in a pedigree; and ruled out known candidate genes in one infant. Sequencing of parents or affected siblings expedited the identification of disease genes in prospective cases. Thus, rapid WGS can potentially broaden and foreshorten differential diagnosis, resulting in fewer empirical treatments and faster progression to genetic and prognostic counseling. PMID:23035047

  11. Isolation and whole genome sequencing of a Ruminococcus-like bacterium, associated with irritable bowel syndrome.

    Science.gov (United States)

    Hynönen, Ulla; Rasinkangas, Pia; Satokari, Reetta; Paulin, Lars; de Vos, Willem M; Pietilä, Taija E; Kant, Ravi; Palva, Airi

    2016-06-01

    In our previous studies on the intestinal microbiota in irritable bowel syndrome (IBS), we identified a bacterial phylotype with higher abundance in patients suffering from diarrhea than in healthy controls. In the present work, we have isolated in pure culture strain RT94, belonging to this phylotype, determined its whole genome sequence and performed an extensive genomic analysis and phenotypical testing. This revealed strain RT94 to be a strict anaerobe apparently belonging to a novel species with only 94% similarity in the 16S rRNA gene sequence to the closest relatives Ruminococcus torques and Ruminococcus lactaris. The G + C content of strain RT94 is 45.2 mol% and the major long-chain cellular fatty acids are C16:0, C18:0 and C14:0. The isolate is metabolically versatile but not a mucus or cellulose utilizer. It produces acetate, ethanol, succinate, lactate and formate, but very little butyrate, as end products of glucose metabolism. The mechanisms underlying the association of strain RT94 with diarrhea-type IBS are discussed. PMID:26946362

  12. Whole genome data for omics-based research on the self-fertilizing fish Kryptolebias marmoratus.

    Science.gov (United States)

    Rhee, Jae-Sung; Lee, Jae-Seong

    2014-08-30

    Genome resources have advantages for understanding diverse areas such as biological patterns and functioning of organisms. Omics platforms are useful approaches for the study of organs and organisms. These approaches can be powerful screening tools for whole genome, proteome, and metabolome profiling, and can be used to understand molecular changes in response to internal and external stimuli. This methodology has been applied successfully in freshwater model fish such as the zebrafish Danio rerio and the Japanese medaka Oryzias latipes in research areas such as basic physiology, developmental biology, genetics, and environmental biology. However, information is still scarce about model fish that inhabit brackish water or seawater. To develop the self-fertilizing killifish Kryptolebias marmoratus as a potential model species with unique characteristics and research merits, we obtained genomic information about K. marmoratus. We address ways to use these data for genome-based molecular mechanistic studies. We review the current state of genome information on K. marmoratus to initiate omics approaches. We evaluate the potential applications of integrated omics platforms for future studies in environmental science, developmental biology, and biomedical research. We conclude that information about the K. marmoratus genome will provide a better understanding of the molecular functions of genes, proteins, and metabolites that are involved in the biological functions of this species. Omics platforms, particularly combined technologies that make effective use of bioinformatics, will provide powerful tools for hypothesis-driven investigations and discovery-driven discussions on diverse aspects of this species and on fish and vertebrates in general.

  13. Whole genome sequencing of Gir cattle for identifying polymorphisms and loci under selection.

    Science.gov (United States)

    Liao, Xiaoping; Peng, Fred; Forni, Selma; McLaren, David; Plastow, Graham; Stothard, Paul

    2013-10-01

    Genetic variation in Gir cattle (Bos indicus) has so far not been well characterized. In this study, we used whole genome sequencing of three Gir bulls and a pooled sample from another 11 bulls to identify polymorphisms and loci under selection. A total of 9 990 733 single nucleotide polymorphisms (SNPs) and 604 308 insertion/deletions (indels) were discovered in Gir samples, of which 62.34% and 83.62%, respectively, are previously unknown. Moreover, we detected 79 putative selective sweeps using the sequence data of the pooled sample. One of the most striking sweeps harbours several genes belonging to the cathelicidin gene family, such as CAMP, CATHL1, CATHL2, and CATHL3, which are related to pathogen- and parasite-resistance. Another interesting region harbours genes encoding mitogen-activated protein kinases, which are involved in directing cellular responses to a variety of stimuli, such as osmotic stress and heat shock. These findings are particularly interesting because Gir is resistant to hot temperatures and tropical diseases. This initial selective sweep analysis of Gir cattle has revealed a number of loci that could be important for their adaptation to tropical climates.

  14. Archived neonatal dried blood spot samples can be used for accurate whole genome and exome-targeted next-generation sequencing

    DEFF Research Database (Denmark)

    Hollegaard, Mads Vilhelm; Grauholm, Jonas; Nielsen, Ronni;

    2013-01-01

    Dried blood spot samples (DBSS) have been collected and stored for decades as part of newborn screening programmes worldwide. Representing almost an entire population under a certain age and collected with virtually no bias, the Newborn Screening Biobanks are of immense value in medical studies...... can be used for accurate whole genome sequencing (WGS) and exome sequencing (WES). This study examined two individuals represented by three different types of samples each: whole-blood (reference samples), 3-year-old DBSS spotted with reference material (refDBSS), and 27- to 29-year-old archived...... neonatal DBSS (neoDBSS) stored at -20°C in the Danish Newborn Screening Biobank. The reference samples were genotyped using an Illumina Omni2.5M array, and all samples were sequenced on a HighSeq2000 Paired-End flow cell. First, we compared the array single nucleotide polymorphism (SNP) genotype data...

  15. De novo assembly of the carrot mitochondrial genome using next generation sequencing of whole genomic DNA provides first evidence of DNA transfer into an angiosperm plastid genome

    Directory of Open Access Journals (Sweden)

    Iorizzo Massimo

    2012-05-01

    Full Text Available Abstract Background Sequence analysis of organelle genomes has revealed important aspects of plant cell evolution. The scope of this study was to develop an approach for de novo assembly of the carrot mitochondrial genome using next generation sequence data from total genomic DNA. Results Sequencing data from a carrot 454 whole genome library were used to develop a de novo assembly of the mitochondrial genome. Development of a new bioinformatic tool allowed visualizing contig connections and elucidation of the de novo assembly. Southern hybridization demonstrated recombination across two large repeats. Genome annotation allowed identification of 44 protein coding genes, three rRNA and 17 tRNA. Identification of the plastid genome sequence allowed organelle genome comparison. Mitochondrial intergenic sequence analysis allowed detection of a fragment of DNA specific to the carrot plastid genome. PCR amplification and sequence analysis across different Apiaceae species revealed consistent conservation of this fragment in the mitochondrial genomes and an insertion in Daucus plastid genomes, giving evidence of a mitochondrial to plastid transfer of DNA. Sequence similarity with a retrotransposon element suggests a possibility that a transposon-like event transferred this sequence into the plastid genome. Conclusions This study confirmed that whole genome sequencing is a practical approach for de novo assembly of higher plant mitochondrial genomes. In addition, a new aspect of intercompartmental genome interaction was reported providing the first evidence for DNA transfer into an angiosperm plastid genome. The approach used here could be used more broadly to sequence and assemble mitochondrial genomes of diverse species. This information will allow us to better understand intercompartmental interactions and cell evolution.

  16. Whole-genome approach implicates CD44 in cellular resistance to carboplatin

    Directory of Open Access Journals (Sweden)

    Shukla Sunita J

    2009-01-01

    Full Text Available Abstract Carboplatin is a chemotherapeutic agent used in the management of many cancers, yet treatment is limited by resistance and toxicities. To achieve a better understanding of the genetic contribution to carboplatin resistance or toxicities, lymphoblastoid cell lines from 34 large Centre d'Etude du Polymorphisme Humain pedigrees were utilised to evaluate interindividual variation in carboplatin cytotoxicity. Significant heritability, ranging from 0.17-0.36 (p = 1 × 10-7 to 9 × 10-4, was found for cell growth inhibition following 72-hour treatment at each carboplatin concentration (10, 20, 40 and 80 μM and IC50 (concentration for 50 per cent cell growth inhibition. Linkage analysis revealed 11 regions with logarithm of odds (LOD scores greater than 1.5. The highest LOD score on chromosome 11 (LOD = 3.36, p = 4.2 × 10-5 encompasses 65 genes within the 1 LOD confidence interval for the carboplatin IC50. We further analysed the IC50 phenotype with a linkage-directed association analysis using 71 unrelated HapMap and Perlegen cell lines and identified 18 single nucleotide polymorphisms within eight genes that were significantly associated with the carboplatin IC50 (p -5; false discovery rate 50 values of the eight associated genes, which identified the most significant correlation between CD44 expression and IC50 (r2 = 0.20; p = 6 × 10-4. The quantitative real-time polymerase chain reaction further confirmed a statistically significant difference in CD44 expression levels between carboplatin-resistant and -sensitive cell lines (p = 5.9 × 10-3. Knockdown of CD44 expression through small interfering RNA resulted in increased cellular sensitivity to carboplatin (p CD44 as being important in conferring cellular resistance to carboplatin.

  17. Economic evidence on identifying clinically actionable findings with whole-genome sequencing: a scoping review.

    Science.gov (United States)

    Douglas, Michael P; Ladabaum, Uri; Pletcher, Mark J; Marshall, Deborah A; Phillips, Kathryn A

    2016-02-01

    The American College of Medical Genetics and Genomics (ACMG) recommends that mutations in 56 genes for 24 conditions are clinically actionable and should be reported as secondary findings after whole-genome sequencing (WGS). Our aim was to identify published economic evaluations of detecting mutations in these genes among the general population or among targeted/high-risk populations and conditions and identify gaps in knowledge. A targeted PubMed search from 1994 through November 2014 was performed, and we included original, English-language articles reporting cost-effectiveness or a cost-to-utility ratio or net benefits/benefit-cost focused on screening (not treatment) for conditions and genes listed by the ACMG. Articles were screened, classified as targeting a high-risk or general population, and abstracted by two reviewers. General population studies were evaluated for actual cost-effectiveness measures (e.g., incremental cost-effectiveness ratios (ICER)), whereas studies of targeted populations were evaluated for whether at least one scenario proposed was cost-effective (e.g., ICER of ≤$100,000 per life-year or quality-adjusted life-year gained). A total of 607 studies were identified, and 32 relevant studies were included. Identified studies addressed fewer than one-third (7 of 24; 29%) of the ACMG conditions. The cost-effectiveness of screening in the general population was examined for only 2 of 24 conditions (8%). The cost-effectiveness of most genetic findings that the ACMG recommends for return has not been evaluated in economic studies or in the context of screening in the general population. The individual studies do not directly address the cost-effectiveness of WGS. PMID:25996638

  18. Whole genome scan to detect quantitative trait loci for bovine milk protein composition.

    Science.gov (United States)

    Schopen, G C B; Koks, P D; van Arendonk, J A M; Bovenhuis, H; Visker, M H P W

    2009-08-01

    The objective of this study was to perform a whole genome scan to detect quantitative trait loci (QTL) for milk protein composition in 849 Holstein-Friesian cows originating from seven sires. One morning milk sample was analysed for the major milk proteins using capillary zone electrophoresis. A genetic map was constructed with 1341 single nucleotide polymorphisms, covering 2829 centimorgans (cM) and 95% of the cattle genome. The chromosomal regions most significantly related to milk protein composition (P(genome) casein, alpha(S2)-casein, beta-casein and kappa-casein. The QTL on BTA11 was found at 124 cM, and affected beta-lactoglobulin, and the QTL on BTA14 was found at 0 cM, and affected protein percentage. The proportion of phenotypic variance explained by the QTL was 3.6% for beta-casein and 7.9% for kappa-casein on BTA6, 28.3% for beta-lactoglobulin on BTA11, and 8.6% for protein percentage on BTA14. The QTL affecting alpha(S2)-casein on BTA6 and 17 showed a significant interaction. We investigated the extent to which the detected QTL affecting milk protein composition could be explained by known polymorphisms in beta-casein, kappa-casein, beta-lactoglobulin and DGAT1 genes. Correction for these polymorphisms decreased the proportion of phenotypic variance explained by the QTL previously found on BTA6, 11 and 14. Thus, several significant QTL affecting milk protein composition were found, of which some QTL could partially be explained by polymorphisms in milk protein genes.

  19. Colorectal Cancer and the Human Gut Microbiome: Reproducibility with Whole-Genome Shotgun Sequencing.

    Directory of Open Access Journals (Sweden)

    Emily Vogtmann

    Full Text Available Accumulating evidence indicates that the gut microbiota affects colorectal cancer development, but previous studies have varied in population, technical methods, and associations with cancer. Understanding these variations is needed for comparisons and for potential pooling across studies. Therefore, we performed whole-genome shotgun sequencing on fecal samples from 52 pre-treatment colorectal cancer cases and 52 matched controls from Washington, DC. We compared findings from a previously published 16S rRNA study to the metagenomics-derived taxonomy within the same population. In addition, metagenome-predicted genes, modules, and pathways in the Washington, DC cases and controls were compared to cases and controls recruited in France whose specimens were processed using the same platform. Associations between the presence of fecal Fusobacteria, Fusobacterium, and Porphyromonas with colorectal cancer detected by 16S rRNA were reproduced by metagenomics, whereas higher relative abundance of Clostridia in cancer cases based on 16S rRNA was merely borderline based on metagenomics. This demonstrated that within the same sample set, most, but not all taxonomic associations were seen with both methods. Considering significant cancer associations with the relative abundance of genes, modules, and pathways in a recently published French metagenomics dataset, statistically significant associations in the Washington, DC population were detected for four out of 10 genes, three out of nine modules, and seven out of 17 pathways. In total, colorectal cancer status in the Washington, DC study was associated with 39% of the metagenome-predicted genes, modules, and pathways identified in the French study. More within and between population comparisons are needed to identify sources of variation and disease associations that can be reproduced despite these variations. Future studies should have larger sample sizes or pool data across studies to have sufficient

  20. Whole genome evaluation of horizontal transfers in the pathogenic fungus Aspergillus fumigatus

    Directory of Open Access Journals (Sweden)

    Deschavanne Patrick

    2010-03-01

    Full Text Available Abstract Background Numerous cases of horizontal transfers (HTs have been described for eukaryote genomes, but in contrast to prokaryote genomes, no whole genome evaluation of HTs has been carried out. This is mainly due to a lack of parametric methods specially designed to take the intrinsic heterogeneity of eukaryote genomes into account. We applied a simple and tested method based on local variations of genomic signatures to analyze the genome of the pathogenic fungus Aspergillus fumigatus. Results We detected 189 atypical regions containing 214 genes, accounting for about 1 Mb of DNA sequences. However, the fraction of atypical DNA detected was smaller than the average amount detected in the same conditions in prokaryote genomes (3.1% vs 5.6%. It appeared that about one third of these regions contained no annotated genes, a proportion far greater than in prokaryote genomes. When analyzing the origin of these HTs by comparing their signatures to a home made database of species signatures, 3 groups of donor species emerged: bacteria (40%, fungi (25%, and viruses (22%. It is to be noticed that though inter-domain exchanges are confirmed, we only put in evidence very few exchanges between eukaryotic kingdoms. Conclusions In conclusion, we demonstrated that HTs are not negligible in eukaryote genomes, bearing in mind that in our stringent conditions this amount is a floor value, though of a lesser extent than in prokaryote genomes. The biological mechanisms underlying those transfers remain to be elucidated as well as the biological functions of the transferred genes.

  1. An assessment of population structure in eight breeds of cattle using a whole genome SNP panel

    Directory of Open Access Journals (Sweden)

    Gao Chuan

    2008-05-01

    Full Text Available Abstract Background Analyses of population structure and breed diversity have provided insight into the origin and evolution of cattle. Previously, these studies have used a low density of microsatellite markers, however, with the large number of single nucleotide polymorphism markers that are now available, it is possible to perform genome wide population genetic analyses in cattle. In this study, we used a high-density panel of SNP markers to examine population structure and diversity among eight cattle breeds sampled from Bos indicus and Bos taurus. Results Two thousand six hundred and forty one single nucleotide polymorphisms (SNPs spanning all of the bovine autosomal genome were genotyped in Angus, Brahman, Charolais, Dutch Black and White Dairy, Holstein, Japanese Black, Limousin and Nelore cattle. Population structure was examined using the linkage model in the program STRUCTURE and Fst estimates were used to construct a neighbor-joining tree to represent the phylogenetic relationship among these breeds. Conclusion The whole-genome SNP panel identified several levels of population substructure in the set of examined cattle breeds. The greatest level of genetic differentiation was detected between the Bos taurus and Bos indicus breeds. When the Bos indicus breeds were excluded from the analysis, genetic differences among beef versus dairy and European versus Asian breeds were detected among the Bos taurus breeds. Exploration of the number of SNP loci required to differentiate between breeds showed that for 100 SNP loci, individuals could only be correctly clustered into breeds 50% of the time, thus a large number of SNP markers are required to replace the 30 microsatellite markers that are currently commonly used in genetic diversity studies.

  2. Microbiota present in cystic fibrosis lungs as revealed by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Philippe M Hauser

    Full Text Available Determination of the precise composition and variation of microbiota in cystic fibrosis lungs is crucial since chronic inflammation due to microorganisms leads to lung damage and ultimately, death. However, this constitutes a major technical challenge. Culturing of microorganisms does not provide a complete representation of a microbiota, even when using culturomics (high-throughput culture. So far, only PCR-based metagenomics have been investigated. However, these methods are biased towards certain microbial groups, and suffer from uncertain quantification of the different microbial domains. We have explored whole genome sequencing (WGS using the Illumina high-throughput technology applied directly to DNA extracted from sputa obtained from two cystic fibrosis patients. To detect all microorganism groups, we used four procedures for DNA extraction, each with a different lysis protocol. We avoided biases due to whole DNA amplification thanks to the high efficiency of current Illumina technology. Phylogenomic classification of the reads by three different methods produced similar results. Our results suggest that WGS provides, in a single analysis, a better qualitative and quantitative assessment of microbiota compositions than cultures and PCRs. WGS identified a high quantity of Haemophilus spp. (patient 1 or Staphylococcus spp. plus Streptococcus spp. (patient 2 together with low amounts of anaerobic (Veillonella, Prevotella, Fusobacterium and aerobic bacteria (Gemella, Moraxella, Granulicatella. WGS suggested that fungal members represented very low proportions of the microbiota, which were detected by cultures and PCRs because of their selectivity. The future increase of reads' sizes and decrease in cost should ensure the usefulness of WGS for the characterisation of microbiota.

  3. Whole genome analysis of Leptospira licerasiae provides insight into leptospiral evolution and pathogenicity.

    Directory of Open Access Journals (Sweden)

    Jessica N Ricaldi

    Full Text Available The whole genome analysis of two strains of the first intermediately pathogenic leptospiral species to be sequenced (Leptospira licerasiae strains VAR010 and MMD0835 provides insight into their pathogenic potential and deepens our understanding of leptospiral evolution. Comparative analysis of eight leptospiral genomes shows the existence of a core leptospiral genome comprising 1547 genes and 452 conserved genes restricted to infectious species (including L. licerasiae that are likely to be pathogenicity-related. Comparisons of the functional content of the genomes suggests that L. licerasiae retains several proteins related to nitrogen, amino acid and carbohydrate metabolism which might help to explain why these Leptospira grow well in artificial media compared with pathogenic species. L. licerasiae strains VAR010(T and MMD0835 possess two prophage elements. While one element is circular and shares homology with LE1 of L. biflexa, the second is cryptic and homologous to a previously identified but unnamed region in L. interrogans serovars Copenhageni and Lai. We also report a unique O-antigen locus in L. licerasiae comprised of a 6-gene cluster that is unexpectedly short compared with L. interrogans in which analogous regions may include >90 such genes. Sequence homology searches suggest that these genes were acquired by lateral gene transfer (LGT. Furthermore, seven putative genomic islands ranging in size from 5 to 36 kb are present also suggestive of antecedent LGT. How Leptospira become naturally competent remains to be determined, but considering the phylogenetic origins of the genes comprising the O-antigen cluster and other putative laterally transferred genes, L. licerasiae must be able to exchange genetic material with non-invasive environmental bacteria. The data presented here demonstrate that L. licerasiae is genetically more closely related to pathogenic than to saprophytic Leptospira and provide insight into the genomic bases for

  4. Whole-genome SNP association analysis of reproduction traits in the Finnish Landrace pig breed

    Directory of Open Access Journals (Sweden)

    Uimari Pekka

    2011-12-01

    Full Text Available Abstract Background Good genetic progress for pig reproduction traits has been achieved using a quantitative genetics-based multi-trait BLUP evaluation system. At present, whole-genome single nucleotide polymorphisms (SNP panels provide a new tool for pig selection. The purpose of this study was to identify SNP associated with reproduction traits in the Finnish Landrace pig breed using the Illumina PorcineSNP60 BeadChip. Methods Association of each SNP with different traits was tested with a weighted linear model, using SNP genotype as a covariate and animal as a random variable. Deregressed estimated breeding values of the progeny tested boars were used as the dependent variable and weights were based on their reliabilities. Statistical significance of the associations was based on Bonferroni-corrected P-values. Results Deregressed estimated breeding values were available for 328 genotyped boars. Of the 62 163 SNP in the chip, 57 868 SNP had a call rate > 0.9 and 7 632 SNP were monomorphic. Statistically significant results (P-value P-value P-value = 1.69E-08 more than unfavourable double homozygote animals. A region on chromosome 9 (66 Mb was statistically significant for piglet mortality between birth and weaning in later parity (0.44 piglets between homozygotes, P-value = 6.94E-08. Conclusions Three separate regions on chromosome 9 gave significant results for litter size and pig mortality. The frequencies of favourable alleles of the significant SNP are moderate in the Finnish Landrace population and these SNP are thus valuable candidates for possible marker-assisted selection.

  5. Genome-wide association study for longevity with whole-genome sequencing in 3 cattle breeds.

    Science.gov (United States)

    Zhang, Qianqian; Guldbrandtsen, Bernt; Thomasen, Jørn Rind; Lund, Mogens Sandø; Sahana, Goutam

    2016-09-01

    Longevity is an important economic trait in dairy production. Improvements in longevity could increase the average number of lactations per cow, thereby affecting the profitability of the dairy cattle industry. Improved longevity for cows reduces the replacement cost of stock and enables animals to achieve the highest production period. Moreover, longevity is an indirect indicator of animal welfare. Using whole-genome sequencing variants in 3 dairy cattle breeds, we carried out an association study and identified 7 genomic regions in Holstein and 5 regions in Red Dairy Cattle that were associated with longevity. Meta-analyses of 3 breeds revealed 2 significant genomic regions, located on chromosomes 6 (META-CHR6-88MB) and 18 (META-CHR18-58MB). META-CHR6-88MB overlaps with 2 known genes: neuropeptide G-protein coupled receptor (NPFFR2; 89,052,210-89,059,348 bp) and vitamin D-binding protein precursor (GC; 88,695,940-88,739,180 bp). The NPFFR2 gene was previously identified as a candidate gene for mastitis resistance. META-CHR18-58MB overlaps with zinc finger protein 717 (ZNF717; 58,130,465-58,141,877 bp) and zinc finger protein 613 (ZNF613; 58,115,782-58,117,110 bp), which have been associated with calving difficulties. Information on longevity-associated genomic regions could be used to find causal genes/variants influencing longevity and exploited to improve the reliability of genomic prediction. PMID:27289149

  6. Validation of Pooled Whole-Genome Re-Sequencing in Arabidopsis lyrata.

    Directory of Open Access Journals (Sweden)

    Marco Fracassetti

    Full Text Available Sequencing pooled DNA of multiple individuals from a population instead of sequencing individuals separately has become popular due to its cost-effectiveness and simple wet-lab protocol, although some criticism of this approach remains. Here we validated a protocol for pooled whole-genome re-sequencing (Pool-seq of Arabidopsis lyrata libraries prepared with low amounts of DNA (1.6 ng per individual. The validation was based on comparing single nucleotide polymorphism (SNP frequencies obtained by pooling with those obtained by individual-based Genotyping By Sequencing (GBS. Furthermore, we investigated the effect of sample number, sequencing depth per individual and variant caller on population SNP frequency estimates. For Pool-seq data, we compared frequency estimates from two SNP callers, VarScan and Snape; the former employs a frequentist SNP calling approach while the latter uses a Bayesian approach. Results revealed concordance correlation coefficients well above 0.8, confirming that Pool-seq is a valid method for acquiring population-level SNP frequency data. Higher accuracy was achieved by pooling more samples (25 compared to 14 and working with higher sequencing depth (4.1× per individual compared to 1.4× per individual, which increased the concordance correlation coefficient to 0.955. The Bayesian-based SNP caller produced somewhat higher concordance correlation coefficients, particularly at low sequencing depth. We recommend pooling at least 25 individuals combined with sequencing at a depth of 100× to produce satisfactory frequency estimates for common SNPs (minor allele frequency above 0.05.

  7. Colorectal Cancer and the Human Gut Microbiome: Reproducibility with Whole-Genome Shotgun Sequencing.

    Science.gov (United States)

    Vogtmann, Emily; Hua, Xing; Zeller, Georg; Sunagawa, Shinichi; Voigt, Anita Y; Hercog, Rajna; Goedert, James J; Shi, Jianxin; Bork, Peer; Sinha, Rashmi

    2016-01-01

    Accumulating evidence indicates that the gut microbiota affects colorectal cancer development, but previous studies have varied in population, technical methods, and associations with cancer. Understanding these variations is needed for comparisons and for potential pooling across studies. Therefore, we performed whole-genome shotgun sequencing on fecal samples from 52 pre-treatment colorectal cancer cases and 52 matched controls from Washington, DC. We compared findings from a previously published 16S rRNA study to the metagenomics-derived taxonomy within the same population. In addition, metagenome-predicted genes, modules, and pathways in the Washington, DC cases and controls were compared to cases and controls recruited in France whose specimens were processed using the same platform. Associations between the presence of fecal Fusobacteria, Fusobacterium, and Porphyromonas with colorectal cancer detected by 16S rRNA were reproduced by metagenomics, whereas higher relative abundance of Clostridia in cancer cases based on 16S rRNA was merely borderline based on metagenomics. This demonstrated that within the same sample set, most, but not all taxonomic associations were seen with both methods. Considering significant cancer associations with the relative abundance of genes, modules, and pathways in a recently published French metagenomics dataset, statistically significant associations in the Washington, DC population were detected for four out of 10 genes, three out of nine modules, and seven out of 17 pathways. In total, colorectal cancer status in the Washington, DC study was associated with 39% of the metagenome-predicted genes, modules, and pathways identified in the French study. More within and between population comparisons are needed to identify sources of variation and disease associations that can be reproduced despite these variations. Future studies should have larger sample sizes or pool data across studies to have sufficient power to detect

  8. Whole-Genome Sequencing Analysis Accurately Predicts Antimicrobial Resistance Phenotypes in Campylobacter spp.

    Science.gov (United States)

    Zhao, S; Tyson, G H; Chen, Y; Li, C; Mukherjee, S; Young, S; Lam, C; Folster, J P; Whichard, J M; McDermott, P F

    2016-01-01

    The objectives of this study were to identify antimicrobial resistance genotypes for Campylobacter and to evaluate the correlation between resistance phenotypes and genotypes using in vitro antimicrobial susceptibility testing and whole-genome sequencing (WGS). A total of 114 Campylobacter species isolates (82 C. coli and 32 C. jejuni) obtained from 2000 to 2013 from humans, retail meats, and cecal samples from food production animals in the United States as part of the National Antimicrobial Resistance Monitoring System were selected for study. Resistance phenotypes were determined using broth microdilution of nine antimicrobials. Genomic DNA was sequenced using the Illumina MiSeq platform, and resistance genotypes were identified using assembled WGS sequences through blastx analysis. Eighteen resistance genes, including tet(O), blaOXA-61, catA, lnu(C), aph(2″)-Ib, aph(2″)-Ic, aph(2')-If, aph(2″)-Ig, aph(2″)-Ih, aac(6')-Ie-aph(2″)-Ia, aac(6')-Ie-aph(2″)-If, aac(6')-Im, aadE, sat4, ant(6'), aad9, aph(3')-Ic, and aph(3')-IIIa, and mutations in two housekeeping genes (gyrA and 23S rRNA) were identified. There was a high degree of correlation between phenotypic resistance to a given drug and the presence of one or more corresponding resistance genes. Phenotypic and genotypic correlation was 100% for tetracycline, ciprofloxacin/nalidixic acid, and erythromycin, and correlations ranged from 95.4% to 98.7% for gentamicin, azithromycin, clindamycin, and telithromycin. All isolates were susceptible to florfenicol, and no genes associated with florfenicol resistance were detected. There was a strong correlation (99.2%) between resistance genotypes and phenotypes, suggesting that WGS is a reliable indicator of resistance to the nine antimicrobial agents assayed in this study. WGS has the potential to be a powerful tool for antimicrobial resistance surveillance programs. PMID:26519386

  9. Whole-genome resequencing uncovers molecular signatures of natural and sexual selection in wild bighorn sheep.

    Science.gov (United States)

    Kardos, Marty; Luikart, Gordon; Bunch, Rowan; Dewey, Sarah; Edwards, William; McWilliam, Sean; Stephenson, John; Allendorf, Fred W; Hogg, John T; Kijas, James

    2015-11-01

    The identification of genes influencing fitness is central to our understanding of the genetic basis of adaptation and how it shapes phenotypic variation in wild populations. Here, we used whole-genome resequencing of wild Rocky Mountain bighorn sheep (Ovis canadensis) to >50-fold coverage to identify 2.8 million single nucleotide polymorphisms (SNPs) and genomic regions bearing signatures of directional selection (i.e. selective sweeps). A comparison of SNP diversity between the X chromosome and the autosomes indicated that bighorn males had a dramatically reduced long-term effective population size compared to females. This probably reflects a long history of intense sexual selection mediated by male-male competition for mates. Selective sweep scans based on heterozygosity and nucleotide diversity revealed evidence for a selective sweep shared across multiple populations at RXFP2, a gene that strongly affects horn size in domestic ungulates. The massive horns carried by bighorn rams appear to have evolved in part via strong positive selection at RXFP2. We identified evidence for selection within individual populations at genes affecting early body growth and cellular response to hypoxia; however, these must be interpreted more cautiously as genetic drift is strong within local populations and may have caused false positives. These results represent a rare example of strong genomic signatures of selection identified at genes with known function in wild populations of a nonmodel species. Our results also showcase the value of reference genome assemblies from agricultural or model species for studies of the genomic basis of adaptation in closely related wild taxa. PMID:26454263

  10. Genome-Wide Association Study of HIV Whole Genome Sequences Validated using Drug Resistance

    Science.gov (United States)

    Power, Robert A.; Davaniah, Siva; Derache, Anne; Wilkinson, Eduan; Tanser, Frank; Pillay, Deenan; de Oliveira, Tulio

    2016-01-01

    Background Genome-wide association studies (GWAS) have considerably advanced our understanding of human traits and diseases. With the increasing availability of whole genome sequences (WGS) for pathogens, it is important to establish whether GWAS of viral genomes could reveal important biological insights. Here we perform the first proof of concept viral GWAS examining drug resistance (DR), a phenotype with well understood genetics. Method We performed a GWAS of DR in a sample of 343 HIV subtype C patients failing 1st line antiretroviral treatment in rural KwaZulu-Natal, South Africa. The majority and minority variants within each sequence were called using PILON, and GWAS was performed within PLINK. HIV WGS from patients failing on different antiretroviral treatments were compared to sequences derived from individuals naïve to the respective treatment. Results GWAS methodology was validated by identifying five associations on a genetic level that led to amino acid changes known to cause DR. Further, we highlighted the ability of GWAS to identify epistatic effects, identifying two replicable variants within amino acid 68 of the reverse transcriptase protein previously described as potential fitness compensatory mutations. A possible additional DR variant within amino acid 91 of the matrix region of the Gag protein was associated with tenofovir failure, highlighting GWAS’s ability to identify variants outside classical candidate genes. Our results also suggest a polygenic component to DR. Conclusions These results validate the applicability of GWAS to HIV WGS data even in relative small samples, and emphasise how high throughput sequencing can provide novel and clinically relevant insights. Further they suggested that for viruses like HIV, population structure was only minor concern compared to that seen in bacteria or parasite GWAS. Given the small genome length and reduced burden for multiple testing, this makes HIV an ideal candidate for GWAS. PMID:27677172

  11. Implementation of Nationwide Real-time Whole-genome Sequencing to Enhance Listeriosis Outbreak Detection and Investigation.

    Science.gov (United States)

    Jackson, Brendan R; Tarr, Cheryl; Strain, Errol; Jackson, Kelly A; Conrad, Amanda; Carleton, Heather; Katz, Lee S; Stroika, Steven; Gould, L Hannah; Mody, Rajal K; Silk, Benjamin J; Beal, Jennifer; Chen, Yi; Timme, Ruth; Doyle, Matthew; Fields, Angela; Wise, Matthew; Tillman, Glenn; Defibaugh-Chavez, Stephanie; Kucerova, Zuzana; Sabol, Ashley; Roache, Katie; Trees, Eija; Simmons, Mustafa; Wasilenko, Jamie; Kubota, Kristy; Pouseele, Hannes; Klimke, William; Besser, John; Brown, Eric; Allard, Marc; Gerner-Smidt, Peter

    2016-08-01

    Listeria monocytogenes (Lm) causes severe foodborne illness (listeriosis). Previous molecular subtyping methods, such as pulsed-field gel electrophoresis (PFGE), were critical in detecting outbreaks that led to food safety improvements and declining incidence, but PFGE provides limited genetic resolution. A multiagency collaboration began performing real-time, whole-genome sequencing (WGS) on all US Lm isolates from patients, food, and the environment in September 2013, posting sequencing data into a public repository. Compared with the year before the project began, WGS, combined with epidemiologic and product trace-back data, detected more listeriosis clusters and solved more outbreaks (2 outbreaks in pre-WGS year, 5 in WGS year 1, and 9 in year 2). Whole-genome multilocus sequence typing and single nucleotide polymorphism analyses provided equivalent phylogenetic relationships relevant to investigations; results were most useful when interpreted in context of epidemiological data. WGS has transformed listeriosis outbreak surveillance and is being implemented for other foodborne pathogens. PMID:27090985

  12. Beyond the Whole-Genome Duplication: Phylogenetic Evidence for an Ancient Interspecies Hybridization in the Baker's Yeast Lineage.

    Directory of Open Access Journals (Sweden)

    Marina Marcet-Houben

    2015-08-01

    Full Text Available Whole-genome duplications have shaped the genomes of several vertebrate, plant, and fungal lineages. Earlier studies have focused on establishing when these events occurred and on elucidating their functional and evolutionary consequences, but we still lack sufficient understanding of how genome duplications first originated. We used phylogenomics to study the ancient genome duplication occurred in the yeast Saccharomyces cerevisiae lineage and found compelling evidence for the existence of a contemporaneous interspecies hybridization. We propose that the genome doubling was a direct consequence of this hybridization and that it served to provide stability to the recently formed allopolyploid. This scenario provides a mechanism for the origin of this ancient duplication and the lineage that originated from it and brings a new perspective to the interpretation of the origin and consequences of whole-genome duplications.

  13. Whole genome sequence and genome annotation of Colletotrichum acutatum, causal agent of anthracnose in pepper plants in South Korea

    Directory of Open Access Journals (Sweden)

    Joon-Hee Han

    2016-06-01

    Full Text Available Colletotrichum acutatum is a destructive fungal pathogen which causes anthracnose in a wide range of crops. Here we report the whole genome sequence and annotation of C. acutatum strain KC05, isolated from an infected pepper in Kangwon, South Korea. Genomic DNA from the KC05 strain was used for the whole genome sequencing using a PacBio sequencer and the MiSeq system. The KC05 genome was determined to be 52,190,760 bp in size with a G + C content of 51.73% in 27 scaffolds and to contain 13,559 genes with an average length of 1516 bp. Gene prediction and annotation were performed by incorporating RNA-Seq data. The genome sequence of the KC05 was deposited at DDBJ/ENA/GenBank under the accession number LUXP00000000.

  14. Genome-wide association analysis of milk yield traits in Nordic Red Cattle using imputed whole genome sequence variants

    DEFF Research Database (Denmark)

    Iso-Touru, T; Sahana, G; Guldbrandtsen, B;

    2016-01-01

    variants behind them. In this study, we used whole genome sequence level data from 4280 progeny tested Nordic Red Cattle bulls to scan the genome for loci affecting milk, fat and protein yields. RESULTS: Using a genome-wise significance threshold, regions on Bos taurus chromosomes 5, 14, 23, 25 and 26 were...... traits via biological networks. CONCLUSION: This is the first time when whole genome sequence data is utilized to study genomic regions affecting milk production in the Nordic Red Cattle population. Sequence level data offers the possibility to study quantitative traits in detail but still cannot......BACKGROUND: The Nordic Red Cattle consisting of three different populations from Finland, Sweden and Denmark are under a joint breeding value estimation system. The long history of recording of production and health traits offers a great opportunity to study production traits and identify causal...

  15. Whole genome sequences of three Treponema pallidum ssp. pertenue strains: yaws and syphilis treponemes differ in less than 0.2% of the genome sequence.

    Directory of Open Access Journals (Sweden)

    Darina Cejková

    2012-01-01

    Full Text Available BACKGROUND: The yaws treponemes, Treponema pallidum ssp. pertenue (TPE strains, are closely related to syphilis causing strains of Treponema pallidum ssp. pallidum (TPA. Both yaws and syphilis are distinguished on the basis of epidemiological characteristics, clinical symptoms, and several genetic signatures of the corresponding causative agents. METHODOLOGY/PRINCIPAL FINDINGS: To precisely define genetic differences between TPA and TPE, high-quality whole genome sequences of three TPE strains (Samoa D, CDC-2, Gauthier were determined using next-generation sequencing techniques. TPE genome sequences were compared to four genomes of TPA strains (Nichols, DAL-1, SS14, Chicago. The genome structure was identical in all three TPE strains with similar length ranging between 1,139,330 bp and 1,139,744 bp. No major genome rearrangements were found when compared to the four TPA genomes. The whole genome nucleotide divergence (d(A between TPA and TPE subspecies was 4.7 and 4.8 times higher than the observed nucleotide diversity (π among TPA and TPE strains, respectively, corresponding to 99.8% identity between TPA and TPE genomes. A set of 97 (9.9% TPE genes encoded proteins containing two or more amino acid replacements or other major sequence changes. The TPE divergent genes were mostly from the group encoding potential virulence factors and genes encoding proteins with unknown function. CONCLUSIONS/SIGNIFICANCE: Hypothetical genes, with genetic differences, consistently found between TPE and TPA strains are candidates for syphilitic treponemes virulence factors. Seventeen TPE genes were predicted under positive selection, and eleven of them coded either for predicted exported proteins or membrane proteins suggesting their possible association with the cell surface. Sequence changes between TPE and TPA strains and changes specific to individual strains represent suitable targets for subspecies- and strain-specific molecular diagnostics.

  16. Whole-genome sequence of Sunxiuqinia dokdonensis DH1(T), isolated from deep sub-seafloor sediment in Dokdo Island.

    Science.gov (United States)

    Lim, Sooyeon; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-09-01

    Sunxiuqinia dokdonensis DH1(T) was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000. PMID:27437183

  17. Coverage tradeoffs and power estimation in the design of whole-genome sequencing experiments for detecting association

    OpenAIRE

    Shen, Yufeng; Song, Ruijie; Pe'er, Itsik

    2011-01-01

    Motivation: Whole-genome sequencing (WGS) allows direct interrogation of previously undetected uncommon or rare variants, which potentially contribute to the missing heritability of human disease. However, cost of sequencing large numbers of samples limits its application in case–control association studies. Here, we describe theoretical and empirical design considerations for such sequencing studies, aimed at maximizing the power of detecting association under the constraint of study-wide co...

  18. “We don’t know her history, her background”: Adoptive parents’ perspectives on whole genome sequencing results

    OpenAIRE

    Crouch, Julia; Yu, Joon-Ho; Shankar, Aditi G.; Tabor, Holly K.

    2014-01-01

    Exome sequencing and whole genome sequencing (ES/WGS) can provide parents with a wide range of genetic information about their children, and adoptive parents may have unique issues to consider regarding possible access to this information. The few papers published on adoption and genetics have focused on targeted genetic testing of children in the pre-adoption context. There are no data on adoptive parent perspectives about pediatric ES/WGS, including their preferences about different kinds o...

  19. Genotyping using whole-genome sequencing is a realistic alternative to surveillance based on phenotypic antimicrobial susceptibility testing

    DEFF Research Database (Denmark)

    Zankari, Ea; Hasman, Henrik; Kaas, Rolf Sommer;

    2013-01-01

    Objectives: Antimicrobial susceptibility testing of bacterial isolates is essential for clinical diagnosis, to detect emerging problems and to guide empirical treatment. Current phenotypic procedures are sometimes associated with mistakes and may require further genetic testing. Whole-genome sequ......Objectives: Antimicrobial susceptibility testing of bacterial isolates is essential for clinical diagnosis, to detect emerging problems and to guide empirical treatment. Current phenotypic procedures are sometimes associated with mistakes and may require further genetic testing. Whole...

  20. Whole-genome sequence of Sunxiuqinia dokdonensis DH1T, isolated from deep sub-seafloor sediment in Dokdo Island

    Directory of Open Access Journals (Sweden)

    Sooyeon Lim

    2016-09-01

    Full Text Available Sunxiuqinia dokdonensis DH1T was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  1. What are people willing to pay for whole-genome sequencing information, and who decides what they receive?

    OpenAIRE

    Marshall, DA; Gonzalez, JM; Johnson, FR; Macdonald, KV; Pugh, A; Douglas, MP; Phillips, KA

    2016-01-01

    Whole-genome sequencing (WGS) can be used as a powerful diagnostic tool as well as for screening, but it may lead to anxiety, unnecessary testing, and overtreatment. Current guidelines suggest reporting clinically actionable secondary findings when diagnostic testing is performed. We examined preferences for receiving WGS results.A US nationally representative survey (n = 410 adults) was used to rank preferences for who decides (an expert panel, your doctor, you) which WGS results are reporte...

  2. Implications of using whole genome sequencing to test unselected populations for high risk breast cancer genes: a modelling study

    OpenAIRE

    Warren-Gash, Charlotte; Kroese, Mark; Burton, Hilary; Pharoah, Paul

    2016-01-01

    Background The decision to test for high risk breast cancer gene mutations is traditionally based on risk scores derived from age, family and personal cancer history. Next generation sequencing technologies such as whole genome sequencing (WGS) make wider population testing more feasible. In the UK’s 100,000 Genomes Project, mutations in 16 genes including BRCA1 and BRCA2 are to be actively sought regardless of clinical presentation. The implications of deploying this approach at scale for pa...

  3. Whole-genome sequence of Sunxiuqinia dokdonensis DH1T, isolated from deep sub-seafloor sediment in Dokdo Island

    OpenAIRE

    Sooyeon Lim; Dong-Ho Chang; Byoung-Chan Kim

    2016-01-01

    Sunxiuqinia dokdonensis DH1T was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  4. Whole genome sequencing of Halomonas sp. SUBG004 isolated from Little Rann of Kutch, a desert of India

    OpenAIRE

    Jigna H. Patel; Thaker, Vrinda S.

    2015-01-01

    A salt tolerant strain, designated as SUBG004, was isolated from the desert of India, Little Rann of Kutch. The organism is a Gram-negative, facultatively anaerobic and rod shaped bacterium. Chemotaxonomic and phylogenetic properties were consistent with its classification in the genus Halomonas. Here we report the whole genome sequence of Halomonas sp. SUBG004 deposited in DDBJ/EMBL/GenBank under accession number JPEU0100000 which provides insights for salt stress adaptation through betaine ...

  5. Whole genome sequencing of Halomonas sp. SUBG004 isolated from Little Rann of Kutch, a desert of India.

    Science.gov (United States)

    Patel, Jigna H; Thaker, Vrinda S

    2015-12-01

    A salt tolerant strain, designated as SUBG004, was isolated from the desert of India, Little Rann of Kutch. The organism is a Gram-negative, facultatively anaerobic and rod shaped bacterium. Chemotaxonomic and phylogenetic properties were consistent with its classification in the genus Halomonas. Here we report the whole genome sequence of Halomonas sp. SUBG004 deposited in DDBJ/EMBL/GenBank under accession number JPEU0100000 which provides insights for salt stress adaptation through betaine synthesis. PMID:26697321

  6. Whole genome sequencing of Halomonas sp. SUBG004 isolated from Little Rann of Kutch, a desert of India

    Directory of Open Access Journals (Sweden)

    Jigna H. Patel

    2015-12-01

    Full Text Available A salt tolerant strain, designated as SUBG004, was isolated from the desert of India, Little Rann of Kutch. The organism is a Gram-negative, facultatively anaerobic and rod shaped bacterium. Chemotaxonomic and phylogenetic properties were consistent with its classification in the genus Halomonas. Here we report the whole genome sequence of Halomonas sp. SUBG004 deposited in DDBJ/EMBL/GenBank under accession number JPEU0100000 which provides insights for salt stress adaptation through betaine synthesis.

  7. PhyResSE: a Web Tool Delineating Mycobacterium tuberculosis Antibiotic Resistance and Lineage from Whole-Genome Sequencing Data

    OpenAIRE

    Feuerriegel, Silke; Schleusener, Viola; Beckert, Patrick; Kohl, Thomas A.; Miotto, Paolo; Cirillo, Daniela M; Cabibbe, Andrea M.; Niemann, Stefan; Fellenberg, Kurt

    2015-01-01

    Antibiotic-resistant tuberculosis poses a global threat, causing the deaths of hundreds of thousands of people annually. While whole-genome sequencing (WGS), with its unprecedented level of detail, promises to play an increasingly important role in diagnosis, data analysis is a daunting challenge. Here, we present a simple-to-use web service (free for academic use at http://phyresse.org). Delineating both lineage and resistance, it provides state-of-the-art methodology to life scientists and ...

  8. Whole-genome sequence of Sunxiuqinia dokdonensis DH1(T), isolated from deep sub-seafloor sediment in Dokdo Island.

    Science.gov (United States)

    Lim, Sooyeon; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-09-01

    Sunxiuqinia dokdonensis DH1(T) was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  9. Whole-Genome Resequencing Analysis of Hanwoo and Yanbian Cattle to Identify Genome-Wide SNPs and Signatures of Selection

    OpenAIRE

    Choi, Jung-Woo; Choi, Bong-Hwan; Lee, Seung-Hwan; Lee, Seung-Soo; Kim, Hyeong-Cheol; Yu, Dayeong; Chung, Won-Hyong; Lee, Kyung-Tai; Chai, Han-Ha; Cho, Yong-Min; Lim, Dajeong

    2015-01-01

    Over the last 30 years, Hanwoo has been selectively bred to improve economically important traits. Hanwoo is currently the representative Korean native beef cattle breed, and it is believed that it shared an ancestor with a Chinese breed, Yanbian cattle, until the last century. However, these two breeds have experienced different selection pressures during recent decades. Here, we whole-genome sequenced 10 animals each of Hanwoo and Yanbian cattle (20 total) using the Illumina HiSeq 2000 sequ...

  10. Identifying Gene Disruptions in Novel Balanced de novo Constitutional Translocations in Childhood Cancer Patients by Whole Genome Sequencing

    OpenAIRE

    Ritter, Deborah I.; Haines, Katherine; Cheung, Hannah; Davis, Caleb F.; Lau, Ching C.; Berg, Jonathan S.; Brown, Chester W.; Thompson, Patrick A.; Gibbs, Richard; Wheeler, David A.; Plon, Sharon E.

    2015-01-01

    Purpose We applied whole genome sequencing to children diagnosed with neoplasms and found to carry apparently balanced constitutional translocations, to discover novel genic disruptions. Methods We applied SV calling programs CREST, Break Dancer, SV-STAT and CGAP-CNV, and developed an annotative filtering strategy to achieve nucleotide resolution at the translocations. Results We identified the breakpoints for t(6;12) (p21.1;q24.31) disrupting HNF1A in a patient diagnosed with hepatic adenoma...

  11. Whole Genome Sequencing of Mycobacterium tuberculosis Reveals Slow Growth and Low Mutation Rates during Latent Infections in Humans

    OpenAIRE

    Roberto Colangeli; Vic L Arcus; Cursons, Ray T.; Ali Ruthe; Noel Karalus; Kathy Coley; Manning, Shannon D.; Soyeon Kim; Emily Marchiano; David Alland

    2014-01-01

    Very little is known about the growth and mutation rates of Mycobacterium tuberculosis during latent infection in humans. However, studies in rhesus macaques have suggested that latent infections have mutation rates that are higher than that observed during active tuberculosis disease. Elevated mutation rates are presumed risk factors for the development of drug resistance. Therefore, the investigation of mutation rates during human latency is of high importance. We performed whole genome mut...

  12. Generation of whole genome sequences of new Cryptosporidium hominis and Cryptosporidium parvum isolates directly from stool samples

    OpenAIRE

    Hadfield, Stephen J.; Pachebat, Justin A; Swain, Martin T; Robinson, Guy; Cameron, Simon JS; Alexander, Jenna; Hegarty, Matthew J.; Elwin, Kristin; Chalmers, Rachel M.

    2015-01-01

    Background Whole genome sequencing (WGS) of Cryptosporidium spp. has previously relied on propagation of the parasite in animals to generate enough oocysts from which to extract DNA of sufficient quantity and purity for analysis. We have developed and validated a method for preparation of genomic Cryptosporidium DNA suitable for WGS directly from human stool samples and used it to generate 10 high-quality whole Cryptosporidium genome assemblies. Our method uses a combination of salt flotation...

  13. Whole genome sequencing identifies circulating Beijing-lineage Mycobacterium tuberculosis strains in Guatemala and an associated urban outbreak

    OpenAIRE

    Saelens, Joseph W.; Lau-Bonilla, Dalia; Moller, Anneliese; Medina, Narda; Guzmán, Brenda; Calderón, Maylena; Herrera, Raúl; Sisk, Dana M.; Ana M Xet-Mull; Stout, Jason E.; Arathoon, Eduardo; Samayoa, Blanca; Tobin, David M.

    2015-01-01

    Summary Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from in...

  14. Transcriptional control in embryonic Drosophila midline guidance assessed through a whole genome approach

    Directory of Open Access Journals (Sweden)

    Tomancak Pavel

    2007-07-01

    Full Text Available Abstract Background During the development of the Drosophila central nervous system the process of midline crossing is orchestrated by a number of guidance receptors and ligands. Many key axon guidance molecules have been identified in both invertebrates and vertebrates, but the transcriptional regulation of growth cone guidance remains largely unknown. It is established that translational regulation plays a role in midline crossing, and there are indications that transcriptional regulation is also involved. To investigate this issue, we conducted a genome-wide study of transcription in Drosophila embryos using wild type and a number of well-characterized Drosophila guidance mutants and transgenics. We also analyzed a previously published microarray time course of Drosophila embryonic development with an axon guidance focus. Results Using hopach, a novel clustering method which is well suited to microarray data analysis, we identified groups of genes with similar expression patterns across guidance mutants and transgenics. We then systematically characterized the resulting clusters with respect to their relevance to axon guidance using two complementary controlled vocabularies: the Gene Ontology (GO and anatomical annotations of the Atlas of Pattern of Gene Expression (APoGE in situ hybridization database. The analysis indicates that regulation of gene expression does play a role in the process of axon guidance in Drosophila. We also find a strong link between axon guidance and hemocyte migration, a result that agrees with mounting evidence that axon guidance molecules are co-opted in vertebrate vascularization. Cell cyclin activity in the context of axon guidance is also suggested from our array data. RNA and protein expression patterns of cell cyclins in axon guidance mutants and transgenics support this possible link. Conclusion This study provides important insights into the regulation of axon guidance in vivo.

  15. Inferring regulatory elements from a whole genome. An analysis of Helicobacter pylori sigma(80) family of promoter signals.

    Science.gov (United States)

    Vanet, A; Marsan, L; Labigne, A; Sagot, M F

    2000-03-24

    Helicobacter pylori is adapted to life in a unique niche, the gastric epithelium of primates. Its promoters may therefore be different from those of other bacteria. Here, we determine motifs possibly involved in the recognition of such promoter sequences by the RNA polymerase using a new motif identification method. An important feature of this method is that the motifs are sought with the least possible assumptions about what they may look like. The method starts by considering the whole genome of H. pylori and attempts to infer directly from it a description for a family of promoters. Thus, this approach differs from searching for such promoters with a previously established description. The two algorithms are based on the idea of inferring motifs by flexibly comparing words in the sequences with an external object, instead of between themselves. The first algorithm infers single motifs, the second a combination of two motifs separated from one another by strictly defined, sterically constrained distances. Besides independently finding motifs known to be present in other bacteria, such as the Shine-Dalgarno sequence and the TATA-box, this approach suggests the existence in H. pylori of a new, combined motif, TTAAGC, followed optimally 21 bp downstream by TATAAT. Between these two motifs, there is in some cases another, TTTTAA or, less frequently, a repetition of TTAAGC separated optimally from the TATA-box by 12 bp. The combined motif TTAAGCx(21+/-2)TATAAT is present with no errors immediately upstream from the only two copies of the ribosomal 23 S-5 S RNA genes in H. pylori, and with one error upstream from the only two copies of the ribosomal 16 S RNA genes. The operons of both ribosomal RNA molecules are strongly expressed, representing an encouraging sign of the pertinence of the motifs found by the algorithms. In 25 cases out of a possible 30, the combined motif is found with no more than three substitutions immediately upstream from ribosomal proteins, or

  16. MethylC-seq library preparation for base-resolution whole-genome bisulfite sequencing.

    Science.gov (United States)

    Urich, Mark A; Nery, Joseph R; Lister, Ryan; Schmitz, Robert J; Ecker, Joseph R

    2015-03-01

    Current high-throughput DNA sequencing technologies enable acquisition of billions of data points through which myriad biological processes can be interrogated, including genetic variation, chromatin structure, gene expression patterns, small RNAs and protein-DNA interactions. Here we describe the MethylC-sequencing (MethylC-seq) library preparation method, a 2-d protocol that enables the genome-wide identification of cytosine DNA methylation states at single-base resolution. The technique involves fragmentation of genomic DNA followed by adapter ligation, bisulfite conversion and limited amplification using adapter-specific PCR primers in preparation for sequencing. To date, this protocol has been successfully applied to genomic DNA isolated from primary cell culture, sorted cells and fresh tissue from over a thousand plant and animal samples.

  17. Whole Genome Sequencing Based Characterization of Extensively Drug-Resistant Mycobacterium tuberculosis Isolates from Pakistan

    KAUST Repository

    Ali, Asho

    2015-02-26

    Improved molecular diagnostic methods for detection drug resistance in Mycobacterium tuberculosis (MTB) strains are required. Resistance to first- and second- line anti-tuberculous drugs has been associated with single nucleotide polymorphisms (SNPs) in particular genes. However, these SNPs can vary between MTB lineages therefore local data is required to describe different strain populations. We used whole genome sequencing (WGS) to characterize 37 extensively drug-resistant (XDR) MTB isolates from Pakistan and investigated 40 genes associated with drug resistance. Rifampicin resistance was attributable to SNPs in the rpoB hot-spot region. Isoniazid resistance was most commonly associated with the katG codon 315 (92%) mutation followed by inhA S94A (8%) however, one strain did not have SNPs in katG, inhA or oxyR-ahpC. All strains were pyrazimamide resistant but only 43% had pncA SNPs. Ethambutol resistant strains predominantly had embB codon 306 (62%) mutations, but additional SNPs at embB codons 406, 378 and 328 were also present. Fluoroquinolone resistance was associated with gyrA 91-94 codons in 81% of strains; four strains had only gyr B mutations, while others did not have SNPs in either gyrA or gyrB. Streptomycin resistant strains had mutations in ribosomal RNA genes; rpsL codon 43 (42%); rrs 500 region (16%), and gidB (34%) while six strains did not have mutations in any of these genes. Amikacin/kanamycin/capreomycin resistance was associated with SNPs in rrs at nt1401 (78%) and nt1484 (3%), except in seven (19%) strains. We estimate that if only the common hot-spot region targets of current commercial assays were used, the concordance between phenotypic and genotypic testing for these XDR strains would vary between rifampicin (100%), isoniazid (92%), flouroquinolones (81%), aminoglycoside (78%) and ethambutol (62%); while pncA sequencing would provide genotypic resistance in less than half the isolates. This work highlights the importance of expanded

  18. Whole genome SNP discovery and analysis of genetic diversity in Turkey (Meleagris gallopavo

    Directory of Open Access Journals (Sweden)

    Aslam Muhammad L

    2012-08-01

    whole genome SNP discovery study in turkey resulted in the detection of 5.49 million putative SNPs compared to the reference genome. All commercial lines appear to share a common origin. Presence of different alleles/haplotypes in the SM population highlights that specific haplotypes have been selected in the modern domesticated turkey.

  19. A comparison of alternative random regression and reaction norm models for whole genome predictions.

    Science.gov (United States)

    Yang, W; Chen, C; Steibel, J P; Ernst, C W; Bates, R O; Zhou, L; Tempelman, R J

    2015-06-01

    Whole genome prediction (WGP) based on high density SNP marker panels is known to improve the accuracy of breeding value (BV) prediction in livestock. However, these accuracies can be compromised when genotype by environment interaction (G×E) exists but is not accounted for. Reaction norm (RN) and random regression (RR) models have proven to be useful in accounting for G×E in pre-WGP evaluations by modeling BV as linear or higher order functions of environmental or temporal covariates. We extend these RR/RN models based on several alternative specifications for SNP-specific intercepts and linear slopes on environmental covariates. One specification is based on bivariate normality (BVN) of SNP-specific intercepts and slopes, whereas 2 others, IW-BayesA and based on inverted Wishart (IW) extensions IW-BayesB, are, respectively, bivariate Student t extensions of currently popular models without (BayesA) or with (BayesB) variable selection. We highlight alternative specifications based on the square root free Cholesky decomposition (CD) of SNP-specific variance-covariance (VCV) matrices in an attempt to better differentially model environmentally sensitive from environmentally robust QTL. Two CD specifications were considered with (CD-BayesB) or without (CD-BayesA) any variable selection on intercept and slope effects. We compared each of the 5 models based on an RN simulation study. Six scenarios were considered based on differences in overall genetic correlations between SNP-specific intercept and slope effects as well as on heritabilities and numbers of environmentally robust versus sensitive QTL. In most scenarios, IW-BayesA had the greatest accuracy, whereas CD-BayesB exhibited the greatest accuracy in low complexity architectures (i.e., low number of QTL). In an RR application of a Duroc × Pietrain resource population at Michigan State University, 5,271 SNP markers and 928 F2 animals with known pedigree were analyzed for backfat thickness at wk 10, 13, 16, 19

  20. Whole Genome Association Studies of Residual Feed Intake and Related Traits in the Pig.

    Directory of Open Access Journals (Sweden)

    Suneel K Onteru

    Full Text Available Residual feed intake (RFI, a measure of feed efficiency, is the difference between observed feed intake and the expected feed requirement predicted from growth and maintenance. Pigs with low RFI have reduced feed costs without compromising their growth. Identification of genes or genetic markers associated with RFI will be useful for marker-assisted selection at an early age of animals with improved feed efficiency.Whole genome association studies (WGAS for RFI, average daily feed intake (ADFI, average daily gain (ADG, back fat (BF and loin muscle area (LMA were performed on 1,400 pigs from the divergently selected ISU-RFI lines, using the Illumina PorcineSNP60 BeadChip. Various statistical methods were applied to find SNPs and genomic regions associated with the traits, including a Bayesian approach using GenSel software, and frequentist approaches such as allele frequency differences between lines, single SNP and haplotype analyses using PLINK software. Single SNP and haplotype analyses showed no significant associations (except for LMA after genomic control and FDR. Bayesian analyses found at least 2 associations for each trait at a false positive probability of 0.5. At generation 8, the RFI selection lines mainly differed in allele frequencies for SNPs near (<0.05 Mb genes that regulate insulin release and leptin functions. The Bayesian approach identified associations of genomic regions containing insulin release genes (e.g., GLP1R, CDKAL, SGMS1 with RFI and ADFI, of regions with energy homeostasis (e.g., MC4R, PGM1, GPR81 and muscle growth related genes (e.g., TGFB1 with ADG, and of fat metabolism genes (e.g., ACOXL, AEBP1 with BF. Specifically, a very highly significantly associated QTL for LMA on SSC7 with skeletal myogenesis genes (e.g., KLHL31 was identified for subsequent fine mapping.Important genomic regions associated with RFI related traits were identified for future validation studies prior to their incorporation in marker

  1. Whole genome duplications and expansion of the vertebrate GATA transcription factor gene family

    Directory of Open Access Journals (Sweden)

    Bowerman Bruce

    2009-08-01

    Full Text Available Abstract Background GATA transcription factors influence many developmental processes, including the specification of embryonic germ layers. The GATA gene family has significantly expanded in many animal lineages: whereas diverse cnidarians have only one GATA transcription factor, six GATA genes have been identified in many vertebrates, five in many insects, and eleven to thirteen in Caenorhabditis nematodes. All bilaterian animal genomes have at least one member each of two classes, GATA123 and GATA456. Results We have identified one GATA123 gene and one GATA456 gene from the genomic sequence of two invertebrate deuterostomes, a cephalochordate (Branchiostoma floridae and a hemichordate (Saccoglossus kowalevskii. We also have confirmed the presence of six GATA genes in all vertebrate genomes, as well as additional GATA genes in teleost fish. Analyses of conserved sequence motifs and of changes to the exon-intron structure, and molecular phylogenetic analyses of these deuterostome GATA genes support their origin from two ancestral deuterostome genes, one GATA 123 and one GATA456. Comparison of the conserved genomic organization across vertebrates identified eighteen paralogous gene families linked to multiple vertebrate GATA genes (GATA paralogons, providing the strongest evidence yet for expansion of vertebrate GATA gene families via genome duplication events. Conclusion From our analysis, we infer the evolutionary birth order and relationships among vertebrate GATA transcription factors, and define their expansion via multiple rounds of whole genome duplication events. As the genomes of four independent invertebrate deuterostome lineages contain single copy GATA123 and GATA456 genes, we infer that the 0R (pre-genome duplication invertebrate deuterostome ancestor also had two GATA genes, one of each class. Synteny analyses identify duplications of paralogous chromosomal regions (paralogons, from single ancestral vertebrate GATA123 and GATA456

  2. Preparation of a phage DNA fragment library for whole genome shotgun sequencing.

    Science.gov (United States)

    Summer, Elizabeth J

    2009-01-01

    The most efficient method to determine the genomic sequence of a dsDNA phage is to use a whole genome shotgun approach (WGSA). Preparation of a library where each genomic fragment has an equal chance of being represented is critical to the success of the WGSA. For many phages, there are regions of the genome likely to be under-represented in the shotgun library, which results in more gaps in the shotgun assembly than predicted by the Poisson distribution. However, as phage genomes are relatively small, this increased number of gaps does not present an insurmountable impediment to using the WGSA. This chapter will focus on construction of a high-quality random library and sequence analysis of this library in a 96-well format. Techniques are described for the mechanical fragmentation of genomic DNA into 2 kb average size fragments, preparation of the fragmented DNA for shotgun cloning, and advice on the choice of cloning vector for library preparation. Protocols for deepwell block culture, plasmid isolation, and sequencing in 96-well format are given. The rationale for determining the total number of random clones from a library to sequence for a 50 and 150 kb genome is explained. The steps involved in going from hundreds of shotgun sequencing traces to generating contigs will be outlined as well as how to close gaps in the sequence by primer walking on phage DNA and PCR-generated templates. Finally, examples will be given of how biological information about the phage genomic termini can be derived by analysis of the organization of individual clones in the shotgun sequence assembly. Specific examples are given for the circularly permuted termini of pac type phages, the direct terminal repeats found in most T7-like phages, variable host DNA at either end as in the Mu-like phages, and the 5' and 3' overhanging ends of cos type phages. The end result of these steps is the entire DNA sequence of a novel phage, ready for gene prediction. PMID:19082550

  3. Whole genome resequencing of black Angus and Holstein cattle for SNP and CNV discovery

    Directory of Open Access Journals (Sweden)

    Stothard Paul

    2011-11-01

    Full Text Available Abstract Background One of the goals of livestock genomics research is to identify the genetic differences responsible for variation in phenotypic traits, particularly those of economic importance. Characterizing the genetic variation in livestock species is an important step towards linking genes or genomic regions with phenotypes. The completion of the bovine genome sequence and recent advances in DNA sequencing technology allow for in-depth characterization of the genetic variations present in cattle. Here we describe the whole-genome resequencing of two Bos taurus bulls from distinct breeds for the purpose of identifying and annotating novel forms of genetic variation in cattle. Results The genomes of a Black Angus bull and a Holstein bull were sequenced to 22-fold and 19-fold coverage, respectively, using the ABI SOLiD system. Comparisons of the sequences with the Btau4.0 reference assembly yielded 7 million single nucleotide polymorphisms (SNPs, 24% of which were identified in both animals. Of the total SNPs found in Holstein, Black Angus, and in both animals, 81%, 81%, and 75% respectively are novel. In-depth annotations of the data identified more than 16 thousand distinct non-synonymous SNPs (85% novel between the two datasets. Alignments between the SNP-altered proteins and orthologues from numerous species indicate that many of the SNPs alter well-conserved amino acids. Several SNPs predicted to create or remove stop codons were also found. A comparison between the sequencing SNPs and genotyping results from the BovineHD high-density genotyping chip indicates a detection rate of 91% for homozygous SNPs and 81% for heterozygous SNPs. The false positive rate is estimated to be about 2% for both the Black Angus and Holstein SNP sets, based on follow-up genotyping of 422 and 427 SNPs, respectively. Comparisons of read depth between the two bulls along the reference assembly identified 790 putative copy-number variations (CNVs. Ten

  4. Coffee Polyphenols Change the Expression of STAT5B and ATF-2 Modifying Cyclin D1 Levels in Cancer Cells

    Directory of Open Access Journals (Sweden)

    Carlota Oleaga

    2012-01-01

    Full Text Available Background. Epidemiological studies suggest that coffee consumption reduces the risk of cancer, but the molecular mechanisms of its chemopreventive effects remain unknown. Objective. To identify differentially expressed genes upon incubation of HT29 colon cancer cells with instant caffeinated coffee (ICC or caffeic acid (CA using whole-genome microarrays. Results. ICC incubation of HT29 cells caused the overexpression of 57 genes and the underexpression of 161, while CA incubation induced the overexpression of 12 genes and the underexpression of 32. Using Venn-Diagrams, we built a list of five overexpressed genes and twelve underexpressed genes in common between the two experimental conditions. This list was used to generate a biological association network in which STAT5B and ATF-2 appeared as highly interconnected nodes. STAT5B overexpression was confirmed at the mRNA and protein levels. For ATF-2, the changes in mRNA levels were confirmed for both ICC and CA, whereas the decrease in protein levels was only observed in CA-treated cells. The levels of cyclin D1, a target gene for both STAT5B and ATF-2, were downregulated by CA in colon cancer cells and by ICC and CA in breast cancer cells. Conclusions. Coffee polyphenols are able to affect cyclin D1 expression in cancer cells through the modulation of STAT5B and ATF-2.

  5. Whole-genome association study identifies STK39 as a hypertension susceptibility gene

    Science.gov (United States)

    Wang, Ying; O'Connell, Jeffrey R.; McArdle, Patrick F.; Wade, James B.; Dorff, Sarah E.; Shah, Sanjiv J.; Shi, Xiaolian; Pan, Lin; Rampersaud, Evadnie; Shen, Haiqing; Kim, James D.; Subramanya, Arohan R.; Steinle, Nanette I.; Parsa, Afshin; Ober, Carole C.; Welling, Paul A.; Chakravarti, Aravinda; Weder, Alan B.; Cooper, Richard S.; Mitchell, Braxton D.; Shuldiner, Alan R.; Chang, Yen-Pei C.

    2009-01-01

    Hypertension places a major burden on individual and public health, but the genetic basis of this complex disorder is poorly understood. We conducted a genome-wide association study of systolic and diastolic blood pressure (SBP and DBP) in Amish subjects and found strong association signals with common variants in a serine/threonine kinase gene, STK39. We confirmed this association in an independent Amish and 4 non-Amish Caucasian samples including the Diabetes Genetics Initiative, Framingham Heart Study, GenNet, and Hutterites (meta-analysis combining all studies: n = 7,125, P 0.09 and were associated with increases of 3.3/1.3 mm Hg in SBP/DBP, respectively, in the Amish subjects and with smaller but consistent effects across the non-Amish studies. Cell-based functional studies showed that STK39 interacts with WNK kinases and cation-chloride cotransporters, mutations in which cause monogenic forms of BP dysregulation. We demonstrate that in vivo, STK39 is expressed in the distal nephron, where it may interact with these proteins. Although none of the associated SNPs alter protein structure, we identified and experimentally confirmed a highly conserved intronic element with allele-specific in vitro transcription activity as a functional candidate for this association. Thus, variants in STK39 may influence BP by increasing STK39 expression and consequently altering renal Na+ excretion, thus unifying rare and common BP-regulating alleles in the same physiological pathway. PMID:19114657

  6. Whole-genome sequences of four strains closely related to members of the Mycobacterium chelonae group,isolated from biofilms in a drinking water distribution system simulator

    Data.gov (United States)

    U.S. Environmental Protection Agency — Whole-genome sequences of four strains closely related to members of the Mycobacterium chelonae group, isolated from biofilms in a drinking water distribution...

  7. Whole Genome DNA Sequence Analysis of Salmonella subspecies enterica serotype Tennessee obtained from related peanut butter foodborne outbreaks.

    Science.gov (United States)

    Wilson, Mark R; Brown, Eric; Keys, Chris; Strain, Errol; Luo, Yan; Muruvanda, Tim; Grim, Christopher; Jean-Gilles Beaubrun, Junia; Jarvis, Karen; Ewing, Laura; Gopinath, Gopal; Hanes, Darcy; Allard, Marc W; Musser, Steven

    2016-01-01

    Establishing an association between possible food sources and clinical isolates requires discriminating the suspected pathogen from an environmental background, and distinguishing it from other closely-related foodborne pathogens. We used whole genome sequencing (WGS) to Salmonella subspecies enterica serotype Tennessee (S. Tennessee) to describe genomic diversity across the serovar as well as among and within outbreak clades of strains associated with contaminated peanut butter. We analyzed 71 isolates of S. Tennessee from disparate food, environmental, and clinical sources and 2 other closely-related Salmonella serovars as outgroups (S. Kentucky and S. Cubana), which were also shot-gun sequenced. A whole genome single nucleotide polymorphism (SNP) analysis was performed using a maximum likelihood approach to infer phylogenetic relationships. Several monophyletic lineages of S. Tennessee with limited SNP variability were identified that recapitulated several food contamination events. S. Tennessee clades were separated from outgroup salmonellae by more than sixteen thousand SNPs. Intra-serovar diversity of S. Tennessee was small compared to the chosen outgroups (1,153 SNPs), suggesting recent divergence of some S. Tennessee clades. Analysis of all 1,153 SNPs structuring an S. Tennessee peanut butter outbreak cluster revealed that isolates from several food, plant, and clinical isolates were very closely related, as they had only a few SNP differences between them. SNP-based cluster analyses linked specific food sources to several clinical S. Tennessee strains isolated in separate contamination events. Environmental and clinical isolates had very similar whole genome sequences; no markers were found that could be used to discriminate between these sources. Finally, we identified SNPs within variable S. Tennessee genes that may be useful markers for the development of rapid surveillance and typing methods, potentially aiding in traceback efforts during future

  8. Whole Genome DNA Sequence Analysis of Salmonella subspecies enterica serotype Tennessee obtained from related peanut butter foodborne outbreaks.

    Directory of Open Access Journals (Sweden)

    Mark R Wilson

    Full Text Available Establishing an association between possible food sources and clinical isolates requires discriminating the suspected pathogen from an environmental background, and distinguishing it from other closely-related foodborne pathogens. We used whole genome sequencing (WGS to Salmonella subspecies enterica serotype Tennessee (S. Tennessee to describe genomic diversity across the serovar as well as among and within outbreak clades of strains associated with contaminated peanut butter. We analyzed 71 isolates of S. Tennessee from disparate food, environmental, and clinical sources and 2 other closely-related Salmonella serovars as outgroups (S. Kentucky and S. Cubana, which were also shot-gun sequenced. A whole genome single nucleotide polymorphism (SNP analysis was performed using a maximum likelihood approach to infer phylogenetic relationships. Several monophyletic lineages of S. Tennessee with limited SNP variability were identified that recapitulated several food contamination events. S. Tennessee clades were separated from outgroup salmonellae by more than sixteen thousand SNPs. Intra-serovar diversity of S. Tennessee was small compared to the chosen outgroups (1,153 SNPs, suggesting recent divergence of some S. Tennessee clades. Analysis of all 1,153 SNPs structuring an S. Tennessee peanut butter outbreak cluster revealed that isolates from several food, plant, and clinical isolates were very closely related, as they had only a few SNP differences between them. SNP-based cluster analyses linked specific food sources to several clinical S. Tennessee strains isolated in separate contamination events. Environmental and clinical isolates had very similar whole genome sequences; no markers were found that could be used to discriminate between these sources. Finally, we identified SNPs within variable S. Tennessee genes that may be useful markers for the development of rapid surveillance and typing methods, potentially aiding in traceback efforts

  9. Whole Genome DNA Sequence Analysis of Salmonella subspecies enterica serotype Tennessee obtained from related peanut butter foodborne outbreaks.

    Science.gov (United States)

    Wilson, Mark R; Brown, Eric; Keys, Chris; Strain, Errol; Luo, Yan; Muruvanda, Tim; Grim, Christopher; Jean-Gilles Beaubrun, Junia; Jarvis, Karen; Ewing, Laura; Gopinath, Gopal; Hanes, Darcy; Allard, Marc W; Musser, Steven

    2016-01-01

    Establishing an association between possible food sources and clinical isolates requires discriminating the suspected pathogen from an environmental background, and distinguishing it from other closely-related foodborne pathogens. We used whole genome sequencing (WGS) to Salmonella subspecies enterica serotype Tennessee (S. Tennessee) to describe genomic diversity across the serovar as well as among and within outbreak clades of strains associated with contaminated peanut butter. We analyzed 71 isolates of S. Tennessee from disparate food, environmental, and clinical sources and 2 other closely-related Salmonella serovars as outgroups (S. Kentucky and S. Cubana), which were also shot-gun sequenced. A whole genome single nucleotide polymorphism (SNP) analysis was performed using a maximum likelihood approach to infer phylogenetic relationships. Several monophyletic lineages of S. Tennessee with limited SNP variability were identified that recapitulated several food contamination events. S. Tennessee clades were separated from outgroup salmonellae by more than sixteen thousand SNPs. Intra-serovar diversity of S. Tennessee was small compared to the chosen outgroups (1,153 SNPs), suggesting recent divergence of some S. Tennessee clades. Analysis of all 1,153 SNPs structuring an S. Tennessee peanut butter outbreak cluster revealed that isolates from several food, plant, and clinical isolates were very closely related, as they had only a few SNP differences between them. SNP-based cluster analyses linked specific food sources to several clinical S. Tennessee strains isolated in separate contamination events. Environmental and clinical isolates had very similar whole genome sequences; no markers were found that could be used to discriminate between these sources. Finally, we identified SNPs within variable S. Tennessee genes that may be useful markers for the development of rapid surveillance and typing methods, potentially aiding in traceback efforts during future

  10. Expansion of banana (Musa acuminata) gene families involved in ethylene biosynthesis and signalling after lineage-specific whole-genome duplications.

    Science.gov (United States)

    Jourda, Cyril; Cardi, Céline; Mbéguié-A-Mbéguié, Didier; Bocs, Stéphanie; Garsmeur, Olivier; D'Hont, Angélique; Yahiaoui, Nabila

    2014-05-01

    Whole-genome duplications (WGDs) are widespread in plants, and three lineage-specific WGDs occurred in the banana (Musa acuminata) genome. Here, we analysed the impact of WGDs on the evolution of banana gene families involved in ethylene biosynthesis and signalling, a key pathway for banana fruit ripening. Banana ethylene pathway genes were identified using comparative genomics approaches and their duplication modes and expression profiles were analysed. Seven out of 10 banana ethylene gene families evolved through WGD and four of them (1-aminocyclopropane-1-carboxylate synthase (ACS), ethylene-insensitive 3-like (EIL), ethylene-insensitive 3-binding F-box (EBF) and ethylene response factor (ERF)) were preferentially retained. Banana orthologues of AtEIN3 and AtEIL1, two major genes for ethylene signalling in Arabidopsis, were particularly expanded. This expansion was paralleled by that of EBF genes which are responsible for control of EIL protein levels. Gene expression profiles in banana fruits suggested functional redundancy for several MaEBF and MaEIL genes derived from WGD and subfunctionalization for some of them. We propose that EIL and EBF genes were co-retained after WGD in banana to maintain balanced control of EIL protein levels and thus avoid detrimental effects of constitutive ethylene signalling. In the course of evolution, subfunctionalization was favoured to promote finer control of ethylene signalling.

  11. Whole genome sequencing identifies circulating Beijing-lineage Mycobacterium tuberculosis strains in Guatemala and an associated urban outbreak.

    Science.gov (United States)

    Saelens, Joseph W; Lau-Bonilla, Dalia; Moller, Anneliese; Medina, Narda; Guzmán, Brenda; Calderón, Maylena; Herrera, Raúl; Sisk, Dana M; Xet-Mull, Ana M; Stout, Jason E; Arathoon, Eduardo; Samayoa, Blanca; Tobin, David M

    2015-12-01

    Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from independent founding events, are currently circulating in Guatemala, but overall still represent a relatively small proportion of disease burden. Finally, we identify a specific Beijing-lineage outbreak centered on a poor neighborhood in Guatemala City.

  12. ReAS: Recovery of ancestral sequences for transposable elements from the unassembled reads of a whole genome shotgun.

    Directory of Open Access Journals (Sweden)

    Ruiqiang Li

    2005-09-01

    Full Text Available We describe an algorithm, ReAS, to recover ancestral sequences for transposable elements (TEs from the unassembled reads of a whole genome shotgun. The main assumptions are that these TEs must exist at high copy numbers across the genome and must not be so old that they are no longer recognizable in comparison to their ancestral sequences. Tested on the japonica rice genome, ReAS was able to reconstruct all of the high copy sequences in the Repbase repository of known TEs, and increase the effectiveness of RepeatMasker in identifying TEs from genome sequences.

  13. Whole-Genome Sequence of a blaOXA-48-Harboring Raoultella ornithinolytica Clinical Isolate from Lebanon.

    Science.gov (United States)

    Al-Bayssari, Charbel; Olaitan, Abiola Olumuyiwa; Leangapichart, Thongpan; Okdah, Liliane; Dabboussi, Fouad; Hamze, Monzer; Rolain, Jean-Marc

    2016-04-01

    We analyzed the whole-genome sequence of ablaOXA-48-harboringRaoultella ornithinolyticaclinical isolate from a patient in Lebanon. The size of theRaoultella ornithinolyticaCMUL058 genome was 5,622,862 bp, with a G+C content of 55.7%. We deciphered all the molecular mechanisms of antibiotic resistance, and we compared our genome to other availableR. ornithinolyticagenomes in GenBank. The resistome consisted of 9 antibiotic resistance genes, including a plasmidicblaOXA-48gene whose genetic organization is also described.

  14. Fatal Cases of Influenza A(H3N2) in Children: Insights from Whole Genome Sequence Analysis

    OpenAIRE

    Monica Galiano; Johnson, Benjamin F.; Richard Myers; Joanna Ellis; Rod Daniels; Maria Zambon

    2012-01-01

    During the Northern Hemisphere winter of 2003-2004 the emergence of a novel influenza antigenic variant, A/Fujian/411/2002-like(H3N2), was associated with an unusually high number of fatalities in children. Seventeen fatal cases in the UK were laboratory confirmed for Fujian/411-like viruses. To look for phylogenetic patterns and genetic markers that might be associated with increased virulence, sequencing and phylogenetic analysis of the whole genomes of 63 viruses isolated from fatal cases ...

  15. Whole genome sequencing identifies circulating Beijing-lineage Mycobacterium tuberculosis strains in Guatemala and an associated urban outbreak.

    Science.gov (United States)

    Saelens, Joseph W; Lau-Bonilla, Dalia; Moller, Anneliese; Medina, Narda; Guzmán, Brenda; Calderón, Maylena; Herrera, Raúl; Sisk, Dana M; Xet-Mull, Ana M; Stout, Jason E; Arathoon, Eduardo; Samayoa, Blanca; Tobin, David M

    2015-12-01

    Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from independent founding events, are currently circulating in Guatemala, but overall still represent a relatively small proportion of disease burden. Finally, we identify a specific Beijing-lineage outbreak centered on a poor neighborhood in Guatemala City. PMID:26542222

  16. Interferon-beta induces distinct gene expression response patterns in human monocytes versus T cells.

    Directory of Open Access Journals (Sweden)

    Noa Henig

    Full Text Available BACKGROUND: Monocytes, which are key players in innate immunity, are outnumbered by neutrophils and lymphocytes among peripheral white blood cells. The cytokine interferon-β (IFN-β is widely used as an immunomodulatory drug for multiple sclerosis and its functional pathways in peripheral blood mononuclear cells (PBMCs have been previously described. The aim of the present study was to identify novel, cell-specific IFN-β functions and pathways in tumor necrosis factor (TNF-α-activated monocytes that may have been missed in studies using PBMCs. METHODOLOGY/PRINCIPAL FINDINGS: Whole genome gene expression profiles of human monocytes and T cells were compared following in vitro priming to TNF-α and overnight exposure to IFN-β. Statistical analyses of the gene expression data revealed a cell-type-specific change of 699 transcripts, 667 monocyte-specific transcripts, 21 T cell-specific transcripts and 11 transcripts with either a difference in the response direction or a difference in the magnitude of response. RT-PCR revealed a set of differentially expressed genes (DEGs, exhibiting responses to IFN-β that are modulated by TNF-α in monocytes, such as RIPK2 and CD83, but not in T cells or PBMCs. Known IFN-β promoter response elements, such as ISRE, were enriched in T cell DEGs but not in monocyte DEGs. The overall directionality of the gene expression regulation by IFN-β was different in T cells and monocytes, with up-regulation more prevalent in T cells, and a similar extent of up and down-regulation recorded in monocytes. CONCLUSIONS: By focusing on the response of distinct cell types and by evaluating the combined effects of two cytokines with pro and anti-inflammatory activities, we were able to present two new findings First, new IFN-β response pathways and genes, some of which were monocytes specific; second, a cell-specific modulation of the IFN-β response transcriptome by TNF-α.

  17. Therapeutics of Ebola hemorrhagic fever: whole-genome transcriptional analysis of successful disease mitigation.

    Science.gov (United States)

    Yen, Judy Y; Garamszegi, Sara; Geisbert, Joan B; Rubins, Kathleen H; Geisbert, Thomas W; Honko, Anna; Xia, Yu; Connor, John H; Hensley, Lisa E

    2011-11-01

    The mechanisms of Ebola (EBOV) pathogenesis are only partially understood, but the dysregulation of normal host immune responses (including destruction of lymphocytes, increases in circulating cytokine levels, and development of coagulation abnormalities) is thought to play a major role. Accumulating evidence suggests that much of the observed pathology is not the direct result of virus-induced structural damage but rather is due to the release of soluble immune mediators from EBOV-infected cells. It is therefore essential to understand how the candidate therapeutic may be interrupting the disease process and/or targeting the infectious agent. To identify genetic signatures that are correlates of protection, we used a DNA microarray-based approach to compare the host genome-wide responses of EBOV-infected nonhuman primates (NHPs) responding to candidate therapeutics. We observed that, although the overall circulating immune response was similar in the presence and absence of coagulation inhibitors, surviving NHPs clustered together. Noticeable differences in coagulation-associated genes appeared to correlate with survival, which revealed a subset of distinctly differentially expressed genes, including chemokine ligand 8 (CCL8/MCP-2), that may provide possible targets for early-stage diagnostics or future therapeutics. These analyses will assist us in understanding the pathogenic mechanisms of EBOV infection and in identifying improved therapeutic strategies.

  18. Construction of white spot syndrome virus (WSSV) whole genome phage display library

    Institute of Scientific and Technical Information of China (English)

    ZHU Yanbing; YANG Feng

    2007-01-01

    A rebuilt vector pCANTAB 5 EE was obtained by inserting a 34 bp double-stranded oligonucleotide which contained a EcoRV recognition sequence into pCANTAB 5 E. White spot syndrome virus (WSSV) genome DNA was fragmented by sonication to isolate fragments mainly in the range of 0.8 ~2.0 kb, then the fragments were blunt-ended with T4 DNA polymerase and cloned into the EcoRV site of pCANTAB 5 EE. The primary recombinant clone of the library was 3.0 × 105.Colony PCR of random selected recombinants showed that the size of the inserts was 0.12 ~ 1.77 kb. After the whole library recombinant phages infected Escherichia coli HB2151 cells, the extracellular and periplasmic extracts were dropped on PVDF membranes to perform dot blot, using polyclonal mouse anti-VP24 serum,anti-WSV026 serum,anti-WSV063 serum,anti-WSV069 serum,anti-WSV112 serum, anti WSV238 serum,anti-WSV303 serum and anti-VP26 serum as the primary antibody, respectively. The results showed that the display library could express the viral proteins.

  19. Construction and evaluation of a whole genome microarray of Chlamydomonas reinhardtii

    Directory of Open Access Journals (Sweden)

    Toepel Jörg

    2011-11-01

    Full Text Available Abstract Background Chlamydomonas reinhardtii is widely accepted as a model organism regarding photosynthesis, circadian rhythm, cell mobility, phototaxis, and biotechnology. The complete annotation of the genome allows transcriptomic studies, however a new microarray platform was needed. Based on the completed annotation of Chlamydomonas reinhardtii a new microarray on an Agilent platform was designed using an extended JGI 3.1 genome data set which included 15000 transcript models. Results In total 44000 probes were determined (3 independent probes per transcript model covering 93% of the transcriptome. Alignment studies with the recently published AUGUSTUS 10.2 annotation confirmed 11000 transcript models resulting in a very good coverage of 70% of the transcriptome (17000. Following the estimation of 10000 predicted genes in Chlamydomonas reinhardtii our new microarray, nevertheless, covers the expected genome by 90-95%. Conclusions To demonstrate the capabilities of the new microarray, we analyzed transcript levels for cultures grown under nitrogen as well as sulfate limitation, and compared the results with recently published microarray and RNA-seq data. We could thereby confirm previous results derived from data on nutrient-starvation induced gene expression of a group of genes related to protein transport and adaptation of the metabolism as well as genes related to efficient light harvesting, light energy distribution and photosynthetic electron transport.

  20. Cell signaling and transcription factor genes expressed during whole body regeneration in a colonial chordate

    Directory of Open Access Journals (Sweden)

    Rinkevich Baruch

    2008-10-01

    Full Text Available Abstract Background The restoration of adults from fragments of blood vessels in botryllid ascidians (termed whole body regeneration [WBR] represents an inimitable event in the chordates, which is poorly understood on the mechanistic level. Results To elucidate mechanisms underlying this phenomenon, a subtracted EST library for early WBR stages was previously assembled, revealing 76 putative genes belonging to major signaling pathways, including Notch/Delta, JAK/STAT, protein kinases, nuclear receptors, Ras oncogene family members, G-Protein coupled receptor (GPCR and transforming growth factor beta (TGF-β signaling. RT-PCR on selected transcripts documented specific up-regulation in only regenerating fragments, pointing to a broad activation of these signaling pathways at onset of WBR. The followed-up expression pattern of seven representative transcripts from JAK/STAT signaling (Bl-STAT, the Ras oncogene family (Bl-Rap1A, Bl-Rab-33, the protein kinase family (Bl-Mnk, Bl-Cnot, Bl-Slit and Bl-Bax inhibitor, revealed systemic and site specific activations during WBR in a sub-population of circulatory cells. Conclusion WBR in the non-vertebrate chordate Botrylloides leachi is a multifaceted phenomenon, presided by a complex array of cell signaling and transcription factors. Above results, provide a first insight into the whole genome molecular machinery of this unique regeneration process, and reveal the broad participation of cell signaling and transcription factors in the process. While regeneration involves the participation of specific cell populations, WBR signals are systemically expressed at the organism level.

  1. Evaluation ofA Single-reaction Method for Whole Genome Sequencing of Influenza A Virus using Next Generation Sequencing

    Institute of Scientific and Technical Information of China (English)

    ZOU Xiao Hui; CHEN Wen Bing; ZHAO Xiang; ZHU Wen Fei; YANG Lei; WANG Da Yan; SHU Yue Long

    2016-01-01

    ObjectiveTo evaluate a single-reaction genome amplification method, the multisegment reverse transcription-PCR (M-RTPCR), for its sensitivity to full genome sequencing of influenza A virus, and the ability to differentiate mix-subtype virus, using the next generation sequencing (NGS) platform. MethodsVirus genome copy was quantified and serially diluted to different titers, followed by amplification with the M-RTPCR method and sequencing on the NGS platform. Furthermore, we manually mixed two subtype viruses to different titer rate and amplified the mixed virus with the M-RTPCR protocol, followed by whole genome sequencing on the NGS platform. We also used clinical samples to test the method performance. ResultsThe M-RTPCR method obtained complete genome of testing virus at 125 copies/reaction and determined the virus subtype at titer of 25 copies/reaction. Moreover, the two subtypes in the mixed virus could be discriminated, even though these two virus copies differed by 200-fold using this amplification protocol. The sensitivity of this protocol we detected using virus RNA was also confirmed with clinical samples containing low-titer virus. ConclusionThe M-RTPCR is a robust and sensitive amplification method for whole genome sequencing of influenza A virus using NGS platform.

  2. Identifying Likely Transmission Pathways within a 10-Year Community Outbreak of Tuberculosis by High-Depth Whole Genome Sequencing.

    Directory of Open Access Journals (Sweden)

    Alexander C Outhred

    Full Text Available Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways.We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants.Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade.Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster.

  3. Construction of whole genome radiation hybrid panels and map of chromosome 5A of wheat using asymmetric somatic hybridization.

    Directory of Open Access Journals (Sweden)

    Chuanen Zhou

    Full Text Available To explore the feasibility of constructing a whole genome radiation hybrid (WGRH map in plant species with large genomes, asymmetric somatic hybridization between wheat (Triticum aestivum L. and Bupleurum scorzonerifolium Willd. was performed. The protoplasts of wheat were irradiated with ultraviolet light (UV and gamma-ray and rescued by protoplast fusion using B. scorzonerifolium as the recipient. Assessment of SSR markers showed that the radiation hybrids have the average marker retention frequency of 15.5%. Two RH panels (RHPWI and RHPWII that contained 92 and 184 radiation hybrids, respectively, were developed and used for mapping of 68 SSR markers in chromosome 5A of wheat. A total of 1557 and 2034 breaks were detected in each panel. The RH map of chromosome 5A based on RHPWII was constructed. The distance of the comprehensive map was 2103 cR and the approximate resolution was estimated to be ∼501.6 kb/break. The RH panels evaluated in this study enabled us to order the ESTs in a single deletion bin or in the multiple bins cross the chromosome. These results demonstrated that RH mapping via protoplast fusion is feasible at the whole genome level for mapping purposes in wheat and the potential value of this mapping approach for the plant species with large genomes.

  4. Efficient Haplotype Inference Algorithms in One Whole Genome Scan for Pedigree Data with Non-genotyped Founders

    Institute of Scientific and Technical Information of China (English)

    Yongxi Cheng; Hadi Sabaa; Zhipeng Cai; Randy Goebel; Guohui Lin

    2009-01-01

    An efficient rule-based algorithm is presented for haplotype inference from general pedigree genotype data, with the assumption of no recombination. This algorithm generalizes previous algorithms to handle the cases where some pedigree founders are not genotyped, provided that for each nuclear family at least one parent is genotyped and each non-genotyped founder appears in exactly one nuclear family. The importance of this generalization lies in that such cases frequently happen in real data, because some founders may have passed away and their genotype data can no longer be collected. The algorithm runs in O(m3n3) time, where m is the number of single nucleotide polymorphism (SNP) loci under consideration and n is the number of genotyped members in the pedigree. This zero-recombination haplotyping algorithm is extended to a maximum parsimoniously haplotyping algorithm in one whole genome scan to minimize the total number of breakpoint sites, or equivalently, the number of maximal zero-recombination chromosomal regions. We show that such a whole genome scan haplotyping algorithm can be implemented in O(m3n3) time in a novel incremental fashion,here m denotes the total number of SNP loci along the chromosome.

  5. Contamination-controlled high-throughput whole genome sequencing for influenza A viruses using the MiSeq sequencer.

    Science.gov (United States)

    Lee, Hong Kai; Lee, Chun Kiat; Tang, Julian Wei-Tze; Loh, Tze Ping; Koay, Evelyn Siew-Chuan

    2016-01-01

    Accurate full-length genomic sequences are important for viral phylogenetic studies. We developed a targeted high-throughput whole genome sequencing (HT-WGS) method for influenza A viruses, which utilized an enzymatic cleavage-based approach, the Nextera XT DNA library preparation kit, for library preparation. The entire library preparation workflow was adapted for the Sentosa SX101, a liquid handling platform, to automate this labor-intensive step. As the enzymatic cleavage-based approach generates low coverage reads at both ends of the cleaved products, we corrected this loss of sequencing coverage at the termini by introducing modified primers during the targeted amplification step to generate full-length influenza A sequences with even coverage across the whole genome. Another challenge of targeted HTS is the risk of specimen-to-specimen cross-contamination during the library preparation step that results in the calling of false-positive minority variants. We included an in-run, negative system control to capture contamination reads that may be generated during the liquid handling procedures. The upper limits of 99.99% prediction intervals of the contamination rate were adopted as cut-off values of contamination reads. Here, 148 influenza A/H3N2 samples were sequenced using the HTS protocol and were compared against a Sanger-based sequencing method. Our data showed that the rate of specimen-to-specimen cross-contamination was highly significant in HTS. PMID:27624998

  6. Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery1

    Science.gov (United States)

    Gardner, Elliot M.; Johnson, Matthew G.; Ragone, Diane; Wickett, Norman J.; Zerega, Nyree J. C.

    2016-01-01

    Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes. PMID:27437173

  7. Validation of whole genome amplification for analysis of the p53 tumor suppressor gene in limited amounts of tumor samples.

    Science.gov (United States)

    Hasmats, Johanna; Green, Henrik; Solnestam, Beata Werne; Zajac, Pawel; Huss, Mikael; Orear, Cedric; Validire, Pierre; Bjursell, Magnus; Lundeberg, Joakim

    2012-08-24

    Personalized cancer treatment requires molecular characterization of individual tumor biopsies. These samples are frequently only available in limited quantities hampering genomic analysis. Several whole genome amplification (WGA) protocols have been developed with reported varying representation of genomic regions post amplification. In this study we investigate region dropout using a φ29 polymerase based WGA approach. DNA from 123 lung cancers specimens and corresponding normal tissue were used and evaluated by Sanger sequencing of the p53 exons 5-8. To enable comparative analysis of this scarce material, WGA samples were compared with unamplified material using a pooling strategy of the 123 samples. In addition, a more detailed analysis of exon 7 amplicons were performed followed by extensive cloning and Sanger sequencing. Interestingly, by comparing data from the pooled samples to the individually sequenced exon 7, we demonstrate that mutations are more easily recovered from WGA pools and this was also supported by simulations of different sequencing coverage. Overall this data indicate a limited random loss of genomic regions supporting the use of whole genome amplification for genomic analysis.

  8. Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery1

    Science.gov (United States)

    Gardner, Elliot M.; Johnson, Matthew G.; Ragone, Diane; Wickett, Norman J.; Zerega, Nyree J. C.

    2016-01-01

    Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes.

  9. Mining whole genomes and transcriptomes of Jatropha (Jatropha curcas) and Castor bean (Ricinus communis) for NBS-LRR genes and defense response associated transcription factors.

    Science.gov (United States)

    Sood, Archit; Jaiswal, Varun; Chanumolu, Sree Krishna; Malhotra, Nikhil; Pal, Tarun; Chauhan, Rajinder Singh

    2014-11-01

    Jatropha (Jatropha curcas L.) and Castor bean (Ricinus communis) are oilseed crops of family Euphorbiaceae with the potential of producing high quality biodiesel and having industrial value. Both the bioenergy plants are becoming susceptible to various biotic stresses directly affecting the oil quality and content. No report exists as of today on analysis of Nucleotide Binding Site-Leucine Rich Repeat (NBS-LRR) gene repertoire and defense response transcription factors in both the plant species. In silico analysis of whole genomes and transcriptomes identified 47 new NBS-LRR genes in both the species and 122 and 318 defense response related transcription factors in Jatropha and Castor bean, respectively. The identified NBS-LRR genes and defense response transcription factors were mapped onto the respective genomes. Common and unique NBS-LRR genes and defense related transcription factors were identified in both the plant species. All NBS-LRR genes in both the species were characterized into Toll/interleukin-1 receptor NBS-LRRs (TNLs) and coiled-coil NBS-LRRs (CNLs), position on contigs, gene clusters and motifs and domains distribution. Transcript abundance or expression values were measured for all NBS-LRR genes and defense response transcription factors, suggesting their functional role. The current study provides a repertoire of NBS-LRR genes and transcription factors which can be used in not only dissecting the molecular basis of disease resistance phenotype but also in developing disease resistant genotypes in Jatropha and Castor bean through transgenic or molecular breeding approaches.

  10. Analyses of Methylomes Derived from Meso-American Common Bean (Phaseolus vulgaris L.) Using MeDIP-Seq and Whole Genome Sodium Bisulfite-Sequencing.

    Science.gov (United States)

    Crampton, Mollee; Sripathi, Venkateswara R; Hossain, Khwaja; Kalavacharla, Venu

    2016-01-01

    Common bean (Phaseolus vulgaris L.) is economically important for its high protein, fiber, and micronutrient contents, with a relatively small genome size of ∼587 Mb. Common bean is genetically diverse with two major gene pools, Meso-American and Andean. The phenotypic variability within common bean is partly attributed to the genetic diversity and epigenetic changes that are largely influenced by environmental factors. It is well established that an important epigenetic regulator of gene expression is DNA methylation. Here, we present results generated from two high-throughput sequencing technologies, methylated DNA immunoprecipitation-sequencing (MeDIP-seq) and whole genome bisulfite-sequencing (BS-Seq). Our analyses revealed that this Meso-American common bean displays similar methylation patterns as other previously published plant methylomes, with CG ∼50%, CHG ∼30%, and CHH ∼2.7% methylation, however, these differ from the common bean reference methylome of Andean origin. We identified higher CG methylation levels in both promoter and genic regions than CHG and CHH contexts. Moreover, we found relatively higher CG methylation levels in genes than in promoters. Conversely, the CHG and CHH methylation levels were highest in promoters than in genes. This is the first genome-wide DNA methylation profiling study in a Meso-American common bean cultivar ("Sierra") using NGS approaches. Our long-term goal is to generate genome-wide epigenomic maps in common bean focusing on chromatin accessibility, histone modifications, and DNA methylation. PMID:27199997

  11. Cell Free DNA of Tumor Origin Induces a 'Metastatic' Expression Profile in HT-29 Cancer Cell Line.

    Directory of Open Access Journals (Sweden)

    István Fűri

    Full Text Available Epithelial cells in malignant conditions release DNA into the extracellular compartment. Cell free DNA of tumor origin may act as a ligand of DNA sensing mechanisms and mediate changes in epithelial-stromal interactions.To evaluate and compare the potential autocrine and paracrine regulatory effect of normal and malignant epithelial cell-related DNA on TLR9 and STING mediated pathways in HT-29 human colorectal adenocarcinoma cells and normal fibroblasts.DNA isolated from normal and tumorous colonic epithelia of fresh frozen surgically removed tissue samples was used for 24 and 6 hour treatment of HT-29 colon carcinoma and HDF-α fibroblast cells. Whole genome mRNA expression analysis and qRT-PCR was performed for the elements/members of TLR9 signaling pathway. Immunocytochemistry was performed for epithelial markers (i.e. CK20 and E-cadherin, DNA methyltransferase 3a (DNMT3a and NFκB (for treated HDFα cells.Administration of tumor derived DNA on HT29 cells resulted in significant (p<0.05 mRNA level alteration in 118 genes (logFc≥1, p≤0.05, including overexpression of metallothionein genes (i.e. MT1H, MT1X, MT1P2, MT2A, metastasis-associated genes (i.e. TACSTD2, MACC1, MALAT1, tumor biomarker (CEACAM5, metabolic genes (i.e. INSIG1, LIPG, messenger molecule genes (i.e. DAPP, CREB3L2. Increased protein levels of CK20, E-cadherin, and DNMT3a was observed after tumor DNA treatment in HT-29 cells. Healthy DNA treatment affected mRNA expression of 613 genes (logFc≥1, p≤0.05, including increased expression of key adaptor molecules of TLR9 pathway (e.g. MYD88, IRAK2, NFκB, IL8, IL-1β, STING pathway (ADAR, IRF7, CXCL10, CASP1 and the FGF2 gene.DNA from tumorous colon epithelium, but not from the normal epithelial cells acts as a pro-metastatic factor to HT-29 cells through the overexpression of pro-metastatic genes through TLR9/MYD88 independent pathway. In contrast, DNA derived from healthy colonic epithelium induced TLR9 and STING signaling

  12. Genome-wide profiling to analyze the effects of FXR activation on mouse renal proximal tubular cells

    OpenAIRE

    Gui, Ting; Gai, Zhibo

    2015-01-01

    To assess the effect of farnesoid X receptor (FXR), a bile acid nuclear receptor, on renal proximal tubular cells, primary cultured mouse kidney proximal tubular cells were treated with GW4064 (a FXR agonist) or DMSO (as controls) overnight. Analysis of gene expression in the proximal tubular cells by whole genome microarrays indicated that FXR activation induced genes involved in fatty acid degradation and oxidation reduction. Among them, genes involved in glutathione metabolism were mostly ...

  13. Molecular characterization of c-Abl/c-Src kinase inhibitors targeted against murine tumour progenitor cells that express stem cell markers.

    Directory of Open Access Journals (Sweden)

    Thomas Kruewel

    Full Text Available BACKGROUND: The non-receptor tyrosine kinases c-Abl and c-Src are overexpressed in various solid human tumours. Inhibition of their hyperactivity represents a molecular rationale in the combat of cancerous diseases. Here we examined the effects of a new family of pyrazolo [3,4-d] pyrimidines on a panel of 11 different murine lung tumour progenitor cell lines, that express stem cell markers, as well as on the human lung adenocarcinoma cell line A549, the human hepatoma cell line HepG2 and the human colon cancer cell line CaCo2 to obtain insight into the mode of action of these experimental drugs. METHODOLOGY/PRINCIPAL FINDINGS: Treatment with the dual kinase inhibitors blocked c-Abl and c-Src kinase activity efficiently in the nanomolar range, induced apoptosis, reduced cell viability and caused cell cycle arrest predominantly at G0/G1 phase while western blot analysis confirmed repressed protein expression of c-Abl and c-Src as well as the interacting partners p38 mitogen activated protein kinase, heterogenous ribonucleoprotein K, cyclin dependent kinase 1 and further proteins that are crucial for tumour progression. Importantly, a significant repression of the epidermal growth factor receptor was observed while whole genome gene expression analysis evidenced regulation of many cell cycle regulated genes as well integrin and focal adhesion kinase (FAK signalling to impact cytoskeleton dynamics, migration, invasion and metastasis. CONCLUSIONS/SIGNIFICANCE: Our experiments and recently published in vivo engraftment studies with various tumour cell lines revealed the dual kinase inhibitors to be efficient in their antitumour activity.

  14. An evaluation of alternative methods for constructing phylogenies from whole genome sequence data: a case study with Salmonella

    Directory of Open Access Journals (Sweden)

    James B. Pettengill

    2014-10-01

    Full Text Available Comparative genomics based on whole genome sequencing (WGS is increasingly being applied to investigate questions within evolutionary and molecular biology, as well as questions concerning public health (e.g., pathogen outbreaks. Given the impact that conclusions derived from such analyses may have, we have evaluated the robustness of clustering individuals based on WGS data to three key factors: (1 next-generation sequencing (NGS platform (HiSeq, MiSeq, IonTorrent, 454, and SOLiD, (2 algorithms used to construct a SNP (single nucleotide polymorphism matrix (reference-based and reference-free, and (3 phylogenetic inference method (FastTreeMP, GARLI, and RAxML. We carried out these analyses on 194 whole genome sequences representing 107 unique Salmonella enterica subsp. enterica ser. Montevideo strains. Reference-based approaches for identifying SNPs produced trees that were significantly more similar to one another than those produced under the reference-free approach. Topologies inferred using a core matrix (i.e., no missing data were significantly more discordant than those inferred using a non-core matrix that allows for some missing data. However, allowing for too much missing data likely results in a high false discovery rate of SNPs. When analyzing the same SNP matrix, we observed that the more thorough inference methods implemented in GARLI and RAxML produced more similar topologies than FastTreeMP. Our results also confirm that reproducibility varies among NGS platforms where the MiSeq had the lowest number of pairwise differences among replicate runs. Our investigation into the robustness of clustering patterns illustrates the importance of carefully considering how data from different platforms are combined and analyzed. We found clear differences in the topologies inferred, and certain methods performed significantly better than others for discriminating between the highly clonal organisms investigated here. The methods supported by

  15. Diversification and evolution of the SDG gene family in Brassica rapa after the whole genome triplication

    OpenAIRE

    Heng Dong; Dandan Liu; Tianyu Han; Yuxue Zhao; Ji Sun; Sue Lin; Jiashu Cao; Zhong-Hua Chen; Li Huang

    2015-01-01

    Histone lysine methylation, controlled by the SET Domain Group (SDG) gene family, is part of the histone code that regulates chromatin function and epigenetic control of gene expression. Analyzing the SDG gene family in Brassica rapa for their gene structure, domain architecture, subcellular localization, rate of molecular evolution and gene expression pattern revealed common occurrences of subfunctionalization and neofunctionalization in BrSDGs. In comparison with Arabidopsis thaliana, the B...

  16. Different responsiveness to a high-fat/cholesterol diet in two inbred mice and underlying genetic factors: a whole genome microarray analysis

    Directory of Open Access Journals (Sweden)

    Jin Gang

    2009-10-01

    Full Text Available Abstract Background To investigate different responses to a high-fat/cholesterol diet and uncover their underlying genetic factors between C57BL/6J (B6 and DBA/2J (D2 inbred mice. Methods B6 and D2 mice were fed a high-fat/cholesterol diet for a series of time-points. Serum and bile lipid profiles, bile acid yields, hepatic apoptosis, gallstones and atherosclerosis formation were measured. Furthermore, a whole genome microarray was performed to screen hepatic genes expression profile. Quantitative real-time PCR, western blot and TUNEL assay were conducted to validate microarray data. Results After fed the high-fat/cholesterol diet, serum and bile total cholesterol, serum cholesterol esters, HDL cholesterol and Non-HDL cholesterol levels were altered in B6 but not significantly changed in D2; meanwhile, biliary bile acid was decreased in B6 but increased in D2. At the same time, hepatic apoptosis, gallstones and atherosclerotic lesions occurred in B6 but not in D2. The hepatic microarray analysis revealed distinctly different genes expression patterns between B6 and D2 mice. Their functional pathway groups included lipid metabolism, oxidative stress, immune/inflammation response and apoptosis. Quantitative real time PCR, TUNEL assay and western-blot results were consistent with microarray analysis. Conclusion Different genes expression patterns between B6 and D2 mice might provide a genetic basis for their distinctive responses to a high-fat/cholesterol diet, and give us an opportunity to identify novel pharmaceutical targets in related diseases in the future.

  17. Inability of ‘Whole Genome Amplification’ to Improve Success Rates for the Biomolecular Detection of Tuberculosis in Archaeological Samples

    Science.gov (United States)

    Forst, Jannine; Brown, Terence A.

    2016-01-01

    We assessed the ability of whole genome amplification (WGA) to improve the efficiency of downstream polymerase chain reactions (PCRs) directed at ancient DNA (aDNA) of members of the Mycobacterium tuberculosis complex (MTBC). Using extracts from a variety of bones and a tooth from human skeletons with or without lesions indicative of tuberculosis, from multiple time periods, we obtained inconsistent results. We conclude that WGA does not provide any advantage in studies of MTBC aDNA. The sporadic nature of our results are probably due to the fact that WGA is itself a PCR-based procedure which, although designed to deal with fragmented DNA, might be inefficient with the low concentration of templates in an aDNA extract. As such, WGA is subject to similar, if not the same, restrictions as PCR when applied to aDNA. PMID:27654468

  18. Rapid and Easy In Silico Serotyping of Escherichia coli Isolates by Use of Whole-Genome Sequencing Data

    DEFF Research Database (Denmark)

    Joensen, Katrine Grimstrup; Tetzschner, Anna M. M.; Iguchi, Atsushi;

    2015-01-01

    typing and surveillance. The aim of this study was to establish a valid and publicly available tool for WGS-based in silico serotyping of E. coli applicable for routine typing and surveillance. A FASTA database of specific O-antigen processing system genes for O typing and flagellin genes for H typing...... was created as a component of the publicly available Web tools hosted by the Center for Genomic Epidemiology (CGE) (www.genomicepidemiology.org). All E. coli isolates available with WGS data and conventional serotype information were subjected to WGS-based serotyping employing this specific Serotype......Finder CGE tool. SerotypeFinder was evaluated on 682 E. coli genomes, 108 of which were sequenced for this study, where both the whole genome and the serotype were available. In total, 601 and 509 isolates were included for O and H typing, respectively. The O-antigen genes wzx, wzy, wzm, and wzt and the...

  19. Application of whole genome shotgun sequencing for detection and characterization of genetically modified organisms and derived products.

    Science.gov (United States)

    Holst-Jensen, Arne; Spilsberg, Bjørn; Arulandhu, Alfred J; Kok, Esther; Shi, Jianxin; Zel, Jana

    2016-07-01

    The emergence of high-throughput, massive or next-generation sequencing technologies has created a completely new foundation for molecular analyses. Various selective enrichment processes are commonly applied to facilitate detection of predefined (known) targets. Such approaches, however, inevitably introduce a bias and are prone to miss unknown targets. Here we review the application of high-throughput sequencing technologies and the preparation of fit-for-purpose whole genome shotgun sequencing libraries for the detection and characterization of genetically modified and derived products. The potential impact of these new sequencing technologies for the characterization, breeding selection, risk assessment, and traceability of genetically modified organisms and genetically modified products is yet to be fully acknowledged. The published literature is reviewed, and the prospects for future developments and use of the new sequencing technologies for these purposes are discussed. PMID:27100228

  20. Genome-wide association analyses based on whole-genome sequencing in Sardinia provide insights into regulation of hemoglobin levels.

    Science.gov (United States)

    Danjou, Fabrice; Zoledziewska, Magdalena; Sidore, Carlo; Steri, Maristella; Busonero, Fabio; Maschio, Andrea; Mulas, Antonella; Perseu, Lucia; Barella, Susanna; Porcu, Eleonora; Pistis, Giorgio; Pitzalis, Maristella; Pala, Mauro; Menzel, Stephan; Metrustry, Sarah; Spector, Timothy D; Leoni, Lidia; Angius, Andrea; Uda, Manuela; Moi, Paolo; Thein, Swee Lay; Galanello, Renzo; Abecasis, Gonçalo R; Schlessinger, David; Sanna, Serena; Cucca, Francesco

    2015-11-01

    We report genome-wide association study results for the levels of A1, A2 and fetal hemoglobins, analyzed for the first time concurrently. Integrating high-density array genotyping and whole-genome sequencing in a large general population cohort from Sardinia, we detected 23 associations at 10 loci. Five signals are due to variants at previously undetected loci: MPHOSPH9, PLTP-PCIF1, ZFPM1 (FOG1), NFIX and CCND3. Among the signals at known loci, ten are new lead variants and four are new independent signals. Half of all variants also showed pleiotropic associations with different hemoglobins, which further corroborated some of the detected associations and identified features of coordinated hemoglobin species production. PMID:26366553

  1. Two listeria outbreaks caused by smoked fish consumption-using whole-genome sequencing for outbreak investigations

    DEFF Research Database (Denmark)

    Gillesberg Lassen, S.; Ethelberg, S.; Björkman, J. T.;

    2016-01-01

    Listeria monocytogenes may contaminate and persist in food production facilities and cause repeated, seemingly sporadic, illnesses over extended periods of time. We report on the investigation of two such concurrent outbreaks. We compared patient isolates and available isolates from foods and food...... production facilities by use of whole-genome sequencing and subsequent multilocus sequence type and single nucleotide polymorphism analysis. Outbreak cases shared outbreak strains, defined as Listeria monocytogenes isolates belonging to the same sequence type with fewer than five single nucleotide....... Listeria monocytogenes isolates from cold smoked or gravad fish products or their two respective production environments were repeatedly found to belong to the outbreak strains. Outbreak cases more often than sporadic cases stated that they consumed the relevant fish products, odds ratio 10.7. Routine...

  2. Genetic Diversity and Fingerprint Profiles of Commercial Lentinula edodes Cultivars Based on SSR Markers Developed from the Whole Genome Sequence

    Institute of Scientific and Technical Information of China (English)

    ZHANG Dan; SONG Chunyan; ZHANG Lujun; WU Ping; BAO Dapeng; SHANG Xiaodong; TAN Qi

    2014-01-01

    Lentinula edodes is an important cultivated mushroom in China, and accurate and reliable identification of individual cultivars is a prerequisite for successful cultivation and variety protection.In this study,the whole genome sequence of L.edodes was used to generate 200 simple sequence repeat (SSR) markers for delineating 25 commercial cultivars and for determining their genetic diversity.Our data revealed a relatively high level of genetic similarity among the cultivars,with average,minimum and maximum genetic similarity coefficient values of 0.776,0.567 and 1.000,respectively.Seven SSR primer pairs delineated eleven of the cultivars (Cr-02,Minfeng-1,Xianggu 241-4,Senyuan-1,Senyuan-8404,Xiang-9,Guangxiang-51,Huaxiang-5,L952,L9319 and L808)based on their unique multilocus SSR fingerprint profiles.

  3. Application of whole genome shotgun sequencing for detection and characterization of genetically modified organisms and derived products.

    Science.gov (United States)

    Holst-Jensen, Arne; Spilsberg, Bjørn; Arulandhu, Alfred J; Kok, Esther; Shi, Jianxin; Zel, Jana

    2016-07-01

    The emergence of high-throughput, massive or next-generation sequencing technologies has created a completely new foundation for molecular analyses. Various selective enrichment processes are commonly applied to facilitate detection of predefined (known) targets. Such approaches, however, inevitably introduce a bias and are prone to miss unknown targets. Here we review the application of high-throughput sequencing technologies and the preparation of fit-for-purpose whole genome shotgun sequencing libraries for the detection and characterization of genetically modified and derived products. The potential impact of these new sequencing technologies for the characterization, breeding selection, risk assessment, and traceability of genetically modified organisms and genetically modified products is yet to be fully acknowledged. The published literature is reviewed, and the prospects for future developments and use of the new sequencing technologies for these purposes are discussed.

  4. Gene expression and pathway analysis of ovarian cancer cells selected for resistance to cisplatin, paclitaxel, or doxorubicin

    Directory of Open Access Journals (Sweden)

    Sherman-Baust Cheryl A

    2011-12-01

    Full Text Available Abstract Background Resistance to current chemotherapeutic agents is a major cause of therapy failure in ovarian cancer patients, but the exact mechanisms leading to the development of drug resistance remain unclear. Methods To better understand mechanisms of drug resistance, and possibly identify novel targets for therapy, we generated a series of drug resistant ovarian cancer cell lines through repeated exposure to three chemotherapeutic drugs (cisplatin, doxorubicin, or paclitaxel, and identified changes in gene expression patterns using Illumina whole-genome expression microarrays. Validation of selected genes was performed by RT-PCR and immunoblotting. Pathway enrichment analysis using the KEGG, GO, and Reactome databases was performed to identify pathways that may be important in each drug resistance phenotype. Results A total of 845 genes (p Conclusions Ovarian cancer cells develop drug resistance through different pathways depending on the drug used in the generation of chemoresistance. A better understanding of these mechanisms may lead to the development of novel strategies to circumvent the problem of drug resistance.

  5. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon. PMID:27311063

  6. Whole-Genome Sequencing Allows for Improved Identification of Persistent Listeria monocytogenes in Food-Associated Environments.

    Science.gov (United States)

    Stasiewicz, Matthew J; Oliver, Haley F; Wiedmann, Martin; den Bakker, Henk C

    2015-09-01

    While the food-borne pathogen Listeria monocytogenes can persist in food associated environments, there are no whole-genome sequence (WGS) based methods to differentiate persistent from sporadic strains. Whole-genome sequencing of 188 isolates from a longitudinal study of L. monocytogenes in retail delis was used to (i) apply single-nucleotide polymorphism (SNP)-based phylogenetics for subtyping of L. monocytogenes, (ii) use SNP counts to differentiate persistent from repeatedly reintroduced strains, and (iii) identify genetic determinants of L. monocytogenes persistence. WGS analysis revealed three prophage regions that explained differences between three pairs of phylogenetically similar populations with pulsed-field gel electrophoresis types that differed by ≤3 bands. WGS-SNP-based phylogenetics found that putatively persistent L. monocytogenes represent SNP patterns (i) unique to a single retail deli, supporting persistence within the deli (11 clades), (ii) unique to a single state, supporting clonal spread within a state (7 clades), or (iii) spanning multiple states (5 clades). Isolates that formed one of 11 deli-specific clades differed by a median of 10 SNPs or fewer. Isolates from 12 putative persistence events had significantly fewer SNPs (median, 2 to 22 SNPs) than between isolates of the same subtype from other delis (median up to 77 SNPs), supporting persistence of the strain. In 13 events, nearly indistinguishable isolates (0 to 1 SNP) were found across multiple delis. No individual genes were enriched among persistent isolates compared to sporadic isolates. Our data show that WGS analysis improves food-borne pathogen subtyping and identification of persistent bacterial pathogens in food associated environments. PMID:26116683

  7. Whole genome sequencing and evolutionary analysis of human respiratory syncytial virus A and B from Milwaukee, WI 1998-2010.

    Directory of Open Access Journals (Sweden)

    Cecilia Rebuffo-Scheer

    Full Text Available BACKGROUND: Respiratory Syncytial Virus (RSV is the leading cause of lower respiratory-tract infections in infants and young children worldwide. Despite this, only six complete genome sequences of original strains have been previously published, the most recent of which dates back 35 and 26 years for RSV group A and group B respectively. METHODOLOGY/PRINCIPAL FINDINGS: We present a semi-automated sequencing method allowing for the sequencing of four RSV whole genomes simultaneously. We were able to sequence the complete coding sequences of 13 RSV A and 4 RSV B strains from Milwaukee collected from 1998-2010. Another 12 RSV A and 5 RSV B strains sequenced in this study cover the majority of the genome. All RSV A and RSV B sequences were analyzed by neighbor-joining, maximum parsimony and Bayesian phylogeny methods. Genetic diversity was high among RSV A viruses in Milwaukee including the circulation of multiple genotypes (GA1, GA2, GA5, GA7 with GA2 persisting throughout the 13 years of the study. However, RSV B genomes showed little variation with all belonging to the BA genotype. For RSV A, the same evolutionary patterns and clades were seen consistently across the whole genome including all intergenic, coding, and non-coding regions sequences. CONCLUSIONS/SIGNIFICANCE: The sequencing strategy presented in this work allows for RSV A and B genomes to be sequenced simultaneously in two working days and with a low cost. We have significantly increased the amount of genomic data that is available for both RSV A and B, providing the basic molecular characteristics of RSV strains circulating in Milwaukee over the last 13 years. This information can be used for comparative analysis with strains circulating in other communities around the world which should also help with the development of new strategies for control of RSV, specifically vaccine development and improvement of RSV diagnostics.

  8. Whole-genome SNP association in the horse: identification of a deletion in myosin Va responsible for Lavender Foal Syndrome.

    Directory of Open Access Journals (Sweden)

    Samantha A Brooks

    2010-04-01

    Full Text Available Lavender Foal Syndrome (LFS is a lethal inherited disease of horses with a suspected autosomal recessive mode of inheritance. LFS has been primarily diagnosed in a subgroup of the Arabian breed, the Egyptian Arabian horse. The condition is characterized by multiple neurological abnormalities and a dilute coat color. Candidate genes based on comparative phenotypes in mice and humans include the ras-associated protein RAB27a (RAB27A and myosin Va (MYO5A. Here we report mapping of the locus responsible for LFS using a small set of 36 horses segregating for LFS. These horses were genotyped using a newly available single nucleotide polymorphism (SNP chip containing 56,402 discriminatory elements. The whole genome scan identified an associated region containing these two functional candidate genes. Exon sequencing of the MYO5A gene from an affected foal revealed a single base deletion in exon 30 that changes the reading frame and introduces a premature stop codon. A PCR-based Restriction Fragment Length Polymorphism (PCR-RFLP assay was designed and used to investigate the frequency of the mutant gene. All affected horses tested were homozygous for this mutation. Heterozygous carriers were detected in high frequency in families segregating for this trait, and the frequency of carriers in unrelated Egyptian Arabians was 10.3%. The mapping and discovery of the LFS mutation represents the first successful use of whole-genome SNP scanning in the horse for any trait. The RFLP assay can be used to assist breeders in avoiding carrier-to-carrier matings and thus in preventing the birth of affected foals.

  9. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Directory of Open Access Journals (Sweden)

    Sathishkumar Natarajan

    Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  10. Rapid identification of genetic modifications in Bacillus anthracis using whole genome draft sequences generated by 454 pyrosequencing.

    Directory of Open Access Journals (Sweden)

    Peter E Chen

    Full Text Available BACKGROUND: The anthrax letter attacks of 2001 highlighted the need for rapid identification of biothreat agents not only for epidemiological surveillance of the intentional outbreak but also for implementing appropriate countermeasures, such as antibiotic treatment, in a timely manner to prevent further casualties. It is clear from the 2001 cases that survival may be markedly improved by administration of antimicrobial therapy during the early symptomatic phase of the illness; i.e., within 3 days of appearance of symptoms. Microbiological detection methods are feasible only for organisms that can be cultured in vitro and cannot detect all genetic modifications with the exception of antibiotic resistance. Currently available immuno or nucleic acid-based rapid detection assays utilize known, organism-specific proteins or genomic DNA signatures respectively. Hence, these assays lack the ability to detect novel natural variations or intentional genetic modifications that circumvent the targets of the detection assays or in the case of a biological attack using an antibiotic resistant or virulence enhanced Bacillus anthracis, to advise on therapeutic treatments. METHODOLOGY/PRINCIPAL FINDINGS: We show here that the Roche 454-based pyrosequencing can generate whole genome draft sequences of deep and broad enough coverage of a bacterial genome in less than 24 hours. Furthermore, using the unfinished draft sequences, we demonstrate that unbiased identification of known as well as heretofore-unreported genetic modifications that include indels and single nucleotide polymorphisms conferring antibiotic and phage resistances is feasible within the next 12 hours. CONCLUSIONS/SIGNIFICANCE: Second generation sequencing technologies have paved the way for sequence-based rapid identification of both known and previously undocumented genetic modifications in cultured, conventional and newly emerging biothreat agents. Our findings have significant implications in

  11. Comprehensive characterization of human genome variation by high coverage whole-genome sequencing of forty four Caucasians.

    Directory of Open Access Journals (Sweden)

    Hui Shen

    Full Text Available Whole genome sequencing studies are essential to obtain a comprehensive understanding of the vast pattern of human genomic variations. Here we report the results of a high-coverage whole genome sequencing study for 44 unrelated healthy Caucasian adults, each sequenced to over 50-fold coverage (averaging 65.8×. We identified approximately 11 million single nucleotide polymorphisms (SNPs, 2.8 million short insertions and deletions, and over 500,000 block substitutions. We showed that, although previous studies, including the 1000 Genomes Project Phase 1 study, have catalogued the vast majority of common SNPs, many of the low-frequency and rare variants remain undiscovered. For instance, approximately 1.4 million SNPs and 1.3 million short indels that we found were novel to both the dbSNP and the 1000 Genomes Project Phase 1 data sets, and the majority of which (∼96% have a minor allele frequency less than 5%. On average, each individual genome carried ∼3.3 million SNPs and ∼492,000 indels/block substitutions, including approximately 179 variants that were predicted to cause loss of function of the gene products. Moreover, each individual genome carried an average of 44 such loss-of-function variants in a homozygous state, which would completely "knock out" the corresponding genes. Across all the 44 genomes, a total of 182 genes were "knocked-out" in at least one individual genome, among which 46 genes were "knocked out" in over 30% of our samples, suggesting that a number of genes are commonly "knocked-out" in general populations. Gene ontology analysis suggested that these commonly "knocked-out" genes are enriched in biological process related to antigen processing and immune response. Our results contribute towards a comprehensive characterization of human genomic variation, especially for less-common and rare variants, and provide an invaluable resource for future genetic studies of human variation and diseases.

  12. Improving the accuracy of whole genome prediction for complex traits using the results of genome wide association studies.

    Directory of Open Access Journals (Sweden)

    Zhe Zhang

    Full Text Available Utilizing the whole genomic variation of complex traits to predict the yet-to-be observed phenotypes or unobserved genetic values via whole genome prediction (WGP and to infer the underlying genetic architecture via genome wide association study (GWAS is an interesting and fast developing area in the context of human disease studies as well as in animal and plant breeding. Though thousands of significant loci for several species were detected via GWAS in the past decade, they were not used directly to improve WGP due to lack of proper models. Here, we propose a generalized way of building trait-specific genomic relationship matrices which can exploit GWAS results in WGP via a best linear unbiased prediction (BLUP model for which we suggest the name BLUP|GA. Results from two illustrative examples show that using already existing GWAS results from public databases in BLUP|GA improved the accuracy of WGP for two out of the three model traits in a dairy cattle data set, and for nine out of the 11 traits in a rice diversity data set, compared to the reference methods GBLUP and BayesB. While BLUP|GA outperforms BayesB, its required computing time is comparable to GBLUP. Further simulation results suggest that accounting for publicly available GWAS results is potentially more useful for WGP utilizing smaller data sets and/or traits of low heritability, depending on the genetic architecture of the trait under consideration. To our knowledge, this is the first study incorporating public GWAS results formally into the standard GBLUP model and we think that the BLUP|GA approach deserves further investigations in animal breeding, plant breeding as well as human genetics.

  13. Lessons learned from the application of whole-genome analysis to the treatment of patients with advanced cancers

    Science.gov (United States)

    Laskin, Janessa; Jones, Steven; Aparicio, Samuel; Chia, Stephen; Ch'ng, Carolyn; Deyell, Rebecca; Eirew, Peter; Fok, Alexandra; Gelmon, Karen; Ho, Cheryl; Huntsman, David; Jones, Martin; Kasaian, Katayoon; Karsan, Aly; Leelakumari, Sreeja; Li, Yvonne; Lim, Howard; Ma, Yussanne; Mar, Colin; Martin, Monty; Moore, Richard; Mungall, Andrew; Mungall, Karen; Pleasance, Erin; Rassekh, S. Rod; Renouf, Daniel; Shen, Yaoqing; Schein, Jacqueline; Schrader, Kasmintan; Sun, Sophie; Tinker, Anna; Zhao, Eric; Yip, Stephen; Marra, Marco A.

    2015-01-01

    Given the success of targeted agents in specific populations it is expected that some degree of molecular biomarker testing will become standard of care for many, if not all, cancers. To facilitate this, cancer centers worldwide are experimenting with targeted “panel” sequencing of selected mutations. Recent advances in genomic technology enable the generation of genome-scale data sets for individual patients. Recognizing the risk, inherent in panel sequencing, of failing to detect meaningful somatic alterations, we sought to establish processes to integrate data from whole-genome analysis (WGA) into routine cancer care. Between June 2012 and August 2014, 100 adult patients with incurable cancers consented to participate in the Personalized OncoGenomics (POG) study. Fresh tumor and blood samples were obtained and used for whole-genome and RNA sequencing. Computational approaches were used to identify candidate driver mutations, genes, and pathways. Diagnostic and drug information were then sought based on these candidate “drivers.” Reports were generated and discussed weekly in a multidisciplinary team setting. Other multidisciplinary working groups were assembled to establish guidelines on the interpretation, communication, and integration of individual genomic findings into patient care. Of 78 patients for whom WGA was possible, results were considered actionable in 55 cases. In 23 of these 55 cases, the patients received treatments motivated by WGA. Our experience indicates that a multidisciplinary team of clinicians and scientists can implement a paradigm in which WGA is integrated into the care of late stage cancer patients to inform systemic therapy decisions. PMID:27148575

  14. Whole-genome phylogenomic heterogeneity of Neisseria gonorrhoeae isolates with decreased cephalosporin susceptibility collected in Canada between 1989 and 2013.

    Science.gov (United States)

    Demczuk, Walter; Lynch, Tarah; Martin, Irene; Van Domselaar, Gary; Graham, Morag; Bharat, Amrita; Allen, Vanessa; Hoang, Linda; Lefebvre, Brigitte; Tyrrell, Greg; Horsman, Greg; Haldane, David; Garceau, Richard; Wylie, John; Wong, Tom; Mulvey, Michael R

    2015-01-01

    A large-scale, whole-genome comparison of Canadian Neisseria gonorrhoeae isolates with high-level cephalosporin MICs was used to demonstrate a genomic epidemiology approach to investigate strain relatedness and dynamics. Although current typing methods have been very successful in tracing short-chain transmission of gonorrheal disease, investigating the temporal evolutionary relationships and geographical dissemination of highly clonal lineages requires enhanced resolution only available through whole-genome sequencing (WGS). Phylogenomic cluster analysis grouped 169 Canadian strains into 12 distinct clades. While some N. gonorrhoeae multiantigen sequence types (NG-MAST) agreed with specific phylogenomic clades or subclades, other sequence types (ST) and closely related groups of ST were widely distributed among clades. Decreased susceptibility to extended-spectrum cephalosporins (ESC-DS) emerged among a group of diverse strains in Canada during the 1990s with a variety of nonmosaic penA alleles, followed in 2000/2001 with the penA mosaic X allele and then in 2007 with ST1407 strains with the penA mosaic XXXIV allele. Five genetically distinct ESC-DS lineages were associated with penA mosaic X, XXXV, and XXXIV alleles and nonmosaic XII and XIII alleles. ESC-DS with coresistance to azithromycin was observed in 5 strains with 23S rRNA C2599T or A2143G mutations. As the costs associated with WGS decline and analysis tools are streamlined, WGS can provide a more thorough understanding of strain dynamics, facilitate epidemiological studies to better resolve social networks, and improve surveillance to optimize treatment for gonorrheal infections.

  15. Evidence and evolutionary analysis of ancient whole-genome duplication in barley predating the divergence from rice

    Directory of Open Access Journals (Sweden)

    Grosse Ivo

    2009-08-01

    Full Text Available Abstract Background Well preserved genomic colinearity among agronomically important grass species such as rice, maize, Sorghum, wheat and barley provides access to whole-genome structure information even in species lacking a reference genome sequence. We investigated footprints of whole-genome duplication (WGD in barley that shaped the cereal ancestor genome by analyzing shared synteny with rice using a ~2000 gene-based barley genetic map and the rice genome reference sequence. Results Based on a recent annotation of the rice genome, we reviewed the WGD in rice and identified 24 pairs of duplicated genomic segments involving 70% of the rice genome. Using 968 putative orthologous gene pairs, synteny covered 89% of the barley genetic map and 63% of the rice genome. We found strong evidence for seven shared segmental genome duplications, corresponding to more than 50% of the segmental genome duplications previously determined in rice. Analysis of synonymous substitution rates (Ks suggested that shared duplications originated before the divergence of these two species. While major genome rearrangements affected the ancestral genome of both species, small paracentric inversions were found to be species specific. Conclusion We provide a thorough analysis of comparative genome evolution between barley and rice. A barley genetic map of approximately 2000 non-redundant EST sequences provided sufficient density to allow a detailed view of shared synteny with the rice genome. Using an indirect approach that included the localization of WGD-derived duplicated genome segments in the rice genome, we determined the current extent of shared WGD-derived genome duplications that occurred prior to species divergence.

  16. Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    KAUST Repository

    Phelan, Jody

    2016-03-23

    Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure and interaction modelling are used to understand the functional effects of putative mutations and provide insight into the molecular mechanisms leading to resistance. Methods To investigate the potential utility of these approaches, we analysed the genomes of 144 Mycobacterium tuberculosis clinical isolates from The Special Programme for Research and Training in Tropical Diseases (TDR) collection sourced from 20 countries in four continents. A genome-wide approach was applied to 127 isolates to identify polymorphisms associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. In addition, the effect of identified candidate mutations on protein stability and interactions was assessed quantitatively with well-established computational methods. Results The analysis revealed that mutations in the genes rpoB (rifampicin), katG (isoniazid), inhA-promoter (isoniazid), rpsL (streptomycin) and embB (ethambutol) were responsible for the majority of resistance observed. A subset of the mutations identified in rpoB and katG were predicted to affect protein stability. Further, a strong direct correlation was observed between the minimum inhibitory concentration values and the distance of the mutated residues in the three-dimensional structures of rpoB and katG to their respective drugs binding sites. Conclusions Using the TDR resource, we demonstrate the usefulness of whole genome association and convergent evolution approaches to detect known and potentially novel mutations associated with drug resistance. Further, protein structural modelling could provide a means of predicting the impact of polymorphisms on drug efficacy in the absence of phenotypic data. These approaches could ultimately lead to novel resistance

  17. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  18. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, ‘SCNU1154’, ‘Edisto47’, ‘MR-1’, and ‘PMR5’. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon. PMID:27311063

  19. A dense linkage map for Chinook salmon (Oncorhynchus tshawytscha) reveals variable chromosomal divergence after an ancestral whole genome duplication event.

    Science.gov (United States)

    Brieuc, Marine S O; Waters, Charles D; Seeb, James E; Naish, Kerry A

    2014-03-20

    Comparisons between the genomes of salmon species reveal that they underwent extensive chromosomal rearrangements following whole genome duplication that occurred in their lineage 58-63 million years ago. Extant salmonids are diploid, but occasional pairing between homeologous chromosomes exists in males. The consequences of re-diploidization can be characterized by mapping the position of duplicated loci in such species. Linkage maps are also a valuable tool for genome-wide applications such as genome-wide association studies, quantitative trait loci mapping or genome scans. Here, we investigated chromosomal evolution in Chinook salmon (Oncorhynchus tshawytscha) after genome duplication by mapping 7146 restriction-site associated DNA loci in gynogenetic haploid, gynogenetic diploid, and diploid crosses. In the process, we developed a reference database of restriction-site associated DNA loci for Chinook salmon comprising 48528 non-duplicated loci and 6409 known duplicated loci, which will facilitate locus identification and data sharing. We created a very dense linkage map anchored to all 34 chromosomes for the species, and all arms were identified through centromere mapping. The map positions of 799 duplicated loci revealed that homeologous pairs have diverged at different rates following whole genome duplication, and that degree of differentiation along arms was variable. Many of the homeologous pairs with high numbers of duplicated markers appear conserved with other salmon species, suggesting that retention of conserved homeologous pairing in some arms preceded species divergence. As chromosome arms are highly conserved across species, the major resources developed for Chinook salmon in this study are also relevant for other related species.

  20. Whole genome sequencing of Saccharomyces cerevisiae: from genotype to phenotype for improved metabolic engineering applications

    DEFF Research Database (Denmark)

    Otero, José Manuel; Vongsangnak, Wanwipa; Asadollahi, Mohammadali;

    2010-01-01

    BACKGROUND: The need for rapid and efficient microbial cell factory design and construction are possible through the enabling technology, metabolic engineering, which is now being facilitated by systems biology approaches. Metabolic engineering is often complimented by directed evolution, where...