WorldWideScience

Sample records for cell whole-genome expression

  1. Whole-Genome Expression Analysis of Human Mesenchymal Stromal Cells Exposed to Ultrasmooth Tantalum vs. Titanium Oxide Surfaces

    Stiehler, C.; Bunger, C.; Overall, R. W.

    2013-01-01

    to titanium (Ti) surface. The aim of this study was to extend the previous investigation of biocompatibility by monitoring temporal gene expression of MSCs on topographically comparable smooth Ta and Ti surfaces using whole-genome gene expression analysis. Total RNA samples from telomerase-immortalized human...... MSCs cultivated on plain sputter-coated surfaces of Ta or Ti for 1, 2, 4, and 8 days were hybridized to n = 16 U133 Plus 2.0 arrays (Affymetrix(A (R))). Functional annotation, cluster and pathway analyses were performed. The vast majority of genes were differentially regulated after 4 days...... of cultivation and genes upregulated by MSCs exposed to Ta and Ti were predominantly related to the processes of differentiation and transcription, respectively. Functional annotation analysis of the 1,000 temporally most significantly regulated genes suggests earlier cellular differentiation on Ta compared...

  2. Whole-genome gene expression modifications associated with nitrosamine exposure and micronucleus frequency in human blood cells

    Hebels, Dennie G A J; Jennen, Danyel G J; van Herwijnen, Marcel H M

    2011-01-01

    association between MN frequency and urinary NOCs (r = 0.41, P = 0.025) and identified modifications in among others cytoskeleton remodeling, cell cycle, apoptosis and survival, signal transduction, immune response, G-protein signaling and development pathways, which indicate a response to NOC......-induced genotoxicity. Moreover, we established a network of genes, the most important ones of which include FBXW7, BUB3, Caspase 2, Caspase 8, SMAD3, Huntingtin and MGMT, which are involved in processes relevant in carcinogenesis. The modified genetic processes and genes found in this study may be of interest...

  3. Use of whole genome expression analysis in the toxicity screening of nanoparticles

    Fröhlich, Eleonore; Meindl, Claudia; Wagner, Karin; Leitinger, Gerd; Roblegg, Eva

    2014-01-01

    The use of nanoparticles (NPs) offers exciting new options in technical and medical applications provided they do not cause adverse cellular effects. Cellular effects of NPs depend on particle parameters and exposure conditions. In this study, whole genome expression arrays were employed to identify the influence of particle size, cytotoxicity, protein coating, and surface functionalization of polystyrene particles as model particles and for short carbon nanotubes (CNTs) as particles with potential interest in medical treatment. Another aim of the study was to find out whether screening by microarray would identify other or additional targets than commonly used cell-based assays for NP action. Whole genome expression analysis and assays for cell viability, interleukin secretion, oxidative stress, and apoptosis were employed. Similar to conventional assays, microarray data identified inflammation, oxidative stress, and apoptosis as affected by NP treatment. Application of lower particle doses and presence of protein decreased the total number of regulated genes but did not markedly influence the top regulated genes. Cellular effects of CNTs were small; only carboxyl-functionalized single-walled CNTs caused appreciable regulation of genes. It can be concluded that regulated functions correlated well with results in cell-based assays. Presence of protein mitigated cytotoxicity but did not cause a different pattern of regulated processes. - Highlights: • Regulated functions were screened using whole genome expression assays. • Polystyrene particles regulated more genes than short carbon nanotubes. • Protein coating of polystyrene particles did not change regulation pattern. • Functions regulated by microarray were confirmed by cell-based assay

  4. Use of whole genome expression analysis in the toxicity screening of nanoparticles

    Fröhlich, Eleonore, E-mail: eleonore.froehlich@medunigraz.at [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Meindl, Claudia; Wagner, Karin [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Leitinger, Gerd [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Institute for Cell Biology, Histology and Embryology, Medical University of Graz, Harrachgasse 21, 8010 Graz (Austria); Roblegg, Eva [Institute of Pharmaceutical Sciences, Department of Pharmaceutical Technology, Karl-Franzens-University of Graz, Universitätsplatz 1, 8010 Graz (Austria)

    2014-10-15

    The use of nanoparticles (NPs) offers exciting new options in technical and medical applications provided they do not cause adverse cellular effects. Cellular effects of NPs depend on particle parameters and exposure conditions. In this study, whole genome expression arrays were employed to identify the influence of particle size, cytotoxicity, protein coating, and surface functionalization of polystyrene particles as model particles and for short carbon nanotubes (CNTs) as particles with potential interest in medical treatment. Another aim of the study was to find out whether screening by microarray would identify other or additional targets than commonly used cell-based assays for NP action. Whole genome expression analysis and assays for cell viability, interleukin secretion, oxidative stress, and apoptosis were employed. Similar to conventional assays, microarray data identified inflammation, oxidative stress, and apoptosis as affected by NP treatment. Application of lower particle doses and presence of protein decreased the total number of regulated genes but did not markedly influence the top regulated genes. Cellular effects of CNTs were small; only carboxyl-functionalized single-walled CNTs caused appreciable regulation of genes. It can be concluded that regulated functions correlated well with results in cell-based assays. Presence of protein mitigated cytotoxicity but did not cause a different pattern of regulated processes. - Highlights: • Regulated functions were screened using whole genome expression assays. • Polystyrene particles regulated more genes than short carbon nanotubes. • Protein coating of polystyrene particles did not change regulation pattern. • Functions regulated by microarray were confirmed by cell-based assay.

  5. Current Developments in Prokaryotic Single Cell Whole Genome Amplification

    Goudeau, Danielle; Nath, Nandita; Ciobanu, Doina; Cheng, Jan-Fang; Malmstrom, Rex

    2014-03-14

    Our approach to prokaryotic single-cell Whole Genome Amplification at the JGI continues to evolve. To increase both the quality and number of single-cell genomes produced, we explore all aspects of the process from cell sorting to sequencing. For example, we now utilize specialized reagents, acoustic liquid handling, and reduced reaction volumes eliminate non-target DNA contamination in WGA reactions. More specifically, we use a cleaner commercial WGA kit from Qiagen that employs a UV decontamination procedure initially developed at the JGI, and we use the Labcyte Echo for tip-less liquid transfer to set up 2uL reactions. Acoustic liquid handling also dramatically reduces reagent costs. In addition, we are exploring new cell lysis methods including treatment with Proteinase K, lysozyme, and other detergents, in order to complement standard alkaline lysis and allow for more efficient disruption of a wider range of cells. Incomplete lysis represents a major hurdle for WGA on some environmental samples, especially rhizosphere, peatland, and other soils. Finding effective lysis strategies that are also compatible with WGA is challenging, and we are currently assessing the impact of various strategies on genome recovery.

  6. Impact of antenatal glucocorticosteroids on whole-genome expression in preterm babies.

    Saugstad, Ola Didrik; Kwinta, Przemko; Wollen, Embjørg Julianne; Bik-Multanowski, Mirosław; Madetko-Talowska, Anna; Jagła, Mateusz; Tomasik, Tomasz; Pietrzyk, Jacek Józef

    2013-04-01

    To study the impact that using antenatal steroid to treat threatened preterm delivery has on whole-genome expression. A prospective whole-genome expression study was carried out on 50 newborn infants, delivered before 32 weeks gestation, who had been exposed to antenatal steroids, including 40 who had received a full antenatal steroid course. Seventy infants not exposed to antenatal steroids formed the control group. Microarray analyses were performed five and 28 days after delivery, and the results were validated by real-time PCR. The study was conducted between September 2008 and November 2010. Twenty thousand six hundred and ninety-three genes were studied in the infants' leucocytes. Thirteen were differentially expressed 5 days after delivery, but there were no differences at day 28. Four genes related to cancer or inflammation were up-regulated. Nine genes were down-regulated: six were Y-linked and associated with malignancies, graft-versus-host disease, male infertility and cell differentiation and three were associated with pre-eclampsia, oxidative stress and chloride/bicarbonate exchange. Seven gene pathways were up-regulated at day five and only one at day 28. These were associated with cell growth, cell cycle regulation, metabolism and apoptosis. Antenatal steroid therapy affects a limited number of genes and gene pathways in leucocytes in preterm babies at day five of life. The effect is short-lived, but long-term effects cannot be ruled out. ©2013 The Author(s)/Acta Paediatrica ©2013 Foundation Acta Paediatrica.

  7. Rapid Identification of Potential Drugs for Diabetic Nephropathy Using Whole-Genome Expression Profiles of Glomeruli

    Jingsong Shi

    2016-01-01

    Full Text Available Objective. To investigate potential drugs for diabetic nephropathy (DN using whole-genome expression profiles and the Connectivity Map (CMAP. Methodology. Eighteen Chinese Han DN patients and six normal controls were included in this study. Whole-genome expression profiles of microdissected glomeruli were measured using the Affymetrix human U133 plus 2.0 chip. Differentially expressed genes (DEGs between late stage and early stage DN samples and the CMAP database were used to identify potential drugs for DN using bioinformatics methods. Results. (1 A total of 1065 DEGs (FDR 1.5 were found in late stage DN patients compared with early stage DN patients. (2 Piperlongumine, 15d-PGJ2 (15-delta prostaglandin J2, vorinostat, and trichostatin A were predicted to be the most promising potential drugs for DN, acting as NF-κB inhibitors, histone deacetylase inhibitors (HDACIs, PI3K pathway inhibitors, or PPARγ agonists, respectively. Conclusion. Using whole-genome expression profiles and the CMAP database, we rapidly predicted potential DN drugs, and therapeutic potential was confirmed by previously published studies. Animal experiments and clinical trials are needed to confirm both the safety and efficacy of these drugs in the treatment of DN.

  8. Automated typing of red blood cell and platelet antigens: a whole-genome sequencing study.

    Lane, William J; Westhoff, Connie M; Gleadall, Nicholas S; Aguad, Maria; Smeland-Wagman, Robin; Vege, Sunitha; Simmons, Daimon P; Mah, Helen H; Lebo, Matthew S; Walter, Klaudia; Soranzo, Nicole; Di Angelantonio, Emanuele; Danesh, John; Roberts, David J; Watkins, Nick A; Ouwehand, Willem H; Butterworth, Adam S; Kaufman, Richard M; Rehm, Heidi L; Silberstein, Leslie E; Green, Robert C

    2018-06-01

    There are more than 300 known red blood cell (RBC) antigens and 33 platelet antigens that differ between individuals. Sensitisation to antigens is a serious complication that can occur in prenatal medicine and after blood transfusion, particularly for patients who require multiple transfusions. Although pre-transfusion compatibility testing largely relies on serological methods, reagents are not available for many antigens. Methods based on single-nucleotide polymorphism (SNP) arrays have been used, but typing for ABO and Rh-the most important blood groups-cannot be done with SNP typing alone. We aimed to develop a novel method based on whole-genome sequencing to identify RBC and platelet antigens. This whole-genome sequencing study is a subanalysis of data from patients in the whole-genome sequencing arm of the MedSeq Project randomised controlled trial (NCT01736566) with no measured patient outcomes. We created a database of molecular changes in RBC and platelet antigens and developed an automated antigen-typing algorithm based on whole-genome sequencing (bloodTyper). This algorithm was iteratively improved to address cis-trans haplotype ambiguities and homologous gene alignments. Whole-genome sequencing data from 110 MedSeq participants (30 × depth) were used to initially validate bloodTyper through comparison with conventional serology and SNP methods for typing of 38 RBC antigens in 12 blood-group systems and 22 human platelet antigens. bloodTyper was further validated with whole-genome sequencing data from 200 INTERVAL trial participants (15 × depth) with serological comparisons. We iteratively improved bloodTyper by comparing its typing results with conventional serological and SNP typing in three rounds of testing. The initial whole-genome sequencing typing algorithm was 99·5% concordant across the first 20 MedSeq genomes. Addressing discordances led to development of an improved algorithm that was 99·8% concordant for the remaining 90 Med

  9. Single-Cell Whole-Genome Amplification and Sequencing: Methodology and Applications.

    Huang, Lei; Ma, Fei; Chapman, Alec; Lu, Sijia; Xie, Xiaoliang Sunney

    2015-01-01

    We present a survey of single-cell whole-genome amplification (WGA) methods, including degenerate oligonucleotide-primed polymerase chain reaction (DOP-PCR), multiple displacement amplification (MDA), and multiple annealing and looping-based amplification cycles (MALBAC). The key parameters to characterize the performance of these methods are defined, including genome coverage, uniformity, reproducibility, unmappable rates, chimera rates, allele dropout rates, false positive rates for calling single-nucleotide variations, and ability to call copy-number variations. Using these parameters, we compare five commercial WGA kits by performing deep sequencing of multiple single cells. We also discuss several major applications of single-cell genomics, including studies of whole-genome de novo mutation rates, the early evolution of cancer genomes, circulating tumor cells (CTCs), meiotic recombination of germ cells, preimplantation genetic diagnosis (PGD), and preimplantation genomic screening (PGS) for in vitro-fertilized embryos.

  10. Single Cell HLA Matching Feasibility by Whole Genomic Amplification and Nested PCR

    Xiao-hong Li; Fang-yin Meng

    2004-01-01

    @@ PCR based single-cell DNA analysis has been widely used in forensic science, preimplantation genetic diagnosis and so on. However, the original sample cannot be efficiently retrieved following single cell PCR, consequently the amount of information gained is limited. HLA system is too sophisticated that it is very hard to complete HLA typing by single cell. A Taq polymerase-based method using random primers to amplify whole genome termed as whole genome amplification (WGA) has demonstrated to be a useful method in increasing the copies of minimum sample. We establish a technique in this study to amplify HLA-A and HLA-B loci at same time in a single cell using WGA.

  11. Comparison of whole genome amplification techniques for human single cell exome sequencing.

    Borgström, Erik; Paterlini, Marta; Mold, Jeff E; Frisen, Jonas; Lundeberg, Joakim

    2017-01-01

    Whole genome amplification (WGA) is currently a prerequisite for single cell whole genome or exome sequencing. Depending on the method used the rate of artifact formation, allelic dropout and sequence coverage over the genome may differ significantly. The largest difference between the evaluated protocols was observed when analyzing the target coverage and read depth distribution. These differences also had impact on the downstream variant calling. Conclusively, the products from the AMPLI1 and MALBAC kits were shown to be most similar to the bulk samples and are therefore recommended for WGA of single cells. In this study four commercial kits for WGA (AMPLI1, MALBAC, Repli-G and PicoPlex) were used to amplify human single cells. The WGA products were exome sequenced together with non-amplified bulk samples from the same source. The resulting data was evaluated in terms of genomic coverage, allelic dropout and SNP calling.

  12. Comparison of variations detection between whole-genome amplification methods used in single-cell resequencing

    Hou, Yong; Wu, Kui; Shi, Xulian

    2015-01-01

    methods, focusing particularly on variations detection. Low-coverage whole-genome sequencing revealed that DOP-PCR had the highest duplication ratio, but an even read distribution and the best reproducibility and accuracy for detection of copy-number variations (CNVs). However, MDA had significantly...... performance using SCRS amplified by different WGA methods. It will guide researchers to determine which WGA method is best suited to individual experimental needs at single-cell level....

  13. Whole-genome gene expression profiling of formalin-fixed, paraffin-embedded tissue samples.

    Craig April

    2009-12-01

    Full Text Available We have developed a gene expression assay (Whole-Genome DASL, capable of generating whole-genome gene expression profiles from degraded samples such as formalin-fixed, paraffin-embedded (FFPE specimens.We demonstrated a similar level of sensitivity in gene detection between matched fresh-frozen (FF and FFPE samples, with the number and overlap of probes detected in the FFPE samples being approximately 88% and 95% of that in the corresponding FF samples, respectively; 74% of the differentially expressed probes overlapped between the FF and FFPE pairs. The WG-DASL assay is also able to detect 1.3-1.5 and 1.5-2 -fold changes in intact and FFPE samples, respectively. The dynamic range for the assay is approximately 3 logs. Comparing the WG-DASL assay with an in vitro transcription-based labeling method yielded fold-change correlations of R(2 approximately 0.83, while fold-change comparisons with quantitative RT-PCR assays yielded R(2 approximately 0.86 and R(2 approximately 0.55 for intact and FFPE samples, respectively. Additionally, the WG-DASL assay yielded high self-correlations (R(2>0.98 with low intact RNA inputs ranging from 1 ng to 100 ng; reproducible expression profiles were also obtained with 250 pg total RNA (R(2 approximately 0.92, with approximately 71% of the probes detected in 100 ng total RNA also detected at the 250 pg level. When FFPE samples were assayed, 1 ng total RNA yielded self-correlations of R(2 approximately 0.80, while still maintaining a correlation of R(2 approximately 0.75 with standard FFPE inputs (200 ng.Taken together, these results show that WG-DASL assay provides a reliable platform for genome-wide expression profiling in archived materials. It also possesses utility within clinical settings where only limited quantities of samples may be available (e.g. microdissected material or when minimally invasive procedures are performed (e.g. biopsied specimens.

  14. Whole-genome sequencing of a malignant granular cell tumor with metabolic response to pazopanib

    Wei, Lei; Liu, Song; Conroy, Jeffrey; Wang, Jianmin; Papanicolau-Sengos, Antonios; Glenn, Sean T.; Murakami, Mitsuko; Liu, Lu; Hu, Qiang; Conroy, Jacob; Miles, Kiersten Marie; Nowak, David E.; Liu, Biao; Qin, Maochun; Bshara, Wiam; Omilian, Angela R.; Head, Karen; Bianchi, Michael; Burgher, Blake; Darlak, Christopher; Kane, John; Merzianu, Mihai; Cheney, Richard; Fabiano, Andrew; Salerno, Kilian; Talati, Chetasi; Khushalani, Nikhil I.; Trump, Donald L.; Johnson, Candace S.; Morrison, Carl D.

    2015-01-01

    Granular cell tumors are an uncommon soft tissue neoplasm. Malignant granular cell tumors comprise T transitions, particularly when immediately preceded by a 5′ G. A loss-of-function mutation was detected in a newly recognized tumor suppressor candidate, BRD7. No mutations were found in known targets of pazopanib. However, we identified a receptor tyrosine kinase pathway mutation in GFRA2 that warrants further evaluation. To the best of our knowledge, this is only the second reported case of a malignant granular cell tumor exhibiting a response to pazopanib, and the first whole-genome sequencing of this uncommon tumor type. The findings provide insight into the genetic basis of malignant granular cell tumors and identify potential targets for further investigation. PMID:27148567

  15. Disturbance of gene expression in primary human hepatocytes by hepatotoxic pyrrolizidine alkaloids: A whole genome transcriptome analysis.

    Luckert, Claudia; Hessel, Stefanie; Lenze, Dido; Lampen, Alfonso

    2015-10-01

    1,2-unsaturated pyrrolizidine alkaloids (PA) are plant metabolites predominantly occurring in the plant families Asteraceae and Boraginaceae. Acute and chronic PA poisoning causes severe hepatotoxicity. So far, the molecular mechanisms of PA toxicity are not well understood. To analyze its mode of action, primary human hepatocytes were exposed to a non-cytotoxic dose of 100 μM of four structurally different PA: echimidine, heliotrine, senecionine, senkirkine. Changes in mRNA expression were analyzed by a whole genome microarray. Employing cut-off values with a |fold change| of 2 and a q-value of 0.01, data analysis revealed numerous changes in gene expression. In total, 4556, 1806, 3406 and 8623 genes were regulated by echimidine, heliotrine, senecione and senkirkine, respectively. 1304 genes were identified as commonly regulated. PA affected pathways related to cell cycle regulation, cell death and cancer development. The transcription factors TP53, MYC, NFκB and NUPR1 were predicted to be activated upon PA treatment. Furthermore, gene expression data showed a considerable interference with lipid metabolism and bile acid flow. The associated transcription factors FXR, LXR, SREBF1/2, and PPARα/γ/δ were predicted to be inhibited. In conclusion, though structurally different, all four PA significantly regulated a great number of genes in common. This proposes similar molecular mechanisms, although the extent seems to differ between the analyzed PA as reflected by the potential hepatotoxicity and individual PA structure. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Massively parallel whole genome amplification for single-cell sequencing using droplet microfluidics.

    Hosokawa, Masahito; Nishikawa, Yohei; Kogawa, Masato; Takeyama, Haruko

    2017-07-12

    Massively parallel single-cell genome sequencing is required to further understand genetic diversities in complex biological systems. Whole genome amplification (WGA) is the first step for single-cell sequencing, but its throughput and accuracy are insufficient in conventional reaction platforms. Here, we introduce single droplet multiple displacement amplification (sd-MDA), a method that enables massively parallel amplification of single cell genomes while maintaining sequence accuracy and specificity. Tens of thousands of single cells are compartmentalized in millions of picoliter droplets and then subjected to lysis and WGA by passive droplet fusion in microfluidic channels. Because single cells are isolated in compartments, their genomes are amplified to saturation without contamination. This enables the high-throughput acquisition of contamination-free and cell specific sequence reads from single cells (21,000 single-cells/h), resulting in enhancement of the sequence data quality compared to conventional methods. This method allowed WGA of both single bacterial cells and human cancer cells. The obtained sequencing coverage rivals those of conventional techniques with superior sequence quality. In addition, we also demonstrate de novo assembly of uncultured soil bacteria and obtain draft genomes from single cell sequencing. This sd-MDA is promising for flexible and scalable use in single-cell sequencing.

  17. Whole-genome expression analyses of type 2 diabetes in human skin reveal altered immune function and burden of infection.

    Wu, Chun; Chen, Xiaopan; Shu, Jing; Lee, Chun-Ting

    2017-05-23

    Skin disorders are among most common complications associated with type 2 diabetes mellitus (T2DM). Although T2DM patients are known to have increased risk of infections and other T2DM-related skin disorders, their molecular mechanisms are largely unknown. This study aims to identify dysregulated genes and gene networks that are associated with T2DM in human skin. We compared the expression profiles of 56,318 transcribed genes on 74 T2DM cases and 148 gender- age-, and race-matched non-diabetes controls from the Genotype-Tissue Expression (GTEx) database. RNA-Sequencing data indicates that diabetic skin is characterized by increased expression of genes that are related to immune responses (CCL20, CXCL9, CXCL10, CXCL11, CXCL13, and CCL18), JAK/STAT signaling pathway (JAK3, STAT1, and STAT2), tumor necrosis factor superfamily (TNFSF10 and TNFSF15), and infectious disease pathways (OAS1, OAS2, OAS3, and IFIH1). Genes in cell adhesion molecules pathway (NCAM1 and L1CAM) and collagen family (PCOLCE2 and COL9A3) are downregulated, suggesting structural changes in the skin of T2DM. For the first time, to the best of our knowledge, this pioneer analytic study reports comprehensive unbiased gene expression changes and dysregulated pathways in the non-diseased skin of T2DM patients. This comprehensive understanding derived from whole-genome expression profiles could advance our knowledge in determining molecular targets for the prevention and treatment of T2DM-associated skin disorders.

  18. Single Cell Analysis of Dystrophin and SRY Gene by Using Whole Genome Amplification

    徐晨明; 金帆; 黄荷凤; 陶冶; 叶英辉

    2001-01-01

    Objective To develop a reliable and sensitive method for detection of sex and multiloci of Duchenne muscular dystrophy (DMD) gene in single cell Materials & methods Whole genome of single cell were amplified by using 15-base random primers (primer extension preamplification, PEP), then a small aliquot of PEP product were analyzed by using locus-specific nest PCR amplification. The procedure was evaluated by detection dystrophin exons 8, 17, 19, 44, 45, 48 and human testis-determining gene (SRY)in single lymphocytes from known sources and single blastomeres from the couples with no family history of DMD.Results The amplification efficiency rate of six dystrophin exons from single lymphocytes and single blastomeres were 97. 2% (175/180) and 100% (60/60) respectively.Results of SRY showed that 100% (15/15) amplification in single male-derived lymphocytes and 0% (0/15) amplification in single female-derived lymphocytes. Conclusion The technique of single cell PEP-nest PCR for dystrophin exons 8, 17,19, 44, 45, 48 and SRY is highly specifc. PEP-nest PCR is suitable for Preimplantation genetic diagnosis (PGD) of DMD at single cell level.

  19. A quantitative comparison of single-cell whole genome amplification methods.

    Charles F A de Bourcy

    Full Text Available Single-cell sequencing is emerging as an important tool for studies of genomic heterogeneity. Whole genome amplification (WGA is a key step in single-cell sequencing workflows and a multitude of methods have been introduced. Here, we compare three state-of-the-art methods on both bulk and single-cell samples of E. coli DNA: Multiple Displacement Amplification (MDA, Multiple Annealing and Looping Based Amplification Cycles (MALBAC, and the PicoPLEX single-cell WGA kit (NEB-WGA. We considered the effects of reaction gain on coverage uniformity, error rates and the level of background contamination. We compared the suitability of the different WGA methods for the detection of copy-number variations, for the detection of single-nucleotide polymorphisms and for de-novo genome assembly. No single method performed best across all criteria and significant differences in characteristics were observed; the choice of which amplifier to use will depend strongly on the details of the type of question being asked in any given experiment.

  20. Whole-genome analysis of a patient with early-stage small-cell lung cancer.

    Han, J-Y; Lee, Y-S; Kim, B C; Lee, G K; Lee, S; Kim, E-H; Kim, H-M; Bhak, J

    2014-12-01

    We performed whole-genome sequencing (WGS) of a case of early-stage small-cell lung cancer (SCLC) to analyze the genomic features. WGS revealed a lot of single-nucleotide variations (SNVs), small insertion/deletions and chromosomal abnormality. Chromosomes 4p, 5q, 13q, 15q, 17p and 22q contained many block deletions. Especially, copy loss was observed in tumor suppressor genes RB1 and TP53, and copy gain in oncogene hTERT. Somatic mutations were found in TP53 and CREBBP. Novel nonsynonymous (ns) SNVs in C6ORF103 and SLC5A4 genes were also found. Sanger sequencing of the SLC5A4 gene in 23 independent SCLC samples showed another nsSNV in the SLC5A4 gene, indicating that nsSNVs in the SLC5A4 gene are recurrent in SCLC. WGS of an early-stage SCLC identified novel recurrent mutations and validated known variations, including copy number variations. These findings provide insight into the genomic landscape contributing to SCLC development.

  1. Whole genome transcription profiling of Anaplasma phagocytophilum in human and tick host cells by tiling array analysis

    Chavez Adela

    2008-07-01

    Full Text Available Abstract Background Anaplasma phagocytophilum (Ap is an obligate intracellular bacterium and the agent of human granulocytic anaplasmosis, an emerging tick-borne disease. Ap alternately infects ticks and mammals and a variety of cell types within each. Understanding the biology behind such versatile cellular parasitism may be derived through the use of tiling microarrays to establish high resolution, genome-wide transcription profiles of the organism as it infects cell lines representative of its life cycle (tick; ISE6 and pathogenesis (human; HL-60 and HMEC-1. Results Detailed, host cell specific transcriptional behavior was revealed. There was extensive differential Ap gene transcription between the tick (ISE6 and the human (HL-60 and HMEC-1 cell lines, with far fewer differentially transcribed genes between the human cell lines, and all disproportionately represented by membrane or surface proteins. There were Ap genes exclusively transcribed in each cell line, apparent human- and tick-specific operons and paralogs, and anti-sense transcripts that suggest novel expression regulation processes. Seven virB2 paralogs (of the bacterial type IV secretion system showed human or tick cell dependent transcription. Previously unrecognized genes and coding sequences were identified, as were the expressed p44/msp2 (major surface proteins paralogs (of 114 total, through elevated signal produced to the unique hypervariable region of each – 2/114 in HL-60, 3/114 in HMEC-1, and none in ISE6. Conclusion Using these methods, whole genome transcription profiles can likely be generated for Ap, as well as other obligate intracellular organisms, in any host cells and for all stages of the cell infection process. Visual representation of comprehensive transcription data alongside an annotated map of the genome renders complex transcription into discernable patterns.

  2. Effects of a diet high in monounsaturated fat and a full Mediterranean diet on PBMC whole genome gene expression and plasma proteins

    Dijk, van, Susan; Feskens, Edith; Bos, M.B.; Groot, de, Lisette; Vries, de, Jeanne; Muller, Michael; Afman, Lydia

    2012-01-01

    This study aimed to identify the effects of replacement of saturated fat (SFA) by monunsaturated fat (MUFA) in a western-type diet and the effects of a full Mediterranean (MED) diet on whole genome PBMC gene expression and plasma protein profiles. Abdominally overweight subjects were randomized to a 8 wk completely controlled SFA-rich diet, a SFA-by-MUFA-replaced diet (MUFA diet) or a MED diet. Concentrations of 124 plasma proteins and PBMCs whole genome transcriptional profiles were assessed...

  3. Whole genome expression profiling using DNA microarray for determining biocompatibility of polymeric surfaces

    Stangegaard, Michael; Wang, Zhenyu; Kutter, Jörg Peter

    2006-01-01

    There is an ever increasing need to find surfaces that are biocompatible for applications like medical implants and microfluidics-based cell culture systems. The biocompatibility of five different surfaces with different hydrophobicity was determined using gene expression profiling as well as more...

  4. Whole genome expression array profiling highlights differences in mucosal defense genes in Barrett's esophagus and esophageal adenocarcinoma.

    Derek J Nancarrow

    Full Text Available Esophageal adenocarcinoma (EAC has become a major concern in Western countries due to rapid rises in incidence coupled with very poor survival rates. One of the key risk factors for the development of this cancer is the presence of Barrett's esophagus (BE, which is believed to form in response to repeated gastro-esophageal reflux. In this study we performed comparative, genome-wide expression profiling (using Illumina whole-genome Beadarrays on total RNA extracted from esophageal biopsy tissues from individuals with EAC, BE (in the absence of EAC and those with normal squamous epithelium. We combined these data with publically accessible raw data from three similar studies to investigate key gene and ontology differences between these three tissue states. The results support the deduction that BE is a tissue with enhanced glycoprotein synthesis machinery (DPP4, ATP2A3, AGR2 designed to provide strong mucosal defenses aimed at resisting gastro-esophageal reflux. EAC exhibits the enhanced extracellular matrix remodeling (collagens, IGFBP7, PLAU effects expected in an aggressive form of cancer, as well as evidence of reduced expression of genes associated with mucosal (MUC6, CA2, TFF1 and xenobiotic (AKR1C2, AKR1B10 defenses. When our results are compared to previous whole-genome expression profiling studies keratin, mucin, annexin and trefoil factor gene groups are the most frequently represented differentially expressed gene families. Eleven genes identified here are also represented in at least 3 other profiling studies. We used these genes to discriminate between squamous epithelium, BE and EAC within the two largest cohorts using a support vector machine leave one out cross validation (LOOCV analysis. While this method was satisfactory for discriminating squamous epithelium and BE, it demonstrates the need for more detailed investigations into profiling changes between BE and EAC.

  5. Application of Whole Genome Expression Analysis to Assess Bacterial Responses to Environmental Conditions

    Vukanti, R. V.; Mintz, E. M.; Leff, L. G.

    2005-05-01

    Bacterial responses to environmental signals are multifactorial and are coupled to changes in gene expression. An understanding of bacterial responses to environmental conditions is possible using microarray expression analysis. In this study, the utility of microarrays for examining changes in gene expression in Escherichia coli under different environmental conditions was assessed. RNA was isolated, hybridized to Affymetrix E. coli Genome 2.0 chips and analyzed using Affymetrix GCOS and Genespring software. Major limiting factors were obtaining enough quality RNA (107-108 cells to get 10μg RNA)and accounting for differences in growth rates under different conditions. Stabilization of RNA prior to isolation and taking extreme precautions while handling RNA were crucial. In addition, use of this method in ecological studies is limited by availability and cost of commercial arrays; choice of primers for cDNA synthesis, reproducibility, complexity of results generated and need to validate findings. This method may be more widely applicable with the development of better approaches for RNA recovery from environmental samples and increased number of available strain-specific arrays. Diligent experimental design and verification of results with real-time PCR or northern blots is needed. Overall, there is a great potential for use of this technology to discover mechanisms underlying organisms' responses to environmental conditions.

  6. Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array

    Sugnet Charles

    2006-12-01

    Full Text Available Abstract Background Alternative splicing is a mechanism for increasing protein diversity by excluding or including exons during post-transcriptional processing. Alternatively spliced proteins are particularly relevant in oncology since they may contribute to the etiology of cancer, provide selective drug targets, or serve as a marker set for cancer diagnosis. While conventional identification of splice variants generally targets individual genes, we present here a new exon-centric array (GeneChip Human Exon 1.0 ST that allows genome-wide identification of differential splice variation, and concurrently provides a flexible and inclusive analysis of gene expression. Results We analyzed 20 paired tumor-normal colon cancer samples using a microarray designed to detect over one million putative exons that can be virtually assembled into potential gene-level transcripts according to various levels of prior supporting evidence. Analysis of high confidence (empirically supported transcripts identified 160 differentially expressed genes, with 42 genes occupying a network impacting cell proliferation and another twenty nine genes with unknown functions. A more speculative analysis, including transcripts based solely on computational prediction, produced another 160 differentially expressed genes, three-fourths of which have no previous annotation. We also present a comparison of gene signal estimations from the Exon 1.0 ST and the U133 Plus 2.0 arrays. Novel splicing events were predicted by experimental algorithms that compare the relative contribution of each exon to the cognate transcript intensity in each tissue. The resulting candidate splice variants were validated with RT-PCR. We found nine genes that were differentially spliced between colon tumors and normal colon tissues, several of which have not been previously implicated in cancer. Top scoring candidates from our analysis were also found to substantially overlap with EST-based bioinformatic

  7. Microarray analysis of serum mRNA in patients with head and neck squamous cell carcinoma at whole-genome scale

    Čapková, M.; Šáchová, Jana; Strnad, Hynek; Kolář, Michal; Hroudová, Miluše; Chovanec, M.; Čada, Z.; Štefl, M.; Valach, J.; Kastner, J.; Smetana, K. Jr.; Plzák, J.

    -, April 23 (2014) ISSN 2314-6141 R&D Projects: GA MZd(CZ) NT13488 Institutional support: RVO:68378050 Keywords : Microarray Analysis * Head and Neck Squamous Cell Carcinoma * whole-genome scale Subject RIV: EB - Genetics ; Molecular Biology

  8. Transcriptome sequencing and whole genome expression profiling of chrysanthemum under dehydration stress

    2013-01-01

    Background Chrysanthemum is one of the most important ornamental crops in the world and drought stress seriously limits its production and distribution. In order to generate a functional genomics resource and obtain a deeper understanding of the molecular mechanisms regarding chrysanthemum responses to dehydration stress, we performed large-scale transcriptome sequencing of chrysanthemum plants under dehydration stress using the Illumina sequencing technology. Results Two cDNA libraries constructed from mRNAs of control and dehydration-treated seedlings were sequenced by Illumina technology. A total of more than 100 million reads were generated and de novo assembled into 98,180 unique transcripts which were further extensively annotated by comparing their sequencing to different protein databases. Biochemical pathways were predicted from these transcript sequences. Furthermore, we performed gene expression profiling analysis upon dehydration treatment in chrysanthemum and identified 8,558 dehydration-responsive unique transcripts, including 307 transcription factors and 229 protein kinases and many well-known stress responsive genes. Gene ontology (GO) term enrichment and biochemical pathway analyses showed that dehydration stress caused changes in hormone response, secondary and amino acid metabolism, and light and photoperiod response. These findings suggest that drought tolerance of chrysanthemum plants may be related to the regulation of hormone biosynthesis and signaling, reduction of oxidative damage, stabilization of cell proteins and structures, and maintenance of energy and carbon supply. Conclusions Our transcriptome sequences can provide a valuable resource for chrysanthemum breeding and research and novel insights into chrysanthemum responses to dehydration stress and offer candidate genes or markers that can be used to guide future studies attempting to breed drought tolerant chrysanthemum cultivars. PMID:24074255

  9. Effects of a diet high in monounsaturated fat and a full Mediterranean diet on PBMC whole genome gene expression and plasma proteins

    Dijk, van Susan; Feskens, Edith; Bos, M.B.; Groot, de Lisette; Vries, de Jeanne; Muller, Michael; Afman, Lydia

    2012-01-01

    This study aimed to identify the effects of replacement of saturated fat (SFA) by monunsaturated fat (MUFA) in a western-type diet and the effects of a full Mediterranean (MED) diet on whole genome PBMC gene expression and plasma protein profiles. Abdominally overweight subjects were randomized to a

  10. Gene expression profiling to characterize sediment toxicity – a pilot study using Caenorhabditis elegans whole genome microarrays

    Reifferscheid Georg

    2009-04-01

    Full Text Available Abstract Background Traditionally, toxicity of river sediments is assessed using whole sediment tests with benthic organisms. The challenge, however, is the differentiation between multiple effects caused by complex contaminant mixtures and the unspecific toxicity endpoints such as survival, growth or reproduction. The use of gene expression profiling facilitates the identification of transcriptional changes at the molecular level that are specific to the bio-available fraction of pollutants. Results In this pilot study, we exposed the nematode Caenorhabditis elegans to three sediments of German rivers with varying (low, medium and high levels of heavy metal and organic contamination. Beside chemical analysis, three standard bioassays were performed: reproduction of C. elegans, genotoxicity (Comet assay and endocrine disruption (YES test. Gene expression was profiled using a whole genome DNA-microarray approach to identify overrepresented functional gene categories and derived cellular processes. Disaccharide and glycogen metabolism were found to be affected, whereas further functional pathways, such as oxidative phosphorylation, ribosome biogenesis, metabolism of xenobiotics, aging and several developmental processes were found to be differentially regulated only in response to the most contaminated sediment. Conclusion This study demonstrates how ecotoxicogenomics can identify transcriptional responses in complex mixture scenarios to distinguish different samples of river sediments.

  11. Whole-Genome Sequencing of Invasion-Resistant Cells Identifies Laminin α2 as a Host Factor for Bacterial Invasion

    van Wijk, Xander M.; Döhrmann, Simon; Hallstrom, Bjorn

    2017-01-01

    cells. Whole-genome sequencing and transcriptome sequencing (RNA-Seq) uncovered a deletion in the gene encoding the laminin subunit α2 (Lama2) that eliminated much of domain L4a. Silencing of the long Lama2 isoform in wild-type cells strongly reduced bacterial invasion, whereas transfection with human...... LAMA2 cDNA significantly enhanced invasion in pgsA745 cells. The addition of exogenous laminin-α2β1γ1/laminin-α2β2γ1 strongly increased bacterial invasion in CHO cells, as well as in human alveolar basal epithelial and human brain microvascular endothelial cells. Thus, the L4a domain in laminin α2...

  12. Whole genome expression and biochemical correlates of extreme constitutional types defined in Ayurveda.

    Prasher, Bhavana; Negi, Sapna; Aggarwal, Shilpi; Mandal, Amit K; Sethi, Tav P; Deshmukh, Shailaja R; Purohit, Sudha G; Sengupta, Shantanu; Khanna, Sangeeta; Mohammad, Farhan; Garg, Gaurav; Brahmachari, Samir K; Mukerji, Mitali

    2008-09-09

    Ayurveda is an ancient system of personalized medicine documented and practiced in India since 1500 B.C. According to this system an individual's basic constitution to a large extent determines predisposition and prognosis to diseases as well as therapy and life-style regime. Ayurveda describes seven broad constitution types (Prakritis) each with a varying degree of predisposition to different diseases. Amongst these, three most contrasting types, Vata, Pitta, Kapha, are the most vulnerable to diseases. In the realm of modern predictive medicine, efforts are being directed towards capturing disease phenotypes with greater precision for successful identification of markers for prospective disease conditions. In this study, we explore whether the different constitution types as described in Ayurveda has molecular correlates. Normal individuals of the three most contrasting constitutional types were identified following phenotyping criteria described in Ayurveda in Indian population of Indo-European origin. The peripheral blood samples of these individuals were analysed for genome wide expression levels, biochemical and hematological parameters. Gene Ontology (GO) and pathway based analysis was carried out on differentially expressed genes to explore if there were significant enrichments of functional categories among Prakriti types. Individuals from the three most contrasting constitutional types exhibit striking differences with respect to biochemical and hematological parameters and at genome wide expression levels. Biochemical profiles like liver function tests, lipid profiles, and hematological parameters like haemoglobin exhibited differences between Prakriti types. Functional categories of genes showing differential expression among Prakriti types were significantly enriched in core biological processes like transport, regulation of cyclin dependent protein kinase activity, immune response and regulation of blood coagulation. A significant enrichment of

  13. Multi-platform whole-genome microarray analyses refine the epigenetic signature of breast cancer metastasis with gene expression and copy number.

    Joseph Andrews

    2010-01-01

    Full Text Available We have previously identified genome-wide DNA methylation changes in a cell line model of breast cancer metastasis. These complex epigenetic changes that we observed, along with concurrent karyotype analyses, have led us to hypothesize that complex genomic alterations in cancer cells (deletions, translocations and ploidy are superimposed over promoter-specific methylation events that are responsible for gene-specific expression changes observed in breast cancer metastasis.We undertook simultaneous high-resolution, whole-genome analyses of MDA-MB-468GFP and MDA-MB-468GFP-LN human breast cancer cell lines (an isogenic, paired lymphatic metastasis cell line model using Affymetrix gene expression (U133, promoter (1.0R, and SNP/CNV (SNP 6.0 microarray platforms to correlate data from gene expression, epigenetic (DNA methylation, and combination copy number variant/single nucleotide polymorphism microarrays. Using Partek Software and Ingenuity Pathway Analysis we integrated datasets from these three platforms and detected multiple hypomethylation and hypermethylation events. Many of these epigenetic alterations correlated with gene expression changes. In addition, gene dosage events correlated with the karyotypic differences observed between the cell lines and were reflected in specific promoter methylation patterns. Gene subsets were identified that correlated hyper (and hypo methylation with the loss (or gain of gene expression and in parallel, with gene dosage losses and gains, respectively. Individual gene targets from these subsets were also validated for their methylation, expression and copy number status, and susceptible gene pathways were identified that may indicate how selective advantage drives the processes of tumourigenesis and metastasis.Our approach allows more precisely profiling of functionally relevant epigenetic signatures that are associated with cancer progression and metastasis.

  14. Monodisperse Picoliter Droplets for Low-Bias and Contamination-Free Reactions in Single-Cell Whole Genome Amplification.

    Yohei Nishikawa

    Full Text Available Whole genome amplification (WGA is essential for obtaining genome sequences from single bacterial cells because the quantity of template DNA contained in a single cell is very low. Multiple displacement amplification (MDA, using Phi29 DNA polymerase and random primers, is the most widely used method for single-cell WGA. However, single-cell MDA usually results in uneven genome coverage because of amplification bias, background amplification of contaminating DNA, and formation of chimeras by linking of non-contiguous chromosomal regions. Here, we present a novel MDA method, termed droplet MDA, that minimizes amplification bias and amplification of contaminants by using picoliter-sized droplets for compartmentalized WGA reactions. Extracted DNA fragments from a lysed cell in MDA mixture are divided into 105 droplets (67 pL within minutes via flow through simple microfluidic channels. Compartmentalized genome fragments can be individually amplified in these droplets without the risk of encounter with reagent-borne or environmental contaminants. Following quality assessment of WGA products from single Escherichia coli cells, we showed that droplet MDA minimized unexpected amplification and improved the percentage of genome recovery from 59% to 89%. Our results demonstrate that microfluidic-generated droplets show potential as an efficient tool for effective amplification of low-input DNA for single-cell genomics and greatly reduce the cost and labor investment required for determination of nearly complete genome sequences of uncultured bacteria from environmental samples.

  15. Coriandrum sativum L. (Coriander essential oil: antifungal activity and mode of action on Candida spp., and molecular targets affected in human whole-genome expression.

    Irlan de Almeida Freires

    Full Text Available Oral candidiasis is an opportunistic fungal infection of the oral cavity with increasingly worldwide prevalence and incidence rates. Novel specifically-targeted strategies to manage this ailment have been proposed using essential oils (EO known to have antifungal properties. In this study, we aim to investigate the antifungal activity and mode of action of the EO from Coriandrum sativum L. (coriander leaves on Candida spp. In addition, we detected the molecular targets affected in whole-genome expression in human cells. The EO phytochemical profile indicates monoterpenes and sesquiterpenes as major components, which are likely to negatively impact the viability of yeast cells. There seems to be a synergistic activity of the EO chemical compounds as their isolation into fractions led to a decreased antimicrobial effect. C. sativum EO may bind to membrane ergosterol, increasing ionic permeability and causing membrane damage leading to cell death, but it does not act on cell wall biosynthesis-related pathways. This mode of action is illustrated by photomicrographs showing disruption in biofilm integrity caused by the EO at varied concentrations. The EO also inhibited Candida biofilm adherence to a polystyrene substrate at low concentrations, and decreased the proteolytic activity of Candida albicans at minimum inhibitory concentration. Finally, the EO and its selected active fraction had low cytotoxicity on human cells, with putative mechanisms affecting gene expression in pathways involving chemokines and MAP-kinase (proliferation/apoptosis, as well as adhesion proteins. These findings highlight the potential antifungal activity of the EO from C. sativum leaves and suggest avenues for future translational toxicological research.

  16. Whole-Genome Sequence of the Metastatic PC3 and LNCaP Human Prostate Cancer Cell Lines

    Inge Seim

    2017-06-01

    Full Text Available The bone metastasis-derived PC3 and the lymph node metastasis-derived LNCaP prostate cancer cell lines are widely studied, having been described in thousands of publications over the last four decades. Here, we report short-read whole-genome sequencing (WGS and de novo assembly of PC3 (ATCC CRL-1435 and LNCaP (clone FGC; ATCC CRL-1740 at ∼70 × coverage. A known homozygous mutation in TP53 and homozygous loss of PTEN were robustly identified in the PC3 cell line, whereas the LNCaP cell line exhibited a larger number of putative inactivating somatic point and indel mutations (and in particular a loss of stop codon events. This study also provides preliminary evidence that loss of one or both copies of the tumor suppressor Capicua (CIC contributes to primary tumor relapse and metastatic progression, potentially offering a treatment target for castration-resistant prostate cancer (CRPC. Our work provides a resource for genetic, genomic, and biological studies employing two commonly-used prostate cancer cell lines.

  17. Whole-Genome Sequence of the Metastatic PC3 and LNCaP Human Prostate Cancer Cell Lines.

    Seim, Inge; Jeffery, Penny L; Thomas, Patrick B; Nelson, Colleen C; Chopin, Lisa K

    2017-06-07

    The bone metastasis-derived PC3 and the lymph node metastasis-derived LNCaP prostate cancer cell lines are widely studied, having been described in thousands of publications over the last four decades. Here, we report short-read whole-genome sequencing (WGS) and de novo assembly of PC3 (ATCC CRL-1435) and LNCaP (clone FGC; ATCC CRL-1740) at ∼70 × coverage. A known homozygous mutation in TP53 and homozygous loss of PTEN were robustly identified in the PC3 cell line, whereas the LNCaP cell line exhibited a larger number of putative inactivating somatic point and indel mutations (and in particular a loss of stop codon events). This study also provides preliminary evidence that loss of one or both copies of the tumor suppressor Capicua ( CIC ) contributes to primary tumor relapse and metastatic progression, potentially offering a treatment target for castration-resistant prostate cancer (CRPC). Our work provides a resource for genetic, genomic, and biological studies employing two commonly-used prostate cancer cell lines. Copyright © 2017 Seim et al.

  18. Effect of Wortmannin on the repair profiles of DNA double-strand breaks in the whole genome and in interstitial telomeric sequences of Chinese hamster cells

    Losada, Raquel; Rivero, Maria Teresa; Slijepcevic, Predrag; Goyanes, Vicente; Fernandez, Jose Luis

    2005-01-01

    The DNA breakage detection-fluorescence in situ hybridization (DBD-FISH) procedure was applied to analyze the effect of Wortmannin (WM) in the rejoining kinetics of ionizing radiation-induced DNA double-strand breaks (DSBs) in the whole genome and in the long interstitial telomeric repeat sequence (ITRS) blocks from Chinese hamster cell lines. The results indicate that the ITRS blocks from wild-type Chinese hamster cell lines, CHO9 and V79B, exhibit a slower initial rejoining rate of ionizing radiation-induced DSBs than the genome overall. Neither Rad51C nor the catalytic subunit of DNA-dependent protein kinase (DNA-PKcs) activities, involved in homologous recombination (HR) and in non-homologous end-joining (NHEJ) pathways of DSB repair respectively, influenced the rejoining kinetics within ITRS in contrast to DNA sequences in the whole genome. Nevertheless, DSB removal rate within ITRS was decreased in the absence of Ku86 activity, though at a lower affectation level than in the whole genome, thus homogenizing both rejoining kinetics rates. WM treatment slowed down the DSB rejoining kinetics rate in ITRS, this effect being more pronounced in the whole genome, resulting in a similar pattern to that of the Ku86 deficient cells. In fact, no WM effect was detected in the Ku86 deficient Chinese hamster cells, so probably WM does not add further impairment in DSB rejoining than that resulted as a consequence of absence of Ku activity. The same slowing effect was also observed after treatment of Rad51C and DNA-PKcs defective hamster cells by WM, suggesting that: (1) there is no potentiation of the HR when the NHEJ is impaired by WM, either in the whole genome or in the ITRS, and (2) that this impairment may probably involve more targets than DNA-PKcs. These results suggest that there is an intragenomic heterogeneity in DSB repair, as well as in the effect of WM on this process

  19. A simple method for encapsulating single cells in alginate microspheres allows for direct PCR and whole genome amplification.

    Saharnaz Bigdeli

    Full Text Available Microdroplets are an effective platform for segregating individual cells and amplifying DNA. However, a key challenge is to recover the contents of individual droplets for downstream analysis. This paper offers a method for embedding cells in alginate microspheres and performing multiple serial operations on the isolated cells. Rhodobacter sphaeroides cells were diluted in alginate polymer and sprayed into microdroplets using a fingertip aerosol sprayer. The encapsulated cells were lysed and subjected either to conventional PCR, or whole genome amplification using either multiple displacement amplification (MDA or a two-step PCR protocol. Microscopic examination after PCR showed that the lumen of the occupied microspheres contained fluorescently stained DNA product, but multiple displacement amplification with phi29 produced only a small number of polymerase colonies. The 2-step WGA protocol was successful in generating fluorescent material, and quantitative PCR from DNA extracted from aliquots of microspheres suggested that the copy number inside the microspheres was amplified up to 3 orders of magnitude. Microspheres containing fluorescent material were sorted by a dilution series and screened with a fluorescent plate reader to identify single microspheres. The DNA was extracted from individual isolates, re-amplified with full-length sequencing adapters, and then a single isolate was sequenced using the Illumina MiSeq platform. After filtering the reads, the only sequences that collectively matched a genome in the NCBI nucleotide database belonged to R. sphaeroides. This demonstrated that sequencing-ready DNA could be generated from the contents of a single microsphere without culturing. However, the 2-step WGA strategy showed limitations in terms of low genome coverage and an uneven frequency distribution of reads across the genome. This paper offers a simple method for embedding cells in alginate microspheres and performing PCR on isolated

  20. Evaluation of a Stenotrophomonas maltophilia bacteremia cluster in hematopoietic stem cell transplantation recipients using whole genome sequencing

    Stefanie Kampmeier

    2017-11-01

    Full Text Available Abstract Background Stenotrophomonas maltophilia ubiquitously occurs in the hospital environment. This opportunistic pathogen can cause severe infections in immunocompromised hosts such as hematopoietic stem cell transplantation (HSCT recipients. Between February and July 2016, a cluster of four patients on the HSCT unit suffered from S. maltophilia bloodstream infections (BSI. Methods For epidemiological investigation we retrospectively identified the colonization status of patients admitted to the ward during this time period and performed environmental monitoring of shower heads, shower outlets, washbasins and toilets in patient rooms. We tested antibiotic susceptibility of detected S. maltophilia isolates. Environmental and blood culture samples were subjected to whole genome sequence (WGS-based typing. Results Of four patients with S. maltophlilia BSI, three were found to be colonized previously. In addition, retrospective investigations revealed two patients being colonized in anal swab samples but not infected. Environmental monitoring revealed one shower outlet contaminated with S. maltophilia. Antibiotic susceptibility testing of seven S. maltophlia strains resulted in two trimethoprim/sulfamethoxazole resistant and five susceptible isolates, however, not excluding an outbreak scenario. WGS-based typing did not result in any close genotypic relationship among the patients’ isolates. In contrast, one environmental isolate from a shower outlet was closely related to a single patient’s isolate. Conclusion WGS-based typing successfully refuted an outbreak of S. maltophilia on a HSCT ward but uncoverd that sanitary installations can be an actual source of S. maltophilia transmissions.

  1. Alignment of whole genomes.

    Delcher, A L; Kasif, S; Fleischmann, R D; Peterson, J; White, O; Salzberg, S L

    1999-01-01

    A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6. In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications. PMID:10325427

  2. Whole genome mRNA transcriptomics analysis reveals different modes of action of the diarrheic shellfish poisons okadaic acid and dinophysis toxin-1 versus azaspiracid-1 in Caco-2 cells.

    Bodero, Marcia; Hoogenboom, Ron L A P; Bovee, Toine F H; Portier, Liza; de Haan, Laura; Peijnenburg, Ad; Hendriksen, Peter J M

    2018-02-01

    A study with DNA microarrays was performed to investigate the effects of two diarrhetic and one azaspiracid shellfish poison, okadaic acid (OA), dinophysistoxin-1 (DTX-1) and azaspiracid-1 (AZA-1) respectively, on the whole-genome mRNA expression of undifferentiated intestinal Caco-2 cells. Previously, the most responding genes were used to develop a dedicated array tube test to screen shellfish samples on the presence of these toxins. In the present study the whole genome mRNA expression was analyzed in order to reveal modes of action and obtain hints on potential biomarkers suitable to be used in alternative bioassays. Effects on key genes in the most affected pathways and processes were confirmed by qPCR. OA and DTX-1 induced almost identical effects on mRNA expression, which strongly indicates that OA and DTX-1induce similar toxic effects. Biological interpretation of the microarray data indicates that both compounds induce hypoxia related pathways/processes, the unfolded protein response (UPR) and endoplasmic reticulum (ER) stress. The gene expression profile of AZA-1 is different and shows increased mRNA expression of genes involved in cholesterol synthesis and glycolysis, suggesting a different mode of action for this toxin. Future studies should reveal whether identified pathways provide suitable biomarkers for rapid detection of DSPs in shellfish. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  3. Whole-Genome Sequencing and iPLEX MassARRAY Genotyping Map an EMS-Induced Mutation Affecting Cell Competition in Drosophila melanogaster.

    Lee, Chang-Hyun; Rimesso, Gerard; Reynolds, David M; Cai, Jinlu; Baker, Nicholas E

    2016-10-13

    Cell competition, the conditional loss of viable genotypes only when surrounded by other cells, is a phenomenon observed in certain genetic mosaic conditions. We conducted a chemical mutagenesis and screen to recover new mutations that affect cell competition between wild-type and RpS3 heterozygous cells. Mutations were identified by whole-genome sequencing, making use of software tools that greatly facilitate the distinction between newly induced mutations and other sources of apparent sequence polymorphism, thereby reducing false-positive and false-negative identification rates. In addition, we utilized iPLEX MassARRAY for genotyping recombinant chromosomes. These approaches permitted the mapping of a new mutation affecting cell competition when only a single allele existed, with a phenotype assessed only in genetic mosaics, without the benefit of complementation with existing mutations, deletions, or duplications. These techniques expand the utility of chemical mutagenesis and whole-genome sequencing for mutant identification. We discuss mutations in the Atm and Xrp1 genes identified in this screen. Copyright © 2016 Lee et al.

  4. Whole-Genome Sequencing and iPLEX MassARRAY Genotyping Map an EMS-Induced Mutation Affecting Cell Competition in Drosophila melanogaster

    Chang-Hyun Lee

    2016-10-01

    Full Text Available Cell competition, the conditional loss of viable genotypes only when surrounded by other cells, is a phenomenon observed in certain genetic mosaic conditions. We conducted a chemical mutagenesis and screen to recover new mutations that affect cell competition between wild-type and RpS3 heterozygous cells. Mutations were identified by whole-genome sequencing, making use of software tools that greatly facilitate the distinction between newly induced mutations and other sources of apparent sequence polymorphism, thereby reducing false-positive and false-negative identification rates. In addition, we utilized iPLEX MassARRAY for genotyping recombinant chromosomes. These approaches permitted the mapping of a new mutation affecting cell competition when only a single allele existed, with a phenotype assessed only in genetic mosaics, without the benefit of complementation with existing mutations, deletions, or duplications. These techniques expand the utility of chemical mutagenesis and whole-genome sequencing for mutant identification. We discuss mutations in the Atm and Xrp1 genes identified in this screen.

  5. Effects of temperature on gene expression patterns in Leptospira interrogans serovar Lai as assessed by whole-genome microarrays.

    Lo, Miranda; Bulach, Dieter M; Powell, David R; Haake, David A; Matsunaga, James; Paustian, Michael L; Zuerner, Richard L; Adler, Ben

    2006-10-01

    Leptospirosis is an important zoonosis of worldwide distribution. Humans become infected via exposure to pathogenic Leptospira spp. from infected animals or contaminated water or soil. The availability of genome sequences for Leptospira interrogans, serovars Lai and Copenhageni, has opened up opportunities to examine global transcription profiles using microarray technology. Temperature is a key environmental factor known to affect leptospiral protein expression. Leptospira spp. can grow in artificial media at a range of temperatures reflecting conditions found in the environment and the mammalian host. Therefore, transcriptional changes were compared between cultures grown at 20 degrees C, 30 degrees C, 37 degrees C, and 39 degrees C to represent ambient temperatures in the environment, growth under laboratory conditions, and temperatures in healthy and febrile hosts. Data from direct pairwise comparisons of the four temperatures were consolidated to examine transcriptional changes at two generalized biological conditions representing mammalian physiological temperatures (37 degrees C and 39 degrees C) versus environmental temperatures (20 degrees C and 30 degrees C). Additionally, cultures grown at 30 degrees C then shifted overnight to 37 degrees C were compared with those grown long-term at 30 degrees C and 37 degrees C to identify genes potentially expressed in the early stages of infection. Comparison of data sets from physiological versus environmental experiments with upshift experiments provided novel insights into possible transcriptional changes at different stages of infection. Changes included differential expression of chemotaxis and motility genes, signal transduction systems, and genes encoding proteins involved in alteration of the outer membrane. These findings indicate that temperature is an important factor regulating expression of proteins that facilitate invasion and establishment of disease.

  6. Carbon ion irradiation of the human prostate cancer cell line PC3: A whole genome microarray study

    SUETENS, ANNELIES; MOREELS, MARJAN; QUINTENS, ROEL; CHIRIOTTI, SABINA; TABURY, KEVIN; MICHAUX, ARLETTE; GRÉGOIRE, VINCENT; BAATOUT, SARAH

    2014-01-01

    Hadrontherapy is a form of external radiation therapy, which uses beams of charged particles such as carbon ions. Compared to conventional radiotherapy with photons, the main advantage of carbon ion therapy is the precise dose localization along with an increased biological effectiveness. The first results obtained from prostate cancer patients treated with carbon ion therapy showed good local tumor control and survival rates. In view of this advanced treatment modality we investigated the effects of irradiation with different beam qualities on gene expression changes in the PC3 prostate adenocarcinoma cell line. For this purpose, PC3 cells were irradiated with various doses (0.0, 0.5 and 2.0 Gy) of carbon ions (LET=33.7 keV/μm) at the beam of the Grand Accélérateur National d’Ions Lourds (Caen, France). Comparative experiments with X-rays were performed at the Belgian Nuclear Research Centre. Genome-wide gene expression was analyzed using microarrays. Our results show a downregulation in many genes involved in cell cycle and cell organization processes after 2.0 Gy irradiation. This effect was more pronounced after carbon ion irradiation compared with X-rays. Furthermore, we found a significant downregulation of many genes related to cell motility. Several of these changes were confirmed using qPCR. In addition, recurrence-free survival analysis of prostate cancer patients based on one of these motility genes (FN1) revealed that patients with low expression levels had a prolonged recurrence-free survival time, indicating that this gene may be a potential prognostic biomarker for prostate cancer. Understanding how different radiation qualities affect the cellular behavior of prostate cancer cells is important to improve the clinical outcome of cancer radiation therapy. PMID:24504141

  7. Whole genome expression profiling associates activation of unfolded protein response with impaired production and release of epinephrine after recurrent hypoglycemia.

    Juhye Lena Kim

    Full Text Available Recurrent hypoglycemia can occur as a major complication of insulin replacement therapy, limiting the long-term health benefits of intense glycemic control in type 1 and advanced type 2 diabetic patients. It impairs the normal counter-regulatory hormonal and behavioral responses to glucose deprivation, a phenomenon known as hypoglycemia associated autonomic failure (HAAF. The molecular mechanisms leading to defective counter-regulation are not completely understood. We hypothesized that both neuronal (excessive cholinergic signaling between the splanchnic nerve fibers and the adrenal medulla and humoral factors contribute to the impaired epinephrine production and release in HAAF. To gain further insight into the molecular mechanism(s mediating the blunted epinephrine responses following recurrent hypoglycemia, we utilized a global gene expression profiling approach. We characterized the transcriptomes during recurrent (defective counter-regulation model and acute hypoglycemia (normal counter-regulation group in the adrenal medulla of normal Sprague-Dawley rats. Based on comparison analysis of differentially expressed genes, a set of unique genes that are activated only at specific time points after recurrent hypoglycemia were revealed. A complementary bioinformatics analysis of the functional category, pathway, and integrated network indicated activation of the unfolded protein response. Furthermore, at least three additional pathways/interaction networks altered in the adrenal medulla following recurrent hypoglycemia were identified, which may contribute to the impaired epinephrine secretion in HAAF: greatly increased neuropeptide signaling (proenkephalin, neuropeptide Y, galanin; altered ion homeostasis (Na+, K+, Ca2+ and downregulation of genes involved in Ca2+-dependent exocytosis of secretory vesicles. Given the pleiotropic effects of the unfolded protein response in different organs, involved in maintaining glucose homeostasis, these

  8. Whole Genome and Global Gene Expression Analyses of the Model Mushroom Flammulina velutipes Reveal a High Capacity for Lignocellulose Degradation

    Park, Young-Jin; Baek, Jeong Hun; Lee, Seonwook; Kim, Changhoon; Rhee, Hwanseok; Kim, Hyungtae; Seo, Jeong-Sun; Park, Hae-Ran; Yoon, Dae-Eun; Nam, Jae-Young; Kim, Hong-Il; Kim, Jong-Guk; Yoon, Hyeokjun; Kang, Hee-Wan; Cho, Jae-Yong; Song, Eun-Sung; Sung, Gi-Ho; Yoo, Young-Bok; Lee, Chang-Soo; Lee, Byoung-Moo; Kong, Won-Sik

    2014-01-01

    Flammulina velutipes is a fungus with health and medicinal benefits that has been used for consumption and cultivation in East Asia. F. velutipes is also known to degrade lignocellulose and produce ethanol. The overlapping interests of mushroom production and wood bioconversion make F. velutipes an attractive new model for fungal wood related studies. Here, we present the complete sequence of the F. velutipes genome. This is the first sequenced genome for a commercially produced edible mushroom that also degrades wood. The 35.6-Mb genome contained 12,218 predicted protein-encoding genes and 287 tRNA genes assembled into 11 scaffolds corresponding with the 11 chromosomes of strain KACC42780. The 88.4-kb mitochondrial genome contained 35 genes. Well-developed wood degrading machinery with strong potential for lignin degradation (69 auxiliary activities, formerly FOLymes) and carbohydrate degradation (392 CAZymes), along with 58 alcohol dehydrogenase genes were highly expressed in the mycelium, demonstrating the potential application of this organism to bioethanol production. Thus, the newly uncovered wood degrading capacity and sequential nature of this process in F. velutipes, offer interesting possibilities for more detailed studies on either lignin or (hemi-) cellulose degradation in complex wood substrates. The mutual interest in wood degradation by the mushroom industry and (ligno-)cellulose biomass related industries further increase the significance of F. velutipes as a new model. PMID:24714189

  9. Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

    Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

    2016-01-01

    ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including

  10. Whole Genome Epigenetics

    Carmell, Michelle A; Hannon, Gregory J

    2004-01-01

    .... However, this is only part of the picture. Increasingly, we are learning that epigenetic changes, that is, changes in chromatin structure, are critically important in regulating cellular gene expression...

  11. Whole Genome Epigenetics

    Carmell, Michelle A; Hannon, Gregory J

    2005-01-01

    .... However, this is only part of the picture. Increasingly, we are learning that epigenetic changes, that is, changes in chromatin structure, are critically important in regulating cellular gene expression...

  12. Whole Genome Epigenetics

    Carmell, Michelle

    2003-01-01

    .... However, this is only part of the picture. Increasingly, we are learning that epigenetic changes, that is, changes in chromatin structure, are critically important in regulation cellular gene expression...

  13. Whole-genome transcription and DNA methylation analysis of peripheral blood mononuclear cells identified aberrant gene regulation pathways in systemic lupus erythematosus.

    Zhu, Honglin; Mi, Wentao; Luo, Hui; Chen, Tao; Liu, Shengxi; Raman, Indu; Zuo, Xiaoxia; Li, Quan-Zhen

    2016-07-13

    Recent achievement in genetics and epigenetics has led to the exploration of the pathogenesis of systemic lupus erythematosus (SLE). Identification of differentially expressed genes and their regulatory mechanism(s) at whole-genome level will provide a comprehensive understanding of the development of SLE and its devastating complications, lupus nephritis (LN). We performed whole-genome transcription and DNA methylation analysis in PBMC of 30 SLE patients, including 15 with LN (SLE LN(+)) and 15 without LN (SLE LN(-)), and 25 normal controls (NC) using HumanHT-12 Beadchips and Illumina Human Methy450 chips. The serum proinflammatory cytokines were quantified using Bio-plex Human Cytokine 27-plex assay. Differentially expressed genes and differentially methylated CpG were analyzed with GenomeStudio, R, and SAM software. The association between DNA methylation and gene expression were tested. Gene interaction pathways of the differentially expressed genes were analyzed by IPA software. We identified 552 upregulated genes and 550 downregulated genes in PBMC of SLE. Integration of DNA methylation and gene expression profiling showed that 334 upregulated genes were hypomethylated, and 479 downregulated genes were hypermethylated. Pathway analysis on the differential genes in SLE revealed significant enrichment in interferon (IFN) signaling and toll-like receptor (TLR) signaling pathways. Nine IFN- and seven TLR-related genes were identified and displayed step-wise increase in SLE LN(-) and SLE LN(+). Hypomethylated CpG sites were detected on these genes. The gene expressions for MX1, GPR84, and E2F2 were increased in SLE LN(+) as compared to SLE LN(-) patients. The serum levels of inflammatory cytokines, including IL17A, IP-10, bFGF, TNF-α, IL-6, IL-15, GM-CSF, IL-1RA, IL-5, and IL-12p70, were significantly elevated in SLE compared with NC. The levels of IL-15 and IL1RA correlated with their mRNA expression. The upregulation of IL-15 may be regulated by hypomethylated

  14. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.

  15. Analysis of antisense expression by whole genome tiling microarrays and siRNAs suggests mis-annotation of Arabidopsis orphan protein-coding genes.

    Casey R Richardson

    2010-05-01

    Full Text Available MicroRNAs (miRNAs and trans-acting small-interfering RNAs (tasi-RNAs are small (20-22 nt long RNAs (smRNAs generated from hairpin secondary structures or antisense transcripts, respectively, that regulate gene expression by Watson-Crick pairing to a target mRNA and altering expression by mechanisms related to RNA interference. The high sequence homology of plant miRNAs to their targets has been the mainstay of miRNA prediction algorithms, which are limited in their predictive power for other kingdoms because miRNA complementarity is less conserved yet transitive processes (production of antisense smRNAs are active in eukaryotes. We hypothesize that antisense transcription and associated smRNAs are biomarkers which can be computationally modeled for gene discovery.We explored rice (Oryza sativa sense and antisense gene expression in publicly available whole genome tiling array transcriptome data and sequenced smRNA libraries (as well as C. elegans and found evidence of transitivity of MIRNA genes similar to that found in Arabidopsis. Statistical analysis of antisense transcript abundances, presence of antisense ESTs, and association with smRNAs suggests several hundred Arabidopsis 'orphan' hypothetical genes are non-coding RNAs. Consistent with this hypothesis, we found novel Arabidopsis homologues of some MIRNA genes on the antisense strand of previously annotated protein-coding genes. A Support Vector Machine (SVM was applied using thermodynamic energy of binding plus novel expression features of sense/antisense transcription topology and siRNA abundances to build a prediction model of miRNA targets. The SVM when trained on targets could predict the "ancient" (deeply conserved class of validated Arabidopsis MIRNA genes with an accuracy of 84%, and 76% for "new" rapidly-evolving MIRNA genes.Antisense and smRNA expression features and computational methods may identify novel MIRNA genes and other non-coding RNAs in plants and potentially other

  16. Development of a fluorescence-activated cell sorting method coupled with whole genome amplification to analyze minority and trace Dehalococcoides genomes in microbial communities.

    Lee, Patrick K H; Men, Yujie; Wang, Shanquan; He, Jianzhong; Alvarez-Cohen, Lisa

    2015-02-03

    Dehalococcoides mccartyi are functionally important bacteria that catalyze the reductive dechlorination of chlorinated ethenes. However, these anaerobic bacteria are fastidious to isolate, making downstream genomic characterization challenging. In order to facilitate genomic analysis, a fluorescence-activated cell sorting (FACS) method was developed in this study to separate D. mccartyi cells from a microbial community, and the DNA of the isolated cells was processed by whole genome amplification (WGA) and hybridized onto a D. mccartyi microarray for comparative genomics against four sequenced strains. First, FACS was successfully applied to a D. mccartyi isolate as positive control, and then microarray results verified that WGA from 10(6) cells or ∼1 ng of genomic DNA yielded high-quality coverage detecting nearly all genes across the genome. As expected, some inter- and intrasample variability in WGA was observed, but these biases were minimized by performing multiple parallel amplifications. Subsequent application of the FACS and WGA protocols to two enrichment cultures containing ∼10% and ∼1% D. mccartyi cells successfully enabled genomic analysis. As proof of concept, this study demonstrates that coupling FACS with WGA and microarrays is a promising tool to expedite genomic characterization of target strains in environmental communities where the relative concentrations are low.

  17. Whole-genome methylation caller designed for methyl- DNA ...

    etchie

    2013-02-20

    Feb 20, 2013 ... Our method uses a single-CpG-resolution, whole-genome methylation ... Key words: Methyl-DNA immunoprecipitation, next-generation sequencing, ...... methylation is prevalent in embryonic stem cells andmaybe mediated.

  18. Thiopurine treatment in patients with Crohn's disease leads to a selective reduction of an effector cytotoxic gene expression signature revealed by whole-genome expression profiling.

    Bouma, G; Baggen, J M; van Bodegraven, A A; Mulder, C J J; Kraal, G; Zwiers, A; Horrevoets, A J; van der Pouw Kraan, C T M

    2013-07-01

    Crohn's disease (CD) is characterized by chronic inflammation of the gastrointestinal tract, as a result of aberrant activation of the innate immune system through TLR stimulation by bacterial products. The conventional immunosuppressive thiopurine derivatives (azathioprine and mercaptopurine) are used to treat CD. The effects of thiopurines on circulating immune cells and TLR responsiveness are unknown. To obtain a global view of affected gene expression of the immune system in CD patients and the treatment effect of thiopurine derivatives, we performed genome-wide transcriptome analysis on whole blood samples from 20 CD patients in remission, of which 10 patients received thiopurine treatment, compared to 16 healthy controls, before and after TLR4 stimulation with LPS. Several immune abnormalities were observed, including increased baseline interferon activity, while baseline expression of ribosomal genes was reduced. After LPS stimulation, CD patients showed reduced cytokine and chemokine expression. None of these effects were related to treatment. Strikingly, only one highly correlated set of 69 genes was affected by treatment, not influenced by LPS stimulation and consisted of genes reminiscent of effector cytotoxic NK cells. The most reduced cytotoxicity-related gene in CD was the cell surface marker CD160. Concordantly, we could demonstrate an in vivo reduction of circulating CD160(+)CD3(-)CD8(-) cells in CD patients after treatment with thiopurine derivatives in an independent cohort. In conclusion, using genome-wide profiling, we identified a disturbed immune activation status in peripheral blood cells from CD patients and a clear treatment effect of thiopurine derivatives selectively affecting effector cytotoxic CD160-positive cells. Copyright © 2013 Elsevier Ltd. All rights reserved.

  19. Whole-genome sequencing of multiple myeloma from diagnosis to plasma cell leukemia reveals genomic initiating events, evolution, and clonal tides.

    Egan, Jan B; Shi, Chang-Xin; Tembe, Waibhav; Christoforides, Alexis; Kurdoglu, Ahmet; Sinari, Shripad; Middha, Sumit; Asmann, Yan; Schmidt, Jessica; Braggio, Esteban; Keats, Jonathan J; Fonseca, Rafael; Bergsagel, P Leif; Craig, David W; Carpten, John D; Stewart, A Keith

    2012-08-02

    The longitudinal evolution of a myeloma genome from diagnosis to plasma cell leukemia has not previously been reported. We used whole-genome sequencing (WGS) on 4 purified tumor samples and patient germline DNA drawn over a 5-year period in a t(4;14) multiple myeloma patient. Tumor samples were acquired at diagnosis, first relapse, second relapse, and end-stage secondary plasma cell leukemia (sPCL). In addition to the t(4;14), all tumor time points also shared 10 common single-nucleotide variants (SNVs) on WGS comprising shared initiating events. Interestingly, we observed genomic sequence variants that waxed and waned with time in progressive tumors, suggesting the presence of multiple independent, yet related, clones at diagnosis that rose and fell in dominance. Five newly acquired SNVs, including truncating mutations of RB1 and ZKSCAN3, were observed only in the final sPCL sample suggesting leukemic transformation events. This longitudinal WGS characterization of the natural history of a high-risk myeloma patient demonstrated tumor heterogeneity at diagnosis with shifting dominance of tumor clones over time and has also identified potential mutations contributing to myelomagenesis as well as transformation from myeloma to overt extramedullary disease such as sPCL.

  20. Discovering human germ cell mutagens with whole genome sequencing: Insights from power calculations reveal the importance of controlling for between-family variability.

    Webster, R J; Williams, A; Marchetti, F; Yauk, C L

    2018-07-01

    Mutations in germ cells pose potential genetic risks to offspring. However, de novo mutations are rare events that are spread across the genome and are difficult to detect. Thus, studies in this area have generally been under-powered, and no human germ cell mutagen has been identified. Whole Genome Sequencing (WGS) of human pedigrees has been proposed as an approach to overcome these technical and statistical challenges. WGS enables analysis of a much wider breadth of the genome than traditional approaches. Here, we performed power analyses to determine the feasibility of using WGS in human families to identify germ cell mutagens. Different statistical models were compared in the power analyses (ANOVA and multiple regression for one-child families, and mixed effect model sampling between two to four siblings per family). Assumptions were made based on parameters from the existing literature, such as the mutation-by-paternal age effect. We explored two scenarios: a constant effect due to an exposure that occurred in the past, and an accumulating effect where the exposure is continuing. Our analysis revealed the importance of modeling inter-family variability of the mutation-by-paternal age effect. Statistical power was improved by models accounting for the family-to-family variability. Our power analyses suggest that sufficient statistical power can be attained with 4-28 four-sibling families per treatment group, when the increase in mutations ranges from 40 to 10% respectively. Modeling family variability using mixed effect models provided a reduction in sample size compared to a multiple regression approach. Much larger sample sizes were required to detect an interaction effect between environmental exposures and paternal age. These findings inform study design and statistical modeling approaches to improve power and reduce sequencing costs for future studies in this area. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.

  1. Whole genome-wide transcript profiling to identify differentially expressed genes associated with seed field emergence in two soybean low phytate mutants.

    Yuan, Fengjie; Yu, Xiaomin; Dong, Dekun; Yang, Qinghua; Fu, Xujun; Zhu, Shenlong; Zhu, Danhua

    2017-01-18

    Seed germination is important to soybean (Glycine max) growth and development, ultimately affecting soybean yield. A lower seed field emergence has been the main hindrance for breeding soybeans low in phytate. Although this reduction could be overcome by additional breeding and selection, the mechanisms of seed germination in different low phytate mutants remain unknown. In this study, we performed a comparative transcript analysis of two low phytate soybean mutants (TW-1 and TW-1-M), which have the same mutation, a 2 bp deletion in GmMIPS1, but show a significant difference in seed field emergence, TW-1-M was higher than that of TW-1 . Numerous genes analyzed by RNA-Seq showed markedly different expression levels between TW-1-M and TW-1 mutants. Approximately 30,000-35,000 read-mapped genes and ~21000-25000 expressed genes were identified for each library. There were ~3900-9200 differentially expressed genes (DEGs) in each contrast library, the number of up-regulated genes was similar with down-regulated genes in the mutant TW-1and TW-1-M. Gene ontology functional categories of DEGs indicated that the ethylene-mediated signaling pathway, the abscisic acid-mediated signaling pathway, response to hormone, ethylene biosynthetic process, ethylene metabolic process, regulation of hormone levels, and oxidation-reduction process, regulation of flavonoid biosynthetic process and regulation of abscisic acid-activated signaling pathway had high correlations with seed germination. In total, 2457 DEGs involved in the above functional categories were identified. Twenty-two genes with 20 biological functions were the most highly up/down- regulated (absolute value Log2FC >5) in the high field emergence mutant TW-1-M and were related to metabolic or signaling pathways. Fifty-seven genes with 36 biological functions had the greatest expression abundance (FRPM >100) in germination-related pathways. Seed germination in the soybean low phytate mutants is a very complex process

  2. Characterization and analysis of CCR and CAD gene families at the whole-genome level for lignin synthesis of stone cells in pear (Pyrus bretschneideri) fruit.

    Cheng, Xi; Li, Manli; Li, Dahui; Zhang, Jinyun; Jin, Qing; Sheng, Lingling; Cai, Yongping; Lin, Yi

    2017-11-15

    The content of stone cells has significant effects on the flavour and quality of pear fruit. Previous research suggested that lignin deposition is closely related to stone cell formation. In the lignin biosynthetic pathway, cinnamoyl-CoA reductase (CCR) and cinnamyl alcohol dehydrogenase (CAD), dehydrogenase/reductase family members, catalyse the last two steps in monolignol synthesis. However, there is little knowledge of the characteristics of the CCR and CAD families in pear and their involvement in lignin synthesis of stone cells. In this study, 31 CCR s and 26 CAD s were identified in the pear genome. Phylogenetic trees for CCR s and CAD s were constructed; key amino acid residues were analysed, and three-dimensional structures were predicted. Using quantitative real-time polymerase chain reaction (qRT-PCR), PbCAD2 , PbCCR1 , -2 and - 3 were identified as participating in lignin synthesis of stone cells in pear fruit. Subcellular localization analysis showed that the expressed proteins (PbCAD2, PbCCR1, -2 and -3) are found in the cytoplasm or at the cell membrane. These results reveal the evolutionary features of the CCR and CAD families in pear as well as the genes responsible for regulation of lignin synthesis and stone cell development in pear fruit. © 2017. Published by The Company of Biologists Ltd.

  3. Characterization and analysis of CCR and CAD gene families at the whole-genome level for lignin synthesis of stone cells in pear (Pyrus bretschneideri fruit

    Xi Cheng

    2017-11-01

    Full Text Available The content of stone cells has significant effects on the flavour and quality of pear fruit. Previous research suggested that lignin deposition is closely related to stone cell formation. In the lignin biosynthetic pathway, cinnamoyl-CoA reductase (CCR and cinnamyl alcohol dehydrogenase (CAD, dehydrogenase/reductase family members, catalyse the last two steps in monolignol synthesis. However, there is little knowledge of the characteristics of the CCR and CAD families in pear and their involvement in lignin synthesis of stone cells. In this study, 31 CCRs and 26 CADs were identified in the pear genome. Phylogenetic trees for CCRs and CADs were constructed; key amino acid residues were analysed, and three-dimensional structures were predicted. Using quantitative real-time polymerase chain reaction (qRT-PCR, PbCAD2, PbCCR1, -2 and -3 were identified as participating in lignin synthesis of stone cells in pear fruit. Subcellular localization analysis showed that the expressed proteins (PbCAD2, PbCCR1, -2 and -3 are found in the cytoplasm or at the cell membrane. These results reveal the evolutionary features of the CCR and CAD families in pear as well as the genes responsible for regulation of lignin synthesis and stone cell development in pear fruit.

  4. A generic approach for the design of whole-genome oligoarrays, validated for genomotyping, deletion mapping and gene expression analysis on Staphylococcus aureus

    Renzoni Adriana

    2005-06-01

    Full Text Available Abstract Background DNA microarray technology is widely used to determine the expression levels of thousands of genes in a single experiment, for a broad range of organisms. Optimal design of immobilized nucleic acids has a direct impact on the reliability of microarray results. However, despite small genome size and complexity, prokaryotic organisms are not frequently studied to validate selected bioinformatics approaches. Relying on parameters shown to affect the hybridization of nucleic acids, we designed freely available software and validated experimentally its performance on the bacterial pathogen Staphylococcus aureus. Results We describe an efficient procedure for selecting 40–60 mer oligonucleotide probes combining optimal thermodynamic properties with high target specificity, suitable for genomic studies of microbial species. The algorithm for filtering probes from extensive oligonucleotides libraries fitting standard thermodynamic criteria includes positional information of predicted target-probe binding regions. This algorithm efficiently selected probes recognizing homologous gene targets across three different sequenced genomes of Staphylococcus aureus. BLAST analysis of the final selection of 5,427 probes yielded >97%, 93%, and 81% of Staphylococcus aureus genome coverage in strains N315, Mu50, and COL, respectively. A manufactured oligoarray including a subset of control Escherichia coli probes was validated for applications in the fields of comparative genomics and molecular epidemiology, mapping of deletion mutations and transcription profiling. Conclusion This generic chip-design process merging sequence information from several related genomes improves genome coverage even in conserved regions.

  5. Whole Genome Epidemiological Typing of Escherichia coli

    Kaas, Rolf Sommer

    validating each position analyzed and ignoring the positions that cannot be validated thereby creating a distance matrix that is used as input to an UPGMA method that creates the final phylogeny. The ND method was also implemented as a web server and published. If whole genome sequencing is to be used...

  6. Whole-genome shotgun optical mapping of rhodospirillumrubrum

    Reslewic, Susan; Zhou, Shiguo; Place, Mike; Zhang, Yaoping; Briska, Adam; Goldstein, Steve; Churas, Chris; Runnheim, Rod; Forrest,Dan; Lim, Alex; Lapidus, Alla; Han, Cliff S.; Roberts, Gary P.; Schwartz,David C.

    2004-07-01

    Rhodospirillum rubrum is a phototrophic purple non-sulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems, and as a source of hydrogen and biodegradable plastics production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction maps (Xba I, Nhe I, and Hind III) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction maps from randomly sheared genomic DNA molecules extracted directly from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the Hind III map acted as a scaffold for high resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and validation of genome sequence, our work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a ''molecular cytogenetics'' approach to solving problems in genomic analysis.

  7. Whole-genome shotgun optical mapping of Rhodospirillum rubrum

    Reslewic, S. [Univ. Wisc.-Madison; Zhou, S. [Univ. Wisc.-Madison; Place, M. [Univ. Wisc.-Madison; Zhang, Y. [Univ. Wisc.-Madison; Briska, A. [Univ. Wisc.-Madison; Goldstein, S. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Lim, A. [Univ. Wisc.-Madison; Lapidus, A. [Univ. Wisc.-Madison; Han, C. S. [Univ. Wisc.-Madison; Roberts, G. P. [Univ. Wisc.-Madison; Schwartz, D. C. [Univ. Wisc.-Madison

    2005-09-01

    Rhodospirillum rubrum is a phototrophic purple nonsulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems and as a source of hydrogen and biodegradable plastic production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction endonuclease maps (XbaI, NheI, and HindIII) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction endonuclease maps from randomly sheared genomic DNA molecules extracted from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the HindIII map acted as a scaffold for high-resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and confirmation of genome sequence, this work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a "molecular cytogenetics" approach to solving problems in genomic analysis.

  8. Whole genome phylogenies for multiple Drosophila species

    Seetharam Arun

    2012-12-01

    Full Text Available Abstract Background Reconstructing the evolutionary history of organisms using traditional phylogenetic methods may suffer from inaccurate sequence alignment. An alternative approach, particularly effective when whole genome sequences are available, is to employ methods that don’t use explicit sequence alignments. We extend a novel phylogenetic method based on Singular Value Decomposition (SVD to reconstruct the phylogeny of 12 sequenced Drosophila species. SVD analysis provides accurate comparisons for a high fraction of sequences within whole genomes without the prior identification of orthologs or homologous sites. With this method all protein sequences are converted to peptide frequency vectors within a matrix that is decomposed to provide simplified vector representations for each protein of the genome in a reduced dimensional space. These vectors are summed together to provide a vector representation for each species, and the angle between these vectors provides distance measures that are used to construct species trees. Results An unfiltered whole genome analysis (193,622 predicted proteins strongly supports the currently accepted phylogeny for 12 Drosophila species at higher dimensions except for the generally accepted but difficult to discern sister relationship between D. erecta and D. yakuba. Also, in accordance with previous studies, many sequences appear to support alternative phylogenies. In this case, we observed grouping of D. erecta with D. sechellia when approximately 55% to 95% of the proteins were removed using a filter based on projection values or by reducing resolution by using fewer dimensions. Similar results were obtained when just the melanogaster subgroup was analyzed. Conclusions These results indicate that using our novel phylogenetic method, it is possible to consult and interpret all predicted protein sequences within multiple whole genomes to produce accurate phylogenetic estimations of relatedness between

  9. Whole-genome sequencing of veterinary pathogens

    Ronco, Troels

    -electrophoresis and single-locus sequencing has been widely used to characterize such types of veterinary pathogens. However, DNA sequencing techniques have become fast and cost effective in recent years and whole-genome sequencing data provide a much higher discriminative power and reproducibility than any...... genetic background. This indicates that dairy cows can be natural carriers of S. aureus subtypes that in certain cases lead to CM. A group of isolates that mostly belonged to ST151 carried three pathogenicity islands that were primarily found in this group. The prevalence of resistance genes was generally...

  10. Harnessing Whole Genome Sequencing in Medical Mycology.

    Cuomo, Christina A

    2017-01-01

    Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.

  11. Whole genome sequence analysis of Mycobacterium suricattae

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; Van Der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah; Siame, Kabengele Keith; Gey Van Pittius, Nicolaas Claudius; Van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-01-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  12. Whole genome amplification in preimplantation genetic diagnosis*

    Zheng, Ying-ming; Wang, Ning; Li, Lei; Jin, Fan

    2011-01-01

    Preimplantation genetic diagnosis (PGD) refers to a procedure for genetically analyzing embryos prior to implantation, improving the chance of conception for patients at high risk of transmitting specific inherited disorders. This method has been widely used for a large number of genetic disorders since the first successful application in the early 1990s. Polymerase chain reaction (PCR) and fluorescent in situ hybridization (FISH) are the two main methods in PGD, but there are some inevitable shortcomings limiting the scope of genetic diagnosis. Fortunately, different whole genome amplification (WGA) techniques have been developed to overcome these problems. Sufficient DNA can be amplified and multiple tasks which need abundant DNA can be performed. Moreover, WGA products can be analyzed as a template for multi-loci and multi-gene during the subsequent DNA analysis. In this review, we will focus on the currently available WGA techniques and their applications, as well as the new technical trends from WGA products. PMID:21194180

  13. Whole genome sequence analysis of Mycobacterium suricattae

    Dippenaar, Anzaan

    2015-10-21

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  14. Aligning the unalignable: bacteriophage whole genome alignments.

    Bérard, Sèverine; Chateau, Annie; Pompidor, Nicolas; Guertin, Paul; Bergeron, Anne; Swenson, Krister M

    2016-01-13

    In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressive Mauve aligner - which implements a partial order strategy, but whose alignments are linearized - shows a greatly improved interactive graphic display, while avoiding misalignments. Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).

  15. Tolerance of Whole-Genome Doubling Propagates Chromosomal Instability and Accelerates Cancer Genome Evolution

    Dewhurst, Sally M.; McGranahan, Nicholas; Burrell, Rebecca A.

    2014-01-01

    The contribution of whole-genome doubling to chromosomal instability (CIN) and tumor evolution is unclear. We use long-term culture of isogenic tetraploid cells from a stable diploid colon cancer progenitor to investigate how a genome-doubling event affects genome stability over time. Rare cells...

  16. Microbial species delineation using whole genome sequences.

    Varghese, Neha J; Mukherjee, Supratim; Ivanova, Natalia; Konstantinidis, Konstantinos T; Mavrommatis, Kostas; Kyrpides, Nikos C; Pati, Amrita

    2015-08-18

    Increased sequencing of microbial genomes has revealed that prevailing prokaryotic species assignments can be inconsistent with whole genome information for a significant number of species. The long-standing need for a systematic and scalable species assignment technique can be met by the genome-wide Average Nucleotide Identity (gANI) metric, which is widely acknowledged as a robust measure of genomic relatedness. In this work, we demonstrate that the combination of gANI and the alignment fraction (AF) between two genomes accurately reflects their genomic relatedness. We introduce an efficient implementation of AF,gANI and discuss its successful application to 86.5M genome pairs between 13,151 prokaryotic genomes assigned to 3032 species. Subsequently, by comparing the genome clusters obtained from complete linkage clustering of these pairs to existing taxonomy, we observed that nearly 18% of all prokaryotic species suffer from anomalies in species definition. Our results can be used to explore central questions such as whether microorganisms form a continuum of genetic diversity or distinct species represented by distinct genetic signatures. We propose that this precise and objective AF,gANI-based species definition: the MiSI (Microbial Species Identifier) method, be used to address previous inconsistencies in species classification and as the primary guide for new taxonomic species assignment, supplemented by the traditional polyphasic approach, as required. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Small sample whole-genome amplification

    Hara, Christine; Nguyen, Christine; Wheeler, Elizabeth; Sorensen, Karen; Arroyo, Erin; Vrankovich, Greg; Christian, Allen

    2005-11-01

    Many challenges arise when trying to amplify and analyze human samples collected in the field due to limitations in sample quantity, and contamination of the starting material. Tests such as DNA fingerprinting and mitochondrial typing require a certain sample size and are carried out in large volume reactions; in cases where insufficient sample is present whole genome amplification (WGA) can be used. WGA allows very small quantities of DNA to be amplified in a way that enables subsequent DNA-based tests to be performed. A limiting step to WGA is sample preparation. To minimize the necessary sample size, we have developed two modifications of WGA: the first allows for an increase in amplified product from small, nanoscale, purified samples with the use of carrier DNA while the second is a single-step method for cleaning and amplifying samples all in one column. Conventional DNA cleanup involves binding the DNA to silica, washing away impurities, and then releasing the DNA for subsequent testing. We have eliminated losses associated with incomplete sample release, thereby decreasing the required amount of starting template for DNA testing. Both techniques address the limitations of sample size by providing ample copies of genomic samples. Carrier DNA, included in our WGA reactions, can be used when amplifying samples with the standard purification method, or can be used in conjunction with our single-step DNA purification technique to potentially further decrease the amount of starting sample necessary for future forensic DNA-based assays.

  18. Whole-genome sequence-based analysis of thyroid function

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  19. Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer

    Wang, Kai; Yuen, Siu Tsan; Xu, Jiangchun; Lee, Siu Po; Yan, Helen H N; Shi, Stephanie T; Siu, Hoi Cheong; Deng, Shibing; Chu, Kent Man; Law, Simon; Chan, Kok Hoe; Chan, Annie S Y; Tsui, Wai Yin; Ho, Siu Lun; Chan, Anthony K W; Man, Jonathan L K; Foglizzo, Valentina; Ng, Man Kin; Chan, April S; Ching, Yick Pang; Cheng, Grace H W; Xie, Tao; Fernandez, Julio; Li, Vivian S W; Clevers, Hans; Rejto, Paul A; Mao, Mao; Leung, Suet Yi

    Gastric cancer is a heterogeneous disease with diverse molecular and histological subtypes. We performed whole-genome sequencing in 100 tumor-normal pairs, along with DNA copy number, gene expression and methylation profiling, for integrative genomic analysis. We found subtype-specific genetic and

  20. Rapid whole genome sequencing and precision neonatology.

    Petrikin, Joshua E; Willig, Laurel K; Smith, Laurie D; Kingsmore, Stephen F

    2015-12-01

    Traditionally, genetic testing has been too slow or perceived to be impractical to initial management of the critically ill neonate. Technological advances have led to the ability to sequence and interpret the entire genome of a neonate in as little as 26 h. As the cost and speed of testing decreases, the utility of whole genome sequencing (WGS) of neonates for acute and latent genetic illness increases. Analyzing the entire genome allows for concomitant evaluation of the currently identified 5588 single gene diseases. When applied to a select population of ill infants in a level IV neonatal intensive care unit, WGS yielded a diagnosis of a causative genetic disease in 57% of patients. These diagnoses may lead to clinical management changes ranging from transition to palliative care for uniformly lethal conditions for alteration or initiation of medical or surgical therapy to improve outcomes in others. Thus, institution of 2-day WGS at time of acute presentation opens the possibility of early implementation of precision medicine. This implementation may create opportunities for early interventional, frequently novel or off-label therapies that may alter disease trajectory in infants with what would otherwise be fatal disease. Widespread deployment of rapid WGS and precision medicine will raise ethical issues pertaining to interpretation of variants of unknown significance, discovery of incidental findings related to adult onset conditions and carrier status, and implementation of medical therapies for which little is known in terms of risks and benefits. Despite these challenges, precision neonatology has significant potential both to decrease infant mortality related to genetic diseases with onset in newborns and to facilitate parental decision making regarding transition to palliative care. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. Early passage bone marrow stromal cells express genes involved in nervous system development supporting their relevance for neural repair

    Nandoe Tewarie, R.D.S.; Bossers, K.; Ritfeld, G.J.; Blits, B.; Grotenhuis, J.A.; Verhaagen, J.; Oudega, M.

    2011-01-01

    PURPOSE: The assessment of the capacity of bone marrow stromal cells (BMSC) to repair the nervous system using gene expression profiling. The evaluation of effects of long-term culturing on the gene expression profile of BMSC. METHODS: Fourty four k whole genome rat microarrays were used to study

  2. Differential retention of metabolic genes following whole-genome duplication.

    Gout, Jean-François; Duret, Laurent; Kahn, Daniel

    2009-05-01

    Classical studies in Metabolic Control Theory have shown that metabolic fluxes usually exhibit little sensitivity to changes in individual enzyme activity, yet remain sensitive to global changes of all enzymes in a pathway. Therefore, little selective pressure is expected on the dosage or expression of individual metabolic genes, yet entire pathways should still be constrained. However, a direct estimate of this selective pressure had not been evaluated. Whole-genome duplications (WGDs) offer a good opportunity to address this question by analyzing the fates of metabolic genes during the massive gene losses that follow. Here, we take advantage of the successive rounds of WGD that occurred in the Paramecium lineage. We show that metabolic genes exhibit different gene retention patterns than nonmetabolic genes. Contrary to what was expected for individual genes, metabolic genes appeared more retained than other genes after the recent WGD, which was best explained by selection for gene expression operating on entire pathways. Metabolic genes also tend to be less retained when present at high copy number before WGD, contrary to other genes that show a positive correlation between gene retention and preduplication copy number. This is rationalized on the basis of the classical concave relationship relating metabolic fluxes with enzyme expression.

  3. Identification of somatic mutations in postmortem human brains by whole genome sequencing and their implications for psychiatric disorders.

    Nishioka, Masaki; Bundo, Miki; Ueda, Junko; Katsuoka, Fumiki; Sato, Yukuto; Kuroki, Yoko; Ishii, Takao; Ukai, Wataru; Murayama, Shigeo; Hashimoto, Eri; Nagasaki, Masao; Yasuda, Jun; Kasai, Kiyoto; Kato, Tadafumi; Iwamoto, Kazuya

    2018-04-01

    Somatic mutations in the human brain are hypothesized to contribute to the functional diversity of brain cells as well as the pathophysiology of neuropsychiatric diseases. However, there are still few reports on somatic mutations in non-neoplastic human brain tissues. This study attempted to unveil the landscape of somatic mutations in the human brain. We explored the landscape of somatic mutations in human brain tissues derived from three individuals with no neuropsychiatric diseases by whole-genome deep sequencing at a depth of around 100. The candidate mutations underwent multi-layered filtering, and were validated by ultra-deep target amplicon sequencing at a depth of around 200 000. Thirty-one somatic mutations were identified in the human brain, demonstrating the utility of whole-genome sequencing of bulk brain tissue. The mutations were enriched in neuron-expressed genes, and two-thirds of the identified somatic single nucleotide variants in the brain tissues were cytosine-to-thymine transitions, half of which were in CpG dinucleotides. Our developed filtering and validation approaches will be useful to identify somatic mutations in the human brain. The vulnerability of neuron-expressed genes to mutational events suggests their potential relevance to neuropsychiatric diseases. © 2017 The Authors. Psychiatry and Clinical Neurosciences published by John Wiley & Sons Australia, Ltd on behalf of Japanese Society of Psychiatry and Neurology.

  4. Multiple Whole Genome Alignments Without a Reference Organism

    Dubchak, Inna; Poliakov, Alexander; Kislyuk, Andrey; Brudno, Michael

    2009-01-16

    Multiple sequence alignments have become one of the most commonly used resources in genomics research. Most algorithms for multiple alignment of whole genomes rely either on a reference genome, against which all of the other sequences are laid out, or require a one-to-one mapping between the nucleotides of the genomes, preventing the alignment of recently duplicated regions. Both approaches have drawbacks for whole-genome comparisons. In this paper we present a novel symmetric alignment algorithm. The resulting alignments not only represent all of the genomes equally well, but also include all relevant duplications that occurred since the divergence from the last common ancestor. Our algorithm, implemented as a part of the VISTA Genome Pipeline (VGP), was used to align seven vertebrate and sixDrosophila genomes. The resulting whole-genome alignments demonstrate a higher sensitivity and specificity than the pairwise alignments previously available through the VGP and have higher exon alignment accuracy than comparable public whole-genome alignments. Of the multiple alignment methods tested, ours performed the best at aligning genes from multigene families?perhaps the most challenging test for whole-genome alignments. Our whole-genome multiple alignments are available through the VISTA Browser at http://genome.lbl.gov/vista/index.shtml.

  5. Assessing molecular initiating events (MIEs), key events (KEs) and modulating factors (MFs) for styrene responses in mouse lungs using whole genome gene expression profiling following 1-day and multi-week exposures.

    Andersen, Melvin E; Cruzan, George; Black, Michael B; Pendse, Salil N; Dodd, Darol; Bus, James S; Sarang, Satinder S; Banton, Marcy I; Waites, Robbie; McMullen, Patrick D

    2017-11-15

    Styrene increased lung tumors in mice at chronic inhalation exposures of 20ppm and greater. MIEs, KEs and MFs were examined using gene expression in three strains of male mice (the parental C57BL/6 strain, a CYP2F2(-/-) knock out and a CYP2F2(-/-) transgenic containing human CYP2F1, 2A13 and 2B6). Exposures were for 1-day and 1, 4 and 26weeks. After 1-day exposures at 1, 5, 10, 20, 40 and 120ppm significant increases in differentially expressed genes (DEGs) occurred only in parental strain lungs where there was already an increase in DEGs at 5ppm and then many thousands of DEGs by 120ppm. Enrichment for 1-day and 1-week exposures included cell cycle, mitotic M-M/G1 phases, DNA-synthesis and metabolism of lipids and lipoproteins pathways. The numbers of DEGs decreased steadily over time with no DEGs meeting both statistical significance and fold-change criteria at 26weeks. At 4 and 26weeks, some key transcription factors (TFs) - Nr1d1, Nr1d2, Dbp, Tef, Hlf, Per3, Per2 and Bhlhe40 - were upregulated (|FC|>1.5), while others - Npas, Arntl, Nfil3, Nr4a1, Nr4a2, and Nr4a3 - were down-regulated. At all times, consistent changes in gene expression only occurred in the parental strain. Our results support a MIE for styrene of direct mitogenicity from mouse-specific CYP2F2-mediated metabolites activating Nr4a signaling. Longer-term MFs include down-regulation of Nr4a genes and shifts in both circadian clock TFs and other TFs, linking circadian clock to cellular metabolism. We found no gene expression changes indicative of cytotoxicity or activation of p53-mediated DNA-damage pathways. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  6. Whole-genome landscape of pancreatic neuroendocrine tumours.

    Scarpa, Aldo; Chang, David K; Nones, Katia; Corbo, Vincenzo; Patch, Ann-Marie; Bailey, Peter; Lawlor, Rita T; Johns, Amber L; Miller, David K; Mafficini, Andrea; Rusev, Borislav; Scardoni, Maria; Antonello, Davide; Barbi, Stefano; Sikora, Katarzyna O; Cingarlini, Sara; Vicentini, Caterina; McKay, Skye; Quinn, Michael C J; Bruxner, Timothy J C; Christ, Angelika N; Harliwong, Ivon; Idrisoglu, Senel; McLean, Suzanne; Nourse, Craig; Nourbakhsh, Ehsan; Wilson, Peter J; Anderson, Matthew J; Fink, J Lynn; Newell, Felicity; Waddell, Nick; Holmes, Oliver; Kazakoff, Stephen H; Leonard, Conrad; Wood, Scott; Xu, Qinying; Nagaraj, Shivashankar Hiriyur; Amato, Eliana; Dalai, Irene; Bersani, Samantha; Cataldo, Ivana; Dei Tos, Angelo P; Capelli, Paola; Davì, Maria Vittoria; Landoni, Luca; Malpaga, Anna; Miotto, Marco; Whitehall, Vicki L J; Leggett, Barbara A; Harris, Janelle L; Harris, Jonathan; Jones, Marc D; Humphris, Jeremy; Chantrill, Lorraine A; Chin, Venessa; Nagrial, Adnan M; Pajic, Marina; Scarlett, Christopher J; Pinho, Andreia; Rooman, Ilse; Toon, Christopher; Wu, Jianmin; Pinese, Mark; Cowley, Mark; Barbour, Andrew; Mawson, Amanda; Humphrey, Emily S; Colvin, Emily K; Chou, Angela; Lovell, Jessica A; Jamieson, Nigel B; Duthie, Fraser; Gingras, Marie-Claude; Fisher, William E; Dagg, Rebecca A; Lau, Loretta M S; Lee, Michael; Pickett, Hilda A; Reddel, Roger R; Samra, Jaswinder S; Kench, James G; Merrett, Neil D; Epari, Krishna; Nguyen, Nam Q; Zeps, Nikolajs; Falconi, Massimo; Simbolo, Michele; Butturini, Giovanni; Van Buren, George; Partelli, Stefano; Fassan, Matteo; Khanna, Kum Kum; Gill, Anthony J; Wheeler, David A; Gibbs, Richard A; Musgrove, Elizabeth A; Bassi, Claudio; Tortora, Giampaolo; Pederzoli, Paolo; Pearson, John V; Waddell, Nicola; Biankin, Andrew V; Grimmond, Sean M

    2017-03-02

    The diagnosis of pancreatic neuroendocrine tumours (PanNETs) is increasing owing to more sensitive detection methods, and this increase is creating challenges for clinical management. We performed whole-genome sequencing of 102 primary PanNETs and defined the genomic events that characterize their pathogenesis. Here we describe the mutational signatures they harbour, including a deficiency in G:C > T:A base excision repair due to inactivation of MUTYH, which encodes a DNA glycosylase. Clinically sporadic PanNETs contain a larger-than-expected proportion of germline mutations, including previously unreported mutations in the DNA repair genes MUTYH, CHEK2 and BRCA2. Together with mutations in MEN1 and VHL, these mutations occur in 17% of patients. Somatic mutations, including point mutations and gene fusions, were commonly found in genes involved in four main pathways: chromatin remodelling, DNA damage repair, activation of mTOR signalling (including previously undescribed EWSR1 gene fusions), and telomere maintenance. In addition, our gene expression analyses identified a subgroup of tumours associated with hypoxia and HIF signalling.

  7. Genomic V exons from whole genome shotgun data in reptiles.

    Olivieri, D N; von Haeften, B; Sánchez-Espinel, C; Faro, J; Gambón-Deza, F

    2014-08-01

    Reptiles and mammals diverged over 300 million years ago, creating two parallel evolutionary lineages amongst terrestrial vertebrates. In reptiles, two main evolutionary lines emerged: one gave rise to Squamata, while the other gave rise to Testudines, Crocodylia, and Aves. In this study, we determined the genomic variable (V) exons from whole genome shotgun sequencing (WGS) data in reptiles corresponding to the three main immunoglobulin (IG) loci and the four main T cell receptor (TR) loci. We show that Squamata lack the TRG and TRD genes, and snakes lack the IGKV genes. In representative species of Testudines and Crocodylia, the seven major IG and TR loci are maintained. As in mammals, genes of the IG loci can be grouped into well-defined IMGT clans through a multi-species phylogenetic analysis. We show that the reptilian IGHV and IGLV genes are distributed amongst the established mammalian clans, while their IGKV genes are found within a single clan, nearly exclusive from the mammalian sequences. The reptilian and mammalian TRAV genes cluster into six common evolutionary clades (since IMGT clans have not been defined for TR). In contrast, the reptilian TRBV genes cluster into three clades, which have few mammalian members. In this locus, the V exon sequences from mammals appear to have undergone different evolutionary diversification processes that occurred outside these shared reptilian clans. These sequences can be obtained in a freely available public repository (http://vgenerepertoire.org).

  8. Whole genome shotgun sequencing of Indian strains of Streptococcus agalactiae

    Balaji Veeraraghavan

    2017-12-01

    Full Text Available Group B streptococcus is known as a leading cause of neonatal infections in developing countries. The present study describes the whole genome shotgun sequences of four Group B Streptococcus (GBS isolates. Molecular data on clonality is lacking for GBS in India. The present genome report will add important information on the scarce genome data of GBS and will help in deriving comparative genome studies of GBS isolates at global level. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession numbers NHPL00000000 – NHPO00000000.

  9. Whole-Genome Sequences of Three Symbiotic Endozoicomonas Bacteria

    Neave, Matthew J.

    2014-08-14

    Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp.

  10. Whole-Genome Sequences of Three Symbiotic Endozoicomonas Bacteria

    Neave, Matthew J.; Michell, Craig; Apprill, Amy; Voolstra, Christian R.

    2014-01-01

    Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp.

  11. Soybean (Glycine max) SWEET gene family: insights through comparative genomics, transcriptome profiling and whole genome re-sequence analysis.

    Patil, Gunvant; Valliyodan, Babu; Deshmukh, Rupesh; Prince, Silvas; Nicander, Bjorn; Zhao, Mingzhe; Sonah, Humira; Song, Li; Lin, Li; Chaudhary, Juhi; Liu, Yang; Joshi, Trupti; Xu, Dong; Nguyen, Henry T

    2015-07-11

    SWEET (MtN3_saliva) domain proteins, a recently identified group of efflux transporters, play an indispensable role in sugar efflux, phloem loading, plant-pathogen interaction and reproductive tissue development. The SWEET gene family is predominantly studied in Arabidopsis and members of the family are being investigated in rice. To date, no transcriptome or genomics analysis of soybean SWEET genes has been reported. In the present investigation, we explored the evolutionary aspect of the SWEET gene family in diverse plant species including primitive single cell algae to angiosperms with a major emphasis on Glycine max. Evolutionary features showed expansion and duplication of the SWEET gene family in land plants. Homology searches with BLAST tools and Hidden Markov Model-directed sequence alignments identified 52 SWEET genes that were mapped to 15 chromosomes in the soybean genome as tandem duplication events. Soybean SWEET (GmSWEET) genes showed a wide range of expression profiles in different tissues and developmental stages. Analysis of public transcriptome data and expression profiling using quantitative real time PCR (qRT-PCR) showed that a majority of the GmSWEET genes were confined to reproductive tissue development. Several natural genetic variants (non-synonymous SNPs, premature stop codons and haplotype) were identified in the GmSWEET genes using whole genome re-sequencing data analysis of 106 soybean genotypes. A significant association was observed between SNP-haplogroup and seed sucrose content in three gene clusters on chromosome 6. Present investigation utilized comparative genomics, transcriptome profiling and whole genome re-sequencing approaches and provided a systematic description of soybean SWEET genes and identified putative candidates with probable roles in the reproductive tissue development. Gene expression profiling at different developmental stages and genomic variation data will aid as an important resource for the soybean research

  12. Whole-Genome Sequencing in Microbial Forensic Analysis of Gamma-Irradiated Microbial Materials.

    Broomall, Stacey M; Ait Ichou, Mohamed; Krepps, Michael D; Johnsky, Lauren A; Karavis, Mark A; Hubbard, Kyle S; Insalaco, Joseph M; Betters, Janet L; Redmond, Brady W; Rivers, Bryan A; Liem, Alvin T; Hill, Jessica M; Fochler, Edward T; Roth, Pierce A; Rosenzweig, C Nicole; Skowronski, Evan W; Gibbons, Henry S

    2016-01-15

    Effective microbial forensic analysis of materials used in a potential biological attack requires robust methods of morphological and genetic characterization of the attack materials in order to enable the attribution of the materials to potential sources and to exclude other potential sources. The genetic homogeneity and potential intersample variability of many of the category A to C bioterrorism agents offer a particular challenge to the generation of attributive signatures, potentially requiring whole-genome or proteomic approaches to be utilized. Currently, irradiation of mail is standard practice at several government facilities judged to be at particularly high risk. Thus, initial forensic signatures would need to be recovered from inactivated (nonviable) material. In the study described in this report, we determined the effects of high-dose gamma irradiation on forensic markers of bacterial biothreat agent surrogate organisms with a particular emphasis on the suitability of genomic DNA (gDNA) recovered from such sources as a template for whole-genome analysis. While irradiation of spores and vegetative cells affected the retention of Gram and spore stains and sheared gDNA into small fragments, we found that irradiated material could be utilized to generate accurate whole-genome sequence data on the Illumina and Roche 454 sequencing platforms. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  13. Whole genome amplification - Review of applications and advances

    Hawkins, Trevor L.; Detter, J.C.; Richardson, Paul

    2001-11-15

    The concept of Whole Genome Amplification is something that has arisen in the past few years as modifications to the polymerase chain reaction (PCR) have been adapted to replicate regions of genomes which are of biological interest. The applications here are many--forensics, embryonic disease diagnosis, bio terrorism genome detection, ''imoralization'' of clinical samples, microbial diversity, and genotyping. The key question is if DNA can be replicated a genome at a time without bias or non random distribution of the target. Several papers published in the last year and currently in preparation may lead to the conclusion that whole genome amplification may indeed be possible and therefore open up a new avenue to molecular biology.

  14. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  15. Whole genomes redefine the mutational landscape of pancreatic cancer

    Waddell, Nicola; Pajic, Marina; Patch, Ann-Marie; Chang, David K.; Kassahn, Karin S.; Bailey, Peter; Johns, Amber L.; Miller, David; Nones, Katia; Quek, Kelly; Quinn, Michael C. J.; Robertson, Alan J.; Fadlullah, Muhammad Z. H.; Bruxner, Tim J. C.; Christ, Angelika N.

    2015-01-01

    Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (...

  16. A Site Specific Model And Analysis Of The Neutral Somatic Mutation Rate In Whole-Genome Cancer Data

    Bertl, Johanna; Guo, Qianyun; Rasmussen, Malene Juul

    2017-01-01

    Detailed modelling of the neutral mutational process in cancer cells is crucial for identifying driver mutations and understanding the mutational mechanisms that act during cancer development. The neutral mutational process is very complex: whole-genome analyses have revealed that the mutation ra...

  17. WGSQuikr: fast whole-genome shotgun metagenomic classification.

    David Koslicki

    Full Text Available With the decrease in cost and increase in output of whole-genome shotgun technologies, many metagenomic studies are utilizing this approach in lieu of the more traditional 16S rRNA amplicon technique. Due to the large number of relatively short reads output from whole-genome shotgun technologies, there is a need for fast and accurate short-read OTU classifiers. While there are relatively fast and accurate algorithms available, such as MetaPhlAn, MetaPhyler, PhyloPythiaS, and PhymmBL, these algorithms still classify samples in a read-by-read fashion and so execution times can range from hours to days on large datasets. We introduce WGSQuikr, a reconstruction method which can compute a vector of taxonomic assignments and their proportions in the sample with remarkable speed and accuracy. We demonstrate on simulated data that WGSQuikr is typically more accurate and up to an order of magnitude faster than the aforementioned classification algorithms. We also verify the utility of WGSQuikr on real biological data in the form of a mock community. WGSQuikr is a Whole-Genome Shotgun QUadratic, Iterative, K-mer based Reconstruction method which extends the previously introduced 16S rRNA-based algorithm Quikr. A MATLAB implementation of WGSQuikr is available at: http://sourceforge.net/projects/wgsquikr.

  18. Functional regression method for whole genome eQTL epistasis analysis with sequencing data.

    Xu, Kelin; Jin, Li; Xiong, Momiao

    2017-05-18

    Epistasis plays an essential rule in understanding the regulation mechanisms and is an essential component of the genetic architecture of the gene expressions. However, interaction analysis of gene expressions remains fundamentally unexplored due to great computational challenges and data availability. Due to variation in splicing, transcription start sites, polyadenylation sites, post-transcriptional RNA editing across the entire gene, and transcription rates of the cells, RNA-seq measurements generate large expression variability and collectively create the observed position level read count curves. A single number for measuring gene expression which is widely used for microarray measured gene expression analysis is highly unlikely to sufficiently account for large expression variation across the gene. Simultaneously analyzing epistatic architecture using the RNA-seq and whole genome sequencing (WGS) data poses enormous challenges. We develop a nonlinear functional regression model (FRGM) with functional responses where the position-level read counts within a gene are taken as a function of genomic position, and functional predictors where genotype profiles are viewed as a function of genomic position, for epistasis analysis with RNA-seq data. Instead of testing the interaction of all possible pair-wises SNPs, the FRGM takes a gene as a basic unit for epistasis analysis, which tests for the interaction of all possible pairs of genes and use all the information that can be accessed to collectively test interaction between all possible pairs of SNPs within two genome regions. By large-scale simulations, we demonstrate that the proposed FRGM for epistasis analysis can achieve the correct type 1 error and has higher power to detect the interactions between genes than the existing methods. The proposed methods are applied to the RNA-seq and WGS data from the 1000 Genome Project. The numbers of pairs of significantly interacting genes after Bonferroni correction

  19. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  20. DON shares a similar mode of action as the ribotoxic stress inducer anisomycin while TBTO shares ER stress patterns with the ER stress inducer Thapsigargin based on comparative gene expression profiling in Jurkat T cells

    Schmeits, P.C.J.; Katika, M.R.; Peijnenburg, A.A.C.M.; Loveren, van H.; Hendriksen, P.J.M.

    2014-01-01

    Previously, we studied the effects of deoxynivalenol (DON) and tributyltin oxide (TBTO) on whole genome mRNA expression profiles of human T lymphocyte Jurkat cells. These studies indicated that DON induces ribotoxic stress and both DON and TBTO induced ER stress which resulted into T-cell activation

  1. Integration of transcriptome and whole genomic resequencing data to identify key genes affecting swine fat deposition.

    Kai Xing

    Full Text Available Fat deposition is highly correlated with the growth, meat quality, reproductive performance and immunity of pigs. Fatty acid synthesis takes place mainly in the adipose tissue of pigs; therefore, in this study, a high-throughput massively parallel sequencing approach was used to generate adipose tissue transcriptomes from two groups of Songliao black pigs that had opposite backfat thickness phenotypes. The total number of paired-end reads produced for each sample was in the range of 39.29-49.36 millions. Approximately 188 genes were differentially expressed in adipose tissue and were enriched for metabolic processes, such as fatty acid biosynthesis, lipid synthesis, metabolism of fatty acids, etinol, caffeine and arachidonic acid and immunity. Additionally, many genetic variations were detected between the two groups through pooled whole-genome resequencing. Integration of transcriptome and whole-genome resequencing data revealed important genomic variations among the differentially expressed genes for fat deposition, for example, the lipogenic genes. Further studies are required to investigate the roles of candidate genes in fat deposition to improve pig breeding programs.

  2. Genome U-Plot: a whole genome visualization.

    Gaitatzes, Athanasios; Johnson, Sarah H; Smadbeck, James B; Vasmatzis, George

    2018-05-15

    The ability to produce and analyze whole genome sequencing (WGS) data from samples with structural variations (SV) generated the need to visualize such abnormalities in simplified plots. Conventional two-dimensional representations of WGS data frequently use either circular or linear layouts. There are several diverse advantages regarding both these representations, but their major disadvantage is that they do not use the two-dimensional space very efficiently. We propose a layout, termed the Genome U-Plot, which spreads the chromosomes on a two-dimensional surface and essentially quadruples the spatial resolution. We present the Genome U-Plot for producing clear and intuitive graphs that allows researchers to generate novel insights and hypotheses by visualizing SVs such as deletions, amplifications, and chromoanagenesis events. The main features of the Genome U-Plot are its layered layout, its high spatial resolution and its improved aesthetic qualities. We compare conventional visualization schemas with the Genome U-Plot using visualization metrics such as number of line crossings and crossing angle resolution measures. Based on our metrics, we improve the readability of the resulting graph by at least 2-fold, making apparent important features and making it easy to identify important genomic changes. A whole genome visualization tool with high spatial resolution and improved aesthetic qualities. An implementation and documentation of the Genome U-Plot is publicly available at https://github.com/gaitat/GenomeUPlot. vasmatzis.george@mayo.edu. Supplementary data are available at Bioinformatics online.

  3. Rapid identification of lettuce seed germination mutants by bulked segregant analysis and whole genome sequencing.

    Huo, Heqiang; Henry, Isabelle M; Coppoolse, Eric R; Verhoef-Post, Miriam; Schut, Johan W; de Rooij, Han; Vogelaar, Aat; Joosen, Ronny V L; Woudenberg, Leo; Comai, Luca; Bradford, Kent J

    2016-11-01

    Lettuce (Lactuca sativa) seeds exhibit thermoinhibition, or failure to complete germination when imbibed at warm temperatures. Chemical mutagenesis was employed to develop lettuce lines that exhibit germination thermotolerance. Two independent thermotolerant lettuce seed mutant lines, TG01 and TG10, were generated through ethyl methanesulfonate mutagenesis. Genetic and physiological analyses indicated that these two mutations were allelic and recessive. To identify the causal gene(s), we applied bulked segregant analysis by whole genome sequencing. For each mutant, bulked DNA samples of segregating thermotolerant (mutant) seeds were sequenced and analyzed for homozygous single-nucleotide polymorphisms. Two independent candidate mutations were identified at different physical positions in the zeaxanthin epoxidase gene (ABSCISIC ACID DEFICIENT 1/ZEAXANTHIN EPOXIDASE, or ABA1/ZEP) in TG01 and TG10. The mutation in TG01 caused an amino acid replacement, whereas the mutation in TG10 resulted in alternative mRNA splicing. Endogenous abscisic acid contents were reduced in both mutants, and expression of the ABA1 gene from wild-type lettuce under its own promoter fully complemented the TG01 mutant. Conventional genetic mapping confirmed that the causal mutations were located near the ZEP/ABA1 gene, but the bulked segregant whole genome sequencing approach more efficiently identified the specific gene responsible for the phenotype. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  4. Dirofilaria immitis JYD-34 isolate: whole genome analysis

    Catherine Bourguinat

    2017-11-01

    Full Text Available Abstract Background Macrocyclic lactone (ML anthelmintics are used for chemoprophylaxis for heartworm infection in dogs and cats. Cases of dogs becoming infected with heartworms, despite apparent compliance to recommended chemoprophylaxis with approved preventives, has led to such cases being considered as suspected lack of efficacy (LOE. Recently, microfilariae collected from a small number of LOE isolates were used as a source of infection of new host dogs and confirmed to have reduced susceptibility to ML in controlled efficacy studies using L3 challenge in dogs. A specific Dirofilaria immitis laboratory isolate named JYD-34 has also been confirmed to have less than 100% susceptibility to ML-based preventives. For preventive claims against heartworm disease, evidence of 100% efficacy is required by FDA-CVM. It was therefore of interest to determine whether JYD-34 has a genetic profile similar to other documented LOE and confirmed reduced susceptibility isolates or has a genetic profile similar to known ML-susceptible isolates. Methods In this study, the 90Mbp whole genome of the JYD-34 strain was sequenced. This genome was compared using bioinformatics tools to pooled whole genomes of four well-characterized susceptible D. immitis populations, one susceptible Missouri laboratory isolate, as well as the pooled whole genomes of four LOE D. immitis populations. Fixation indexes (FST, which allow the genetic structure of each population (isolate to be compared at the level of single nucleotide polymorphisms (SNP across the genome, have been calculated. Forty-one previously reported SNP, that appeared to differentiate between susceptible and LOE and confirmed reduced susceptibility isolates, were also investigated in the JYD-34 isolate. Results The FST analysis, and the analysis of the 41 SNP that appeared to differentiate reduced susceptibility from fully susceptible isolates, confirmed that the JYD-34 isolate has a genome similar to previously

  5. Identification of antimicrobial resistance genes in multidrug-resistant clinical Bacteroides fragilis isolates by whole genome shotgun sequencing

    Sydenham, Thomas Vognbjerg; Sóki, József; Hasman, Henrik

    2015-01-01

    Bacteroides fragilis constitutes the most frequent anaerobic bacterium causing bacteremia in humans. The genetic background for antimicrobial resistance in B. fragilis is diverse with some genes requiring insertion sequence (IS) elements inserted upstream for increased expression. To evaluate whole...... genome shotgun sequencing as a method for predicting antimicrobial resistance properties, one meropenem resistant and five multidrug-resistant blood culture isolates were sequenced and antimicrobial resistance genes and IS elements identified using ResFinder 2.1 (http...

  6. Whole genome sequencing: an efficient approach to ensuring food safety

    Lakicevic, B.; Nastasijevic, I.; Dimitrijevic, M.

    2017-09-01

    Whole genome sequencing is an effective, powerful tool that can be applied to a wide range of public health and food safety applications. A major difference between WGS and the traditional typing techniques is that WGS allows all genes to be included in the analysis, instead of a well-defined subset of genes or variable intergenic regions. Also, the use of WGS can facilitate the understanding of contamination/colonization routes of foodborne pathogens within the food production environment, and can also afford efficient tracking of pathogens’ entry routes and distribution from farm-to-consumer. Tracking foodborne pathogens in the food processing-distribution-retail-consumer continuum is of the utmost importance for facilitation of outbreak investigations and rapid action in controlling/preventing foodborne outbreaks. Therefore, WGS likely will replace most of the numerous workflows used in public health laboratories to characterize foodborne pathogens into one consolidated, efficient workflow.

  7. Whole genome sequencing in clinical and public health microbiology.

    Kwong, J C; McCallum, N; Sintchenko, V; Howden, B P

    2015-04-01

    Genomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology.The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology.Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories.As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future.Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure.

  8. SNPassoc: an R package to perform whole genome association studies.

    González, Juan R; Armengol, Lluís; Solé, Xavier; Guinó, Elisabet; Mercader, Josep M; Estivill, Xavier; Moreno, Víctor

    2007-03-01

    The popularization of large-scale genotyping projects has led to the widespread adoption of genetic association studies as the tool of choice in the search for single nucleotide polymorphisms (SNPs) underlying susceptibility to complex diseases. Although the analysis of individual SNPs is a relatively trivial task, when the number is large and multiple genetic models need to be explored it becomes necessary a tool to automate the analyses. In order to address this issue, we developed SNPassoc, an R package to carry out most common analyses in whole genome association studies. These analyses include descriptive statistics and exploratory analysis of missing values, calculation of Hardy-Weinberg equilibrium, analysis of association based on generalized linear models (either for quantitative or binary traits), and analysis of multiple SNPs (haplotype and epistasis analysis). Package SNPassoc is available at CRAN from http://cran.r-project.org. A tutorial is available on Bioinformatics online and in http://davinci.crg.es/estivill_lab/snpassoc.

  9. Methyl-Analyzer--whole genome DNA methylation profiling.

    Xin, Yurong; Ge, Yongchao; Haghighi, Fatemeh G

    2011-08-15

    Methyl-Analyzer is a python package that analyzes genome-wide DNA methylation data produced by the Methyl-MAPS (methylation mapping analysis by paired-end sequencing) method. Methyl-MAPS is an enzymatic-based method that uses both methylation-sensitive and -dependent enzymes covering >80% of CpG dinucleotides within mammalian genomes. It combines enzymatic-based approaches with high-throughput next-generation sequencing technology to provide whole genome DNA methylation profiles. Methyl-Analyzer processes and integrates sequencing reads from methylated and unmethylated compartments and estimates CpG methylation probabilities at single base resolution. Methyl-Analyzer is available at http://github.com/epigenomics/methylmaps. Sample dataset is available for download at http://epigenomicspub.columbia.edu/methylanalyzer_data.html. fgh3@columbia.edu Supplementary data are available at Bioinformatics online.

  10. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  11. Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing

    Plant Ramona N

    2006-08-01

    Full Text Available Abstract Background Whole genome amplification is an increasingly common technique through which minute amounts of DNA can be multiplied to generate quantities suitable for genetic testing and analysis. Questions of amplification-induced error and template bias generated by these methods have previously been addressed through either small scale (SNPs or large scale (CGH array, FISH methodologies. Here we utilized whole genome sequencing to assess amplification-induced bias in both coding and non-coding regions of two bacterial genomes. Halobacterium species NRC-1 DNA and Campylobacter jejuni were amplified by several common, commercially available protocols: multiple displacement amplification, primer extension pre-amplification and degenerate oligonucleotide primed PCR. The amplification-induced bias of each method was assessed by sequencing both genomes in their entirety using the 454 Sequencing System technology and comparing the results with those obtained from unamplified controls. Results All amplification methodologies induced statistically significant bias relative to the unamplified control. For the Halobacterium species NRC-1 genome, assessed at 100 base resolution, the D-statistics from GenomiPhi-amplified material were 119 times greater than those from unamplified material, 164.0 times greater for Repli-G, 165.0 times greater for PEP-PCR and 252.0 times greater than the unamplified controls for DOP-PCR. For Campylobacter jejuni, also analyzed at 100 base resolution, the D-statistics from GenomiPhi-amplified material were 15 times greater than those from unamplified material, 19.8 times greater for Repli-G, 61.8 times greater for PEP-PCR and 220.5 times greater than the unamplified controls for DOP-PCR. Conclusion Of the amplification methodologies examined in this paper, the multiple displacement amplification products generated the least bias, and produced significantly higher yields of amplified DNA.

  12. A whole genome screen for HIV restriction factors

    Liu Li

    2011-11-01

    Full Text Available Abstract Background Upon cellular entry retroviruses must avoid innate restriction factors produced by the host cell. For human immunodeficiency virus (HIV human restriction factors, APOBEC3 (apolipoprotein-B-mRNA-editing-enzyme, p21 and tetherin are well characterised. Results To identify intrinsic resistance factors to HIV-1 replication we screened 19,121 human genes and identified 114 factors with significant inhibition of infection. Those with a known function are involved in a broad spectrum of cellular processes including receptor signalling, vesicle trafficking, transcription, apoptosis, cross-nuclear membrane transport, meiosis, DNA damage repair, ubiquitination and RNA processing. We focused on the PAF1 complex which has been previously implicated in gene transcription, cell cycle control and mRNA surveillance. Knockdown of all members of the PAF1 family of proteins enhanced HIV-1 reverse transcription and integration of provirus. Over-expression of PAF1 in host cells renders them refractory to HIV-1. Simian Immunodeficiency Viruses and HIV-2 are also restricted in PAF1 expressing cells. PAF1 is expressed in primary monocytes, macrophages and T-lymphocytes and we demonstrate strong activity in MonoMac1, a monocyte cell line. Conclusions We propose that the PAF1c establishes an anti-viral state to prevent infection by incoming retroviruses. This previously unrecognised mechanism of restriction could have implications for invasion of cells by any pathogen.

  13. Microfluidic screening and whole-genome sequencing identifies mutations associated with improved protein secretion by yeast

    Huang, Mingtao; Bai, Yunpeng; Sjostrom, Staffan L.

    2015-01-01

    There is an increasing demand for biotech-based production of recombinant proteins for use as pharmaceuticals in the food and feed industry and in industrial applications. Yeast Saccharomyces cerevisiae is among preferred cell factories for recombinant protein production, and there is increasing...... interest in improving its protein secretion capacity. Due to the complexity of the secretory machinery in eukaryotic cells, it is difficult to apply rational engineering for construction of improved strains. Here we used high-throughput microfluidics for the screening of yeast libraries, generated by UV...... mutagenesis. Several screening and sorting rounds resulted in the selection of eight yeast clones with significantly improved secretion of recombinant a-amylase. Efficient secretion was genetically stable in the selected clones. We performed whole-genome sequencing of the eight clones and identified 330...

  14. Whole-Genome Resequencing of Experimental Populations Reveals Polygenic Basis of Egg-Size Variation in Drosophila melanogaster.

    Jha, Aashish R; Miles, Cecelia M; Lippert, Nodia R; Brown, Christopher D; White, Kevin P; Kreitman, Martin

    2015-10-01

    Complete genome resequencing of populations holds great promise in deconstructing complex polygenic traits to elucidate molecular and developmental mechanisms of adaptation. Egg size is a classic adaptive trait in insects, birds, and other taxa, but its highly polygenic architecture has prevented high-resolution genetic analysis. We used replicated experimental evolution in Drosophila melanogaster and whole-genome sequencing to identify consistent signatures of polygenic egg-size adaptation. A generalized linear-mixed model revealed reproducible allele frequency differences between replicated experimental populations selected for large and small egg volumes at approximately 4,000 single nucleotide polymorphisms (SNPs). Several hundred distinct genomic regions contain clusters of these SNPs and have lower heterozygosity than the genomic background, consistent with selection acting on polymorphisms in these regions. These SNPs are also enriched among genes expressed in Drosophila ovaries and many of these genes have well-defined functions in Drosophila oogenesis. Additional genes regulating egg development, growth, and cell size show evidence of directional selection as genes regulating these biological processes are enriched for highly differentiated SNPs. Genetic crosses performed with a subset of candidate genes demonstrated that these genes influence egg size, at least in the large genetic background. These findings confirm the highly polygenic architecture of this adaptive trait, and suggest the involvement of many novel candidate genes in regulating egg size. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  15. Deep whole-genome sequencing of 90 Han Chinese genomes.

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000

  16. Whole genomes redefine the mutational landscape of pancreatic cancer.

    Waddell, Nicola; Pajic, Marina; Patch, Ann-Marie; Chang, David K; Kassahn, Karin S; Bailey, Peter; Johns, Amber L; Miller, David; Nones, Katia; Quek, Kelly; Quinn, Michael C J; Robertson, Alan J; Fadlullah, Muhammad Z H; Bruxner, Tim J C; Christ, Angelika N; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourse, Craig; Nourbakhsh, Ehsan; Wani, Shivangi; Wilson, Peter J; Markham, Emma; Cloonan, Nicole; Anderson, Matthew J; Fink, J Lynn; Holmes, Oliver; Kazakoff, Stephen H; Leonard, Conrad; Newell, Felicity; Poudel, Barsha; Song, Sarah; Taylor, Darrin; Waddell, Nick; Wood, Scott; Xu, Qinying; Wu, Jianmin; Pinese, Mark; Cowley, Mark J; Lee, Hong C; Jones, Marc D; Nagrial, Adnan M; Humphris, Jeremy; Chantrill, Lorraine A; Chin, Venessa; Steinmann, Angela M; Mawson, Amanda; Humphrey, Emily S; Colvin, Emily K; Chou, Angela; Scarlett, Christopher J; Pinho, Andreia V; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S; Kench, James G; Pettitt, Jessica A; Merrett, Neil D; Toon, Christopher; Epari, Krishna; Nguyen, Nam Q; Barbour, Andrew; Zeps, Nikolajs; Jamieson, Nigel B; Graham, Janet S; Niclou, Simone P; Bjerkvig, Rolf; Grützmann, Robert; Aust, Daniela; Hruban, Ralph H; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Wolfgang, Christopher L; Morgan, Richard A; Lawlor, Rita T; Corbo, Vincenzo; Bassi, Claudio; Falconi, Massimo; Zamboni, Giuseppe; Tortora, Giampaolo; Tempero, Margaret A; Gill, Anthony J; Eshleman, James R; Pilarsky, Christian; Scarpa, Aldo; Musgrove, Elizabeth A; Pearson, John V; Biankin, Andrew V; Grimmond, Sean M

    2015-02-26

    Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (variation in chromosomal structure) classified PDACs into 4 subtypes with potential clinical utility: the subtypes were termed stable, locally rearranged, scattered and unstable. A significant proportion harboured focal amplifications, many of which contained druggable oncogenes (ERBB2, MET, FGFR1, CDK6, PIK3R3 and PIK3CA), but at low individual patient prevalence. Genomic instability co-segregated with inactivation of DNA maintenance genes (BRCA1, BRCA2 or PALB2) and a mutational signature of DNA damage repair deficiency. Of 8 patients who received platinum therapy, 4 of 5 individuals with these measures of defective DNA maintenance responded.

  17. Whole genomes redefine the mutational landscape of pancreatic cancer

    Waddell, Nicola; Pajic, Marina; Patch, Ann-Marie; Chang, David K.; Kassahn, Karin S.; Bailey, Peter; Johns, Amber L.; Miller, David; Nones, Katia; Quek, Kelly; Quinn, Michael C. J.; Robertson, Alan J.; Fadlullah, Muhammad Z. H.; Bruxner, Tim J. C.; Christ, Angelika N.; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourse, Craig; Nourbakhsh, Ehsan; Wani, Shivangi; Wilson, Peter J; Markham, Emma; Cloonan, Nicole; Anderson, Matthew J.; Fink, J. Lynn; Holmes, Oliver; Kazakoff, Stephen H.; Leonard, Conrad; Newell, Felicity; Poudel, Barsha; Song, Sarah; Taylor, Darrin; Waddell, Nick; Wood, Scott; Xu, Qinying; Wu, Jianmin; Pinese, Mark; Cowley, Mark J.; Lee, Hong C.; Jones, Marc D.; Nagrial, Adnan M.; Humphris, Jeremy; Chantrill, Lorraine A.; Chin, Venessa; Steinmann, Angela M.; Mawson, Amanda; Humphrey, Emily S.; Colvin, Emily K.; Chou, Angela; Scarlett, Christopher J.; Pinho, Andreia V.; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S.; Kench, James G.; Pettitt, Jessica A.; Merrett, Neil D.; Toon, Christopher; Epari, Krishna; Nguyen, Nam Q.; Barbour, Andrew; Zeps, Nikolajs; Jamieson, Nigel B.; Graham, Janet S.; Niclou, Simone P.; Bjerkvig, Rolf; Grützmann, Robert; Aust, Daniela; Hruban, Ralph H.; Maitra, Anirban; Iacobuzio-Donahue, Christine A.; Wolfgang, Christopher L.; Morgan, Richard A.; Lawlor, Rita T.; Corbo, Vincenzo; Bassi, Claudio; Falconi, Massimo; Zamboni, Giuseppe; Tortora, Giampaolo; Tempero, Margaret A.; Gill, Anthony J.; Eshleman, James R.; Pilarsky, Christian; Scarpa, Aldo; Musgrove, Elizabeth A.; Pearson, John V.; Biankin, Andrew V.; Grimmond, Sean M.

    2015-01-01

    Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (variation in chromosomal structure) classified PDACs into 4 subtypes with potential clinical utility: the subtypes were termed stable, locally rearranged, scattered and unstable. A significant proportion harboured focal amplifications, many of which contained druggable oncogenes (ERBB2, MET, FGFR1, CDK6, PIK3R3 and PIK3CA), but at low individual patient prevalence. Genomic instability co-segregated with inactivation of DNA maintenance genes (BRCA1, BRCA2 or PALB2) and a mutational signature of DNA damage repair deficiency. Of 8 patients who received platinum therapy, 4 of 5 individuals with these measures of defective DNA maintenance responded. PMID:25719666

  18. Review:Whole genome amplification in preimplantation genetic diagnosis

    Ying-ming ZHENG; Ning WANG; Lei LI; Fan JIN

    2011-01-01

    Preimplantation genetic diagnosis(PGD)refers to a procedure for genetically analyzing embryos prior to implantation,improving the chance of conception for patients at high risk of transmitting specific inherited disorders.This method has been widely used for a large number of genetic disorders since the first successful application in the early 1990s.Polymerase chain reaction(PCR)and fluorescent in situ hybridization(FISH)are the two main methods in PGD,but there are some inevitable shortcomings limiting the scope of genetic diagnosis.Fortunately,different whole genome amplification(WGA)techniques have been developed to overcome these problems.Sufficient DNA can be amplified and multiple tasks which need abundant DNA can be performed.Moreover,WGA products can be analyzed as a template for multi-loci and multi-gene during the subsequent DNA analysis.In this review,we will focus on the currently available WGA techniques and their applications,as well as the new technical trends from WGA products.

  19. MIPS: analysis and annotation of proteins from whole genomes.

    Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A

    2004-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).

  20. Whole-Genome Sequencing for National Surveillance of Shigella flexneri

    Marie A. Chattaway

    2017-09-01

    Full Text Available National surveillance of Shigella flexneri ensures the rapid detection of outbreaks to facilitate public health investigation and intervention strategies. In this study, we used whole-genome sequencing (WGS to type S. flexneri in order to detect linked cases and support epidemiological investigations. We prospectively analyzed 330 isolates of S. flexneri received at the Gastrointestinal Bacteria Reference Unit at Public Health England between August 2015 and January 2016. Traditional phenotypic and WGS sub-typing methods were compared. PCR was carried out on isolates exhibiting phenotypic/genotypic discrepancies with respect to serotype. Phylogenetic relationships between isolates were analyzed by WGS using single nucleotide polymorphism (SNP typing to facilitate cluster detection. For 306/330 (93% isolates there was concordance between serotype derived from the genome and phenotypic serology. Discrepant results between the phenotypic and genotypic tests were attributed to novel O-antigen synthesis/modification gene combinations or indels identified in O-antigen synthesis/modification genes rendering them dysfunctional. SNP typing identified 36 clusters of two isolates or more. WGS provided microbiological evidence of epidemiologically linked clusters and detected novel O-antigen synthesis/modification gene combinations associated with two outbreaks. WGS provided reliable and robust data for monitoring trends in the incidence of different serotypes over time. SNP typing can be used to facilitate outbreak investigations in real-time thereby informing surveillance strategies and providing the opportunities for implementing timely public health interventions.

  1. Signatures of selection in tilapia revealed by whole genome resequencing.

    Xia, Jun Hong; Bai, Zhiyi; Meng, Zining; Zhang, Yong; Wang, Le; Liu, Feng; Jing, Wu; Wan, Zi Yi; Li, Jiale; Lin, Haoran; Yue, Gen Hua

    2015-09-16

    Natural selection and selective breeding for genetic improvement have left detectable signatures within the genome of a species. Identification of selection signatures is important in evolutionary biology and for detecting genes that facilitate to accelerate genetic improvement. However, selection signatures, including artificial selection and natural selection, have only been identified at the whole genome level in several genetically improved fish species. Tilapia is one of the most important genetically improved fish species in the world. Using next-generation sequencing, we sequenced the genomes of 47 tilapia individuals. We identified a total of 1.43 million high-quality SNPs and found that the LD block sizes ranged from 10-100 kb in tilapia. We detected over a hundred putative selective sweep regions in each line of tilapia. Most selection signatures were located in non-coding regions of the tilapia genome. The Wnt signaling, gonadotropin-releasing hormone receptor and integrin signaling pathways were under positive selection in all improved tilapia lines. Our study provides a genome-wide map of genetic variation and selection footprints in tilapia, which could be important for genetic studies and accelerating genetic improvement of tilapia.

  2. Whole-genome sequence variation, population structure and demographic history of the Dutch population

    Francioli, Laurent C.; Menelaou, Andronild; Pulit, Sara L.; Van Dijk, Freerk; Palamara, Pier Francesco; Elbers, Clara C.; Neerincx, Pieter B. T.; Ye, Kai; Guryev, Victor; Kloosterman, Wigard P.; Deelen, Patrick; Abdellaoui, Abdel; Van Leeuwen, Elisabeth M.; Van Oven, Mannis; Vermaat, Martijn; Li, Mingkun; Laros, Jeroen F. J.; Karssen, Lennart C.; Kanterakis, Alexandros; Amin, Najaf; Hottenga, Jouke Jan; Lameijer, Eric-Wubbo; Kattenberg, Mathijs; Dijkstra, Martijn; Byelas, Heorhiy; Van Settenl, Jessica; Van Schaik, Barbera D. C.; Bot, Jan; Nijman, Isaac J.; Renkens, Ivo; Marscha, Tobias; Schonhuth, Alexander; Hehir-Kwa, Jayne Y.; Handsaker, Robert E.; Polak, Paz; Sohail, Mashaal; Vuzman, Dana; Hormozdiari, Fereydoun; Van Enckevort, David; Mei, Hailiang; Koval, Vyacheslav; Moed, Ma-Tthijs H.; Van der Velde, K. Joeri; Rivadeneira, Fernando; Estrada, Karol; Medina-Gomez, Carolina; Isaacs, Aaron; Platteel, Mathieu; Swertz, Morris A.; Wijmenga, Cisca

    Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring

  3. Whole-genome sequence variation, population structure and demographic history of the Dutch population

    The Genome of the Netherlands Consortium; T. Marschall (Tobias); A. Schönhuth (Alexander)

    2014-01-01

    htmlabstractWhole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch

  4. Environmental whole-genome amplification to access microbial populations in contaminated sediments

    Abulencia, Carl B [Diversa Corporation; Wyborski, Denise L. [Diversa Corporation; Garcia, Joseph A. [Diversa Corporation; Podar, Mircea [ORNL; Chen, Wenqiong [Diversa Corporation; Chang, Sherman H. [Diversa Corporation; Chang, Hwai W. [Diversa Corporation; Watson, David B [ORNL; Brodie, Eoin L. [Lawrence Berkeley National Laboratory (LBNL); Hazen, Terry [Lawrence Berkeley National Laboratory (LBNL); Keller, Martin [ORNL

    2006-05-01

    Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using {phi}29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2% genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small-subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9% of the sequences had significant similarities to known proteins, and 'clusters of orthologous groups' (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible.

  5. Environmental Whole-Genome Amplification to Access Microbial Diversity in Contaminated Sediments

    Abulencia, C.B.; Wyborski, D.L.; Garcia, J.; Podar, M.; Chen, W.; Chang, S.H.; Chang, H.W.; Watson, D.; Brodie,E.I.; Hazen, T.C.; Keller, M.

    2005-12-10

    Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using ?29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2 percent genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9 percent of the sequences had significant similarities to known proteins, and ''clusters of orthologous groups'' (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible.

  6. Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation.

    Frank Technow

    Full Text Available Genomic selection, enabled by whole genome prediction (WGP methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E, continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC, a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics.

  7. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  8. Differential gene expression profiles of peripheral blood mononuclear cells in childhood asthma.

    Kong, Qian; Li, Wen-Jing; Huang, Hua-Rong; Zhong, Ying-Qiang; Fang, Jian-Pei

    2015-05-01

    Asthma is a common childhood disease with strong genetic components. This study compared whole-genome expression differences between asthmatic young children and healthy controls to identify gene signatures of childhood asthma. Total RNA extracted from peripheral blood mononuclear cells (PBMC) was subjected to microarray analysis. QRT-PCR was performed to verify the microarray results. Classification and functional characterization of differential genes were illustrated by hierarchical clustering and gene ontology analysis. Multiple logistic regression (MLR) analysis, receiver operating characteristic (ROC) curve analysis, and discriminate power were used to scan asthma-specific diagnostic markers. For fold-change>2 and p childhood asthma model for prediction and diagnosis.

  9. Whole genome transcript profiling of drug induced steatosis in rats reveals a gene signature predictive of outcome.

    Nishika Sahini

    Full Text Available Drug induced steatosis (DIS is characterised by excess triglyceride accumulation in the form of lipid droplets (LD in liver cells. To explore mechanisms underlying DIS we interrogated the publically available microarray data from the Japanese Toxicogenomics Project (TGP to study comprehensively whole genome gene expression changes in the liver of treated rats. For this purpose a total of 17 and 12 drugs which are diverse in molecular structure and mode of action were considered based on their ability to cause either steatosis or phospholipidosis, respectively, while 7 drugs served as negative controls. In our efforts we focused on 200 genes which are considered to be mechanistically relevant in the process of lipid droplet biogenesis in hepatocytes as recently published (Sahini and Borlak, 2014. Based on mechanistic considerations we identified 19 genes which displayed dose dependent responses while 10 genes showed time dependency. Importantly, the present study defined 9 genes (ANGPTL4, FABP7, FADS1, FGF21, GOT1, LDLR, GK, STAT3, and PKLR as signature genes to predict DIS. Moreover, cross tabulation revealed 9 genes to be regulated ≥10 times amongst the various conditions and included genes linked to glucose metabolism, lipid transport and lipogenesis as well as signalling events. Additionally, a comparison between drugs causing phospholipidosis and/or steatosis revealed 26 genes to be regulated in common including 4 signature genes to predict DIS (PKLR, GK, FABP7 and FADS1. Furthermore, a comparison between in vivo single dose (3, 6, 9 and 24 h and findings from rat hepatocyte studies (2 h, 8 h, 24 h identified 10 genes which are regulated in common and contained 2 DIS signature genes (FABP7, FGF21. Altogether, our studies provide comprehensive information on mechanistically linked gene expression changes of a range of drugs causing steatosis and phospholipidosis and encourage the screening of DIS signature genes at the preclinical stage.

  10. Whole genome sequence analysis of the arctic-lineage strain responsible for distemper in Italian wolves and dogs through a fast and robust next generation sequencing protocol.

    Marcacci, Maurilia; Ancora, Massimo; Mangone, Iolanda; Teodori, Liana; Di Sabatino, Daria; De Massis, Fabrizio; Camma', Cesare; Savini, Giovanni; Lorusso, Alessio

    2014-06-01

    Dynamic surveillance and characterization of canine distemper virus (CDV) circulating strains are essential against possible vaccine breakthroughs events. This study describes the setup of a fast and robust next-generation sequencing (NGS) Ion PGM™ protocol that was used to obtain the complete genome sequence of a CDV isolate (CDV2784/2013). CDV2784/2013 is the prototype of CDV strains responsible for severe clinical distemper in dogs and wolves in Italy during 2013. CDV2784/2013 was isolated on cell culture and total RNA was used for NGS sample preparation. A total of 112.3 Mb of reads were assembled de novo using MIRA version 4.0rc4, which yielded a total number of 403 contigs with 12.1% coverage. The whole genome (15,690 bp) was recovered successfully and compared to those of existing CDV whole genomes. CDV2784/2013 was shown to have 92% nt identity with the Onderstepoort vaccine strain. This study describes for the first time a fast and robust Ion PGM™ platform-based whole genome amplification protocol for non-segmented negative stranded RNA viruses starting from total cell-purified RNA. Additionally, this is the first study reporting the whole genome analysis of an Arctic lineage strain that is known to circulate widely in Europe, Asia and USA. Copyright © 2014 Elsevier B.V. All rights reserved.

  11. New perspectives on microbial community distortion after whole-genome amplification

    Whole-genome amplification (WGA) has become an important tool to explore the genomic information of microorganisms in an environmental sample with limited biomass, however potential selective biases during the amplification processes are poorly understood. Here, we describe the e...

  12. Systematic evaluation of bias in microbial community profiles induced by whole genome amplification.

    Direito, S.; Zaura, E.; Little, M.; Ehrenfreund, P.; Roling, W.F.M.

    2014-01-01

    Whole genome amplification methods facilitate the detection and characterization of microbial communities in low biomass environments. We examined the extent to which the actual community structure is reliably revealed and factors contributing to bias. One widely used [multiple displacement

  13. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    Black, PA; Vos, M. de; Louw, GE; Merwe, RG van der; Dippenaar, A.; Streicher, EM; Abdallah, AM; Sampson, SL; Victor, TC; Dolby, T.; Simpson, JA; Helden, PD van; Warren, RM; Pain, Arnab

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug

  14. Systematic evaluation of bias in microbial community profiles induced by whole genome amplification

    Direito, S.O.L.; Zaura, E.; Little, M.; Ehrenfreund, P.; Röling, W.F.M.

    2014-01-01

    Whole genome amplification methods facilitate the detection and characterization of microbial communities in low biomass environments. We examined the extent to which the actual community structure is reliably revealed and factors contributing to bias. One widely used [multiple displacement

  15. Tracing Mycobacterium tuberculosis transmission by whole genome sequencing in a high incidence setting

    Bjorn-Mortensen, K; Soborg, B; Koch, A

    2016-01-01

    In East Greenland, a dramatic increase of tuberculosis (TB) incidence has been observed in recent years. Classical genotyping suggests a genetically similar Mycobacterium tuberculosis (Mtb) strain population as cause, however, precise transmission patterns are unclear. We performed whole genome...

  16. Rapid determination of anti-tuberculosis drug resistance from whole-genome sequences

    Coll, Francesc; McNerney, Ruth; Preston, Mark D; Guerra-Assunç ã o, José Afonso; Warry, Andrew; Hill-Cawthorne, Grant A.; Mallard, Kim; Nair, Mridul; Miranda, Anabela; Alves, Adriana; Perdigã o, Joã o; Viveiros, Miguel; Portugal, Isabel; Hasan, Zahra; Hasan, Rumina; Glynn, Judith R; Martin, Nigel; Pain, Arnab; Clark, Taane G

    2015-01-01

    Mycobacterium tuberculosis drug resistance (DR) challenges effective tuberculosis disease control. Current molecular tests examine limited numbers of mutations, and although whole genome sequencing approaches could fully characterise DR, data

  17. Whole-genome sequencing identifies recurrent somatic NOTCH2 mutations in splenic marginal zone lymphoma.

    Kiel, Mark J; Velusamy, Thirunavukkarasu; Betz, Bryan L; Zhao, Lili; Weigelin, Helmut G; Chiang, Mark Y; Huebner-Chan, David R; Bailey, Nathanael G; Yang, David T; Bhagat, Govind; Miranda, Roberto N; Bahler, David W; Medeiros, L Jeffrey; Lim, Megan S; Elenitoba-Johnson, Kojo S J

    2012-08-27

    Splenic marginal zone lymphoma (SMZL), the most common primary lymphoma of spleen, is poorly understood at the genetic level. In this study, using whole-genome DNA sequencing (WGS) and confirmation by Sanger sequencing, we observed mutations identified in several genes not previously known to be recurrently altered in SMZL. In particular, we identified recurrent somatic gain-of-function mutations in NOTCH2, a gene encoding a protein required for marginal zone B cell development, in 25 of 99 (∼25%) cases of SMZL and in 1 of 19 (∼5%) cases of nonsplenic MZLs. These mutations clustered near the C-terminal proline/glutamate/serine/threonine (PEST)-rich domain, resulting in protein truncation or, rarely, were nonsynonymous substitutions affecting the extracellular heterodimerization domain (HD). NOTCH2 mutations were not present in other B cell lymphomas and leukemias, such as chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL; n = 15), mantle cell lymphoma (MCL; n = 15), low-grade follicular lymphoma (FL; n = 44), hairy cell leukemia (HCL; n = 15), and reactive lymphoid hyperplasia (n = 14). NOTCH2 mutations were associated with adverse clinical outcomes (relapse, histological transformation, and/or death) among SMZL patients (P = 0.002). These results suggest that NOTCH2 mutations play a role in the pathogenesis and progression of SMZL and are associated with a poor prognosis.

  18. Whole genome sequencing as the ultimate tool to diagnose tuberculosis

    Dick van Soolingen

    2016-01-01

    Full Text Available In the past two decades, DNA techniques have been increasingly used in the laboratory diagnosis of tuberculosis (TB. The (sub species of the Mycobacterium tuberculosis complex are usually identified using reverse line blot techniques. The resistance is predicted by the detection of mutations in genes associated with resistance. Nevertheless, all cases are still subjected to cumbersome phenotypic resistance testing. The production of a strain-characteristic DNA fingerprint, to investigate the epidemiology of TB, is done by the 24-locus variable number tandem repeat (VNTR typing. However, most of the molecular techniques in the diagnosis of TB can eventually be replaced by whole genome sequencing (WGS. Many international TB reference laboratories are currently working on the introduction of WGS; however, standardization in the international context is lacking. The European Centre for Infectious Disease Prevention and Control in Stockholm, Sweden organizes a yearly round of quality control on VNTR typing and in 2015 for the first time also WGS. In this first proficiency study, only three out of eight international TB laboratories produced WGS results in line with those of the reference laboratory. The whole process of DNA isolation, purification, quantification, sequencing, and analysis/interpretation of data is still under development. In this presentation, many aspects will be covered that influence the quality and interpretation of WGS results. The turn-around-time, analysis, and utility of WGS will be discussed. Moreover, the experiences in the use of WGS in the molecular epidemiology of TB in The Netherlands are detailed. It can be concluded that many difficulties still have to be conquered. The state of the art is that bacteria still have to be cultured to have sufficient quality and quantity of DNA for succesful WGS. The quality of sequencing has improved significantly over the past 7 years, and the detection of mutations has, therefore

  19. A site specific model and analysis of the neutral somatic mutation rate in whole-genome cancer data.

    Bertl, Johanna; Guo, Qianyun; Juul, Malene; Besenbacher, Søren; Nielsen, Morten Muhlig; Hornshøj, Henrik; Pedersen, Jakob Skou; Hobolth, Asger

    2018-04-19

    Detailed modelling of the neutral mutational process in cancer cells is crucial for identifying driver mutations and understanding the mutational mechanisms that act during cancer development. The neutral mutational process is very complex: whole-genome analyses have revealed that the mutation rate differs between cancer types, between patients and along the genome depending on the genetic and epigenetic context. Therefore, methods that predict the number of different types of mutations in regions or specific genomic elements must consider local genomic explanatory variables. A major drawback of most methods is the need to average the explanatory variables across the entire region or genomic element. This procedure is particularly problematic if the explanatory variable varies dramatically in the element under consideration. To take into account the fine scale of the explanatory variables, we model the probabilities of different types of mutations for each position in the genome by multinomial logistic regression. We analyse 505 cancer genomes from 14 different cancer types and compare the performance in predicting mutation rate for both regional based models and site-specific models. We show that for 1000 randomly selected genomic positions, the site-specific model predicts the mutation rate much better than regional based models. We use a forward selection procedure to identify the most important explanatory variables. The procedure identifies site-specific conservation (phyloP), replication timing, and expression level as the best predictors for the mutation rate. Finally, our model confirms and quantifies certain well-known mutational signatures. We find that our site-specific multinomial regression model outperforms the regional based models. The possibility of including genomic variables on different scales and patient specific variables makes it a versatile framework for studying different mutational mechanisms. Our model can serve as the neutral null model

  20. An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data.

    Jenkinson, Garrett; Abante, Jordi; Feinberg, Andrew P; Goutsias, John

    2018-03-07

    DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical dependencies between neighboring methylation sites, thus ignoring significant information available in WGBS reads. We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigorous quantification of methylation stochasticity in individual WGBS samples genome-wide. Furthermore, it utilizes the Jensen-Shannon distance to evaluate differences in methylation distributions between a test and a reference sample. Differential performance assessment using simulated and real human lung normal/cancer data demonstrate a clear superiority of our approach over DSS, a recently proposed method for WGBS data analysis. Critically, these results demonstrate that marginal methods become statistically invalid when correlations are present in the data. This contribution demonstrates clear benefits and the necessity of modeling joint probability distributions of methylation using the 1D Ising model of statistical physics and of

  1. A synergism between adaptive effects and evolvability drives whole genome duplication to fixation.

    Cuypers, Thomas D; Hogeweg, Paulien

    2014-04-01

    Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes. This pattern has been explained by a neutral process of subfunctionalization and more recently, dosage balance selection. However, much about the relationship between environmental change, WGD and adaptation remains unknown. Here, we study the duplicate retention pattern postWGD, by letting virtual cells adapt to environmental changes. The virtual cells have structured genomes that encode a regulatory network and simple metabolism. Populations are under selection for homeostasis and evolve by point mutations, small indels and WGD. After populations had initially adapted fully to fluctuating resource conditions re-adaptation to a broad range of novel environments was studied by tracking mutations in the line of descent. WGD was established in a minority (≈30%) of lineages, yet, these were significantly more successful at re-adaptation. Unexpectedly, WGD lineages conserved more seemingly redundant genes, yet had higher per gene mutation rates. While WGD duplicates of all functional classes were significantly over-retained compared to a model of neutral losses, duplicate retention was clearly biased towards highly connected TFs. Importantly, no subfunctionalization occurred in conserved pairs, strongly suggesting that dosage balance shaped retention. Meanwhile, singles diverged significantly. WGD, therefore, is a powerful mechanism to cope with environmental change, allowing conservation of a core machinery, while adapting the peripheral network to accommodate change.

  2. Whole-genome transcriptional analysis of heavy metal stresses inCaulobacter crescentus

    Hu, Ping; Brodie, Eoin L.; Suzuki, Yohey; McAdams, Harley H.; Andersen, Gary L.

    2005-09-21

    The bacterium Caulobacter crescentus and related stalkbacterial species are known for their distinctive ability to live in lownutrient environments, a characteristic of most heavy metal contaminatedsites. Caulobacter crescentus is a model organism for studying cell cycleregulation with well developed genetics. We have identified the pathwaysresponding to heavy metal toxicity in C. crescentus to provide insightsfor possible application of Caulobacter to environmental restoration. Weexposed C. crescentus cells to four heavy metals (chromium, cadmium,selenium and uranium) and analyzed genome wide transcriptional activitiespost exposure using a Affymetrix GeneChip microarray. C. crescentusshowed surprisingly high tolerance to uranium, a possible mechanism forwhich may be formation of extracellular calcium-uranium-phosphateprecipitates. The principal response to these metals was protectionagainst oxidative stress (up-regulation of manganese-dependent superoxidedismutase, sodA). Glutathione S-transferase, thioredoxin, glutaredoxinsand DNA repair enzymes responded most strongly to cadmium and chromate.The cadmium and chromium stress response also focused on reducing theintracellular metal concentration, with multiple efflux pumps employed toremove cadmium while a sulfate transporter was down-regulated to reducenon-specific uptake of chromium. Membrane proteins were also up-regulatedin response to most of the metals tested. A two-component signaltransduction system involved in the uranium response was identified.Several differentially regulated transcripts from regions previously notknown to encode proteins were identified, demonstrating the advantage ofevaluating the transcriptome using whole genome microarrays.

  3. Using beta-binomial regression for high-precision differential methylation analysis in multifactor whole-genome bisulfite sequencing experiments

    2014-01-01

    Background Whole-genome bisulfite sequencing currently provides the highest-precision view of the epigenome, with quantitative information about populations of cells down to single nucleotide resolution. Several studies have demonstrated the value of this precision: meaningful features that correlate strongly with biological functions can be found associated with only a few CpG sites. Understanding the role of DNA methylation, and more broadly the role of DNA accessibility, requires that methylation differences between populations of cells are identified with extreme precision and in complex experimental designs. Results In this work we investigated the use of beta-binomial regression as a general approach for modeling whole-genome bisulfite data to identify differentially methylated sites and genomic intervals. Conclusions The regression-based analysis can handle medium- and large-scale experiments where it becomes critical to accurately model variation in methylation levels between replicates and account for influence of various experimental factors like cell types or batch effects. PMID:24962134

  4. Insights into three whole-genome duplications gleaned from the Paramecium caudatum genome sequence.

    McGrath, Casey L; Gout, Jean-Francois; Doak, Thomas G; Yanagi, Akira; Lynch, Michael

    2014-08-01

    Paramecium has long been a model eukaryote. The sequence of the Paramecium tetraurelia genome reveals a history of three successive whole-genome duplications (WGDs), and the sequences of P. biaurelia and P. sexaurelia suggest that these WGDs are shared by all members of the aurelia species complex. Here, we present the genome sequence of P. caudatum, a species closely related to the P. aurelia species group. P. caudatum shares only the most ancient of the three WGDs with the aurelia complex. We found that P. caudatum maintains twice as many paralogs from this early event as the P. aurelia species, suggesting that post-WGD gene retention is influenced by subsequent WGDs and supporting the importance of selection for dosage in gene retention. The availability of P. caudatum as an outgroup allows an expanded analysis of the aurelia intermediate and recent WGD events. Both the Guanine+Cytosine (GC) content and the expression level of preduplication genes are significant predictors of duplicate retention. We find widespread asymmetrical evolution among aurelia paralogs, which is likely caused by gradual pseudogenization rather than by neofunctionalization. Finally, cases of divergent resolution of intermediate WGD duplicates between aurelia species implicate this process acts as an ongoing reinforcement mechanism of reproductive isolation long after a WGD event. Copyright © 2014 by the Genetics Society of America.

  5. The house spider genome reveals an ancient whole-genome duplication during arachnid evolution.

    Schwager, Evelyn E; Sharma, Prashant P; Clarke, Thomas; Leite, Daniel J; Wierschin, Torsten; Pechmann, Matthias; Akiyama-Oda, Yasuko; Esposito, Lauren; Bechsgaard, Jesper; Bilde, Trine; Buffry, Alexandra D; Chao, Hsu; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dugan, Shannon; Eibner, Cornelius; Extavour, Cassandra G; Funch, Peter; Garb, Jessica; Gonzalez, Luis B; Gonzalez, Vanessa L; Griffiths-Jones, Sam; Han, Yi; Hayashi, Cheryl; Hilbrant, Maarten; Hughes, Daniel S T; Janssen, Ralf; Lee, Sandra L; Maeso, Ignacio; Murali, Shwetha C; Muzny, Donna M; Nunes da Fonseca, Rodrigo; Paese, Christian L B; Qu, Jiaxin; Ronshaugen, Matthew; Schomburg, Christoph; Schönauer, Anna; Stollewerk, Angelika; Torres-Oliva, Montserrat; Turetzek, Natascha; Vanthournout, Bram; Werren, John H; Wolff, Carsten; Worley, Kim C; Bucher, Gregor; Gibbs, Richard A; Coddington, Jonathan; Oda, Hiroki; Stanke, Mario; Ayoub, Nadia A; Prpic, Nikola-Michael; Flot, Jean-François; Posnien, Nico; Richards, Stephen; McGregor, Alistair P

    2017-07-31

    The duplication of genes can occur through various mechanisms and is thought to make a major contribution to the evolutionary diversification of organisms. There is increasing evidence for a large-scale duplication of genes in some chelicerate lineages including two rounds of whole genome duplication (WGD) in horseshoe crabs. To investigate this further, we sequenced and analyzed the genome of the common house spider Parasteatoda tepidariorum. We found pervasive duplication of both coding and non-coding genes in this spider, including two clusters of Hox genes. Analysis of synteny conservation across the P. tepidariorum genome suggests that there has been an ancient WGD in spiders. Comparison with the genomes of other chelicerates, including that of the newly sequenced bark scorpion Centruroides sculpturatus, suggests that this event occurred in the common ancestor of spiders and scorpions, and is probably independent of the WGDs in horseshoe crabs. Furthermore, characterization of the sequence and expression of the Hox paralogs in P. tepidariorum suggests that many have been subject to neo-functionalization and/or sub-functionalization since their duplication. Our results reveal that spiders and scorpions are likely the descendants of a polyploid ancestor that lived more than 450 MYA. Given the extensive morphological diversity and ecological adaptations found among these animals, rivaling those of vertebrates, our study of the ancient WGD event in Arachnopulmonata provides a new comparative platform to explore common and divergent evolutionary outcomes of polyploidization events across eukaryotes.

  6. Whole-genome sequencing reveals the mechanisms for evolution of streptomycin resistance in Lactobacillus plantarum.

    Zhang, Fuxin; Gao, Jiayuan; Wang, Bini; Huo, Dongxue; Wang, Zhaoxia; Zhang, Jiachao; Shao, Yuyu

    2018-04-01

    In this research, we investigated the evolution of streptomycin resistance in Lactobacillus plantarum ATCC14917, which was passaged in medium containing a gradually increasing concentration of streptomycin. After 25 d, the minimum inhibitory concentration (MIC) of L. plantarum ATCC14917 had reached 131,072 µg/mL, which was 8,192-fold higher than the MIC of the original parent isolate. The highly resistant L. plantarum ATCC14917 isolate was then passaged in antibiotic-free medium to determine the stability of resistance. The MIC value of the L. plantarum ATCC14917 isolate decreased to 2,048 µg/mL after 35 d but remained constant thereafter, indicating that resistance was irreversible even in the absence of selection pressure. Whole-genome sequencing of parent isolates, control isolates, and isolates following passage was used to study the resistance mechanism of L. plantarum ATCC14917 to streptomycin and adaptation in the presence and absence of selection pressure. Five mutated genes (single nucleotide polymorphisms and structural variants) were verified in highly resistant L. plantarum ATCC14917 isolates, which were related to ribosomal protein S12, LPXTG-motif cell wall anchor domain protein, LrgA family protein, Ser/Thr phosphatase family protein, and a hypothetical protein that may correlate with resistance to streptomycin. After passage in streptomycin-free medium, only the mutant gene encoding ribosomal protein S12 remained; the other 4 mutant genes had reverted to the wild type as found in the parent isolate. Although the MIC value of L. plantarum ATCC14917 was reduced in the absence of selection pressure, it remained 128-fold higher than the MIC value of the parent isolate, indicating that ribosomal protein S12 may play an important role in streptomycin resistance. Using the mobile elements database, we demonstrated that streptomycin resistance-related genes in L. plantarum ATCC14917 were not located on mobile elements. This research offers a way of

  7. Effects of DNA mass on multiple displacement whole genome amplification and genotyping performance

    Haque Kashif A

    2005-09-01

    Full Text Available Abstract Background Whole genome amplification (WGA promises to eliminate practical molecular genetic analysis limitations associated with genomic DNA (gDNA quantity. We evaluated the performance of multiple displacement amplification (MDA WGA using gDNA extracted from lymphoblastoid cell lines (N = 27 with a range of starting gDNA input of 1–200 ng into the WGA reaction. Yield and composition analysis of whole genome amplified DNA (wgaDNA was performed using three DNA quantification methods (OD, PicoGreen® and RT-PCR. Two panels of N = 15 STR (using the AmpFlSTR® Identifiler® panel and N = 49 SNP (TaqMan® genotyping assays were performed on each gDNA and wgaDNA sample in duplicate. gDNA and wgaDNA masses of 1, 4 and 20 ng were used in the SNP assays to evaluate the effects of DNA mass on SNP genotyping assay performance. A total of N = 6,880 STR and N = 56,448 SNP genotype attempts provided adequate power to detect differences in STR and SNP genotyping performance between gDNA and wgaDNA, and among wgaDNA produced from a range of gDNA templates inputs. Results The proportion of double-stranded wgaDNA and human-specific PCR amplifiable wgaDNA increased with increased gDNA input into the WGA reaction. Increased amounts of gDNA input into the WGA reaction improved wgaDNA genotyping performance. Genotype completion or genotype concordance rates of wgaDNA produced from all gDNA input levels were observed to be reduced compared to gDNA, although the reduction was not always statistically significant. Reduced wgaDNA genotyping performance was primarily due to the increased variance of allelic amplification, resulting in loss of heterozygosity or increased undetermined genotypes. MDA WGA produces wgaDNA from no template control samples; such samples exhibited substantial false-positive genotyping rates. Conclusion The amount of gDNA input into the MDA WGA reaction is a critical determinant of genotyping performance of wgaDNA. At least 10 ng of

  8. The "most wanted" taxa from the human microbiome for whole genome sequencing.

    Anthony A Fodor

    Full Text Available The goal of the Human Microbiome Project (HMP is to generate a comprehensive catalog of human-associated microorganisms including reference genomes representing the most common species. Toward this goal, the HMP has characterized the microbial communities at 18 body habitats in a cohort of over 200 healthy volunteers using 16S rRNA gene (16S sequencing and has generated nearly 1,000 reference genomes from human-associated microorganisms. To determine how well current reference genome collections capture the diversity observed among the healthy microbiome and to guide isolation and future sequencing of microbiome members, we compared the HMP's 16S data sets to several reference 16S collections to create a 'most wanted' list of taxa for sequencing. Our analysis revealed that the diversity of commonly occurring taxa within the HMP cohort microbiome is relatively modest, few novel taxa are represented by these OTUs and many common taxa among HMP volunteers recur across different populations of healthy humans. Taken together, these results suggest that it should be possible to perform whole-genome sequencing on a large fraction of the human microbiome, including the 'most wanted', and that these sequences should serve to support microbiome studies across multiple cohorts. Also, in stark contrast to other taxa, the 'most wanted' organisms are poorly represented among culture collections suggesting that novel culture- and single-cell-based methods will be required to isolate these organisms for sequencing.

  9. [Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].

    Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y

    2017-08-01

    To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine

  10. Advanced Whole-Genome Sequencing and Analysis of Fetal Genomes from Amniotic Fluid.

    Mao, Qing; Chin, Robert; Xie, Weiwei; Deng, Yuqing; Zhang, Wenwei; Xu, Huixin; Zhang, Rebecca Yu; Shi, Quan; Peters, Erin E; Gulbahce, Natali; Li, Zhenyu; Chen, Fang; Drmanac, Radoje; Peters, Brock A

    2018-04-01

    Amniocentesis is a common procedure, the primary purpose of which is to collect cells from the fetus to allow testing for abnormal chromosomes, altered chromosomal copy number, or a small number of genes that have small single- to multibase defects. Here we demonstrate the feasibility of generating an accurate whole-genome sequence of a fetus from either the cellular or cell-free DNA (cfDNA) of an amniotic sample. cfDNA and DNA isolated from the cell pellet of 31 amniocenteses were sequenced to approximately 50× genome coverage by use of the Complete Genomics nanoarray platform. In a subset of the samples, long fragment read libraries were generated from DNA isolated from cells and sequenced to approximately 100× genome coverage. Concordance of variant calls between the 2 DNA sources and with parental libraries was >96%. Two fetal genomes were found to harbor potentially detrimental variants in chromodomain helicase DNA binding protein 8 ( CHD8 ) and LDL receptor-related protein 1 ( LRP1 ), variations of which have been associated with autism spectrum disorder and keratosis pilaris atrophicans, respectively. We also discovered drug sensitivities and carrier information of fetuses for a variety of diseases. We were able to elucidate the complete genome sequence of 31 fetuses from amniotic fluid and demonstrate that the cfDNA or DNA from the cell pellet can be analyzed with little difference in quality. We believe that current technologies could analyze this material in a highly accurate and complete manner and that analyses like these should be considered for addition to current amniocentesis procedures. © 2018 American Association for Clinical Chemistry.

  11. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  12. Whole-Genome de novo Sequencing Of Quail And Grey Partridge

    Holm, Lars-Erik; Panitz, Frank; Burt, Dave

    2011-01-01

    The development in sequencing methods has made it possible to perform whole genome de novo sequencing of species without large commercial interests. Within the EU-financed QUANTOMICS project (KBBE-2A-222664), we have performed de novo sequencing of quail (Coturnix coturnix) and grey partridge...... (Perdix perdix) on a Genome Analyzer GAII (Illumina) using paired-end sequencing. The amount of generated sequences amounts to 8 to 9 Gb for each species. The analysis and assembly of the generated sequences is ongoing. Access to the whole genome sequence from these two species will enable enhanced...... comparative studies towards the chicken genome and will aid in identifying evolutionarily conserved sequences within the Galliformes. The obtained sequences from quail and partridge represent a beginning of generating the whole genome sequence for these species. The continuation of establishing the genome...

  13. Whole-Genome Sequences of Two Borrelia afzelii and Two Borrelia garinii Lyme Disease Agent Isolates

    Casjens, S.R.; Dunn, J.; Mongodin, E. F.; Qiu, W.-G.; Luft, B. J.; Fraser-Liggett, C. M.; Schutzer, S. E.

    2011-12-01

    Human Lyme disease is commonly caused by several species of spirochetes in the Borrelia genus. In Eurasia these species are largely Borrelia afzelii, B. garinii, B. burgdorferi, and B. bavariensis sp. nov. Whole-genome sequencing is an excellent tool for investigating and understanding the influence of bacterial diversity on the pathogenesis and etiology of Lyme disease. We report here the whole-genome sequences of four isolates from two of the Borrelia species that cause human Lyme disease, B. afzelii isolates ACA-1 and PKo and B. garinii isolates PBr and Far04.

  14. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls....... In a dairy data set, predictions using BayesRC and imputed sequence data from 1000 Bull Genomes were 2% more accurate than with 800k data. We could demonstrate the method identified causal mutations in some cases. Further improvements will come from more accurate imputation of sequence variant genotypes...

  15. Increased frequency of single base substitutions in a population of transcripts expressed in cancer cells

    Bianchetti Laurent

    2012-11-01

    Full Text Available Abstract Background Single Base Substitutions (SBS that alter transcripts expressed in cancer originate from somatic mutations. However, recent studies report SBS in transcripts that are not supported by the genomic DNA of tumor cells. Methods We used sequence based whole genome expression profiling, namely Long-SAGE (L-SAGE and Tag-seq (a combination of L-SAGE and deep sequencing, and computational methods to identify transcripts with greater SBS frequencies in cancer. Millions of tags produced by 40 healthy and 47 cancer L-SAGE experiments were compared to 1,959 Reference Tags (RT, i.e. tags matching the human genome exactly once. Similarly, tens of millions of tags produced by 7 healthy and 8 cancer Tag-seq experiments were compared to 8,572 RT. For each transcript, SBS frequencies in healthy and cancer cells were statistically tested for equality. Results In the L-SAGE and Tag-seq experiments, 372 and 4,289 transcripts respectively, showed greater SBS frequencies in cancer. Increased SBS frequencies could not be attributed to known Single Nucleotide Polymorphisms (SNP, catalogued somatic mutations or RNA-editing enzymes. Hypothesizing that Single Tags (ST, i.e. tags sequenced only once, were indicators of SBS, we observed that ST proportions were heterogeneously distributed across Embryonic Stem Cells (ESC, healthy differentiated and cancer cells. ESC had the lowest ST proportions, whereas cancer cells had the greatest. Finally, in a series of experiments carried out on a single patient at 1 healthy and 3 consecutive tumor stages, we could show that SBS frequencies increased during cancer progression. Conclusion If the mechanisms generating the base substitutions could be known, increased SBS frequency in transcripts would be a new useful biomarker of cancer. With the reduction of sequencing cost, sequence based whole genome expression profiling could be used to characterize increased SBS frequency in patient’s tumor and aid diagnostic.

  16. Increased frequency of single base substitutions in a population of transcripts expressed in cancer cells

    Bianchetti, Laurent; Kieffer, David; Féderkeil, Rémi; Poch, Olivier

    2012-01-01

    Single Base Substitutions (SBS) that alter transcripts expressed in cancer originate from somatic mutations. However, recent studies report SBS in transcripts that are not supported by the genomic DNA of tumor cells. We used sequence based whole genome expression profiling, namely Long-SAGE (L-SAGE) and Tag-seq (a combination of L-SAGE and deep sequencing), and computational methods to identify transcripts with greater SBS frequencies in cancer. Millions of tags produced by 40 healthy and 47 cancer L-SAGE experiments were compared to 1,959 Reference Tags (RT), i.e. tags matching the human genome exactly once. Similarly, tens of millions of tags produced by 7 healthy and 8 cancer Tag-seq experiments were compared to 8,572 RT. For each transcript, SBS frequencies in healthy and cancer cells were statistically tested for equality. In the L-SAGE and Tag-seq experiments, 372 and 4,289 transcripts respectively, showed greater SBS frequencies in cancer. Increased SBS frequencies could not be attributed to known Single Nucleotide Polymorphisms (SNP), catalogued somatic mutations or RNA-editing enzymes. Hypothesizing that Single Tags (ST), i.e. tags sequenced only once, were indicators of SBS, we observed that ST proportions were heterogeneously distributed across Embryonic Stem Cells (ESC), healthy differentiated and cancer cells. ESC had the lowest ST proportions, whereas cancer cells had the greatest. Finally, in a series of experiments carried out on a single patient at 1 healthy and 3 consecutive tumor stages, we could show that SBS frequencies increased during cancer progression. If the mechanisms generating the base substitutions could be known, increased SBS frequency in transcripts would be a new useful biomarker of cancer. With the reduction of sequencing cost, sequence based whole genome expression profiling could be used to characterize increased SBS frequency in patient’s tumor and aid diagnostic

  17. Challenging a bioinformatic tool's ability to detect microbial contaminants using in silico whole genome sequencing data.

    Olson, Nathan D; Zook, Justin M; Morrow, Jayne B; Lin, Nancy J

    2017-01-01

    High sensitivity methods such as next generation sequencing and polymerase chain reaction (PCR) are adversely impacted by organismal and DNA contaminants. Current methods for detecting contaminants in microbial materials (genomic DNA and cultures) are not sensitive enough and require either a known or culturable contaminant. Whole genome sequencing (WGS) is a promising approach for detecting contaminants due to its sensitivity and lack of need for a priori assumptions about the contaminant. Prior to applying WGS, we must first understand its limitations for detecting contaminants and potential for false positives. Herein we demonstrate and characterize a WGS-based approach to detect organismal contaminants using an existing metagenomic taxonomic classification algorithm. Simulated WGS datasets from ten genera as individuals and binary mixtures of eight organisms at varying ratios were analyzed to evaluate the role of contaminant concentration and taxonomy on detection. For the individual genomes the false positive contaminants reported depended on the genus, with Staphylococcus , Escherichia , and Shigella having the highest proportion of false positives. For nearly all binary mixtures the contaminant was detected in the in-silico datasets at the equivalent of 1 in 1,000 cells, though F. tularensis was not detected in any of the simulated contaminant mixtures and Y. pestis was only detected at the equivalent of one in 10 cells. Once a WGS method for detecting contaminants is characterized, it can be applied to evaluate microbial material purity, in efforts to ensure that contaminants are characterized in microbial materials used to validate pathogen detection assays, generate genome assemblies for database submission, and benchmark sequencing methods.

  18. A synergism between adaptive effects and evolvability drives whole genome duplication to fixation.

    Thomas D Cuypers

    2014-04-01

    Full Text Available Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes. This pattern has been explained by a neutral process of subfunctionalization and more recently, dosage balance selection. However, much about the relationship between environmental change, WGD and adaptation remains unknown. Here, we study the duplicate retention pattern postWGD, by letting virtual cells adapt to environmental changes. The virtual cells have structured genomes that encode a regulatory network and simple metabolism. Populations are under selection for homeostasis and evolve by point mutations, small indels and WGD. After populations had initially adapted fully to fluctuating resource conditions re-adaptation to a broad range of novel environments was studied by tracking mutations in the line of descent. WGD was established in a minority (≈30% of lineages, yet, these were significantly more successful at re-adaptation. Unexpectedly, WGD lineages conserved more seemingly redundant genes, yet had higher per gene mutation rates. While WGD duplicates of all functional classes were significantly over-retained compared to a model of neutral losses, duplicate retention was clearly biased towards highly connected TFs. Importantly, no subfunctionalization occurred in conserved pairs, strongly suggesting that dosage balance shaped retention. Meanwhile, singles diverged significantly. WGD, therefore, is a powerful mechanism to cope with environmental change, allowing conservation of a core machinery, while adapting the peripheral network to accommodate change.

  19. Whole genome grey and white matter DNA methylation profiles in dorsolateral prefrontal cortex.

    Sanchez-Mut, Jose Vicente; Heyn, Holger; Vidal, Enrique; Delgado-Morales, Raúl; Moran, Sebastian; Sayols, Sergi; Sandoval, Juan; Ferrer, Isidre; Esteller, Manel; Gräff, Johannes

    2017-06-01

    The brain's neocortex is anatomically organized into grey and white matter, which are mainly composed by neuronal and glial cells, respectively. The neocortex can be further divided in different Brodmann areas according to their cytoarchitectural organization, which are associated with distinct cortical functions. There is increasing evidence that brain development and function are governed by epigenetic processes, yet their contribution to the functional organization of the neocortex remains incompletely understood. Herein, we determined the DNA methylation patterns of grey and white matter of dorsolateral prefrontal cortex (Brodmann area 9), an important region for higher cognitive skills that is particularly affected in various neurological diseases. For avoiding interindividual differences, we analyzed white and grey matter from the same donor using whole genome bisulfite sequencing, and for validating their biological significance, we used Infinium HumanMethylation450 BeadChip and pyrosequencing in ten and twenty independent samples, respectively. The combination of these analysis indicated robust grey-white matter differences in DNA methylation. What is more, cell type-specific markers were enriched among the most differentially methylated genes. Interestingly, we also found an outstanding number of grey-white matter differentially methylated genes that have previously been associated with Alzheimer's, Parkinson's, and Huntington's disease, as well as Multiple and Amyotrophic lateral sclerosis. The data presented here thus constitute an important resource for future studies not only to gain insight into brain regional as well as grey and white matter differences, but also to unmask epigenetic alterations that might underlie neurological and neurodegenerative diseases. © 2017 Wiley Periodicals, Inc.

  20. Rice-arsenate interactions in hydroponics: whole genome transcriptional analysis.

    Norton, Gareth J; Lou-Hing, Daniel E; Meharg, Andrew A; Price, Adam H

    2008-01-01

    Rice (Oryza sativa) varieties that are arsenate-tolerant (Bala) and -sensitive (Azucena) were used to conduct a transcriptome analysis of the response of rice seedlings to sodium arsenate (AsV) in hydroponic solution. RNA extracted from the roots of three replicate experiments of plants grown for 1 week in phosphate-free nutrient with or without 13.3 muM AsV was used to challenge the Affymetrix (52K) GeneChip Rice Genome array. A total of 576 probe sets were significantly up-regulated at least 2-fold in both varieties, whereas 622 were down-regulated. Ontological classification is presented. As expected, a large number of transcription factors, stress proteins, and transporters demonstrated differential expression. Striking is the lack of response of classic oxidative stress-responsive genes or phytochelatin synthases/synthatases. However, the large number of responses from genes involved in glutathione synthesis, metabolism, and transport suggests that glutathione conjugation and arsenate methylation may be important biochemical responses to arsenate challenge. In this report, no attempt is made to dissect differences in the response of the tolerant and sensitive variety, but analysis in a companion article will link gene expression to the known tolerance loci available in the BalaxAzucena mapping population.

  1. Rice–arsenate interactions in hydroponics: whole genome transcriptional analysis

    Norton, Gareth J.; Lou-Hing, Daniel E.; Meharg, Andrew A.; Price, Adam H.

    2008-01-01

    Rice (Oryza sativa) varieties that are arsenate-tolerant (Bala) and -sensitive (Azucena) were used to conduct a transcriptome analysis of the response of rice seedlings to sodium arsenate (AsV) in hydroponic solution. RNA extracted from the roots of three replicate experiments of plants grown for 1 week in phosphate-free nutrient with or without 13.3 μM AsV was used to challenge the Affymetrix (52K) GeneChip Rice Genome array. A total of 576 probe sets were significantly up-regulated at least 2-fold in both varieties, whereas 622 were down-regulated. Ontological classification is presented. As expected, a large number of transcription factors, stress proteins, and transporters demonstrated differential expression. Striking is the lack of response of classic oxidative stress-responsive genes or phytochelatin synthases/synthatases. However, the large number of responses from genes involved in glutathione synthesis, metabolism, and transport suggests that glutathione conjugation and arsenate methylation may be important biochemical responses to arsenate challenge. In this report, no attempt is made to dissect differences in the response of the tolerant and sensitive variety, but analysis in a companion article will link gene expression to the known tolerance loci available in the Bala×Azucena mapping population. PMID:18453530

  2. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.

    Mathelier, Anthony; Lefebvre, Calvin; Zhang, Allen W; Arenillas, David J; Ding, Jiarui; Wasserman, Wyeth W; Shah, Sohrab P

    2015-04-23

    With the rapid increase of whole-genome sequencing of human cancers, an important opportunity to analyze and characterize somatic mutations lying within cis-regulatory regions has emerged. A focus on protein-coding regions to identify nonsense or missense mutations disruptive to protein structure and/or function has led to important insights; however, the impact on gene expression of mutations lying within cis-regulatory regions remains under-explored. We analyzed somatic mutations from 84 matched tumor-normal whole genomes from B-cell lymphomas with accompanying gene expression measurements to elucidate the extent to which these cancers are disrupted by cis-regulatory mutations. We characterize mutations overlapping a high quality set of well-annotated transcription factor binding sites (TFBSs), covering a similar portion of the genome as protein-coding exons. Our results indicate that cis-regulatory mutations overlapping predicted TFBSs are enriched in promoter regions of genes involved in apoptosis or growth/proliferation. By integrating gene expression data with mutation data, our computational approach culminates with identification of cis-regulatory mutations most likely to participate in dysregulation of the gene expression program. The impact can be measured along with protein-coding mutations to highlight key mutations disrupting gene expression and pathways in cancer. Our study yields specific genes with disrupted expression triggered by genomic mutations in either the coding or the regulatory space. It implies that mutated regulatory components of the genome contribute substantially to cancer pathways. Our analyses demonstrate that identifying genomically altered cis-regulatory elements coupled with analysis of gene expression data will augment biological interpretation of mutational landscapes of cancers.

  3. Discovery of Gene Sources for Economic Traits in Hanwoo by Whole-genome Resequencing

    Younhee Shin

    2016-09-01

    Full Text Available Hanwoo, a Korean native cattle (Bos taurus coreana, has great economic value due to high meat quality. Also, the breed has genetic variations that are associated with production traits such as health, disease resistance, reproduction, growth as well as carcass quality. In this study, next generation sequencing technologies and the availability of an appropriate reference genome were applied to discover a large amount of single nucleotide polymorphisms (SNPs in ten Hanwoo bulls. Analysis of whole-genome resequencing generated a total of 26.5 Gb data, of which 594,716,859 and 592,990,750 reads covered 98.73% and 93.79% of the bovine reference genomes of UMD 3.1 and Btau 4.6.1, respectively. In total, 2,473,884 and 2,402,997 putative SNPs were discovered, of which 1,095,922 (44.3% and 982,674 (40.9% novel SNPs were discovered against UMD3.1 and Btau 4.6.1, respectively. Among the SNPs, the 46,301 (UMD 3.1 and 28,613 SNPs (Btau 4.6.1 that were identified as Hanwoo-specific SNPs were included in the functional genes that may be involved in the mechanisms of milk production, tenderness, juiciness, marbling of Hanwoo beef and yellow hair. Most of the Hanwoo-specific SNPs were identified in the promoter region, suggesting that the SNPs influence differential expression of the regulated genes relative to the relevant traits. In particular, the non-synonymous (ns SNPs found in CORIN, which is a negative regulator of Agouti, might be a causal variant to determine yellow hair of Hanwoo. Our results will provide abundant genetic sources of variation to characterize Hanwoo genetics and for subsequent breeding.

  4. Whole genome transcript profiling from fingerstick blood samples: a comparison and feasibility study

    Williams Adam R

    2009-12-01

    Full Text Available Abstract Background Whole genome gene expression profiling has revolutionized research in the past decade especially with the advent of microarrays. Recently, there have been significant improvements in whole blood RNA isolation techniques which, through stabilization of RNA at the time of sample collection, avoid bias and artifacts introduced during sample handling. Despite these improvements, current human whole blood RNA stabilization/isolation kits are limited by the requirement of a venous blood sample of at least 2.5 mL. While fingerstick blood collection has been used for many different assays, there has yet to be a kit developed to isolate high quality RNA for use in gene expression studies from such small human samples. The clinical and field testing advantages of obtaining reliable and reproducible gene expression data from a fingerstick are many; it is less invasive, time saving, more mobile, and eliminates the need of a trained phlebotomist. Furthermore, this method could also be employed in small animal studies, i.e. mice, where larger sample collections often require sacrificing the animal. In this study, we offer a rapid and simple method to extract sufficient amounts of high quality total RNA from approximately 70 μl of whole blood collected via a fingerstick using a modified protocol of the commercially available Qiagen PAXgene RNA Blood Kit. Results From two sets of fingerstick collections, about 70 uL whole blood collected via finger lancet and capillary tube, we recovered an average of 252.6 ng total RNA with an average RIN of 9.3. The post-amplification yields for 50 ng of total RNA averaged at 7.0 ug cDNA. The cDNA hybridized to Affymetrix HG-U133 Plus 2.0 GeneChips had an average % Present call of 52.5%. Both fingerstick collections were highly correlated with r2 values ranging from 0.94 to 0.97. Similarly both fingerstick collections were highly correlated to the venous collection with r2 values ranging from 0.88 to 0

  5. Development of a Method to Implement Whole-Genome Bisulfite Sequencing of cfDNA from Cancer Patients and a Mouse Tumor Model

    Elaine C. Maggi

    2018-01-01

    Full Text Available The goal of this study was to develop a method for whole genome cell-free DNA (cfDNA methylation analysis in humans and mice with the ultimate goal to facilitate the identification of tumor derived DNA methylation changes in the blood. Plasma or serum from patients with pancreatic neuroendocrine tumors or lung cancer, and plasma from a murine model of pancreatic adenocarcinoma was used to develop a protocol for cfDNA isolation, library preparation and whole-genome bisulfite sequencing of ultra low quantities of cfDNA, including tumor-specific DNA. The protocol developed produced high quality libraries consistently generating a conversion rate >98% that will be applicable for the analysis of human and mouse plasma or serum to detect tumor-derived changes in DNA methylation.

  6. Whole Genome and Tandem Duplicate Retention facilitated Glucosinolate Pathway Diversification in the Mustard Family.

    Hofberger, J.A.; Lyons, E.; Edger, P.P.; Pires, J.C.; Schranz, M.E.

    2013-01-01

    Plants share a common history of successive whole genome duplication (WGD) events retaining genomic patterns of duplicate gene copies (ohnologs) organized in conserved syntenic blocks. Duplication was often proposed to affect the origin of novel traits during evolution. However, genetic evidence

  7. Whole-genome regression and prediction methods applied to plant and animal breeding

    Los Campos, De G.; Hickey, J.M.; Pong-Wong, R.; Daetwyler, H.D.; Calus, M.P.L.

    2013-01-01

    Genomic-enabled prediction is becoming increasingly important in animal and plant breeding, and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of

  8. Direct whole-genome sequencing of Plasmodium falciparum specimens from dried erythrocyte spots

    Nag, Sidsel; Kofoed, Poul Erik; Ursing, Johan

    2018-01-01

    -infected individuals living in rural areas, away from main infrastructure and the electrical grid. The aim of this study was to describe a low-tech procedure to sample P. falciparum specimens for direct whole genome sequencing (WGS), without use of electricity and cold-chain. Methods: Venous blood samples were...

  9. Bos taurus strain:dairy beef (cattle): 1000 Bull Genomes Run 2, Bovine Whole Genome Sequence

    Bouwman, A.C.; Daetwyler, H.D.; Chamberlain, Amanda J.; Ponce, Carla Hurtado; Sargolzaei, Mehdi; Schenkel, Flavio S.; Sahana, Goutam; Govignon-Gion, Armelle; Boitard, Simon; Dolezal, Marlies; Pausch, Hubert; Brøndum, Rasmus F.; Bowman, Phil J.; Thomsen, Bo; Guldbrandtsen, Bernt; Lund, Mogens S.; Servin, Bertrand; Garrick, Dorian J.; Reecy, James M.; Vilkki, Johanna; Bagnato, Alessandro; Wang, Min; Hoff, Jesse L.; Schnabel, Robert D.; Taylor, Jeremy F.; Vinkhuyzen, Anna A.E.; Panitz, Frank; Bendixen, Christian; Holm, Lars-Erik; Gredler, Birgit; Hozé, Chris; Boussaha, Mekki; Sanchez, Marie Pierre; Rocha, Dominique; Capitan, Aurelien; Tribout, Thierry; Barbat, Anne; Croiseau, Pascal; Drögemüller, Cord; Jagannathan, Vidhya; Vander Jagt, Christy; Crowley, John J.; Bieber, Anna; Purfield, Deirdre C.; Berry, Donagh P.; Emmerling, Reiner; Götz, Kay Uwe; Frischknecht, Mirjam; Russ, Ingolf; Sölkner, Johann; Tassell, van Curtis P.; Fries, Ruedi; Stothard, Paul; Veerkamp, R.F.; Boichard, Didier; Goddard, Mike E.; Hayes, Ben J.

    2014-01-01

    Whole genome sequence data (BAM format) of 234 bovine individuals aligned to UMD3.1. The aim of the study was to identify genetic variants (SNPs and indels) for downstream analysis such as imputation, GWAS, and detection of lethal recessives. Additional sequences for later 1000 bull genomes runs can

  10. Genotype call for chromosomal deletions using read-depth from whole genome sequence variants in cattle

    Mesbah-Uddin, Md; Guldbrandtsen, Bernt; Lund, Mogens Sandø

    2018-01-01

    We presented a deletion genotyping (copy-number estimation) method that leverages population-scale whole genome sequence variants data from 1K bull genomes project (1KBGP) to build reference panel for imputation. To estimate deletion-genotype likelihood, we extracted read-depth (RD) data of all...

  11. A synergism between adaptive effects and evolvability drives whole genome duplication to fixation

    Cuypers, Thomas D; Hogeweg, Paulien; Hogeweg, P.

    Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes.

  12. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder

    Yuen, Ryan K C; Merico, Daniele; Bookman, Matt; Howe, Jennifer L.; Thiruvahindrapuram, Bhooma; Patel, Rohan V.; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A.; Walker, Susan; Marshall, Christian R.; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D'Abate, Lia; Chan, Ada J S; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L.; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J.; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R.; Nalpathamkalam, Thomas; Sung, Wilson W L; Tsoi, Fiona J.; Wei, John; Xu, Lizhen; Tasse, Anne Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie Mackinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili; Iaboni, Alana; Doyle-Thomas, Krissy; Thompson, Ann; Chrysler, Christina; Leef, Jonathan; Savion-Lemieux, Tal; Smith, Isabel M.; Liu, Xudong; Nicolson, Rob; Seifer, Vicki; Fedele, Angie; Cook, Edwin H.; Dager, Stephen; Estes, Annette; Gallagher, Louise; Malow, Beth A.; Parr, Jeremy R.; Spence, Sarah J.; Vorstman, Jacob; Frey, Brendan J.; Robinson, James T.; Strug, Lisa J.; Fernandez, Bridget A.; Elsabbagh, Mayada; Carter, Melissa T.; Hallmayer, Joachim; Knoppers, Bartha M.; Anagnostou, Evdokia; Szatmari, Peter; Ring, Robert H.; Glazer, David; Pletcher, Mathew T.; Scherer, Stephen W.

    2017-01-01

    We are performing whole-genome sequencing of families with autism spectrum disorder (ASD) to build a resource (MSSNG) for subcategorizing the phenotypes and underlying genetic factors involved. Here we report sequencing of 5,205 samples from families with ASD, accompanied by clinical information,

  13. Determination of Elizabethkingia Diversity by MALDI-TOF Mass Spectrometry and Whole-Genome Sequencing

    Eriksen, Helle Brander; Gumpert, Heidi; Faurholt, Cecilie Haase

    2017-01-01

    In a hospital-acquired infection with multidrug-resistant Elizabethkingia, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry and 16S rRNA gene analysis identified the pathogen as Elizabethkingia miricola. Whole-genome sequencing, genus-level core genome analysis, and in...

  14. Whole-Genome Sequencing Coupled to Imputation Discovers Genetic Signals for Anthropometric Traits

    I. Tachmazidou (Ioanna); Süveges, D. (Dániel); J. Min (Josine); G.R.S. Ritchie (Graham R.S.); Steinberg, J. (Julia); K. Walter (Klaudia); V. Iotchkova (Valentina); J.A. Schwartzentruber (Jeremy); J. Huang (Jian); Y. Memari (Yasin); McCarthy, S. (Shane); Crawford, A.A. (Andrew A.); C. Bombieri (Cristina); M. Cocca (Massimiliano); A.-E. Farmaki (Aliki-Eleni); T.R. Gaunt (Tom); P. Jousilahti (Pekka); M.N. Kooijman (Marjolein ); Lehne, B. (Benjamin); G. Malerba (Giovanni); S. Männistö (Satu); A. Matchan (Angela); M.C. Medina-Gomez (Carolina); S. Metrustry (Sarah); A. Nag (Abhishek); I. Ntalla (Ioanna); L. Paternoster (Lavinia); N.W. Rayner (Nigel William); C. Sala (Cinzia); W.R. Scott (William R.); H.A. Shihab (Hashem A.); L. Southam (Lorraine); B. St Pourcain (Beate); M. Traglia (Michela); K. Trajanoska (Katerina); Zaza, G. (Gialuigi); W. Zhang (Weihua); M.S. Artigas; Bansal, N. (Narinder); M. Benn (Marianne); Chen, Z. (Zhongsheng); P. Danecek (Petr); Lin, W.-Y. (Wei-Yu); A. Locke (Adam); J. Luan (Jian'An); A.K. Manning (Alisa); Mulas, A. (Antonella); C. Sidore (Carlo); A. Tybjaerg-Hansen; A. Varbo (Anette); M. Zoledziewska (Magdalena); C. Finan (Chris); Hatzikotoulas, K. (Konstantinos); A.E. Hendricks (Audrey E.); J.P. Kemp (John); A. Moayyeri (Alireza); Panoutsopoulou, K. (Kalliope); Szpak, M. (Michal); S.G. Wilson (Scott); M. Boehnke (Michael); F. Cucca (Francesco); Di Angelantonio, E. (Emanuele); C. Langenberg (Claudia); C.M. Lindgren (Cecilia M.); McCarthy, M.I. (Mark I.); A.P. Morris (Andrew); B.G. Nordestgaard (Børge); R.A. Scott (Robert); M.D. Tobin (Martin); N.J. Wareham (Nick); P.R. Burton (Paul); J.C. Chambers (John); Smith, G.D. (George Davey); G.V. Dedoussis (George); J.F. Felix (Janine); O.H. Franco (Oscar); Gambaro, G. (Giovanni); P. Gasparini (Paolo); C.J. Hammond (Christopher J.); A. Hofman (Albert); V.W.V. Jaddoe (Vincent); M.E. Kleber (Marcus); J.S. Kooner (Jaspal S.); M. Perola (Markus); C.L. Relton (Caroline); S.M. Ring (Susan); F. Rivadeneira Ramirez (Fernando); V. Salomaa (Veikko); T.D. Spector (Timothy); O. Stegle (Oliver); D. Toniolo (Daniela); A.G. Uitterlinden (André); I.E. Barroso (Inês); C.M.T. Greenwood (Celia); Perry, J.R.B. (John R.B.); Walker, B.R. (Brian R.); A.S. Butterworth (Adam); Y. Xue (Yali); R. Durbin (Richard); K.S. Small (Kerrin); N. Soranzo (Nicole); N.J. Timpson (Nicholas); E. Zeggini (Eleftheria)

    2016-01-01

    textabstractDeep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the

  15. Whole-Genome Sequence and Classification of 11 Endophytic Bacteria from Poison Ivy (Toxicodendron radicans).

    Tran, Phuong N; Tan, Nicholas E H; Lee, Yin Peng; Gan, Han Ming; Polter, Steven J; Dailey, Lucas K; Hudson, André O; Savka, Michael A

    2015-11-19

    Here, we report the whole-genome sequences and annotation of 11 endophytic bacteria from poison ivy (Toxicodendron radicans) vine tissue. Five bacteria belong to the genus Pseudomonas, and six single members from other genera were found present in interior vine tissue of poison ivy. Copyright © 2015 Tran et al.

  16. Whole-Genome Sequence and Classification of 11 Endophytic Bacteria from Poison Ivy (Toxicodendron radicans)

    Tran, Phuong N.; Tan, Nicholas E. H.; Lee, Yin Peng; Gan, Han Ming; Polter, Steven J.; Dailey, Lucas K.; Hudson, Andr? O.; Savka, Michael A.

    2015-01-01

    Here, we report the whole-genome sequences and annotation of 11 endophytic bacteria from poison ivy (Toxicodendron radicans) vine tissue. Five bacteria belong to the genus Pseudomonas, and six single members from other genera were found present in interior vine tissue of poison ivy.

  17. The effect of whole genome amplification on samples originating from more than one donor

    Thacker, C.R.; Balogh, M.K.; Børsting, Claus

    2006-01-01

    In this study, the GenomiPhi(TM) DNA Amplification Kit (Amersham Biosciences) was used to investigate the potential of whole genome amplification (WGA) when considering samples originating from more than one donor. DNA was extracted from blood samples, quantified and normalised before being mixed...

  18. Evolutionary insight from whole-genome sequencing of Pseudomonas aeruginosa from cystic fibrosis patients

    Marvig, Rasmus Lykke; Madsen Sommer, Lea Mette; Jelsbak, Lars

    2015-01-01

    is suggested to be due to the large genetic repertoire of P. aeruginosa and its ability to genetically adapt to the host environment. Here, we review the recent work that has applied whole-genome sequencing to understand P. aeruginosa population genomics, within-host microevolution and diversity, mutational...

  19. Whole-Genome Sequencing Coupled to Imputation Discovers Genetic Signals for Anthropometric Traits

    Tachmazidou, Ioanna; Süveges, Dániel; Min, Josine L

    2017-01-01

    Deep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the broader alleli...

  20. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity

    Kok-Gan Chan

    2016-03-01

    Full Text Available Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000. Keywords: Human tongue surface, Oral cavity, Oral bacteria, Virulence

  1. Challenges and opportunities for whole-genome sequencing–based surveillance of antibiotic resistance

    Schürch, Anita C.; van Schaik, Willem

    2017-01-01

    Infections caused by drug-resistant bacteria are increasingly reported across the planet, and drug-resistant bacteria are recognized to be a major threat to public health and modern medicine. In this review, we discuss how whole-genome sequencing (WGS)–based approaches can contribute to the

  2. Rapid whole genome sequencing for the detection and characterization of microorganisms directly from clinical samples

    Hasman, Henrik; Saputra, Dhany; Sicheritz-Pontén, Thomas

    2014-01-01

    Whole genome sequencing (WGS) is becoming available as a routine tool for clinical microbiology. If applied directly on clinical samples this could further reduce diagnostic time and thereby improve control and treatment. A major bottle-neck is the availability of fast and reliable bioinformatics...

  3. Effective Normalization for Copy Number Variation Detection from Whole Genome Sequencing

    Janevski, A.; Varadan, V.; Kamalakaran, S.; Banerjee, N.; Dimitrova, D.

    2012-01-01

    Background Whole genome sequencing enables a high resolution view ofthe human genome and provides unique insights into genome structureat an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools while validatedalso include a number of

  4. Whole-genome sequence of the bacteriophage-sensitive strain Campylobacter jejuni NCTC12662

    Gencay, Yilmaz Emre; Sørensen, Martine C.H.; Brøndsted, Lone

    2017-01-01

    Campylobacter jejuni NCTC12662 has been the choice bacteriophage isolation strain due to its susceptibility to C. jejuni bacteriophages. This trait makes it a good candidate for studying bacteriophage-host interactions. We report here the whole-genome sequence of NCTC12662, allowing future...

  5. Construction of a river buffalo (Bubalus bubalis whole-genome radiation hybrid panel and preliminary RH mapping of chromosomes 3 and 10

    J.E. Womack

    2010-02-01

    Full Text Available The buffalo (Bubalus bubalis not only is a useful source of milk, it also provides meat and works as a natural source of labor and biogas. To establish a project for buffalo genome mapping a 5,000-rad whole genome radiation hybrid panel was constructed for river buffalo and used to build preliminary RH maps from two chromosomes (BBU 3 and BBU10. The preliminary maps contain 66 markers, including coding genes, cattle ESTs and microsatellite loci. The RH maps presented here are the starting point for mapping additional loci, in particular, genes and expressed sequence tags that will allow detailed comparative maps between buffalo, cattle and other species to be constructed. A large quantity of DNA has been prepared from the cell lines forming the RH panel reported here and will be made publicly available to the international community both for the study of chromosome evolution and for the improvement of traits important to the role of buffalo in animal agriculture.

  6. A high-resolution whole genome radiation hybrid map of human chromosome 17q22-q25.3 across the genes for GH and TK

    Foster, J.W.; Schafer, A.J.; Critcher, R. [Univ. of Cambridge (United Kingdom)] [and others

    1996-04-15

    We have constructed a whole genome radiation hybrid (WG-RH) map across a region of human chromosome 17q, from growth hormone (GH) to thymidine kinase (TK). A panel of 128 WG-RH hybrid cell lines generated by X-irradiation and fusion has been tested for the retention of 39 sequence-tagged site (STS) markers by the polymerase chain reaction. This genome mapping technique has allowed the integration of existing VNTR and microsatellite markers with additional new markers and existing STS markers previously mapped to this region by other means. The WG-RH map includes eight expressed sequence tag (EST) and three anonymous markers developed for this study, together with 23 anonymous microsatellites and five existing ESTs. Analysis of these data resulted in a high-density comprehensive map across this region of the genome. A subset of these markers has been used to produce a framework map consisting of 20 loci ordered with odds greater than 1000:1. The markers are of sufficient density to build a YAC contig across this region based on marker content. We have developed sequence tags for both ends of a 2.1-Mb YAC and mapped these using the WG-RH panel, allowing a direct comparison of cRay{sub 6000} to physical distance. 31 refs., 3 figs., 2 tabs.

  7. A Whole-Genome Microarray Study of Arabidopsis thaliana Semisolid Callus Cultures Exposed to Microgravity and Nonmicrogravity Related Spaceflight Conditions for 5 Days on Board of Shenzhou 8

    Svenja Fengler

    2015-01-01

    Full Text Available The Simbox mission was the first joint space project between Germany and China in November 2011. Eleven-day-old Arabidopsis thaliana wild type semisolid callus cultures were integrated into fully automated plant cultivation containers and exposed to spaceflight conditions within the Simbox hardware on board of the spacecraft Shenzhou 8. The related ground experiment was conducted under similar conditions. The use of an in-flight centrifuge provided a 1 g gravitational field in space. The cells were metabolically quenched after 5 days via RNAlater injection. The impact on the Arabidopsis transcriptome was investigated by means of whole-genome gene expression analysis. The results show a major impact of nonmicrogravity related spaceflight conditions. Genes that were significantly altered in transcript abundance are mainly involved in protein phosphorylation and MAPK cascade-related signaling processes, as well as in the cellular defense and stress responses. In contrast to short-term effects of microgravity (seconds, minutes, this mission identified only minor changes after 5 days of microgravity. These concerned genes coding for proteins involved in the plastid-associated translation machinery, mitochondrial electron transport, and energy production.

  8. Whole-genome resequencing reveals candidate mutations for pig prolificacy.

    Li, Wen-Ting; Zhang, Meng-Meng; Li, Qi-Gang; Tang, Hui; Zhang, Li-Fan; Wang, Ke-Jun; Zhu, Mu-Zhen; Lu, Yun-Feng; Bao, Hai-Gang; Zhang, Yuan-Ming; Li, Qiu-Yan; Wu, Ke-Liang; Wu, Chang-Xin

    2017-12-20

    Changes in pig fertility have occurred as a result of domestication, but are not understood at the level of genetic variation. To identify variations potentially responsible for prolificacy, we sequenced the genomes of the highly prolific Taihu pig breed and four control breeds. Genes involved in embryogenesis and morphogenesis were targeted in the Taihu pig, consistent with the morphological differences observed between the Taihu pig and others during pregnancy. Additionally, excessive functional non-coding mutations have been specifically fixed or nearly fixed in the Taihu pig. We focused attention on an oestrogen response element (ERE) within the first intron of the bone morphogenetic protein receptor type-1B gene ( BMPR1B ) that overlaps with a known quantitative trait locus (QTL) for pig fecundity. Using 242 pigs from 30 different breeds, we confirmed that the genotype of the ERE was nearly fixed in the Taihu pig. ERE function was assessed by luciferase assays, examination of histological sections, chromatin immunoprecipitation, quantitative polymerase chain reactions, and western blots. The results suggest that the ERE may control pig prolificacy via the cis-regulation of BMPR1B expression. This study provides new insight into changes in reproductive performance and highlights the role of non-coding mutations in generating phenotypic diversity between breeds. © 2017 The Author(s).

  9. Differences in gene expression of cells growing in conventional 2D versus 3D cell culture

    Zschenker, Oliver; Cordes, Nils; Streichert, Thomas

    2009-01-01

    Full text: Telomeres are DNA protein complexes on the ends of chromosomes that distinguish the ends of chromosomes from double strand breaks and prevent degradation or fusion by nonhomologous end-joining. The loss of telomeres is associated with a loss of heterochromatic features leading to a less compact chromatin structure which allows e.g. DNA repair proteins to get better access to the site of the DNA damage and facilitate chromosome fusions. Telomerase is an enzyme that can counteract the loss of telomeres by adding telomeric repeats on the ends of chromosomes. Since telomerase is active in most tumor cells, telomerase is suggested to be the reason for the unlimited number of cell divisions of cancer cells. TRF2 is one of the most important proteins of the Shelterin complex protecting the telomeres from shortening by inhibiting ATM which is up-stream of the DNA repair mechanisms. Thus, we are concentrating on TRF2 and telomerase to investigate the differences in DNA repair in telomeric (heterochromatic) versus euchromatic regions. Human cancer cells with differences in status of p53 and telomerase like A549, UT-SCC15 and FaDu cells are used. Without any treatment, FaDu cells express high levels of telomerase and TRF2 in conventional 2D cell culture which is in contrast to e.g. A549. We found that telomerase is even higher expressed in 3D than in 2D cell culture. To connect telomere associated processes to both repair of radiogenic DNA damage/lesions and to cell-extracellular matrix interactions, we performed whole genome microarray analysis. By comparing the differential expression of genes associated with these three cell functions, we intend to yield new molecular insight into radiotherapy relevant tumor characteristics, particularly radioresistance and DNA damage response network processing. (author)

  10. Germline contamination and leakage in whole genome somatic single nucleotide variant detection.

    Sendorek, Dorota H; Caloian, Cristian; Ellrott, Kyle; Bare, J Christopher; Yamaguchi, Takafumi N; Ewing, Adam D; Houlahan, Kathleen E; Norman, Thea C; Margolin, Adam A; Stuart, Joshua M; Boutros, Paul C

    2018-01-31

    The clinical sequencing of cancer genomes to personalize therapy is becoming routine across the world. However, concerns over patient re-identification from these data lead to questions about how tightly access should be controlled. It is not thought to be possible to re-identify patients from somatic variant data. However, somatic variant detection pipelines can mistakenly identify germline variants as somatic ones, a process called "germline leakage". The rate of germline leakage across different somatic variant detection pipelines is not well-understood, and it is uncertain whether or not somatic variant calls should be considered re-identifiable. To fill this gap, we quantified germline leakage across 259 sets of whole-genome somatic single nucleotide variant (SNVs) predictions made by 21 teams as part of the ICGC-TCGA DREAM Somatic Mutation Calling Challenge. The median somatic SNV prediction set contained 4325 somatic SNVs and leaked one germline polymorphism. The level of germline leakage was inversely correlated with somatic SNV prediction accuracy and positively correlated with the amount of infiltrating normal cells. The specific germline variants leaked differed by tumour and algorithm. To aid in quantitation and correction of leakage, we created a tool, called GermlineFilter, for use in public-facing somatic SNV databases. The potential for patient re-identification from leaked germline variants in somatic SNV predictions has led to divergent open data access policies, based on different assessments of the risks. Indeed, a single, well-publicized re-identification event could reshape public perceptions of the values of genomic data sharing. We find that modern somatic SNV prediction pipelines have low germline-leakage rates, which can be further reduced, especially for cloud-sharing, using pre-filtering software.

  11. Combined sequencing of mRNA and DNA from human embryonic stem cells

    Florian Mertes

    2016-06-01

    Full Text Available Combined transcriptome and whole genome sequencing of the same ultra-low input sample down to single cells is a rapidly evolving approach for the analysis of rare cells. Besides stem cells, rare cells originating from tissues like tumor or biopsies, circulating tumor cells and cells from early embryonic development are under investigation. Herein we describe a universal method applicable for the analysis of minute amounts of sample material (150 to 200 cells derived from sub-colony structures from human embryonic stem cells. The protocol comprises the combined isolation and separate amplification of poly(A mRNA and whole genome DNA followed by next generation sequencing. Here we present a detailed description of the method developed and an overview of the results obtained for RNA and whole genome sequencing of human embryonic stem cells, sequencing data is available in the Gene Expression Omnibus (GEO database under accession number GSE69471.

  12. Whole genomic DNA probe for detection of Porphyromonas endodontalis.

    Nissan, R; Makkar, S R; Sela, M N; Stevens, R

    2000-04-01

    The purpose of the present study was to develop a DNA probe for Porphyromonas endodontalis. Pure cultures of P. endodontalis were grown in TYP medium, in an anaerobic chamber. DNA was extracted from the P. endodontalis and labeled using the Genius System by Boehringer Mannheim. The labeled P. endodontalis DNA was used in dot-blot hybridization reactions with homologous (P. endodontalis) and unrelated bacterial samples. To determine specificity, strains of 40 other oral bacterial species (e.g. Porphyromonas gingivalis, Porphyromonas asaccharolytica, and Prevotella intermedia) were spotted and reacted with the P. endodontalis DNA probe. None of the panel of 40 oral bacteria hybridized with the P. endodontalis probe, whereas the blot of the homologous organism showed a strong positive reaction. To determine the sensitivity of the probe, dilutions of a P. endodontalis suspension of known concentration were blotted onto a nylon membrane and reacted with the probe. The results of our investigation indicate that the DNA probe that we have prepared specifically detects only P. endodontalis and can detect at least 3 x 10(4) cells.

  13. Expression of a retinoic acid signature in circulating CD34 cells from coronary artery disease patients

    van der Laan Anja M

    2010-06-01

    Full Text Available Abstract Background Circulating CD34+ progenitor cells have the potential to differentiate into a variety of cells, including endothelial cells. Knowledge is still scarce about the transcriptional programs used by CD34+ cells from peripheral blood, and how these are affected in coronary artery disease (CAD patients. Results We performed a whole genome transcriptome analysis of CD34+ cells, CD4+ T cells, CD14+ monocytes, and macrophages from 12 patients with CAD and 11 matched controls. CD34+ cells, compared to other mononuclear cells from the same individuals, showed high levels of KRAB box transcription factors, known to be involved in gene silencing. This correlated with high expression levels in CD34+ cells for the progenitor markers HOXA5 and HOXA9, which are known to control expression of KRAB factor genes. The comparison of expression profiles of CD34+ cells from CAD patients and controls revealed a less naïve phenotype in patients' CD34+ cells, with increased expression of genes from the Mitogen Activated Kinase network and a lowered expression of a panel of histone genes, reaching levels comparable to that in more differentiated circulating cells. Furthermore, we observed a reduced expression of several genes involved in CXCR4-signaling and migration to SDF1/CXCL12. Conclusions The altered gene expression profile of CD34+ cells in CAD patients was related to activation/differentiation by a retinoic acid-induced differentiation program. These results suggest that circulating CD34+ cells in CAD patients are programmed by retinoic acid, leading to a reduced capacity to migrate to ischemic tissues.

  14. Whole genome sequencing of mutation accumulation lines reveals a low mutation rate in the social amoeba Dictyostelium discoideum.

    Gerda Saxer

    Full Text Available Spontaneous mutations play a central role in evolution. Despite their importance, mutation rates are some of the most elusive parameters to measure in evolutionary biology. The combination of mutation accumulation (MA experiments and whole-genome sequencing now makes it possible to estimate mutation rates by directly observing new mutations at the molecular level across the whole genome. We performed an MA experiment with the social amoeba Dictyostelium discoideum and sequenced the genomes of three randomly chosen lines using high-throughput sequencing to estimate the spontaneous mutation rate in this model organism. The mitochondrial mutation rate of 6.76×10(-9, with a Poisson confidence interval of 4.1×10(-9 - 9.5×10(-9, per nucleotide per generation is slightly lower than estimates for other taxa. The mutation rate estimate for the nuclear DNA of 2.9×10(-11, with a Poisson confidence interval ranging from 7.4×10(-13 to 1.6×10(-10, is the lowest reported for any eukaryote. These results are consistent with low microsatellite mutation rates previously observed in D. discoideum and low levels of genetic variation observed in wild D. discoideum populations. In addition, D. discoideum has been shown to be quite resistant to DNA damage, which suggests an efficient DNA-repair mechanism that could be an adaptation to life in soil and frequent exposure to intracellular and extracellular mutagenic compounds. The social aspect of the life cycle of D. discoideum and a large portion of the genome under relaxed selection during vegetative growth could also select for a low mutation rate. This hypothesis is supported by a significantly lower mutation rate per cell division in multicellular eukaryotes compared with unicellular eukaryotes.

  15. Whole genome sequence of Enterobacter ludwigii type strain EN-119T, isolated from clinical specimens.

    Li, Gengmi; Hu, Zonghai; Zeng, Ping; Zhu, Bing; Wu, Lijuan

    2015-04-01

    Enterobacter ludwigii strain EN-119(T) is the type strain of E. ludwigii, which belongs to the E. cloacae complex (Ecc). This strain was first reported and nominated in 2005 and later been found in many hospitals. In this paper, the whole genome sequencing of this strain was carried out. The total genome size of EN-119(T) is 4952,770 bp with 4578 coding sequences, 88 tRNAs and 10 rRNAs. The genome sequence of EN-119(T) is the first whole genome sequence of E. ludwigii, which will further our understanding of Ecc. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  16. The genome BLASTatlas - a GeneWiz extension for visualization of whole-genome homology

    Hallin, Peter Fischer; Binnewies, Tim Terence; Ussery, David

    2008-01-01

    ://www.cbs.dtu.dk/ws/BLASTatlas), where programming examples are available in Perl. By providing an interoperable method to carry out whole genome visualization of homology, this service offers bioinformaticians as well as biologists an easy-to-adopt workflow that can be directly called from the programming language of the user, hence......The development of fast and inexpensive methods for sequencing bacterial genomes has led to a wealth of data, often with many genomes being sequenced of the same species or closely related organisms. Thus, there is a need for visualization methods that will allow easy comparison of many sequenced...... genomes to a defined reference strain. The BLASTatlas is one such tool that is useful for mapping and visualizing whole genome homology of genes and proteins within a reference strain compared to other strains or species of one or more prokaryotic organisms. We provide examples of BLASTatlases, including...

  17. Whole genome sequencing of Mycobacterium tuberculosis SB24 isolated from Sabah, Malaysia

    Noraini Philip

    2016-09-01

    Full Text Available Mycobacterium tuberculosis (M. tuberculosis is the causative agent of tuberculosis (TB that causes millions of death every year. We have sequenced the genome of M. tuberculosis isolated from cerebrospinal fluid (CSF of a patient diagnosed with tuberculous meningitis (TBM. The isolated strain was referred as M. tuberculosis SB24. Genomic DNA of the M. tuberculosis SB24 was extracted and subjected to whole genome sequencing using PacBio platform. The draft genome size of M. tuberculosis SB24 was determined to be 4,452,489 bp with a G + C content of 65.6%. The whole genome shotgun project has been deposited in NCBI SRA under the accession number SRP076503.

  18. Determining the cause of recurrent Clostridium difficile infection using whole genome sequencing.

    Sim, James Heng Chiak; Truong, Cynthia; Minot, Samuel S; Greenfield, Nick; Budvytiene, Indre; Lohith, Akshar; Anikst, Victoria; Pourmand, Nader; Banaei, Niaz

    2017-01-01

    Understanding the contribution of relapse and reinfection to recurrent Clostridium difficile infection (CDI) has implications for therapy and infection prevention, respectively. We used whole genome sequencing to determine the relation of C. difficile strains isolated from patients with recurrent CDI at an academic medical center in the United States. Thirty-five toxigenic C. difficile isolates from 16 patients with 19 recurrent CDI episodes with median time of 53.5days (range, 13-362) between episodes were whole genome sequenced on the Illumina MiSeq platform. In 84% (16) of recurrences, the cause of recurrence was relapse with prior strain of C. difficile. In 16% (3) of recurrent episodes, reinfection with a new strain of C. difficile was the cause. In conclusion, the majority of CDI recurrences at our institution were due to infection with the same strain rather than infection with a new strain. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. How could disclosing incidental information from whole-genome sequencing affect patient behavior?

    Christensen, Kurt D; Green, Robert C

    2013-06-01

    In this article, we argue that disclosure of incidental findings from whole-genome sequencing has the potential to motivate individuals to change health behaviors through psychological mechanisms that differ from typical risk assessment interventions. Their ability to do so, however, is likely to be highly contingent upon the nature of the incidental findings and how they are disclosed, the context of the disclosure and the characteristics of the patient. Moreover, clinicians need to be aware that behavioral responses may occur in unanticipated ways. This article argues for commentators and policy makers to take a cautious but optimistic perspective while empirical evidence is collected through ongoing research involving whole-genome sequencing and the disclosure of incidental information.

  20. The Personal Genome Project Canada: findings from whole genome sequences of the inaugural 56 participants.

    Reuter, Miriam S; Walker, Susan; Thiruvahindrapuram, Bhooma; Whitney, Joe; Cohn, Iris; Sondheimer, Neal; Yuen, Ryan K C; Trost, Brett; Paton, Tara A; Pereira, Sergio L; Herbrick, Jo-Anne; Wintle, Richard F; Merico, Daniele; Howe, Jennifer; MacDonald, Jeffrey R; Lu, Chao; Nalpathamkalam, Thomas; Sung, Wilson W L; Wang, Zhuozhi; Patel, Rohan V; Pellecchia, Giovanna; Wei, John; Strug, Lisa J; Bell, Sherilyn; Kellam, Barbara; Mahtani, Melanie M; Bassett, Anne S; Bombard, Yvonne; Weksberg, Rosanna; Shuman, Cheryl; Cohn, Ronald D; Stavropoulos, Dimitri J; Bowdin, Sarah; Hildebrandt, Matthew R; Wei, Wei; Romm, Asli; Pasceri, Peter; Ellis, James; Ray, Peter; Meyn, M Stephen; Monfared, Nasim; Hosseini, S Mohsen; Joseph-George, Ann M; Keeley, Fred W; Cook, Ryan A; Fiume, Marc; Lee, Hin C; Marshall, Christian R; Davies, Jill; Hazell, Allison; Buchanan, Janet A; Szego, Michael J; Scherer, Stephen W

    2018-02-05

    The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set ( n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants - associated with cancer, cardiac or neurodegenerative phenotypes - remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care. © 2018 Joule Inc. or its licensors.

  1. Microarray-based whole-genome hybridization as a tool for determining procaryotic species relatedness

    Wu, L.; Liu, X.; Fields, M.W.; Thompson, D.K.; Bagwell, C.E.; Tiedje, J. M.; Hazen, T.C.; Zhou, J.

    2008-01-15

    The definition and delineation of microbial species are of great importance and challenge due to the extent of evolution and diversity. Whole-genome DNA-DNA hybridization is the cornerstone for defining procaryotic species relatedness, but obtaining pairwise DNA-DNA reassociation values for a comprehensive phylogenetic analysis of procaryotes is tedious and time consuming. A previously described microarray format containing whole-genomic DNA (the community genome array or CGA) was rigorously evaluated as a high-throughput alternative to the traditional DNA-DNA reassociation approach for delineating procaryotic species relationships. DNA similarities for multiple bacterial strains obtained with the CGA-based hybridization were comparable to those obtained with various traditional whole-genome hybridization methods (r=0.87, P<0.01). Significant linear relationships were also observed between the CGA-based genome similarities and those derived from small subunit (SSU) rRNA gene sequences (r=0.79, P<0.0001), gyrB sequences (r=0.95, P<0.0001) or REP- and BOX-PCR fingerprinting profiles (r=0.82, P<0.0001). The CGA hybridization-revealed species relationships in several representative genera, including Pseudomonas, Azoarcus and Shewanella, were largely congruent with previous classifications based on various conventional whole-genome DNA-DNA reassociation, SSU rRNA and/or gyrB analyses. These results suggest that CGA-based DNA-DNA hybridization could serve as a powerful, high-throughput format for determining species relatedness among microorganisms.

  2. The need for high-quality whole-genome sequence databases in microbial forensics.

    Sjödin, Andreas; Broman, Tina; Melefors, Öjar; Andersson, Gunnar; Rasmusson, Birgitta; Knutsson, Rickard; Forsman, Mats

    2013-09-01

    Microbial forensics is an important part of a strengthened capability to respond to biocrime and bioterrorism incidents to aid in the complex task of distinguishing between natural outbreaks and deliberate acts. The goal of a microbial forensic investigation is to identify and criminally prosecute those responsible for a biological attack, and it involves a detailed analysis of the weapon--that is, the pathogen. The recent development of next-generation sequencing (NGS) technologies has greatly increased the resolution that can be achieved in microbial forensic analyses. It is now possible to identify, quickly and in an unbiased manner, previously undetectable genome differences between closely related isolates. This development is particularly relevant for the most deadly bacterial diseases that are caused by bacterial lineages with extremely low levels of genetic diversity. Whole-genome analysis of pathogens is envisaged to be increasingly essential for this purpose. In a microbial forensic context, whole-genome sequence analysis is the ultimate method for strain comparisons as it is informative during identification, characterization, and attribution--all 3 major stages of the investigation--and at all levels of microbial strain identity resolution (ie, it resolves the full spectrum from family to isolate). Given these capabilities, one bottleneck in microbial forensics investigations is the availability of high-quality reference databases of bacterial whole-genome sequences. To be of high quality, databases need to be curated and accurate in terms of sequences, metadata, and genetic diversity coverage. The development of whole-genome sequence databases will be instrumental in successfully tracing pathogens in the future.

  3. Demographic history and biologically relevant genetic variation of Native Mexicans inferred from whole-genome sequencing

    Romero-Hidalgo, Sandra; Ochoa-Leyva, Adrián; Garcíarrubio, Alejandro; Acuña-Alonzo, Victor; Antúnez-Argüelles, Erika; Balcazar-Quintero, Martha; Barquera-Lozano, Rodrigo; Carnevale, Alessandra; Cornejo-Granados, Fernanda; Fernández-López, Juan Carlos; García-Herrera, Rodrigo; García-Ortíz, Humberto; Granados-Silvestre, Ángeles; Granados, Julio; Guerrero-Romero, Fernando

    2017-01-01

    Understanding the genetic structure of Native American populations is important to clarify their diversity, demographic history, and to identify genetic factors relevant for biomedical traits. Here, we show a demographic history reconstruction from 12 Native American whole genomes belonging to six distinct ethnic groups representing the three main described genetic clusters of Mexico (Northern, Southern, and Maya). Effective population size estimates of all Native American groups remained bel...

  4. A synergism between adaptive effects and evolvability drives whole genome duplication to fixation

    Cuypers, Thomas D; Hogeweg, Paulien; Hogeweg, P.

    2014-01-01

    Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes. This pattern has been explained by a neutral process of subfunctionalization and more recently, dosage balance selection. However, much about the relationship between environmental change, WGD and ada...

  5. A synergism between adaptive effects and evolvability drives whole genome duplication to fixation.

    Thomas D Cuypers; Paulien Hogeweg

    2014-01-01

    Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes. This pattern has been explained by a neutral process of subfunctionalization and more recently, dosage balance selection. However, much about the relationship between environmental change, WGD and ada...

  6. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds

    Xu, Yao; Jiang, Yu; Shi, Tao; Cai, Hanfang; Lan, Xianyong; Zhao, Xin; Plath, Martin; Chen, Hong

    2017-01-01

    Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus) and Qinchuan (Bos taurus) are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 ...

  7. Whole-Genome Sequence of Chlamydia abortus Strain GN6 Isolated from Aborted Yak Fetus

    Li, Zhaocai; Cai, Jinshan; Cao, Xiaoan; Lou, Zhongzi; Chao, Yilin; Kan, Wei; Zhou, Jizhang

    2017-01-01

    ABSTRACT The obligate intracellular Gram-negative bacterium Chlamydia abortus is one of the causative agents of abortion and fetal loss in sheep, goats, and cattle in many countries. It also affects the reproductivity of yaks (Bos grunniens). This study reports the whole-genome sequence of Chlamydia abortus strain GN6, which was isolated from aborted yak fetus in Qinghai-Tibetan Plateau, China.

  8. Whole-Genome Sequence of Chlamydia abortus Strain GN6 Isolated from Aborted Yak Fetus.

    Li, Zhaocai; Cai, Jinshan; Cao, Xiaoan; Lou, Zhongzi; Chao, Yilin; Kan, Wei; Zhou, Jizhang

    2017-08-31

    The obligate intracellular Gram-negative bacterium Chlamydia abortus is one of the causative agents of abortion and fetal loss in sheep, goats, and cattle in many countries. It also affects the reproductivity of yaks ( Bos grunniens ). This study reports the whole-genome sequence of Chlamydia abortus strain GN6, which was isolated from aborted yak fetus in Qinghai-Tibetan Plateau, China. Copyright © 2017 Li et al.

  9. The Future of Whole-Genome Sequencing for Public Health and the Clinic

    Allard, Marc W.

    2016-01-01

    An American Society for Microbiology (ASM) conference titled the Conference on Rapid Next-Generation Sequencing and Bioinformatic Pipelines for Enhanced Molecular Epidemiological Investigation of Pathogens provided a venue for discussing how technologies surrounding whole-genome sequencing (WGS) are advancing microbiology. Several applications in microbial taxonomy, microbial forensics, and genomics for public health pathogen surveillance were presented at the meeting and are reviewed. All of...

  10. Whole-genome sequence of the orchid anthracnose pathogen Colletotrichum orchidophilum.

    Baroncelli, Riccardo; Sukno, Serenella; Sarrocco, Sabrina; Cafà, Giovanni; Le Floch, Gaetan; Thon, Michael R

    2018-04-12

    Colletotrichum orchidophilum is a plant pathogenic fungus infecting a wide range of plant species belonging to the family Orchidaceae. Besides its economic impact, C. orchidophilum has been used in recent years in evolutionary studies as it represents the closest related species to the C. acutatum species complex. Here we present the first draft whole-genome sequence of C. orchidophilum IMI 309357, providing a resource for future research on anthracnose of Orchidaceae and other hosts.

  11. Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.

    Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad

    2016-09-01

    Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. © 2016 Li et al.; Published by Cold Spring Harbor Laboratory Press.

  12. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds.

    Yao Xu

    Full Text Available Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus and Qinchuan (Bos taurus are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 to 12 fold on average of 97.86% and 98.98% coverage of genomes, respectively. Comparison with the Bos_taurus_UMD_3.1 reference assembly yielded 9,010,096 SNPs for Nanyang, and 6,965,062 for Qinchuan cattle, 51% and 29% of which were novel SNPs, respectively. A total of 154,934 and 115,032 small indels (1 to 3 bp were found in the Nanyang and Qinchuan genomes, respectively. The SNP and indel distribution revealed that Nanyang showed a genetically high diversity as compared to Qinchuan cattle. Furthermore, a total of 2,907 putative cases of copy number variation (CNV were identified by aligning Nanyang to Qinchuan genome, 783 of which (27% encompassed the coding regions of 495 functional genes. The gene ontology (GO analysis revealed that many CNV genes were enriched in the immune system and environment adaptability. Among several CNV genes related to lipid transport and fat metabolism, Lepin receptor gene (LEPR overlapping with CNV_1815 showed remarkably higher copy number in Qinchuan than Nanyang (log2 (ratio = -2.34988; P value = 1.53E-102. Further qPCR and association analysis investigated that the copy number of the LEPR gene presented positive correlations with transcriptional expression and phenotypic traits, suggesting the LEPR CNV may contribute to the higher fat deposition in muscles of Qinchuan cattle. Our findings provide evidence that the distinct phenotypes of Nanyang and Qinchuan breeds may be due to the different genetic variations including SNPs

  13. Mural granulosa cell gene expression associated with oocyte developmental competence

    Jiang Jin-Yi

    2010-03-01

    Full Text Available Abstract Background Ovarian follicle development is a complex process. Paracrine interactions between somatic and germ cells are critical for normal follicular development and oocyte maturation. Studies have suggested that the health and function of the granulosa and cumulus cells may be reflective of the health status of the enclosed oocyte. The objective of the present study is to assess, using an in vivo immature rat model, gene expression profile in granulosa cells, which may be linked to the developmental competence of the oocyte. We hypothesized that expression of specific genes in granulosa cells may be correlated with the developmental competence of the oocyte. Methods Immature rats were injected with eCG and 24 h thereafter with anti-eCG antibody to induce follicular atresia or with pre-immune serum to stimulate follicle development. A high percentage (30-50%, normal developmental competence, NDC of oocytes from eCG/pre-immune serum group developed to term after embryo transfer compared to those from eCG/anti-eCG (0%, poor developmental competence, PDC. Gene expression profiles of mural granulosa cells from the above oocyte-collected follicles were assessed by Affymetrix rat whole genome array. Results The result showed that twelve genes were up-regulated, while one gene was down-regulated more than 1.5 folds in the NDC group compared with those in the PDC group. Gene ontology classification showed that the up-regulated genes included lysyl oxidase (Lox and nerve growth factor receptor associated protein 1 (Ngfrap1, which are important in the regulation of protein-lysine 6-oxidase activity, and in apoptosis induction, respectively. The down-regulated genes included glycoprotein-4-beta galactosyltransferase 2 (Ggbt2, which is involved in the regulation of extracellular matrix organization and biogenesis. Conclusions The data in the present study demonstrate a close association between specific gene expression in mural granulosa cells and

  14. Whole genome DNA copy number changes identified by high density oligonucleotide arrays

    Huang Jing

    2004-05-01

    Full Text Available Abstract Changes in DNA copy number are one of the hallmarks of the genetic instability common to most human cancers. Previous micro-array-based methods have been used to identify chromosomal gains and losses; however, they are unable to genotype alleles at the level of single nucleotide polymorphisms (SNPs. Here we describe a novel algorithm that uses a recently developed high-density oligonucleotide array-based SNP genotyping method, whole genome sampling analysis (WGSA, to identify genome-wide chromosomal gains and losses at high resolution. WGSA simultaneously genotypes over 10,000 SNPs by allele-specific hybridisation to perfect match (PM and mismatch (MM probes synthesised on a single array. The copy number algorithm jointly uses PM intensity and discrimination ratios between paired PM and MM intensity values to identify and estimate genetic copy number changes. Values from an experimental sample are compared with SNP-specific distributions derived from a reference set containing over 100 normal individuals to gain statistical power. Genomic regions with statistically significant copy number changes can be identified using both single point analysis and contiguous point analysis of SNP intensities. We identified multiple regions of amplification and deletion using a panel of human breast cancer cell lines. We verified these results using an independent method based on quantitative polymerase chain reaction and found that our approach is both sensitive and specific and can tolerate samples which contain a mixture of both tumour and normal DNA. In addition, by using known allele frequencies from the reference set, statistically significant genomic intervals can be identified containing contiguous stretches of homozygous markers, potentially allowing the detection of regions undergoing loss of heterozygosity (LOH without the need for a matched normal control sample. The coupling of LOH analysis, via SNP genotyping, with copy number

  15. Pancreatic cancer circulating tumour cells express a cell motility gene signature that predicts survival after surgery

    Sergeant, Gregory; Eijsden, Rudy van; Roskams, Tania; Van Duppen, Victor; Topal, Baki

    2012-01-01

    Most cancer deaths are caused by metastases, resulting from circulating tumor cells (CTC) that detach from the primary cancer and survive in distant organs. The aim of the present study was to develop a CTC gene signature and to assess its prognostic relevance after surgery for pancreatic ductal adenocarcinoma (PDAC). Negative depletion fluorescence activated cell sorting (FACS) was developed and validated with spiking experiments using cancer cell lines in whole human blood samples. This FACS-based method was used to enrich for CTC from the blood of 10 patients who underwent surgery for PDAC. Total RNA was isolated from 4 subgroup samples, i.e. CTC, haematological cells (G), original tumour (T), and non-tumoural pancreatic control tissue (P). After RNA quality control, samples of 6 patients were eligible for further analysis. Whole genome microarray analysis was performed after double linear amplification of RNA. ‘Ingenuity Pathway Analysis’ software and AmiGO were used for functional data analyses. A CTC gene signature was developed and validated with the nCounter system on expression data of 78 primary PDAC using Cox regression analysis for disease-free (DFS) and overall survival (OS). Using stringent statistical analysis, we retained 8,152 genes to compare expression profiles of CTC vs. other subgroups, and found 1,059 genes to be differentially expressed. The pathway with the highest expression ratio in CTC was p38 mitogen-activated protein kinase (p38 MAPK) signaling, known to be involved in cancer cell migration. In the p38 MAPK pathway, TGF-β1, cPLA2, and MAX were significantly upregulated. In addition, 9 other genes associated with both p38 MAPK signaling and cell motility were overexpressed in CTC. High co-expression of TGF-β1 and our cell motility panel (≥ 4 out of 9 genes for DFS and ≥ 6 out of 9 genes for OS) in primary PDAC was identified as an independent predictor of DFS (p=0.041, HR (95% CI) = 1.885 (1.025 – 3.559)) and OS (p=0.047, HR

  16. Supplementary Material for: Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    Phelan, Jody; Coll, Francesc; McNerney, Ruth; Ascher, David; Pires, Douglas; Furnham, Nick; Coeck, Nele; Hill-Cawthorne, Grant; Nair, Mridul; Mallard, Kim; Ramsay, Andrew; Campino, Susana; Hibberd, Martin; Pain, Arnab; Rigouts, Leen; Clark, Taane

    2016-01-01

    Abstract Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure

  17. Whole-genome characterization in pedigreed non-human primates using Genotyping-By-Sequencing and imputation.

    Cervera-Juanes, Rita; Vinson, Amanda; Ferguson, Betsy; Carbone, Lucia; Spindel, Eliot; Mccouch, Susan; Spindel, Jennifer; Nevonen, Kimberly; Letaw, John; Raboin, Michael; Bimber, Ben

    2016-01-01

    Background: Rhesus macaques are widely used in biomedical research, but the application of genomic information in this species to better understand human disease is still undeveloped. Whole-genome sequence (WGS) data in pedigreed macaque colonies could provide substantial experimental power, but the collection of WGS data in large cohorts remains a formidable expense. Here, we describe a cost-effective approach that selects the most informative macaques in a pedigree for whole-genome sequenci...

  18. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-01-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000. Keywords: Insect, Larval gut, Whole genome shot-gun sequencing

  19. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus

    Yookyung Lee

    2016-03-01

    Full Text Available Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000. Keywords: Insect, Larval gut, Whole genome shot-gun sequencing

  20. Estrogen Receptor-Mediated Effects of Isoflavone Supplementation Were Not Observed in Whole-Genome Gene Expression Profiles of Peripheral Blood Mononuclear Cells in Postmenopausal, Equol-Producing Women

    Velpen, van der V.; Geelen, A.; Schouten, E.G.; Hollman, P.C.H.; Afman, L.A.; Veer, van 't P.

    2013-01-01

    Isoflavones (genistein, daidzein, and glycitein) are suggested to have benefits as well as risks for human health. Approximately one-third of the Western population is able to metabolize daidzein into the more potent metabolite equol. Having little endogenous estradiol, equol-producing

  1. Global genetic response in a cancer cell: self-organized coherent expression dynamics.

    Masa Tsuchiya

    Full Text Available Understanding the basic mechanism of the spatio-temporal self-control of genome-wide gene expression engaged with the complex epigenetic molecular assembly is one of major challenges in current biological science. In this study, the genome-wide dynamical profile of gene expression was analyzed for MCF-7 breast cancer cells induced by two distinct ErbB receptor ligands: epidermal growth factor (EGF and heregulin (HRG, which drive cell proliferation and differentiation, respectively. We focused our attention to elucidate how global genetic responses emerge and to decipher what is an underlying principle for dynamic self-control of genome-wide gene expression. The whole mRNA expression was classified into about a hundred groups according to the root mean square fluctuation (rmsf. These expression groups showed characteristic time-dependent correlations, indicating the existence of collective behaviors on the ensemble of genes with respect to mRNA expression and also to temporal changes in expression. All-or-none responses were observed for HRG and EGF (biphasic statistics at around 10-20 min. The emergence of time-dependent collective behaviors of expression occurred through bifurcation of a coherent expression state (CES. In the ensemble of mRNA expression, the self-organized CESs reveals distinct characteristic expression domains for biphasic statistics, which exhibits notably the presence of criticality in the expression profile as a route for genomic transition. In time-dependent changes in the expression domains, the dynamics of CES reveals that the temporal development of the characteristic domains is characterized as autonomous bistable switch, which exhibits dynamic criticality (the temporal development of criticality in the genome-wide coherent expression dynamics. It is expected that elucidation of the biophysical origin for such critical behavior sheds light on the underlying mechanism of the control of whole genome.

  2. Whole genome comparisons of Fragaria, Prunus and Malus reveal different modes of evolution between Rosaceous subfamilies.

    Jung, Sook; Cestaro, Alessandro; Troggio, Michela; Main, Dorrie; Zheng, Ping; Cho, Ilhyung; Folta, Kevin M; Sosinski, Bryon; Abbott, Albert; Celton, Jean-Marc; Arús, Pere; Shulaev, Vladimir; Verde, Ignazio; Morgante, Michele; Rokhsar, Daniel; Velasco, Riccardo; Sargent, Daniel James

    2012-04-04

    Rosaceae include numerous economically important and morphologically diverse species. Comparative mapping between the member species in Rosaceae have indicated some level of synteny. Recently the whole genome of three crop species, peach, apple and strawberry, which belong to different genera of the Rosaceae family, have been sequenced, allowing in-depth comparison of these genomes. Our analysis using the whole genome sequences of peach, apple and strawberry identified 1399 orthologous regions between the three genomes, with a mean length of around 100 kb. Each peach chromosome showed major orthology mostly to one strawberry chromosome, but to more than two apple chromosomes, suggesting that the apple genome went through more chromosomal fissions in addition to the whole genome duplication after the divergence of the three genera. However, the distribution of contiguous ancestral regions, identified using the multiple genome rearrangements and ancestors (MGRA) algorithm, suggested that the Fragaria genome went through a greater number of small scale rearrangements compared to the other genomes since they diverged from a common ancestor. Using the contiguous ancestral regions, we reconstructed a hypothetical ancestral genome for the Rosaceae 7 composed of nine chromosomes and propose the evolutionary steps from the ancestral genome to the extant Fragaria, Prunus and Malus genomes. Our analysis shows that different modes of evolution may have played major roles in different subfamilies of Rosaceae. The hypothetical ancestral genome of Rosaceae and the evolutionary steps that lead to three different lineages of Rosaceae will facilitate our understanding of plant genome evolution as well as have a practical impact on knowledge transfer among member species of Rosaceae.

  3. High depth, whole-genome sequencing of cholera isolates from Haiti and the Dominican Republic.

    Sealfon, Rachel; Gire, Stephen; Ellis, Crystal; Calderwood, Stephen; Qadri, Firdausi; Hensley, Lisa; Kellis, Manolis; Ryan, Edward T; LaRocque, Regina C; Harris, Jason B; Sabeti, Pardis C

    2012-09-11

    Whole-genome sequencing is an important tool for understanding microbial evolution and identifying the emergence of functionally important variants over the course of epidemics. In October 2010, a severe cholera epidemic began in Haiti, with additional cases identified in the neighboring Dominican Republic. We used whole-genome approaches to sequence four Vibrio cholerae isolates from Haiti and the Dominican Republic and three additional V. cholerae isolates to a high depth of coverage (>2000x); four of the seven isolates were previously sequenced. Using these sequence data, we examined the effect of depth of coverage and sequencing platform on genome assembly and identification of sequence variants. We found that 50x coverage is sufficient to construct a whole-genome assembly and to accurately call most variants from 100 base pair paired-end sequencing reads. Phylogenetic analysis between the newly sequenced and thirty-three previously sequenced V. cholerae isolates indicates that the Haitian and Dominican Republic isolates are closest to strains from South Asia. The Haitian and Dominican Republic isolates form a tight cluster, with only four variants unique to individual isolates. These variants are located in the CTX region, the SXT region, and the core genome. Of the 126 mutations identified that separate the Haiti-Dominican Republic cluster from the V. cholerae reference strain (N16961), 73 are non-synonymous changes, and a number of these changes cluster in specific genes and pathways. Sequence variant analyses of V. cholerae isolates, including multiple isolates from the Haitian outbreak, identify coverage-specific and technology-specific effects on variant detection, and provide insight into genomic change and functional evolution during an epidemic.

  4. High depth, whole-genome sequencing of cholera isolates from Haiti and the Dominican Republic

    Sealfon Rachel

    2012-09-01

    Full Text Available Abstract Background Whole-genome sequencing is an important tool for understanding microbial evolution and identifying the emergence of functionally important variants over the course of epidemics. In October 2010, a severe cholera epidemic began in Haiti, with additional cases identified in the neighboring Dominican Republic. We used whole-genome approaches to sequence four Vibrio cholerae isolates from Haiti and the Dominican Republic and three additional V. cholerae isolates to a high depth of coverage (>2000x; four of the seven isolates were previously sequenced. Results Using these sequence data, we examined the effect of depth of coverage and sequencing platform on genome assembly and identification of sequence variants. We found that 50x coverage is sufficient to construct a whole-genome assembly and to accurately call most variants from 100 base pair paired-end sequencing reads. Phylogenetic analysis between the newly sequenced and thirty-three previously sequenced V. cholerae isolates indicates that the Haitian and Dominican Republic isolates are closest to strains from South Asia. The Haitian and Dominican Republic isolates form a tight cluster, with only four variants unique to individual isolates. These variants are located in the CTX region, the SXT region, and the core genome. Of the 126 mutations identified that separate the Haiti-Dominican Republic cluster from the V. cholerae reference strain (N16961, 73 are non-synonymous changes, and a number of these changes cluster in specific genes and pathways. Conclusions Sequence variant analyses of V. cholerae isolates, including multiple isolates from the Haitian outbreak, identify coverage-specific and technology-specific effects on variant detection, and provide insight into genomic change and functional evolution during an epidemic.

  5. Using Whole Genome Analysis to Examine Recombination across Diverse Sequence Types of Staphylococcus aureus.

    Elizabeth M Driebe

    Full Text Available Staphylococcus aureus is an important clinical pathogen worldwide and understanding this organism's phylogeny and, in particular, the role of recombination, is important both to understand the overall spread of virulent lineages and to characterize outbreaks. To further elucidate the phylogeny of S. aureus, 35 diverse strains were sequenced using whole genome sequencing. In addition, 29 publicly available whole genome sequences were included to create a single nucleotide polymorphism (SNP-based phylogenetic tree encompassing 11 distinct lineages. All strains of a particular sequence type fell into the same clade with clear groupings of the major clonal complexes of CC8, CC5, CC30, CC45 and CC1. Using a novel analysis method, we plotted the homoplasy density and SNP density across the whole genome and found evidence of recombination throughout the entire chromosome, but when we examined individual clonal lineages we found very little recombination. However, when we analyzed three branches of multiple lineages, we saw intermediate and differing levels of recombination between them. These data demonstrate that in S. aureus, recombination occurs across major lineages that subsequently expand in a clonal manner. Estimated mutation rates for the CC8 and CC5 lineages were different from each other. While the CC8 lineage rate was similar to previous studies, the CC5 lineage was 100-fold greater. Fifty known virulence genes were screened in all genomes in silico to determine their distribution across major clades. Thirty-three genes were present variably across clades, most of which were not constrained by ancestry, indicating horizontal gene transfer or gene loss.

  6. High-precision, whole-genome sequencing of laboratory strains facilitates genetic studies.

    Anjana Srivatsan

    2008-08-01

    Full Text Available Whole-genome sequencing is a powerful technique for obtaining the reference sequence information of multiple organisms. Its use can be dramatically expanded to rapidly identify genomic variations, which can be linked with phenotypes to obtain biological insights. We explored these potential applications using the emerging next-generation sequencing platform Solexa Genome Analyzer, and the well-characterized model bacterium Bacillus subtilis. Combining sequencing with experimental verification, we first improved the accuracy of the published sequence of the B. subtilis reference strain 168, then obtained sequences of multiple related laboratory strains and different isolates of each strain. This provides a framework for comparing the divergence between different laboratory strains and between their individual isolates. We also demonstrated the power of Solexa sequencing by using its results to predict a defect in the citrate signal transduction pathway of a common laboratory strain, which we verified experimentally. Finally, we examined the molecular nature of spontaneously generated mutations that suppress the growth defect caused by deletion of the stringent response mediator relA. Using whole-genome sequencing, we rapidly mapped these suppressor mutations to two small homologs of relA. Interestingly, stable suppressor strains had mutations in both genes, with each mutation alone partially relieving the relA growth defect. This supports an intriguing three-locus interaction module that is not easily identifiable through traditional suppressor mapping. We conclude that whole-genome sequencing can drastically accelerate the identification of suppressor mutations and complex genetic interactions, and it can be applied as a standard tool to investigate the genetic traits of model organisms.

  7. Whole genome comparisons of Fragaria, Prunus and Malus reveal different modes of evolution between Rosaceous subfamilies

    Jung Sook

    2012-04-01

    Full Text Available Abstract Background Rosaceae include numerous economically important and morphologically diverse species. Comparative mapping between the member species in Rosaceae have indicated some level of synteny. Recently the whole genome of three crop species, peach, apple and strawberry, which belong to different genera of the Rosaceae family, have been sequenced, allowing in-depth comparison of these genomes. Results Our analysis using the whole genome sequences of peach, apple and strawberry identified 1399 orthologous regions between the three genomes, with a mean length of around 100 kb. Each peach chromosome showed major orthology mostly to one strawberry chromosome, but to more than two apple chromosomes, suggesting that the apple genome went through more chromosomal fissions in addition to the whole genome duplication after the divergence of the three genera. However, the distribution of contiguous ancestral regions, identified using the multiple genome rearrangements and ancestors (MGRA algorithm, suggested that the Fragaria genome went through a greater number of small scale rearrangements compared to the other genomes since they diverged from a common ancestor. Using the contiguous ancestral regions, we reconstructed a hypothetical ancestral genome for the Rosaceae 7 composed of nine chromosomes and propose the evolutionary steps from the ancestral genome to the extant Fragaria, Prunus and Malus genomes. Conclusion Our analysis shows that different modes of evolution may have played major roles in different subfamilies of Rosaceae. The hypothetical ancestral genome of Rosaceae and the evolutionary steps that lead to three different lineages of Rosaceae will facilitate our understanding of plant genome evolution as well as have a practical impact on knowledge transfer among member species of Rosaceae.

  8. Whole genome sequences and annotation of Micrococcus luteus SUBG006, a novel phytopathogen of mango.

    Rakhashiya, Purvi M; Patel, Pooja P; Thaker, Vrinda S

    2015-12-01

    Actinobaceria, Micrococcus luteus SUBG006 was isolated from infected leaves of Mangifera indica L. vr. Nylon in Rajkot, (22.30°N, 70.78°E), Gujarat, India. The genome size is 3.86 Mb with G + C content of 69.80% and contains 112 rRNA sequences (5S, 16S and 23S). The whole genome sequencing has been deposited in DDBJ/EMBL/GenBank under the accession number JOKP00000000.

  9. Refining QTL with high-density SNP genotyping and whole genome sequence in three cattle breeds

    Sahana, Goutam; Guldbrandtsen, Bernt; Lund, Mogens Sandø

    2012-01-01

    Genome-wide association study was carried out in Nordic Holsteins, Nordic Red and Jersey breeds for functional traits using BovineHD Genotyping BreadChip (Illumina, San Diego, CA). The association analyses were carried out using both linear mixed model approach and a Bayesian variable selection...... method. Principal components were used to account for population structure. The QTL segregating in all three breeds were selected and a few of the most significant ones were followed in further analyses. The polymorphisms in the identified QTL regions were imputed using 90 whole genome sequences...

  10. A Danish Salmonella Bareilly outbreak investigated by the use of whole genome sequencing

    Torpdahl, M.; Kiil, K.; Litrup, E.

    2013-01-01

    with several band changes and others are defined by one PFGE profile thereby excluding closely related profiles. We decided to investigate whether whole genome sequencing (WGS) could resolve this issue and be useful in outbreak investigations. Several analyses were performed, including a SNP tree based...... on the core genome, MLST profiles and detection of phages in the genome. The human cluster and the broiler isolates belonged to the same ST, but the isolates were divided into two groups, 9 SNPs apart, according to an MP phylogeny. When using PHAST, we found that two phage regions were a 100% similar...

  11. Whole-genome sequencing for identification of the source in hospital-acquired Legionnaires' disease

    Rosendahl Madsen, A M; Holm, A; Jensen, T G

    2017-01-01

    Acquisition of Legionnaires' disease is a serious complication of hospitalization. Rapid determination of whether or not the infection is caused by strains of Legionella pneumophila in the hospital environment is crucial to avoid further cases. This study investigated the use of whole-genome sequ......Acquisition of Legionnaires' disease is a serious complication of hospitalization. Rapid determination of whether or not the infection is caused by strains of Legionella pneumophila in the hospital environment is crucial to avoid further cases. This study investigated the use of whole...

  12. Isolation and whole-genome sequencing of a Crimean-Congo hemorrhagic fever virus strain, Greece.

    Papa, Anna; Papadopoulou, Elpida; Tsioka, Katerina; Kontana, Anastasia; Pappa, Styliani; Melidou, Ageliki; Giadinis, Nektarios D

    2018-03-01

    Crimean-Congo hemorrhagic fever virus (CCHFV) was isolated from a pool of two adult Rhipicephalus bursa ticks removed from a goat in 2015 in Greece. The strain clusters into lineage Europe 2 representing the second available whole-genome sequenced isolate of this lineage. CCHFV IgG antibodies were detected in 8 of 19 goats of the farm. Currently CCHFV is not associated with disease in mammals other than humans. Studies in animal models are needed to investigate the pathogenicity level of lineage Europe 2 and compare it with that of other lineages. Copyright © 2018 Elsevier GmbH. All rights reserved.

  13. Association analysis of whole genome sequencing data accounting for longitudinal and family designs.

    Hu, Yijuan; Hui, Qin; Sun, Yan V

    2014-01-01

    Using the whole genome sequencing data and the simulated longitudinal phenotypes for 849 pedigree-based individuals from Genetic Analysis Workshop 18, we investigated various approaches to detecting the association of rare and common variants with blood pressure traits. We compared three strategies for longitudinal data: (a) using the baseline measurement only, (b) using the average from multiple visits, and (c) using all individual measurements. We also compared the power of using all of the pedigree-based data and the unrelated subset. The analyses were performed without knowledge of the underlying simulating model.

  14. Reflections on the cost of "low-cost" whole genome sequencing: framing the health policy debate.

    Timothy Caulfield

    2013-11-01

    Full Text Available The cost of whole genome sequencing is dropping rapidly. There has been a great deal of enthusiasm about the potential for this technological advance to transform clinical care. Given the interest and significant investment in genomics, this seems an ideal time to consider what the evidence tells us about potential benefits and harms, particularly in the context of health care policy. The scale and pace of adoption of this powerful new technology should be driven by clinical need, clinical evidence, and a commitment to put patients at the centre of health care policy.

  15. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    Li, Ying-hui; Zhao, Shan-cen; Ma, Jian-xin

    2013-01-01

    and genetic improvement were identified.CONCLUSIONS:Given the uniqueness of the soybean germplasm sequenced, this study drew a clear picture of human-mediated evolution of the soybean genomes. The genomic resources and information provided by this study would also facilitate the discovery of genes......BACKGROUND:Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re...

  16. SV2: accurate structural variation genotyping and de novo mutation detection from whole genomes.

    Antaki, Danny; Brandler, William M; Sebat, Jonathan

    2018-05-15

    Structural variation (SV) detection from short-read whole genome sequencing is error prone, presenting significant challenges for population or family-based studies of disease. Here, we describe SV2, a machine-learning algorithm for genotyping deletions and duplications from paired-end sequencing data. SV2 can rapidly integrate variant calls from multiple structural variant discovery algorithms into a unified call set with high genotyping accuracy and capability to detect de novo mutations. SV2 is freely available on GitHub (https://github.com/dantaki/SV2). jsebat@ucsd.edu. Supplementary data are available at Bioinformatics online.

  17. Recommendations to address the difficulties encountered when determining linezolid resistance from whole genome sequencing data.

    Beukers, Alicia G; Hasman, Henrik; Hegstad, Kristin; van Hal, Sebastiaan J

    2018-05-29

    Mutations associated with linezolid resistance within the V domain of 23S rRNA are annotated using an Escherichia coli numbering system. The 23S rRNA gene varies in length, nucleotide sequence and copy number between bacterial species. Consequently, this numbering system is not intuitive and can lead to confusion when locating mutation sites using whole genome sequencing data. Using the mutation G2576T as an example, we demonstrate the difficulties associated with using the E. coli numbering system. © Crown copyright 2018.

  18. Highly efficient PCR assay to discriminate allelic DNA methylation status using whole genome amplification

    Ito Takashi

    2011-06-01

    Full Text Available Abstract Background We previously developed a simple method termed HpaII-McrBC PCR (HM-PCR to discriminate allelic methylation status of the genomic sites of interest, and successfully applied it to a comprehensive analysis of CpG islands (CGIs on human chromosome 21q. However, HM-PCR requires 200 ng of genomic DNA to examine one target site, thereby precluding its application to such samples that are limited in quantity. Findings We developed HpaII-McrBC whole-genome-amplification PCR (HM-WGA-PCR that uses whole-genome-amplified DNA as the template. HM-WGA-PCR uses only 1/100th the genomic template material required for HM-PCR. Indeed, we successfully analyzed 147 CGIs by HM-WGA-PCR using only ~300 ng of DNA, whereas previous HM-PCR study had required ~30 μg. Furthermore, we confirmed that allelic methylation status revealed by HM-WGA-PCR is identical to that by HM-PCR in every case of the 147 CGIs tested, proving high consistency between the two methods. Conclusions HM-WGA-PCR would serve as a reliable alternative to HM-PCR in the analysis of allelic methylation status when the quantity of DNA available is limited.

  19. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

    Can Alkan

    2007-09-01

    Full Text Available The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

  20. Whole genome sequence typing to investigate the Apophysomyces outbreak following a tornado in Joplin, Missouri, 2011.

    Etienne, Kizee A; Gillece, John; Hilsabeck, Remy; Schupp, Jim M; Colman, Rebecca; Lockhart, Shawn R; Gade, Lalitha; Thompson, Elizabeth H; Sutton, Deanna A; Neblett-Fanfair, Robyn; Park, Benjamin J; Turabelidze, George; Keim, Paul; Brandt, Mary E; Deak, Eszter; Engelthaler, David M

    2012-01-01

    Case reports of Apophysomyces spp. in immunocompetent hosts have been a result of traumatic deep implantation of Apophysomyces spp. spore-contaminated soil or debris. On May 22, 2011 a tornado occurred in Joplin, MO, leaving 13 tornado victims with Apophysomyces trapeziformis infections as a result of lacerations from airborne material. We used whole genome sequence typing (WGST) for high-resolution phylogenetic SNP analysis of 17 outbreak Apophysomyces isolates and five additional temporally and spatially diverse Apophysomyces control isolates (three A. trapeziformis and two A. variabilis isolates). Whole genome SNP phylogenetic analysis revealed three clusters of genotypically related or identical A. trapeziformis isolates and multiple distinct isolates among the Joplin group; this indicated multiple genotypes from a single or multiple sources. Though no linkage between genotype and location of exposure was observed, WGST analysis determined that the Joplin isolates were more closely related to each other than to the control isolates, suggesting local population structure. Additionally, species delineation based on WGST demonstrated the need to reassess currently accepted taxonomic classifications of phylogenetic species within the genus Apophysomyces.

  1. Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

    Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

    2016-08-01

    Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. Systematic evaluation of bias in microbial community profiles induced by whole genome amplification.

    Direito, Susana O L; Zaura, Egija; Little, Miranda; Ehrenfreund, Pascale; Röling, Wilfred F M

    2014-03-01

    Whole genome amplification methods facilitate the detection and characterization of microbial communities in low biomass environments. We examined the extent to which the actual community structure is reliably revealed and factors contributing to bias. One widely used [multiple displacement amplification (MDA)] and one new primer-free method [primase-based whole genome amplification (pWGA)] were compared using a polymerase chain reaction (PCR)-based method as control. Pyrosequencing of an environmental sample and principal component analysis revealed that MDA impacted community profiles more strongly than pWGA and indicated that this related to species GC content, although an influence of DNA integrity could not be excluded. Subsequently, biases by species GC content, DNA integrity and fragment size were separately analysed using defined mixtures of DNA from various species. We found significantly less amplification of species with the highest GC content for MDA-based templates and, to a lesser extent, for pWGA. DNA fragmentation also interfered severely: species with more fragmented DNA were less amplified with MDA and pWGA. pWGA was unable to amplify low molecular weight DNA (microbial communities in low-biomass environments and for currently planned astrobiological missions to Mars. © 2013 Society for Applied Microbiology and John Wiley & Sons Ltd.

  3. Whole genome duplication affects evolvability of flowering time in an autotetraploid plant.

    Sara L Martin

    Full Text Available Whole genome duplications have occurred recurrently throughout the evolutionary history of eukaryotes. The resulting genetic and phenotypic changes can influence physiological and ecological responses to the environment; however, the impact of genome copy number on evolvability has rarely been examined experimentally. Here, we evaluate the effect of genome duplication on the ability to respond to selection for early flowering time in lines drawn from naturally occurring diploid and autotetraploid populations of the plant Chamerion angustifolium (fireweed. We contrast this with the result of four generations of selection on synthesized neoautotetraploids, whose genic variability is similar to diploids but genome copy number is similar to autotetraploids. In addition, we examine correlated responses to selection in all three groups. Diploid and both extant tetraploid and neoautotetraploid lines responded to selection with significant reductions in time to flowering. Evolvability, measured as realized heritability, was significantly lower in extant tetraploids (^b(T =  0.31 than diploids (^b(T =  0.40. Neotetraploids exhibited the highest evolutionary response (^b(T  =  0.55. The rapid shift in flowering time in neotetraploids was associated with an increase in phenotypic variability across generations, but not with change in genome size or phenotypic correlations among traits. Our results suggest that whole genome duplications, without hybridization, may initially alter evolutionary rate, and that the dynamic nature of neoautopolyploids may contribute to the prevalence of polyploidy throughout eukaryotes.

  4. Rediscovery by Whole Genome Sequencing: Classical Mutations and Genome Polymorphisms in Neurospora crassa

    McCluskey, Kevin; Wiest, Aric E.; Grigoriev, Igor V.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Baker, Scott E.

    2011-06-02

    Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.

  5. Are Escherichia coli Pathotypes Still Relevant in the Era of Whole-Genome Sequencing?

    Robins-Browne, Roy M.; Holt, Kathryn E.; Ingle, Danielle J.; Hocking, Dianna M.; Yang, Ji; Tauschek, Marija

    2016-01-01

    The empirical and pragmatic nature of diagnostic microbiology has given rise to several different schemes to subtype E.coli, including biotyping, serotyping, and pathotyping. These schemes have proved invaluable in identifying and tracking outbreaks, and for prognostication in individual cases of infection, but they are imprecise and potentially misleading due to the malleability and continuous evolution of E. coli. Whole genome sequencing can be used to accurately determine E. coli subtypes that are based on allelic variation or differences in gene content, such as serotyping and pathotyping. Whole genome sequencing also provides information about single nucleotide polymorphisms in the core genome of E. coli, which form the basis of sequence typing, and is more reliable than other systems for tracking the evolution and spread of individual strains. A typing scheme for E. coli based on genome sequences that includes elements of both the core and accessory genomes, should reduce typing anomalies and promote understanding of how different varieties of E. coli spread and cause disease. Such a scheme could also define pathotypes more precisely than current methods. PMID:27917373

  6. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

    Alkan, Can; Ventura, Mario; Archidiacono, Nicoletta; Rocchi, Mariano; Sahinalp, S Cenk; Eichler, Evan E

    2007-09-01

    The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

  7. Independent Evolution of Winner Traits without Whole Genome Duplication in Dekkera Yeasts.

    Yi-Cheng Guo

    Full Text Available Dekkera yeasts have often been considered as alternative sources of ethanol production that could compete with S. cerevisiae. The two lineages of yeasts independently evolved traits that include high glucose and ethanol tolerance, aerobic fermentation, and a rapid ethanol fermentation rate. The Saccharomyces yeasts attained these traits mainly through whole genome duplication approximately 100 million years ago (Mya. However, the Dekkera yeasts, which were separated from S. cerevisiae approximately 200 Mya, did not undergo whole genome duplication (WGD but still occupy a niche similar to S. cerevisiae. Upon analysis of two Dekkera yeasts and five closely related non-WGD yeasts, we found that a massive loss of cis-regulatory elements occurred in an ancestor of the Dekkera yeasts, which led to improved mitochondrial functions similar to the S. cerevisiae yeasts. The evolutionary analysis indicated that genes involved in the transcription and translation process exhibited faster evolution in the Dekkera yeasts. We detected 90 positively selected genes, suggesting that the Dekkera yeasts evolved an efficient translation system to facilitate adaptive evolution. Moreover, we identified that 12 vacuolar H+-ATPase (V-ATPase function genes that were under positive selection, which assists in developing tolerance to high alcohol and high sugar stress. We also revealed that the enzyme PGK1 is responsible for the increased rate of glycolysis in the Dekkera yeasts. These results provide important insights to understand the independent adaptive evolution of the Dekkera yeasts and provide tools for genetic modification promoting industrial usage.

  8. Epidemiological analysis of Salmonella clusters identified by whole genome sequencing, England and Wales 2014.

    Waldram, Alison; Dolan, Gayle; Ashton, Philip M; Jenkins, Claire; Dallman, Timothy J

    2018-05-01

    The unprecedented level of bacterial strain discrimination provided by whole genome sequencing (WGS) presents new challenges with respect to the utility and interpretation of the data. Whole genome sequences from 1445 isolates of Salmonella belonging to the most commonly identified serotypes in England and Wales isolated between April and August 2014 were analysed. Single linkage single nucleotide polymorphism thresholds at the 10, 5 and 0 level were explored for evidence of epidemiological links between clustered cases. Analysis of the WGS data organised 566 of the 1445 isolates into 32 clusters of five or more. A statistically significant epidemiological link was identified for 17 clusters. The clusters were associated with foreign travel (n = 8), consumption of Chinese takeaways (n = 4), chicken eaten at home (n = 2), and one each of the following; eating out, contact with another case in the home and contact with reptiles. In the same time frame, one cluster was detected using traditional outbreak detection methods. WGS can be used for the highly specific and highly sensitive detection of biologically related isolates when epidemiological links are obscured. Improvements in the collection of detailed, standardised exposure information would enhance cluster investigations. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Kernel-based whole-genome prediction of complex traits: a review.

    Morota, Gota; Gianola, Daniel

    2014-01-01

    Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

  10. Kernel-based whole-genome prediction of complex traits: a review

    Gota eMorota

    2014-10-01

    Full Text Available Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways, thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

  11. Bacterial phylogenetic reconstruction from whole genomes is robust to recombination but demographic inference is not.

    Hedge, Jessica; Wilson, Daniel J

    2014-11-25

    Phylogenetic inference in bacterial genomics is fundamental to understanding problems such as population history, antimicrobial resistance, and transmission dynamics. The field has been plagued by an apparent state of contradiction since the distorting effects of recombination on phylogeny were discovered more than a decade ago. Researchers persist with detailed phylogenetic analyses while simultaneously acknowledging that recombination seriously misleads inference of population dynamics and selection. Here we resolve this paradox by showing that phylogenetic tree topologies based on whole genomes robustly reconstruct the clonal frame topology but that branch lengths are badly skewed. Surprisingly, removing recombining sites can exacerbate branch length distortion caused by recombination. Phylogenetic tree reconstruction is a popular approach for understanding the relatedness of bacteria in a population from differences in their genome sequences. However, bacteria frequently exchange regions of their genomes by a process called homologous recombination, which violates a fundamental assumption of phylogenetic methods. Since many researchers continue to use phylogenetics for recombining bacteria, it is important to understand how recombination affects the conclusions drawn from these analyses. We find that whole-genome sequences afford great accuracy in reconstructing evolutionary relationships despite concerns surrounding the presence of recombination, but the branch lengths of the phylogenetic tree are indeed badly distorted. Surprisingly, methods to reduce the impact of recombination on branch lengths can exacerbate the problem. Copyright © 2014 Hedge and Wilson.

  12. Construction of a phylogenetic tree of photosynthetic prokaryotes based on average similarities of whole genome sequences.

    Soichirou Satoh

    Full Text Available Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.

  13. Whole genome analysis of linezolid resistance in Streptococcus pneumoniae reveals resistance and compensatory mutations

    Légaré Danielle

    2011-10-01

    Full Text Available Abstract Background Several mutations were present in the genome of Streptococcus pneumoniae linezolid-resistant strains but the role of several of these mutations had not been experimentally tested. To analyze the role of these mutations, we reconstituted resistance by serial whole genome transformation of a novel resistant isolate into two strains with sensitive background. We sequenced the parent mutant and two independent transformants exhibiting similar minimum inhibitory concentration to linezolid. Results Comparative genomic analyses revealed that transformants acquired G2576T transversions in every gene copy of 23S rRNA and that the number of altered copies correlated with the level of linezolid resistance and cross-resistance to florfenicol and chloramphenicol. One of the transformants also acquired a mutation present in the parent mutant leading to the overexpression of an ABC transporter (spr1021. The acquisition of these mutations conferred a fitness cost however, which was further enhanced by the acquisition of a mutation in a RNA methyltransferase implicated in resistance. Interestingly, the fitness of the transformants could be restored in part by the acquisition of altered copies of the L3 and L16 ribosomal proteins and by mutations leading to the overexpression of the spr1887 ABC transporter that were present in the original linezolid-resistant mutant. Conclusions Our results demonstrate the usefulness of whole genome approaches at detecting major determinants of resistance as well as compensatory mutations that alleviate the fitness cost associated with resistance.

  14. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

    Puente, Xose S.; Pinyol, Magda; Quesada, Víctor; Conde, Laura; Ordóñez, Gonzalo R.; Villamor, Neus; Escaramis, Georgia; Jares, Pedro; Beà, Sílvia; González-Díaz, Marcos; Bassaganyas, Laia; Baumann, Tycho; Juan, Manel; López-Guerra, Mónica; Colomer, Dolors; Tubío, José M. C.; López, Cristina; Navarro, Alba; Tornador, Cristian; Aymerich, Marta; Rozman, María; Hernández, Jesús M.; Puente, Diana A.; Freije, José M. P.; Velasco, Gloria; Gutiérrez-Fernández, Ana; Costa, Dolors; Carrió, Anna; Guijarro, Sara; Enjuanes, Anna; Hernández, Lluís; Yagüe, Jordi; Nicolás, Pilar; Romeo-Casabona, Carlos M.; Himmelbauer, Heinz; Castillo, Ester; Dohm, Juliane C.; de Sanjosé, Silvia; Piris, Miguel A.; de Alava, Enrique; Miguel, Jesús San; Royo, Romina; Gelpí, Josep L.; Torrents, David; Orozco, Modesto; Pisano, David G.; Valencia, Alfonso; Guigó, Roderic; Bayés, Mónica; Heath, Simon; Gut, Marta; Klatt, Peter; Marshall, John; Raine, Keiran; Stebbings, Lucy A.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.; Gut, Ivo; López-Guillermo, Armando; Estivill, Xavier; Montserrat, Emili; López-Otín, Carlos; Campo, Elías

    2012-01-01

    Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution1,2. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes3,4. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer. PMID:21642962

  15. Whole-genome analysis of herbicide-tolerant mutant rice generated by Agrobacterium-mediated gene targeting.

    Endo, Masaki; Kumagai, Masahiko; Motoyama, Ritsuko; Sasaki-Yamagata, Harumi; Mori-Hosokawa, Satomi; Hamada, Masao; Kanamori, Hiroyuki; Nagamura, Yoshiaki; Katayose, Yuichi; Itoh, Takeshi; Toki, Seiichi

    2015-01-01

    Gene targeting (GT) is a technique used to modify endogenous genes in target genomes precisely via homologous recombination (HR). Although GT plants are produced using genetic transformation techniques, if the difference between the endogenous and the modified gene is limited to point mutations, GT crops can be considered equivalent to non-genetically modified mutant crops generated by conventional mutagenesis techniques. However, it is difficult to guarantee the non-incorporation of DNA fragments from Agrobacterium in GT plants created by Agrobacterium-mediated GT despite screening with conventional Southern blot and/or PCR techniques. Here, we report a comprehensive analysis of herbicide-tolerant rice plants generated by inducing point mutations in the rice ALS gene via Agrobacterium-mediated GT. We performed genome comparative genomic hybridization (CGH) array analysis and whole-genome sequencing to evaluate the molecular composition of GT rice plants. Thus far, no integration of Agrobacterium-derived DNA fragments has been detected in GT rice plants. However, >1,000 single nucleotide polymorphisms (SNPs) and insertion/deletion (InDels) were found in GT plants. Among these mutations, 20-100 variants might have some effect on expression levels and/or protein function. Information about additive mutations should be useful in clearing out unwanted mutations by backcrossing. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  16. High-resolution whole-genome analysis of skull base chordomas implicates FHIT loss in chordoma pathogenesis.

    Diaz, Roberto Jose; Guduk, Mustafa; Romagnuolo, Rocco; Smith, Christian A; Northcott, Paul; Shih, David; Berisha, Fitim; Flanagan, Adrienne; Munoz, David G; Cusimano, Michael D; Pamir, M Necmettin; Rutka, James T

    2012-09-01

    Chordoma is a rare tumor arising in the sacrum, clivus, or vertebrae. It is often not completely resectable and shows a high incidence of recurrence and progression with shortened patient survival and impaired quality of life. Chemotherapeutic options are limited to investigational therapies at present. Therefore, adjuvant therapy for control of tumor recurrence and progression is of great interest, especially in skull base lesions where complete tumor resection is often not possible because of the proximity of cranial nerves. To understand the extent of genetic instability and associated chromosomal and gene losses or gains in skull base chordoma, we undertook whole-genome single-nucleotide polymorphism microarray analysis of flash frozen surgical chordoma specimens, 21 from the clivus and 1 from C1 to C2 vertebrae. We confirm the presence of a deletion at 9p involving CDKN2A, CDKN2B, and MTAP but at a much lower rate (22%) than previously reported for sacral chordoma. At a similar frequency (21%), we found aneuploidy of chromosome 3. Tissue microarray immunohistochemistry demonstrated absent or reduced fragile histidine triad (FHIT) protein expression in 98% of sacral chordomas and 67%of skull base chordomas. Our data suggest that chromosome 3 aneuploidy and epigenetic regulation of FHIT contribute to loss of the FHIT tumor suppressor in chordoma. The finding that FHIT is lost in a majority of chordomas provides new insight into chordoma pathogenesis and points to a potential new therapeutic target for this challenging neoplasm.

  17. High-resolution Whole-Genome Analysis of Skull Base Chordomas Implicates FHIT Loss in Chordoma Pathogenesis

    Roberto Jose Diaz

    2012-09-01

    Full Text Available Chordoma is a rare tumor arising in the sacrum, clivus, or vertebrae. It is often not completely resectable and shows a high incidence of recurrence and progression with shortened patient survival and impaired quality of life. Chemotherapeutic options are limited to investigational therapies at present. Therefore, adjuvant therapy for control of tumor recurrence and progression is of great interest, especially in skull base lesions where complete tumor resection is often not possible because of the proximity of cranial nerves. To understand the extent of genetic instability and associated chromosomal and gene losses or gains in skull base chordoma, we undertook whole-genome single-nucleotide polymorphism microarray analysis of flash frozen surgical chordoma specimens, 21 from the clivus and 1 from C1 to C2 vertebrae. We confirm the presence of a deletion at 9p involving CDKN2A, CDKN2B, and MTAP but at a much lower rate (22% than previously reported for sacral chordoma. At a similar frequency (21%, we found aneuploidy of chromosome 3. Tissue microarray immunohistochemistry demonstrated absent or reduced fragile histidine triad (FHIT protein expression in 98% of sacral chordomas and 67%of skull base chordomas. Our data suggest that chromosome 3 aneuploidy and epigenetic regulation of FHIT contribute to loss of the FHIT tumor suppressor in chordoma. The finding that FHIT is lost in a majority of chordomas provides new insight into chordoma pathogenesis and points to a potential new therapeutic target for this challenging neoplasm.

  18. Whole-genome analysis of human papillomavirus genotypes 52 and 58 isolated from Japanese women with cervical intraepithelial neoplasia and invasive cervical cancer.

    Tenjimbayashi, Yuri; Onuki, Mamiko; Hirose, Yusuke; Mori, Seiichiro; Ishii, Yoshiyuki; Takeuchi, Takamasa; Tasaka, Nobutaka; Satoh, Toyomi; Morisada, Tohru; Iwata, Takashi; Miyamoto, Shingo; Matsumoto, Koji; Sekizawa, Akihiko; Kukimoto, Iwao

    2017-01-01

    Human papillomavirus genotypes 52 and 58 (HPV52/58) are frequently detected in patients with cervical intraepithelial neoplasia (CIN) and invasive cervical cancer (ICC) in East Asian countries including Japan. As with other HPV genotypes, HPV52/58 consist of multiple lineages of genetic variants harboring less than 10% differences between complete genome sequences of the same HPV genotype. However, site variations of nucleotide and amino acid sequences across the viral whole-genome have not been fully examined for HPV52/58. The aim of this study was to investigate genetic variations of HPV52/58 prevalent among Japanese women by analyzing the viral whole-genome sequences. The entire genomic region of HPV52/58 was amplified by long-range PCR with total cellular DNA extracted from cervical exfoliated cells isolated from Japanese patients with CIN or ICC. The amplified DNA was subjected to next generation sequencing to determine the complete viral genome sequences. Phylogenetic analyses were performed with the whole-genome sequences to assign variant lineages/sublineages to the HPV52/58 isolates. The variability in amino acid sequences of viral proteins was assessed by calculating the Shannon entropy scores at individual amino acid positions of HPV proteins. Among 52 isolates of HPV52 (CIN1, n  = 20; CIN2/3, n  = 21; ICC, n  = 11), 50 isolates belonged to lineage B (sublineage B2) and two isolates belonged to lineage A (sublineage A1). Among 48 isolates of HPV58 (CIN1, n  = 21; CIN2/3, n  = 19; ICC, n  = 8), 47 isolates belonged to lineage A (sublineages A1/A2/A3) and one isolate belonged to lineage C. Single nucleotide polymorphisms specific for individual variant lineages were determined throughout the viral genome based on multiple sequence alignments of the Japanese HPV52/58 isolates and reference HPV52/58 genomes. Entropy analyses revealed that the E1 protein was relatively variable among the HPV52 isolates, whereas the E7, E4, and L2 proteins showed

  19. Gene expression in peripheral immune cells following cardioembolic stroke is sexually dimorphic.

    Boryana Stamova

    Full Text Available Epidemiological studies suggest that sex has a role in the pathogenesis of cardioembolic stroke. Since stroke is a vascular disease, identifying sexually dimorphic gene expression changes in blood leukocytes can inform on sex-specific risk factors, response and outcome biology. We aimed to examine the sexually dimorphic immune response following cardioembolic stroke by studying the differential gene expression in peripheral white blood cells.Blood samples from patients with cardioembolic stroke were obtained at ≤3 hours (prior to treatment, 5 hours and 24 hours (after treatment after stroke onset (n = 23; 69 samples and compared with vascular risk factor controls without symptomatic vascular diseases (n = 23, 23 samples (ANCOVA, false discovery rate p≤0.05, |fold change| ≥1.2. mRNA levels were measured on whole-genome Affymetrix microarrays. There were more up-regulated than down-regulated genes in both sexes, and females had more differentially expressed genes than males following cardioembolic stroke. Female gene expression was associated with cell death and survival, cell-cell signaling and inflammation. Male gene expression was associated with cellular assembly, organization and compromise. Immune response pathways were over represented at ≤3, 5 and 24 h after stroke in female subjects but only at 24 h in males. Neutrophil-specific genes were differentially expressed at 3, 5 and 24 h in females but only at 5 h and 24 h in males.There are sexually dimorphic immune cell expression profiles following cardioembolic stroke. Future studies are needed to confirm the findings using qRT-PCR in an independent cohort, to determine how they relate to risk and outcome, and to compare to other causes of ischemic stroke.

  20. Whole Genome Analyses of a Well-Differentiated Liposarcoma Reveals Novel SYT1 and DDR2 Rearrangements

    Egan, Jan B.; Barrett, Michael T.; Champion, Mia D.; Middha, Sumit; Lenkiewicz, Elizabeth; Evers, Lisa; Francis, Princy; Schmidt, Jessica; Shi, Chang-Xin; Van Wier, Scott; Badar, Sandra; Ahmann, Gregory; Kortuem, K. Martin; Boczek, Nicole J.; Fonseca, Rafael; Craig, David W.; Carpten, John D.; Borad, Mitesh J.; Stewart, A. Keith

    2014-01-01

    Liposarcoma is the most common soft tissue sarcoma, but little is known about the genomic basis of this disease. Given the low cell content of this tumor type, we utilized flow cytometry to isolate the diploid normal and aneuploid tumor populations from a well-differentiated liposarcoma prior to array comparative genomic hybridization and whole genome sequencing. This work revealed massive highly focal amplifications throughout the aneuploid tumor genome including MDM2, a gene that has previously been found to be amplified in well-differentiated liposarcoma. Structural analysis revealed massive rearrangement of chromosome 12 and 11 gene fusions, some of which may be part of double minute chromosomes commonly present in well-differentiated liposarcoma. We identified a hotspot of genomic instability localized to a region of chromosome 12 that includes a highly conserved, putative L1 retrotransposon element, LOC100507498 which resides within a gene cluster (NAV3, SYT1, PAWR) where 6 of the 11 fusion events occurred. Interestingly, a potential gene fusion was also identified in amplified DDR2, which is a potential therapeutic target of kinase inhibitors such as dastinib, that are not routinely used in the treatment of patients with liposarcoma. Furthermore, 7 somatic, damaging single nucleotide variants have also been identified, including D125N in the PTPRQ protein. In conclusion, this work is the first to report the entire genome of a well-differentiated liposarcoma with novel chromosomal rearrangements associated with amplification of therapeutically targetable genes such as MDM2 and DDR2. PMID:24505276

  1. Whole genome analyses of a well-differentiated liposarcoma reveals novel SYT1 and DDR2 rearrangements.

    Jan B Egan

    Full Text Available Liposarcoma is the most common soft tissue sarcoma, but little is known about the genomic basis of this disease. Given the low cell content of this tumor type, we utilized flow cytometry to isolate the diploid normal and aneuploid tumor populations from a well-differentiated liposarcoma prior to array comparative genomic hybridization and whole genome sequencing. This work revealed massive highly focal amplifications throughout the aneuploid tumor genome including MDM2, a gene that has previously been found to be amplified in well-differentiated liposarcoma. Structural analysis revealed massive rearrangement of chromosome 12 and 11 gene fusions, some of which may be part of double minute chromosomes commonly present in well-differentiated liposarcoma. We identified a hotspot of genomic instability localized to a region of chromosome 12 that includes a highly conserved, putative L1 retrotransposon element, LOC100507498 which resides within a gene cluster (NAV3, SYT1, PAWR where 6 of the 11 fusion events occurred. Interestingly, a potential gene fusion was also identified in amplified DDR2, which is a potential therapeutic target of kinase inhibitors such as dastinib, that are not routinely used in the treatment of patients with liposarcoma. Furthermore, 7 somatic, damaging single nucleotide variants have also been identified, including D125N in the PTPRQ protein. In conclusion, this work is the first to report the entire genome of a well-differentiated liposarcoma with novel chromosomal rearrangements associated with amplification of therapeutically targetable genes such as MDM2 and DDR2.

  2. Patterns of somatic alterations between matched primary and metastatic colorectal tumors characterized by whole-genome sequencing.

    Xie, Tao; Cho, Yong Beom; Wang, Kai; Huang, Donghui; Hong, Hye Kyung; Choi, Yoon-La; Ko, Young Hyeh; Nam, Do-Hyun; Jin, Juyoun; Yang, Heekyoung; Fernandez, Julio; Deng, Shibing; Rejto, Paul A; Lee, Woo Yong; Mao, Mao

    2014-10-01

    Colorectal cancer (CRC) patients have poor prognosis after formation of distant metastasis. Understanding the molecular mechanisms by which genetic changes facilitate metastasis is critical for the development of targeted therapeutic strategies aimed at controlling disease progression while minimizing toxic side effects. A comprehensive portrait of somatic alterations in CRC and the changes between primary and metastatic tumors has yet to be developed. We performed whole genome sequencing of two primary CRC tumors and their matched liver metastases. By comparing to matched germline DNA, we catalogued somatic alterations at multiple scales, including single nucleotide variations, small insertions and deletions, copy number aberrations and structural variations in both the primary and matched metastasis. We found that the majority of these somatic alterations are present in both sites. Despite the overall similarity, several de novo alterations in the metastases were predicted to be deleterious, in genes including FBXW7, DCLK1 and FAT2, which might contribute to the initiation and progression of distant metastasis. Through careful examination of the mutation prevalence among tumor cells at each site, we also proposed distinct clonal evolution patterns between primary and metastatic tumors in the two cases. These results suggest that somatic alterations may play an important role in driving the development of colorectal cancer metastasis and present challenges and opportunities when considering the choice of treatment. Copyright © 2014 Elsevier Inc. All rights reserved.

  3. Yeast "make-accumulate-consume" life strategy evolved as a multi-step process that predates the whole genome duplication.

    Hagman, Arne; Säll, Torbjörn; Compagno, Concetta; Piskur, Jure

    2013-01-01

    When fruits ripen, microbial communities start a fierce competition for the freely available fruit sugars. Three yeast lineages, including baker's yeast Saccharomyces cerevisiae, have independently developed the metabolic activity to convert simple sugars into ethanol even under fully aerobic conditions. This fermentation capacity, named Crabtree effect, reduces the cell-biomass production but provides in nature a tool to out-compete other microorganisms. Here, we analyzed over forty Saccharomycetaceae yeasts, covering over 200 million years of the evolutionary history, for their carbon metabolism. The experiments were done under strictly controlled and uniform conditions, which has not been done before. We show that the origin of Crabtree effect in Saccharomycetaceae predates the whole genome duplication and became a settled metabolic trait after the split of the S. cerevisiae and Kluyveromyces lineages, and coincided with the origin of modern fruit bearing plants. Our results suggest that ethanol fermentation evolved progressively, involving several successive molecular events that have gradually remodeled the yeast carbon metabolism. While some of the final evolutionary events, like gene duplications of glucose transporters and glycolytic enzymes, have been deduced, the earliest molecular events initiating Crabtree effect are still to be determined.

  4. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing.

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan

    2013-06-27

    Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available

  5. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  6. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    Black, PA

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic

  7. Whole genome assembly of a natto production strain Bacillus subtilis natto from very short read data.

    Nishito, Yukari; Osana, Yasunori; Hachiya, Tsuyoshi; Popendorf, Kris; Toyoda, Atsushi; Fujiyama, Asao; Itaya, Mitsuhiro; Sakakibara, Yasubumi

    2010-04-16

    Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and functions as a starter for the production of the traditional Japanese food "natto" made from soybeans. Although re-sequencing whole genomes of several laboratory domesticated B. subtilis 168 derivatives has already been attempted using short read sequencing data, the assembly of the whole genome sequence of a closely related strain, B. subtilis natto, from very short read data is more challenging, particularly with our aim to assemble one fully connected scaffold from short reads around 35 bp in length. We applied a comparative genome assembly method, which combines de novo assembly and reference guided assembly, to one of the B. subtilis natto strains. We successfully assembled 28 scaffolds and managed to avoid substantial fragmentation. Completion of the assembly through long PCR experiments resulted in one connected scaffold for B. subtilis natto. Based on the assembled genome sequence, our orthologous gene analysis between natto BEST195 and Marburg 168 revealed that 82.4% of 4375 predicted genes in BEST195 are one-to-one orthologous to genes in 168, with two genes in-paralog, 3.2% are deleted in 168, 14.3% are inserted in BEST195, and 5.9% of genes present in 168 are deleted in BEST195. The natto genome contains the same alleles in the promoter region of degQ and the coding region of swrAA as the wild strain, RO-FF-1. These are specific for gamma-PGA production ability, which is related to natto production. Further, the B. subtilis natto strain completely lacked a polyketide synthesis operon, disrupted the plipastatin production operon, and possesses previously unidentified transposases. The determination of the whole genome sequence of Bacillus subtilis natto provided detailed analyses of a set of genes related to natto production, demonstrating the number and locations of insertion sequences that B. subtilis natto harbors but B. subtilis 168 lacks

  8. Whole genome assembly of a natto production strain Bacillus subtilis natto from very short read data

    Fujiyama Asao

    2010-04-01

    Full Text Available Abstract Background Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and functions as a starter for the production of the traditional Japanese food "natto" made from soybeans. Although re-sequencing whole genomes of several laboratory domesticated B. subtilis 168 derivatives has already been attempted using short read sequencing data, the assembly of the whole genome sequence of a closely related strain, B. subtilis natto, from very short read data is more challenging, particularly with our aim to assemble one fully connected scaffold from short reads around 35 bp in length. Results We applied a comparative genome assembly method, which combines de novo assembly and reference guided assembly, to one of the B. subtilis natto strains. We successfully assembled 28 scaffolds and managed to avoid substantial fragmentation. Completion of the assembly through long PCR experiments resulted in one connected scaffold for B. subtilis natto. Based on the assembled genome sequence, our orthologous gene analysis between natto BEST195 and Marburg 168 revealed that 82.4% of 4375 predicted genes in BEST195 are one-to-one orthologous to genes in 168, with two genes in-paralog, 3.2% are deleted in 168, 14.3% are inserted in BEST195, and 5.9% of genes present in 168 are deleted in BEST195. The natto genome contains the same alleles in the promoter region of degQ and the coding region of swrAA as the wild strain, RO-FF-1. These are specific for γ-PGA production ability, which is related to natto production. Further, the B. subtilis natto strain completely lacked a polyketide synthesis operon, disrupted the plipastatin production operon, and possesses previously unidentified transposases. Conclusions The determination of the whole genome sequence of Bacillus subtilis natto provided detailed analyses of a set of genes related to natto production, demonstrating the number and locations of insertion sequences that B

  9. Significance of functional disease-causal/susceptible variants identified by whole-genome analyses for the understanding of human diseases.

    Hitomi, Yuki; Tokunaga, Katsushi

    2017-01-01

    Human genome variation may cause differences in traits and disease risks. Disease-causal/susceptible genes and variants for both common and rare diseases can be detected by comprehensive whole-genome analyses, such as whole-genome sequencing (WGS), using next-generation sequencing (NGS) technology and genome-wide association studies (GWAS). Here, in addition to the application of an NGS as a whole-genome analysis method, we summarize approaches for the identification of functional disease-causal/susceptible variants from abundant genetic variants in the human genome and methods for evaluating their functional effects in human diseases, using an NGS and in silico and in vitro functional analyses. We also discuss the clinical applications of the functional disease causal/susceptible variants to personalized medicine.

  10. High-Quality Exome Sequencing of Whole-Genome Amplified Neonatal Dried Blood Spot DNA

    Poulsen, Jesper Buchhave; Lescai, Francesco; Grove, Jakob

    2016-01-01

    Stored neonatal dried blood spot (DBS) samples from neonatal screening programmes are a valuable diagnostic and research resource. Combined with information from national health registries they can be used in population-based studies of genetic diseases. DNA extracted from neonatal DBSs can...... be amplified to obtain micrograms of an otherwise limited resource, referred to as whole-genome amplified DNA (wgaDNA). Here we investigate the robustness of exome sequencing of wgaDNA of neonatal DBS samples. We conducted three pilot studies of seven, eight and seven subjects, respectively. For each subject...... we analysed a neonatal DBS sample and corresponding adult whole-blood (WB) reference sample. Different DNA sample types were prepared for each of the subjects. Pilot 1: wgaDNA of 2x3.2mm neonatal DBSs (DBS_2x3.2) and raw DNA extract of the WB reference sample (WB_ref). Pilot 2: DBS_2x3.2, WB...

  11. Whole-Genome Scans Provide Evidence of Adaptive Evolution in Malawian Plasmodium falciparum Isolates

    Ocholla, Harold; Preston, Mark D; Mipando, Mwapatsa

    2014-01-01

    BACKGROUND:  Selection by host immunity and antimalarial drugs has driven extensive adaptive evolution in Plasmodium falciparum and continues to produce ever-changing landscapes of genetic variation. METHODS:  We performed whole-genome sequencing of 69 P. falciparum isolates from Malawi and used......, an area of high malaria transmission. Allele frequency-based tests provided evidence of recent population growth in Malawi and detected potential targets of host immunity and candidate vaccine antigens. Comparison of the sequence variation between isolates from Malawi and those from 5 geographically...... dispersed countries (Kenya, Burkina Faso, Mali, Cambodia, and Thailand) detected population genetic differences between Africa and Asia, within Southeast Asia, and within Africa. Haplotype-based tests of selection to sequence data from all 6 populations identified signals of directional selection at known...

  12. Whole-genome analyses resolve early branches in the tree of life of modern birds

    Sicheritz-Pontén, Thomas; Li, Cai; Li, Bo

    2014-01-01

    To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister...... or close relationships. We identified the first divergence in Neoaves, two groups we named Passerea and Columbea, representing independent lineages of diverse and convergently evolved land and water bird species. Among Passerea, we infer the common ancestor of core landbirds to have been an apex predator...... and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence and high...

  13. Evaluation of whole genome sequencing for outbreak detection of Salmonella enterica

    Leekitcharoenphon, Pimlapas; Nielsen, Eva M.; Kaas, Rolf Sommer

    2014-01-01

    Salmonella enterica is a common cause of minor and large food borne outbreaks. To achieve successful and nearly ‘real-time’ monitoring and identification of outbreaks, reliable sub-typing is essential. Whole genome sequencing (WGS) shows great promises for using as a routine epidemiological typing....... Enteritidis and 5 S. Derby were also sequenced and used for comparison. A number of different bioinformatics approaches were applied on the data; including pan-genome tree, k-mer tree, nucleotide difference tree and SNP tree. The outcome of each approach was evaluated in relation to the association...... of the isolates to specific outbreaks. The pan-genome tree clustered 65% of the S. Typhimurium isolates according to the pre-defined epidemiology, the k-mer tree 88%, the nucleotide difference tree 100% and the SNP tree 100% of the strains within S. Typhimurium. The resulting outcome of the four phylogenetic...

  14. A strategic stakeholder approach for addressing further analysis requests in whole genome sequencing research.

    Thornock, Bradley Steven O

    2016-01-01

    Whole genome sequencing (WGS) can be a cost-effective and efficient means of diagnosis for some children, but it also raises a number of ethical concerns. One such concern is how researchers derive and communicate results from WGS, including future requests for further analysis of stored sequences. The purpose of this paper is to think about what is at stake, and for whom, in any solution that is developed to deal with such requests. To accomplish this task, this paper will utilize stakeholder theory, a common method used in business ethics. Several scenarios that connect stakeholder concerns and WGS will also posited and analyzed. This paper concludes by developing criteria composed of a series of questions that researchers can answer in order to more effectively address requests for further analysis of stored sequences.

  15. Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding

    de los Campos, Gustavo; Hickey, John M.; Pong-Wong, Ricardo; Daetwyler, Hans D.; Calus, Mario P. L.

    2013-01-01

    Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade. PMID:22745228

  16. Whole-genome sequencing of giant pandas provides insights into demographic history and local adaptation

    Zhao, Shancen; Zheng, Pingping; Dong, Shanshan

    2013-01-01

    The panda lineage dates back to the late Miocene and ultimately leads to only one extant species, the giant panda (Ailuropoda melanoleuca). Although global climate change and anthropogenic disturbances are recognized to shape animal population demography their contribution to panda population...... dynamics remains largely unknown. We sequenced the whole genomes of 34 pandas at an average 4.7-fold coverage and used this data set together with the previously deep-sequenced panda genome to reconstruct a continuous demographic history of pandas from their origin to the present. We identify two...... panda populations that show genetic adaptation to their environments. However, in all three populations, anthropogenic activities have negatively affected pandas for 3,000 years....

  17. Mapping genomic features to functional traits through microbial whole genome sequences.

    Zhang, Wei; Zeng, Erliang; Liu, Dan; Jones, Stuart E; Emrich, Scott

    2014-01-01

    Recently, the utility of trait-based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. We proposed a machine learning framework to quantitatively link the genomic features with functional traits. Genes from bacteria genomes belonging to different functional traits were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. Extensive experimental results demonstrated that functional trait related genes can be detected using our method. Further, the method has the potential to provide novel biological insights.

  18. Evaluation of whole genome amplified DNA to decrease material expenditure and increase quality

    Marie Bækvad-Hansen

    2017-06-01

    Discussion: Whole genome amplified DNA samples from dried blood spots is well suited for array genotyping and produces robust and reliable genotype data. However, the amplification process introduces additional noise to the data, making detection of structural variants such as copy number variants difficult. With this study, we explore ways of optimizing the amplification protocol in order to reduce noise and increase data quality. We found, that the amplification process was very robust, and that changes in amplification time or temperature did not alter the genotyping calls or quality of the array data. Adding additional replicates of each sample also lead to insignificant changes in the array data. Thus, the amount of noise introduced by the amplification process was consistent regardless of changes made to the amplification protocol. We also explored ways of decreasing material expenditure by reducing the spot size or the amplification reaction volume. The reduction did not affect the quality of the genotyping data.

  19. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction

    Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc

    2015-01-01

    This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected...... with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index...... itself. Depending on the trait’s economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage...

  20. Small homologous blocks in phytophthora genomes do not point to an ancient whole-genome duplication.

    van Hooff, Jolien J E; Snel, Berend; Seidl, Michael F

    2014-05-01

    Genomes of the plant-pathogenic genus Phytophthora are characterized by small duplicated blocks consisting of two consecutive genes (2HOM blocks) and by an elevated abundance of similarly aged gene duplicates. Both properties, in particular the presence of 2HOM blocks, have been attributed to a whole-genome duplication (WGD) at the last common ancestor of Phytophthora. However, large intraspecies synteny-compelling evidence for a WGD-has not been detected. Here, we revisited the WGD hypothesis by deducing the age of 2HOM blocks. Two independent timing methods reveal that the majority of 2HOM blocks arose after divergence of the Phytophthora lineages. In addition, a large proportion of the 2HOM block copies colocalize on the same scaffold. Therefore, the presence of 2HOM blocks does not support a WGD at the last common ancestor of Phytophthora. Thus, genome evolution of Phytophthora is likely driven by alternative mechanisms, such as bursts of transposon activity.

  1. Whole-genome and Transcriptome Sequencing of Prostate Cancer Identify New Genetic Alterations Driving Disease Progression

    Ren, Shancheng; Wei, Gong-Hong; Liu, Dongbing

    2018-01-01

    BACKGROUND: Global disparities in prostate cancer (PCa) incidence highlight the urgent need to identify genomic abnormalities in prostate tumors in different ethnic populations including Asian men. OBJECTIVE: To systematically explore the genomic complexity and define disease-driven genetic......-scale and comprehensive genomic data of prostate cancer from Asian population. Identification of these genetic alterations may help advance prostate cancer diagnosis, prognosis, and treatment....... alterations in PCa. DESIGN, SETTING, AND PARTICIPANTS: The study sequenced whole-genome and transcriptome of tumor-benign paired tissues from 65 treatment-naive Chinese PCa patients. Subsequent targeted deep sequencing of 293 PCa-relevant genes was performed in another cohort of 145 prostate tumors. OUTCOME...

  2. Rapid high resolution genotyping of Francisella tularensis by whole genome sequence comparison of annotated genes ("MLST+".

    Markus H Antwerpen

    Full Text Available The zoonotic disease tularemia is caused by the bacterium Francisella tularensis. This pathogen is considered as a category A select agent with potential to be misused in bioterrorism. Molecular typing based on DNA-sequence like canSNP-typing or MLVA has become the accepted standard for this organism. Due to the organism's highly clonal nature, the current typing methods have reached their limit of discrimination for classifying closely related subpopulations within the subspecies F. tularensis ssp. holarctica. We introduce a new gene-by-gene approach, MLST+, based on whole genome data of 15 sequenced F. tularensis ssp. holarctica strains and apply this approach to investigate an epidemic of lethal tularemia among non-human primates in two animal facilities in Germany. Due to the high resolution of MLST+ we are able to demonstrate that three independent clones of this highly infectious pathogen were responsible for these spatially and temporally restricted outbreaks.

  3. Clinical decision support for whole genome sequence information leveraging a service-oriented architecture: a prototype.

    Welch, Brandon M; Rodriguez-Loya, Salvador; Eilbeck, Karen; Kawamoto, Kensaku

    2014-01-01

    Whole genome sequence (WGS) information could soon be routinely available to clinicians to support the personalized care of their patients. At such time, clinical decision support (CDS) integrated into the clinical workflow will likely be necessary to support genome-guided clinical care. Nevertheless, developing CDS capabilities for WGS information presents many unique challenges that need to be overcome for such approaches to be effective. In this manuscript, we describe the development of a prototype CDS system that is capable of providing genome-guided CDS at the point of care and within the clinical workflow. To demonstrate the functionality of this prototype, we implemented a clinical scenario of a hypothetical patient at high risk for Lynch Syndrome based on his genomic information. We demonstrate that this system can effectively use service-oriented architecture principles and standards-based components to deliver point of care CDS for WGS information in real-time.

  4. Whole-genome sequencing of bloodstream Staphylococcus aureus isolates does not distinguish bacteraemia from endocarditis

    Lilje, Berit; Rasmussen, Rasmus Vedby; Dahl, Anders

    2017-01-01

    Most Staphylococcus aureus isolates can cause invasive disease given the right circumstances, but it is unknown if some isolates are more likely to cause severe infections than others. S. aureus bloodstream isolates from 120 patients with definite infective endocarditis and 121 with S. aureus...... bacteraemia without infective endocarditis underwent whole-genome sequencing. Genome-wide association analysis was performed using a variety of bioinformatics approaches including SNP analysis, accessory genome analysis and k-mer based analysis. Core and accessory genome analyses found no association...... with either of the two clinical groups. In this study, the genome sequences of S. aureus bloodstream isolates did not discriminate between bacteraemia and infective endocarditis. Based on our study and the current literature, it is not convincing that a specific S. aureus genotype is clearly associated...

  5. Real time application of whole genome sequencing for outbreak investigation - What is an achievable turnaround time?

    McGann, Patrick; Bunin, Jessica L; Snesrud, Erik; Singh, Seema; Maybank, Rosslyn; Ong, Ana C; Kwak, Yoon I; Seronello, Scott; Clifford, Robert J; Hinkle, Mary; Yamada, Stephen; Barnhill, Jason; Lesho, Emil

    2016-07-01

    Whole genome sequencing (WGS) is increasingly employed in clinical settings, though few assessments of turnaround times (TAT) have been performed in real-time. In this study, WGS was used to investigate an unfolding outbreak of vancomycin resistant Enterococcus faecium (VRE) among 3 patients in the ICU of a tertiary care hospital. Including overnight culturing, a TAT of just 48.5 h for a comprehensive report was achievable using an Illumina Miseq benchtop sequencer. WGS revealed that isolates from patient 2 and 3 differed from that of patient 1 by a single nucleotide polymorphism (SNP), indicating nosocomial transmission. However, the unparalleled resolution provided by WGS suggested that nosocomial transmission involved two separate events from patient 1 to patient 2 and 3, and not a linear transmission suspected by the time line. Rapid TAT's are achievable using WGS in the clinical setting and can provide an unprecedented level of resolution for outbreak investigations. Published by Elsevier Inc.

  6. Rapid determination of anti-tuberculosis drug resistance from whole-genome sequences

    Coll, Francesc

    2015-05-27

    Mycobacterium tuberculosis drug resistance (DR) challenges effective tuberculosis disease control. Current molecular tests examine limited numbers of mutations, and although whole genome sequencing approaches could fully characterise DR, data complexity has restricted their clinical application. A library (1,325 mutations) predictive of DR for 15 anti-tuberculosis drugs was compiled and validated for 11 of them using genomic-phenotypic data from 792 strains. A rapid online ‘TB-Profiler’ tool was developed to report DR and strain-type profiles directly from raw sequences. Using our DR mutation library, in silico diagnostic accuracy was superior to some commercial diagnostics and alternative databases. The library will facilitate sequence-based drug-susceptibility testing.

  7. Practical Value of Food Pathogen Traceability through Building a Whole-Genome Sequencing Network and Database.

    Allard, Marc W; Strain, Errol; Melka, David; Bunning, Kelly; Musser, Steven M; Brown, Eric W; Timme, Ruth

    2016-08-01

    The FDA has created a United States-based open-source whole-genome sequencing network of state, federal, international, and commercial partners. The GenomeTrakr network represents a first-of-its-kind distributed genomic food shield for characterizing and tracing foodborne outbreak pathogens back to their sources. The GenomeTrakr network is leading investigations of outbreaks of foodborne illnesses and compliance actions with more accurate and rapid recalls of contaminated foods as well as more effective monitoring of preventive controls for food manufacturing environments. An expanded network would serve to provide an international rapid surveillance system for pathogen traceback, which is critical to support an effective public health response to bacterial outbreaks. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  8. Phylogenetics and differentiation of Salmonella Newport lineages by whole genome sequencing.

    Guojie Cao

    Full Text Available Salmonella Newport has ranked in the top three Salmonella serotypes associated with foodborne outbreaks from 1995 to 2011 in the United States. In the current study, we selected 26 S. Newport strains isolated from diverse sources and geographic locations and then conducted 454 shotgun pyrosequencing procedures to obtain 16-24 × coverage of high quality draft genomes for each strain. Comparative genomic analysis of 28 S. Newport strains (including 2 reference genomes and 15 outgroup genomes identified more than 140,000 informative SNPs. A resulting phylogenetic tree consisted of four sublineages and indicated that S. Newport had a clear geographic structure. Strains from Asia were divergent from those from the Americas. Our findings demonstrated that analysis using whole genome sequencing data resulted in a more accurate picture of phylogeny compared to that using single genes or small sets of genes. We selected loci around the mutS gene of S. Newport to differentiate distinct lineages, including those between invH and mutS genes at the 3' end of Salmonella Pathogenicity Island 1 (SPI-1, ste fimbrial operon, and Clustered, Regularly Interspaced, Short Palindromic Repeats (CRISPR associated-proteins (cas. These genes in the outgroup genomes held high similarity with either S. Newport Lineage II or III at the same loci. S. Newport Lineages II and III have different evolutionary histories in this region and our data demonstrated genetic flow and homologous recombination events around mutS. The findings suggested that S. Newport Lineages II and III diverged early in the serotype evolution and have evolved largely independently. Moreover, we identified genes that could delineate sublineages within the phylogenetic tree and that could be used as potential biomarkers for trace-back investigations during outbreaks. Thus, whole genome sequencing data enabled us to better understand the genetic background of pathogenicity and evolutionary history of S

  9. Whole genome investigation of a divergent clade of the pathogen Streptococcus suis

    Abiyad eBaig

    2015-11-01

    Full Text Available Streptococcus suis is a major porcine and zoonotic pathogen responsible for significant economic losses in the pig industry and an increasing number of human cases. Multiple isolates of S. suis show marked genomic diversity. Here we report the analysis of whole genome sequences of nine pig isolates that caused disease typical of S. suis and had phenotypic characteristics of S. suis, but their genomes were divergent from those of many other S. suis isolates. Comparison of protein sequences predicted from divergent genomes with those from normal S. suis reduced the size of core genome from 793 to only 397 genes. Divergence was clear if phylogenetic analysis was performed on reduced core genes and MLST alleles. Phylogenies based on certain other genes (16S rRNA, sodA, recN and cpn60 did not show divergence for all isolates, suggesting recombination between some divergent isolates with normal S. suis for these genes. Indeed, there is evidence of recent recombination between the divergent and normal S. suis genomes for 249 of 397 core genes. In addition, phylogenetic analysis based on the 16S rRNA gene and 132 genes that were conserved between the divergent isolates and representatives of the broader Streptococcus genus showed that divergent isolates were more closely related to S. suis. Six out of nine divergent isolates possessed a S. suis-like capsule region with variation in capsular gene sequences but the remaining three did not have a discrete capsule locus. The majority (40/70, of virulence-associated genes in normal S. suis were present in the divergent genomes. Overall, the divergent isolates extend the current diversity of S. suis species but the phenotypic similarities and the large amount of gene exchange with normal S. suis gives insufficient evidence to assign these isolates to a new species or subspecies. Further sampling and whole genome analysis of more isolates is warranted to understand the diversity of the species.

  10. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes.

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-02-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information.

  11. Whole genome sequence phylogenetic analysis of four Mexican rabies viruses isolated from cattle.

    Bárcenas-Reyes, I; Loza-Rubio, E; Cantó-Alarcón, G J; Luna-Cozar, J; Enríquez-Vázquez, A; Barrón-Rodríguez, R J; Milián-Suazo, F

    2017-08-01

    Phylogenetic analysis of the rabies virus in molecular epidemiology has been traditionally performed on partial sequences of the genome, such as the N, G, and P genes; however, that approach raises concerns about the discriminatory power compared to whole genome sequencing. In this study we characterized four strains of the rabies virus isolated from cattle in Querétaro, Mexico by comparing the whole genome sequence to that of strains from the American, European and Asian continents. Four cattle brain samples positive to rabies and characterized as AgV11, genotype 1, were used in the study. A cDNA sequence was generated by reverse transcription PCR (RT-PCR) using oligo dT. cDNA samples were sequenced in an Illumina NextSeq 500 platform. The phylogenetic analysis was performed with MEGA 6.0. Minimum evolution phylogenetic trees were constructed with the Neighbor-Joining method and bootstrapped with 1000 replicates. Three large and seven small clusters were formed with the 26 sequences used. The largest cluster grouped strains from different species in South America: Brazil, and the French Guyana. The second cluster grouped five strains from Mexico. A Mexican strain reported in a different study was highly related to our four strains, suggesting common source of infection. The phylogenetic analysis shows that the type of host is different for the different regions in the American Continent; rabies is more related to bats. It was concluded that the rabies virus in central Mexico is genetically stable and that it is transmitted by the vampire bat Desmodus rotundus. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Practical issues in implementing whole-genome-sequencing in routine diagnostic microbiology.

    Rossen, J W A; Friedrich, A W; Moran-Gilad, J

    2018-04-01

    Next generation sequencing (NGS) is increasingly being used in clinical microbiology. Like every new technology adopted in microbiology, the integration of NGS into clinical and routine workflows must be carefully managed. To review the practical aspects of implementing bacterial whole genome sequencing (WGS) in routine diagnostic laboratories. Review of the literature and expert opinion. In this review, we discuss when and how to integrate whole genome sequencing (WGS) in the routine workflow of the clinical laboratory. In addition, as the microbiology laboratories have to adhere to various national and international regulations and criteria for their accreditation, we deliberate on quality control issues for using WGS in microbiology, including the importance of proficiency testing. Furthermore, the current and future place of this technology in the diagnostic hierarchy of microbiology is described as well as the necessity of maintaining backwards compatibility with already established methods. Finally, we speculate on the question of whether WGS can entirely replace routine microbiology in the future and the tension between the fact that most sequencers are designed to process multiple samples in parallel whereas for optimal diagnosis a one-by-one processing of the samples is preferred. Special reference is made to the cost and turnaround time of WGS in diagnostic laboratories. Further development is required to improve the workflow for WGS, in particular to shorten the turnaround time, reduce costs, and streamline downstream data analyses. Only when these processes reach maturity will reliance on WGS for routine patient management and infection control management become feasible, enabling the transformation of clinical microbiology into a genome-based and personalized diagnostic field. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  13. Whole-genome phylogeny of Escherichia coli/Shigella group by feature frequency profiles (FFPs)

    Sims, Gregory E.; Kim, Sung-Hou

    2011-01-01

    A whole-genome phylogeny of the Escherichia coli/Shigella group was constructed by using the feature frequency profile (FFP) method. This alignment-free approach uses the frequencies of l-mer features of whole genomes to infer phylogenic distances. We present two phylogenies that accentuate different aspects of E. coli/Shigella genomic evolution: (i) one based on the compositions of all possible features of length l = 24 (∼8.4 million features), which are likely to reveal the phenetic grouping and relationship among the organisms and (ii) the other based on the compositions of core features with low frequency and low variability (∼0.56 million features), which account for ∼69% of all commonly shared features among 38 taxa examined and are likely to have genome-wide lineal evolutionary signal. Shigella appears as a single clade when all possible features are used without filtering of noncore features. However, results using core features show that Shigella consists of at least two distantly related subclades, implying that the subclades evolved into a single clade because of a high degree of convergence influenced by mobile genetic elements and niche adaptation. In both FFP trees, the basal group of the E. coli/Shigella phylogeny is the B2 phylogroup, which contains primarily uropathogenic strains, suggesting that the E. coli/Shigella ancestor was likely a facultative or opportunistic pathogen. The extant commensal strains diverged relatively late and appear to be the result of reductive evolution of genomes. We also identify clade distinguishing features and their associated genomic regions within each phylogroup. Such features may provide useful information for understanding evolution of the groups and for quick diagnostic identification of each phylogroup. PMID:21536867

  14. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  15. Light whole genome sequence for SNP discovery across domestic cat breeds

    Driscoll Carlos

    2010-06-01

    Full Text Available Abstract Background The domestic cat has offered enormous genomic potential in the veterinary description of over 250 hereditary disease models as well as the occurrence of several deadly feline viruses (feline leukemia virus -- FeLV, feline coronavirus -- FECV, feline immunodeficiency virus - FIV that are homologues to human scourges (cancer, SARS, and AIDS respectively. However, to realize this bio-medical potential, a high density single nucleotide polymorphism (SNP map is required in order to accomplish disease and phenotype association discovery. Description To remedy this, we generated 3,178,297 paired fosmid-end Sanger sequence reads from seven cats, and combined these data with the publicly available 2X cat whole genome sequence. All sequence reads were assembled together to form a 3X whole genome assembly allowing the discovery of over three million SNPs. To reduce potential false positive SNPs due to the low coverage assembly, a low upper-limit was placed on sequence coverage and a high lower-limit on the quality of the discrepant bases at a potential variant site. In all domestic cats of different breeds: female Abyssinian, female American shorthair, male Cornish Rex, female European Burmese, female Persian, female Siamese, a male Ragdoll and a female African wildcat were sequenced lightly. We report a total of 964 k common SNPs suitable for a domestic cat SNP genotyping array and an additional 900 k SNPs detected between African wildcat and domestic cats breeds. An empirical sampling of 94 discovered SNPs were tested in the sequenced cats resulting in a SNP validation rate of 99%. Conclusions These data provide a large collection of mapped feline SNPs across the cat genome that will allow for the development of SNP genotyping platforms for mapping feline diseases.

  16. Whole-genome sequencing approaches for conservation biology: Advantages, limitations and practical recommendations.

    Fuentes-Pardo, Angela P; Ruzzante, Daniel E

    2017-10-01

    Whole-genome resequencing (WGR) is a powerful method for addressing fundamental evolutionary biology questions that have not been fully resolved using traditional methods. WGR includes four approaches: the sequencing of individuals to a high depth of coverage with either unresolved or resolved haplotypes, the sequencing of population genomes to a high depth by mixing equimolar amounts of unlabelled-individual DNA (Pool-seq) and the sequencing of multiple individuals from a population to a low depth (lcWGR). These techniques require the availability of a reference genome. This, along with the still high cost of shotgun sequencing and the large demand for computing resources and storage, has limited their implementation in nonmodel species with scarce genomic resources and in fields such as conservation biology. Our goal here is to describe the various WGR methods, their pros and cons and potential applications in conservation biology. WGR offers an unprecedented marker density and surveys a wide diversity of genetic variations not limited to single nucleotide polymorphisms (e.g., structural variants and mutations in regulatory elements), increasing their power for the detection of signatures of selection and local adaptation as well as for the identification of the genetic basis of phenotypic traits and diseases. Currently, though, no single WGR approach fulfils all requirements of conservation genetics, and each method has its own limitations and sources of potential bias. We discuss proposed ways to minimize such biases. We envision a not distant future where the analysis of whole genomes becomes a routine task in many nonmodel species and fields including conservation biology. © 2017 John Wiley & Sons Ltd.

  17. The Whole-Genome and Transcriptome of the Manila Clam (Ruditapes philippinarum).

    Mun, Seyoung; Kim, Yun-Ji; Markkandan, Kesavan; Shin, Wonseok; Oh, Sumin; Woo, Jiyoung; Yoo, Jongsu; An, Hyesuck; Han, Kyudong

    2017-06-01

    The manila clam, Ruditapes philippinarum, is an important bivalve species in worldwide aquaculture including Korea. The aquaculture production of R. philippinarum is under threat from diverse environmental factors including viruses, microorganisms, parasites, and water conditions with subsequently declining production. In spite of its importance as a marine resource, the reference genome of R. philippinarum for comprehensive genetic studies is largely unexplored. Here, we report the de novo whole-genome and transcriptome assembly of R. philippinarum across three different tissues (foot, gill, and adductor muscle), and provide the basic data for advanced studies in selective breeding and disease control in order to obtain successful aquaculture systems. An approximately 2.56 Gb high quality whole-genome was assembled with various library construction methods. A total of 108,034 protein coding gene models were predicted and repetitive elements including simple sequence repeats and noncoding RNAs were identified to further understanding of the genetic background of R. philippinarum for genomics-assisted breeding. Comparative analysis with the bivalve marine invertebrates uncover that the gene family related to complement C1q was enriched. Furthermore, we performed transcriptome analysis with three different tissues in order to support genome annotation and then identified 41,275 transcripts which were annotated. The R. philippinarum genome resource will markedly advance a wide range of potential genetic studies, a reference genome for comparative analysis of bivalve species and unraveling mechanisms of biological processes in molluscs. We believe that the R. philippinarum genome will serve as an initial platform for breeding better-quality clams using a genomic approach. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  18. QTL Mapping by Whole Genome Re-sequencing and Analysis of Candidate Genes for Nitrogen Use Efficiency in Rice

    Xinghai Yang

    2017-09-01

    Full Text Available Nitrogen is a major nutritional element in rice production. However, excessive application of nitrogen fertilizer has caused severe environmental pollution. Therefore, development of rice varieties with improved nitrogen use efficiency (NUE is urgent for sustainable agriculture. In this study, bulked segregant analysis (BSA combined with whole genome re-sequencing (WGS technology was applied to finely map quantitative trait loci (QTL for NUE. A key QTL, designated as qNUE6 was identified on chromosome 6 and further validated by Insertion/Deletion (InDel marker-based substitutional mapping in recombinants from F2 population (NIL-13B4 × GH998. Forty-four genes were identified in this 266.5-kb region. According to detection and annotation analysis of variation sites, 39 genes with large-effect single-nucleotide polymorphisms (SNPs and large-effect InDels were selected as candidates and their expression levels were analyzed by qRT-PCR. Significant differences in the expression levels of LOC_Os06g15370 (peptide transporter PTR2 and LOC_Os06g15420 (asparagine synthetase were observed between two parents (Y11 and GH998. Phylogenetic analysis in Arabidopsis thaliana identified two closely related homologs, AT1G68570 (AtNPF3.1 and AT5G65010 (ASN2, which share 72.3 and 87.5% amino acid similarity with LOC_Os06g15370 and LOC_Os06g15420, respectively. Taken together, our results suggested that qNUE6 is a possible candidate gene for NUE in rice. The fine mapping and candidate gene analysis of qNUE6 provide the basis of molecular breeding for genetic improvement of rice varieties with high NUE, and lay the foundation for further cloning and functional analysis.

  19. "We don't know her history, her background": adoptive parents' perspectives on whole genome sequencing results.

    Crouch, Julia; Yu, Joon-Ho; Shankar, Aditi G; Tabor, Holly K

    2015-02-01

    Exome sequencing and whole genome sequencing (ES/WGS) can provide parents with a wide range of genetic information about their children, and adoptive parents may have unique issues to consider regarding possible access to this information. The few papers published on adoption and genetics have focused on targeted genetic testing of children in the pre-adoption context. There are no data on adoptive parents' perspectives about pediatric ES/WGS, including their preferences about different kinds of results, and the potential benefits and risks of receiving results. To explore these issues, we conducted four exploratory focus groups with adoptive parents (N = 26). The majority lacked information about their children's biological family health history and ancestry, and many viewed WGS results as a way to fill in these gaps in knowledge. Some expressed concerns about protecting their children's future privacy and autonomy, but at the same time stated that WGS results could possibly help them be proactive about their children's health. A few parents expressed concerns about the risks of WGS in a pre-adoption context, specifically about decreasing a child's chance of adoption. These results suggest that issues surrounding genetic information in the post-adoption and ES/WGS contexts need to be considered, as well as concerns about risks in the pre-adoption context. A critical challenge for ES/WGS in the context of adoption will be balancing the right to know different kinds of genetic information with the right not to know. Specific guidance for geneticists and genetic counselors may be needed to maximize benefits of WGS while minimizing harms and prohibiting misuse of the information in the adoption process.

  20. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Bellod Cisneros, Jose Luis

    2016-01-01

    and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services...... and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services...

  1. Teleost Fish-Specific Preferential Retention of Pigmentation Gene-Containing Families After Whole Genome Duplications in Vertebrates

    Lorin, Thibault; Brunet, Frédéric G.; Laudet, Vincent; Volff, Jean-Nicolas

    2018-01-01

    Vertebrate pigmentation is a highly diverse trait mainly determined by neural crest cell derivatives. It has been suggested that two rounds (1R/2R) of whole-genome duplications (WGDs) at the basis of vertebrates allowed changes in gene regulation associated with neural crest evolution. Subsequently, the teleost fish lineage experienced other WGDs, including the teleost-specific Ts3R before teleost radiation and the more recent Ss4R at the basis of salmonids. As the teleost lineage harbors the highest number of pigment cell types and pigmentation diversity in vertebrates, WGDs might have contributed to the evolution and diversification of the pigmentation gene repertoire in teleosts. We have compared the impact of the basal vertebrate 1R/2R duplications with that of the teleost-specific Ts3R and salmonid-specific Ss4R WGDs on 181 gene families containing genes involved in pigmentation. We show that pigmentation genes (PGs) have been globally more frequently retained as duplicates than other genes after Ts3R and Ss4R but not after the early 1R/2R. This is also true for non-pigmentary paralogs of PGs, suggesting that the function in pigmentation is not the sole key driver of gene retention after WGDs. On the long-term, specific categories of PGs have been repeatedly preferentially retained after ancient 1R/2R and Ts3R WGDs, possibly linked to the molecular nature of their proteins (e.g., DNA binding transcriptional regulators) and their central position in protein-protein interaction networks. Taken together, our results support a major role of WGDs in the diversification of the pigmentation gene repertoire in the teleost lineage, with a possible link with the diversity of pigment cell lineages observed in these animals compared to other vertebrates. PMID:29599177

  2. Whole-genome typing and characterization of blaVIM19-harbouring ST383 Klebsiella pneumoniae by PFGE, whole-genome mapping and WGS.

    Sabirova, Julia S; Xavier, Basil Britto; Coppens, Jasmine; Zarkotou, Olympia; Lammens, Christine; Janssens, Lore; Burggrave, Ronald; Wagner, Trevor; Goossens, Herman; Malhotra-Kumar, Surbhi

    2016-06-01

    We utilized whole-genome mapping (WGM) and WGS to characterize 12 clinical carbapenem-resistant Klebsiella pneumoniae strains (TGH1-TGH12). All strains were screened for carbapenemase genes by PCR, and typed by MLST, PFGE (XbaI) and WGM (AflII) (OpGen, USA). WGS (Illumina) was performed on TGH8 and TGH10. Reads were de novo assembled and annotated [SPAdes, Rapid Annotation Subsystem Technology (RAST)]. Contigs were aligned directly, and after in silico AflII restriction, with corresponding WGMs (MapSolver, OpGen; BioNumerics, Applied Maths). All 12 strains were ST383. Of the 12 strains, 11 were carbapenem resistant, 7 harboured blaKPC-2 and 11 harboured blaVIM-19. Varying the parameters for assigning WGM clusters showed that these were comparable to STs and to the eight PFGE types or subtypes (difference of three or more bands). A 95% similarity coefficient assigned all 12 WGMs to a single cluster, whereas a 99% similarity coefficient (or ≥10 unmatched-fragment difference) assigned the 12 WGMs to eight (sub)clusters. Based on a difference of three or more bands between PFGE profiles, the Simpson's diversity indices (SDIs) of WGM (0.94, Jackknife pseudo-values CI: 0.883-0.996) and PFGE (0.93, Jackknife pseudo-values CI: 0.828-1.000) were similar (P = 0.649). However, the discriminatory power of WGM was significantly higher (SDI: 0.94, Jackknife pseudo-values CI: 0.883-0.996) than that of PFGE profiles typed on a difference of seven or more bands (SDI: 0.53, Jackknife pseudo-values CI: 0.212-0.849) (P = 0.007). This study demonstrates the application of WGM to understanding the epidemiology of hospital-associated K. pneumoniae. Utilizing a combination of WGM and WGS, we also present here the first longitudinal genomic characterization of the highly dynamic carbapenem-resistant ST383 K. pneumoniae clone that is rapidly gaining importance in Europe. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial

  3. Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data

    Nathan D. Olson

    2017-09-01

    Full Text Available High sensitivity methods such as next generation sequencing and polymerase chain reaction (PCR are adversely impacted by organismal and DNA contaminants. Current methods for detecting contaminants in microbial materials (genomic DNA and cultures are not sensitive enough and require either a known or culturable contaminant. Whole genome sequencing (WGS is a promising approach for detecting contaminants due to its sensitivity and lack of need for a priori assumptions about the contaminant. Prior to applying WGS, we must first understand its limitations for detecting contaminants and potential for false positives. Herein we demonstrate and characterize a WGS-based approach to detect organismal contaminants using an existing metagenomic taxonomic classification algorithm. Simulated WGS datasets from ten genera as individuals and binary mixtures of eight organisms at varying ratios were analyzed to evaluate the role of contaminant concentration and taxonomy on detection. For the individual genomes the false positive contaminants reported depended on the genus, with Staphylococcus, Escherichia, and Shigella having the highest proportion of false positives. For nearly all binary mixtures the contaminant was detected in the in-silico datasets at the equivalent of 1 in 1,000 cells, though F. tularensis was not detected in any of the simulated contaminant mixtures and Y. pestis was only detected at the equivalent of one in 10 cells. Once a WGS method for detecting contaminants is characterized, it can be applied to evaluate microbial material purity, in efforts to ensure that contaminants are characterized in microbial materials used to validate pathogen detection assays, generate genome assemblies for database submission, and benchmark sequencing methods.

  4. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    Pimlapas Leekitcharoenphon

    Full Text Available Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections.

  5. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Thorup Nielsen, Mette

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely...

  6. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang

    2015-01-01

    . Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication...

  7. Direct DNA Extraction from Mycobacterium tuberculosis Frozen Stocks as a Reculture-Independent Approach to Whole-Genome Sequencing

    Bjorn-Mortensen, K; Zallet, J; Lillebaek, T

    2015-01-01

    Culturing before DNA extraction represents a major time-consuming step in whole-genome sequencing of slow-growing bacteria, such as Mycobacterium tuberculosis. We report a workflow to extract DNA from frozen isolates without reculturing. Prepared libraries and sequence data were comparable...... with results from recultured aliquots of the same stocks....

  8. Whole-Genome Sequence of Pseudomonas graminis Strain UASWS1507, a Potential Biological Control Agent and Biofertilizer Isolated in Switzerland.

    Crovadore, Julien; Calmin, Gautier; Chablais, Romain; Cochard, Bastien; Schulz, Torsten; Lefort, François

    2016-10-06

    We report here the whole-genome shotgun sequence of the strain UASWS1507 of the species Pseudomonas graminis, isolated in Switzerland from an apple tree. This is the first genome registered for this species, which is considered as a potential and valuable resource of biological control agents and biofertilizers for agriculture. Copyright © 2016 Crovadore et al.

  9. Whole-genome amplified DNA from stored dried blood spots is reliable in high resolution melting curve and sequencing analysis

    Winkel, Bo G; Hollegaard, Mads V; Olesen, Morten S

    2011-01-01

    BACKGROUND: The use of dried blood spots (DBS) samples in genomic workup has been limited by the relative low amounts of genomic DNA (gDNA) they contain. It remains to be proven that whole genome amplified DNA (wgaDNA) from stored DBS samples, constitutes a reliable alternative to gDNA.We wanted...

  10. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  11. Whole-genome pyrosequencing of an epidemic multidrug-resistant Acinetobacter baumannii strain belonging to the European clone II group

    Iacono, M.; Villa, L.; Fortini, D.

    2008-01-01

    The whole-genome sequence of an epidemic, multidrug-resistant Acinetobacter baumannii strain (strain ACICU) belonging to the European clone II group and carrying the plasmid-mediated bla(OXA-58) carbapenem resistance gene was determined. The A. baumannii ACICU genome was compared with the genomes...

  12. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus.

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-03-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000.

  13. Comparing Whole-Genome Sequencing with Sanger Sequencing for spa Typing of Methicillin-Resistant Staphylococcus aureus

    Bartels, Mette Damkjaer; Petersen, Andreas; Worning, Peder

    2014-01-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and ...

  14. Screening of whole genome sequences identified high-impact variants for stallion fertility.

    Schrimpf, Rahel; Gottschalk, Maren; Metzger, Julia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

    2016-04-14

    Stallion fertility is an economically important trait due to the increase of artificial insemination in horses. The availability of whole genome sequence data facilitates identification of rare high-impact variants contributing to stallion fertility. The aim of our study was to genotype rare high-impact variants retrieved from next-generation sequencing (NGS)-data of 11 horses in order to unravel harmful genetic variants in large samples of stallions. Gene ontology (GO) terms and search results from public databases were used to obtain a comprehensive list of human und mice genes predicted to participate in the regulation of male reproduction. The corresponding equine orthologous genes were searched in whole genome sequence data of seven stallions and four mares and filtered for high-impact genetic variants using SnpEFF, SIFT and Polyphen 2 software. All genetic variants with the missing homozygous mutant genotype were genotyped on 337 fertile stallions of 19 breeds using KASP genotyping assays or PCR-RFLP. Mixed linear model analysis was employed for an association analysis with de-regressed estimated breeding values of the paternal component of the pregnancy rate per estrus (EBV-PAT). We screened next generation sequenced data of whole genomes from 11 horses for equine genetic variants in 1194 human and mice genes involved in male fertility and linked through common gene ontology (GO) with male reproductive processes. Variants were filtered for high-impact on protein structure and validated through SIFT and Polyphen 2. Only those genetic variants were followed up when the homozygote mutant genotype was missing in the detection sample comprising 11 horses. After this filtering process, 17 single nucleotide polymorphism (SNPs) were left. These SNPs were genotyped in 337 fertile stallions of 19 breeds using KASP genotyping assays or PCR-RFLP. An association analysis in 216 Hanoverian stallions revealed a significant association of the splice-site disruption variant

  15. Whole-Genome Sequencing and Variant Analysis of Human Papillomavirus 16 Infections.

    van der Weele, Pascal; Meijer, Chris J L M; King, Audrey J

    2017-10-01

    Human papillomavirus (HPV) is a strongly conserved DNA virus, high-risk types of which can cause cervical cancer in persistent infections. The most common type found in HPV-attributable cancer is HPV16, which can be subdivided into four lineages (A to D) with different carcinogenic properties. Studies have shown HPV16 sequence diversity in different geographical areas, but only limited information is available regarding HPV16 diversity within a population, especially at the whole-genome level. We analyzed HPV16 major variant diversity and conservation in persistent infections and performed a single nucleotide polymorphism (SNP) comparison between persistent and clearing infections. Materials were obtained in the Netherlands from a cohort study with longitudinal follow-up for up to 3 years. Our analysis shows a remarkably large variant diversity in the population. Whole-genome sequences were obtained for 57 persistent and 59 clearing HPV16 infections, resulting in 109 unique variants. Interestingly, persistent infections were completely conserved through time. One reinfection event was identified where the initial and follow-up samples clustered differently. Non-A1/A2 variants seemed to clear preferentially ( P = 0.02). Our analysis shows that population-wide HPV16 sequence diversity is very large. In persistent infections, the HPV16 sequence was fully conserved. Sequencing can identify HPV16 reinfections, although occurrence is rare. SNP comparison identified no strongly acting effect of the viral genome affecting HPV16 infection clearance or persistence in up to 3 years of follow-up. These findings suggest the progression of an early HPV16 infection could be host related. IMPORTANCE Human papillomavirus 16 (HPV16) is the predominant type found in cervical cancer. Progression of initial infection to cervical cancer has been linked to sequence properties; however, knowledge of variants circulating in European populations, especially with longitudinal follow-up, is

  16. Novel degenerate PCR method for whole genome amplification applied to Peru Margin (ODP Leg 201 subsurface samples

    Amanda eMartino

    2012-01-01

    Full Text Available A degenerate PCR-based method of whole-genome amplification, designed to work fluidly with 454 sequencing technology, was developed and tested for use on deep marine subsurface DNA samples. The method, which we have called Random Amplification Metagenomic PCR (RAMP, involves the use of specific primers from Roche 454 amplicon sequencing, modified by the addition of a degenerate region at the 3’ end. It utilizes a PCR reaction, which resulted in no amplification from blanks, even after 50 cycles of PCR. After efforts to optimize experimental conditions, the method was tested with DNA extracted from cultured E. coli cells, and genome coverage was estimated after sequencing on three different occasions. Coverage did not vary greatly with the different experimental conditions tested, and was around 62% with a sequencing effort equivalent to a theoretical genome coverage of 14.10X. The GC content of the sequenced amplification product was within 2% of the predicted values for this strain of E. coli. The method was also applied to DNA extracted from marine subsurface samples from ODP Leg 201 site 1229 (Peru Margin, and results of a taxonomic analysis revealed microbial communities dominated by Proteobacteria, Chloroflexi, Firmicutes, Euryarchaeota, and Crenarchaeota, among others. These results were similar to those obtained previously for those samples; however, variations in the proportions of taxa show that community analysis can be sensitive to both the amplification technique used and the method of assigning sequences to taxonomic groups. Overall, we find that RAMP represents a valid methodology for amplifying metagenomes from low biomass samples.

  17. Novel Degenerate PCR Method for Whole-Genome Amplification Applied to Peru Margin (ODP Leg 201) Subsurface Samples

    Martino, Amanda J.; Rhodes, Matthew E.; Biddle, Jennifer F.; Brandt, Leah D.; Tomsho, Lynn P.; House, Christopher H.

    2011-01-01

    A degenerate polymerase chain reaction (PCR)-based method of whole-genome amplification, designed to work fluidly with 454 sequencing technology, was developed and tested for use on deep marine subsurface DNA samples. While optimized here for use with Roche 454 technology, the general framework presented may be applicable to other next generation sequencing systems as well (e.g., Illumina, Ion Torrent). The method, which we have called random amplification metagenomic PCR (RAMP), involves the use of specific primers from Roche 454 amplicon sequencing, modified by the addition of a degenerate region at the 3′ end. It utilizes a PCR reaction, which resulted in no amplification from blanks, even after 50 cycles of PCR. After efforts to optimize experimental conditions, the method was tested with DNA extracted from cultured E. coli cells, and genome coverage was estimated after sequencing on three different occasions. Coverage did not vary greatly with the different experimental conditions tested, and was around 62% with a sequencing effort equivalent to a theoretical genome coverage of 14.10×. The GC content of the sequenced amplification product was within 2% of the predicted values for this strain of E. coli. The method was also applied to DNA extracted from marine subsurface samples from ODP Leg 201 site 1229 (Peru Margin), and results of a taxonomic analysis revealed microbial communities dominated by Proteobacteria, Chloroflexi, Firmicutes, Euryarchaeota, and Crenarchaeota, among others. These results were similar to those obtained previously for those samples; however, variations in the proportions of taxa identified illustrates well the generally accepted view that community analysis is sensitive to both the amplification technique used and the method of assigning sequences to taxonomic groups. Overall, we find that RAMP represents a valid methodology for amplifying metagenomes from low-biomass samples. PMID:22319519

  18. Whole Genome Sequencing of Mycobacterium africanum Strains from Mali Provides Insights into the Mechanisms of Geographic Restriction.

    Winglee, Kathryn; Manson McGuire, Abigail; Maiga, Mamoudou; Abeel, Thomas; Shea, Terrance; Desjardins, Christopher A; Diarra, Bassirou; Baya, Bocar; Sanogo, Moumine; Diallo, Souleymane; Earl, Ashlee M; Bishai, William R

    2016-01-01

    Mycobacterium africanum, made up of lineages 5 and 6 within the Mycobacterium tuberculosis complex (MTC), causes up to half of all tuberculosis cases in West Africa, but is rarely found outside of this region. The reasons for this geographical restriction remain unknown. Possible reasons include a geographically restricted animal reservoir, a unique preference for hosts of West African ethnicity, and an inability to compete with other lineages outside of West Africa. These latter two hypotheses could be caused by loss of fitness or altered interactions with the host immune system. We sequenced 92 MTC clinical isolates from Mali, including two lineage 5 and 24 lineage 6 strains. Our genome sequencing assembly, alignment, phylogeny and average nucleotide identity analyses enabled us to identify features that typify lineages 5 and 6 and made clear that these lineages do not constitute a distinct species within the MTC. We found that in Mali, lineage 6 and lineage 4 strains have similar levels of diversity and evolve drug resistance through similar mechanisms. In the process, we identified a putative novel streptomycin resistance mutation. In addition, we found evidence of person-to-person transmission of lineage 6 isolates and showed that lineage 6 is not enriched for mutations in virulence-associated genes. This is the largest collection of lineage 5 and 6 whole genome sequences to date, and our assembly and alignment data provide valuable insights into what distinguishes these lineages from other MTC lineages. Lineages 5 and 6 do not appear to be geographically restricted due to an inability to transmit between West African hosts or to an elevated number of mutations in virulence-associated genes. However, lineage-specific mutations, such as mutations in cell wall structure, secretion systems and cofactor biosynthesis, provide alternative mechanisms that may lead to host specificity.

  19. Phenotypic H-Antigen Typing by Mass Spectrometry Combined with Genetic Typing of H Antigens, O Antigens, and Toxins by Whole-Genome Sequencing Enhances Identification of Escherichia coli Isolates.

    Cheng, Keding; Chui, Huixia; Domish, Larissa; Sloan, Angela; Hernandez, Drexler; McCorrister, Stuart; Robinson, Alyssia; Walker, Matthew; Peterson, Lorea A M; Majcher, Miles; Ratnam, Sam; Haldane, David J M; Bekal, Sadjia; Wylie, John; Chui, Linda; Tyler, Shaun; Xu, Bianli; Reimer, Aleisha; Nadon, Celine; Knox, J David; Wang, Gehua

    2016-08-01

    Mass spectrometry-based phenotypic H-antigen typing (MS-H) combined with whole-genome-sequencing-based genetic identification of H antigens, O antigens, and toxins (WGS-HOT) was used to type 60 clinical Escherichia coli isolates, 43 of which were previously identified as nonmotile, H type undetermined, or O rough by serotyping or having shown discordant MS-H and serotyping results. Whole-genome sequencing confirmed that MS-H was able to provide more accurate data regarding H antigen expression than serotyping. Further, enhanced and more confident O antigen identification resulted from gene cluster based typing in combination with conventional typing based on the gene pair comprising wzx and wzy and that comprising wzm and wzt The O antigen was identified in 94.6% of the isolates when the two genetic O typing approaches (gene pair and gene cluster) were used in conjunction, in comparison to 78.6% when the gene pair database was used alone. In addition, 98.2% of the isolates showed the existence of genes for various toxins and/or virulence factors, among which verotoxins (Shiga toxin 1 and/or Shiga toxin 2) were 100% concordant with conventional PCR based testing results. With more applications of mass spectrometry and whole-genome sequencing in clinical microbiology laboratories, this combined phenotypic and genetic typing platform (MS-H plus WGS-HOT) should be ideal for pathogenic E. coli typing. Copyright © 2016 Cheng et al.

  20. Digital Droplet Multiple Displacement Amplification (ddMDA for Whole Genome Sequencing of Limited DNA Samples.

    Minsoung Rhee

    Full Text Available Multiple displacement amplification (MDA is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA technique where partitioning of the template DNA into thousands of sub-nanoliter droplets, each containing a small number of DNA fragments, greatly reduces the competition among DNA fragments for primers and polymerase thereby greatly reducing amplification bias. Consequently, the ddMDA approach enabled a more uniform coverage of amplification over the entire length of the genome, with significantly lower bias and non-specific amplification than conventional MDA. For a sample containing 0.1 pg/μL of E. coli DNA (equivalent of ~3/1000 of an E. coli genome per droplet, ddMDA achieves a 65-fold increase in coverage in de novo assembly, and more than 20-fold increase in specificity (percentage of reads mapping to E. coli compared to the conventional tube MDA. ddMDA offers a powerful method useful for many applications including medical diagnostics, forensics, and environmental microbiology.

  1. Whole genome sequencing reveals a de novo SHANK3 mutation in familial autism spectrum disorder.

    Sergio I Nemirovsky

    Full Text Available Clinical genomics promise to be especially suitable for the study of etiologically heterogeneous conditions such as Autism Spectrum Disorder (ASD. Here we present three siblings with ASD where we evaluated the usefulness of Whole Genome Sequencing (WGS for the diagnostic approach to ASD.We identified a family segregating ASD in three siblings with an unidentified cause. We performed WGS in the three probands and used a state-of-the-art comprehensive bioinformatic analysis pipeline and prioritized the identified variants located in genes likely to be related to ASD. We validated the finding by Sanger sequencing in the probands and their parents.Three male siblings presented a syndrome characterized by severe intellectual disability, absence of language, autism spectrum symptoms and epilepsy with negative family history for mental retardation, language disorders, ASD or other psychiatric disorders. We found germline mosaicism for a heterozygous deletion of a cytosine in the exon 21 of the SHANK3 gene, resulting in a missense sequence of 5 codons followed by a premature stop codon (NM_033517:c.3259_3259delC, p.Ser1088Profs*6.We reported an infrequent form of familial ASD where WGS proved useful in the clinic. We identified a mutation in SHANK3 that underscores its relevance in Autism Spectrum Disorder.

  2. Preliminary Genomic Characterization of Ten Hardwood Tree Species from Multiplexed Low Coverage Whole Genome Sequencing.

    Margaret Staton

    Full Text Available Forest health issues are on the rise in the United States, resulting from introduction of alien pests and diseases, coupled with abiotic stresses related to climate change. Increasingly, forest scientists are finding genetic/genomic resources valuable in addressing forest health issues. For a set of ten ecologically and economically important native hardwood tree species representing a broad phylogenetic spectrum, we used low coverage whole genome sequencing from multiplex Illumina paired ends to economically profile their genomic content. For six species, the genome content was further analyzed by flow cytometry in order to determine the nuclear genome size. Sequencing yielded a depth of 0.8X to 7.5X, from which in silico analysis yielded preliminary estimates of gene and repetitive sequence content in the genome for each species. Thousands of genomic SSRs were identified, with a clear predisposition toward dinucleotide repeats and AT-rich repeat motifs. Flanking primers were designed for SSR loci for all ten species, ranging from 891 loci in sugar maple to 18,167 in redbay. In summary, we have demonstrated that useful preliminary genome information including repeat content, gene content and useful SSR markers can be obtained at low cost and time input from a single lane of Illumina multiplex sequence.

  3. Use of Whole Genome Sequencing for Diagnosis and Discovery in the Cancer Genetics Clinic

    Samantha B. Foley

    2015-01-01

    Full Text Available Despite the potential of whole-genome sequencing (WGS to improve patient diagnosis and care, the empirical value of WGS in the cancer genetics clinic is unknown. We performed WGS on members of two cohorts of cancer genetics patients: those with BRCA1/2 mutations (n = 176 and those without (n = 82. Initial analysis of potentially pathogenic variants (PPVs, defined as nonsynonymous variants with allele frequency < 1% in ESP6500 in 163 clinically-relevant genes suggested that WGS will provide useful clinical results. This is despite the fact that a majority of PPVs were novel missense variants likely to be classified as variants of unknown significance (VUS. Furthermore, previously reported pathogenic missense variants did not always associate with their predicted diseases in our patients. This suggests that the clinical use of WGS will require large-scale efforts to consolidate WGS and patient data to improve accuracy of interpretation of rare variants. While loss-of-function (LoF variants represented only a small fraction of PPVs, WGS identified additional cancer risk LoF PPVs in patients with known BRCA1/2 mutations and led to cancer risk diagnoses in 21% of non-BRCA cancer genetics patients after expanding our analysis to 3209 ClinVar genes. These data illustrate how WGS can be used to improve our ability to discover patients' cancer genetic risks.

  4. Genomic view of bipolar disorder revealed by whole genome sequencing in a genetic isolate.

    Benjamin Georgi

    2014-03-01

    Full Text Available Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders.

  5. Whole-genome sequencing reveals a potential causal mutation for dwarfism in the Miniature Shetland pony.

    Metzger, Julia; Gast, Alana Christina; Schrimpf, Rahel; Rau, Janina; Eikelberg, Deborah; Beineke, Andreas; Hellige, Maren; Distl, Ottmar

    2017-04-01

    The Miniature Shetland pony represents a horse breed with an extremely small body size. Clinical examination of a dwarf Miniature Shetland pony revealed a lowered size at the withers, malformed skull and brachygnathia superior. Computed tomography (CT) showed a shortened maxilla and a cleft of the hard and soft palate which protruded into the nasal passage leading to breathing difficulties. Pathological examination confirmed these findings but did not reveal histopathological signs of premature ossification in limbs or cranial sutures. Whole-genome sequencing of this dwarf Miniature Shetland pony and comparative sequence analysis using 26 reference equids from NCBI Sequence Read Archive revealed three probably damaging missense variants which could be exclusively found in the affected foal. Validation of these three missense mutations in 159 control horses from different horse breeds and five donkeys revealed only the aggrecan (ACAN)-associated g.94370258G>C variant as homozygous wild-type in all control samples. The dwarf Miniature Shetland pony had the homozygous mutant genotype C/C of the ACAN:g.94370258G>C variant and the normal parents were heterozygous G/C. An unaffected full sib and 3/5 unaffected half-sibs were heterozygous G/C for the ACAN:g.94370258G>C variant. In summary, we could demonstrate a dwarf phenotype in a miniature pony breed perfectly associated with a missense mutation within the ACAN gene.

  6. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder

    Yuen, Ryan KC; Merico, Daniele; Bookman, Matt; Howe, Jennifer L; Thiruvahindrapuram, Bhooma; Patel, Rohan V; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A; Walker, Susan; Marshall, Christian R; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D’Abate, Lia; Chan, Ada JS; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R; Nalpathamkalam, Thomas; Sung, Wilson WL; Tsoi, Fiona J; Wei, John; Xu, Lizhen; Tasse, Anne-Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie MacKinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili; Iaboni, Alana; Doyle-Thomas, Krissy; Thompson, Ann; Chrysler, Christina; Leef, Jonathan; Savion-Lemieux, Tal; Smith, Isabel M; Liu, Xudong; Nicolson, Rob; Seifer, Vicki; Fedele, Angie; Cook, Edwin H; Dager, Stephen; Estes, Annette; Gallagher, Louise; Malow, Beth A; Parr, Jeremy R; Spence, Sarah J; Vorstman, Jacob; Frey, Brendan J; Robinson, James T; Strug, Lisa J; Fernandez, Bridget A; Elsabbagh, Mayada; Carter, Melissa T; Hallmayer, Joachim; Knoppers, Bartha M; Anagnostou, Evdokia; Szatmari, Peter; Ring, Robert H; Glazer, David; Pletcher, Mathew T; Scherer, Stephen W

    2017-01-01

    We are performing whole genome sequencing (WGS) of families with Autism Spectrum Disorder (ASD) to build a resource, named MSSNG, to enable the sub-categorization of phenotypes and underlying genetic factors involved. Here, we report WGS of 5,205 samples from families with ASD, accompanied by clinical information, creating a database accessible in a cloud platform, and through an internet portal with controlled access. We found an average of 73.8 de novo single nucleotide variants and 12.6 de novo insertion/deletions (indels) or copy number variations (CNVs) per ASD subject. We identified 18 new candidate ASD-risk genes such as MED13 and PHF3, and found that participants bearing mutations in susceptibility genes had significantly lower adaptive ability (p=6×10−4). In 294/2,620 (11.2%) of ASD cases, a molecular basis could be determined and 7.2% of these carried CNV/chromosomal abnormalities, emphasizing the importance of detecting all forms of genetic variation as diagnostic and therapeutic targets in ASD. PMID:28263302

  7. Supersize me: how whole-genome sequencing and big data are transforming epidemiology.

    Kao, Rowland R; Haydon, Daniel T; Lycett, Samantha J; Murcia, Pablo R

    2014-05-01

    In epidemiology, the identification of 'who infected whom' allows us to quantify key characteristics such as incubation periods, heterogeneity in transmission rates, duration of infectiousness, and the existence of high-risk groups. Although invaluable, the existence of many plausible infection pathways makes this difficult, and epidemiological contact tracing either uncertain, logistically prohibitive, or both. The recent advent of next-generation sequencing technology allows the identification of traceable differences in the pathogen genome that are transforming our ability to understand high-resolution disease transmission, sometimes even down to the host-to-host scale. We review recent examples of the use of pathogen whole-genome sequencing for the purpose of forensic tracing of transmission pathways, focusing on the particular problems where evolutionary dynamics must be supplemented by epidemiological information on the most likely timing of events as well as possible transmission pathways. We also discuss potential pitfalls in the over-interpretation of these data, and highlight the manner in which a confluence of this technology with sophisticated mathematical and statistical approaches has the potential to produce a paradigm shift in our understanding of infectious disease transmission and control. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. Mechanisms of Linezolid Resistance among Coagulase-Negative Staphylococci Determined by Whole-Genome Sequencing

    Tewhey, Ryan; Gu, Bing; Kelesidis, Theodoros; Charlton, Carmen; Bobenchik, April; Hindler, Janet; Schork, Nicholas J.

    2014-01-01

    ABSTRACT Linezolid resistance is uncommon among staphylococci, but approximately 2% of clinical isolates of coagulase-negative staphylococci (CoNS) may exhibit resistance to linezolid (MIC, ≥8 µg/ml). We performed whole-genome sequencing (WGS) to characterize the resistance mechanisms and genetic backgrounds of 28 linezolid-resistant CoNS (21 Staphylococcus epidermidis isolates and 7 Staphylococcus haemolyticus isolates) obtained from blood cultures at a large teaching health system in California between 2007 and 2012. The following well-characterized mutations associated with linezolid resistance were identified in the 23S rRNA: G2576U, G2447U, and U2504A, along with the mutation C2534U. Mutations in the L3 and L4 riboproteins, at sites previously associated with linezolid resistance, were also identified in 20 isolates. The majority of isolates harbored more than one mutation in the 23S rRNA and L3 and L4 genes. In addition, the cfr methylase gene was found in almost half (48%) of S. epidermidis isolates. cfr had been only rarely identified in staphylococci in the United States prior to this study. Isolates of the same sequence type were identified with unique mutations associated with linezolid resistance, suggesting independent acquisition of linezolid resistance in each isolate. PMID:24915435

  9. Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data

    Wei Du

    2013-01-01

    Full Text Available Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis.

  10. Molecular analysis of single oocyst of Eimeria by whole genome amplification (WGA) based nested PCR.

    Wang, Yunzhou; Tao, Geru; Cui, Yujuan; Lv, Qiyao; Xie, Li; Li, Yuan; Suo, Xun; Qin, Yinghe; Xiao, Lihua; Liu, Xianyong

    2014-09-01

    PCR-based molecular tools are widely used for the identification and characterization of protozoa. Here we report the molecular analysis of Eimeria species using combined methods of whole genome amplification (WGA) and nested PCR. Single oocyst of Eimeria stiedai or Eimeriamedia was directly used for random amplification of the genomic DNA with either primer extension preamplification (PEP) or multiple displacement amplification (MDA), and then the WGA product was used as template in nested PCR with species-specific primers for ITS-1, 18S rDNA and 23S rDNA of E. stiedai and E. media. WGA-based PCR was successful for the amplification of these genes from single oocyst. For the species identification of single oocyst isolated from mixed E. stiedai or E. media, the results from WGA-based PCR were exactly in accordance with those from morphological identification, suggesting the availability of this method in molecular analysis of eimerian parasites at the single oocyst level. WGA-based PCR method can also be applied for the identification and genetic characterization of other protists. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. Norgal: extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data.

    Al-Nakeeb, Kosai; Petersen, Thomas Nordahl; Sicheritz-Pontén, Thomas

    2017-11-21

    Whole-genome sequencing (WGS) projects provide short read nucleotide sequences from nuclear and possibly organelle DNA depending on the source of origin. Mitochondrial DNA is present in animals and fungi, while plants contain DNA from both mitochondria and chloroplasts. Current techniques for separating organelle reads from nuclear reads in WGS data require full reference or partial seed sequences for assembling. Norgal (de Novo ORGAneLle extractor) avoids this requirement by identifying a high frequency subset of k-mers that are predominantly of mitochondrial origin and performing a de novo assembly on a subset of reads that contains these k-mers. The method was applied to WGS data from a panda, brown algae seaweed, butterfly and filamentous fungus. We were able to extract full circular mitochondrial genomes and obtained sequence identities to the reference sequences in the range from 98.5 to 99.5%. We also assembled the chloroplasts of grape vines and cucumbers using Norgal together with seed-based de novo assemblers. Norgal is a pipeline that can extract and assemble full or partial mitochondrial and chloroplast genomes from WGS short reads without prior knowledge. The program is available at: https://bitbucket.org/kosaidtu/norgal .

  12. BALSA: integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU

    Ruibang Luo

    2014-06-01

    Full Text Available This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels, BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads, or just 25 min for 210-fold whole exome sequencing. BALSA’s speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa.

  13. Publisher Correction: Whole genome sequencing in psychiatric disorders: the WGSPD consortium.

    Sanders, Stephan J; Neale, Benjamin M; Huang, Hailiang; Werling, Donna M; An, Joon-Yong; Dong, Shan; Abecasis, Goncalo; Arguello, P Alexander; Blangero, John; Boehnke, Michael; Daly, Mark J; Eggan, Kevin; Geschwind, Daniel H; Glahn, David C; Goldstein, David B; Gur, Raquel E; Handsaker, Robert E; McCarroll, Steven A; Ophoff, Roel A; Palotie, Aarno; Pato, Carlos N; Sabatti, Chiara; State, Matthew W; Willsey, A Jeremy; Hyman, Steven E; Addington, Anjene M; Lehner, Thomas; Freimer, Nelson B

    2018-03-16

    In the version of this article initially published, the consortium authorship and corresponding authors were not presented correctly. In the PDF and print versions, the Whole Genome Sequencing for Psychiatric Disorders (WGSPD) consortium was missing from the author list at the beginning of the paper, where it should have appeared as the seventh author; it was present in the author list at the end of the paper, but the footnote directing readers to the Supplementary Note for a list of members was missing. In the HTML version, the consortium was listed as the last author instead of as the seventh, and the line directing readers to the Supplementary Note for a list of members appeared at the end of the paper under Author Information but not in association with the consortium name itself. Also, this line stated that both member names and affiliations could be found in the Supplementary Note; in fact, only names are given. In all versions of the paper, the corresponding author symbols were attached to A. Jeremy Willsey, Steven E. Hyman, Anjene M. Addington and Thomas Lehner; they should have been attached, respectively, to Steven E. Hyman, Anjene M. Addington, Thomas Lehner and Nelson B. Freimer. As a result of this shift, the respective contact links in the HTML version did not lead to the indicated individuals. The errors have been corrected in the HTML and PDF versions of the article.

  14. Nested radiations and the pulse of angiosperm diversification: increased diversification rates often follow whole genome duplications.

    Tank, David C; Eastman, Jonathan M; Pennell, Matthew W; Soltis, Pamela S; Soltis, Douglas E; Hinchliff, Cody E; Brown, Joseph W; Sessa, Emily B; Harmon, Luke J

    2015-07-01

    Our growing understanding of the plant tree of life provides a novel opportunity to uncover the major drivers of angiosperm diversity. Using a time-calibrated phylogeny, we characterized hot and cold spots of lineage diversification across the angiosperm tree of life by modeling evolutionary diversification using stepwise AIC (MEDUSA). We also tested the whole-genome duplication (WGD) radiation lag-time model, which postulates that increases in diversification tend to lag behind established WGD events. Diversification rates have been incredibly heterogeneous throughout the evolutionary history of angiosperms and reveal a pattern of 'nested radiations' - increases in net diversification nested within other radiations. This pattern in turn generates a negative relationship between clade age and diversity across both families and orders. We suggest that stochastically changing diversification rates across the phylogeny explain these patterns. Finally, we demonstrate significant statistical support for the WGD radiation lag-time model. Across angiosperms, nested shifts in diversification led to an overall increasing rate of net diversification and declining relative extinction rates through time. These diversification shifts are only rarely perfectly associated with WGD events, but commonly follow them after a lag period. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  15. Comprehensive Phylogenetic Analysis of Bovine Non-aureus Staphylococci Species Based on Whole-Genome Sequencing

    Naushad, Sohail; Barkema, Herman W.; Luby, Christopher; Condas, Larissa A. Z.; Nobrega, Diego B.; Carson, Domonique A.; De Buck, Jeroen

    2016-01-01

    Non-aureus staphylococci (NAS), a heterogeneous group of a large number of species and subspecies, are the most frequently isolated pathogens from intramammary infections in dairy cattle. Phylogenetic relationships among bovine NAS species are controversial and have mostly been determined based on single-gene trees. Herein, we analyzed phylogeny of bovine NAS species using whole-genome sequencing (WGS) of 441 distinct isolates. In addition, evolutionary relationships among bovine NAS were estimated from multilocus data of 16S rRNA, hsp60, rpoB, sodA, and tuf genes and sequences from these and numerous other single genes/proteins. All phylogenies were created with FastTree, Maximum-Likelihood, Maximum-Parsimony, and Neighbor-Joining methods. Regardless of methodology, WGS-trees clearly separated bovine NAS species into five monophyletic coherent clades. Furthermore, there were consistent interspecies relationships within clades in all WGS phylogenetic reconstructions. Except for the Maximum-Parsimony tree, multilocus data analysis similarly produced five clades. There were large variations in determining clades and interspecies relationships in single gene/protein trees, under different methods of tree constructions, highlighting limitations of using single genes for determining bovine NAS phylogeny. However, based on WGS data, we established a robust phylogeny of bovine NAS species, unaffected by method or model of evolutionary reconstructions. Therefore, it is now possible to determine associations between phylogeny and many biological traits, such as virulence, antimicrobial resistance, environmental niche, geographical distribution, and host specificity. PMID:28066335

  16. DNA-based identification of spices: DNA isolation, whole genome amplification, and polymerase chain reaction.

    Focke, Felix; Haase, Ilka; Fischer, Markus

    2011-01-26

    Usually spices are identified morphologically using simple methods like magnifying glasses or microscopic instruments. On the other hand, molecular biological methods like the polymerase chain reaction (PCR) enable an accurate and specific detection also in complex matrices. Generally, the origins of spices are plants with diverse genetic backgrounds and relationships. The processing methods used for the production of spices are complex and individual. Consequently, the development of a reliable DNA-based method for spice analysis is a challenging intention. However, once established, this method will be easily adapted to less difficult food matrices. In the current study, several alternative methods for the isolation of DNA from spices have been developed and evaluated in detail with regard to (i) its purity (photometric), (ii) yield (fluorimetric methods), and (iii) its amplifiability (PCR). Whole genome amplification methods were used to preamplify isolates to improve the ratio between amplifiable DNA and inhibiting substances. Specific primer sets were designed, and the PCR conditions were optimized to detect 18 spices selectively. Assays of self-made spice mixtures were performed to proof the applicability of the developed methods.

  17. Whole-Genome Analyses of Korean Native and Holstein Cattle Breeds by Massively Parallel Sequencing

    Stothard, Paul; Chung, Won-Hyong; Jeon, Heoyn-Jeong; Miller, Stephen P.; Choi, So-Young; Lee, Jeong-Koo; Yang, Bokyoung; Lee, Kyung-Tai; Han, Kwang-Jin; Kim, Hyeong-Cheol; Jeong, Dongkee; Oh, Jae-Don; Kim, Namshin; Kim, Tae-Hun; Lee, Hak-Kyo; Lee, Sung-Jin

    2014-01-01

    A main goal of cattle genomics is to identify DNA differences that account for variations in economically important traits. In this study, we performed whole-genome analyses of three important cattle breeds in Korea—Hanwoo, Jeju Heugu, and Korean Holstein—using the Illumina HiSeq 2000 sequencing platform. We achieved 25.5-, 29.6-, and 29.5-fold coverage of the Hanwoo, Jeju Heugu, and Korean Holstein genomes, respectively, and identified a total of 10.4 million single nucleotide polymorphisms (SNPs), of which 54.12% were found to be novel. We also detected 1,063,267 insertions–deletions (InDels) across the genomes (78.92% novel). Annotations of the datasets identified a total of 31,503 nonsynonymous SNPs and 859 frameshift InDels that could affect phenotypic variations in traits of interest. Furthermore, genome-wide copy number variation regions (CNVRs) were detected by comparing the Hanwoo, Jeju Heugu, and previously published Chikso genomes against that of Korean Holstein. A total of 992, 284, and 1881 CNVRs, respectively, were detected throughout the genome. Moreover, 53, 65, 45, and 82 putative regions of homozygosity (ROH) were identified in Hanwoo, Jeju Heugu, Chikso, and Korean Holstein respectively. The results of this study provide a valuable foundation for further investigations to dissect the molecular mechanisms underlying variation in economically important traits in cattle and to develop genetic markers for use in cattle breeding. PMID:24992012

  18. MALINA: a web service for visual analytics of human gut microbiota whole-genome metagenomic reads.

    Tyakht, Alexander V; Popenko, Anna S; Belenikin, Maxim S; Altukhov, Ilya A; Pavlenko, Alexander V; Kostryukova, Elena S; Selezneva, Oksana V; Larin, Andrei K; Karpova, Irina Y; Alexeev, Dmitry G

    2012-12-07

    MALINA is a web service for bioinformatic analysis of whole-genome metagenomic data obtained from human gut microbiota sequencing. As input data, it accepts metagenomic reads of various sequencing technologies, including long reads (such as Sanger and 454 sequencing) and next-generation (including SOLiD and Illumina). It is the first metagenomic web service that is capable of processing SOLiD color-space reads, to authors' knowledge. The web service allows phylogenetic and functional profiling of metagenomic samples using coverage depth resulting from the alignment of the reads to the catalogue of reference sequences which are built into the pipeline and contain prevalent microbial genomes and genes of human gut microbiota. The obtained metagenomic composition vectors are processed by the statistical analysis and visualization module containing methods for clustering, dimension reduction and group comparison. Additionally, the MALINA database includes vectors of bacterial and functional composition for human gut microbiota samples from a large number of existing studies allowing their comparative analysis together with user samples, namely datasets from Russian Metagenome project, MetaHIT and Human Microbiome Project (downloaded from http://hmpdacc.org). MALINA is made freely available on the web at http://malina.metagenome.ru. The website is implemented in JavaScript (using Ext JS), Microsoft .NET Framework, MS SQL, Python, with all major browsers supported.

  19. Improved acid tolerance of Lactobacillus pentosus by error-prone whole genome amplification.

    Ye, Lidan; Zhao, Hua; Li, Zhi; Wu, Jin Chuan

    2013-05-01

    Acid tolerance of Lactobacillus pentosus ATCC 8041 was improved by error-prone amplification of its genomic DNA using random primers and Taq DNA polymerase. The resulting amplification products were transferred into wild-type L. pentosus by electroporation and the transformants were screened for growth on low-pH agar plates. After only one round of mutation, one mutant (MT3) was identified that was able to completely consume 20 g/L of glucose to produce lactic acid at a yield of 95% in 1L MRS medium at pH 3.8 within 36 h, whereas no growth or lactic acid production was observed for the wild-type strain under the same conditions. The acid tolerance of mutant MT3 remained genetically stable for at least 25 subcultures. Therefore, the error-prone whole genome amplification technique is a very powerful tool for improving phenotypes of this lactic acid bacterium and may also be applicable for other microorganisms. Copyright © 2012 Elsevier Ltd. All rights reserved.

  20. Genomic View of Bipolar Disorder Revealed by Whole Genome Sequencing in a Genetic Isolate

    Georgi, Benjamin; Craig, David; Kember, Rachel L.; Liu, Wencheng; Lindquist, Ingrid; Nasser, Sara; Brown, Christopher; Egeland, Janice A.; Paul, Steven M.; Bućan, Maja

    2014-01-01

    Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders. PMID:24625924

  1. BALSA: integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU.

    Luo, Ruibang; Wong, Yiu-Lun; Law, Wai-Chun; Lee, Lap-Kei; Cheung, Jeanno; Liu, Chi-Man; Lam, Tak-Wah

    2014-01-01

    This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels), BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads), or just 25 min for 210-fold whole exome sequencing. BALSA's speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa.

  2. Whole-genome sequencing of a laboratory-evolved yeast strain

    Dunham Maitreya J

    2010-02-01

    Full Text Available Abstract Background Experimental evolution of microbial populations provides a unique opportunity to study evolutionary adaptation in response to controlled selective pressures. However, until recently it has been difficult to identify the precise genetic changes underlying adaptation at a genome-wide scale. New DNA sequencing technologies now allow the genome of parental and evolved strains of microorganisms to be rapidly determined. Results We sequenced >93.5% of the genome of a laboratory-evolved strain of the yeast Saccharomyces cerevisiae and its ancestor at >28× depth. Both single nucleotide polymorphisms and copy number amplifications were found, with specific gains over array-based methodologies previously used to analyze these genomes. Applying a segmentation algorithm to quantify structural changes, we determined the approximate genomic boundaries of a 5× gene amplification. These boundaries guided the recovery of breakpoint sequences, which provide insights into the nature of a complex genomic rearrangement. Conclusions This study suggests that whole-genome sequencing can provide a rapid approach to uncover the genetic basis of evolutionary adaptations, with further applications in the study of laboratory selections and mutagenesis screens. In addition, we show how single-end, short read sequencing data can provide detailed information about structural rearrangements, and generate predictions about the genomic features and processes that underlie genome plasticity.

  3. A Proposed Clinical Decision Support Architecture Capable of Supporting Whole Genome Sequence Information

    Brandon M. Welch

    2014-04-01

    Full Text Available Whole genome sequence (WGS information may soon be widely available to help clinicians personalize the care and treatment of patients. However, considerable barriers exist, which may hinder the effective utilization of WGS information in a routine clinical care setting. Clinical decision support (CDS offers a potential solution to overcome such barriers and to facilitate the effective use of WGS information in the clinic. However, genomic information is complex and will require significant considerations when developing CDS capabilities. As such, this manuscript lays out a conceptual framework for a CDS architecture designed to deliver WGS-guided CDS within the clinical workflow. To handle the complexity and breadth of WGS information, the proposed CDS framework leverages service-oriented capabilities and orchestrates the interaction of several independently-managed components. These independently-managed components include the genome variant knowledge base, the genome database, the CDS knowledge base, a CDS controller and the electronic health record (EHR. A key design feature is that genome data can be stored separately from the EHR. This paper describes in detail: (1 each component of the architecture; (2 the interaction of the components; and (3 how the architecture attempts to overcome the challenges associated with WGS information. We believe that service-oriented CDS capabilities will be essential to using WGS information for personalized medicine.

  4. A proposed clinical decision support architecture capable of supporting whole genome sequence information.

    Welch, Brandon M; Loya, Salvador Rodriguez; Eilbeck, Karen; Kawamoto, Kensaku

    2014-04-04

    Whole genome sequence (WGS) information may soon be widely available to help clinicians personalize the care and treatment of patients. However, considerable barriers exist, which may hinder the effective utilization of WGS information in a routine clinical care setting. Clinical decision support (CDS) offers a potential solution to overcome such barriers and to facilitate the effective use of WGS information in the clinic. However, genomic information is complex and will require significant considerations when developing CDS capabilities. As such, this manuscript lays out a conceptual framework for a CDS architecture designed to deliver WGS-guided CDS within the clinical workflow. To handle the complexity and breadth of WGS information, the proposed CDS framework leverages service-oriented capabilities and orchestrates the interaction of several independently-managed components. These independently-managed components include the genome variant knowledge base, the genome database, the CDS knowledge base, a CDS controller and the electronic health record (EHR). A key design feature is that genome data can be stored separately from the EHR. This paper describes in detail: (1) each component of the architecture; (2) the interaction of the components; and (3) how the architecture attempts to overcome the challenges associated with WGS information. We believe that service-oriented CDS capabilities will be essential to using WGS information for personalized medicine.

  5. Using whole genome sequencing to study American foulbrood epidemiology in honeybees.

    Joakim Ågren

    Full Text Available American foulbrood (AFB, caused by Paenibacillus larvae, is a devastating disease in honeybees. In most countries, the disease is controlled through compulsory burning of symptomatic colonies causing major economic losses in apiculture. The pathogen is endemic to honeybees world-wide and is readily transmitted via the movement of hive equipment or bees. Molecular epidemiology of AFB currently largely relies on placing isolates in one of four ERIC-genotypes. However, a more powerful alternative is multi-locus sequence typing (MLST using whole-genome sequencing (WGS, which allows for high-resolution studies of disease outbreaks. To evaluate WGS as a tool for AFB-epidemiology, we applied core genome MLST (cgMLST on isolates from a recent outbreak of AFB in Sweden. The high resolution of the cgMLST allowed different bacterial clones involved in the disease outbreak to be identified and to trace the source of infection. The source was found to be a beekeeper who had sold bees to two other beekeepers, proving the epidemiological link between them. No such conclusion could have been made using conventional MLST or ERIC-typing. This is the first time that WGS has been used to study the epidemiology of AFB. The results show that the technique is very powerful for high-resolution tracing of AFB-outbreaks.

  6. Automated whole-genome multiple alignment of rat, mouse, and human

    Brudno, Michael; Poliakov, Alexander; Salamov, Asaf; Cooper, Gregory M.; Sidow, Arend; Rubin, Edward M.; Solovyev, Victor; Batzoglou, Serafim; Dubchak, Inna

    2004-07-04

    We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.

  7. Living laboratory: whole-genome sequencing as a learning healthcare enterprise.

    Angrist, M; Jamal, L

    2015-04-01

    With the proliferation of affordable large-scale human genomic data come profound and vexing questions about management of such data and their clinical uncertainty. These issues challenge the view that genomic research on human beings can (or should) be fully segregated from clinical genomics, either conceptually or practically. Here, we argue that the sharp distinction between clinical care and research is especially problematic in the context of large-scale genomic sequencing of people with suspected genetic conditions. Core goals of both enterprises (e.g. understanding genotype-phenotype relationships; generating an evidence base for genomic medicine) are more likely to be realized at a population scale if both those ordering and those undergoing sequencing for diagnostic reasons are routinely and longitudinally studied. Rather than relying on expensive and lengthy randomized clinical trials and meta-analyses, we propose leveraging nascent clinical-research hybrid frameworks into a broader, more permanent instantiation of exploratory medical sequencing. Such an investment could enlighten stakeholders about the real-life challenges posed by whole-genome sequencing, such as establishing the clinical actionability of genetic variants, returning 'off-target' results to families, developing effective service delivery models and monitoring long-term outcomes. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  8. Two Rounds of Whole Genome Duplication in the AncestralVertebrate

    Dehal, Paramvir; Boore, Jeffrey L.

    2005-04-12

    The hypothesis that the relatively large and complex vertebrate genome was created by two ancient, whole genome duplications has been hotly debated, but remains unresolved. We reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, then determined when each gene duplicated relative to the evolutionary tree of the organisms. We confirmed the results of earlier studies that there remains little signal of these events in numbers of duplicated genes, gene tree topology, or the number of genes per multigene family. However, when we plotted the genomic map positions of only the subset of paralogous genes that were duplicated prior to the fish-tetrapod split, their global physical organization provides unmistakable evidence of two distinct genome duplication events early in vertebrate evolution indicated by clear patterns of 4-way paralogous regions covering a large part of the human genome. Our results highlight the potential for these large-scale genomic events to have driven the evolutionary success of the vertebrate lineage.

  9. Whole Genome Sequence Analysis of Pig Respiratory Bacterial Pathogens with Elevated Minimum Inhibitory Concentrations for Macrolides.

    Dayao, Denise Ann Estarez; Seddon, Jennifer M; Gibson, Justine S; Blackall, Patrick J; Turni, Conny

    2016-10-01

    Macrolides are often used to treat and control bacterial pathogens causing respiratory disease in pigs. This study analyzed the whole genome sequences of one clinical isolate of Actinobacillus pleuropneumoniae, Haemophilus parasuis, Pasteurella multocida, and Bordetella bronchiseptica, all isolated from Australian pigs to identify the mechanism underlying the elevated minimum inhibitory concentrations (MICs) for erythromycin, tilmicosin, or tulathromycin. The H. parasuis assembled genome had a nucleotide transition at position 2059 (A to G) in the six copies of the 23S rRNA gene. This mutation has previously been associated with macrolide resistance but this is the first reported mechanism associated with elevated macrolide MICs in H. parasuis. There was no known macrolide resistance mechanism identified in the other three bacterial genomes. However, strA and sul2, aminoglycoside and sulfonamide resistance genes, respectively, were detected in one contiguous sequence (contig 1) of A. pleuropneumoniae assembled genome. This contig was identical to plasmids previously identified in Pasteurellaceae. This study has provided one possible explanation of elevated MICs to macrolides in H. parasuis. Further studies are necessary to clarify the mechanism causing the unexplained macrolide resistance in other Australian pig respiratory pathogens including the role of efflux systems, which were detected in all analyzed genomes.

  10. A human genome-wide library of local phylogeny predictions for whole-genome inference problems

    Schwartz Russell

    2008-08-01

    Full Text Available Abstract Background Many common inference problems in computational genetics depend on inferring aspects of the evolutionary history of a data set given a set of observed modern sequences. Detailed predictions of the full phylogenies are therefore of value in improving our ability to make further inferences about population history and sources of genetic variation. Making phylogenetic predictions on the scale needed for whole-genome analysis is, however, extremely computationally demanding. Results In order to facilitate phylogeny-based predictions on a genomic scale, we develop a library of maximum parsimony phylogenies within local regions spanning all autosomal human chromosomes based on Haplotype Map variation data. We demonstrate the utility of this library for population genetic inferences by examining a tree statistic we call 'imperfection,' which measures the reuse of variant sites within a phylogeny. This statistic is significantly predictive of recombination rate, shows additional regional and population-specific conservation, and allows us to identify outlier genes likely to have experienced unusual amounts of variation in recent human history. Conclusion Recent theoretical advances in algorithms for phylogenetic tree reconstruction have made it possible to perform large-scale inferences of local maximum parsimony phylogenies from single nucleotide polymorphism (SNP data. As results from the imperfection statistic demonstrate, phylogeny predictions encode substantial information useful for detecting genomic features and population history. This data set should serve as a platform for many kinds of inferences one may wish to make about human population history and genetic variation.

  11. Prediction of maize phenotype based on whole-genome single nucleotide polymorphisms using deep belief networks

    Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.

    2017-05-01

    Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.

  12. Whole genome duplication of intra- and inter-chromosomes in the tomato genome.

    Song, Chi; Guo, Juan; Sun, Wei; Wang, Ying

    2012-07-20

    Whole genome duplication (WGD) events have been proven to occur in the evolutionary history of most angiosperms. Tomato is considered a model species of the Solanaceae family. In this study, we describe the details of the evolutionary process of the tomato genome by detecting collinearity blocks and dating the WGD events on the tree of life by combining two different methods: synonymous substitution rates (Ks) and phylogenetic trees. In total, 593 collinearity blocks were discovered out of 12 pseudo-chromosomes constructed. It was evident that chromosome 2 had experienced an intra-chromosomal duplication event. Major inter-chromosomal duplication occurred among all the pseudo-chromosome. We calculated the Ks value of these collinearity blocks. Two peaks of Ks distribution were found, corresponding to two WGD events occurring approximately 36-82 million years ago (MYA) and 148-205 MYA. Additionally, the results of phylogenetic trees suggested that the more recent WGD event may have occurred after the divergence of the rosid-asterid clade, but before the major diversification in Solanaceae. The older WGD event was shown to have occurred before the divergence of the rosid-asterid clade and after the divergence of rice-Arabidopsis (monocot-dicot). Copyright © 2012. Published by Elsevier Ltd.

  13. Immunocytological and biochemical analysis of the mode of action of bis (tri-n-butyltin) tri-oxide (TBTO) in Jurkat cells

    Katika, M.R.; Hendriksen, P.J.M.; Ruijter, de N.C.A.; Loveren, van H.; Peijnenburg, A.A.C.M.

    2012-01-01

    Bis (tri-n-butyltin) oxide (TBTO) is one of the organotin compounds known to induce immunosuppression. Previously, we examined the effect of TBTO on whole-genome mRNA expression in the human T lymphocyte cell line Jurkat, which led to the hypothesis that induction of endoplasmic reticulum (ER)

  14. Changes in Gene Expression of Arabidopsis Thaliana Cell Cultures Upon Exposure to Real and Simulated Partial- g Forces

    Fengler, Svenja; Spirer, Ina; Neef, Maren; Ecke, Margret; Hauslage, Jens; Hampp, Rüdiger

    2016-06-01

    Cell cultures of the plant model organism Arabidopsis thaliana were exposed to partial- g forces during parabolic flight and clinostat experiments (0.16 g, 0.38 g and 0.5 g were tested). In order to investigate gravity-dependent alterations in gene expression, samples were metabolically quenched by the fixative RNA later Ⓡ to stabilize nucleic acids and used for whole-genome microarray analysis. An attempt to identify the potential threshold acceleration for the gravity-dependent response showed that the smaller the experienced g-force, the greater was the susceptibility of the cell cultures. Compared to short-term μ g during a parabolic flight, the number of differentially expressed genes under partial- g was lower. In addition, the effect on the alteration of amounts of transcripts decreased during partial- g parabolic flight due to the sequence of the different parabolas (0.38 g, 0.16 g and μ g). A time-dependent analysis under simulated 0.5 g indicates that adaptation occurs within minutes. Differentially expressed genes (at least 2-fold up- or down-regulated in expression) under real flight conditions were to some extent identical with those affected by clinorotation. The highest number of homologuous genes was detected within seconds of exposure to 0.38 g (both flight and clinorotation). To a considerable part, these genes deal with cell wall properties. Additionally, responses specific for clinorotation were observed.

  15. Whole-genome transcriptional analysis of Escherichia coli during heat inactivation processes related to industrial cooking.

    Guernec, A; Robichaud-Rincon, P; Saucier, L

    2013-08-01

    Escherichia coli K-12 was grown to the stationary phase, for maximum physiological resistance, in brain heart infusion (BHI) broth at 37°C. Cells were then heated at 58°C or 60°C to reach a process lethality value \\[\\mathbf{\\left(}{{\\mathit{F}}^{\\mathit{o}}}_{\\mathbf{70}}^{\\mathbf{10}}\\mathbf{\\right)} \\] of 2 or 3 or to a core temperature of 71°C (control industrial cooking temperature). Growth recovery and cell membrane integrity were evaluated immediately after heating, and a global transcription analysis was performed using gene expression microarrays. Only cells heated at 58°C with F(o) = 2 were still able to grow on liquid or solid BHI broth after heat treatment. However, their transcriptome did not differ from that of bacteria heated at 58°C with F(o) = 3 (P value for the false discovery rate [P-FDR] > 0.01), where no growth recovery was observed posttreatment. Genome-wide transcriptomic data obtained at 71°C were distinct from those of the other treatments without growth recovery. Quantification of heat shock gene expression by real-time PCR revealed that dnaK and groEL mRNA levels decreased significantly above 60°C to reach levels similar to those of control cells at 37°C (P citE, glyS, oppB, and asd, whose expression was upregulated at 71°C, may be worth investigating as good biomarkers for accurately determining the efficiency of heat treatments, especially when cells are too injured to be enumerated using growth media.

  16. A whole genome association study to detect additive and dominant single nucleotide polymorphisms for growth and carcass traits in Korean native cattle, Hanwoo

    Yi Li

    2017-01-01

    Full Text Available Objective A whole genome association study was conducted to identify single nucleotide polymorphisms (SNPs with additive and dominant effects for growth and carcass traits in Korean native cattle, Hanwoo. Methods The data set comprised 61 sires and their 486 Hanwoo steers that were born between spring of 2005 and fall of 2007. The steers were genotyped with the 35,968 SNPs that were embedded in the Illumina bovine SNP 50K beadchip and six growth and carcass quality traits were measured for the steers. A series of lack-of-fit tests between the models was applied to classify gene expression pattern as additive or dominant. Results A total of 18 (0, 15 (3, 12 (8, 15 (18, 11 (7, and 21 (1 SNPs were detected at the 5% chromosome (genome - wise level for weaning weight (WWT, yearling weight (YWT, carcass weight (CWT, backfat thickness (BFT, longissimus dorsi muscle area (LMA and marbling score, respectively. Among the significant 129 SNPs, 56 SNPs had additive effects, 20 SNPs dominance effects, and 53 SNPs both additive and dominance effects, suggesting that dominance inheritance mode be considered in genetic improvement for growth and carcass quality in Hanwoo. The significant SNPs were located at 33 quantitative trait locus (QTL regions on 18 Bos Taurus chromosomes (i.e. BTA 3, 4, 5, 6, 7, 9, 11, 12, 13, 14, 16, 17, 18, 20, 23, 26, 28, and 29 were detected. There is strong evidence that BTA14 is the key chromosome affecting CWT. Also, BTA20 is the key chromosome for almost all traits measured (WWT, YWT, LMA. Conclusion The application of various additive and dominance SNP models enabled better characterization of SNP inheritance mode for growth and carcass quality traits in Hanwoo, and many of the detected SNPs or QTL had dominance effects, suggesting that dominance be considered for the whole-genome SNPs data and implementation of successive molecular breeding schemes in Hanwoo.

  17. The zebrafish maternal-effect gene cellular atoll encodes the centriolar component sas-6 and defects in its paternal function promote whole genome duplication.

    Yabe, Taijiro; Ge, Xiaoyan; Pelegri, Francisco

    2007-12-01

    A female-sterile zebrafish maternal-effect mutation in cellular atoll (cea) results in defects in the initiation of cell division starting at the second cell division cycle. This phenomenon is caused by defects in centrosome duplication, which in turn affect the formation of a bipolar spindle. We show that cea encodes the centriolar coiled-coil protein Sas-6, and that zebrafish Cea/Sas-6 protein localizes to centrosomes. cea also has a genetic paternal contribution, which when mutated results in an arrested first cell division followed by normal cleavage. Our data supports the idea that, in zebrafish, paternally inherited centrosomes are required for the first cell division while maternally derived factors are required for centrosomal duplication and cell divisions in subsequent cell cycles. DNA synthesis ensues in the absence of centrosome duplication, and the one-cycle delay in the first cell division caused by cea mutant sperm leads to whole genome duplication. We discuss the potential implications of these findings with regards to the origin of polyploidization in animal species. In addition, the uncoupling of developmental time and cell division count caused by the cea mutation suggests the presence of a time window, normally corresponding to the first two cell cycles, which is permissive for germ plasm recruitment.

  18. Sequence analysis of the whole genomes of five African human G9 rotavirus strains.

    Nyaga, Martin M; Jere, Khuzwayo C; Peenze, Ina; Mlera, Luwanika; van Dijk, Alberdina A; Seheri, Mapaseka L; Mphahlele, M Jeffrey

    2013-06-01

    The G9 rotaviruses are amongst the most common global rotavirus strains causing severe childhood diarrhoea. However, the whole genomes of only a few G9 rotaviruses have been fully sequenced and characterised of which only one G9P[6] and one G9P[8] are from Africa. We determined the consensus sequence of the whole genomes of five African human group A G9 rotavirus strains, four G9P[8] strains and one G9P[6] strain collected in Cameroon (central Africa), Kenya (eastern Africa), South Africa and Zimbabwe (southern Africa) in 1999, 2009 and 2010. Strain RVA/Human-wt/ZWE/MRC-DPRU1723/2009/G9P[8] from Zimbabwe, RVA/Human-wt/ZAF/MRC-DPRU4677/2010/G9P[8] from South Africa, RVA/Human-wt/CMR/1424/2009/G9P[8] from Cameroon and RVA/Human-wt/KEN/MRC-DPRU2427/2010/G9P[8] from Kenya were on a Wa-like genetic backbone and were genotyped as G9-P[8]-I1-R1-C1-M1-A1-N1-T1-E1-H1. Strain RVA/Human-wt/ZAF/MRC-DPRU9317/1999/G9P[6] from South Africa was genotyped as G9-P[6]-I2-R2-C2-M2-A2-N1-T2-E2-H2. Rotavirus A strain MRC-DPRU9317 is the second G9 strain to be reported on a DS-1-like genetic backbone, the other being RVA/Human-wt/ZAF/GR10924/1999/G9P[6]. MRC-DPRU9317 was found to be a reassortant between DS-1-like (I2, R2, C2, M2, A2, T2, E2 and H2) and Wa-like (N1) genome segments. All the genome segments of the five strains grouped strictly according to their genotype Wa- or DS-1-like clusters. Within their respective genotypes, the genome segments of the three G9 study strains from southern Africa clustered most closely with rotaviruses from the same geographical origin and with those with the same G and P types. The highest nucleotide identity of genome segments of the study strains from eastern and central Africa regions on a Wa-like backbone was not limited to rotaviruses with G9P[8] genotypes only, they were also closely related to G12P[6], G8P[8], G1P[8] and G11P[25] rotaviruses, indicating a close inter-genotype relationship between the G9 and other rotavirus genotypes

  19. Whole-genome-based Mycobacterium tuberculosis surveillance: a standardized, portable, and expandable approach.

    Kohl, Thomas A; Diel, Roland; Harmsen, Dag; Rothgänger, Jörg; Walter, Karen Meywald; Merker, Matthias; Weniger, Thomas; Niemann, Stefan

    2014-07-01

    Whole-genome sequencing (WGS) allows for effective tracing of Mycobacterium tuberculosis complex (MTBC) (tuberculosis pathogens) transmission. However, it is difficult to standardize and, therefore, is not yet employed for interlaboratory prospective surveillance. To allow its widespread application, solutions for data standardization and storage in an easily expandable database are urgently needed. To address this question, we developed a core genome multilocus sequence typing (cgMLST) scheme for clinical MTBC isolates using the Ridom SeqSphere(+) software, which transfers the genome-wide single nucleotide polymorphism (SNP) diversity into an allele numbering system that is standardized, portable, and not computationally intensive. To test its performance, we performed WGS analysis of 26 isolates with identical IS6110 DNA fingerprints and spoligotyping patterns from a longitudinal outbreak in the federal state of Hamburg, Germany (notified between 2001 and 2010). The cgMLST approach (3,041 genes) discriminated the 26 strains with a resolution comparable to that of SNP-based WGS typing (one major cluster of 22 identical or closely related and four outlier isolates with at least 97 distinct SNPs or 63 allelic variants). Resulting tree topologies are highly congruent and grouped the isolates in both cases analogously. Our data show that SNP- and cgMLST-based WGS analyses facilitate high-resolution discrimination of longitudinal MTBC outbreaks. cgMLST allows for a meaningful epidemiological interpretation of the WGS genotyping data. It enables standardized WGS genotyping for epidemiological investigations, e.g., on the regional public health office level, and the creation of web-accessible databases for global TB surveillance with an integrated early warning system. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  20. Prediction of expected years of life using whole-genome markers.

    Gustavo de los Campos

    Full Text Available Genetic factors are believed to account for 25% of the interindividual differences in Years of Life (YL among humans. However, the genetic loci that have thus far been found to be associated with YL explain a very small proportion of the expected genetic variation in this trait, perhaps reflecting the complexity of the trait and the limitations of traditional association studies when applied to traits affected by a large number of small-effect genes. Using data from the Framingham Heart Study and statistical methods borrowed largely from the field of animal genetics (whole-genome prediction, WGP, we developed a WGP model for the study of YL and evaluated the extent to which thousands of genetic variants across the genome examined simultaneously can be used to predict interindividual differences in YL. We find that a sizable proportion of differences in YL--which were unexplained by age at entry, sex, smoking and BMI--can be accounted for and predicted using WGP methods. The contribution of genomic information to prediction accuracy was even higher than that of smoking and body mass index (BMI combined; two predictors that are considered among the most important life-shortening factors. We evaluated the impacts of familial relationships and population structure (as described by the first two marker-derived principal components and concluded that in our dataset population structure explained partially, but not fully the gains in prediction accuracy obtained with WGP. Further inspection of prediction accuracies by age at death indicated that most of the gains in predictive ability achieved with WGP were due to the increased accuracy of prediction of early mortality, perhaps reflecting the ability of WGP to capture differences in genetic risk to deadly diseases such as cancer, which are most often responsible for early mortality in our sample.

  1. Whole-genome copy number variation analysis in anophthalmia and microphthalmia.

    Schilter, K F; Reis, L M; Schneider, A; Bardakjian, T M; Abdul-Rahman, O; Kozel, B A; Zimmerman, H H; Broeckel, U; Semina, E V

    2013-11-01

    Anophthalmia/microphthalmia (A/M) represent severe developmental ocular malformations. Currently, mutations in known genes explain less than 40% of A/M cases. We performed whole-genome copy number variation analysis in 60 patients affected with isolated or syndromic A/M. Pathogenic deletions of 3q26 (SOX2) were identified in four independent patients with syndromic microphthalmia. Other variants of interest included regions with a known role in human disease (likely pathogenic) as well as novel rearrangements (uncertain significance). A 2.2-Mb duplication of 3q29 in a patient with non-syndromic anophthalmia and an 877-kb duplication of 11p13 (PAX6) and a 1.4-Mb deletion of 17q11.2 (NF1) in two independent probands with syndromic microphthalmia and other ocular defects were identified; while ocular anomalies have been previously associated with 3q29 duplications, PAX6 duplications, and NF1 mutations in some cases, the ocular phenotypes observed here are more severe than previously reported. Three novel regions of possible interest included a 2q14.2 duplication which cosegregated with microphthalmia/microcornea and congenital cataracts in one family, and 2q21 and 15q26 duplications in two additional cases; each of these regions contains genes that are active during vertebrate ocular development. Overall, this study identified causative copy number mutations and regions with a possible role in ocular disease in 17% of A/M cases. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  2. Whole genome sequencing distinguishes between relapse and reinfection in recurrent leprosy cases.

    Mariane M A Stefani

    2017-06-01

    Full Text Available Since leprosy is both treated and controlled by multidrug therapy (MDT it is important to monitor recurrent cases for drug resistance and to distinguish between relapse and reinfection as a means of assessing therapeutic efficacy. All three objectives can be reached with single nucleotide resolution using next generation sequencing and bioinformatics analysis of Mycobacterium leprae DNA present in human skin.DNA was isolated by means of optimized extraction and enrichment methods from samples from three recurrent cases in leprosy patients participating in an open-label, randomized, controlled clinical trial of uniform MDT in Brazil (U-MDT/CT-BR. Genome-wide sequencing of M. leprae was performed and the resultant sequence assemblies analyzed in silico.In all three cases, no mutations responsible for resistance to rifampicin, dapsone and ofloxacin were found, thus eliminating drug resistance as a possible cause of disease recurrence. However, sequence differences were detected between the strains from the first and second disease episodes in all three patients. In one case, clear evidence was obtained for reinfection with an unrelated strain whereas in the other two cases, relapse appeared more probable.This is the first report of using M. leprae whole genome sequencing to reveal that treated and cured leprosy patients who remain in endemic areas can be reinfected by another strain. Next generation sequencing can be applied reliably to M. leprae DNA extracted from biopsies to discriminate between cases of relapse and reinfection, thereby providing a powerful tool for evaluating different outcomes of therapeutic regimens and for following disease transmission.

  3. Selective whole genome amplification for resequencing target microbial species from complex natural samples.

    Leichty, Aaron R; Brisson, Dustin

    2014-10-01

    Population genomic analyses have demonstrated power to address major questions in evolutionary and molecular microbiology. Collecting populations of genomes is hindered in many microbial species by the absence of a cost effective and practical method to collect ample quantities of sufficiently pure genomic DNA for next-generation sequencing. Here we present a simple method to amplify genomes of a target microbial species present in a complex, natural sample. The selective whole genome amplification (SWGA) technique amplifies target genomes using nucleotide sequence motifs that are common in the target microbe genome, but rare in the background genomes, to prime the highly processive phi29 polymerase. SWGA thus selectively amplifies the target genome from samples in which it originally represented a minor fraction of the total DNA. The post-SWGA samples are enriched in target genomic DNA, which are ideal for population resequencing. We demonstrate the efficacy of SWGA using both laboratory-prepared mixtures of cultured microbes as well as a natural host-microbe association. Targeted amplification of Borrelia burgdorferi mixed with Escherichia coli at genome ratios of 1:2000 resulted in >10(5)-fold amplification of the target genomes with genomic extracts from Wolbachia pipientis-infected Drosophila melanogaster resulted in up to 70% of high-throughput resequencing reads mapping to the W. pipientis genome. By contrast, 2-9% of sequencing reads were derived from W. pipientis without prior amplification. The SWGA technique results in high sequencing coverage at a fraction of the sequencing effort, thus allowing population genomic studies at affordable costs. Copyright © 2014 by the Genetics Society of America.

  4. Utility of Whole-Genome Sequencing in Characterizing Acinetobacter Epidemiology and Analyzing Hospital Outbreaks

    Fitzpatrick, Margaret A.; Hauser, Alan R.

    2015-01-01

    Acinetobacter baumannii frequently causes nosocomial infections and outbreaks. Whole-genome sequencing (WGS) is a promising technique for strain typing and outbreak investigations. We compared the performance of conventional methods with WGS for strain typing clinical Acinetobacter isolates and analyzing a carbapenem-resistant A. baumannii (CRAB) outbreak. We performed two band-based typing techniques (pulsed-field gel electrophoresis and repetitive extragenic palindromic-PCR), multilocus sequence type (MLST) analysis, and WGS on 148 Acinetobacter calcoaceticus-A. baumannii complex bloodstream isolates collected from a single hospital from 2005 to 2012. Phylogenetic trees inferred from core-genome single nucleotide polymorphisms (SNPs) confirmed three Acinetobacter species within this collection. Four major A. baumannii clonal lineages (as defined by MLST) circulated during the study, three of which are globally distributed and one of which is novel. WGS indicated that a threshold of 2,500 core SNPs accurately distinguished A. baumannii isolates from different clonal lineages. The band-based techniques performed poorly in assigning isolates to clonal lineages and exhibited little agreement with sequence-based techniques. After applying WGS to a CRAB outbreak that occurred during the study, we identified a threshold of 2.5 core SNPs that distinguished nonoutbreak from outbreak strains. WGS was more discriminatory than the band-based techniques and was used to construct a more accurate transmission map that resolved many of the plausible transmission routes suggested by epidemiologic links. Our study demonstrates that WGS is superior to conventional techniques for A. baumannii strain typing and outbreak analysis. These findings support the incorporation of WGS into health care infection prevention efforts. PMID:26699703

  5. Whole genome analysis of Leptospira licerasiae provides insight into leptospiral evolution and pathogenicity.

    Jessica N Ricaldi

    Full Text Available The whole genome analysis of two strains of the first intermediately pathogenic leptospiral species to be sequenced (Leptospira licerasiae strains VAR010 and MMD0835 provides insight into their pathogenic potential and deepens our understanding of leptospiral evolution. Comparative analysis of eight leptospiral genomes shows the existence of a core leptospiral genome comprising 1547 genes and 452 conserved genes restricted to infectious species (including L. licerasiae that are likely to be pathogenicity-related. Comparisons of the functional content of the genomes suggests that L. licerasiae retains several proteins related to nitrogen, amino acid and carbohydrate metabolism which might help to explain why these Leptospira grow well in artificial media compared with pathogenic species. L. licerasiae strains VAR010(T and MMD0835 possess two prophage elements. While one element is circular and shares homology with LE1 of L. biflexa, the second is cryptic and homologous to a previously identified but unnamed region in L. interrogans serovars Copenhageni and Lai. We also report a unique O-antigen locus in L. licerasiae comprised of a 6-gene cluster that is unexpectedly short compared with L. interrogans in which analogous regions may include >90 such genes. Sequence homology searches suggest that these genes were acquired by lateral gene transfer (LGT. Furthermore, seven putative genomic islands ranging in size from 5 to 36 kb are present also suggestive of antecedent LGT. How Leptospira become naturally competent remains to be determined, but considering the phylogenetic origins of the genes comprising the O-antigen cluster and other putative laterally transferred genes, L. licerasiae must be able to exchange genetic material with non-invasive environmental bacteria. The data presented here demonstrate that L. licerasiae is genetically more closely related to pathogenic than to saprophytic Leptospira and provide insight into the genomic bases for

  6. DNA fingerprinting of Mycobacterium tuberculosis: from phage typing to whole-genome sequencing.

    Schürch, Anita C; van Soolingen, Dick

    2012-06-01

    Current typing methods for Mycobacterium tuberculosis complex evolved from simple phenotypic approaches like phage typing and drug susceptibility profiling to DNA-based strain typing methods, such as IS6110-restriction fragment length polymorphisms (RFLP) and variable number of tandem repeats (VNTR) typing. Examples of the usefulness of molecular typing are source case finding and epidemiological linkage of tuberculosis (TB) cases, international transmission of MDR/XDR-TB, the discrimination between endogenous reactivation and exogenous re-infection as a cause of relapses after curative treatment of tuberculosis, the evidence of multiple M. tuberculosis infections, and the disclosure of laboratory cross-contaminations. Simultaneously, phylogenetic analyses were developed based on single nucleotide polymorphisms (SNPs), genomic deletions usually referred to as regions of difference (RDs) and spoligotyping which served both strain typing and phylogenetic analysis. National and international initiatives that rely on the application of these typing methods have brought significant insight into the molecular epidemiology of tuberculosis. However, current DNA fingerprinting methods have important limitations. They can often not distinguish between genetically closely related strains and the turn-over of these markers is variable. Moreover, the suitability of most DNA typing methods for phylogenetic reconstruction is limited as they show a high propensity of convergent evolution or misinfer genetic distances. In order to fully explore the possibilities of genotyping in the molecular epidemiology of tuberculosis and to study the phylogeny of the causative bacteria reliably, the application of whole-genome sequencing (WGS) analysis for all M. tuberculosis isolates is the optimal, although currently still a costly solution. In the last years WGS for typing of pathogens has been explored and yielded important additional information on strain diversity in comparison to the

  7. Whole Genome Sequencing Increases Molecular Diagnostic Yield Compared with Current Diagnostic Testing for Inherited Retinal Disease.

    Ellingford, Jamie M; Barton, Stephanie; Bhaskar, Sanjeev; Williams, Simon G; Sergouniotis, Panagiotis I; O'Sullivan, James; Lamb, Janine A; Perveen, Rahat; Hall, Georgina; Newman, William G; Bishop, Paul N; Roberts, Stephen A; Leach, Rick; Tearle, Rick; Bayliss, Stuart; Ramsden, Simon C; Nemeth, Andrea H; Black, Graeme C M

    2016-05-01

    To compare the efficacy of whole genome sequencing (WGS) with targeted next-generation sequencing (NGS) in the diagnosis of inherited retinal disease (IRD). Case series. A total of 562 patients diagnosed with IRD. We performed a direct comparative analysis of current molecular diagnostics with WGS. We retrospectively reviewed the findings from a diagnostic NGS DNA test for 562 patients with IRD. A subset of 46 of 562 patients (encompassing potential clinical outcomes of diagnostic analysis) also underwent WGS, and we compared mutation detection rates and molecular diagnostic yields. In addition, we compared the sensitivity and specificity of the 2 techniques to identify known single nucleotide variants (SNVs) using 6 control samples with publically available genotype data. Diagnostic yield of genomic testing. Across known disease-causing genes, targeted NGS and WGS achieved similar levels of sensitivity and specificity for SNV detection. However, WGS also identified 14 clinically relevant genetic variants through WGS that had not been identified by NGS diagnostic testing for the 46 individuals with IRD. These variants included large deletions and variants in noncoding regions of the genome. Identification of these variants confirmed a molecular diagnosis of IRD for 11 of the 33 individuals referred for WGS who had not obtained a molecular diagnosis through targeted NGS testing. Weighted estimates, accounting for population structure, suggest that WGS methods could result in an overall 29% (95% confidence interval, 15-45) uplift in diagnostic yield. We show that WGS methods can detect disease-causing genetic variants missed by current NGS diagnostic methodologies for IRD and thereby demonstrate the clinical utility and additional value of WGS. Copyright © 2016 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.

  8. The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

    Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

    2016-10-11

    Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

  9. Impact of whole-genome duplication events on diversification rates in angiosperms.

    Landis, Jacob B; Soltis, Douglas E; Li, Zheng; Marx, Hannah E; Barker, Michael S; Tank, David C; Soltis, Pamela S

    2018-03-01

    Polyploidy or whole-genome duplication (WGD) pervades the evolutionary history of angiosperms. Despite extensive progress in our understanding of WGD, the role of these events in promoting diversification is still not well understood. We seek to clarify the possible association between WGD and diversification rates in flowering plants. Using a previously published phylogeny spanning all land plants (31,749 tips) and WGD events inferred from analyses of the 1000 Plants (1KP) transcriptome data, we analyzed the association of WGDs and diversification rates following numerous WGD events across the angiosperms. We used a stepwise AIC approach (MEDUSA), a Bayesian mixture model approach (BAMM), and state-dependent diversification analyses (MuSSE) to investigate patterns of diversification. Sister-clade comparisons were used to investigate species richness after WGDs. Based on the density of 1KP taxon sampling, 106 WGDs were unambiguously placed on the angiosperm phylogeny. We identified 334-530 shifts in diversification rates. We found that 61 WGD events were tightly linked to changes in diversification rates, and state-dependent diversification analyses indicated higher speciation rates for subsequent rounds of WGD. Additionally, 70 of 99 WGD events showed an increase in species richness compared to the sister clade. Forty-six of the 106 WGDs analyzed appear to be closely associated with upshifts in the rate of diversification in angiosperms. Shifts in diversification do not appear more likely than random within a four-node lag phase following a WGD; however, younger WGD events are more likely to be followed by an upshift in diversification than older WGD events. © 2018 Botanical Society of America.

  10. Prediction of Phenotypic Antimicrobial Resistance Profiles From Whole Genome Sequences of Non-typhoidal Salmonella enterica.

    Neuert, Saskia; Nair, Satheesh; Day, Martin R; Doumith, Michel; Ashton, Philip M; Mellor, Kate C; Jenkins, Claire; Hopkins, Katie L; Woodford, Neil; de Pinna, Elizabeth; Godbole, Gauri; Dallman, Timothy J

    2018-01-01

    Surveillance of antimicrobial resistance (AMR) in non-typhoidal Salmonella enterica (NTS), is essential for monitoring transmission of resistance from the food chain to humans, and for establishing effective treatment protocols. We evaluated the prediction of phenotypic resistance in NTS from genotypic profiles derived from whole genome sequencing (WGS). Genes and chromosomal mutations responsible for phenotypic resistance were sought in WGS data from 3,491 NTS isolates received by Public Health England's Gastrointestinal Bacteria Reference Unit between April 2014 and March 2015. Inferred genotypic AMR profiles were compared with phenotypic susceptibilities determined for fifteen antimicrobials using EUCAST guidelines. Discrepancies between phenotypic and genotypic profiles for one or more antimicrobials were detected for 76 isolates (2.18%) although only 88/52,365 (0.17%) isolate/antimicrobial combinations were discordant. Of the discrepant results, the largest number were associated with streptomycin (67.05%, n = 59). Pan-susceptibility was observed in 2,190 isolates (62.73%). Overall, resistance to tetracyclines was most common (26.27% of isolates, n = 917) followed by sulphonamides (23.72%, n = 828) and ampicillin (21.43%, n = 748). Multidrug resistance (MDR), i.e., resistance to three or more antimicrobial classes, was detected in 848 isolates (24.29%) with resistance to ampicillin, streptomycin, sulphonamides and tetracyclines being the most common MDR profile ( n = 231; 27.24%). For isolates with this profile, all but one were S . Typhimurium and 94.81% ( n = 219) had the resistance determinants bla TEM-1, strA-strB, sul2 and tet (A). Extended-spectrum β-lactamase genes were identified in 41 isolates (1.17%) and multiple mutations in chromosomal genes associated with ciprofloxacin resistance in 82 isolates (2.35%). This study showed that WGS is suitable as a rapid means of determining AMR patterns of NTS for public health surveillance.

  11. A whole genome analyses of genetic variants in two Kelantan Malay individuals.

    Wan Juhari, Wan Khairunnisa; Md Tamrin, Nur Aida; Mat Daud, Mohd Hanif Ridzuan; Isa, Hatin Wan; Mohd Nasir, Nurfazreen; Maran, Sathiya; Abdul Rajab, Nur Shafawati; Ahmad Amin Noordin, Khairul Bariah; Nik Hassan, Nik Norliza; Tearle, Rick; Razali, Rozaimi; Merican, Amir Feisal; Zilfalil, Bin Alwi

    2014-12-01

    The sequencing of two members of the Royal Kelantan Malay family genomes will provide insights on the Kelantan Malay whole genome sequences. The two Kelantan Malay genomes were analyzed for the SNP markers associated with thalassemia and Helicobacter pylori infection. Helicobacter pylori infection was reported to be low prevalence in the north-east as compared to the west coast of the Peninsular Malaysia and beta-thalassemia was known to be one of the most common inherited and genetic disorder in Malaysia. By combining SNP information from literatures, GWAS study and NCBI ClinVar, 18 unique SNPs were selected for further analysis. From these 18 SNPs, 10 SNPs came from previous study of Helicobacter pylori infection among Malay patients, 6 SNPs were from NCBI ClinVar and 2 SNPs from GWAS studies. The analysis reveals that both Royal Kelantan Malay genomes shared all the 10 SNPs identified by Maran (Single Nucleotide Polymorphims (SNPs) genotypic profiling of Malay patients with and without Helicobacter pylori infection in Kelantan, 2011) and one SNP from GWAS study. In addition, the analysis also reveals that both Royal Kelantan Malay genomes shared 3 SNP markers; HBG1 (rs1061234), HBB (rs1609812) and BCL11A (rs766432) where all three markers were associated with beta-thalassemia. Our findings suggest that the Royal Kelantan Malays carry the SNPs which are associated with protection to Helicobacter pylori infection. In addition they also carry SNPs which are associated with beta-thalassemia. These findings are in line with the findings by other researchers who conducted studies on thalassemia and Helicobacter pylori infection in the non-royal Malay population.

  12. Whole-genome DNA methylation status associated with clinical PTSD measures of OIF/OEF veterans

    Hammamieh, R; Chakraborty, N; Gautam, A; Muhie, S; Yang, R; Donohue, D; Kumar, R; Daigle, B J; Zhang, Y; Amara, D A; Miller, S-A; Srinivasan, S; Flory, J; Yehuda, R; Petzold, L; Wolkowitz, O M; Mellon, S H; Hood, L; Doyle, F J; Marmar, C; Jett, M

    2017-01-01

    Emerging knowledge suggests that post-traumatic stress disorder (PTSD) pathophysiology is linked to the patients’ epigenetic changes, but comprehensive studies examining genome-wide methylation have not been performed. In this study, we examined genome-wide DNA methylation in peripheral whole blood in combat veterans with and without PTSD to ascertain differentially methylated probes. Discovery was initially made in a training sample comprising 48 male Operation Enduring Freedom (OEF)/Operation Iraqi Freedom (OIF) veterans with PTSD and 51 age/ethnicity/gender-matched combat-exposed PTSD-negative controls. Agilent whole-genome array detected ~5600 differentially methylated CpG islands (CpGI) annotated to ~2800 differently methylated genes (DMGs). The majority (84.5%) of these CpGIs were hypermethylated in the PTSD cases. Functional analysis was performed using the DMGs encoding the promoter-bound CpGIs to identify networks related to PTSD. The identified networks were further validated by an independent test set comprising 31 PTSD+/29 PTSD− veterans. Targeted bisulfite sequencing was also used to confirm the methylation status of 20 DMGs shown to be highly perturbed in the training set. To improve the statistical power and mitigate the assay bias and batch effects, a union set combining both training and test set was assayed using a different platform from Illumina. The pathways curated from this analysis confirmed 65% of the pool of pathways mined from training and test sets. The results highlight the importance of assay methodology and use of independent samples for discovery and validation of differentially methylated genes mined from whole blood. Nonetheless, the current study demonstrates that several important epigenetically altered networks may distinguish combat-exposed veterans with and without PTSD. PMID:28696412

  13. Quantification of trace-level DNA by real-time whole genome amplification.

    Kang, Min-Jung; Yu, Hannah; Kim, Sook-Kyung; Park, Sang-Ryoul; Yang, Inchul

    2011-01-01

    Quantification of trace amounts of DNA is a challenge in analytical applications where the concentration of a target DNA is very low or only limited amounts of samples are available for analysis. PCR-based methods including real-time PCR are highly sensitive and widely used for quantification of low-level DNA samples. However, ordinary PCR methods require at least one copy of a specific gene sequence for amplification and may not work for a sub-genomic amount of DNA. We suggest a real-time whole genome amplification method adopting the degenerate oligonucleotide primed PCR (DOP-PCR) for quantification of sub-genomic amounts of DNA. This approach enabled quantification of sub-picogram amounts of DNA independently of their sequences. When the method was applied to the human placental DNA of which amount was accurately determined by inductively coupled plasma-optical emission spectroscopy (ICP-OES), an accurate and stable quantification capability for DNA samples ranging from 80 fg to 8 ng was obtained. In blind tests of laboratory-prepared DNA samples, measurement accuracies of 7.4%, -2.1%, and -13.9% with analytical precisions around 15% were achieved for 400-pg, 4-pg, and 400-fg DNA samples, respectively. A similar quantification capability was also observed for other DNA species from calf, E. coli, and lambda phage. Therefore, when provided with an appropriate standard DNA, the suggested real-time DOP-PCR method can be used as a universal method for quantification of trace amounts of DNA.

  14. Comparison of Control of Clostridium difficile Infection in Six English Hospitals Using Whole-Genome Sequencing.

    Eyre, David W; Fawley, Warren N; Rajgopal, Anu; Settle, Christopher; Mortimer, Kalani; Goldenberg, Simon D; Dawson, Susan; Crook, Derrick W; Peto, Tim E A; Walker, A Sarah; Wilcox, Mark H

    2017-08-01

    Variation in Clostridium difficile infection (CDI) rates between healthcare institutions suggests overall incidence could be reduced if the lowest rates could be achieved more widely. We used whole-genome sequencing (WGS) of consecutive C. difficile isolates from 6 English hospitals over 1 year (2013-14) to compare infection control performance. Fecal samples with a positive initial screen for C. difficile were sequenced. Within each hospital, we estimated the proportion of cases plausibly acquired from previous cases. Overall, 851/971 (87.6%) sequenced samples contained toxin genes, and 451 (46.4%) were fecal-toxin-positive. Of 652 potentially toxigenic isolates >90-days after the study started, 128 (20%, 95% confidence interval [CI] 17-23%) were genetically linked (within ≤2 single nucleotide polymorphisms) to a prior patient's isolate from the previous 90 days. Hospital 2 had the fewest linked isolates, 7/105 (7%, 3-13%), hospital 1, 9/70 (13%, 6-23%), and hospitals 3-6 had similar proportions of linked isolates (22-26%) (P ≤ .002 comparing hospital-2 vs 3-6). Results were similar adjusting for locally circulating ribotypes. Adjusting for hospital, ribotype-027 had the highest proportion of linked isolates (57%, 95% CI 29-81%). Fecal-toxin-positive and toxin-negative patients were similarly likely to be a potential transmission donor, OR = 1.01 (0.68-1.49). There was no association between the estimated proportion of linked cases and testing rates. WGS can be used as a novel surveillance tool to identify varying rates of C. difficile transmission between institutions and therefore to allow targeted efforts to reduce CDI incidence. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America.

  15. Parents perspectives on whole genome sequencing for their children: qualified enthusiasm?

    Anderson, J A; Meyn, M S; Shuman, C; Zlotnik Shaul, R; Mantella, L E; Szego, M J; Bowdin, S; Monfared, N; Hayeems, R Z

    2017-08-01

    To better understand the consequences of returning whole genome sequencing (WGS) results in paediatrics and facilitate its evidence-based clinical implementation, we studied parents' experiences with WGS and their preferences for the return of adult-onset secondary variants (SVs)-medically actionable genomic variants unrelated to their child's current medical condition that predict adult-onset disease. We conducted qualitative interviews with parents whose children were undergoing WGS as part of the SickKids Genome Clinic, a research project that studies the impact of clinical WGS on patients, families, and the healthcare system. Interviews probed parents' experience with and motivation for WGS as well as their preferences related to SVs. Interviews were analysed thematically. Of 83 invited, 23 parents from 18 families participated. These parents supported WGS as a diagnostic test, perceiving clear intrinsic and instrumental value. However, many parents were ambivalent about receiving SVs, conveying a sense of self-imposed obligation to take on the 'weight' of knowing their child's SVs, however unpleasant. Some parents chose to learn about adult-onset SVs for their child but not for themselves. Despite general enthusiasm for WGS as a diagnostic test, many parents felt a duty to learn adult-onset SVs. Analogous to 'inflicted insight', we call this phenomenon 'inflicted ought'. Importantly, not all parents of children undergoing WGS view the best interests of their child in relational terms, thereby challenging an underlying justification for current ACMG guidelines for reporting incidental secondary findings from whole exome and WGS. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  16. Identification of Escherichia coli and Shigella Species from Whole-Genome Sequences.

    Chattaway, Marie A; Schaefer, Ulf; Tewolde, Rediat; Dallman, Timothy J; Jenkins, Claire

    2017-02-01

    Escherichia coli and Shigella species are closely related and genetically constitute the same species. Differentiating between these two pathogens and accurately identifying the four species of Shigella are therefore challenging. The organism-specific bioinformatics whole-genome sequencing (WGS) typing pipelines at Public Health England are dependent on the initial identification of the bacterial species by use of a kmer-based approach. Of the 1,982 Escherichia coli and Shigella sp. isolates analyzed in this study, 1,957 (98.4%) had concordant results by both traditional biochemistry and serology (TB&S) and the kmer identification (ID) derived from the WGS data. Of the 25 mismatches identified, 10 were enteroinvasive E. coli isolates that were misidentified as Shigella flexneri or S. boydii by the kmer ID, and 8 were S. flexneri isolates misidentified by TB&S as S. boydii due to nonfunctional S. flexneri O antigen biosynthesis genes. Analysis of the population structure based on multilocus sequence typing (MLST) data derived from the WGS data showed that the remaining discrepant results belonged to clonal complex 288 (CC288), comprising both S. boydii and S. dysenteriae strains. Mismatches between the TB&S and kmer ID results were explained by the close phylogenetic relationship between the two species and were resolved with reference to the MLST data. Shigella can be differentiated from E. coli and accurately identified to the species level by use of kmer comparisons and MLST. Analysis of the WGS data provided explanations for the discordant results between TB&S and WGS data, revealed the true phylogenetic relationships between different species of Shigella, and identified emerging pathoadapted lineages. © Crown copyright 2017.

  17. Microbiota present in cystic fibrosis lungs as revealed by whole genome sequencing.

    Philippe M Hauser

    Full Text Available Determination of the precise composition and variation of microbiota in cystic fibrosis lungs is crucial since chronic inflammation due to microorganisms leads to lung damage and ultimately, death. However, this constitutes a major technical challenge. Culturing of microorganisms does not provide a complete representation of a microbiota, even when using culturomics (high-throughput culture. So far, only PCR-based metagenomics have been investigated. However, these methods are biased towards certain microbial groups, and suffer from uncertain quantification of the different microbial domains. We have explored whole genome sequencing (WGS using the Illumina high-throughput technology applied directly to DNA extracted from sputa obtained from two cystic fibrosis patients. To detect all microorganism groups, we used four procedures for DNA extraction, each with a different lysis protocol. We avoided biases due to whole DNA amplification thanks to the high efficiency of current Illumina technology. Phylogenomic classification of the reads by three different methods produced similar results. Our results suggest that WGS provides, in a single analysis, a better qualitative and quantitative assessment of microbiota compositions than cultures and PCRs. WGS identified a high quantity of Haemophilus spp. (patient 1 or Staphylococcus spp. plus Streptococcus spp. (patient 2 together with low amounts of anaerobic (Veillonella, Prevotella, Fusobacterium and aerobic bacteria (Gemella, Moraxella, Granulicatella. WGS suggested that fungal members represented very low proportions of the microbiota, which were detected by cultures and PCRs because of their selectivity. The future increase of reads' sizes and decrease in cost should ensure the usefulness of WGS for the characterisation of microbiota.

  18. Whole Genome Amplification and Reduced-Representation Genome Sequencing of Schistosoma japonicum Miracidia.

    Jonathan A Shortt

    2017-01-01

    Full Text Available In areas where schistosomiasis control programs have been implemented, morbidity and prevalence have been greatly reduced. However, to sustain these reductions and move towards interruption of transmission, new tools for disease surveillance are needed. Genomic methods have the potential to help trace the sources of new infections, and allow us to monitor drug resistance. Large-scale genotyping efforts for schistosome species have been hindered by cost, limited numbers of established target loci, and the small amount of DNA obtained from miracidia, the life stage most readily acquired from humans. Here, we present a method using next generation sequencing to provide high-resolution genomic data from S. japonicum for population-based studies.We applied whole genome amplification followed by double digest restriction site associated DNA sequencing (ddRADseq to individual S. japonicum miracidia preserved on Whatman FTA cards. We found that we could effectively and consistently survey hundreds of thousands of variants from 10,000 to 30,000 loci from archived miracidia as old as six years. An analysis of variation from eight miracidia obtained from three hosts in two villages in Sichuan showed clear population structuring by village and host even within this limited sample.This high-resolution sequencing approach yields three orders of magnitude more information than microsatellite genotyping methods that have been employed over the last decade, creating the potential to answer detailed questions about the sources of human infections and to monitor drug resistance. Costs per sample range from $50-$200, depending on the amount of sequence information desired, and we expect these costs can be reduced further given continued reductions in sequencing costs, improvement of protocols, and parallelization. This approach provides new promise for using modern genome-scale sampling to S. japonicum surveillance, and could be applied to other schistosome species

  19. Whole Genome Amplification and Reduced-Representation Genome Sequencing of Schistosoma japonicum Miracidia.

    Shortt, Jonathan A; Card, Daren C; Schield, Drew R; Liu, Yang; Zhong, Bo; Castoe, Todd A; Carlton, Elizabeth J; Pollock, David D

    2017-01-01

    In areas where schistosomiasis control programs have been implemented, morbidity and prevalence have been greatly reduced. However, to sustain these reductions and move towards interruption of transmission, new tools for disease surveillance are needed. Genomic methods have the potential to help trace the sources of new infections, and allow us to monitor drug resistance. Large-scale genotyping efforts for schistosome species have been hindered by cost, limited numbers of established target loci, and the small amount of DNA obtained from miracidia, the life stage most readily acquired from humans. Here, we present a method using next generation sequencing to provide high-resolution genomic data from S. japonicum for population-based studies. We applied whole genome amplification followed by double digest restriction site associated DNA sequencing (ddRADseq) to individual S. japonicum miracidia preserved on Whatman FTA cards. We found that we could effectively and consistently survey hundreds of thousands of variants from 10,000 to 30,000 loci from archived miracidia as old as six years. An analysis of variation from eight miracidia obtained from three hosts in two villages in Sichuan showed clear population structuring by village and host even within this limited sample. This high-resolution sequencing approach yields three orders of magnitude more information than microsatellite genotyping methods that have been employed over the last decade, creating the potential to answer detailed questions about the sources of human infections and to monitor drug resistance. Costs per sample range from $50-$200, depending on the amount of sequence information desired, and we expect these costs can be reduced further given continued reductions in sequencing costs, improvement of protocols, and parallelization. This approach provides new promise for using modern genome-scale sampling to S. japonicum surveillance, and could be applied to other schistosome species and other

  20. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing.

    Aflitos, Saulo; Schijlen, Elio; de Jong, Hans; de Ridder, Dick; Smit, Sandra; Finkers, Richard; Wang, Jun; Zhang, Gengyun; Li, Ning; Mao, Likai; Bakker, Freek; Dirks, Rob; Breit, Timo; Gravendeel, Barbara; Huits, Henk; Struss, Darush; Swanson-Wagner, Ruth; van Leeuwen, Hans; van Ham, Roeland C H J; Fito, Laia; Guignier, Laëtitia; Sevilla, Myrna; Ellul, Philippe; Ganko, Eric; Kapur, Arvind; Reclus, Emannuel; de Geus, Bernard; van de Geest, Henri; Te Lintel Hekkert, Bas; van Haarst, Jan; Smits, Lars; Koops, Andries; Sanchez-Perez, Gabino; van Heusden, Adriaan W; Visser, Richard; Quan, Zhiwu; Min, Jiumeng; Liao, Li; Wang, Xiaoli; Wang, Guangbiao; Yue, Zhen; Yang, Xinhua; Xu, Na; Schranz, Eric; Smets, Erik; Vos, Rutger; Rauwerda, Johan; Ursem, Remco; Schuit, Cees; Kerns, Mike; van den Berg, Jan; Vriezen, Wim; Janssen, Antoine; Datema, Erwin; Jahrman, Torben; Moquet, Frederic; Bonnet, Julien; Peters, Sander

    2014-10-01

    We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new reference genomes were reconstructed to support our comparative genome analyses. Comparative sequence alignment revealed group-, species- and accession-specific polymorphisms, explaining characteristic fruit traits and growth habits in the various cultivars. Using gene models from the annotated Heinz 1706 reference genome, we observed differences in the ratio between non-synonymous and synonymous SNPs (dN/dS) in fruit diversification and plant growth genes compared to a random set of genes, indicating positive selection and differences in selection pressure between crop accessions and wild species. In wild species, the number of single-nucleotide polymorphisms (SNPs) exceeds 10 million, i.e. 20-fold higher than found in most of the crop accessions, indicating dramatic genetic erosion of crop and heirloom tomatoes. In addition, the highest levels of heterozygosity were found for allogamous self-incompatible wild species, while facultative and autogamous self-compatible species display a lower heterozygosity level. Using whole-genome SNP information for maximum-likelihood analysis, we achieved complete tree resolution, whereas maximum-likelihood trees based on SNPs from ten fruit and growth genes show incomplete resolution for the crop accessions, partly due to the effect of heterozygous SNPs. Finally, results suggest that phylogenetic relationships are correlated with habitat, indicating the occurrence of geographical races within these groups, which is of practical importance for Solanum genome evolution studies. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.

  1. Whole genome sequencing distinguishes between relapse and reinfection in recurrent leprosy cases

    Bührer-Sékula, Samira; Benjak, Andrej; Loiseau, Chloé; Singh, Pushpendra; Pontes, Maria A. A.; Gonçalves, Heitor S.; Hungria, Emerith M.; Busso, Philippe; Piton, Jérémie; Silveira, Maria I. S.; Cruz, Rossilene; Schetinni, Antônio; Costa, Maurício B.; Virmond, Marcos C. L.; Diorio, Suzana M.; Dias-Baptista, Ida M. F.; Rosa, Patricia S.; Matsuoka, Masanori; Penna, Maria L. F.; Cole, Stewart T.; Penna, Gerson O.

    2017-01-01

    Background Since leprosy is both treated and controlled by multidrug therapy (MDT) it is important to monitor recurrent cases for drug resistance and to distinguish between relapse and reinfection as a means of assessing therapeutic efficacy. All three objectives can be reached with single nucleotide resolution using next generation sequencing and bioinformatics analysis of Mycobacterium leprae DNA present in human skin. Methodology DNA was isolated by means of optimized extraction and enrichment methods from samples from three recurrent cases in leprosy patients participating in an open-label, randomized, controlled clinical trial of uniform MDT in Brazil (U-MDT/CT-BR). Genome-wide sequencing of M. leprae was performed and the resultant sequence assemblies analyzed in silico. Principal findings In all three cases, no mutations responsible for resistance to rifampicin, dapsone and ofloxacin were found, thus eliminating drug resistance as a possible cause of disease recurrence. However, sequence differences were detected between the strains from the first and second disease episodes in all three patients. In one case, clear evidence was obtained for reinfection with an unrelated strain whereas in the other two cases, relapse appeared more probable. Conclusions/Significance This is the first report of using M. leprae whole genome sequencing to reveal that treated and cured leprosy patients who remain in endemic areas can be reinfected by another strain. Next generation sequencing can be applied reliably to M. leprae DNA extracted from biopsies to discriminate between cases of relapse and reinfection, thereby providing a powerful tool for evaluating different outcomes of therapeutic regimens and for following disease transmission. PMID:28617800

  2. Whole Genome Sequences of Three Treponema pallidum ssp. pertenue Strains: Yaws and Syphilis Treponemes Differ in Less than 0.2% of the Genome Sequence

    Chen, Lei; Pospíšilová, Petra; Strouhal, Michal; Qin, Xiang; Mikalová, Lenka; Norris, Steven J.; Muzny, Donna M.; Gibbs, Richard A.; Fulton, Lucinda L.; Sodergren, Erica; Weinstock, George M.; Šmajs, David

    2012-01-01

    Background The yaws treponemes, Treponema pallidum ssp. pertenue (TPE) strains, are closely related to syphilis causing strains of Treponema pallidum ssp. pallidum (TPA). Both yaws and syphilis are distinguished on the basis of epidemiological characteristics, clinical symptoms, and several genetic signatures of the corresponding causative agents. Methodology/Principal Findings To precisely define genetic differences between TPA and TPE, high-quality whole genome sequences of three TPE strains (Samoa D, CDC-2, Gauthier) were determined using next-generation sequencing techniques. TPE genome sequences were compared to four genomes of TPA strains (Nichols, DAL-1, SS14, Chicago). The genome structure was identical in all three TPE strains with similar length ranging between 1,139,330 bp and 1,139,744 bp. No major genome rearrangements were found when compared to the four TPA genomes. The whole genome nucleotide divergence (dA) between TPA and TPE subspecies was 4.7 and 4.8 times higher than the observed nucleotide diversity (π) among TPA and TPE strains, respectively, corresponding to 99.8% identity between TPA and TPE genomes. A set of 97 (9.9%) TPE genes encoded proteins containing two or more amino acid replacements or other major sequence changes. The TPE divergent genes were mostly from the group encoding potential virulence factors and genes encoding proteins with unknown function. Conclusions/Significance Hypothetical genes, with genetic differences, consistently found between TPE and TPA strains are candidates for syphilitic treponemes virulence factors. Seventeen TPE genes were predicted under positive selection, and eleven of them coded either for predicted exported proteins or membrane proteins suggesting their possible association with the cell surface. Sequence changes between TPE and TPA strains and changes specific to individual strains represent suitable targets for subspecies- and strain-specific molecular diagnostics. PMID:22292095

  3. Whole genome sequence and genome annotation of Colletotrichum acutatum, causal agent of anthracnose in pepper plants in South Korea.

    Han, Joon-Hee; Chon, Jae-Kyung; Ahn, Jong-Hwa; Choi, Ik-Young; Lee, Yong-Hwan; Kim, Kyoung Su

    2016-06-01

    Colletotrichum acutatum is a destructive fungal pathogen which causes anthracnose in a wide range of crops. Here we report the whole genome sequence and annotation of C. acutatum strain KC05, isolated from an infected pepper in Kangwon, South Korea. Genomic DNA from the KC05 strain was used for the whole genome sequencing using a PacBio sequencer and the MiSeq system. The KC05 genome was determined to be 52,190,760 bp in size with a G + C content of 51.73% in 27 scaffolds and to contain 13,559 genes with an average length of 1516 bp. Gene prediction and annotation were performed by incorporating RNA-Seq data. The genome sequence of the KC05 was deposited at DDBJ/ENA/GenBank under the accession number LUXP00000000.

  4. The use of mycobacterial interspersed repetitive unit typing and whole genome sequencing to inform tuberculosis prevention and control activities.

    Gilbert, Gwendolyn L; Sintchenko, Vitali

    2013-07-01

    Molecular strain typing of Mycobacterium tuberculosis has been possible for only about 20 years; it has significantly improved our understanding of the evolution and epidemiology of Mycobacterium tuberculosis and tuberculosis disease. Mycobacterial interspersed repetitive unit typing, based on 24 variable number tandem repeat unit loci, is highly discriminatory, relatively easy to perform and interpret and is currently the most widely used molecular typing system for tuberculosis surveillance. Nevertheless, clusters identified by mycobacterial interspersed repetitive unit typing sometimes cannot be confirmed or adequately defined by contact tracing and additional methods are needed. Recently, whole genome sequencing has been used to identify single nucleotide polymorphisms and other mutations, between genotypically indistinguishable isolates from the same cluster, to more accurately trace transmission pathways. Rapidly increasing speed and quality and reduced costs will soon make large scale whole genome sequencing feasible, combined with the use of sophisticated bioinformatics tools, for epidemiological surveillance of tuberculosis.

  5. Optical Whole-Genome Restriction Mapping as a Tool for Rapidly Distinguishing and Identifying Bacterial Contaminants in Clinical Samples

    2015-08-01

    Article 3. DATES COVERED (From – To) Oct 2011 – Aug 2012 4. TITLE AND SUBTITLE Optical Whole-Genome Restriction Mapping as a Tool for Rapidly...multiple bacteria could be uniquely identified within mixtures. In the first set of experiments, three unique organisms ( Bacillus subtilis subsp. globigii...be useful in monitoring nosocomial outbreaks in neonatal and intensive care wards, or even as an initial screen for antibiotic resistant strains

  6. Identification and Whole Genome Sequencing of the First Case of Kosakonia radicincitans Causing a Human Bloodstream Infection

    Bhatti, Micah D.; Kalia, Awdhesh; Sahasrabhojane, Pranoti; Kim, Jiwoong; Greenberg, David E.; Shelburne, Samuel A.

    2017-01-01

    The taxonomy of Enterobacter species is rapidly changing. Herein we report a bloodstream infection isolate originally identified as Enterobacter cloacae by Vitek2 methodology that we found to be Kosakonia radicincitans using genetic means. Comparative whole genome sequencing of our isolate and other published Kosakonia genomes revealed these organisms lack the AmpC β-lactamase present on the chromosome of Enterobacter sp. A fimbriae operon primarily found in Escherichia coli O157:H7 isolates ...

  7. Whole-Genome Sequences of Two Carbapenem-Resistant Klebsiella quasipneumoniae Strains Isolated from a Tertiary Hospital in Johor, Malaysia.

    Gan, Han Ming; Rajasekaram, Ganeswrie; Eng, Wilhelm Wei Han; Kaniappan, Priyatharisni; Dhanoa, Amreeta

    2017-08-10

    We report the whole-genome sequences of two carbapenem-resistant clinical isolates of Klebsiella quasipneumoniae subsp. similipneumoniae obtained from two different patients. Both strains contained three different extended-spectrum β-lactamase genes and showed strikingly high pairwise average nucleotide identity of 99.99% despite being isolated 3 years apart from the same hospital. Copyright © 2017 Gan et al.

  8. Whole-Genome DNA Methylation Status Associated with Clinical PTSD Measures of OIF/OEF Veterans (Open Access)

    2017-07-11

    OIF) veterans with PTSD and 51 age/ethnicity/ gender -matched combat-exposed PTSD-negative controls. Agilent whole-genome array detected ~ 5600...exclusion criteria were used19,20 to identify a training set comprising 48 male veterans with PTSD (PTSD+) and 51 age-/ethnicity-/ gender -matched controls...568 Doughten Drive, Fort Detrick, Frederick, MD 21702-5010, USA. E-mail: Rasha.Hammamieh1.civ@mail.mil 11These authors contributed equally to this

  9. Real-time whole-genome sequencing for routine typing, surveillance, and outbreak detection of verotoxigenic Escherichia coli.

    Joensen, Katrine Grimstrup; Scheutz, Flemming; Lund, Ole; Hasman, Henrik; Kaas, Rolf Sommer; Nielsen, Eva M.; Aarestrup, Frank Møller

    2014-01-01

    Fast and accurate identification and typing of pathogens are essential for effective surveillance and outbreak detection. The current routine procedure is based on a variety of techniques, making the procedure laborious, time-consuming, and expensive. With whole-genome sequencing (WGS) becoming cheaper, it has huge potential in both diagnostics and routine surveillance. The aim of this study was to perform a real-time evaluation of WGS for routine typing and surveillance of verocytotoxin-prod...

  10. Real-Time Whole-Genome Sequencing for Routine Typing, Surveillance, and Outbreak Detection of Verotoxigenic Escherichia coli

    Joensen, Katrine Grimstrup; Scheutz, Flemming; Lund, Ole; Hasman, Henrik; Kaas, Rolf S.; Nielsen, Eva M.; Aarestrup, Frank M.

    2014-01-01

    Fast and accurate identification and typing of pathogens are essential for effective surveillance and outbreak detection. The current routine procedure is based on a variety of techniques, making the procedure laborious, time-consuming, and expensive. With whole-genome sequencing (WGS) becoming cheaper, it has huge potential in both diagnostics and routine surveillance. The aim of this study was to perform a real-time evaluation of WGS for routine typing and surveillance of verocytotoxin-prod...

  11. Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges.

    Cai, Binghuang; Li, Biao; Kiga, Nikki; Thusberg, Janita; Bergquist, Timothy; Chen, Yun-Ching; Niknafs, Noushin; Carter, Hannah; Tokheim, Collin; Beleva-Guthrie, Violeta; Douville, Christopher; Bhattacharya, Rohit; Yeo, Hui Ting Grace; Fan, Jean; Sengupta, Sohini; Kim, Dewey; Cline, Melissa; Turner, Tychele; Diekhans, Mark; Zaucha, Jan; Pal, Lipika R; Cao, Chen; Yu, Chen-Hsin; Yin, Yizhou; Carraro, Marco; Giollo, Manuel; Ferrari, Carlo; Leonardi, Emanuela; Tosatto, Silvio C E; Bobe, Jason; Ball, Madeleine; Hoskins, Roger A; Repo, Susanna; Church, George; Brenner, Steven E; Moult, John; Gough, Julian; Stanke, Mario; Karchin, Rachel; Mooney, Sean D

    2017-09-01

    The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features. © 2017 Wiley Periodicals, Inc.

  12. Robust and rapid algorithms facilitate large-scale whole genome sequencing downstream analysis in an integrative framework.

    Li, Miaoxin; Li, Jiang; Li, Mulin Jun; Pan, Zhicheng; Hsu, Jacob Shujui; Liu, Dajiang J; Zhan, Xiaowei; Wang, Junwen; Song, Youqiang; Sham, Pak Chung

    2017-05-19

    Whole genome sequencing (WGS) is a promising strategy to unravel variants or genes responsible for human diseases and traits. However, there is a lack of robust platforms for a comprehensive downstream analysis. In the present study, we first proposed three novel algorithms, sequence gap-filled gene feature annotation, bit-block encoded genotypes and sectional fast access to text lines to address three fundamental problems. The three algorithms then formed the infrastructure of a robust parallel computing framework, KGGSeq, for integrating downstream analysis functions for whole genome sequencing data. KGGSeq has been equipped with a comprehensive set of analysis functions for quality control, filtration, annotation, pathogenic prediction and statistical tests. In the tests with whole genome sequencing data from 1000 Genomes Project, KGGSeq annotated several thousand more reliable non-synonymous variants than other widely used tools (e.g. ANNOVAR and SNPEff). It took only around half an hour on a small server with 10 CPUs to access genotypes of ∼60 million variants of 2504 subjects, while a popular alternative tool required around one day. KGGSeq's bit-block genotype format used 1.5% or less space to flexibly represent phased or unphased genotypes with multiple alleles and achieved a speed of over 1000 times faster to calculate genotypic correlation. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. WGSSAT: A High-Throughput Computational Pipeline for Mining and Annotation of SSR Markers From Whole Genomes.

    Pandey, Manmohan; Kumar, Ravindra; Srivastava, Prachi; Agarwal, Suyash; Srivastava, Shreya; Nagpure, Naresh S; Jena, Joy K; Kushwaha, Basdeo

    2018-03-16

    Mining and characterization of Simple Sequence Repeat (SSR) markers from whole genomes provide valuable information about biological significance of SSR distribution and also facilitate development of markers for genetic analysis. Whole genome sequencing (WGS)-SSR Annotation Tool (WGSSAT) is a graphical user interface pipeline developed using Java Netbeans and Perl scripts which facilitates in simplifying the process of SSR mining and characterization. WGSSAT takes input in FASTA format and automates the prediction of genes, noncoding RNA (ncRNA), core genes, repeats and SSRs from whole genomes followed by mapping of the predicted SSRs onto a genome (classified according to genes, ncRNA, repeats, exonic, intronic, and core gene region) along with primer identification and mining of cross-species markers. The program also generates a detailed statistical report along with visualization of mapped SSRs, genes, core genes, and RNAs. The features of WGSSAT were demonstrated using Takifugu rubripes data. This yielded a total of 139 057 SSR, out of which 113 703 SSR primer pairs were uniquely amplified in silico onto a T. rubripes (fugu) genome. Out of 113 703 mined SSRs, 81 463 were from coding region (including 4286 exonic and 77 177 intronic), 7 from RNA, 267 from core genes of fugu, whereas 105 641 SSR and 601 SSR primer pairs were uniquely mapped onto the medaka genome. WGSSAT is tested under Ubuntu Linux. The source code, documentation, user manual, example dataset and scripts are available online at https://sourceforge.net/projects/wgssat-nbfgr.

  14. Reliable reconstruction of HIV-1 whole genome haplotypes reveals clonal interference and genetic hitchhiking among immune escape variants

    2014-01-01

    Background Following transmission, HIV-1 evolves into a diverse population, and next generation sequencing enables us to detect variants occurring at low frequencies. Studying viral evolution at the level of whole genomes was hitherto not possible because next generation sequencing delivers relatively short reads. Results We here provide a proof of principle that whole HIV-1 genomes can be reliably reconstructed from short reads, and use this to study the selection of immune escape mutations at the level of whole genome haplotypes. Using realistically simulated HIV-1 populations, we demonstrate that reconstruction of complete genome haplotypes is feasible with high fidelity. We do not reconstruct all genetically distinct genomes, but each reconstructed haplotype represents one or more of the quasispecies in the HIV-1 population. We then reconstruct 30 whole genome haplotypes from published short sequence reads sampled longitudinally from a single HIV-1 infected patient. We confirm the reliability of the reconstruction by validating our predicted haplotype genes with single genome amplification sequences, and by comparing haplotype frequencies with observed epitope escape frequencies. Conclusions Phylogenetic analysis shows that the HIV-1 population undergoes selection driven evolution, with successive replacement of the viral population by novel dominant strains. We demonstrate that immune escape mutants evolve in a dependent manner with various mutations hitchhiking along with others. As a consequence of this clonal interference, selection coefficients have to be estimated for complete haplotypes and not for individual immune escapes. PMID:24996694

  15. High Resolution Typing by Whole Genome Mapping Enables Discrimination of LA-MRSA (CC398) Strains and Identification of Transmission Events

    Bosch, Thijs; Verkade, Erwin; van Luit, Martijn; Pot, Bruno; Vauterin, Paul; Burggrave, Ronald; Savelkoul, Paul; Kluytmans, Jan; Schouls, Leo

    2013-01-01

    After its emergence in 2003, a livestock-associated (LA-)MRSA clade (CC398) has caused an impressive increase in the number of isolates submitted for the Dutch national MRSA surveillance and now comprises 40% of all isolates. The currently used molecular typing techniques have limited discriminatory power for this MRSA clade, which hampers studies on the origin and transmission routes. Recently, a new molecular analysis technique named whole genome mapping was introduced. This method creates high-resolution, ordered whole genome restriction maps that may have potential for strain typing. In this study, we assessed and validated the capability of whole genome mapping to differentiate LA-MRSA isolates. Multiple validation experiments showed that whole genome mapping produced highly reproducible results. Assessment of the technique on two well-documented MRSA outbreaks showed that whole genome mapping was able to confirm one outbreak, but revealed major differences between the maps of a second, indicating that not all isolates belonged to this outbreak. Whole genome mapping of LA-MRSA isolates that were epidemiologically unlinked provided a much higher discriminatory power than spa-typing or MLVA. In contrast, maps created from LA-MRSA isolates obtained during a proven LA-MRSA outbreak were nearly indistinguishable showing that transmission of LA-MRSA can be detected by whole genome mapping. Finally, whole genome maps of LA-MRSA isolates originating from two unrelated veterinarians and their household members showed that veterinarians may carry and transmit different LA-MRSA strains at the same time. No such conclusions could be drawn based spa-typing and MLVA. Although PFGE seems to be suitable for molecular typing of LA-MRSA, WGM provides a much higher discriminatory power. Furthermore, whole genome mapping can provide a comparison with other maps within 2 days after the bacterial culture is received, making it suitable to investigate transmission events and

  16. Heterogeneity of estrogen receptor expression in circulating tumor cells from metastatic breast cancer patients.

    Anna Babayan

    Full Text Available BACKGROUND: Endocrine treatment is the most preferable systemic treatment in metastatic breast cancer patients that have had an estrogen receptor (ER positive primary tumor or metastatic lesions, however, approximately 20% of these patients do not benefit from the therapy and demonstrate further metastatic progress. One reason for failure of endocrine therapy might be the heterogeneity of ER expression in tumor cells spreading from the primary tumor to distant sites which is reflected in detectable circulating tumor cells (CTCs. METHODS: A sensitive and specific staining protocol for ER, keratin 8/18/19, CD45 was established. Peripheral blood from 35 metastatic breast cancer patients with ER-positive primary tumors was tested for the presence of CTCs. Keratin 8/18/19 and DAPI positive but CD45 negative cells were classified as CTCs and evaluated for ER staining. Subsequently, eight individual CTCs from four index patients (2 CTCs per patient were isolated and underwent whole genome amplification and ESR1 gene mutation analysis. RESULTS: CTCs were detected in blood of 16 from 35 analyzed patients (46%, with a median of 3 CTCs/7.5 ml. In total, ER-negative CTCs were detected in 11/16 (69% of the CTC positive cases, including blood samples with only ER-negative CTCs (19% and samples with both ER-positive and ER-negative CTCs (50%. No correlation was found between the intensity and/or percentage of ER staining in the primary tumor with the number and ER status of CTCs of the same patient. ESR1 gene mutations were not found. CONCLUSION: CTCs frequently lack ER expression in metastatic breast cancer patients with ER-positive primary tumors and show a considerable intra-patient heterogeneity, which may reflect a mechanism to escape endocrine therapy. Provided single cell analysis did not support a role of ESR1 mutations in this process.

  17. Heterogeneity of Estrogen Receptor Expression in Circulating Tumor Cells from Metastatic Breast Cancer Patients

    Babayan, Anna; Hannemann, Juliane; Spötter, Julia; Müller, Volkmar

    2013-01-01

    Background Endocrine treatment is the most preferable systemic treatment in metastatic breast cancer patients that have had an estrogen receptor (ER) positive primary tumor or metastatic lesions, however, approximately 20% of these patients do not benefit from the therapy and demonstrate further metastatic progress. One reason for failure of endocrine therapy might be the heterogeneity of ER expression in tumor cells spreading from the primary tumor to distant sites which is reflected in detectable circulating tumor cells (CTCs). Methods A sensitive and specific staining protocol for ER, keratin 8/18/19, CD45 was established. Peripheral blood from 35 metastatic breast cancer patients with ER-positive primary tumors was tested for the presence of CTCs. Keratin 8/18/19 and DAPI positive but CD45 negative cells were classified as CTCs and evaluated for ER staining. Subsequently, eight individual CTCs from four index patients (2 CTCs per patient) were isolated and underwent whole genome amplification and ESR1 gene mutation analysis. Results CTCs were detected in blood of 16 from 35 analyzed patients (46%), with a median of 3 CTCs/7.5 ml. In total, ER-negative CTCs were detected in 11/16 (69%) of the CTC positive cases, including blood samples with only ER-negative CTCs (19%) and samples with both ER-positive and ER-negative CTCs (50%). No correlation was found between the intensity and/or percentage of ER staining in the primary tumor with the number and ER status of CTCs of the same patient. ESR1 gene mutations were not found. Conclusion CTCs frequently lack ER expression in metastatic breast cancer patients with ER-positive primary tumors and show a considerable intra-patient heterogeneity, which may reflect a mechanism to escape endocrine therapy. Provided single cell analysis did not support a role of ESR1 mutations in this process. PMID:24058649

  18. Whole Genome Sequencing Based Characterization of Extensively Drug-Resistant Mycobacterium tuberculosis Isolates from Pakistan

    Ali, Asho; Hasan, Zahra; McNerney, Ruth; Mallard, Kim; Hill-Cawthorne, Grant A.; Coll, Francesc; Nair, Mridul; Pain, Arnab; Clark, Taane G.; Hasan, Rumina

    2015-01-01

    Improved molecular diagnostic methods for detection drug resistance in Mycobacterium tuberculosis (MTB) strains are required. Resistance to first- and second- line anti-tuberculous drugs has been associated with single nucleotide polymorphisms (SNPs) in particular genes. However, these SNPs can vary between MTB lineages therefore local data is required to describe different strain populations. We used whole genome sequencing (WGS) to characterize 37 extensively drug-resistant (XDR) MTB isolates from Pakistan and investigated 40 genes associated with drug resistance. Rifampicin resistance was attributable to SNPs in the rpoB hot-spot region. Isoniazid resistance was most commonly associated with the katG codon 315 (92%) mutation followed by inhA S94A (8%) however, one strain did not have SNPs in katG, inhA or oxyR-ahpC. All strains were pyrazimamide resistant but only 43% had pncA SNPs. Ethambutol resistant strains predominantly had embB codon 306 (62%) mutations, but additional SNPs at embB codons 406, 378 and 328 were also present. Fluoroquinolone resistance was associated with gyrA 91-94 codons in 81% of strains; four strains had only gyr B mutations, while others did not have SNPs in either gyrA or gyrB. Streptomycin resistant strains had mutations in ribosomal RNA genes; rpsL codon 43 (42%); rrs 500 region (16%), and gidB (34%) while six strains did not have mutations in any of these genes. Amikacin/kanamycin/capreomycin resistance was associated with SNPs in rrs at nt1401 (78%) and nt1484 (3%), except in seven (19%) strains. We estimate that if only the common hot-spot region targets of current commercial assays were used, the concordance between phenotypic and genotypic testing for these XDR strains would vary between rifampicin (100%), isoniazid (92%), flouroquinolones (81%), aminoglycoside (78%) and ethambutol (62%); while pncA sequencing would provide genotypic resistance in less than half the isolates. This work highlights the importance of expanded

  19. Whole genome resequencing of black Angus and Holstein cattle for SNP and CNV discovery

    Stothard Paul

    2011-11-01

    Full Text Available Abstract Background One of the goals of livestock genomics research is to identify the genetic differences responsible for variation in phenotypic traits, particularly those of economic importance. Characterizing the genetic variation in livestock species is an important step towards linking genes or genomic regions with phenotypes. The completion of the bovine genome sequence and recent advances in DNA sequencing technology allow for in-depth characterization of the genetic variations present in cattle. Here we describe the whole-genome resequencing of two Bos taurus bulls from distinct breeds for the purpose of identifying and annotating novel forms of genetic variation in cattle. Results The genomes of a Black Angus bull and a Holstein bull were sequenced to 22-fold and 19-fold coverage, respectively, using the ABI SOLiD system. Comparisons of the sequences with the Btau4.0 reference assembly yielded 7 million single nucleotide polymorphisms (SNPs, 24% of which were identified in both animals. Of the total SNPs found in Holstein, Black Angus, and in both animals, 81%, 81%, and 75% respectively are novel. In-depth annotations of the data identified more than 16 thousand distinct non-synonymous SNPs (85% novel between the two datasets. Alignments between the SNP-altered proteins and orthologues from numerous species indicate that many of the SNPs alter well-conserved amino acids. Several SNPs predicted to create or remove stop codons were also found. A comparison between the sequencing SNPs and genotyping results from the BovineHD high-density genotyping chip indicates a detection rate of 91% for homozygous SNPs and 81% for heterozygous SNPs. The false positive rate is estimated to be about 2% for both the Black Angus and Holstein SNP sets, based on follow-up genotyping of 422 and 427 SNPs, respectively. Comparisons of read depth between the two bulls along the reference assembly identified 790 putative copy-number variations (CNVs. Ten

  20. A New Single Nucleotide Polymorphism Database for Rainbow Trout Generated Through Whole Genome Resequencing

    Guangtu Gao

    2018-04-01

    Full Text Available Single-nucleotide polymorphisms (SNPs are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout (Oncorhynchus mykiss, SNP discovery has been previously done through sequencing of restriction-site associated DNA (RAD libraries, reduced representation libraries (RRL and RNA sequencing. Recently we have performed high coverage whole genome resequencing with 61 unrelated samples, representing a wide range of rainbow trout and steelhead populations, with 49 new samples added to 12 aquaculture samples from AquaGen (Norway that we previously used for SNP discovery. Of the 49 new samples, 11 were double-haploid lines from Washington State University (WSU and 38 represented wild and hatchery populations from a wide range of geographic distribution and with divergent migratory phenotypes. We then mapped the sequences to the new rainbow trout reference genome assembly (GCA_002163495.1 which is based on the Swanson YY doubled haploid line. Variant calling was conducted with FreeBayes and SAMtools mpileup, followed by filtering of SNPs based on quality score, sequence complexity, read depth on the locus, and number of genotyped samples. Results from the two variant calling programs were compared and genotypes of the double haploid samples were used for detecting and filtering putative paralogous sequence variants (PSVs and multi-sequence variants (MSVs. Overall, 30,302,087 SNPs were identified on the rainbow trout genome 29 chromosomes and 1,139,018 on unplaced scaffolds, with 4,042,723 SNPs having high minor allele frequency (MAF > 0.25. The average SNP density on the chromosomes was one SNP per 64 bp, or 15.6 SNPs per 1 kb. Results from the phylogenetic analysis that we conducted indicate that the SNP markers contain enough population-specific polymorphisms for recovering population relationships despite the small sample size used. Intra-Population polymorphism assessment revealed high level of polymorphism and

  1. Routine Whole-Genome Sequencing for Outbreak Investigations of Staphylococcus aureus in a National Reference Center

    Geraldine Durand

    2018-03-01

    Full Text Available The French National Reference Center for Staphylococci currently uses DNA arrays and spa typing for the initial epidemiological characterization of Staphylococcus aureus strains. We here describe the use of whole-genome sequencing (WGS to investigate retrospectively four distinct and virulent S. aureus lineages [clonal complexes (CCs: CC1, CC5, CC8, CC30] involved in hospital and community outbreaks or sporadic infections in France. We used a WGS bioinformatics pipeline based on de novo assembly (reference-free approach, single nucleotide polymorphism analysis, and on the inclusion of epidemiological markers. We examined the phylogeographic diversity of the French dominant hospital-acquired CC8-MRSA (methicillin-resistant S. aureus Lyon clone through WGS analysis which did not demonstrate evidence of large-scale geographic clustering. We analyzed sporadic cases along with two outbreaks of a CC1-MSSA (methicillin-susceptible S. aureus clone containing the Panton–Valentine leukocidin (PVL and results showed that two sporadic cases were closely related. We investigated an outbreak of PVL-positive CC30-MSSA in a school environment and were able to reconstruct the transmission history between eight families. We explored different outbreaks among newborns due to the CC5-MRSA Geraldine clone and we found evidence of an unsuspected link between two otherwise distinct outbreaks. Here, WGS provides the resolving power to disprove transmission events indicated by conventional methods (same sequence type, spa type, toxin profile, and antibiotic resistance profile and, most importantly, WGS can reveal unsuspected transmission events. Therefore, WGS allows to better describe and understand outbreaks and (inter-national dissemination of S. aureus lineages. Our findings underscore the importance of adding WGS for (inter-national surveillance of infections caused by virulent clones of S. aureus but also substantiate the fact that technological optimization at

  2. The MedSeq Project: a randomized trial of integrating whole genome sequencing into clinical medicine.

    Vassy, Jason L; Lautenbach, Denise M; McLaughlin, Heather M; Kong, Sek Won; Christensen, Kurt D; Krier, Joel; Kohane, Isaac S; Feuerman, Lindsay Z; Blumenthal-Barby, Jennifer; Roberts, J Scott; Lehmann, Lisa Soleymani; Ho, Carolyn Y; Ubel, Peter A; MacRae, Calum A; Seidman, Christine E; Murray, Michael F; McGuire, Amy L; Rehm, Heidi L; Green, Robert C

    2014-03-20

    Whole genome sequencing (WGS) is already being used in certain clinical and research settings, but its impact on patient well-being, health-care utilization, and clinical decision-making remains largely unstudied. It is also unknown how best to communicate sequencing results to physicians and patients to improve health. We describe the design of the MedSeq Project: the first randomized trials of WGS in clinical care. This pair of randomized controlled trials compares WGS to standard of care in two clinical contexts: (a) disease-specific genomic medicine in a cardiomyopathy clinic and (b) general genomic medicine in primary care. We are recruiting 8 to 12 cardiologists, 8 to 12 primary care physicians, and approximately 200 of their patients. Patient participants in both the cardiology and primary care trials are randomly assigned to receive a family history assessment with or without WGS. Our laboratory delivers a genome report to physician participants that balances the needs to enhance understandability of genomic information and to convey its complexity. We provide an educational curriculum for physician participants and offer them a hotline to genetics professionals for guidance in interpreting and managing their patients' genome reports. Using varied data sources, including surveys, semi-structured interviews, and review of clinical data, we measure the attitudes, behaviors and outcomes of physician and patient participants at multiple time points before and after the disclosure of these results. The impact of emerging sequencing technologies on patient care is unclear. We have designed a process of interpreting WGS results and delivering them to physicians in a way that anticipates how we envision genomic medicine will evolve in the near future. That is, our WGS report provides clinically relevant information while communicating the complexity and uncertainty of WGS results to physicians and, through physicians, to their patients. This project will not only

  3. Whole genome duplications and expansion of the vertebrate GATA transcription factor gene family

    Bowerman Bruce

    2009-08-01

    Full Text Available Abstract Background GATA transcription factors influence many developmental processes, including the specification of embryonic germ layers. The GATA gene family has significantly expanded in many animal lineages: whereas diverse cnidarians have only one GATA transcription factor, six GATA genes have been identified in many vertebrates, five in many insects, and eleven to thirteen in Caenorhabditis nematodes. All bilaterian animal genomes have at least one member each of two classes, GATA123 and GATA456. Results We have identified one GATA123 gene and one GATA456 gene from the genomic sequence of two invertebrate deuterostomes, a cephalochordate (Branchiostoma floridae and a hemichordate (Saccoglossus kowalevskii. We also have confirmed the presence of six GATA genes in all vertebrate genomes, as well as additional GATA genes in teleost fish. Analyses of conserved sequence motifs and of changes to the exon-intron structure, and molecular phylogenetic analyses of these deuterostome GATA genes support their origin from two ancestral deuterostome genes, one GATA 123 and one GATA456. Comparison of the conserved genomic organization across vertebrates identified eighteen paralogous gene families linked to multiple vertebrate GATA genes (GATA paralogons, providing the strongest evidence yet for expansion of vertebrate GATA gene families via genome duplication events. Conclusion From our analysis, we infer the evolutionary birth order and relationships among vertebrate GATA transcription factors, and define their expansion via multiple rounds of whole genome duplication events. As the genomes of four independent invertebrate deuterostome lineages contain single copy GATA123 and GATA456 genes, we infer that the 0R (pre-genome duplication invertebrate deuterostome ancestor also had two GATA genes, one of each class. Synteny analyses identify duplications of paralogous chromosomal regions (paralogons, from single ancestral vertebrate GATA123 and GATA456

  4. In vivo capsular switch in Streptococcus pneumoniae--analysis by whole genome sequencing.

    Fen Z Hu

    Full Text Available Two multidrug resistant strains of Streptococcus pneumoniae - SV35-T23 (capsular type 23F and SV36-T3 (capsular type 3 were recovered from the nasopharynx of two adult patients during an outbreak of pneumococcal disease in a New York hospital in 1996. Both strains belonged to the pandemic lineage PMEN1 but they differed strikingly in virulence when tested in the mouse model of IP infection: as few as 1000 CFU of SV36 killed all mice within 24 hours after inoculation while SV35-T23 was avirulent.Whole genome sequencing (WGS of the two isolates was performed (i to test if these two isolates belonging to the same clonal type and recovered from an identical epidemiological scenario only differed in their capsular genes? and (ii to test if the vast difference in virulence between the strains was mostly - or exclusively - due to the type III capsule. WGS demonstrated extensive differences between the two isolates including over 2500 single nucleotide polymorphisms in core genes and also differences in 36 genetic determinants: 25 of which were unique to SV35-T23 and 11 unique to strain SV36-T3. Nineteen of these differences were capsular genes and 9 bacteriocin genes.Using genetic transformation in the laboratory, the capsular region of SV35-T23 was replaced by the type 3 capsular genes from SV36-T3 to generate the recombinant SV35-T3* which was as virulent as the parental strain SV36-T3* in the murine model and the type 3 capsule was the major virulence factor in the chinchilla model as well. On the other hand, a careful comparison of strains SV36-T3 and the laboratory constructed SV35-T3* in the chinchilla model suggested that some additional determinants present in SV36 but not in the laboratory recombinant may also contribute to the progression of middle ear disease. The nature of this determinants remains to be identified.

  5. Variable selection models for genomic selection using whole-genome sequence data and singular value decomposition.

    Meuwissen, Theo H E; Indahl, Ulf G; Ødegård, Jørgen

    2017-12-27

    Non-linear Bayesian genomic prediction models such as BayesA/B/C/R involve iteration and mostly Markov chain Monte Carlo (MCMC) algorithms, which are computationally expensive, especially when whole-genome sequence (WGS) data are analyzed. Singular value decomposition (SVD) of the genotype matrix can facilitate genomic prediction in large datasets, and can be used to estimate marker effects and their prediction error variances (PEV) in a computationally efficient manner. Here, we developed, implemented, and evaluated a direct, non-iterative method for the estimation of marker effects for the BayesC genomic prediction model. The BayesC model assumes a priori that markers have normally distributed effects with probability [Formula: see text] and no effect with probability (1 - [Formula: see text]). Marker effects and their PEV are estimated by using SVD and the posterior probability of the marker having a non-zero effect is calculated. These posterior probabilities are used to obtain marker-specific effect variances, which are subsequently used to approximate BayesC estimates of marker effects in a linear model. A computer simulation study was conducted to compare alternative genomic prediction methods, where a single reference generation was used to estimate marker effects, which were subsequently used for 10 generations of forward prediction, for which accuracies were evaluated. SVD-based posterior probabilities of markers having non-zero effects were generally lower than MCMC-based posterior probabilities, but for some regions the opposite occurred, resulting in clear signals for QTL-rich regions. The accuracies of breeding values estimated using SVD- and MCMC-based BayesC analyses were similar across the 10 generations of forward prediction. For an intermediate number of generations (2 to 5) of forward prediction, accuracies obtained with the BayesC model tended to be slightly higher than accuracies obtained using the best linear unbiased prediction of SNP

  6. Whole Genome Sequencing Based Characterization of Extensively Drug-Resistant Mycobacterium tuberculosis Isolates from Pakistan

    Ali, Asho

    2015-02-26

    Improved molecular diagnostic methods for detection drug resistance in Mycobacterium tuberculosis (MTB) strains are required. Resistance to first- and second- line anti-tuberculous drugs has been associated with single nucleotide polymorphisms (SNPs) in particular genes. However, these SNPs can vary between MTB lineages therefore local data is required to describe different strain populations. We used whole genome sequencing (WGS) to characterize 37 extensively drug-resistant (XDR) MTB isolates from Pakistan and investigated 40 genes associated with drug resistance. Rifampicin resistance was attributable to SNPs in the rpoB hot-spot region. Isoniazid resistance was most commonly associated with the katG codon 315 (92%) mutation followed by inhA S94A (8%) however, one strain did not have SNPs in katG, inhA or oxyR-ahpC. All strains were pyrazimamide resistant but only 43% had pncA SNPs. Ethambutol resistant strains predominantly had embB codon 306 (62%) mutations, but additional SNPs at embB codons 406, 378 and 328 were also present. Fluoroquinolone resistance was associated with gyrA 91-94 codons in 81% of strains; four strains had only gyr B mutations, while others did not have SNPs in either gyrA or gyrB. Streptomycin resistant strains had mutations in ribosomal RNA genes; rpsL codon 43 (42%); rrs 500 region (16%), and gidB (34%) while six strains did not have mutations in any of these genes. Amikacin/kanamycin/capreomycin resistance was associated with SNPs in rrs at nt1401 (78%) and nt1484 (3%), except in seven (19%) strains. We estimate that if only the common hot-spot region targets of current commercial assays were used, the concordance between phenotypic and genotypic testing for these XDR strains would vary between rifampicin (100%), isoniazid (92%), flouroquinolones (81%), aminoglycoside (78%) and ethambutol (62%); while pncA sequencing would provide genotypic resistance in less than half the isolates. This work highlights the importance of expanded

  7. Public attitudes in Japan toward participation in whole genome sequencing studies.

    Okita, Taketoshi; Ohashi, Noriko; Kabata, Daijiro; Shintani, Ayumi; Kato, Kazuto

    2018-04-13

    Recent innovations in gene analysis technology have allowed for rapid and inexpensive sequencing of entire genomes. Thus, both conducting a study using whole genome sequencing (WGS) in a large population and the clinical application of research findings from such studies are currently feasible. However, to promote WGS studies, understanding and voluntary participation by the general public is needed. Therefore, it is essential to investigate the general public's attitude toward and understanding of WGS studies. The primary goal of our research is to investigate these issues and to discover how they relate to research participation in WGS studies. A survey of awareness regarding WGS and studies using WGS was conducted with a sample of 2000 or more participants using a self-administered questionnaire posted on the Internet between February 20 and 21, 2015. Prior to the survey, we briefly explained WGS and WGS study-related issues to the respondents in order to provide them with the minimum knowledge required to answer the questionnaire. We then conducted an analysis, including cross-classification. For the question regarding interest in WGS, 46.6% of participants responded "Yes." 70.7% of all respondents said that they were interested in some kinds of findings that could be obtained from WGS studies. Regarding participation in WGS studies, 29.0% were interested in participating. The demographic factors significantly related to attitudes toward research participation were age, level of education, and employment status. The results also suggest that concerns about WGS have a positive effect on people's willingness to participate. Furthermore, it was shown that for people who were not interested in their gene-related information, concerns about WGS negatively impacted their willingness to participate. However, for people who were interested in their gene-related information, their concerns might not have impacted their willingness to participate. This research has shown

  8. Whole genome SNP discovery and analysis of genetic diversity in Turkey (Meleagris gallopavo)

    2012-01-01

    Background The turkey (Meleagris gallopavo) is an important agricultural species and the second largest contributor to the world’s poultry meat production. Genetic improvement is attributed largely to selective breeding programs that rely on highly heritable phenotypic traits, such as body size and breast muscle development. Commercial breeding with small effective population sizes and epistasis can result in loss of genetic diversity, which in turn can lead to reduced individual fitness and reduced response to selection. The presence of genomic diversity in domestic livestock species therefore, is of great importance and a prerequisite for rapid and accurate genetic improvement of selected breeds in various environments, as well as to facilitate rapid adaptation to potential changes in breeding goals. Genomic selection requires a large number of genetic markers such as e.g. single nucleotide polymorphisms (SNPs) the most abundant source of genetic variation within the genome. Results Alignment of next generation sequencing data of 32 individual turkeys from different populations was used for the discovery of 5.49 million SNPs, which subsequently were used for the analysis of genetic diversity among the different populations. All of the commercial lines branched from a single node relative to the heritage varieties and the South Mexican turkey population. Heterozygosity of all individuals from the different turkey populations ranged from 0.17-2.73 SNPs/Kb, while heterozygosity of populations ranged from 0.73-1.64 SNPs/Kb. The average frequency of heterozygous SNPs in individual turkeys was 1.07 SNPs/Kb. Five genomic regions with very low nucleotide variation were identified in domestic turkeys that showed state of fixation towards alleles different than wild alleles. Conclusion The turkey genome is much less diverse with a relatively low frequency of heterozygous SNPs as compared to other livestock species like chicken and pig. The whole genome SNP discovery

  9. Whole genome SNP discovery and analysis of genetic diversity in Turkey (Meleagris gallopavo

    Aslam Muhammad L

    2012-08-01

    whole genome SNP discovery study in turkey resulted in the detection of 5.49 million putative SNPs compared to the reference genome. All commercial lines appear to share a common origin. Presence of different alleles/haplotypes in the SM population highlights that specific haplotypes have been selected in the modern domesticated turkey.

  10. Cell Free DNA of Tumor Origin Induces a 'Metastatic' Expression Profile in HT-29 Cancer Cell Line.

    István Fűri

    Full Text Available Epithelial cells in malignant conditions release DNA into the extracellular compartment. Cell free DNA of tumor origin may act as a ligand of DNA sensing mechanisms and mediate changes in epithelial-stromal interactions.To evaluate and compare the potential autocrine and paracrine regulatory effect of normal and malignant epithelial cell-related DNA on TLR9 and STING mediated pathways in HT-29 human colorectal adenocarcinoma cells and normal fibroblasts.DNA isolated from normal and tumorous colonic epithelia of fresh frozen surgically removed tissue samples was used for 24 and 6 hour treatment of HT-29 colon carcinoma and HDF-α fibroblast cells. Whole genome mRNA expression analysis and qRT-PCR was performed for the elements/members of TLR9 signaling pathway. Immunocytochemistry was performed for epithelial markers (i.e. CK20 and E-cadherin, DNA methyltransferase 3a (DNMT3a and NFκB (for treated HDFα cells.Administration of tumor derived DNA on HT29 cells resulted in significant (p<0.05 mRNA level alteration in 118 genes (logFc≥1, p≤0.05, including overexpression of metallothionein genes (i.e. MT1H, MT1X, MT1P2, MT2A, metastasis-associated genes (i.e. TACSTD2, MACC1, MALAT1, tumor biomarker (CEACAM5, metabolic genes (i.e. INSIG1, LIPG, messenger molecule genes (i.e. DAPP, CREB3L2. Increased protein levels of CK20, E-cadherin, and DNMT3a was observed after tumor DNA treatment in HT-29 cells. Healthy DNA treatment affected mRNA expression of 613 genes (logFc≥1, p≤0.05, including increased expression of key adaptor molecules of TLR9 pathway (e.g. MYD88, IRAK2, NFκB, IL8, IL-1β, STING pathway (ADAR, IRF7, CXCL10, CASP1 and the FGF2 gene.DNA from tumorous colon epithelium, but not from the normal epithelial cells acts as a pro-metastatic factor to HT-29 cells through the overexpression of pro-metastatic genes through TLR9/MYD88 independent pathway. In contrast, DNA derived from healthy colonic epithelium induced TLR9 and STING signaling

  11. Gene and miRNA expression signature of Lewis lung carcinoma LLC1 cells in extracellular matrix enriched microenvironment

    Stankevicius, Vaidotas; Vasauskas, Gintautas; Bulotiene, Danute; Butkyte, Stase; Jarmalaite, Sonata; Rotomskis, Ricardas; Suziedelis, Kestutis

    2016-01-01

    The extracellular matrix (ECM), one of the key components of tumor microenvironment, has a tremendous impact on cancer development and highly influences tumor cell features. ECM affects vital cellular functions such as cell differentiation, migration, survival and proliferation. Gene and protein expression levels are regulated in cell-ECM interaction dependent manner as well. The rate of unsuccessful clinical trials, based on cell culture research models lacking the ECM microenvironment, indicates the need for alternative models and determines the shift to three-dimensional (3D) laminin rich ECM models, better simulating tissue organization. Recognized advantages of 3D models suggest the development of new anticancer treatment strategies. This is among the most promising directions of 3D cell cultures application. However, detailed analysis at the molecular level of 2D/3D cell cultures and tumors in vivo is still needed to elucidate cellular pathways most promising for the development of targeted therapies. In order to elucidate which biological pathways are altered during microenvironmental shift we have analyzed whole genome mRNA and miRNA expression differences in LLC1 cells cultured in 2D or 3D culture conditions. In our study we used DNA microarrays for whole genome analysis of mRNA and miRNA expression differences in LLC1 cells cultivated in 2D or 3D culture conditions. Next, we indicated the most common enriched functional categories using KEGG pathway enrichment analysis. Finally, we validated the microarray data by quantitative PCR in LLC1 cells cultured under 2D or 3D conditions or LLC1 tumors implanted in experimental animals. Microarray gene expression analysis revealed that 1884 genes and 77 miRNAs were significantly altered in LLC1 cells after 48 h cell growth under 2D and ECM based 3D cell growth conditions. Pathway enrichment results indicated metabolic pathway, MAP kinase, cell adhesion and immune response as the most significantly altered

  12. Molecular characterization of c-Abl/c-Src kinase inhibitors targeted against murine tumour progenitor cells that express stem cell markers.

    Thomas Kruewel

    Full Text Available BACKGROUND: The non-receptor tyrosine kinases c-Abl and c-Src are overexpressed in various solid human tumours. Inhibition of their hyperactivity represents a molecular rationale in the combat of cancerous diseases. Here we examined the effects of a new family of pyrazolo [3,4-d] pyrimidines on a panel of 11 different murine lung tumour progenitor cell lines, that express stem cell markers, as well as on the human lung adenocarcinoma cell line A549, the human hepatoma cell line HepG2 and the human colon cancer cell line CaCo2 to obtain insight into the mode of action of these experimental drugs. METHODOLOGY/PRINCIPAL FINDINGS: Treatment with the dual kinase inhibitors blocked c-Abl and c-Src kinase activity efficiently in the nanomolar range, induced apoptosis, reduced cell viability and caused cell cycle arrest predominantly at G0/G1 phase while western blot analysis confirmed repressed protein expression of c-Abl and c-Src as well as the interacting partners p38 mitogen activated protein kinase, heterogenous ribonucleoprotein K, cyclin dependent kinase 1 and further proteins that are crucial for tumour progression. Importantly, a significant repression of the epidermal growth factor receptor was observed while whole genome gene expression analysis evidenced regulation of many cell cycle regulated genes as well integrin and focal adhesion kinase (FAK signalling to impact cytoskeleton dynamics, migration, invasion and metastasis. CONCLUSIONS/SIGNIFICANCE: Our experiments and recently published in vivo engraftment studies with various tumour cell lines revealed the dual kinase inhibitors to be efficient in their antitumour activity.

  13. Comparative whole genome transcriptome and metabolome analyses of five Klebsiella pneumonia strains.

    Lee, Soojin; Kim, Borim; Yang, Jeongmo; Jeong, Daun; Park, Soohyun; Shin, Sang Heum; Kook, Jun Ho; Yang, Kap-Seok; Lee, Jinwon

    2015-11-01

    The integration of transcriptomics and metabolomics can provide precise information on gene-to-metabolite networks for identifying the function of novel genes. The goal of this study was to identify novel gene functions involved in 2,3-butanediol (2,3-BDO) biosynthesis by a comprehensive analysis of the transcriptome and metabolome of five mutated Klebsiella pneumonia strains (∆wabG = SGSB100, ∆wabG∆budA = SGSB106, ∆wabG∆budB = SGSB107, ∆wabG∆budC = SGSB108, ∆wabG∆budABC = SGSB109). First, the transcriptomes of all five mutants were analyzed and the genes exhibiting reproducible changes in expression were determined. The transcriptome was well conserved among the five strains, and differences in gene expression occurred mainly in genes coding for 2,3-BDO biosynthesis (budA, budB, and budC) and the genes involved in the degradation of reactive oxygen, biosynthesis and transport of arginine, cysteine biosynthesis, sulfur metabolism, oxidoreductase reaction, and formate dehydrogenase reaction. Second, differences in the metabolome (estimated by carbon distribution, CO2 emission, and redox balance) among the five mutant strains due to gene alteration of the 2,3-BDO operon were detected. The functional genomics approach integrating metabolomics and transcriptomics in K. Pneumonia presented here provides an innovative means of identifying novel gene functions involved in 2,3-BDO biosynthesis metabolism and whole cell metabolism.

  14. Whole genome sequencing and assembly of Eukaryotic microbes isolated from ISS environmental surface Kirovograd region soil Chernobyl Nuclear Power Plant and Chernobyl Exclusion Zone

    National Aeronautics and Space Administration — The whole-genome sequences of eight fungal strains that were selected for exposure to microgravity at the International Space Station are presented here. These...

  15. Clinical utilisation of a rapid low-pass whole genome sequencing technique for the diagnosis of aneuploidy in human embryos prior to implantation.

    Wells, Dagan; Kaur, Kulvinder; Grifo, Jamie; Glassner, Michael; Taylor, Jenny C; Fragouli, Elpida; Munne, Santiago

    2014-08-01

    The majority of human embryos created using in vitro fertilisation (IVF) techniques are aneuploid. Comprehensive chromosome screening methods, applicable to single cells biopsied from preimplantation embryos, allow reliable identification and transfer of euploid embryos. Recently, randomised trials using such methods have indicated that aneuploidy screening improves IVF success rates. However, the high cost of testing has restricted the availability of this potentially beneficial strategy. This study aimed to harness next-generation sequencing (NGS) technology, with the intention of lowering the costs of preimplantation aneuploidy screening. Embryo biopsy, whole genome amplification and semiconductor sequencing. A rapid (cost only two-thirds that of the most widely used method for embryo aneuploidy detection. Validation involved blinded analysis of 54 cells from cell lines or biopsies from human embryos. Sensitivity and specificity were 100%. The method was applied clinically, assisting in the selection of euploid embryos in two IVF cycles, producing healthy children in both cases. The NGS approach was also able to reveal specified mutations in the nuclear or mitochondrial genomes in parallel with chromosome assessment. Interestingly, elevated mitochondrial DNA content was associated with aneuploidy (pcost diagnosis of aneuploidy in cells from human preimplantation embryos and is rapid enough to allow testing without embryo cryopreservation. The method described also has the potential to shed light on other aspects of embryo genetics of relevance to health and viability. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  16. Expansion of banana (Musa acuminata) gene families involved in ethylene biosynthesis and signalling after lineage-specific whole-genome duplications.

    Jourda, Cyril; Cardi, Céline; Mbéguié-A-Mbéguié, Didier; Bocs, Stéphanie; Garsmeur, Olivier; D'Hont, Angélique; Yahiaoui, Nabila

    2014-05-01

    Whole-genome duplications (WGDs) are widespread in plants, and three lineage-specific WGDs occurred in the banana (Musa acuminata) genome. Here, we analysed the impact of WGDs on the evolution of banana gene families involved in ethylene biosynthesis and signalling, a key pathway for banana fruit ripening. Banana ethylene pathway genes were identified using comparative genomics approaches and their duplication modes and expression profiles were analysed. Seven out of 10 banana ethylene gene families evolved through WGD and four of them (1-aminocyclopropane-1-carboxylate synthase (ACS), ethylene-insensitive 3-like (EIL), ethylene-insensitive 3-binding F-box (EBF) and ethylene response factor (ERF)) were preferentially retained. Banana orthologues of AtEIN3 and AtEIL1, two major genes for ethylene signalling in Arabidopsis, were particularly expanded. This expansion was paralleled by that of EBF genes which are responsible for control of EIL protein levels. Gene expression profiles in banana fruits suggested functional redundancy for several MaEBF and MaEIL genes derived from WGD and subfunctionalization for some of them. We propose that EIL and EBF genes were co-retained after WGD in banana to maintain balanced control of EIL protein levels and thus avoid detrimental effects of constitutive ethylene signalling. In the course of evolution, subfunctionalization was favoured to promote finer control of ethylene signalling. © 2014 CIRAD New Phytologist © 2014 New Phytologist Trust.

  17. Whole Genome DNA Sequence Analysis of Salmonella subspecies enterica serotype Tennessee obtained from related peanut butter foodborne outbreaks.

    Mark R Wilson

    Full Text Available Establishing an association between possible food sources and clinical isolates requires discriminating the suspected pathogen from an environmental background, and distinguishing it from other closely-related foodborne pathogens. We used whole genome sequencing (WGS to Salmonella subspecies enterica serotype Tennessee (S. Tennessee to describe genomic diversity across the serovar as well as among and within outbreak clades of strains associated with contaminated peanut butter. We analyzed 71 isolates of S. Tennessee from disparate food, environmental, and clinical sources and 2 other closely-related Salmonella serovars as outgroups (S. Kentucky and S. Cubana, which were also shot-gun sequenced. A whole genome single nucleotide polymorphism (SNP analysis was performed using a maximum likelihood approach to infer phylogenetic relationships. Several monophyletic lineages of S. Tennessee with limited SNP variability were identified that recapitulated several food contamination events. S. Tennessee clades were separated from outgroup salmonellae by more than sixteen thousand SNPs. Intra-serovar diversity of S. Tennessee was small compared to the chosen outgroups (1,153 SNPs, suggesting recent divergence of some S. Tennessee clades. Analysis of all 1,153 SNPs structuring an S. Tennessee peanut butter outbreak cluster revealed that isolates from several food, plant, and clinical isolates were very closely related, as they had only a few SNP differences between them. SNP-based cluster analyses linked specific food sources to several clinical S. Tennessee strains isolated in separate contamination events. Environmental and clinical isolates had very similar whole genome sequences; no markers were found that could be used to discriminate between these sources. Finally, we identified SNPs within variable S. Tennessee genes that may be useful markers for the development of rapid surveillance and typing methods, potentially aiding in traceback efforts

  18. Whole-Genome Sequence Analysis of Antimicrobial Resistance Genes in Streptococcus uberis and Streptococcus dysgalactiae Isolates from Canadian Dairy Herds

    Julián Reyes Vélez

    2017-05-01

    Full Text Available The objectives of this study are to determine the occurrence of antimicrobial resistance (AMR genes using whole-genome sequence (WGS of Streptococcus uberis (S. uberis and Streptococcus dysgalactiae (S. dysgalactiae isolates, recovered from dairy cows in the Canadian Maritime Provinces. A secondary objective included the exploration of the association between phenotypic AMR and the genomic characteristics (genome size, guanine–cytosine content, and occurrence of unique gene sequences. Initially, 91 isolates were sequenced, and of these isolates, 89 were assembled. Furthermore, 16 isolates were excluded due to larger than expected genomic sizes (>2.3 bp × 1,000 bp. In the final analysis, 73 were used with complete WGS and minimum inhibitory concentration records, which were part of the previous phenotypic AMR study, representing 18 dairy herds from the Maritime region of Canada (1. A total of 23 unique AMR gene sequences were found in the bacterial genomes, with a mean number of 8.1 (minimum: 5; maximum: 13 per genome. Overall, there were 10 AMR genes [ANT(6, TEM-127, TEM-163, TEM-89, TEM-95, Linb, Lnub, Ermb, Ermc, and TetS] present only in S. uberis genomes and 2 genes unique (EF-TU and TEM-71 to the S. dysgalactiae genomes; 11 AMR genes [APH(3′, TEM-1, TEM-136, TEM-157, TEM-47, TetM, bl2b, gyrA, parE, phoP, and rpoB] were found in both bacterial species. Two-way tabulations showed association between the phenotypic susceptibility to lincosamides and the presence of linB (P = 0.002 and lnuB (P < 0.001 genes and the between the presence of tetM (P = 0.015 and tetS (P = 0.064 genes and phenotypic resistance to tetracyclines only for the S. uberis isolates. The logistic model showed that the odds of resistance (to any of the phenotypically tested antimicrobials was 4.35 times higher when there were >11 AMR genes present in the genome, compared with <7 AMR genes (P < 0.001. The odds of resistance was lower for S

  19. Rho GTPase expression in human myeloid cells.

    Suzanne F G van Helden

    Full Text Available Myeloid cells are critical for innate immunity and the initiation of adaptive immunity. Strict regulation of the adhesive and migratory behavior is essential for proper functioning of these cells. Rho GTPases are important regulators of adhesion and migration; however, it is unknown which Rho GTPases are expressed in different myeloid cells. Here, we use a qPCR-based approach to investigate Rho GTPase expression in myeloid cells.We found that the mRNAs encoding Cdc42, RhoQ, Rac1, Rac2, RhoA and RhoC are the most abundant. In addition, RhoG, RhoB, RhoF and RhoV are expressed at low levels or only in specific cell types. More differentiated cells along the monocyte-lineage display lower levels of Cdc42 and RhoV, while RhoC mRNA is more abundant. In addition, the Rho GTPase expression profile changes during dendritic cell maturation with Rac1 being upregulated and Rac2 downregulated. Finally, GM-CSF stimulation, during macrophage and osteoclast differentiation, leads to high expression of Rac2, while M-CSF induces high levels of RhoA, showing that these cytokines induce a distinct pattern. Our data uncover cell type specific modulation of the Rho GTPase expression profile in hematopoietic stem cells and in more differentiated cells of the myeloid lineage.

  20. Rapid and Easy In Silico Serotyping of Escherichia coli Isolates by Use of Whole-Genome Sequencing Data

    Joensen, Katrine Grimstrup; Tetzschner, Anna M. M.; Iguchi, Atsushi

    2015-01-01

    typing and surveillance. The aim of this study was to establish a valid and publicly available tool for WGS-based in silico serotyping of E. coli applicable for routine typing and surveillance. A FASTA database of specific O-antigen processing system genes for O typing and flagellin genes for H typing...... tool. SerotypeFinder was evaluated on 682 E. coli genomes, 108 of which were sequenced for this study, where both the whole genome and the serotype were available. In total, 601 and 509 isolates were included for O and H typing, respectively. The O-antigen genes wzx, wzy, wzm, and wzt and the flagellin...

  1. Bacterial whole genome-based phylogeny: construction of a new benchmarking dataset and assessment of some existing methods.

    Ahrenfeldt, Johanne; Skaarup, Carina; Hasman, Henrik; Pedersen, Anders Gorm; Aarestrup, Frank Møller; Lund, Ole

    2017-01-05

    Whole genome sequencing (WGS) is increasingly used in diagnostics and surveillance of infectious diseases. A major application for WGS is to use the data for identifying outbreak clusters, and there is therefore a need for methods that can accurately and efficiently infer phylogenies from sequencing reads. In the present study we describe a new dataset that we have created for the purpose of benchmarking such WGS-based methods for epidemiological data, and also present an analysis where we use the data to compare the performance of some current methods. Our aim was to create a benchmark data set that mimics sequencing data of the sort that might be collected during an outbreak of an infectious disease. This was achieved by letting an E. coli hypermutator strain grow in the lab for 8 consecutive days, each day splitting the culture in two while also collecting samples for sequencing. The result is a data set consisting of 101 whole genome sequences with known phylogenetic relationship. Among the sequenced samples 51 correspond to internal nodes in the phylogeny because they are ancestral, while the remaining 50 correspond to leaves. We also used the newly created data set to compare three different online available methods that infer phylogenies from whole-genome sequencing reads: NDtree, CSI Phylogeny and REALPHY. One complication when comparing the output of these methods with the known phylogeny is that phylogenetic methods typically build trees where all observed sequences are placed as leafs, even though some of them are in fact ancestral. We therefore devised a method for post processing the inferred trees by collapsing short branches (thus relocating some leafs to internal nodes), and also present two new measures of tree similarity that takes into account the identity of both internal and leaf nodes. Based on this analysis we find that, among the investigated methods, CSI Phylogeny had the best performance, correctly identifying 73% of all branches in the

  2. Therapeutics of Ebola hemorrhagic fever: whole-genome transcriptional analysis of successful disease mitigation.

    Yen, Judy Y; Garamszegi, Sara; Geisbert, Joan B; Rubins, Kathleen H; Geisbert, Thomas W; Honko, Anna; Xia, Yu; Connor, John H; Hensley, Lisa E

    2011-11-01

    The mechanisms of Ebola (EBOV) pathogenesis are only partially understood, but the dysregulation of normal host immune responses (including destruction of lymphocytes, increases in circulating cytokine levels, and development of coagulation abnormalities) is thought to play a major role. Accumulating evidence suggests that much of the observed pathology is not the direct result of virus-induced structural damage but rather is due to the release of soluble immune mediators from EBOV-infected cells. It is therefore essential to understand how the candidate therapeutic may be interrupting the disease process and/or targeting the infectious agent. To identify genetic signatures that are correlates of protection, we used a DNA microarray-based approach to compare the host genome-wide responses of EBOV-infected nonhuman primates (NHPs) responding to candidate therapeutics. We observed that, although the overall circulating immune response was similar in the presence and absence of coagulation inhibitors, surviving NHPs clustered together. Noticeable differences in coagulation-associated genes appeared to correlate with survival, which revealed a subset of distinctly differentially expressed genes, including chemokine ligand 8 (CCL8/MCP-2), that may provide possible targets for early-stage diagnostics or future therapeutics. These analyses will assist us in understanding the pathogenic mechanisms of EBOV infection and in identifying improved therapeutic strategies.

  3. ENCODE whole-genome data in the UCSC genome browser (2011 update).

    Raney, Brian J; Cline, Melissa S; Rosenbloom, Kate R; Dreszer, Timothy R; Learned, Katrina; Barber, Galt P; Meyer, Laurence R; Sloan, Cricket A; Malladi, Venkat S; Roskin, Krishna M; Suh, Bernard B; Hinrichs, Angie S; Clawson, Hiram; Zweig, Ann S; Kirkup, Vanessa; Fujita, Pauline A; Rhead, Brooke; Smith, Kayla E; Pohl, Andy; Kuhn, Robert M; Karolchik, Donna; Haussler, David; Kent, W James

    2011-01-01

    The ENCODE project is an international consortium with a goal of cataloguing all the functional elements in the human genome. The ENCODE Data Coordination Center (DCC) at the University of California, Santa Cruz serves as the central repository for ENCODE data. In this role, the DCC offers a collection of high-throughput, genome-wide data generated with technologies such as ChIP-Seq, RNA-Seq, DNA digestion and others. This data helps illuminate transcription factor-binding sites, histone marks, chromatin accessibility, DNA methylation, RNA expression, RNA binding and other cell-state indicators. It includes sequences with quality scores, alignments, signals calculated from the alignments, and in most cases, element or peak calls calculated from the signal data. Each data set is available for visualization and download via the UCSC Genome Browser (http://genome.ucsc.edu/). ENCODE data can also be retrieved using a metadata system that captures the experimental parameters of each assay. The ENCODE web portal at UCSC (http://encodeproject.org/) provides information about the ENCODE data and links for access.

  4. Whole-genome analysis of genetic recombination of hepatitis delta virus: molecular domain in delta antigen determining trans-activating efficiency.

    Chao, Mei; Lin, Chia-Chi; Lin, Feng-Ming; Li, Hsin-Pai; Iang, Shan-Bei

    2015-12-01

    Hepatitis delta virus (HDV) is the only animal RNA virus that has an unbranched rod-like genome with ribozyme activity and is replicated by host RNA polymerase. HDV RNA recombination was previously demonstrated in patients and in cultured cells by analysis of a region corresponding to the C terminus of the delta antigen (HDAg), the only viral-encoded protein. Here, a whole-genome recombination map of HDV was constructed using an experimental system in which two HDV-1 sequences were co-transfected into cultured cells and the recombinants were analysed by sequencing of cloned reverse transcription-PCR products. Fifty homologous recombinants with 60 crossovers mapping to 22 junctions were identified from 200 analysed clones. Small HDAg chimeras harbouring a junction newly detected in the recombination map were then constructed. The results further indicated that the genome-replication level of HDV was sensitive to the sixth amino acid within the N-terminal 22 aa of HDAg. Therefore, the recombination map established in this study provided a tool for not only understanding HDV RNA recombination, but also elucidating the related mechanisms, such as molecular elements responsible for the trans-activation levels of the small HDAg.

  5. Use of whole genome deep sequencing to define emerging minority variants in virus envelope genes in herpesvirus treated with novel antimicrobial K21.

    Tweedy, Joshua G; Prusty, Bhupesh K; Gompels, Ursula A

    2017-10-01

    New antivirals are required to prevent rising antimicrobial resistance from replication inhibitors. The aim of this study was to analyse the range of emerging mutations in herpesvirus by whole genome deep sequencing. We tested human herpesvirus 6 treatment with novel antiviral K21, where evidence indicated distinct effects on virus envelope proteins. We treated BACmid cloned virus in order to analyse mechanisms and candidate targets for resistance. Illumina based next generation sequencing technology enabled analyses of mutations in 85 genes to depths of 10,000 per base detecting low prevalent minority variants (<1%). After four passages in tissue culture the untreated virus accumulated mutations in infected cells giving an emerging mixed population (45-73%) of non-synonymous SNPs in six genes including two envelope glycoproteins. Strikingly, treatment with K21 did not accumulate the passage mutations; instead a high frequency mutation was selected in envelope protein gQ2, part of the gH/gL complex essential for herpesvirus infection. This introduced a stop codon encoding a truncation mutation previously observed in increased virion production. There was reduced detection of the glycoprotein complex in infected cells. This supports a novel pathway for K21 targeting virion envelopes distinct from replication inhibition. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  6. Identifying Likely Transmission Pathways within a 10-Year Community Outbreak of Tuberculosis by High-Depth Whole Genome Sequencing.

    Alexander C Outhred

    Full Text Available Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways.We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants.Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade.Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster.

  7. KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses.

    Kim, Jungeun; Weber, Jessica A; Jho, Sungwoong; Jang, Jinho; Jun, JeHoon; Cho, Yun Sung; Kim, Hak-Min; Kim, Hyunho; Kim, Yumi; Chung, OkSung; Kim, Chang Geun; Lee, HyeJin; Kim, Byung Chul; Han, Kyudong; Koh, InSong; Chae, Kyun Shik; Lee, Semin; Edwards, Jeremy S; Bhak, Jong

    2018-04-04

    High-coverage whole-genome sequencing data of a single ethnicity can provide a useful catalogue of population-specific genetic variations, and provides a critical resource that can be used to more accurately identify pathogenic genetic variants. We report a comprehensive analysis of the Korean population, and present the Korean National Standard Reference Variome (KoVariome). As a part of the Korean Personal Genome Project (KPGP), we constructed the KoVariome database using 5.5 terabases of whole genome sequence data from 50 healthy Korean individuals in order to characterize the benign ethnicity-relevant genetic variation present in the Korean population. In total, KoVariome includes 12.7M single-nucleotide variants (SNVs), 1.7M short insertions and deletions (indels), 4K structural variations (SVs), and 3.6K copy number variations (CNVs). Among them, 2.4M (19%) SNVs and 0.4M (24%) indels were identified as novel. We also discovered selective enrichment of 3.8M SNVs and 0.5M indels in Korean individuals, which were used to filter out 1,271 coding-SNVs not originally removed from the 1,000 Genomes Project when prioritizing disease-causing variants. KoVariome health records were used to identify novel disease-causing variants in the Korean population, demonstrating the value of high-quality ethnic variation databases for the accurate interpretation of individual genomes and the precise characterization of genetic variations.

  8. Whole genome sequencing of Mycobacterium bovis to obtain molecular fingerprints in human and cattle isolates from Baja California, Mexico.

    Sandoval-Azuara, Sarai Estrella; Muñiz-Salazar, Raquel; Perea-Jacobo, Ricardo; Robbe-Austerman, Suelee; Perera-Ortiz, Alejandro; López-Valencia, Gilberto; Bravo, Doris M; Sanchez-Flores, Alejandro; Miranda-Guzmán, Daniela; Flores-López, Carlos Alberto; Zenteno-Cuevas, Roberto; Laniado-Laborín, Rafael; de la Cruz, Fabiola Lafarga; Stuber, Tod P

    2017-10-01

    To determine genetic diversity by comparing the whole genome sequences of cattle and human Mycobacterium bovis isolates from Baja California. A whole genome sequencing strategy was used to obtain the molecular fingerprints of 172 isolates of M. bovis obtained from Baja California, Mexico; 155 isolates were from cattle and 17 isolates were from humans. Spoligotypes were characterized in silico and single nucleotide polymorphism (SNP) differences between the isolates were evaluated. A total of 12 M. bovis spoligotype patterns were identified in cattle and humans. Two predominant spoligotypes patterns were seen in both cattle and humans: SB0145 and SB1040. The SB0145 spoligotype represented 59% of cattle isolates (n=91) and 65% of human isolates (n=11), while the SB1040 spoligotype represented 30% of cattle isolates (n=47) and 30% of human isolates (n=5). When evaluating SNP differences, the human isolates were intimately intertwined with the cattle isolates. All isolates from humans had spoligotype patterns that matched those observed in the cattle isolates, and all human isolates shared common ancestors with cattle in Baja California based on SNP analysis. This suggests that most human tuberculosis caused by M. bovis in Baja California is derived from M. bovis circulating in Baja California cattle. These results reinforce the importance of bovine tuberculosis surveillance and control in this region. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  9. Comparing whole-genome sequencing with Sanger sequencing for spa typing of methicillin-resistant Staphylococcus aureus.

    Bartels, Mette Damkjær; Petersen, Andreas; Worning, Peder; Nielsen, Jesper Boye; Larner-Svensson, Hanna; Johansen, Helle Krogh; Andersen, Leif Percival; Jarløv, Jens Otto; Boye, Kit; Larsen, Anders Rhod; Westh, Henrik

    2014-12-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and an in-house analysis pipeline determines the spa types. Due to national surveillance, all MRSA isolates are sent to Statens Serum Institut, where the spa type is determined by PCR and Sanger sequencing. The purpose of this study was to evaluate the reliability of the spa types obtained by 150-bp paired-end Illumina WGS. MRSA isolates from new MRSA patients in 2013 (n = 699) in the capital region of Denmark were included. We found a 97% agreement between spa types obtained by the two methods. All isolates achieved a spa type by both methods. Nineteen isolates differed in spa types by the two methods, in most cases due to the lack of 24-bp repeats in the whole-genome-sequenced isolates. These related but incorrect spa types should have no consequence in outbreak investigations, since all epidemiologically linked isolates, regardless of spa type, will be included in the single nucleotide polymorphism (SNP) analysis. This will reveal the close relatedness of the spa types. In conclusion, our data show that WGS is a reliable method to determine the spa type of MRSA. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  10. Global gene expression analysis of peripheral blood mononuclear cells in rhesus monkey infants with CA16 infection-induced HFMD.

    Song, Jie; Hu, Yajie; Hu, Yunguang; Wang, Jingjing; Zhang, Xiaolong; Wang, Lichun; Guo, Lei; Wang, Yancui; Ning, Ruotong; Liao, Yun; Zhang, Ying; Zheng, Huiwen; Shi, Haijing; He, Zhanlong; Li, Qihan; Liu, Longding

    2016-03-02

    Coxsackievirus A16 (CA16) is a dominant pathogen that results in hand, foot, and mouth disease and causes outbreaks worldwide, particularly in the Asia-Pacific region. However, the underlying molecular mechanisms remain unclear. Our previous study has demonstrated that the basic CA16 pathogenic process was successfully mimicked in rhesus monkey infant. The present study focused on the global gene expression changes in peripheral blood mononuclear cells of rhesus monkey infants with hand, foot, and mouth disease induced by CA16 infection at different time points. Genome-wide expression analysis was performed with Agilent whole-genome microarrays and established bioinformatics tools. Nine hundred and forty-eight significant differentially expressed genes that were associated with 5 gene ontology categories, including cell communication, cell cycle, immune system process, regulation of transcription and metabolic process were identified. Subsequently, the mapping of genes related to the immune system process by PANTHER pathway analysis revealed the predominance of inflammation mediated by chemokine and cytokine signaling pathways and the interleukin signaling pathway. Ultimately, co-expressed genes and their networks were analyzed. The results revealed the gene expression profile of the immune system in response to CA16 in rhesus monkey infants and suggested that such an immune response was generated as a result of the positive mobilization of the immune system. This initial microarray study will provide insights into the molecular mechanism of CA16 infection and will facilitate the identification of biomarkers for the evaluation of vaccines against this virus. Copyright © 2016 Elsevier B.V. All rights reserved.

  11. Foxp3 expression in human cancer cells

    Gourgoulianis Konstantinos I

    2008-04-01

    Full Text Available Abstract Objective Transcription factor forkhead box protein 3 (Foxp3 specifically characterizes the thymically derived naturally occurring regulatory T cells (Tregs. Limited evidence indicates that it is also expressed, albeit to a lesser extent, in tissues other than thymus and spleen, while, very recently, it was shown that Foxp3 is expressed by pancreatic carcinoma. This study was scheduled to investigate whether expression of Foxp3 transcripts and mature protein occurs constitutively in various tumor types. Materials and methods Twenty five tumor cell lines of different tissue origins (lung cancer, colon cancer, breast cancer, melanoma, erythroid leukemia, acute T-cell leukemia were studied. Detection of Foxp3 mRNA was performed using both conventional RT-PCR and quantitative real-time PCR while protein expression was assessed by immunocytochemistry and flow cytometry, using different antibody clones. Results Foxp3 mRNA as well as Foxp3 protein was detected in all tumor cell lines, albeit in variable levels, not related to the tissue of origin. This expression correlated with the expression levels of IL-10 and TGFb1. Conclusion We offer evidence that Foxp3 expression, characterizes tumor cells of various tissue origins. The biological significance of these findings warrants further investigation in the context of tumor immune escape, and especially under the light of current anti-cancer efforts interfering with Foxp3 expression.

  12. Decorin expression in quiescent myogenic cells

    Nishimura, Takanori; Nozu, Kenjiro; Kishioka, Yasuhiro; Wakamatsu, Jun-ichi; Hattori, Akihito

    2008-01-01

    Satellite cells are quiescent muscle stem cells that promote postnatal muscle growth and repair. When satellite cells are activated by myotrauma, they proliferate, migrate, differentiate, and ultimately fuse to existing myofibers. The remainder of these cells do not differentiate, but instead return to quiescence and remain in a quiescent state until activation begins the process again. This ability to maintain their own population is important for skeletal muscle to maintain the capability to repair during postnatal life. However, the mechanisms by which satellite cells return to quiescence and maintain the quiescent state are still unclear. Here, we demonstrated that decorin mRNA expression was high in cell cultures containing a higher ratio of quiescent satellite cells when satellite cells were stimulated with various concentrations of hepatocyte growth factor. This result suggests that quiescent satellite cells express decorin at a high level compared to activated satellite cells. Furthermore, we examined the expression of decorin in reserve cells, which were undifferentiated myoblasts remaining after induction of differentiation by serum-deprivation. Decorin mRNA levels in reserve cells were higher than those in differentiated myotubes and growing myoblasts. These results suggest that decorin participates in the quiescence of myogenic cells

  13. Mining whole genomes and transcriptomes of Jatropha (Jatropha curcas) and Castor bean (Ricinus communis) for NBS-LRR genes and defense response associated transcription factors.

    Sood, Archit; Jaiswal, Varun; Chanumolu, Sree Krishna; Malhotra, Nikhil; Pal, Tarun; Chauhan, Rajinder Singh

    2014-11-01

    Jatropha (Jatropha curcas L.) and Castor bean (Ricinus communis) are oilseed crops of family Euphorbiaceae with the potential of producing high quality biodiesel and having industrial value. Both the bioenergy plants are becoming susceptible to various biotic stresses directly affecting the oil quality and content. No report exists as of today on analysis of Nucleotide Binding Site-Leucine Rich Repeat (NBS-LRR) gene repertoire and defense response transcription factors in both the plant species. In silico analysis of whole genomes and transcriptomes identified 47 new NBS-LRR genes in both the species and 122 and 318 defense response related transcription factors in Jatropha and Castor bean, respectively. The identified NBS-LRR genes and defense response transcription factors were mapped onto the respective genomes. Common and unique NBS-LRR genes and defense related transcription factors were identified in both the plant species. All NBS-LRR genes in both the species were characterized into Toll/interleukin-1 receptor NBS-LRRs (TNLs) and coiled-coil NBS-LRRs (CNLs), position on contigs, gene clusters and motifs and domains distribution. Transcript abundance or expression values were measured for all NBS-LRR genes and defense response transcription factors, suggesting their functional role. The current study provides a repertoire of NBS-LRR genes and transcription factors which can be used in not only dissecting the molecular basis of disease resistance phenotype but also in developing disease resistant genotypes in Jatropha and Castor bean through transgenic or molecular breeding approaches.

  14. A whole genome screening and RNA interference identify a juvenile hormone esterase-like gene of the diamondback moth, Plutella xylostella.

    Gu, Xiaojun; Kumar, Sunil; Kim, Eunjin; Kim, Yonggyun

    2015-09-01

    Juvenile hormone (JH) plays a crucial role in preventing precocious metamorphosis and stimulating reproduction. Thus, its hemolymph titer should be under a tight control. As a negative controller, juvenile hormone esterase (JHE) performs a rapid breakdown of residual JH in the hemolymph during last instar to induce a larval-to-pupal metamorphosis. A whole genome of the diamondback moth (DBM), Plutella xylostella, has been annotated and proposed 11 JHE candidates. Sequence analysis using conserved motifs commonly found in other JHEs proposed a putative JHE (Px004817). Px004817 (64.61 kDa, pI=5.28) exhibited a characteristic JHE expression pattern by showing high peak at the early last instar, at which JHE enzyme activity was also at a maximal level. RNA interference of Px004817 reduced JHE activity and interrupted pupal development with a significant increase of larval period. This study identifies Px004817 as a JHE-like gene of P. xylostella. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Inspecting Targeted Deep Sequencing of Whole Genome Amplified DNA Versus Fresh DNA for Somatic Mutation Detection: A Genetic Study in Myelodysplastic Syndrome Patients.

    Palomo, Laura; Fuster-Tormo, Francisco; Alvira, Daniel; Ademà, Vera; Armengol, María Pilar; Gómez-Marzo, Paula; de Haro, Nuri; Mallo, Mar; Xicoy, Blanca; Zamora, Lurdes; Solé, Francesc

    2017-08-01

    Whole genome amplification (WGA) has become an invaluable method for preserving limited samples of precious stock material and has been used during the past years as an alternative tool to increase the amount of DNA before library preparation for next-generation sequencing. Myelodysplastic syndromes (MDS) are a group of clonal hematopoietic stem cell disorders characterized by presenting somatic mutations in several myeloid-related genes. In this work, targeted deep sequencing has been performed on four paired fresh DNA and WGA DNA samples from bone marrow of MDS patients, to assess the feasibility of using WGA DNA for detecting somatic mutations. The results of this study highlighted that, in general, the sequencing and alignment statistics of fresh DNA and WGA DNA samples were similar. However, after variant calling and when considering variants detected at all frequencies, there was a high level of discordance between fresh DNA and WGA DNA (overall, a higher number of variants was detected in WGA DNA). After proper filtering, a total of three somatic mutations were detected in the cohort. All somatic mutations detected in fresh DNA were also identified in WGA DNA and validated by whole exome sequencing.

  16. Whole-genome sequence analysis of the Mycobacterium avium complex and proposal of the transfer of Mycobacterium yongonense to Mycobacterium intracellulare subsp. yongonense subsp. nov.

    Castejon, Maria; Menéndez, Maria Carmen; Comas, Iñaki; Vicente, Ana; Garcia, Maria J

    2018-06-01

    Bacterial whole-genome sequences contain informative features of their evolutionary pathways. Comparison of whole-genome sequences have become the method of choice for classification of prokaryotes, thus allowing the identification of bacteria from an evolutionary perspective, and providing data to resolve some current controversies. Currently, controversy exists about the assignment of members of the Mycobacterium avium complex, as is for the cases of Mycobacterium yongonense and 'Mycobacterium indicus pranii'. These two mycobacteria, closely related to Mycobacterium intracellulare on the basis of standard phenotypic and single gene-sequences comparisons, were not considered a member of such species on the basis on some particular differences displayed by a single strain. Whole-genome sequence comparison procedures, namely the average nucleotide identity and the genome distance, showed that those two mycobacteria should be considered members of the species M. intracellulare. The results were confirmed with other whole-genome comparison supplementary methods. According to the data provided, Mycobacterium yongonense and 'Mycobacterium indicus pranii' should be considered and renamed and included as members of M. intracellulare. This study highlights the problems caused when a novel species is accepted on the basis of a single strain, as was the case for M. yongonense. Based mainly on whole-genome sequence analysis, we conclude that M. yongonense should be reclassified as a subspecies of Mycobacterium intracellulareas Mycobacterium intracellularesubsp. yongonense and 'Mycobacterium indicus pranii' classified in the same subspecies as the type strain of Mycobacterium intracellulare and classified as Mycobacterium intracellularesubsp. intracellulare.

  17. Development and validation of concurrent preimplantation genetic diagnosis for single gene disorders and comprehensive chromosomal aneuploidy screening without whole genome amplification.

    Zimmerman, Rebekah S; Jalas, Chaim; Tao, Xin; Fedick, Anastasia M; Kim, Julia G; Pepe, Russell J; Northrop, Lesley E; Scott, Richard T; Treff, Nathan R

    2016-02-01

    To develop a novel and robust protocol for multifactorial preimplantation genetic testing of trophectoderm biopsies using quantitative polymerase chain reaction (qPCR). Prospective and blinded. Not applicable. Couples indicated for preimplantation genetic diagnosis (PGD). None. Allele dropout (ADO) and failed amplification rate, genotyping consistency, chromosome screening success rate, and clinical outcomes of qPCR-based screening. The ADO frequency on a single cell from a fibroblast cell line was 1.64% (18/1,096). When two or more cells were tested, the ADO frequency dropped to 0.02% (1/4,426). The rate of amplification failure was 1.38% (55/4,000) overall, with 2.5% (20/800) for single cells and 1.09% (35/3,200) for samples that had two or more cells. Among 152 embryos tested in 17 cases by qPCR-based PGD and CCS, 100% were successfully given a diagnosis, with 0% ADO or amplification failure. Genotyping consistency with reference laboratory results was >99%. Another 304 embryos from 43 cases were included in the clinical application of qPCR-based PGD and CCS, for which 99.7% (303/304) of the embryos were given a definitive diagnosis, with only 0.3% (1/304) having an inconclusive result owing to recombination. In patients receiving a transfer with follow-up, the pregnancy rate was 82% (27/33). This study demonstrates that the use of qPCR for PGD testing delivers consistent and more reliable results than existing methods and that single gene disorder PGD can be run concurrently with CCS without the need for additional embryo biopsy or whole genome amplification. Copyright © 2016 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.

  18. Integrative genome-wide gene expression profiling of clear cell renal cell carcinoma in Czech Republic and in the United States.

    Magdalena B Wozniak

    Full Text Available Gene expression microarray and next generation sequencing efforts on conventional, clear cell renal cell carcinoma (ccRCC have been mostly performed in North American and Western European populations, while the highest incidence rates are found in Central/Eastern Europe. We conducted whole-genome expression profiling on 101 pairs of ccRCC tumours and adjacent non-tumour renal tissue from Czech patients recruited within the "K2 Study", using the Illumina HumanHT-12 v4 Expression BeadChips to explore the molecular variations underlying the biological and clinical heterogeneity of this cancer. Differential expression analysis identified 1650 significant probes (fold change ≥2 and false discovery rate <0.05 mapping to 630 up- and 720 down-regulated unique genes. We performed similar statistical analysis on the RNA sequencing data of 65 ccRCC cases from the Cancer Genome Atlas (TCGA project and identified 60% (402 of the downregulated and 74% (469 of the upregulated genes found in the K2 series. The biological characterization of the significantly deregulated genes demonstrated involvement of downregulated genes in metabolic and catabolic processes, excretion, oxidation reduction, ion transport and response to chemical stimulus, while simultaneously upregulated genes were associated with immune and inflammatory responses, response to hypoxia, stress, wounding, vasculature development and cell activation. Furthermore, genome-wide DNA methylation analysis of 317 TCGA ccRCC/adjacent non-tumour renal tissue pairs indicated that deregulation of approximately 7% of genes could be explained by epigenetic changes. Finally, survival analysis conducted on 89 K2 and 464 TCGA cases identified 8 genes associated with differential prognostic outcomes. In conclusion, a large proportion of ccRCC molecular characteristics were common to the two populations and several may have clinical implications when validated further through large clinical cohorts.

  19. Different responsiveness to a high-fat/cholesterol diet in two inbred mice and underlying genetic factors: a whole genome microarray analysis

    Jin Gang

    2009-10-01

    Full Text Available Abstract Background To investigate different responses to a high-fat/cholesterol diet and uncover their underlying genetic factors between C57BL/6J (B6 and DBA/2J (D2 inbred mice. Methods B6 and D2 mice were fed a high-fat/cholesterol diet for a series of time-points. Serum and bile lipid profiles, bile acid yields, hepatic apoptosis, gallstones and atherosclerosis formation were measured. Furthermore, a whole genome microarray was performed to screen hepatic genes expression profile. Quantitative real-time PCR, western blot and TUNEL assay were conducted to validate microarray data. Results After fed the high-fat/cholesterol diet, serum and bile total cholesterol, serum cholesterol esters, HDL cholesterol and Non-HDL cholesterol levels were altered in B6 but not significantly changed in D2; meanwhile, biliary bile acid was decreased in B6 but increased in D2. At the same time, hepatic apoptosis, gallstones and atherosclerotic lesions occurred in B6 but not in D2. The hepatic microarray analysis revealed distinctly different genes expression patterns between B6 and D2 mice. Their functional pathway groups included lipid metabolism, oxidative stress, immune/inflammation response and apoptosis. Quantitative real time PCR, TUNEL assay and western-blot results were consistent with microarray analysis. Conclusion Different genes expression patterns between B6 and D2 mice might provide a genetic basis for their distinctive responses to a high-fat/cholesterol diet, and give us an opportunity to identify novel pharmaceutical targets in related diseases in the future.

  20. Whole genome sequencing of multidrug-resistant Salmonella enterica serovar Typhimurium isolated from humans and poultry in Burkina Faso.

    Kagambèga, Assèta; Lienemann, Taru; Frye, Jonathan G; Barro, Nicolas; Haukka, Kaisa

    2018-01-01

    Multidrug-resistant Salmonella is an important cause of morbidity and mortality in developing countries. The aim of this study was to characterize and compare multidrug-resistant Salmonella enterica serovar Typhimurium isolates from patients and poultry feces. Salmonella strains were isolated from poultry and patients using standard bacteriological methods described in previous studies. The strains were serotype according to Kaufmann-White scheme and tested for antibiotic susceptibility to 12 different antimicrobial agents using the disk diffusion method. The whole genome of the S. Typhimurium isolates was analyzed using Illumina technology and compared with 20 isolates of S. Typhimurium for which the ST has been deposited in a global MLST database.The ResFinder Web server was used to find the antibiotic resistance genes from whole genome sequencing (WGS) data. For comparative genomics, publicly available complete and draft genomes of different S. Typhimurium laboratory-adapted strains were downloaded from GenBank. All the tested Salmonella serotype Typhimurium were multiresistant to five commonly used antibiotics (ampicillin, chloramphenicol, streptomycin, sulfonamide, and trimethoprim). The multilocus sequence type ST313 was detected from all the strains. Our sequences were very similar to S. Typhimurium ST313 strain D23580 isolated from a patient with invasive non-typhoid Salmonella (NTS) infection in Malawi, also located in sub-Saharan Africa. The use of ResFinder web server on the whole genome of the strains showed a resistance to aminoglycoside associated with carriage of the following resistances genes: strA , strB , and aadA1 ; resistance to β-lactams associated with carriage of a bla TEM-1B genes; resistance to phenicol associated with carriage of catA1 gene; resistance to sulfonamide associated with carriage of sul1 and sul2 genes; resistance to tetracycline associated with carriage of tet B gene; and resistance to trimethoprim associated to dfrA1 gene

  1. Whole-genome sequencing of Bacillus velezensis LS69, a strain with a broad inhibitory spectrum against pathogenic bacteria.

    Liu, Guoqiang; Kong, Yingying; Fan, Yajing; Geng, Ce; Peng, Donghai; Sun, Ming

    2017-05-10

    Bacillus velezensis LS69 was found to exhibit antagonistic activity against a diverse spectrum of pathogenic bacteria. It has one circular chromosome of 3,917,761bp with 3,643 open reading frames. Genome analysis identified ten gene clusters involved in nonribosomal synthesis of polyketides (macrolactin, bacillaene and difficidin), lipopeptides (surfactin, fengycin, bacilysin and iturin A) and bacteriocins (amylolysin and amylocyclicin). In addition, B. velezensis LS69 was found to contain a series of genes involved in enhancing plant growth and triggering plant immunity. Whole genome sequencing of Bacillus velezensis LS69 will provide a basis for elucidation of its biocontrol mechanisms and facilitate its applications in the future. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Whole-Genome Characterization and Strain Comparison of VT2f-Producing Escherichia coli Causing Hemolytic Uremic Syndrome

    Michelacci, Valeria; Bondì, Roslen; Gigliucci, Federica; Franz, Eelco; Badouei, Mahdi Askari; Schlager, Sabine; Minelli, Fabio; Tozzoli, Rosangela; Caprioli, Alfredo; Morabito, Stefano

    2016-01-01

    Verotoxigenic Escherichia coli infections in humans cause disease ranging from uncomplicated intestinal illnesses to bloody diarrhea and systemic sequelae, such as hemolytic uremic syndrome (HUS). Previous research indicated that pigeons may be a reservoir for a population of verotoxigenic E. coli producing the VT2f variant. We used whole-genome sequencing to characterize a set of VT2f-producing E. coli strains from human patients with diarrhea or HUS and from healthy pigeons. We describe a phage conveying the vtx2f genes and provide evidence that the strains causing milder diarrheal disease may be transmitted to humans from pigeons. The strains causing HUS could derive from VT2f phage acquisition by E. coli strains with a virulence genes asset resembling that of typical HUS-associated verotoxigenic E. coli. PMID:27584691

  3. Identifying Rare Variation in Cases of Schizophrenia in the Isolated Population of the Faroe Islands using Whole-genome Sequencing

    Als, Thomas Damm; Lescai, Francesco; Dahl, Hans

    to map risk variants involved in complex traits. We aim at utilizing samples of cases and controls of the isolated population of the Faroe Islands to conduct whole-genome-sequence analysis in order to identify rare genetic variants associated with schizophrenia. We will search for rare genetic variants...... of developing SZ. However, these studies are designed to examining only “the common variant” proportion of the genomic landscape of SZ. Due to increased genetic drift during founding and potential bottlenecks, followed by population expansion, isolated populations may be particularly useful in identifying rare...... disease variants, that may appear at higher frequencies and/or within a more clearly distinct haplotype structure compared to outbred populations. Small isolated populations also typically show reduced phenotypic, genetic and environmental heterogeneity, thus making them advantageous in studies aiming...

  4. Whole genome sequencing identifies circulating Beijing-lineage Mycobacterium tuberculosis strains in Guatemala and an associated urban outbreak.

    Saelens, Joseph W; Lau-Bonilla, Dalia; Moller, Anneliese; Medina, Narda; Guzmán, Brenda; Calderón, Maylena; Herrera, Raúl; Sisk, Dana M; Xet-Mull, Ana M; Stout, Jason E; Arathoon, Eduardo; Samayoa, Blanca; Tobin, David M

    2015-12-01

    Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from independent founding events, are currently circulating in Guatemala, but overall still represent a relatively small proportion of disease burden. Finally, we identify a specific Beijing-lineage outbreak centered on a poor neighborhood in Guatemala City. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. Processes Underlying Rabies Virus Incursions across US-Canada Border as Revealed by Whole-Genome Phylogeography.

    Trewby, Hannah; Nadin-Davis, Susan A; Real, Leslie A; Biek, Roman

    2017-09-01

    Disease control programs aim to constrain and reduce the spread of infection. Human disease interventions such as wildlife vaccination play a major role in determining the limits of a pathogen's spatial distribution. Over the past few decades, a raccoon-specific variant of rabies virus (RRV) has invaded large areas of eastern North America. Although expansion into Canada has been largely prevented through vaccination along the US border, several outbreaks have occurred in Canada. Applying phylogeographic approaches to 289 RRV whole-genome sequences derived from isolates collected in Canada and adjacent US states, we examined the processes underlying these outbreaks. RRV incursions were attributable predominantly to systematic virus leakage of local strains across areas along the border where vaccination has been conducted but also to single stochastic events such as long-distance translocations. These results demonstrate the utility of phylogeographic analysis of pathogen genomes for understanding transboundary outbreaks.

  6. Whole genome sequence to decipher the resistome of Shewanella algae, a multidrug-resistant bacterium responsible for pneumonia, Marseille, France.

    Cimmino, Teresa; Olaitan, Abiola Olumuyiwa; Rolain, Jean-Marc

    2016-01-01

    We characterize and decipher the resistome and the virulence factors of Shewanella algae MARS 14, a multidrug-resistant clinical strain using the whole genome sequencing (WGS) strategy. The bacteria were isolated from the bronchoalveolar lavage of a hospitalized patient in the Timone Hospital in Marseille, France who developed pneumonia after plunging into the Mediterranean Sea. The genome size of S. algae MARS 14 was 5,005,710 bp with 52.8% guanine cytosine content. The resistome includes members of class C and D beta-lactamases and numerous multidrug-efflux pumps. We also found the presence of several hemolysins genes, a complete flagellum system gene cluster and genes responsible for biofilm formation. Moreover, we reported for the first time in a clinical strain of Shewanella spp. the presence of a bacteriocin (marinocin). The WGS analysis of this pathogen provides insight into its virulence factors and resistance to antibiotics.

  7. Using Growing Self-Organising Maps to Improve the Binning Process in Environmental Whole-Genome Shotgun Sequencing

    Chan, Chon-Kit Kenneth; Hsu, Arthur L.; Tang, Sen-Lin; Halgamuge, Saman K.

    2008-01-01

    Metagenomic projects using whole-genome shotgun (WGS) sequencing produces many unassembled DNA sequences and small contigs. The step of clustering these sequences, based on biological and molecular features, is called binning. A reported strategy for binning that combines oligonucleotide frequency and self-organising maps (SOM) shows high potential. We improve this strategy by identifying suitable training features, implementing a better clustering algorithm, and defining quantitative measures for assessing results. We investigated the suitability of each of di-, tri-, tetra-, and pentanucleotide frequencies. The results show that dinucleotide frequency is not a sufficiently strong signature for binning 10 kb long DNA sequences, compared to the other three. Furthermore, we observed that increased order of oligonucleotide frequency may deteriorate the assignment result in some cases, which indicates the possible existence of optimal species-specific oligonucleotide frequency. We replaced SOM with growing self-organising map (GSOM) where comparable results are obtained while gaining 7%–15% speed improvement. PMID:18288261

  8. Inability of 'Whole Genome Amplification' to Improve Success Rates for the Biomolecular Detection of Tuberculosis in Archaeological Samples.

    Jannine Forst

    Full Text Available We assessed the ability of whole genome amplification (WGA to improve the efficiency of downstream polymerase chain reactions (PCRs directed at ancient DNA (aDNA of members of the Mycobacterium tuberculosis complex (MTBC. Using extracts from a variety of bones and a tooth from human skeletons with or without lesions indicative of tuberculosis, from multiple time periods, we obtained inconsistent results. We conclude that WGA does not provide any advantage in studies of MTBC aDNA. The sporadic nature of our results are probably due to the fact that WGA is itself a PCR-based procedure which, although designed to deal with fragmented DNA, might be inefficient with the low concentration of templates in an aDNA extract. As such, WGA is subject to similar, if not the same, restrictions as PCR when applied to aDNA.

  9. Application of whole genome shotgun sequencing for detection and characterization of genetically modified organisms and derived products.

    Holst-Jensen, Arne; Spilsberg, Bjørn; Arulandhu, Alfred J; Kok, Esther; Shi, Jianxin; Zel, Jana

    2016-07-01

    The emergence of high-throughput, massive or next-generation sequencing technologies has created a completely new foundation for molecular analyses. Various selective enrichment processes are commonly applied to facilitate detection of predefined (known) targets. Such approaches, however, inevitably introduce a bias and are prone to miss unknown targets. Here we review the application of high-throughput sequencing technologies and the preparation of fit-for-purpose whole genome shotgun sequencing libraries for the detection and characterization of genetically modified and derived products. The potential impact of these new sequencing technologies for the characterization, breeding selection, risk assessment, and traceability of genetically modified organisms and genetically modified products is yet to be fully acknowledged. The published literature is reviewed, and the prospects for future developments and use of the new sequencing technologies for these purposes are discussed.

  10. Whole-genome comparison of urinary pathogenic Escherichia coli and faecal isolates of UTI patients and healthy controls

    Nielsen, Karen Leth; Stegger, Marc; Kiil, Kristoffer

    2017-01-01

    The faecal flora is a common reservoir for urinary tract infection (UTI), and Escherichia coli (E. coli) is frequently found in this reservoir without causing extraintestinal infection. We investigated these E. coli reservoirs by whole-genome sequencing a large collection of E. coli from healthy...... controls (faecal), who had never previously had UTI, and from UTI patients (faecal and urinary) sampled from the same geographical area. We compared MLST types, phylogenetic relationship, accessory genome content and FimH type between patient and control faecal isolates as well as between UTI and faecal......-only isolates, respectively. Comparison of the accessory genome of UTI isolates to faecal isolates revealed 35 gene families which were significantly more prevalent in the UTI isolates compared to the faecal isolates, although none of these were unique to one of the two groups. Of these 35, 22 belonged...

  11. Whole-genome SNP association in the horse: identification of a deletion in myosin Va responsible for Lavender Foal Syndrome.

    Samantha A Brooks

    2010-04-01

    Full Text Available Lavender Foal Syndrome (LFS is a lethal inherited disease of horses with a suspected autosomal recessive mode of inheritance. LFS has been primarily diagnosed in a subgroup of the Arabian breed, the Egyptian Arabian horse. The condition is characterized by multiple neurological abnormalities and a dilute coat color. Candidate genes based on comparative phenotypes in mice and humans include the ras-associated protein RAB27a (RAB27A and myosin Va (MYO5A. Here we report mapping of the locus responsible for LFS using a small set of 36 horses segregating for LFS. These horses were genotyped using a newly available single nucleotide polymorphism (SNP chip containing 56,402 discriminatory elements. The whole genome scan identified an associated region containing these two functional candidate genes. Exon sequencing of the MYO5A gene from an affected foal revealed a single base deletion in exon 30 that changes the reading frame and introduces a premature stop codon. A PCR-based Restriction Fragment Length Polymorphism (PCR-RFLP assay was designed and used to investigate the frequency of the mutant gene. All affected horses tested were homozygous for this mutation. Heterozygous carriers were detected in high frequency in families segregating for this trait, and the frequency of carriers in unrelated Egyptian Arabians was 10.3%. The mapping and discovery of the LFS mutation represents the first successful use of whole-genome SNP scanning in the horse for any trait. The RFLP assay can be used to assist breeders in avoiding carrier-to-carrier matings and thus in preventing the birth of affected foals.

  12. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Sathishkumar Natarajan

    Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  13. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  14. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis.

    Wei Yee Wee

    Full Text Available Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense.

  15. Whole genome sequencing and evolutionary analysis of human respiratory syncytial virus A and B from Milwaukee, WI 1998-2010.

    Cecilia Rebuffo-Scheer

    Full Text Available BACKGROUND: Respiratory Syncytial Virus (RSV is the leading cause of lower respiratory-tract infections in infants and young children worldwide. Despite this, only six complete genome sequences of original strains have been previously published, the most recent of which dates back 35 and 26 years for RSV group A and group B respectively. METHODOLOGY/PRINCIPAL FINDINGS: We present a semi-automated sequencing method allowing for the sequencing of four RSV whole genomes simultaneously. We were able to sequence the complete coding sequences of 13 RSV A and 4 RSV B strains from Milwaukee collected from 1998-2010. Another 12 RSV A and 5 RSV B strains sequenced in this study cover the majority of the genome. All RSV A and RSV B sequences were analyzed by neighbor-joining, maximum parsimony and Bayesian phylogeny methods. Genetic diversity was high among RSV A viruses in Milwaukee including the circulation of multiple genotypes (GA1, GA2, GA5, GA7 with GA2 persisting throughout the 13 years of the study. However, RSV B genomes showed little variation with all belonging to the BA genotype. For RSV A, the same evolutionary patterns and clades were seen consistently across the whole genome including all intergenic, coding, and non-coding regions sequences. CONCLUSIONS/SIGNIFICANCE: The sequencing strategy presented in this work allows for RSV A and B genomes to be sequenced simultaneously in two working days and with a low cost. We have significantly increased the amount of genomic data that is available for both RSV A and B, providing the basic molecular characteristics of RSV strains circulating in Milwaukee over the last 13 years. This information can be used for comparative analysis with strains circulating in other communities around the world which should also help with the development of new strategies for control of RSV, specifically vaccine development and improvement of RSV diagnostics.

  16. Maximum likelihood phylogenetic reconstruction from high-resolution whole-genome data and a tree of 68 eukaryotes.

    Lin, Yu; Hu, Fei; Tang, Jijun; Moret, Bernard M E

    2013-01-01

    The rapid accumulation of whole-genome data has renewed interest in the study of the evolution of genomic architecture, under such events as rearrangements, duplications, losses. Comparative genomics, evolutionary biology, and cancer research all require tools to elucidate the mechanisms, history, and consequences of those evolutionary events, while phylogenetics could use whole-genome data to enhance its picture of the Tree of Life. Current approaches in the area of phylogenetic analysis are limited to very small collections of closely related genomes using low-resolution data (typically a few hundred syntenic blocks); moreover, these approaches typically do not include duplication and loss events. We describe a maximum likelihood (ML) approach for phylogenetic analysis that takes into account genome rearrangements as well as duplications, insertions, and losses. Our approach can handle high-resolution genomes (with 40,000 or more markers) and can use in the same analysis genomes with very different numbers of markers. Because our approach uses a standard ML reconstruction program (RAxML), it scales up to large trees. We present the results of extensive testing on both simulated and real data showing that our approach returns very accurate results very quickly. In particular, we analyze a dataset of 68 high-resolution eukaryotic genomes, with from 3,000 to 42,000 genes, from the eGOB database; the analysis, including bootstrapping, takes just 3 hours on a desktop system and returns a tree in agreement with all well supported branches, while also suggesting resolutions for some disputed placements.

  17. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis.

    Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense.

  18. Supplementary Material for: Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    Phelan, Jody

    2016-01-01

    Abstract Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure and interaction modelling are used to understand the functional effects of putative mutations and provide insight into the molecular mechanisms leading to resistance. Methods To investigate the potential utility of these approaches, we analysed the genomes of 144 Mycobacterium tuberculosis clinical isolates from The Special Programme for Research and Training in Tropical Diseases (TDR) collection sourced from 20 countries in four continents. A genome-wide approach was applied to 127 isolates to identify polymorphisms associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. In addition, the effect of identified candidate mutations on protein stability and interactions was assessed quantitatively with well-established computational methods. Results The analysis revealed that mutations in the genes rpoB (rifampicin), katG (isoniazid), inhA-promoter (isoniazid), rpsL (streptomycin) and embB (ethambutol) were responsible for the majority of resistance observed. A subset of the mutations identified in rpoB and katG were predicted to affect protein stability. Further, a strong direct correlation was observed between the minimum inhibitory concentration values and the distance of the mutated residues in the three-dimensional structures of rpoB and katG to their respective drugs binding sites. Conclusions Using the TDR resource, we demonstrate the usefulness of whole genome association and convergent evolution approaches to detect known and potentially novel mutations associated with drug resistance. Further, protein structural modelling could provide a means of predicting the impact of polymorphisms on drug efficacy in the absence of phenotypic data. These approaches could ultimately lead to novel

  19. Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    Phelan, Jody

    2016-03-23

    Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure and interaction modelling are used to understand the functional effects of putative mutations and provide insight into the molecular mechanisms leading to resistance. Methods To investigate the potential utility of these approaches, we analysed the genomes of 144 Mycobacterium tuberculosis clinical isolates from The Special Programme for Research and Training in Tropical Diseases (TDR) collection sourced from 20 countries in four continents. A genome-wide approach was applied to 127 isolates to identify polymorphisms associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. In addition, the effect of identified candidate mutations on protein stability and interactions was assessed quantitatively with well-established computational methods. Results The analysis revealed that mutations in the genes rpoB (rifampicin), katG (isoniazid), inhA-promoter (isoniazid), rpsL (streptomycin) and embB (ethambutol) were responsible for the majority of resistance observed. A subset of the mutations identified in rpoB and katG were predicted to affect protein stability. Further, a strong direct correlation was observed between the minimum inhibitory concentration values and the distance of the mutated residues in the three-dimensional structures of rpoB and katG to their respective drugs binding sites. Conclusions Using the TDR resource, we demonstrate the usefulness of whole genome association and convergent evolution approaches to detect known and potentially novel mutations associated with drug resistance. Further, protein structural modelling could provide a means of predicting the impact of polymorphisms on drug efficacy in the absence of phenotypic data. These approaches could ultimately lead to novel resistance

  20. CNPase Expression in Olfactory Ensheathing Cells

    Christine Radtke

    2011-01-01

    Full Text Available A large body of work supports the proposal that transplantation of olfactory ensheathing cells (OECs into nerve or spinal cord injuries can promote axonal regeneration and remyelination. Yet, some investigators have questioned whether the transplanted OECs associate with axons and form peripheral myelin, or if they recruit endogenous Schwann cells that form myelin. Olfactory bulbs from transgenic mice expressing the enhanced green fluorescent protein (eGFP under the control of the 2-3-cyclic nucleotide 3-phosphodiesterase (CNPase promoter were studied. CNPase is expressed in myelin-forming cells throughout their lineage. We examined CNPase expression in both in situ in the olfactory bulb and in vitro to determine if OECs express CNPase commensurate with their myelination potential. eGFP was observed in the outer nerve layer of the olfactory bulb. Dissociated OECs maintained in culture had both intense eGFP expression and CNPase immunostaining. Transplantation of OECs into transected peripheral nerve longitudinally associated with the regenerated axons. These data indicate that OECs in the outer nerve layer of the olfactory bulb of CNPase transgenic mice express CNPase. Thus, while OECs do not normally form myelin on olfactory nerve axons, their expression of CNPase is commensurate with their potential to form myelin when transplanted into injured peripheral nerve.

  1. Differential expression of cell adhesion genes

    Stein, Wilfred D; Litman, Thomas; Fojo, Tito

    2005-01-01

    that compare cells grown in suspension to similar cells grown attached to one another as aggregates have suggested that it is adhesion to the extracellular matrix of the basal membrane that confers resistance to apoptosis and, hence, resistance to cytotoxins. The genes whose expression correlates with poor...... in cell adhesion and the cytoskeleton. If the proteins involved in tethering cells to the extracellular matrix are important in conferring drug resistance, it may be possible to improve chemotherapy by designing drugs that target these proteins....

  2. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  3. A flexible whole-genome microarray for transcriptomics in three-spine stickleback (Gasterosteus aculeatus

    Primmer Craig R

    2009-09-01

    Full Text Available Abstract Background The use of microarray technology for describing changes in mRNA expression to address ecological and evolutionary questions is becoming increasingly popular. Since three-spine stickleback are an important ecological and evolutionary model-species as well as an emerging model for eco-toxicology, the ability to have a functional and flexible microarray platform for transcriptome studies will greatly enhance the research potential in these areas. Results We designed 43,392 unique oligonucleotide probes representing 19,274 genes (93% of the estimated total gene number, and tested the hybridization performance of both DNA and RNA from different populations to determine the efficacy of probe design for transcriptome analysis using the Agilent array platform. The majority of probes were functional as evidenced by the DNA hybridization success, and 30,946 probes (14,615 genes had a signal that was significantly above background for RNA isolated from liver tissue. Genes identified as being expressed in liver tissue were grouped into functional categories for each of the three Gene Ontology groups: biological process, molecular function, and cellular component. As expected, the highest proportions of functional categories belonged to those associated with metabolic functions: metabolic process, binding, catabolism, and organelles. Conclusion The probe and microarray design presented here provides an important step facilitating transcriptomics research for this important research organism by providing a set of over 43,000 probes whose hybridization success and specificity to liver expression has been demonstrated. Probes can easily be added or removed from the current design to tailor the array to specific experiments and additional flexibility lies in the ability to perform either one-color or two-color hybridizations.

  4. Transitional cell carcinoma express vitamin D receptors

    Hermann, G G; Andersen, C B

    1997-01-01

    Recently, vitamin D analogues have shown antineoplastic effect in several diseases. Vitamin D analogues exert its effect by interacting with the vitamin D receptor (VDR). Studies of VDR in transitional cell carcinoma (TCC) have not been reported. The purpose of the present study was therefore.......05). Similarly, also tumor grade appeared to be related to the number of cells expressing the receptor. Normal urothlium also expressed VDR but only with low intensity. Our study shows that TCC cells possess the VDR receptor which may make them capable to respond to stimulation with vitamin D, but functional...... studies of vitamin D's effect on TCC cells in vitro are necessary before the efficacy of treatment with vitamin D analogues in TCC can be evaluated in patients....

  5. Whole Genome Sequencing and Multiplex qPCR Methods to Identify Campylobacter jejuni Encoding cst-II or cst-III Sialyltransferase

    Jason M. Neal-McKinney

    2018-03-01

    Full Text Available Campylobacter jejuni causes more than 2 million cases of gastroenteritis annually in the United States, and is also linked to the autoimmune sequelae Guillan–Barre syndrome (GBS. GBS often results in flaccid paralysis, as the myelin sheaths of nerve cells are degraded by the adaptive immune response. Certain strains of C. jejuni modify their lipooligosaccharide (LOS with the addition of neuraminic acid, resulting in LOS moieties that are structurally similar to gangliosides present on nerve cells. This can trigger GBS in a susceptible host, as antibodies generated against C. jejuni can cross-react with gangliosides, leading to demyelination of nerves and a loss of signal transduction. The goal of this study was to develop a quantitative PCR (qPCR method and use whole genome sequencing data to detect the Campylobactersialyltransferase (cst genes responsible for the addition of neuraminic acid to LOS. The qPCR method was used to screen a library of 89 C. jejuni field samples collected by the Food and Drug Administration Pacific Northwest Lab (PNL as well as clinical isolates transferred to PNL. In silico analysis was used to screen 827 C. jejuni genomes in the FDA GenomeTrakr SRA database. The results indicate that a majority of C. jejuni strains could produce LOS with ganglioside mimicry, as 43.8% of PNL isolates and 46.9% of the GenomeTrakr isolates lacked the cst genes. The methods described in this study can be used by public health laboratories to rapidly determine whether a C. jejuni isolate has the potential to induce GBS. Based on these results, a majority of C. jejuni in the PNL collection and submitted to GenomeTrakr have the potential to produce LOS that mimics human gangliosides.

  6. A 10-year follow-up of a child with mild case of xeroderma pigmentosum complementation group D diagnosed by whole-genome sequencing.

    Ono, Ryusuke; Masaki, Taro; Mayca Pozo, Franklin; Nakazawa, Yuka; Swagemakers, Sigrid M A; Nakano, Eiji; Sakai, Wataru; Takeuchi, Seiji; Kanda, Fumio; Ogi, Tomoo; van der Spek, Peter J; Sugasawa, Kaoru; Nishigori, Chikako

    2016-07-01

    Most patients with xeroderma pigmentosum complementation group D (XP-D) from Western countries suffer from neurological symptoms, whereas Japanese patients display only skin manifestations without neurological symptoms. We have previously suggested that these differences in clinical manifestations in XP-D patients are attributed partly to a predominant mutation in ERCC2, and the allele frequency of S541R is highest in Japan. We diagnosed a child with mild case of XP-D by the evaluation of DNA repair activity and whole-genome sequencing, and followed her ten years. Skin cancer, mental retardation, and neurological symptoms were not observed. Her minimal erythema dose was 41 mJ/cm(2) , which was slightly lower than that of healthy Japanese volunteers. The patient's cells showed sixfold hypersensitivity to UV in comparison with normal cells. Post-UV unscheduled DNA synthesis was 20.4%, and post-UV recovery of RNA synthesis was 58% of non-irradiated samples, which was lower than that of normal fibroblasts. Genome sequence analysis indicated that the patient harbored a compound heterozygous mutation of c.1621A>C and c.591_594del, resulting in p.S541R and p.Y197* in ERCC2: then, patient was diagnosed with XP-D. Y197* has not been described before. Her mild skin manifestations might be attributed to the mutational site on her genome and daily strict sun protection. c.1621A>C might be a founder mutation of ERCC2 among Japanese XP-D patients, as it was identified most frequently in Japanese XP-D patients and it has not been found elsewhere outside Japan. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  7. Diversification and evolution of the SDG gene family in Brassica rapa after the whole genome triplication.

    Dong, Heng; Liu, Dandan; Han, Tianyu; Zhao, Yuxue; Sun, Ji; Lin, Sue; Cao, Jiashu; Chen, Zhong-Hua; Huang, Li

    2015-11-24

    Histone lysine methylation, controlled by the SET Domain Group (SDG) gene family, is part of the histone code that regulates chromatin function and epigenetic control of gene expression. Analyzing the SDG gene family in Brassica rapa for their gene structure, domain architecture, subcellular localization, rate of molecular evolution and gene expression pattern revealed common occurrences of subfunctionalization and neofunctionalization in BrSDGs. In comparison with Arabidopsis thaliana, the BrSDG gene family was found to be more divergent than AtSDGs, which might partly explain the rich variety of morphotypes in B. rapa. In addition, a new evolutionary pattern of the four main groups of SDGs was presented, in which the Trx group and the SUVR subgroup evolved faster than the E(z), Ash groups and the SUVH subgroup. These differences in evolutionary rate among the four main groups of SDGs are perhaps due to the complexity and variability of the regions that bind with biomacromolecules, which guide SDGs to their target loci.

  8. Investigation of archived formalin-fixed paraffin-embedded pancreatic tissue with whole-genome gene expression microarray

    Michelsen, Nete Vinstrup; Brusgaard, Klaus; Tan, Qihua

    2011-01-01

    The use of formalin-fixed, paraffin-embedded (FFPE) tissue overcomes the most prominent issues related to research on relatively rare diseases: limited sample size, availability of control tissue, and time frame. The use of FFPE pancreatic tissue in GEM may be especially challenging due to its very...

  9. Tropomyosin Receptor Kinase A Expression on Merkel Cell Carcinoma Cells.

    Wehkamp, Ulrike; Stern, Sophie; Krüger, Sandra; Hauschild, Axel; Röcken, Christoph; Egberts, Friederike

    2017-11-01

    Merkel cell carcinoma (MCC) is a malignant neuroendocrine skin tumor frequently associated with the Merkel cell polyomavirus. Immune checkpoint therapy showed remarkable results, although not all patients are responsive to this therapy. Anti-tropomyosin receptor kinase A (TrkA)-targeted treatment has shown promising results in several tumor entities. To determine TrkA expression in MCC as a rationale for potential targeted therapy. This case series study investigated the MCC specimens of 55 patients treated at the Department of Dermatology, University Hospital of Schleswig-Holstein, Kiel, Germany, from January 1, 2005, through December 31, 2015. Thirty-nine of the 55 samples were suitable for further histopathologic examination. Expression of TrkA was explored by immunohistochemical analysis. Diagnosis of MCC was confirmed by staining positive for cytokeratin 20 (CK20) and synaptophysin. Expression of TrkA on the tumor cells. Specimens of 39 patients (21 women and 18 men; mean [SD] age, 75.0 [7.8] years) underwent immunohistochemical investigation. Thirty-eight of 38 specimens expressed CK20 and synaptophysin on the MCC tumor cells (100% expression). Merkel cell polyomavirus was detected in 32 of 38 specimens (84%). Tropomyosin receptor kinase A was found in all 36 evaluable specimens on the tumor cells; 34 (94%) showed a weak and 2 (6%) showed a strong cytoplasmic expression. In addition, strongly positive perinuclear dots were observed in 30 of 36 specimens (83%). Tropomyosin receptor kinase A was expressed on MCC tumor cells in 100% of evaluable specimens. This result may lead to the exploration of new targeted treatment options in MCC, especially for patients who do not respond to anti-programmed cell death protein 1 treatment.

  10. Overexpression of GRß in colonic mucosal cell line partly reflects altered gene expression in colonic mucosa of patients with inflammatory bowel disease.

    Nagy, Zsolt; Acs, Bence; Butz, Henriett; Feldman, Karolina; Marta, Alexa; Szabo, Peter M; Baghy, Kornelia; Pazmany, Tamas; Racz, Karoly; Liko, Istvan; Patocs, Attila

    2016-01-01

    The glucocorticoid receptor (GR) plays a crucial role in inflammatory responses. GR has several isoforms, of which the most deeply studied are the GRα and GRß. Recently it has been suggested that in addition to its negative dominant effect on GRα, the GRß may have a GRα-independent transcriptional activity. The GRß isoform was found to be frequently overexpressed in various autoimmune diseases, including inflammatory bowel disease (IBD). In this study, we wished to test whether the gene expression profile found in a GRß overexpressing intestinal cell line (Caco-2GRß) might mimic the gene expression alterations found in patients with IBD. Whole genome microarray analysis was performed in both normal and GRß overexpressing Caco-2 cell lines with and without dexamethasone treatment. IBD-related genes were identified from a meta-analysis of 245 microarrays available in online microarray deposits performed on intestinal mucosa samples from patients with IBD and healthy individuals. The differentially expressed genes were further studied using in silico pathway analysis. Overexpression of GRß altered a large proportion of genes that were not regulated by dexamethasone suggesting that GRß may have a GRα-independent role in the regulation of gene expression. About 10% of genes differentially expressed in colonic mucosa samples from IBD patients compared to normal subjects were also detected in Caco-2 GRß intestinal cell line. Common genes are involved in cell adhesion and cell proliferation. Overexpression of GRß in intestinal cells may affect appropriate mucosal repair and intact barrier function. The proposed novel role of GRß in intestinal epithelium warrants further studies. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Oncomirs Expression Profiling in Uterine Leiomyosarcoma Cells

    Bruna Cristine de Almeida

    2017-12-01

    Full Text Available MicroRNAs (miRNAs are small non-coding RNAs that act as regulators of gene expression at the post-transcriptional level. They play a key role in several biological processes. Their abnormal expression may lead to malignant cell transformation. This study aimed to evaluate the expression profile of 84 miRNAs involved in tumorigenesis in immortalized cells of myometrium (MM, uterine leiomyoma (ULM, and uterine leiomyosarcoma (ULMS. Specific cell lines were cultured and qRT-PCR was performed. Thirteen miRNAs presented different expression profiles in ULM and the same thirteen in ULMS compared to MM. Eight miRNAs were overexpressed, and five were underexpressed in ULM. In ULMS cells, five miRNAs exhibited an overexpression and eight were down-regulated. Six miRNAs (miR-1-3p, miR-130b-3p, miR-140-5p, miR-202-3p, miR-205-5p, and miR-7-5p presented a similar expression pattern in cell lines compared to patient samples. Of these, only three miRNAs showed significant expression in ULM (miR-1-3p, miR-140-5p, and miR-7-5p and ULMS (miR-1-3p, miR-202-3p, and miR-7-5p. Our preliminary approach identified 24 oncomirs with an altered expression profile in ULM and ULMS cells. We identified four differentially expressed miRNAs with the same profile when compared with patients’ samples, which strongly interacted with relevant genes, including apoptosis regulator (BCL2, epidermal growth factor receptor (EGFR, vascular endothelial growth factor A (VEGFA, insulin like growth factor 1 receptor (IGF1R,serine/threonine kinase (RAF1, receptor tyrosine kinase (MET, and bHLH transcription factor (MYCN. This led to alterations in their mRNA-target.

  12. Whole genome-based phylogeny of reptile-associated Helicobacter indicates independent niche adaptation followed by diversification in a poikilothermic host

    Gilbert, Maarten J.; Duim, Birgitta; Timmerman, Arjen J.; Zomer, Aldert L.; Wagenaar, Jaap A.

    2017-01-01

    Reptiles have been shown to host a significant Helicobacter diversity. In order to survive, reptile-associated Helicobacter lineages need to be adapted to the thermally dynamic environment encountered in a poikilothermic host. The whole genomes of reptile-associated Helicobacter lineages can

  13. Whole genome-based phylogeny of reptile-associated Helicobacter indicates independent niche adaptation followed by diversification in a poikilothermic host

    Gilbert, Maarten J; Duim, Birgitta; Timmerman, Arjen J; Zomer, Aldert L; Wagenaar, Jaap A

    2017-01-01

    Reptiles have been shown to host a significant Helicobacter diversity. In order to survive, reptile-associated Helicobacter lineages need to be adapted to the thermally dynamic environment encountered in a poikilothermic host. The whole genomes of reptile-associated Helicobacter lineages can provide

  14. Performance Evaluation of NIPT in Detection of Chromosomal Copy Number Variants Using Low-Coverage Whole-Genome Sequencing of Plasma DNA

    Liu, Hongtai; Gao, Ya; Hu, Zhiyang

    2016-01-01

    , including 33 CNVs samples and 886 normal samples from September 1, 2011 to May 31, 2013, were enrolled in this study. The samples were randomly rearranged and blindly sequenced by low-coverage (about 7M reads) whole-genome sequencing of plasma DNA. Fetal CNVs were detected by Fetal Copy-number Analysis...

  15. Whole-genome sequence of Pseudomonas fluorescens EK007-RG4, a promising biocontrol agent against a broad range of bacteria, including the fire blight bacterium Erwinia amylovora

    Habibi, Roghayeh; Tarighi, Saeed; Behravan, Javad

    2017-01-01

    Here, we report the first draft whole-genome sequence of Pseudomonas fluorescens strain EK007-RG4, which was isolated from the phylloplane of a pear tree. P. fluorescens EK007-RG4 displays strong antagonism against Erwinia amylovora, the causal agent for fire blight disease, in addition to several...

  16. Whole-genome profiling and shotgun sequencing delivers an anchored, gene-decorated, physical map assembly of bread wheat chromosome 6A

    Poursarebani, N.; Nussbaumer, T.; Šimková, Hana; Šafář, Jan; Witsenboer, H.; van Oeveren, J.; Doležel, Jaroslav; Mayer, K. F. X.; Stein, N.; Schnurbusch, T.

    2014-01-01

    Roč. 79, č. 2 (2014), s. 334-347 ISSN 0960-7412 Institutional support: RVO:61389030 Keywords : bread wheat chromosome 6A * whole-genome profiling * LINEAR TOPOLOGICAL CONTIGS Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 5.972, year: 2014

  17. Influenza A virus evolution and spatio-temporal dynamics in eurasian wild birds: A phylogenetic and phylogeographical study of whole-genome sequence data

    N.S. Lewis (Nicola); J.H. Verhagen (Josanne); Z. Javakhishvili (Zurab); C.A. Russell (Colin); P. Lexmond (Pascal); K.B. Westgeest (Kim); T.M. Bestebroer (Theo); R.A. Halpin (Rebecca); X. Lin (Xudong); A. Ransier (Amy); N.B. Fedorova (Nadia B.); T.B. Stockwell (Timothy B.); N. Latorre-Margalef (Neus); B. Olsen (Björn); G.J.D. Smith (Gavin); J. Bahl (Justin); D.E. Wentworth (David E.); J. Waldenström (Jonas); R.A.M. Fouchier (Ron); M.T. de Graaf (Marieke)

    2015-01-01

    textabstractLow pathogenic avian influenza A viruses (IAVs) have a natural host reservoir in wild waterbirds and the potential to spread to other host species. Here, we investigated the evolutionary, spatial and temporal dynamics of avian IAVs in Eurasian wild birds. We used whole-genome sequences

  18. From the Battlefield to the Bedside: Supporting Warfighter and Civilian Health With the "ART" of Whole Genome Sequencing for Antibiotic Resistance and Outbreak Investigations.

    Lesho, Emil; Lin, Xiaoxu; Clifford, Robert; Snesrud, Erik; Onmus-Leone, Fatma; Appalla, Lakshmi; Ong, Ana; Maybank, Rosslyn; Nielsen, Lindsey; Kwak, Yoon; Hinkle, Mary; Turco, John; Marin, Juan A; Hooks, Sally; Matthews, Stacy; Hyland, Stephen; Little, Jered; Waterman, Paige; McGann, Patrick

    2016-07-01

    Awareness, responsiveness, and throughput characterize an approach for enhancing the clinical impact of whole genome sequencing for austere environments and for large geographically dispersed health systems. This Department of Defense approach is informing interagency efforts linking antibiograms of multidrug-resistant organisms to their genome sequences in a public database. Reprint & Copyright © 2016 Association of Military Surgeons of the U.S.

  19. Investigation of isoniazid and ethionamide cross-resistance by whole genome sequencing and association with poor treatment outcomes of multidrug-resistant tuberculosis patients in South Africa

    L Malinga

    2016-01-01

    Conclusion: Baseline ETH molecular resistance before second-line treatment is a concern. Unfavorable treatment outcomes of patients with ethA, ethR, and inhA mutations highlight the importance of genotypic testing before initiation of treatment containing ETH. The clinical significance of whole genome analysis for early detection of mutations predictive of treatment failure needs further investigation.

  20. Dissecting stem cell differentiation using single cell expression profiling

    Moignard, Victoria Rachel; Göttgens, Berthold

    2016-01-01

    Many assumptions about the way cells behave are based on analyses of populations. However, it is now widely recognized that even apparently pure populations can display a remarkable level of heterogeneity. This is particularly true in stem cell biology where it hinders our understanding of normal development and the development of strategies for regenerative medicine. Over the past decade technologies facilitating gene expression analysis at the single cell level have become widespread, provi...