WorldWideScience

Sample records for next-generation sequencing ngs

  1. ngs.plot: Quick mining and visualization of next-generation sequencing data by integrating genomic databases.

    Science.gov (United States)

    Shen, Li; Shao, Ningyi; Liu, Xiaochuan; Nestler, Eric

    2014-04-15

    Understanding the relationship between the millions of functional DNA elements and their protein regulators, and how they work in conjunction to manifest diverse phenotypes, is key to advancing our understanding of the mammalian genome. Next-generation sequencing technology is now used widely to probe these protein-DNA interactions and to profile gene expression at a genome-wide scale. As the cost of DNA sequencing continues to fall, the interpretation of the ever increasing amount of data generated represents a considerable challenge. We have developed ngs.plot - a standalone program to visualize enrichment patterns of DNA-interacting proteins at functionally important regions based on next-generation sequencing data. We demonstrate that ngs.plot is not only efficient but also scalable. We use a few examples to demonstrate that ngs.plot is easy to use and yet very powerful to generate figures that are publication ready. We conclude that ngs.plot is a useful tool to help fill the gap between massive datasets and genomic information in this era of big sequencing data.

  2. NGS QC Toolkit: a toolkit for quality control of next generation sequencing data.

    Directory of Open Access Journals (Sweden)

    Ravi K Patel

    Full Text Available Next generation sequencing (NGS technologies provide a high-throughput means to generate large amount of sequence data. However, quality control (QC of sequence data generated from these technologies is extremely important for meaningful downstream analysis. Further, highly efficient and fast processing tools are required to handle the large volume of datasets. Here, we have developed an application, NGS QC Toolkit, for quality check and filtering of high-quality data. This toolkit is a standalone and open source application freely available at http://www.nipgr.res.in/ngsqctoolkit.html. All the tools in the application have been implemented in Perl programming language. The toolkit is comprised of user-friendly tools for QC of sequencing data generated using Roche 454 and Illumina platforms, and additional tools to aid QC (sequence format converter and trimming tools and analysis (statistics tools. A variety of options have been provided to facilitate the QC at user-defined parameters. The toolkit is expected to be very useful for the QC of NGS data to facilitate better downstream analysis.

  3. Genetic counselors' (GC) knowledge, awareness, understanding of clinical next-generation sequencing (NGS) genomic testing.

    Science.gov (United States)

    Boland, P M; Ruth, K; Matro, J M; Rainey, K L; Fang, C Y; Wong, Y N; Daly, M B; Hall, M J

    2015-12-01

    Genomic tests are increasingly complex, less expensive, and more widely available with the advent of next-generation sequencing (NGS). We assessed knowledge and perceptions among genetic counselors pertaining to NGS genomic testing via an online survey. Associations between selected characteristics and perceptions were examined. Recent education on NGS testing was common, but practical experience limited. Perceived understanding of clinical NGS was modest, specifically concerning tumor testing. Greater perceived understanding of clinical NGS testing correlated with more time spent in cancer-related counseling, exposure to NGS testing, and NGS-focused education. Substantial disagreement about the role of counseling for tumor-based testing was seen. Finally, a majority of counselors agreed with the need for more education about clinical NGS testing, supporting this approach to optimizing implementation. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  4. Genetic counselors’ (GC) knowledge, awareness, and understanding of clinical next-generation sequencing (NGS) genomic testing

    Science.gov (United States)

    Boland, PM; Ruth, K; Matro, JM; Rainey, KL; Fang, CY; Wong, YN; Daly, MB; Hall, MJ

    2014-01-01

    Genomic tests are increasingly complex, less expensive, and more widely available with the advent of next-generation sequencing (NGS). We assessed knowledge and perceptions among genetic counselors pertaining to NGS genomic testing via an online survey. Associations between selected characteristics and perceptions were examined. Recent education on NGS testing was common, but practical experience limited. Perceived understanding of clinical NGS was modest, specifically concerning tumor testing. Greater perceived understanding of clinical NGS testing correlated with more time spent in cancer-related counseling, exposure to NGS testing, and NGS-focused education. Substantial disagreement about the role of counseling for tumor-based testing was seen. Finally, a majority of counselors agreed with the need for more education about clinical NGS testing, supporting this approach to optimizing implementation. PMID:25523111

  5. KNIME4NGS: a comprehensive toolbox for next generation sequencing analysis.

    Science.gov (United States)

    Hastreiter, Maximilian; Jeske, Tim; Hoser, Jonathan; Kluge, Michael; Ahomaa, Kaarin; Friedl, Marie-Sophie; Kopetzky, Sebastian J; Quell, Jan-Dominik; Mewes, H Werner; Küffner, Robert

    2017-05-15

    Analysis of Next Generation Sequencing (NGS) data requires the processing of large datasets by chaining various tools with complex input and output formats. In order to automate data analysis, we propose to standardize NGS tasks into modular workflows. This simplifies reliable handling and processing of NGS data, and corresponding solutions become substantially more reproducible and easier to maintain. Here, we present a documented, linux-based, toolbox of 42 processing modules that are combined to construct workflows facilitating a variety of tasks such as DNAseq and RNAseq analysis. We also describe important technical extensions. The high throughput executor (HTE) helps to increase the reliability and to reduce manual interventions when processing complex datasets. We also provide a dedicated binary manager that assists users in obtaining the modules' executables and keeping them up to date. As basis for this actively developed toolbox we use the workflow management software KNIME. See http://ibisngs.github.io/knime4ngs for nodes and user manual (GPLv3 license). robert.kueffner@helmholtz-muenchen.de. Supplementary data are available at Bioinformatics online.

  6. Next generation sequencing (NGS)technologies and applications

    Energy Technology Data Exchange (ETDEWEB)

    Vuyisich, Momchilo [Los Alamos National Laboratory

    2012-09-11

    NGS technology overview: (1) NGS library preparation - Nucleic acids extraction, Sample quality control, RNA conversion to cDNA, Addition of sequencing adapters, Quality control of library; (2) Sequencing - Clonal amplification of library fragments, (except PacBio), Sequencing by synthesis, Data output (reads and quality); and (3) Data analysis - Read mapping, Genome assembly, Gene expression, Operon structure, sRNA discovery, and Epigenetic analyses.

  7. Next-generation sequencing (NGS) for assessment of microbial water quality: current progress, challenges, and future opportunities.

    Science.gov (United States)

    Tan, BoonFei; Ng, Charmaine; Nshimyimana, Jean Pierre; Loh, Lay Leng; Gin, Karina Y-H; Thompson, Janelle R

    2015-01-01

    Water quality is an emergent property of a complex system comprised of interacting microbial populations and introduced microbial and chemical contaminants. Studies leveraging next-generation sequencing (NGS) technologies are providing new insights into the ecology of microbially mediated processes that influence fresh water quality such as algal blooms, contaminant biodegradation, and pathogen dissemination. In addition, sequencing methods targeting small subunit (SSU) rRNA hypervariable regions have allowed identification of signature microbial species that serve as bioindicators for sewage contamination in these environments. Beyond amplicon sequencing, metagenomic and metatranscriptomic analyses of microbial communities in fresh water environments reveal the genetic capabilities and interplay of waterborne microorganisms, shedding light on the mechanisms for production and biodegradation of toxins and other contaminants. This review discusses the challenges and benefits of applying NGS-based methods to water quality research and assessment. We will consider the suitability and biases inherent in the application of NGS as a screening tool for assessment of biological risks and discuss the potential and limitations for direct quantitative interpretation of NGS data. Secondly, we will examine case studies from recent literature where NGS based methods have been applied to topics in water quality assessment, including development of bioindicators for sewage pollution and microbial source tracking, characterizing the distribution of toxin and antibiotic resistance genes in water samples, and investigating mechanisms of biodegradation of harmful pollutants that threaten water quality. Finally, we provide a short review of emerging NGS platforms and their potential applications to the next generation of water quality assessment tools.

  8. Next-generation sequencing (NGS for assessment of microbial water quality: current progress, challenges, and future opportunities

    Directory of Open Access Journals (Sweden)

    BoonFei eTan

    2015-09-01

    Full Text Available Water quality is an emergent property of a complex system comprised of interacting microbial populations and introduced microbial and chemical contaminants. Studies leveraging next-generation sequencing (NGS technologies are providing new insights into the ecology of microbially mediated processes that influence fresh water quality such as algal blooms, contaminant biodegradation, and pathogen dissemination. In addition, sequencing methods targeting small subunit (SSU rRNA hypervariable regions have allowed identification of signature microbial species that serve as bioindicators for sewage contamination in these environments. Beyond amplicon sequencing, metagenomic and metatranscriptomic analyses of microbial communities in fresh water environments reveal the genetic capabilities and interplay of waterborne microorganisms, shedding light on the mechanisms for production and biodegradation of toxins and other contaminants. This review discusses the challenges and benefits of applying NGS-based methods to water quality research and assessment. We will consider the suitability and biases inherent in the application of NGS as a screening tool for assessment of biological risks and discuss the potential and limitations for direct quantitative interpretation of NGS data. Secondly, we will examine case studies from recent literature where NGS based methods have been applied to topics in water quality assessment, including development of bioindicators for sewage pollution and microbial source tracking, characterizing the distribution of toxin and antibiotic resistance genes in water samples, and investigating mechanisms of biodegradation of harmful pollutants that threaten water quality. Finally, we provide a short review of emerging NGS platforms and their potential applications to the next generation of water quality assessment tools.

  9. QuickNGS elevates Next-Generation Sequencing data analysis to a new level of automation.

    Science.gov (United States)

    Wagle, Prerana; Nikolić, Miloš; Frommolt, Peter

    2015-07-01

    Next-Generation Sequencing (NGS) has emerged as a widely used tool in molecular biology. While time and cost for the sequencing itself are decreasing, the analysis of the massive amounts of data remains challenging. Since multiple algorithmic approaches for the basic data analysis have been developed, there is now an increasing need to efficiently use these tools to obtain results in reasonable time. We have developed QuickNGS, a new workflow system for laboratories with the need to analyze data from multiple NGS projects at a time. QuickNGS takes advantage of parallel computing resources, a comprehensive back-end database, and a careful selection of previously published algorithmic approaches to build fully automated data analysis workflows. We demonstrate the efficiency of our new software by a comprehensive analysis of 10 RNA-Seq samples which we can finish in only a few minutes of hands-on time. The approach we have taken is suitable to process even much larger numbers of samples and multiple projects at a time. Our approach considerably reduces the barriers that still limit the usability of the powerful NGS technology and finally decreases the time to be spent before proceeding to further downstream analysis and interpretation of the data.

  10. lociNGS: a lightweight alternative for assessing suitability of next-generation loci for evolutionary analysis.

    Directory of Open Access Journals (Sweden)

    Sarah M Hird

    Full Text Available Genomic enrichment methods and next-generation sequencing produce uneven coverage for the portions of the genome (the loci they target; this information is essential for ascertaining the suitability of each locus for further analysis. lociNGS is a user-friendly accessory program that takes multi-FASTA formatted loci, next-generation sequence alignments and demographic data as input and collates, displays and outputs information about the data. Summary information includes the parameters coverage per locus, coverage per individual and number of polymorphic sites, among others. The program can output the raw sequences used to call loci from next-generation sequencing data. lociNGS also reformats subsets of loci in three commonly used formats for multi-locus phylogeographic and population genetics analyses - NEXUS, IMa2 and Migrate. lociNGS is available at https://github.com/SHird/lociNGS and is dependent on installation of MongoDB (freely available at http://www.mongodb.org/downloads. lociNGS is written in Python and is supported on MacOSX and Unix; it is distributed under a GNU General Public License.

  11. Next Generation Sequencing Plus (NGS+) with Y-chromosomal Markers for Forensic Pedigree Searches.

    Science.gov (United States)

    Qian, Xiaoqin; Hou, Jiayi; Wang, Zheng; Ye, Yi; Lang, Min; Gao, Tianzhen; Liu, Jing; Hou, Yiping

    2017-09-12

    There is high demand for forensic pedigree searches with Y-chromosome short tandem repeat (Y-STR) profiling in large-scale crime investigations. However, when two Y-STR haplotypes have a few mismatched loci, it is difficult to determine if they are from the same male lineage because of the high mutation rate of Y-STRs. Here we design a new strategy to handle cases in which none of pedigree samples shares identical Y-STR haplotype. We combine next generation sequencing (NGS), capillary electrophoresis and pyrosequencing under the term 'NGS+' for typing Y-STRs and Y-chromosomal single nucleotide polymorphisms (Y-SNPs). The high-resolution Y-SNP haplogroup and Y-STR haplotype can be obtained with NGS+. We further developed a new data-driven decision rule, FSindex, for estimating the likelihood for each retrieved pedigree. Our approach enables positive identification of pedigree from mismatched Y-STR haplotypes. It is envisaged that NGS+ will revolutionize forensic pedigree searches, especially when the person of interest was not recorded in forensic DNA database.

  12. Next-generation sequencing (NGS) for assessment of microbial water quality: current progress, challenges, and future opportunities

    OpenAIRE

    BoonFei eTan; Charmaine Marie Ng; Jean Pierre Nshimyimana; Jean Pierre Nshimyimana; Lay-Leng eLoh; Lay-Leng eLoh; Karina Yew-Hoong Gin; Janelle Renee Thompson; Janelle Renee Thompson

    2015-01-01

    Water quality is an emergent property of a complex system comprised of interacting microbial populations and introduced microbial and chemical contaminants. Studies leveraging next-generation sequencing (NGS) technologies are providing new insights into the ecology of microbially mediated processes that influence fresh water quality such as algal blooms, contaminant biodegradation, and pathogen dissemination. In addition, sequencing methods targeting small subunit (SSU) rRNA hypervariable reg...

  13. What can next generation sequencing do for you? Next generation sequencing as a valuable tool in plant research

    OpenAIRE

    Bräutigam, Andrea; Gowik, Udo

    2010-01-01

    Next generation sequencing (NGS) technologies have opened fascinating opportunities for the analysis of plants with and without a sequenced genome on a genomic scale. During the last few years, NGS methods have become widely available and cost effective. They can be applied to a wide variety of biological questions, from the sequencing of complete eukaryotic genomes and transcriptomes, to the genome-scale analysis of DNA-protein interactions. In this review, we focus on the use of NGS for pla...

  14. HLA typing: Conventional techniques v.next-generation sequencing

    African Journals Online (AJOL)

    The existing techniques have contributed significantly to our current knowledge of allelic diversity. At present, sequence-based typing (SBT) methods, in particular next-generation sequencing. (NGS), provide the highest possible resolution. NGS platforms were initially only used for genomic sequencing, but also showed.

  15. Special Issue: Next Generation DNA Sequencing

    Directory of Open Access Journals (Sweden)

    Paul Richardson

    2010-10-01

    Full Text Available Next Generation Sequencing (NGS refers to technologies that do not rely on traditional dideoxy-nucleotide (Sanger sequencing where labeled DNA fragments are physically resolved by electrophoresis. These new technologies rely on different strategies, but essentially all of them make use of real-time data collection of a base level incorporation event across a massive number of reactions (on the order of millions versus 96 for capillary electrophoresis for instance. The major commercial NGS platforms available to researchers are the 454 Genome Sequencer (Roche, Illumina (formerly Solexa Genome analyzer, the SOLiD system (Applied Biosystems/Life Technologies and the Heliscope (Helicos Corporation. The techniques and different strategies utilized by these platforms are reviewed in a number of the papers in this special issue. These technologies are enabling new applications that take advantage of the massive data produced by this next generation of sequencing instruments. [...

  16. Next-Generation Sequencing of Tubal Intraepithelial Carcinomas.

    Science.gov (United States)

    McDaniel, Andrew S; Stall, Jennifer N; Hovelson, Daniel H; Cani, Andi K; Liu, Chia-Jen; Tomlins, Scott A; Cho, Kathleen R

    2015-11-01

    High-grade serous carcinoma (HGSC) is the most prevalent and lethal form of ovarian cancer. HGSCs frequently arise in the distal fallopian tubes rather than the ovary, developing from small precursor lesions called serous tubal intraepithelial carcinomas (TICs, or more specifically, STICs). While STICs have been reported to harbor TP53 mutations, detailed molecular characterizations of these lesions are lacking. We performed targeted next-generation sequencing (NGS) on formalin-fixed, paraffin-embedded tissue from 4 women, 2 with HGSC and 2 with uterine endometrioid carcinoma (UEC) who were diagnosed as having synchronous STICs. We detected concordant mutations in both HGSCs with synchronous STICs, including TP53 mutations as well as assumed germline BRCA1/2 alterations, confirming a clonal association between these lesions. Next-generation sequencing confirmed the presence of a STIC clonally unrelated to 1 case of UEC, and NGS of the other tubal lesion diagnosed as a STIC unexpectedly supported the lesion as a micrometastasis from the associated UEC. We demonstrate that targeted NGS can identify genetic alterations in minute lesions, such as TICs, and confirm TP53 mutations as early driving events for HGSC. Next-generation sequencing also demonstrated unexpected associations between presumed STICs and synchronous carcinomas, providing evidence that some TICs are actually metastases rather than HGSC precursors.

  17. Galaxy LIMS for next-generation sequencing

    NARCIS (Netherlands)

    Scholtalbers, J.; Rossler, J.; Sorn, P.; Graaf, J. de; Boisguerin, V.; Castle, J.; Sahin, U.

    2013-01-01

    SUMMARY: We have developed a laboratory information management system (LIMS) for a next-generation sequencing (NGS) laboratory within the existing Galaxy platform. The system provides lab technicians standard and customizable sample information forms, barcoded submission forms, tracking of input

  18. Next-Generation Sequencing of Antibody Display Repertoires

    Directory of Open Access Journals (Sweden)

    Romain Rouet

    2018-02-01

    Full Text Available In vitro selection technology has transformed the development of therapeutic monoclonal antibodies. Using methods such as phage, ribosome, and yeast display, high affinity binders can be selected from diverse repertoires. Here, we review strategies for the next-generation sequencing (NGS of phage- and other antibody-display libraries, as well as NGS platforms and analysis tools. Moreover, we discuss recent examples relating to the use of NGS to assess library diversity, clonal enrichment, and affinity maturation.

  19. Application of Next-generation Sequencing in Clinical Molecular Diagnostics

    Directory of Open Access Journals (Sweden)

    Morteza Seifi

    2017-05-01

    Full Text Available ABSTRACT Next-generation sequencing (NGS is the catch all terms that used to explain several different modern sequencing technologies which let us to sequence nucleic acids much more rapidly and cheaply than the formerly used Sanger sequencing, and as such have revolutionized the study of molecular biology and genomics with excellent resolution and accuracy. Over the past years, many academic companies and institutions have continued technological advances to expand NGS applications from research to the clinic. In this review, the performance and technical features of current NGS platforms were described. Furthermore, advances in the applying of NGS technologies towards the progress of clinical molecular diagnostics were emphasized. General advantages and disadvantages of each sequencing system are summarized and compared to guide the selection of NGS platforms for specific research aims.

  20. Bioinformatics for Next Generation Sequencing Data

    Directory of Open Access Journals (Sweden)

    Alberto Magi

    2010-09-01

    Full Text Available The emergence of next-generation sequencing (NGS platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow.

  1. Statistical analysis of next generation sequencing data

    CERN Document Server

    Nettleton, Dan

    2014-01-01

    Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...

  2. Next generation sequencing and its applications in forensic genetics

    DEFF Research Database (Denmark)

    Børsting, Claus; Morling, Niels

    2015-01-01

    articles and presentations at conferences with forensic aspects of NGS. These contributions have demonstrated that NGS offers new possibilities for forensic genetic case work. More information may be obtained from unique samples in a single experiment by analyzing combinations of markers (STRs, SNPs......It has been almost a decade since the first next generation sequencing (NGS) technologies emerged and quickly changed the way genetic research is conducted. Today, full genomes are mapped and published almost weekly and with ever increasing speed and decreasing costs. NGS methods and platforms have...... matured during the last 10 years, and the quality of the sequences has reached a level where NGS is used in clinical diagnostics of humans. Forensic genetic laboratories have also explored NGS technologies and especially in the last year, there has been a small explosion in the number of scientific...

  3. Droplet Digital™ PCR Next-Generation Sequencing Library QC Assay.

    Science.gov (United States)

    Heredia, Nicholas J

    2018-01-01

    Digital PCR is a valuable tool to quantify next-generation sequencing (NGS) libraries precisely and accurately. Accurately quantifying NGS libraries enable accurate loading of the libraries on to the sequencer and thus improve sequencing performance by reducing under and overloading error. Accurate quantification also benefits users by enabling uniform loading of indexed/barcoded libraries which in turn greatly improves sequencing uniformity of the indexed/barcoded samples. The advantages gained by employing the Droplet Digital PCR (ddPCR™) library QC assay includes the precise and accurate quantification in addition to size quality assessment, enabling users to QC their sequencing libraries with confidence.

  4. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    Directory of Open Access Journals (Sweden)

    Fabio eMarroni

    2012-06-01

    Full Text Available Next generation sequencing (NGS instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, only three research groups working in plant sciences have exploited this potentiality. They showed that pooled NGS can provide results in excellent agreement with those obtained by individual Sanger sequencing. Aim of this review is to convey to the reader the general ideas underlying the use of pooled NGS for the identification of rare variants. To facilitate a thorough understanding of the possibilities of the method we will explain in detail the variations in study design and discuss their advantages and disadvantages. We will show that information on allele frequency obtained by pooled next generation sequencing can be used to accurately compute basic population genetics indexes such as allele frequency, nucleotide diversity and Tajima’s D. Finally we will discuss applications and future perspectives of the multiplexed NGS approach.

  5. Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics

    Directory of Open Access Journals (Sweden)

    Rama R Gullapalli

    2012-01-01

    Full Text Available The Human Genome Project (HGP provided the initial draft of mankind′s DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized. [7] We also review the issue of physician education which also is an important consideration for the successful implementation of NGS in the clinical workplace. NGS technologies represent a golden opportunity for the next generation of pathologists to be at the leading edge of the personalized medicine approaches coming our way. Often under-emphasized issues of data access and control as well as potential ethical implications of whole genome NGS sequencing are also discussed. Despite some challenges, it′s hard not to be optimistic about the future of personalized genome sequencing and its potential impact on patient care and the advancement of knowledge of human biology and disease in the near future.

  6. Targeted next-generation sequencing in monogenic dyslipidemias.

    Science.gov (United States)

    Hegele, Robert A; Ban, Matthew R; Cao, Henian; McIntyre, Adam D; Robinson, John F; Wang, Jian

    2015-04-01

    To evaluate the potential clinical translation of high-throughput next-generation sequencing (NGS) methods in diagnosis and management of dyslipidemia. Recent NGS experiments indicate that most causative genes for monogenic dyslipidemias are already known. Thus, monogenic dyslipidemias can now be diagnosed using targeted NGS. Targeting of dyslipidemia genes can be achieved by either: designing custom reagents for a dyslipidemia-specific NGS panel; or performing genome-wide NGS and focusing on genes of interest. Advantages of the former approach are lower cost and limited potential to detect incidental pathogenic variants unrelated to dyslipidemia. However, the latter approach is more flexible because masking criteria can be altered as knowledge advances, with no need for re-design of reagents or follow-up sequencing runs. Also, the cost of genome-wide analysis is decreasing and ethical concerns can likely be mitigated. DNA-based diagnosis is already part of the clinical diagnostic algorithms for familial hypercholesterolemia. Furthermore, DNA-based diagnosis is supplanting traditional biochemical methods to diagnose chylomicronemia caused by deficiency of lipoprotein lipase or its co-factors. The increasing availability and decreasing cost of clinical NGS for dyslipidemia means that its potential benefits can now be evaluated on a larger scale.

  7. Next-Generation Sequencing and Genome Editing in Plant Virology

    Directory of Open Access Journals (Sweden)

    Ahmed Hadidi

    2016-08-01

    Full Text Available Next-generation sequencing (NGS has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21-24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus; beet curly top virus and beet severe curly top virus (curtovirus; and bean yellow dwarf virus (mastrevirus. The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus and cucumber vein yellowing virus (ipomovirus, family, Potyviridae by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology.Keywords: Next-generation sequencing, NGS, plant virology, plant viruses, viroids, resistance to plant viruses by CRISPR-Cas9

  8. Application of next generation sequencing in clinical microbiology and infection prevention

    NARCIS (Netherlands)

    Deurenberg, Ruud H.; Bathoorn, Erik; Chlebowicz, Monika A.; Couto, Natacha; Ferdous, Mithila; Garcia-Cobos, Silvia; Kooistra-Smid, Anna M. D.; Raangs, Erwin C.; Rosema, Sigrid; Veloo, Alida C. M.; Zhou, Kai; Friedrich, Alexander W.; Rossen, John W. A.

    2017-01-01

    Current molecular diagnostics of human pathogens provide limited information that is often not sufficient for outbreak and transmission investigation. Next generation sequencing (NGS) determines the DNA sequence of a complete bacterial genome in a single sequence run, and from these data,

  9. Next-generation sequencing in NSCLC and melanoma patients : A cost and budget impact analysis

    NARCIS (Netherlands)

    Van Amerongen, Rosa A.; Retèl, Valesca P.; Coupé, Veerle M.H.; Nederlof, Petra M.; Vogel, Maartje J.; Van Harten, Wim H.

    2016-01-01

    Next-generation sequencing (NGS) has reached the molecular diagnostic laboratories. Although the NGS technology aims to improve the effectiveness of therapies by selecting the most promising therapy, concerns are that NGS testing is expensive and that the 'benefits' are not yet in relation to these

  10. Next generation sequencing and its applications in forensic genetics.

    Science.gov (United States)

    Børsting, Claus; Morling, Niels

    2015-09-01

    It has been almost a decade since the first next generation sequencing (NGS) technologies emerged and quickly changed the way genetic research is conducted. Today, full genomes are mapped and published almost weekly and with ever increasing speed and decreasing costs. NGS methods and platforms have matured during the last 10 years, and the quality of the sequences has reached a level where NGS is used in clinical diagnostics of humans. Forensic genetic laboratories have also explored NGS technologies and especially in the last year, there has been a small explosion in the number of scientific articles and presentations at conferences with forensic aspects of NGS. These contributions have demonstrated that NGS offers new possibilities for forensic genetic case work. More information may be obtained from unique samples in a single experiment by analyzing combinations of markers (STRs, SNPs, insertion/deletions, mRNA) that cannot be analyzed simultaneously with the standard PCR-CE methods used today. The true variation in core forensic STR loci has been uncovered, and previously unknown STR alleles have been discovered. The detailed sequence information may aid mixture interpretation and will increase the statistical weight of the evidence. In this review, we will give an introduction to NGS and single-molecule sequencing, and we will discuss the possible applications of NGS in forensic genetics. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  11. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    OpenAIRE

    Fabio eMarroni; Sara ePinosio; Sara ePinosio; Michele eMorgante

    2012-01-01

    Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, only three research groups working in plant sciences have exploited this potentiality. They showed that pooled NGS can provide results in excellent agreement with those obt...

  12. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    OpenAIRE

    Marroni, Fabio; Pinosio, Sara; Morgante, Michele

    2012-01-01

    Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, few research groups working in plant sciences have exploited this potentiality, showing that pooled NGS provides results in excellent agreement with those obtained by indiv...

  13. Yleaf: Software for Human Y-Chromosomal Haplogroup Inference from Next-Generation Sequencing Data.

    Science.gov (United States)

    Ralf, Arwin; Montiel González, Diego; Zhong, Kaiyin; Kayser, Manfred

    2018-05-01

    Next-generation sequencing (NGS) technologies offer immense possibilities given the large genomic data they simultaneously deliver. The human Y-chromosome serves as good example how NGS benefits various applications in evolution, anthropology, genealogy, and forensics. Prior to NGS, the Y-chromosome phylogenetic tree consisted of a few hundred branches, based on NGS data, it now contains many thousands. The complexity of both, Y tree and NGS data provide challenges for haplogroup assignment. For effective analysis and interpretation of Y-chromosome NGS data, we present Yleaf, a publically available, automated, user-friendly software for high-resolution Y-chromosome haplogroup inference independently of library and sequencing methods.

  14. High-Throughput Next-Generation Sequencing of Polioviruses

    Science.gov (United States)

    Montmayeur, Anna M.; Schmidt, Alexander; Zhao, Kun; Magaña, Laura; Iber, Jane; Castro, Christina J.; Chen, Qi; Henderson, Elizabeth; Ramos, Edward; Shaw, Jing; Tatusov, Roman L.; Dybdahl-Sissoko, Naomi; Endegue-Zanga, Marie Claire; Adeniji, Johnson A.; Oberste, M. Steven; Burns, Cara C.

    2016-01-01

    ABSTRACT The poliovirus (PV) is currently targeted for worldwide eradication and containment. Sanger-based sequencing of the viral protein 1 (VP1) capsid region is currently the standard method for PV surveillance. However, the whole-genome sequence is sometimes needed for higher resolution global surveillance. In this study, we optimized whole-genome sequencing protocols for poliovirus isolates and FTA cards using next-generation sequencing (NGS), aiming for high sequence coverage, efficiency, and throughput. We found that DNase treatment of poliovirus RNA followed by random reverse transcription (RT), amplification, and the use of the Nextera XT DNA library preparation kit produced significantly better results than other preparations. The average viral reads per total reads, a measurement of efficiency, was as high as 84.2% ± 15.6%. PV genomes covering >99 to 100% of the reference length were obtained and validated with Sanger sequencing. A total of 52 PV genomes were generated, multiplexing as many as 64 samples in a single Illumina MiSeq run. This high-throughput, sequence-independent NGS approach facilitated the detection of a diverse range of PVs, especially for those in vaccine-derived polioviruses (VDPV), circulating VDPV, or immunodeficiency-related VDPV. In contrast to results from previous studies on other viruses, our results showed that filtration and nuclease treatment did not discernibly increase the sequencing efficiency of PV isolates. However, DNase treatment after nucleic acid extraction to remove host DNA significantly improved the sequencing results. This NGS method has been successfully implemented to generate PV genomes for molecular epidemiology of the most recent PV isolates. Additionally, the ability to obtain full PV genomes from FTA cards will aid in facilitating global poliovirus surveillance. PMID:27927929

  15. Next-generation sequencing library preparation method for identification of RNA viruses on the Ion Torrent Sequencing Platform.

    Science.gov (United States)

    Chen, Guiqian; Qiu, Yuan; Zhuang, Qingye; Wang, Suchun; Wang, Tong; Chen, Jiming; Wang, Kaicheng

    2018-05-09

    Next generation sequencing (NGS) is a powerful tool for the characterization, discovery, and molecular identification of RNA viruses. There were multiple NGS library preparation methods published for strand-specific RNA-seq, but some methods are not suitable for identifying and characterizing RNA viruses. In this study, we report a NGS library preparation method to identify RNA viruses using the Ion Torrent PGM platform. The NGS sequencing adapters were directly inserted into the sequencing library through reverse transcription and polymerase chain reaction, without fragmentation and ligation of nucleic acids. The results show that this method is simple to perform, able to identify multiple species of RNA viruses in clinical samples.

  16. Is the $1000 Genome as Near as We Think? A Cost Analysis of Next-Generation Sequencing

    NARCIS (Netherlands)

    Nimwegen, K.J.M. van; Soest, R.A.; Veltman, J.A.; Nelen, M.R.; Wilt, G.J. van der; Peart-Vissers, L.E.L.M.; Grutters, J.P.C.

    2016-01-01

    BACKGROUND: The substantial technological advancements in next-generation sequencing (NGS), combined with dropping costs, have allowed for a swift diffusion of NGS applications in clinical settings. Although several commercial parties report to have broken the $1000 barrier for sequencing an entire

  17. Diagnostics of Primary Immunodeficiencies through Next Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Vera Gallo

    2016-11-01

    Full Text Available Background: Recently, a growing number of novel genetic defects underlying primary immunodeficiencies (PID have been identified, increasing the number of PID up to more than 250 well-defined forms. Next-generation sequencing (NGS technologies and proper filtering strategies greatly contributed to this rapid evolution, providing the possibility to rapidly and simultaneously analyze large numbers of genes or the whole exome. Objective: To evaluate the role of targeted next-generation sequencing and whole exome sequencing in the diagnosis of a case series, characterized by complex or atypical clinical features suggesting a PID, difficult to diagnose using the current diagnostic procedures.Methods: We retrospectively analyzed genetic variants identified through targeted next-generation sequencing or whole exome sequencing in 45 patients with complex PID of unknown etiology. Results: 40 variants were identified using targeted next-generation sequencing, while 5 were identified using whole exome sequencing. Newly identified genetic variants were classified into 4 groups: I variations associated with a well-defined PID; II variations associated with atypical features of a well-defined PID; III functionally relevant variations potentially involved in the immunological features; IV non-diagnostic genotype, in whom the link with phenotype is missing. We reached a conclusive genetic diagnosis in 7/45 patients (~16%. Among them, 4 patients presented with a typical well-defined PID. In the remaining 3 cases, mutations were associated with unexpected clinical features, expanding the phenotypic spectrum of typical PIDs. In addition, we identified 31 variants in 10 patients with complex phenotype, individually not causative per se of the disorder.Conclusion: NGS technologies represent a cost-effective and rapid first-line genetic approaches for the evaluation of complex PIDs. Whole exome sequencing, despite a moderate higher cost compared to targeted, is

  18. Next-generation sequencing approaches to understanding the oral microbiome

    NARCIS (Netherlands)

    Zaura, E.

    2012-01-01

    Until recently, the focus in dental research has been on studying a small fraction of the oral microbiome—so-called opportunistic pathogens. With the advent of next-generation sequencing (NGS) technologies, researchers now have the tools that allow for profiling of the microbiomes and metagenomes at

  19. Next-Generation Sequencing for Binary Protein-Protein Interactions

    Directory of Open Access Journals (Sweden)

    Bernhard eSuter

    2015-12-01

    Full Text Available The yeast two-hybrid (Y2H system exploits host cell genetics in order to display binary protein-protein interactions (PPIs via defined and selectable phenotypes. Numerous improvements have been made to this method, adapting the screening principle for diverse applications, including drug discovery and the scale-up for proteome wide interaction screens in human and other organisms. Here we discuss a systematic workflow and analysis scheme for screening data generated by Y2H and related assays that includes high-throughput selection procedures, readout of comprehensive results via next-generation sequencing (NGS, and the interpretation of interaction data via quantitative statistics. The novel assays and tools will serve the broader scientific community to harness the power of NGS technology to address PPI networks in health and disease. We discuss examples of how this next-generation platform can be applied to address specific questions in diverse fields of biology and medicine.

  20. Uncovering of Classical Swine Fever Virus adaptive response to vaccination by Next Generation Sequencing

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Orton, Richard; Höper, Dirk

    Next Generation Sequencing (NGS) has rapidly become the preferred technology in nucleotide sequencing, and can be applied to unravel molecular adaptation of RNA viruses such as Classical Swine Fever Virus (CSFV). However, the detection of low frequency variants within viral populations by NGS...... is affected by errors introduced during sample preparation and sequencing, and so far no definitive solution to this problem has been presented....

  1. Visual programming for next-generation sequencing data analytics.

    Science.gov (United States)

    Milicchio, Franco; Rose, Rebecca; Bian, Jiang; Min, Jae; Prosperi, Mattia

    2016-01-01

    High-throughput or next-generation sequencing (NGS) technologies have become an established and affordable experimental framework in biological and medical sciences for all basic and translational research. Processing and analyzing NGS data is challenging. NGS data are big, heterogeneous, sparse, and error prone. Although a plethora of tools for NGS data analysis has emerged in the past decade, (i) software development is still lagging behind data generation capabilities, and (ii) there is a 'cultural' gap between the end user and the developer. Generic software template libraries specifically developed for NGS can help in dealing with the former problem, whilst coupling template libraries with visual programming may help with the latter. Here we scrutinize the state-of-the-art low-level software libraries implemented specifically for NGS and graphical tools for NGS analytics. An ideal developing environment for NGS should be modular (with a native library interface), scalable in computational methods (i.e. serial, multithread, distributed), transparent (platform-independent), interoperable (with external software interface), and usable (via an intuitive graphical user interface). These characteristics should facilitate both the run of standardized NGS pipelines and the development of new workflows based on technological advancements or users' needs. We discuss in detail the potential of a computational framework blending generic template programming and visual programming that addresses all of the current limitations. In the long term, a proper, well-developed (although not necessarily unique) software framework will bridge the current gap between data generation and hypothesis testing. This will eventually facilitate the development of novel diagnostic tools embedded in routine healthcare.

  2. Evaluating next-generation sequencing for direct clinical diagnostics in diarrhoeal disease

    DEFF Research Database (Denmark)

    Joensen, Katrine Grimstrup; Engsbro, A L Ø; Lukjancenko, Oksana

    2017-01-01

    The accurate microbiological diagnosis of diarrhoea involves numerous laboratory tests and, often, the pathogen is not identified in time to guide clinical management. With next-generation sequencing (NGS) becoming cheaper, it has huge potential in routine diagnostics. The aim of this study...... was to evaluate the potential of NGS-based diagnostics through direct sequencing of faecal samples. Fifty-eight clinical faecal samples were obtained from patients with diarrhoea as part of the routine diagnostics at Hvidovre University Hospital, Denmark. Ten samples from healthy individuals were also included...

  3. Using next-generation sequencing to develop molecular diagnostics for Pseudoperonospora cubensis, the cucurbit downy mildew pathogen

    Science.gov (United States)

    Advances in Next Generation Sequencing (NGS) allow for rapid development of genomics resources needed to generate molecular diagnostics assays for infectious agents. NGS approaches are particularly helpful for organisms that cannot be cultured, such as the downy mildew pathogens, a group of biotrop...

  4. Reprint of "Application of next generation sequencing in clinical microbiology and infection prevention"

    NARCIS (Netherlands)

    Deurenberg, Ruud H.; Bathoorn, Erik; Chlebowicz, Monika A.; Monge Gomes do Couto, Natacha; Ferdous, Mithila; Garcia-Cobos, Silvia; Kooistra-Smid, Anna M. D.; Raangs, Erwin C.; Rosema, Sigrid; Veloo, Alida C. M.; Zhou, Kai; Friedrich, Alexander W.; Rossen, John W. A.

    2017-01-01

    Current molecular diagnostics of human pathogens provide limited information that is often not sufficient for outbreak and transmission investigation. Next generation sequencing (NGS) determines the DNA sequence of a complete bacterial genome in a single sequence run, and from these data,

  5. SMITH: a LIMS for handling next-generation sequencing workflows

    OpenAIRE

    Venco, Francesco; Vaskin, Yuriy; Ceol, Arnaud; Muller, Heiko

    2014-01-01

    Background Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to ...

  6. Comparison of two Next Generation sequencing platforms for full genome sequencing of Classical Swine Fever Virus

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Pedersen, Anders Gorm; Höper, Dirk

    2013-01-01

    to the consensus sequence. Additionally, we got an average sequence depth for the genome of 4000 for the Iontorrent PGM and 400 for the FLX platform making the mapping suitable for single nucleotide variant (SNV) detection. The analysis revealed a single non-silent SNV A10665G leading to the amino acid change D......Next Generation Sequencing (NGS) is becoming more adopted into viral research and will be the preferred technology in the years to come. We have recently sequenced several strains of Classical Swine Fever Virus (CSFV) by NGS on both Genome Sequencer FLX (GS FLX) and Iontorrent PGM platforms...

  7. Statistical Approaches for Next-Generation Sequencing Data

    OpenAIRE

    Qiao, Dandi

    2012-01-01

    During the last two decades, genotyping technology has advanced rapidly, which enabled the tremendous success of genome-wide association studies (GWAS) in the search of disease susceptibility loci (DSLs). However, only a small fraction of the overall predicted heritability can be explained by the DSLs discovered. One possible explanation for this ”missing heritability” phenomenon is that many causal variants are rare. The recent development of high-throughput next-generation sequencing (NGS) ...

  8. Applying Next Generation Sequencing to Skeletal Development and Disease

    OpenAIRE

    Bowen, Margot Elizabeth

    2013-01-01

    Next Generation Sequencing (NGS) technologies have dramatically increased the throughput and lowered the cost of DNA sequencing. In this thesis, I apply these technologies to unresolved questions in skeletal development and disease. Firstly, I use targeted re-sequencing of genomic DNA to identify the genetic cause of the cartilage tumor syndrome, metachondromatosis (MC). I show that the majority of MC patients carry heterozygous loss-of-function mutations in the PTPN11 gene, which encodes a p...

  9. Comparison of DNA Quantification Methods for Next Generation Sequencing.

    Science.gov (United States)

    Robin, Jérôme D; Ludlow, Andrew T; LaRanger, Ryan; Wright, Woodring E; Shay, Jerry W

    2016-04-06

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library's heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality.

  10. Next-Generation Sequencing in Clinical Molecular Diagnostics of Cancer: Advantages and Challenges

    Directory of Open Access Journals (Sweden)

    Rajyalakshmi Luthra

    2015-10-01

    Full Text Available The application of next-generation sequencing (NGS to characterize cancer genomes has resulted in the discovery of numerous genetic markers. Consequently, the number of markers that warrant routine screening in molecular diagnostic laboratories, often from limited tumor material, has increased. This increased demand has been difficult to manage by traditional low- and/or medium-throughput sequencing platforms. Massively parallel sequencing capabilities of NGS provide a much-needed alternative for mutation screening in multiple genes with a single low investment of DNA. However, implementation of NGS technologies, most of which are for research use only (RUO, in a diagnostic laboratory, needs extensive validation in order to establish Clinical Laboratory Improvement Amendments (CLIA and College of American Pathologists (CAP-compliant performance characteristics. Here, we have reviewed approaches for validation of NGS technology for routine screening of tumors. We discuss the criteria for selecting gene markers to include in the NGS panel and the deciding factors for selecting target capture approaches and sequencing platforms. We also discuss challenges in result reporting, storage and retrieval of the voluminous sequencing data and the future potential of clinical NGS.

  11. Applications and challenges of next-generation sequencing in Brassica species.

    Science.gov (United States)

    Wei, Lijuan; Xiao, Meili; Hayward, Alice; Fu, Donghui

    2013-12-01

    Next-generation sequencing (NGS) produces numerous (often millions) short DNA sequence reads, typically varying between 25 and 400 bp in length, at a relatively low cost and in a short time. This revolutionary technology is being increasingly applied in whole-genome, transcriptome, epigenome and small RNA sequencing, molecular marker and gene discovery, comparative and evolutionary genomics, and association studies. The Brassica genus comprises some of the most agro-economically important crops, providing abundant vegetables, condiments, fodder, oil and medicinal products. Many Brassica species have undergone the process of polyploidization, which makes their genomes exceptionally complex and can create difficulties in genomics research. NGS injects new vigor into Brassica research, yet also faces specific challenges in the analysis of complex crop genomes and traits. In this article, we review the advantages and limitations of different NGS technologies and their applications and challenges, using Brassica as an advanced model system for agronomically important, polyploid crops. Specifically, we focus on the use of NGS for genome resequencing, transcriptome sequencing, development of single-nucleotide polymorphism markers, and identification of novel microRNAs and their targets. We present trends and advances in NGS technology in relation to Brassica crop improvement, with wide application for sophisticated genomics research into agronomically important polyploid crops.

  12. Validation of Metagenomic Next-Generation Sequencing Tests for Universal Pathogen Detection.

    Science.gov (United States)

    Schlaberg, Robert; Chiu, Charles Y; Miller, Steve; Procop, Gary W; Weinstock, George

    2017-06-01

    - Metagenomic sequencing can be used for detection of any pathogens using unbiased, shotgun next-generation sequencing (NGS), without the need for sequence-specific amplification. Proof-of-concept has been demonstrated in infectious disease outbreaks of unknown causes and in patients with suspected infections but negative results for conventional tests. Metagenomic NGS tests hold great promise to improve infectious disease diagnostics, especially in immunocompromised and critically ill patients. - To discuss challenges and provide example solutions for validating metagenomic pathogen detection tests in clinical laboratories. A summary of current regulatory requirements, largely based on prior guidance for NGS testing in constitutional genetics and oncology, is provided. - Examples from 2 separate validation studies are provided for steps from assay design, and validation of wet bench and bioinformatics protocols, to quality control and assurance. - Although laboratory and data analysis workflows are still complex, metagenomic NGS tests for infectious diseases are increasingly being validated in clinical laboratories. Many parallels exist to NGS tests in other fields. Nevertheless, specimen preparation, rapidly evolving data analysis algorithms, and incomplete reference sequence databases are idiosyncratic to the field of microbiology and often overlooked.

  13. Applications of Next-Generation Sequencing Technologies to Diagnostic Virology

    Directory of Open Access Journals (Sweden)

    Giorgio Palù

    2011-11-01

    Full Text Available Novel DNA sequencing techniques, referred to as “next-generation” sequencing (NGS, provide high speed and throughput that can produce an enormous volume of sequences with many possible applications in research and diagnostic settings. In this article, we provide an overview of the many applications of NGS in diagnostic virology. NGS techniques have been used for high-throughput whole viral genome sequencing, such as sequencing of new influenza viruses, for detection of viral genome variability and evolution within the host, such as investigation of human immunodeficiency virus and human hepatitis C virus quasispecies, and monitoring of low-abundance antiviral drug-resistance mutations. NGS techniques have been applied to metagenomics-based strategies for the detection of unexpected disease-associated viruses and for the discovery of novel human viruses, including cancer-related viruses. Finally, the human virome in healthy and disease conditions has been described by NGS-based metagenomics.

  14. The quest for rare variants: pooled multiplexed next generation sequencing in plants.

    Science.gov (United States)

    Marroni, Fabio; Pinosio, Sara; Morgante, Michele

    2012-01-01

    Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, few research groups working in plant sciences have exploited this potentiality, showing that pooled NGS provides results in excellent agreement with those obtained by individual Sanger sequencing. The aim of this review is to convey to the reader the general ideas underlying the use of pooled NGS for the identification of rare variants. To facilitate a thorough understanding of the possibilities of the method, we will explain in detail the possible experimental and analytical approaches and discuss their advantages and disadvantages. We will show that information on allele frequency obtained by pooled NGS can be used to accurately compute basic population genetics indexes such as allele frequency, nucleotide diversity, and Tajima's D. Finally, we will discuss applications and future perspectives of the multiplexed NGS approach.

  15. Rapid and Easy Protocol for Quantification of Next-Generation Sequencing Libraries.

    Science.gov (United States)

    Hawkins, Steve F C; Guest, Paul C

    2018-01-01

    The emergence of next-generation sequencing (NGS) over the last 10 years has increased the efficiency of DNA sequencing in terms of speed, ease, and price. However, the exact quantification of a NGS library is crucial in order to obtain good data on sequencing platforms developed by the current market leader Illumina. Different approaches for DNA quantification are available currently and the most commonly used are based on analysis of the physical properties of the DNA through spectrophotometric or fluorometric methods. Although these methods are technically simple, they do not allow exact quantification as can be achieved using a real-time quantitative PCR (qPCR) approach. A qPCR protocol for DNA quantification with applications in NGS library preparation studies is presented here. This can be applied in various fields of study such as medical disorders resulting from nutritional programming disturbances.

  16. Next-generation sequencing: hype and hope for development of personalized radiation therapy?

    International Nuclear Information System (INIS)

    Tinhofer, Ingeborg; Niehr, Franziska; Konschak, Robert; Liebs, Sandra; Munz, Matthias; Stenzinger, Albrecht; Weichert, Wilko; Keilholz, Ulrich; Budach, Volker

    2015-01-01

    The introduction of next-generation sequencing (NGS) in the field of cancer research has boosted worldwide efforts of genome-wide personalized oncology aiming at identifying predictive biomarkers and novel actionable targets. Despite considerable progress in understanding the molecular biology of distinct cancer entities by the use of this revolutionary technology and despite contemporaneous innovations in drug development, translation of NGS findings into improved concepts for cancer treatment remains a challenge. The aim of this article is to describe shortly the NGS platforms for DNA sequencing and in more detail key achievements and unresolved hurdles. A special focus will be given on potential clinical applications of this innovative technique in the field of radiation oncology

  17. A microfluidic DNA library preparation platform for next-generation sequencing.

    Science.gov (United States)

    Kim, Hanyoup; Jebrail, Mais J; Sinha, Anupama; Bent, Zachary W; Solberg, Owen D; Williams, Kelly P; Langevin, Stanley A; Renzi, Ronald F; Van De Vreugde, James L; Meagher, Robert J; Schoeniger, Joseph S; Lane, Todd W; Branda, Steven S; Bartsch, Michael S; Patel, Kamlesh D

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories.

  18. A microfluidic DNA library preparation platform for next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Hanyoup Kim

    Full Text Available Next-generation sequencing (NGS is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM. The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories.

  19. Next Generation Sequencing of Tubal Intraepithelial Carcinomas

    Science.gov (United States)

    McDaniel, Andrew S.; Stall, Jennifer N.; Hovelson, Daniel H.; Cani, Andi K.; Liu, Chia-Jen; Tomlins, Scott A.; Cho, Kathleen R.

    2016-01-01

    Importance High-grade serous carcinoma (HGSC) is the most prevalent and lethal form of ovarian cancer. HGSCs frequently arise in the distal fallopian tubes rather than the ovary, developing from small precursor lesions called serous tubal intraepithelial carcinomas (TICs or more specifically STICs). While STICs have been reported to harbor TP53 mutations, detailed molecular characterizations of these lesions are lacking. Observations We performed targeted next generation sequencing (NGS) on formalin-fixed, paraffin- embedded tissue from four women, two with HGSC and two with uterine endometrioid carcinoma (UEC) who were diagnosed with synchronous STICs. We detected concordant mutations in both HGSCs with synchronous STICs, including TP53 mutations as well as assumed germline BRCA1/2 alterations, confirming a clonal relationship between these lesions. NGS confirmed the presence of a STIC clonally unrelated to one case of UEC. NGS of the other tubal lesion diagnosed as a STIC unexpectedly supported the lesion as a micrometastasis from the associated UEC. Conclusions and Relevance We demonstrate that targeted NGS can identify genetic lesions in minute lesions such as TICs, and confirm TP53 mutations as early driving events for HGSC. NGS also demonstrated unexpected relationships between presumed STICs and synchronous carcinomas, suggesting potential diagnostic and translational research applications. PMID:26181193

  20. Next-Generation Sequencing in Neuropathologic Diagnosis of Infections of the Nervous System (Open Access)

    Science.gov (United States)

    2016-06-13

    nervous system ABSTRACT Objective: To determine the feasibility of next-generation sequencing (NGS) microbiome ap- proaches in the diagnosis of infectious...V, van Doorn HR, Nghia HD, et al. Identification of a new cyclovirus in cerebrospinal fluid of patients with acute central nervous system infections...Kumar, et al. system Next-generation sequencing in neuropathologic diagnosis of infections of the nervous This information is current as of June 13

  1. Next-generation sequence analysis of cancer xenograft models.

    Directory of Open Access Journals (Sweden)

    Fernando J Rossello

    Full Text Available Next-generation sequencing (NGS studies in cancer are limited by the amount, quality and purity of tissue samples. In this situation, primary xenografts have proven useful preclinical models. However, the presence of mouse-derived stromal cells represents a technical challenge to their use in NGS studies. We examined this problem in an established primary xenograft model of small cell lung cancer (SCLC, a malignancy often diagnosed from small biopsy or needle aspirate samples. Using an in silico strategy that assign reads according to species-of-origin, we prospectively compared NGS data from primary xenograft models with matched cell lines and with published datasets. We show here that low-coverage whole-genome analysis demonstrated remarkable concordance between published genome data and internal controls, despite the presence of mouse genomic DNA. Exome capture sequencing revealed that this enrichment procedure was highly species-specific, with less than 4% of reads aligning to the mouse genome. Human-specific expression profiling with RNA-Seq replicated array-based gene expression experiments, whereas mouse-specific transcript profiles correlated with published datasets from human cancer stroma. We conclude that primary xenografts represent a useful platform for complex NGS analysis in cancer research for tumours with limited sample resources, or those with prominent stromal cell populations.

  2. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    Science.gov (United States)

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  3. Generation of artificial FASTQ files to evaluate the performance of next-generation sequencing pipelines.

    Directory of Open Access Journals (Sweden)

    Matthew Frampton

    Full Text Available Pipelines for the analysis of Next-Generation Sequencing (NGS data are generally composed of a set of different publicly available software, configured together in order to map short reads of a genome and call variants. The fidelity of pipelines is variable. We have developed ArtificialFastqGenerator, which takes a reference genome sequence as input and outputs artificial paired-end FASTQ files containing Phred quality scores. Since these artificial FASTQs are derived from the reference genome, it provides a gold-standard for read-alignment and variant-calling, thereby enabling the performance of any NGS pipeline to be evaluated. The user can customise DNA template/read length, the modelling of coverage based on GC content, whether to use real Phred base quality scores taken from existing FASTQ files, and whether to simulate sequencing errors. Detailed coverage and error summary statistics are outputted. Here we describe ArtificialFastqGenerator and illustrate its implementation in evaluating a typical bespoke NGS analysis pipeline under different experimental conditions. ArtificialFastqGenerator was released in January 2012. Source code, example files and binaries are freely available under the terms of the GNU General Public License v3.0. from https://sourceforge.net/projects/artfastqgen/.

  4. Calculation of Tajima's D and other neutrality test statistics from low depth next-generation sequencing data

    DEFF Research Database (Denmark)

    Korneliussen, Thorfinn Sand; Moltke, Ida; Albrechtsen, Anders

    2013-01-01

    A number of different statistics are used for detecting natural selection using DNA sequencing data, including statistics that are summaries of the frequency spectrum, such as Tajima's D. These statistics are now often being applied in the analysis of Next Generation Sequencing (NGS) data. Howeve......, estimates of frequency spectra from NGS data are strongly affected by low sequencing coverage; the inherent technology dependent variation in sequencing depth causes systematic differences in the value of the statistic among genomic regions....

  5. Clinical validation of targeted next-generation sequencing for inherited disorders.

    Science.gov (United States)

    Yohe, Sophia; Hauge, Adam; Bunjer, Kari; Kemmer, Teresa; Bower, Matthew; Schomaker, Matthew; Onsongo, Getiria; Wilson, Jon; Erdmann, Jesse; Zhou, Yi; Deshpande, Archana; Spears, Michael D; Beckman, Kenneth; Silverstein, Kevin A T; Thyagarajan, Bharat

    2015-02-01

    Although next-generation sequencing (NGS) can revolutionize molecular diagnostics, several hurdles remain in the implementation of this technology in clinical laboratories. To validate and implement an NGS panel for genetic diagnosis of more than 100 inherited diseases, such as neurologic conditions, congenital hearing loss and eye disorders, developmental disorders, nonmalignant diseases treated by hematopoietic cell transplantation, familial cancers, connective tissue disorders, metabolic disorders, disorders of sexual development, and cardiac disorders. The diagnostic gene panels ranged from 1 to 54 genes with most of panels containing 10 genes or fewer. We used a liquid hybridization-based, target-enrichment strategy to enrich 10 067 exons in 568 genes, followed by NGS with a HiSeq 2000 sequencing system (Illumina, San Diego, California). We successfully sequenced 97.6% (9825 of 10 067) of the targeted exons to obtain a minimum coverage of 20× at all bases. We demonstrated 100% concordance in detecting 19 pathogenic single-nucleotide variations and 11 pathogenic insertion-deletion mutations ranging in size from 1 to 18 base pairs across 18 samples that were previously characterized by Sanger sequencing. Using 4 pairs of blinded, duplicate samples, we demonstrated a high degree of concordance (>99%) among the blinded, duplicate pairs. We have successfully demonstrated the feasibility of using the NGS platform to multiplex genetic tests for several rare diseases and the use of cloud computing for bioinformatics analysis as a relatively low-cost solution for implementing NGS in clinical laboratories.

  6. Chronic Meningitis Investigated via Metagenomic Next-Generation Sequencing

    Science.gov (United States)

    O’Donovan, Brian D.; Gelfand, Jeffrey M.; Sample, Hannah A.; Chow, Felicia C.; Betjemann, John P.; Shah, Maulik P.; Richie, Megan B.; Gorman, Mark P.; Hajj-Ali, Rula A.; Calabrese, Leonard H.; Zorn, Kelsey C.; Chow, Eric D.; Greenlee, John E.; Blum, Jonathan H.; Green, Gary; Khan, Lillian M.; Banerji, Debarko; Langelier, Charles; Bryson-Cahn, Chloe; Harrington, Whitney; Lingappa, Jairam R.; Shanbhag, Niraj M.; Green, Ari J.; Brew, Bruce J.; Soldatos, Ariane; Strnad, Luke; Doernberg, Sarah B.; Jay, Cheryl A.; Douglas, Vanja; Josephson, S. Andrew; DeRisi, Joseph L.

    2018-01-01

    Importance Identifying infectious causes of subacute or chronic meningitis can be challenging. Enhanced, unbiased diagnostic approaches are needed. Objective To present a case series of patients with diagnostically challenging subacute or chronic meningitis using metagenomic next-generation sequencing (mNGS) of cerebrospinal fluid (CSF) supported by a statistical framework generated from mNGS of control samples from the environment and from patients who were noninfectious. Design, Setting, and Participants In this case series, mNGS data obtained from the CSF of 94 patients with noninfectious neuroinflammatory disorders and from 24 water and reagent control samples were used to develop and implement a weighted scoring metric based on z scores at the species and genus levels for both nucleotide and protein alignments to prioritize and rank the mNGS results. Total RNA was extracted for mNGS from the CSF of 7 participants with subacute or chronic meningitis who were recruited between September 2013 and March 2017 as part of a multicenter study of mNGS pathogen discovery among patients with suspected neuroinflammatory conditions. The neurologic infections identified by mNGS in these 7 participants represented a diverse array of pathogens. The patients were referred from the University of California, San Francisco Medical Center (n = 2), Zuckerberg San Francisco General Hospital and Trauma Center (n = 2), Cleveland Clinic (n = 1), University of Washington (n = 1), and Kaiser Permanente (n = 1). A weighted z score was used to filter out environmental contaminants and facilitate efficient data triage and analysis. Main Outcomes and Measures Pathogens identified by mNGS and the ability of a statistical model to prioritize, rank, and simplify mNGS results. Results The 7 participants ranged in age from 10 to 55 years, and 3 (43%) were female. A parasitic worm (Taenia solium, in 2 participants), a virus (HIV-1), and 4 fungi (Cryptococcus neoformans

  7. Application of next generation sequencing in clinical microbiology and infection prevention.

    Science.gov (United States)

    Deurenberg, Ruud H; Bathoorn, Erik; Chlebowicz, Monika A; Couto, Natacha; Ferdous, Mithila; García-Cobos, Silvia; Kooistra-Smid, Anna M D; Raangs, Erwin C; Rosema, Sigrid; Veloo, Alida C M; Zhou, Kai; Friedrich, Alexander W; Rossen, John W A

    2017-02-10

    Current molecular diagnostics of human pathogens provide limited information that is often not sufficient for outbreak and transmission investigation. Next generation sequencing (NGS) determines the DNA sequence of a complete bacterial genome in a single sequence run, and from these data, information on resistance and virulence, as well as information for typing is obtained, useful for outbreak investigation. The obtained genome data can be further used for the development of an outbreak-specific screening test. In this review, a general introduction to NGS is presented, including the library preparation and the major characteristics of the most common NGS platforms, such as the MiSeq (Illumina) and the Ion PGM™ (ThermoFisher). An overview of the software used for NGS data analyses used at the medical microbiology diagnostic laboratory in the University Medical Center Groningen in The Netherlands is given. Furthermore, applications of NGS in the clinical setting are described, such as outbreak management, molecular case finding, characterization and surveillance of pathogens, rapid identification of bacteria using the 16S-23S rRNA region, taxonomy, metagenomics approaches on clinical samples, and the determination of the transmission of zoonotic micro-organisms from animals to humans. Finally, we share our vision on the use of NGS in personalised microbiology in the near future, pointing out specific requirements. Copyright © 2016 The Author(s). Published by Elsevier B.V. All rights reserved.

  8. Analysis of Pre-Analytic Factors Affecting the Success of Clinical Next-Generation Sequencing of Solid Organ Malignancies

    International Nuclear Information System (INIS)

    Chen, Hui; Luthra, Rajyalakshmi; Goswami, Rashmi S.; Singh, Rajesh R.; Roy-Chowdhuri, Sinchita

    2015-01-01

    Application of next-generation sequencing (NGS) technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM) (Life Technologies), a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects

  9. Analysis of Pre-Analytic Factors Affecting the Success of Clinical Next-Generation Sequencing of Solid Organ Malignancies

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Hui [Department of Pathology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX 77030 (United States); Luthra, Rajyalakshmi, E-mail: rluthra@mdanderson.org; Goswami, Rashmi S.; Singh, Rajesh R. [Department of Hematopathology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX 77030 (United States); Roy-Chowdhuri, Sinchita [Department of Pathology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX 77030 (United States)

    2015-08-28

    Application of next-generation sequencing (NGS) technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM) (Life Technologies), a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects.

  10. Analysis of Pre-Analytic Factors Affecting the Success of Clinical Next-Generation Sequencing of Solid Organ Malignancies

    Directory of Open Access Journals (Sweden)

    Hui Chen

    2015-08-01

    Full Text Available Application of next-generation sequencing (NGS technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM (Life Technologies, a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects.

  11. Next-Generation Sequencing for Typing and Detection of ESBL and MBL E. coli causing UTI

    OpenAIRE

    Nabakishore Nayak; Mahesh Chanda Sahu

    2017-01-01

    Next-generation sequencing (NGS) has the potential to provide typing results and detect resistance genes in a single assay, thus guiding timely treatment decisions and allowing rapid tracking of transmission of resistant clones. We can be evaluated the performance of a new NGS assay during an outbreak of sequence type 131 (ST131) Escherichia coli infections in a teaching hospital. The assay will be performed on 100 extended-spectrum- beta-lactamase (ESBL) E. coli isolates collected from UTI d...

  12. [Molecular and prenatal diagnosis of a family with Fanconi anemia by next generation sequencing].

    Science.gov (United States)

    Gong, Zhuwen; Yu, Yongguo; Zhang, Qigang; Gu, Xuefan

    2015-04-01

    To provide prenatal diagnosis for a pregnant woman who had given birth to a child with Fanconi anemia with combined next-generation sequencing (NGS) and Sanger sequencing. For the affected child, potential mutations of the FANCA gene were analyzed with NGS. Suspected mutation was verified with Sanger sequencing. For prenatal diagnosis, genomic DNA was extracted from cultured fetal amniotic fluid cells and subjected to analysis of the same mutations. A low-frequency frameshifting mutation c.989_995del7 (p.H330LfsX2, inherited from his father) and a truncating mutation c.3971C>T (p.P1324L, inherited from his mother) have been identified in the affected child and considered to be pathogenic. The two mutations were subsequently verified by Sanger sequencing. Upon prenatal diagnosis, the fetus was found to carry two mutations. The combined next-generation sequencing and Sanger sequencing can reduce the time for diagnosis and identify subtypes of Fanconi anemia and the mutational sites, which has enabled reliable prenatal diagnosis of this disease.

  13. PheoSeq : A Targeted Next-Generation Sequencing Assay for Pheochromocytoma and Paraganglioma Diagnostics

    NARCIS (Netherlands)

    Currás-Freixes, Maria; Piñeiro-Yañez, Elena; Montero-Conde, Cristina; Apellániz-Ruiz, María; Calsina, Bruna; Mancikova, Veronika; Remacha, Laura; Richter, Susan; Ercolino, Tonino; Rogowski-Lehmann, Natalie; Deutschbein, Timo; Calatayud, María; Guadalix, Sonsoles; Álvarez-Escolá, Cristina; Lamas, Cristina; Aller, Javier; Sastre-Marcos, Julia; Lázaro, Conxi; Galofré, Juan C.; Patiño-García, Ana; Meoro-Avilés, Amparo; Balmaña-Gelpi, Judith; De Miguel-Novoa, Paz; Balbín, Milagros; Matías-Guiu, Xavier; Letón, Rocío; Inglada-Pérez, Lucía; Torres-Pérez, Rafael; Roldán-Romero, Juan M.; Rodríguez-Antona, Cristina; Fliedner, Stephanie M J; Opocher, Giuseppe; Pacak, Karel; Korpershoek, Esther; de Krijger, Ronald R.; Vroonen, Laurent; Mannelli, Massimo; Fassnacht, Martin; Beuschlein, Felix; Eisenhofer, Graeme; Cascón, Alberto; Al-Shahrour, Fátima; Robledo, Mercedes

    2017-01-01

    Genetic diagnosis is recommended for all pheochromocytoma and paraganglioma (PPGL) cases, as driver mutations are identified in approximately 80% of the cases. As the list of related genes expands, genetic diagnosis becomes more time-consuming, and targeted next-generation sequencing (NGS) has

  14. Histoimmunogenetics Markup Language 1.0: Reporting next generation sequencing-based HLA and KIR genotyping.

    Science.gov (United States)

    Milius, Robert P; Heuer, Michael; Valiga, Daniel; Doroschak, Kathryn J; Kennedy, Caleb J; Bolon, Yung-Tsi; Schneider, Joel; Pollack, Jane; Kim, Hwa Ran; Cereb, Nezih; Hollenbach, Jill A; Mack, Steven J; Maiers, Martin

    2015-12-01

    We present an electronic format for exchanging data for HLA and KIR genotyping with extensions for next-generation sequencing (NGS). This format addresses NGS data exchange by refining the Histoimmunogenetics Markup Language (HML) to conform to the proposed Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines (miring.immunogenomics.org). Our refinements of HML include two major additions. First, NGS is supported by new XML structures to capture additional NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. A full genotype, consensus sequence, and the surrounding metadata are included directly, while the raw sequence reads and platform documentation are externally referenced. Second, genotype ambiguity is fully represented by integrating Genotype List Strings, which use a hierarchical set of delimiters to represent allele and genotype ambiguity in a complete and accurate fashion. HML also continues to enable the transmission of legacy methods (e.g. site-specific oligonucleotide, sequence-specific priming, and Sequence Based Typing (SBT)), adding features such as allowing multiple group-specific sequencing primers, and fully leveraging techniques that combine multiple methods to obtain a single result, such as SBT integrated with NGS. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  15. Efficiency to Discovery Transgenic Loci in GM Rice Using Next Generation Sequencing Whole Genome Re-sequencing

    Directory of Open Access Journals (Sweden)

    Doori Park

    2015-09-01

    Full Text Available Molecular characterization technology in genetically modified organisms, in addition to how transgenic biotechnologies are developed now require full transparency to assess the risk to living modified and non-modified organisms. Next generation sequencing (NGS methodology is suggested as an effective means in genome characterization and detection of transgenic insertion locations. In the present study, we applied NGS to insert transgenic loci, specifically the epidermal growth factor (EGF in genetically modified rice cells. A total of 29.3 Gb (~72× coverage was sequenced with a 2 × 150 bp paired end method by Illumina HiSeq2500, which was consecutively mapped to the rice genome and T-vector sequence. The compatible pairs of reads were successfully mapped to 10 loci on the rice chromosome and vector sequences were validated to the insertion location by polymerase chain reaction (PCR amplification. The EGF transgenic site was confirmed only on chromosome 4 by PCR. Results of this study demonstrated the success of NGS data to characterize the rice genome. Bioinformatics analyses must be developed in association with NGS data to identify highly accurate transgenic sites.

  16. Next-generation sequencing for genetic testing of familial colorectal cancer syndromes.

    Science.gov (United States)

    Simbolo, Michele; Mafficini, Andrea; Agostini, Marco; Pedrazzani, Corrado; Bedin, Chiara; Urso, Emanuele D; Nitti, Donato; Turri, Giona; Scardoni, Maria; Fassan, Matteo; Scarpa, Aldo

    2015-01-01

    Genetic screening in families with high risk to develop colorectal cancer (CRC) prevents incurable disease and permits personalized therapeutic and follow-up strategies. The advancement of next-generation sequencing (NGS) technologies has revolutionized the throughput of DNA sequencing. A series of 16 probands for either familial adenomatous polyposis (FAP; 8 cases) or hereditary nonpolyposis colorectal cancer (HNPCC; 8 cases) were investigated for intragenic mutations in five CRC familial syndromes-associated genes (APC, MUTYH, MLH1, MSH2, MSH6) applying both a custom multigene Ion AmpliSeq NGS panel and conventional Sanger sequencing. Fourteen pathogenic variants were detected in 13/16 FAP/HNPCC probands (81.3 %); one FAP proband presented two co-existing pathogenic variants, one in APC and one in MUTYH. Thirteen of these 14 pathogenic variants were detected by both NGS and Sanger, while one MSH2 mutation (L280FfsX3) was identified only by Sanger sequencing. This is due to a limitation of the NGS approach in resolving sequences close or within homopolymeric stretches of DNA. To evaluate the performance of our NGS custom panel we assessed its capability to resolve the DNA sequences corresponding to 2225 pathogenic variants reported in the COSMIC database for APC, MUTYH, MLH1, MSH2, MSH6. Our NGS custom panel resolves the sequences where 2108 (94.7 %) of these variants occur. The remaining 117 mutations reside inside or in close proximity to homopolymer stretches; of these 27 (1.2 %) are imprecisely identified by the software but can be resolved by visual inspection of the region, while the remaining 90 variants (4.0 %) are blind spots. In summary, our custom panel would miss 4 % (90/2225) of pathogenic variants that would need a small set of Sanger sequencing reactions to be solved. The multiplex NGS approach has the advantage of analyzing multiple genes in multiple samples simultaneously, requiring only a reduced number of Sanger sequences to resolve

  17. Next generation sequencing assays for benthic monitoring of the environmental impact associated with salmon farming (pilot study)

    DEFF Research Database (Denmark)

    Pawlowski, Jan; Esling, Philippe; Lejzerowicz, Franck

    2015-01-01

    This report presents the study of foraminiferal and metazoan benthic community based on next-generation sequencing (NGS) of environmental DNA and RNA (eDNA/RNA). The objective of this study was to test the application of NGS assays for benthic monitoring of salmon farms in Norway, in order to ove...

  18. Translational Bioinformatics for Diagnostic and Prognostic Prediction of Prostate Cancer in the Next-Generation Sequencing Era

    Directory of Open Access Journals (Sweden)

    Jiajia Chen

    2013-01-01

    Full Text Available The discovery of prostate cancer biomarkers has been boosted by the advent of next-generation sequencing (NGS technologies. Nevertheless, many challenges still exist in exploiting the flood of sequence data and translating them into routine diagnostics and prognosis of prostate cancer. Here we review the recent developments in prostate cancer biomarkers by high throughput sequencing technologies. We highlight some fundamental issues of translational bioinformatics and the potential use of cloud computing in NGS data processing for the improvement of prostate cancer treatment.

  19. Cloning and Identification of Recombinant Argonaute-Bound Small RNAs Using Next-Generation Sequencing.

    Science.gov (United States)

    Gangras, Pooja; Dayeh, Daniel M; Mabin, Justin W; Nakanishi, Kotaro; Singh, Guramrit

    2018-01-01

    Argonaute proteins (AGOs) are loaded with small RNAs as guides to recognize target mRNAs. Since the target specificity heavily depends on the base complementarity between two strands, it is important to identify small guide and long target RNAs bound to AGOs. For this purpose, next-generation sequencing (NGS) technologies have extended our appreciation truly to the nucleotide level. However, the identification of RNAs via NGS from scarce RNA samples remains a challenge. Further, most commercial and published methods are compatible with either small RNAs or long RNAs, but are not equally applicable to both. Therefore, a single method that yields quantitative, bias-free NGS libraries to identify small and long RNAs from low levels of input will be of wide interest. Here, we introduce such a procedure that is based on several modifications of two published protocols and allows robust, sensitive, and reproducible cloning and sequencing of small amounts of RNAs of variable lengths. The method was applied to the identification of small RNAs bound to a purified eukaryotic AGO. Following ligation of a DNA adapter to RNA 3'-end, the key feature of this method is to use the adapter for priming reverse transcription (RT) wherein biotinylated deoxyribonucleotides specifically incorporated into the extended complementary DNA. Such RT products are enriched on streptavidin beads, circularized while immobilized on beads and directly used for PCR amplification. We provide a stepwise guide to generate RNA-Seq libraries, their purification, quantification, validation, and preparation for next-generation sequencing. We also provide basic steps in post-NGS data analyses using Galaxy, an open-source, web-based platform.

  20. Next generation sequencing of pancreatic ductal adenocarcinoma: right or wrong?

    Science.gov (United States)

    Connor, Ashton A; Gallinger, Steven

    2017-07-01

    Pancreatic ductal adenocarcinoma (PDAC) has the highest mortality rate of all epithelial malignancies and a paradoxically rising incidence rate. Clinical translation of next generation sequencing (NGS) of tumour and germline samples may ameliorate outcomes by identifying prognostic and predictive genomic and transcriptomic features in appreciable fractions of patients, facilitating enrolment in biomarker-matched trials. Areas covered: The literature on precision oncology is reviewed. It is found that outcomes may be improved across various malignancies, and it is suggested that current issues of adequate tissue acquisition, turnaround times, analytic expertise and clinical trial accessibility may lessen as experience accrues. Also reviewed are PDAC genomic and transcriptomic NGS studies, emphasizing discoveries of promising biomarkers, though these require validation, and the fraction of patients that will benefit from these outside of the research setting is currently unknown. Expert commentary: Clinical use of NGS with PDAC should be used in investigational contexts in centers with multidisciplinary expertise in cancer sequencing and pancreatic cancer management. Biomarker directed studies will improve our understanding of actionable genomic variation in PDAC, and improve outcomes for this challenging disease.

  1. Best practice guidelines for the use of next-generation sequencing applications in genome diagnostics: a national collaborative study of dutch genome diagnostic laboratories

    NARCIS (Netherlands)

    Weiss, Marjan M.; van der Zwaag, Bert; Jongbloed, Jan D. H.; Vogel, Maartje J.; Brüggenwirth, Hennie T.; Lekanne Deprez, Ronald H.; Mook, Olaf; Ruivenkamp, Claudia A. L.; van Slegtenhorst, Marjon A.; van den Wijngaard, Arthur; Waisfisz, Quinten; Nelen, Marcel R.; van der Stoep, Nienke

    2013-01-01

    Next-generation sequencing (NGS) methods are being adopted by genome diagnostics laboratories worldwide. However, implementing NGS-based tests according to diagnostic standards is a challenge for individual laboratories. To facilitate the implementation of NGS in Dutch laboratories, the Dutch

  2. Implementation of Cloud based next generation sequencing data analysis in a clinical laboratory.

    Science.gov (United States)

    Onsongo, Getiria; Erdmann, Jesse; Spears, Michael D; Chilton, John; Beckman, Kenneth B; Hauge, Adam; Yohe, Sophia; Schomaker, Matthew; Bower, Matthew; Silverstein, Kevin A T; Thyagarajan, Bharat

    2014-05-23

    The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories. To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample. We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100% concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.

  3. Next-Generation Mitogenomics: A Comparison of Approaches Applied to Caecilian Amphibian Phylogeny

    OpenAIRE

    Maddock, Simon T.; Briscoe, Andrew G.; Wilkinson, Mark; Waeschenbach, Andrea; San Mauro, Diego; Day, Julia J.; Littlewood, D. Tim J.; Foster, Peter G.; Nussbaum, Ronald A.; Gower, David J.

    2016-01-01

    Mitochondrial genome (mitogenome) sequences are being generated with increasing speed due to the advances of next-generation sequencing (NGS) technology and associated analytical tools. However, detailed comparisons to explore the utility of alternative NGS approaches applied to the same taxa have not been undertaken. We compared a ‘traditional’ Sanger sequencing method with two NGS approaches (shotgun sequencing and non-indexed, multiplex amplicon sequencing) on four different sequencing pla...

  4. A Review on the Applications of Next Generation Sequencing Technologies as Applied to Food-Related Microbiome Studies

    Directory of Open Access Journals (Sweden)

    Yu Cao

    2017-09-01

    Full Text Available The development of next generation sequencing (NGS techniques has enabled researchers to study and understand the world of microorganisms from broader and deeper perspectives. The contemporary advances in DNA sequencing technologies have not only enabled finer characterization of bacterial genomes but also provided deeper taxonomic identification of complex microbiomes which in its genomic essence is the combined genetic material of the microorganisms inhabiting an environment, whether the environment be a particular body econiche (e.g., human intestinal contents or a food manufacturing facility econiche (e.g., floor drain. To date, 16S rDNA sequencing, metagenomics and metatranscriptomics are the three basic sequencing strategies used in the taxonomic identification and characterization of food-related microbiomes. These sequencing strategies have used different NGS platforms for DNA and RNA sequence identification. Traditionally, 16S rDNA sequencing has played a key role in understanding the taxonomic composition of a food-related microbiome. Recently, metagenomic approaches have resulted in improved understanding of a microbiome by providing a species-level/strain-level characterization. Further, metatranscriptomic approaches have contributed to the functional characterization of the complex interactions between different microbial communities within a single microbiome. Many studies have highlighted the use of NGS techniques in investigating the microbiome of fermented foods. However, the utilization of NGS techniques in studying the microbiome of non-fermented foods are limited. This review provides a brief overview of the advances in DNA sequencing chemistries as the technology progressed from first, next and third generations and highlights how NGS provided a deeper understanding of food-related microbiomes with special focus on non-fermented foods.

  5. A Review on the Applications of Next Generation Sequencing Technologies as Applied to Food-Related Microbiome Studies

    Science.gov (United States)

    Cao, Yu; Fanning, Séamus; Proos, Sinéad; Jordan, Kieran; Srikumar, Shabarinath

    2017-01-01

    The development of next generation sequencing (NGS) techniques has enabled researchers to study and understand the world of microorganisms from broader and deeper perspectives. The contemporary advances in DNA sequencing technologies have not only enabled finer characterization of bacterial genomes but also provided deeper taxonomic identification of complex microbiomes which in its genomic essence is the combined genetic material of the microorganisms inhabiting an environment, whether the environment be a particular body econiche (e.g., human intestinal contents) or a food manufacturing facility econiche (e.g., floor drain). To date, 16S rDNA sequencing, metagenomics and metatranscriptomics are the three basic sequencing strategies used in the taxonomic identification and characterization of food-related microbiomes. These sequencing strategies have used different NGS platforms for DNA and RNA sequence identification. Traditionally, 16S rDNA sequencing has played a key role in understanding the taxonomic composition of a food-related microbiome. Recently, metagenomic approaches have resulted in improved understanding of a microbiome by providing a species-level/strain-level characterization. Further, metatranscriptomic approaches have contributed to the functional characterization of the complex interactions between different microbial communities within a single microbiome. Many studies have highlighted the use of NGS techniques in investigating the microbiome of fermented foods. However, the utilization of NGS techniques in studying the microbiome of non-fermented foods are limited. This review provides a brief overview of the advances in DNA sequencing chemistries as the technology progressed from first, next and third generations and highlights how NGS provided a deeper understanding of food-related microbiomes with special focus on non-fermented foods. PMID:29033905

  6. Quantifying low-frequency revertants in oral poliovirus vaccine using next generation sequencing.

    Science.gov (United States)

    Sarcey, Eric; Serres, Aurélie; Tindy, Fabrice; Chareyre, Audrey; Ng, Siemon; Nicolas, Marine; Vetter, Emmanuelle; Bonnevay, Thierry; Abachin, Eric; Mallet, Laurent

    2017-08-01

    Spontaneous reversion to neurovirulence of live attenuated oral poliovirus vaccine (OPV) serotype 3 (chiefly involving the n.472U>C mutation), must be monitored during production to ensure vaccine safety and consistency. Mutant analysis by polymerase chain reaction and restriction enzyme cleavage (MAPREC) has long been endorsed by the World Health Organization as the preferred in vitro test for this purpose; however, it requires radiolabeling, which is no longer supported by many laboratories. We evaluated the performance and suitability of next generation sequencing (NGS) as an alternative to MAPREC. The linearity of NGS was demonstrated at revertant concentrations equivalent to the study range of 0.25%-1.5%. NGS repeatability and intermediate precision were comparable across all tested samples, and NGS was highly reproducible, irrespective of sequencing platform or analysis software used. NGS was performed on OPV serotype 3 working seed lots and monovalent bulks (n=21) that were previously tested using MAPREC, and which covered the representative range of vaccine production. Percentages of 472-C revertants identified by NGS and MAPREC were comparable and highly correlated (r≥0.80), with a Pearson correlation coefficient of 0.95585 (p<0.0001). NGS demonstrated statistically equivalent performance to that of MAPREC for quantifying low-frequency OPV serotype 3 revertants, and offers a valid alternative to MAPREC. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  7. Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease.

    Science.gov (United States)

    Dilliott, Allison A; Farhan, Sali M K; Ghani, Mahdi; Sato, Christine; Liang, Eric; Zhang, Ming; McIntyre, Adam D; Cao, Henian; Racacho, Lemuel; Robinson, John F; Strong, Michael J; Masellis, Mario; Bulman, Dennis E; Rogaeva, Ekaterina; Lang, Anthony; Tartaglia, Carmela; Finger, Elizabeth; Zinman, Lorne; Turnbull, John; Freedman, Morris; Swartz, Rick; Black, Sandra E; Hegele, Robert A

    2018-04-04

    Next-generation sequencing (NGS) is quickly revolutionizing how research into the genetic determinants of constitutional disease is performed. The technique is highly efficient with millions of sequencing reads being produced in a short time span and at relatively low cost. Specifically, targeted NGS is able to focus investigations to genomic regions of particular interest based on the disease of study. Not only does this further reduce costs and increase the speed of the process, but it lessens the computational burden that often accompanies NGS. Although targeted NGS is restricted to certain regions of the genome, preventing identification of potential novel loci of interest, it can be an excellent technique when faced with a phenotypically and genetically heterogeneous disease, for which there are previously known genetic associations. Because of the complex nature of the sequencing technique, it is important to closely adhere to protocols and methodologies in order to achieve sequencing reads of high coverage and quality. Further, once sequencing reads are obtained, a sophisticated bioinformatics workflow is utilized to accurately map reads to a reference genome, to call variants, and to ensure the variants pass quality metrics. Variants must also be annotated and curated based on their clinical significance, which can be standardized by applying the American College of Medical Genetics and Genomics Pathogenicity Guidelines. The methods presented herein will display the steps involved in generating and analyzing NGS data from a targeted sequencing panel, using the ONDRISeq neurodegenerative disease panel as a model, to identify variants that may be of clinical significance.

  8. Assuring the Quality of Next-Generation Sequencing in Clinical Microbiology and Public Health Laboratories.

    Science.gov (United States)

    Gargis, Amy S; Kalman, Lisa; Lubin, Ira M

    2016-12-01

    Clinical microbiology and public health laboratories are beginning to utilize next-generation sequencing (NGS) for a range of applications. This technology has the potential to transform the field by providing approaches that will complement, or even replace, many conventional laboratory tests. While the benefits of NGS are significant, the complexities of these assays require an evolving set of standards to ensure testing quality. Regulatory and accreditation requirements, professional guidelines, and best practices that help ensure the quality of NGS-based tests are emerging. This review highlights currently available standards and guidelines for the implementation of NGS in the clinical and public health laboratory setting, and it includes considerations for NGS test validation, quality control procedures, proficiency testing, and reference materials. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  9. An integrated SNP mining and utilization (ISMU) pipeline for next generation sequencing data.

    Science.gov (United States)

    Azam, Sarwar; Rathore, Abhishek; Shah, Trushar M; Telluri, Mohan; Amindala, BhanuPrakash; Ruperao, Pradeep; Katta, Mohan A V S K; Varshney, Rajeev K

    2014-01-01

    Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone

  10. Reprint of "Application of next generation sequencing in clinical microbiology and infection prevention".

    Science.gov (United States)

    Deurenberg, Ruud H; Bathoorn, Erik; Chlebowicz, Monika A; Couto, Natacha; Ferdous, Mithila; García-Cobos, Silvia; Kooistra-Smid, Anna M D; Raangs, Erwin C; Rosema, Sigrid; Veloo, Alida C M; Zhou, Kai; Friedrich, Alexander W; Rossen, John W A

    2017-05-20

    Current molecular diagnostics of human pathogens provide limited information that is often not sufficient for outbreak and transmission investigation. Next generation sequencing (NGS) determines the DNA sequence of a complete bacterial genome in a single sequence run, and from these data, information on resistance and virulence, as well as information for typing is obtained, useful for outbreak investigation. The obtained genome data can be further used for the development of an outbreak-specific screening test. In this review, a general introduction to NGS is presented, including the library preparation and the major characteristics of the most common NGS platforms, such as the MiSeq (Illumina) and the Ion PGM™ (ThermoFisher). An overview of the software used for NGS data analyses used at the medical microbiology diagnostic laboratory in the University Medical Center Groningen in The Netherlands is given. Furthermore, applications of NGS in the clinical setting are described, such as outbreak management, molecular case finding, characterization and surveillance of pathogens, rapid identification of bacteria using the 16S-23S rRNA region, taxonomy, metagenomics approaches on clinical samples, and the determination of the transmission of zoonotic micro-organisms from animals to humans. Finally, we share our vision on the use of NGS in personalised microbiology in the near future, pointing out specific requirements. Copyright © 2017. Published by Elsevier B.V.

  11. RAMBO-K: Rapid and Sensitive Removal of Background Sequences from Next Generation Sequencing Data.

    Directory of Open Access Journals (Sweden)

    Simon H Tausch

    Full Text Available The assembly of viral or endosymbiont genomes from Next Generation Sequencing (NGS data is often hampered by the predominant abundance of reads originating from the host organism. These reads increase the memory and CPU time usage of the assembler and can lead to misassemblies.We developed RAMBO-K (Read Assignment Method Based On K-mers, a tool which allows rapid and sensitive removal of unwanted host sequences from NGS datasets. Reaching a speed of 10 Megabases/s on 4 CPU cores and a standard hard drive, RAMBO-K is faster than any tool we tested, while showing a consistently high sensitivity and specificity across different datasets.RAMBO-K rapidly and reliably separates reads from different species without data preprocessing. It is suitable as a straightforward standard solution for workflows dealing with mixed datasets. Binaries and source code (java and python are available from http://sourceforge.net/projects/rambok/.

  12. The utility of Next Generation Sequencing for molecular diagnostics in Rett syndrome.

    Science.gov (United States)

    Vidal, Silvia; Brandi, Núria; Pacheco, Paola; Gerotina, Edgar; Blasco, Laura; Trotta, Jean-Rémi; Derdak, Sophia; Del Mar O'Callaghan, Maria; Garcia-Cazorla, Àngels; Pineda, Mercè; Armstrong, Judith

    2017-09-25

    Rett syndrome (RTT) is an early-onset neurodevelopmental disorder that almost exclusively affects girls and is totally disabling. Three genes have been identified that cause RTT: MECP2, CDKL5 and FOXG1. However, the etiology of some of RTT patients still remains unknown. Recently, next generation sequencing (NGS) has promoted genetic diagnoses because of the quickness and affordability of the method. To evaluate the usefulness of NGS in genetic diagnosis, we present the genetic study of RTT-like patients using different techniques based on this technology. We studied 1577 patients with RTT-like clinical diagnoses and reviewed patients who were previously studied and thought to have RTT genes by Sanger sequencing. Genetically, 477 of 1577 patients with a RTT-like suspicion have been diagnosed. Positive results were found in 30% by Sanger sequencing, 23% with a custom panel, 24% with a commercial panel and 32% with whole exome sequencing. A genetic study using NGS allows the study of a larger number of genes associated with RTT-like symptoms simultaneously, providing genetic study of a wider group of patients as well as significantly reducing the response time and cost of the study.

  13. Next-Generation Sequencing Workflow for NSCLC Critical Samples Using a Targeted Sequencing Approach by Ion Torrent PGM™ Platform.

    Science.gov (United States)

    Vanni, Irene; Coco, Simona; Truini, Anna; Rusmini, Marta; Dal Bello, Maria Giovanna; Alama, Angela; Banelli, Barbara; Mora, Marco; Rijavec, Erika; Barletta, Giulia; Genova, Carlo; Biello, Federica; Maggioni, Claudia; Grossi, Francesco

    2015-12-03

    Next-generation sequencing (NGS) is a cost-effective technology capable of screening several genes simultaneously; however, its application in a clinical context requires an established workflow to acquire reliable sequencing results. Here, we report an optimized NGS workflow analyzing 22 lung cancer-related genes to sequence critical samples such as DNA from formalin-fixed paraffin-embedded (FFPE) blocks and circulating free DNA (cfDNA). Snap frozen and matched FFPE gDNA from 12 non-small cell lung cancer (NSCLC) patients, whose gDNA fragmentation status was previously evaluated using a multiplex PCR-based quality control, were successfully sequenced with Ion Torrent PGM™. The robust bioinformatic pipeline allowed us to correctly call both Single Nucleotide Variants (SNVs) and indels with a detection limit of 5%, achieving 100% specificity and 96% sensitivity. This workflow was also validated in 13 FFPE NSCLC biopsies. Furthermore, a specific protocol for low input gDNA capable of producing good sequencing data with high coverage, high uniformity, and a low error rate was also optimized. In conclusion, we demonstrate the feasibility of obtaining gDNA from FFPE samples suitable for NGS by performing appropriate quality controls. The optimized workflow, capable of screening low input gDNA, highlights NGS as a potential tool in the detection, disease monitoring, and treatment of NSCLC.

  14. Perspectives of Integrative Cancer Genomics in Next Generation Sequencing Era

    Directory of Open Access Journals (Sweden)

    So Mee Kwon

    2012-06-01

    Full Text Available The explosive development of genomics technologies including microarrays and next generation sequencing (NGS has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research.

  15. Next-Generation Sequencing-Based Detection of Germline Copy Number Variations in BRCA1/BRCA2

    DEFF Research Database (Denmark)

    Schmidt, Ane Y; Hansen, Thomas V O; Ahlborn, Lise B

    2017-01-01

    Genetic testing of BRCA1/2 includes screening for single nucleotide variants and small insertions/deletions and for larger copy number variations (CNVs), primarily by Sanger sequencing and multiplex ligation-dependent probe amplification (MLPA). With the advent of next-generation sequencing (NGS)...

  16. Cracking the Code of Human Diseases Using Next-Generation Sequencing: Applications, Challenges, and Perspectives

    Directory of Open Access Journals (Sweden)

    Vincenza Precone

    2015-01-01

    Full Text Available Next-generation sequencing (NGS technologies have greatly impacted on every field of molecular research mainly because they reduce costs and increase throughput of DNA sequencing. These features, together with the technology’s flexibility, have opened the way to a variety of applications including the study of the molecular basis of human diseases. Several analytical approaches have been developed to selectively enrich regions of interest from the whole genome in order to identify germinal and/or somatic sequence variants and to study DNA methylation. These approaches are now widely used in research, and they are already being used in routine molecular diagnostics. However, some issues are still controversial, namely, standardization of methods, data analysis and storage, and ethical aspects. Besides providing an overview of the NGS-based approaches most frequently used to study the molecular basis of human diseases at DNA level, we discuss the principal challenges and applications of NGS in the field of human genomics.

  17. Enrichment of megabase-sized DNA molecules for single-molecule optical mapping and next-generation sequencing

    DEFF Research Database (Denmark)

    Łopacińska-Jørgensen, Joanna M; Pedersen, Jonas Nyvold; Bak, Mads

    2017-01-01

    Next-generation sequencing (NGS) has caused a revolution, yet left a gap: long-range genetic information from native, non-amplified DNA fragments is unavailable. It might be obtained by optical mapping of megabase-sized DNA molecules. Frequently only a specific genomic region is of interest, so...

  18. Analysis and Visualization of ChIP-Seq and RNA-Seq Sequence Alignments Using ngs.plot.

    Science.gov (United States)

    Loh, Yong-Hwee Eddie; Shen, Li

    2016-01-01

    The continual maturation and increasing applications of next-generation sequencing technology in scientific research have yielded ever-increasing amounts of data that need to be effectively and efficiently analyzed and innovatively mined for new biological insights. We have developed ngs.plot-a quick and easy-to-use bioinformatics tool that performs visualizations of the spatial relationships between sequencing alignment enrichment and specific genomic features or regions. More importantly, ngs.plot is customizable beyond the use of standard genomic feature databases to allow the analysis and visualization of user-specified regions of interest generated by the user's own hypotheses. In this protocol, we demonstrate and explain the use of ngs.plot using command line executions, as well as a web-based workflow on the Galaxy framework. We replicate the underlying commands used in the analysis of a true biological dataset that we had reported and published earlier and demonstrate how ngs.plot can easily generate publication-ready figures. With ngs.plot, users would be able to efficiently and innovatively mine their own datasets without having to be involved in the technical aspects of sequence coverage calculations and genomic databases.

  19. Next-generation sequencing of multiple individuals per barcoded library by deconvolution of sequenced amplicons using endonuclease fragment analysis

    DEFF Research Database (Denmark)

    Andersen, Jeppe D; Pereira, Vania; Pietroni, Carlotta

    2014-01-01

    The simultaneous sequencing of samples from multiple individuals increases the efficiency of next-generation sequencing (NGS) while also reducing costs. Here we describe a novel and simple approach for sequencing DNA from multiple individuals per barcode. Our strategy relies on the endonuclease...... digestion of PCR amplicons prior to library preparation, creating a specific fragment pattern for each individual that can be resolved after sequencing. By using both barcodes and restriction fragment patterns, we demonstrate the ability to sequence the human melanocortin 1 receptor (MC1R) genes from 72...... individuals using only 24 barcoded libraries....

  20. Statistical framework for detection of genetically modified organisms based on Next Generation Sequencing.

    Science.gov (United States)

    Willems, Sander; Fraiture, Marie-Alice; Deforce, Dieter; De Keersmaecker, Sigrid C J; De Loose, Marc; Ruttink, Tom; Herman, Philippe; Van Nieuwerburgh, Filip; Roosens, Nancy

    2016-02-01

    Because the number and diversity of genetically modified (GM) crops has significantly increased, their analysis based on real-time PCR (qPCR) methods is becoming increasingly complex and laborious. While several pioneers already investigated Next Generation Sequencing (NGS) as an alternative to qPCR, its practical use has not been assessed for routine analysis. In this study a statistical framework was developed to predict the number of NGS reads needed to detect transgene sequences, to prove their integration into the host genome and to identify the specific transgene event in a sample with known composition. This framework was validated by applying it to experimental data from food matrices composed of pure GM rice, processed GM rice (noodles) or a 10% GM/non-GM rice mixture, revealing some influential factors. Finally, feasibility of NGS for routine analysis of GM crops was investigated by applying the framework to samples commonly encountered in routine analysis of GM crops. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  1. Use of four next-generation sequencing platforms to determine HIV-1 coreceptor tropism.

    Science.gov (United States)

    Archer, John; Weber, Jan; Henry, Kenneth; Winner, Dane; Gibson, Richard; Lee, Lawrence; Paxinos, Ellen; Arts, Eric J; Robertson, David L; Mimms, Larry; Quiñones-Mateu, Miguel E

    2012-01-01

    HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5) viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences) and genotypic (e.g., population sequencing linked to bioinformatic algorithms) assays are the most widely used. Although several next-generation sequencing (NGS) platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences), Illumina®, and Ion Torrent™ (Life Technologies). Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels) and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used), compared to Trofile (80%) and population sequencing (70%). In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage.

  2. Use of four next-generation sequencing platforms to determine HIV-1 coreceptor tropism.

    Directory of Open Access Journals (Sweden)

    John Archer

    Full Text Available HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5 viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences and genotypic (e.g., population sequencing linked to bioinformatic algorithms assays are the most widely used. Although several next-generation sequencing (NGS platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences, Illumina®, and Ion Torrent™ (Life Technologies. Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used, compared to Trofile (80% and population sequencing (70%. In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage.

  3. Comparing microarrays and next-generation sequencing technologies for microbial ecology research.

    Science.gov (United States)

    Roh, Seong Woon; Abell, Guy C J; Kim, Kyoung-Ho; Nam, Young-Do; Bae, Jin-Woo

    2010-06-01

    Recent advances in molecular biology have resulted in the application of DNA microarrays and next-generation sequencing (NGS) technologies to the field of microbial ecology. This review aims to examine the strengths and weaknesses of each of the methodologies, including depth and ease of analysis, throughput and cost-effectiveness. It also intends to highlight the optimal application of each of the individual technologies toward the study of a particular environment and identify potential synergies between the two main technologies, whereby both sample number and coverage can be maximized. We suggest that the efficient use of microarray and NGS technologies will allow researchers to advance the field of microbial ecology, and importantly, improve our understanding of the role of microorganisms in their various environments.

  4. Next Generation Sequencing Methods for Diagnosis of Epilepsy Syndromes

    Directory of Open Access Journals (Sweden)

    Paul Dunn

    2018-02-01

    Full Text Available Epilepsy is a neurological disorder characterized by an increased predisposition for seizures. Although this definition suggests that it is a single disorder, epilepsy encompasses a group of disorders with diverse aetiologies and outcomes. A genetic basis for epilepsy syndromes has been postulated for several decades, with several mutations in specific genes identified that have increased our understanding of the genetic influence on epilepsies. With 70-80% of epilepsy cases identified to have a genetic cause, there are now hundreds of genes identified to be associated with epilepsy syndromes which can be analyzed using next generation sequencing (NGS techniques such as targeted gene panels, whole exome sequencing (WES and whole genome sequencing (WGS. For effective use of these methodologies, diagnostic laboratories and clinicians require information on the relevant workflows including analysis and sequencing depth to understand the specific clinical application and diagnostic capabilities of these gene sequencing techniques. As epilepsy is a complex disorder, the differences associated with each technique influence the ability to form a diagnosis along with an accurate detection of the genetic etiology of the disorder. In addition, for diagnostic testing, an important parameter is the cost-effectiveness and the specific diagnostic outcome of each technique. Here, we review these commonly used NGS techniques to determine their suitability for application to epilepsy genetic diagnostic testing.

  5. Depletion of Human DNA in Spiked Clinical Specimens for Improvement of Sensitivity of Pathogen Detection by Next-Generation Sequencing

    OpenAIRE

    Hasan, Mohammad R.; Rawat, Arun; Tang, Patrick; Jithesh, Puthen V.; Thomas, Eva; Tan, Rusung; Tilley, Peter

    2016-01-01

    Next-generation sequencing (NGS) technology has shown promise for the detection of human pathogens from clinical samples. However, one of the major obstacles to the use of NGS in diagnostic microbiology is the low ratio of pathogen DNA to human DNA in most clinical specimens. In this study, we aimed to develop a specimen-processing protocol to remove human DNA and enrich specimens for bacterial and viral DNA for shotgun metagenomic sequencing. Cerebrospinal fluid (CSF) and nasopharyngeal aspi...

  6. Implementation of Targeted Next Generation Sequencing in Clinical Diagnostics

    DEFF Research Database (Denmark)

    Larsen, Martin Jakob; Burton, Mark; Thomassen, Mads

    Accurate mutation detection is essential in clinical genetic diagnostics of monogenic hereditary diseases. Targeted next generation sequencing (NGS) provides a promising and cost-effective alternative to Sanger sequencing and MLPA analysis currently used in most diagnostic laboratories. One...... of mutation positive controls previously characterized by Sanger/MLPA analysis. Agilent SureSelect Target-Enrichment kits were used for capturing a set of genes associated with hereditary breast and ovarian cancer syndrome and a compilation of genes involved in multiple rare single gene disorders......, respectively. For diagnostics, the sequencing coverage is essential, wherefore a minimum coverage of 30x per nucleotide in the coding regions was used as our primary quality criterion. For the majority of the included genes, we obtained adequate gene coverage, in which we were able to detect 100% of the known...

  7. Authentication of Herbal Supplements Using Next-Generation Sequencing.

    Directory of Open Access Journals (Sweden)

    Natalia V Ivanova

    Full Text Available DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious.We utilized Sanger and Next-Generation Sequencing (NGS for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components.All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven-by NGS. NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components.Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product. Interpretation of results should

  8. Authentication of Herbal Supplements Using Next-Generation Sequencing.

    Science.gov (United States)

    Ivanova, Natalia V; Kuzmina, Maria L; Braukmann, Thomas W A; Borisenko, Alex V; Zakharov, Evgeny V

    2016-01-01

    DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious. We utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components. All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven-by NGS). NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components. Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product. Interpretation of results should involve an

  9. Computational Approach to Annotating Variants of Unknown Significance in Clinical Next Generation Sequencing.

    Science.gov (United States)

    Schulz, Wade L; Tormey, Christopher A; Torres, Richard

    2015-01-01

    Next generation sequencing (NGS) has become a common technology in the clinical laboratory, particularly for the analysis of malignant neoplasms. However, most mutations identified by NGS are variants of unknown clinical significance (VOUS). Although the approach to define these variants differs by institution, software algorithms that predict variant effect on protein function may be used. However, these algorithms commonly generate conflicting results, potentially adding uncertainty to interpretation. In this review, we examine several computational tools used to predict whether a variant has clinical significance. In addition to describing the role of these tools in clinical diagnostics, we assess their efficacy in analyzing known pathogenic and benign variants in hematologic malignancies. Copyright© by the American Society for Clinical Pathology (ASCP).

  10. How Next-Generation Sequencing Has Aided Our Understanding of the Sequence Composition and Origin of B Chromosomes

    Directory of Open Access Journals (Sweden)

    Alevtina Ruban

    2017-10-01

    Full Text Available Accessory, supernumerary, or—most simply—B chromosomes, are found in many eukaryotic karyotypes. These small chromosomes do not follow the usual pattern of segregation, but rather are transmitted in a higher than expected frequency. As increasingly being demonstrated by next-generation sequencing (NGS, their structure comprises fragments of standard (A chromosomes, although in some plant species, their sequence also includes contributions from organellar genomes. Transcriptomic analyses of various animal and plant species have revealed that, contrary to what used to be the common belief, some of the B chromosome DNA is protein-encoding. This review summarizes the progress in understanding B chromosome biology enabled by the application of next-generation sequencing technology and state-of-the-art bioinformatics. In particular, a contrast is drawn between a direct sequencing approach and a strategy based on a comparative genomics as alternative routes that can be taken towards the identification of B chromosome sequences.

  11. MicroRNA Profiling in Aqueous Humor of Individual Human Eyes by Next-Generation Sequencing.

    Science.gov (United States)

    Wecker, Thomas; Hoffmeier, Klaus; Plötner, Anne; Grüning, Björn Andreas; Horres, Ralf; Backofen, Rolf; Reinhard, Thomas; Schlunck, Günther

    2016-04-01

    Extracellular microRNAs (miRNAs) in aqueous humor were suggested to have a role in transcellular signaling and may serve as disease biomarkers. The authors adopted next-generation sequencing (NGS) techniques to further characterize the miRNA profile in single samples of 60 to 80 μL human aqueous humor. Samples were obtained at the outset of cataract surgery in nine independent, otherwise healthy eyes. Four samples were used to extract RNA and generate sequencing libraries, followed by an adapter-driven amplification step, electrophoretic size selection, sequencing, and data analysis. Five samples were used for quantitative PCR (qPCR) validation of NGS results. Published NGS data on circulating miRNAs in blood were analyzed in comparison. One hundred fifty-eight miRNAs were consistently detected by NGS in all four samples; an additional 59 miRNAs were present in at least three samples. The aqueous humor miRNA profile shows some overlap with published NGS-derived inventories of circulating miRNAs in blood plasma with high prevalence of human miR-451a, -21, and -16. In contrast to blood, miR-184, -4448, -30a, -29a, -29c, -19a, -30d, -205, -24, -22, and -3074 were detected among the 20 most prevalent miRNAs in aqueous humor. Relative expression patterns of miR-451a, -202, and -144 suggested by NGS were confirmed by qPCR. Our data illustrate the feasibility of miRNA analysis by NGS in small individual aqueous humor samples. Intraocular cells as well as blood plasma contribute to the extracellular aqueous humor miRNome. The data suggest possible roles of miRNA in intraocular cell adhesion and signaling by TGF-β and Wnt, which are important in intraocular pressure regulation and glaucoma.

  12. Enrichment of megabase-sized DNA molecules for single-molecule optical mapping and next-generation sequencing

    DEFF Research Database (Denmark)

    Łopacińska-Jørgensen, Joanna M; Pedersen, Jonas Nyvold; Bak, Mads

    2017-01-01

    Next-generation sequencing (NGS) has caused a revolution, yet left a gap: long-range genetic information from native, non-amplified DNA fragments is unavailable. It might be obtained by optical mapping of megabase-sized DNA molecules. Frequently only a specific genomic region is of interest, so......-megabase- to megabase-sized DNA molecules were recovered from the gel and analysed by denaturation-renaturation optical mapping. Size-selected molecules from the same gel were sequenced by NGS. The optically mapped molecules and the NGS reads showed enrichment from regions defined by NotI restriction sites. We...... demonstrate that the unannotated genome can be characterized in a locus-specific manner via molecules partially overlapping with the annotated genome. The method is a promising tool for investigation of structural variants in enriched human genomic regions for both research and diagnostic purposes. Our...

  13. Identification of a Novel De Novo Heterozygous Deletion in the SOX10 Gene in Waardenburg Syndrome Type II Using Next-Generation Sequencing.

    Science.gov (United States)

    Li, Haonan; Jin, Peng; Hao, Qian; Zhu, Wei; Chen, Xia; Wang, Ping

    2017-11-01

    Waardenburg syndrome (WS) is a rare autosomal dominant disorder associated with pigmentation abnormalities and sensorineural hearing loss. In this study, we investigated the genetic cause of WSII in a patient and evaluated the reliability of the targeted next-generation exome sequencing method for the genetic diagnosis of WS. Clinical evaluations were conducted on the patient and targeted next-generation sequencing (NGS) was used to identify the candidate genes responsible for WSII. Multiplex ligation-dependent probe amplification (MLPA) and real-time quantitative polymerase chain reaction (qPCR) were performed to confirm the targeted NGS results. Targeted NGS detected the entire deletion of the coding sequence (CDS) of the SOX10 gene in the WSII patient. MLPA results indicated that all exons of the SOX10 heterozygous deletion were detected; no aberrant copy number in the PAX3 and microphthalmia-associated transcription factor (MITF) genes was found. Real-time qPCR results identified the mutation as a de novo heterozygous deletion. This is the first report of using a targeted NGS method for WS candidate gene sequencing; its accuracy was verified by using the MLPA and qPCR methods. Our research provides a valuable method for the genetic diagnosis of WS.

  14. Identification of parasitic communities within European ticks using next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Sarah Bonnet

    2014-03-01

    Full Text Available Risk assessment of tick-borne and zoonotic disease emergence necessitates sound knowledge of the particular microorganisms circulating within the communities of these major vectors. Assessment of pathogens carried by wild ticks must be performed without a priori, to allow for the detection of new or unexpected agents.We evaluated the potential of Next-Generation Sequencing techniques (NGS to produce an inventory of parasites carried by questing ticks. Sequences corresponding to parasites from two distinct genera were recovered in Ixodes ricinus ticks collected in Eastern France: Babesia spp. and Theileria spp. Four Babesia species were identified, three of which were zoonotic: B. divergens, Babesia sp. EU1 and B. microti; and one which infects cattle, B. major. This is the first time that these last two species have been identified in France. This approach also identified new sequences corresponding to as-yet unknown organisms similar to tropical Theileria species.Our findings demonstrate the capability of NGS to produce an inventory of live tick-borne parasites, which could potentially be transmitted by the ticks, and uncovers unexpected parasites in Western Europe.

  15. From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes.

    Science.gov (United States)

    Kwok, Hin; Chiang, Alan Kwok Shing

    2016-02-24

    Genomic sequences of Epstein-Barr virus (EBV) have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS) and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.

  16. From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes

    Directory of Open Access Journals (Sweden)

    Hin Kwok

    2016-02-01

    Full Text Available Genomic sequences of Epstein–Barr virus (EBV have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.

  17. Structural variation discovery in the cancer genome using next generation sequencing: Computational solutions and perspectives

    Science.gov (United States)

    Liu, Biao; Conroy, Jeffrey M.; Morrison, Carl D.; Odunsi, Adekunle O.; Qin, Maochun; Wei, Lei; Trump, Donald L.; Johnson, Candace S.; Liu, Song; Wang, Jianmin

    2015-01-01

    Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937

  18. The first FDA marketing authorizations of next-generation sequencing technology and tests: challenges, solutions and impact for future assays.

    Science.gov (United States)

    Bijwaard, Karen; Dickey, Jennifer S; Kelm, Kellie; Težak, Živana

    2015-01-01

    The rapid emergence and clinical translation of novel high-throughput sequencing technologies created a need to clarify the regulatory pathway for the evaluation and authorization of these unique technologies. Recently, the US FDA authorized for marketing four next generation sequencing (NGS)-based diagnostic devices which consisted of two heritable disease-specific assays, library preparation reagents and a NGS platform that are intended for human germline targeted sequencing from whole blood. These first authorizations can serve as a case study in how different types of NGS-based technology are reviewed by the FDA. In this manuscript we describe challenges associated with the evaluation of these novel technologies and provide an overview of what was reviewed. Besides making validated NGS-based devices available for in vitro diagnostic use, these first authorizations create a regulatory path for similar future instruments and assays.

  19. HPV-QUEST: A highly customized system for automated HPV sequence analysis capable of processing Next Generation sequencing data set.

    Science.gov (United States)

    Yin, Li; Yao, Jiqiang; Gardner, Brent P; Chang, Kaifen; Yu, Fahong; Goodenow, Maureen M

    2012-01-01

    Next Generation sequencing (NGS) applied to human papilloma viruses (HPV) can provide sensitive methods to investigate the molecular epidemiology of multiple type HPV infection. Currently a genotyping system with a comprehensive collection of updated HPV reference sequences and a capacity to handle NGS data sets is lacking. HPV-QUEST was developed as an automated and rapid HPV genotyping system. The web-based HPV-QUEST subtyping algorithm was developed using HTML, PHP, Perl scripting language, and MYSQL as the database backend. HPV-QUEST includes a database of annotated HPV reference sequences with updated nomenclature covering 5 genuses, 14 species and 150 mucosal and cutaneous types to genotype blasted query sequences. HPV-QUEST processes up to 10 megabases of sequences within 1 to 2 minutes. Results are reported in html, text and excel formats and display e-value, blast score, and local and coverage identities; provide genus, species, type, infection site and risk for the best matched reference HPV sequence; and produce results ready for additional analyses.

  20. Next-generation sequencing in NSCLC and melanoma patients: a cost and budget impact analysis

    Science.gov (United States)

    van Amerongen, Rosa A; Retèl, Valesca P; Coupé, Veerle MH; Nederlof, Petra M; Vogel, Maartje J; van Harten, Wim H

    2016-01-01

    Next-generation sequencing (NGS) has reached the molecular diagnostic laboratories. Although the NGS technology aims to improve the effectiveness of therapies by selecting the most promising therapy, concerns are that NGS testing is expensive and that the ‘benefits’ are not yet in relation to these costs. In this study, we give an estimation of the costs and an institutional and national budget impact of various types of NGS tests in non-small-cell lung cancer (NSCLC) and melanoma patients within The Netherlands. First, an activity-based costing (ABC) analysis has been conducted on the costs of two examples of NGS panels (small- and medium-targeted gene panel (TGP)) based on data of The Netherlands Cancer Institute (NKI). Second, we performed a budget impact analysis (BIA) to estimate the current (2015) and future (2020) budget impact of NGS on molecular diagnostics for NSCLC and melanoma patients in The Netherlands. Literature, expert opinions, and a data set of patients within the NKI (n = 172) have been included in the BIA. Based on our analysis, we expect that the NGS test cost concerns will be limited. In the current situation, NGS can indeed result in higher diagnostic test costs, which is mainly related to required additional tests besides the small TGP. However, in the future, we expect that the use of whole-genome sequencing (WGS) will increase, for which it is expected that additional tests can be (partly) avoided. Although the current clinical benefits are expected to be limited, the research potentials of NGS are already an important advantage. PMID:27899957

  1. Identification of a disease-causing mutation in a Chinese patient with retinitis pigmentosa by targeted next-generation sequencing

    DEFF Research Database (Denmark)

    Xiao, Jianping; Guo, Xueqin; Wang, Yong

    2017-01-01

    Purpose: To identify disease-causing mutations in a Chinese patient with retinitis pigmentosa (RP). Methods: A detailed clinical examination was performed on the proband. Targeted next-generation sequencing (NGS) combined with bioinformatics analysis was performed on the proband to detect candidate...

  2. Discovery and validation of Barrett's esophagus microRNA transcriptome by next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Ajay Bansal

    Full Text Available Barrett's esophagus (BE is transition from squamous to columnar mucosa as a result of gastroesophageal reflux disease (GERD. The role of microRNA during this transition has not been systematically studied.For initial screening, total RNA from 5 GERD and 6 BE patients was size fractionated. RNA <70 nucleotides was subjected to SOLiD 3 library preparation and next generation sequencing (NGS. Bioinformatics analysis was performed using R package "DEseq". A p value<0.05 adjusted for a false discovery rate of 5% was considered significant. NGS-identified miRNA were validated using qRT-PCR in an independent group of 40 GERD and 27 BE patients. MicroRNA expression of human BE tissues was also compared with three BE cell lines.NGS detected 19.6 million raw reads per sample. 53.1% of filtered reads mapped to miRBase version 18. NGS analysis followed by qRT-PCR validation found 10 differentially expressed miRNA; several are novel (-708-5p, -944, -224-5p and -3065-5p. Up- or down- regulation predicted by NGS was matched by qRT-PCR in every case. Human BE tissues and BE cell lines showed a high degree of concordance (70-80% in miRNA expression. Prediction analysis identified targets that mapped to developmental signaling pathways such as TGFβ and Notch and inflammatory pathways such as toll-like receptor signaling and TGFβ. Cluster analysis found similarly regulated (up or down miRNA to share common targets suggesting coordination between miRNA.Using highly sensitive next-generation sequencing, we have performed a comprehensive genome wide analysis of microRNA in BE and GERD patients. Differentially expressed miRNA between BE and GERD have been further validated. Expression of miRNA between BE human tissues and BE cell lines are highly correlated. These miRNA should be studied in biological models to further understand BE development.

  3. Next-generation Sequencing-based genomic profiling: Fostering innovation in cancer care?

    Directory of Open Access Journals (Sweden)

    Gustavo S. Fernandes

    Full Text Available OBJECTIVES: With the development of next-generation sequencing (NGS technologies, DNA sequencing has been increasingly utilized in clinical practice. Our goal was to investigate the impact of genomic evaluation on treatment decisions for heavily pretreated patients with metastatic cancer. METHODS: We analyzed metastatic cancer patients from a single institution whose cancers had progressed after all available standard-of-care therapies and whose tumors underwent next-generation sequencing analysis. We determined the percentage of patients who received any therapy directed by the test, and its efficacy. RESULTS: From July 2013 to December 2015, 185 consecutive patients were tested using a commercially available next-generation sequencing-based test, and 157 patients were eligible. Sixty-six patients (42.0% were female, and 91 (58.0% were male. The mean age at diagnosis was 52.2 years, and the mean number of pre-test lines of systemic treatment was 2.7. One hundred and seventy-seven patients (95.6% had at least one identified gene alteration. Twenty-four patients (15.2% underwent systemic treatment directed by the test result. Of these, one patient had a complete response, four (16.7% had partial responses, two (8.3% had stable disease, and 17 (70.8% had disease progression as the best result. The median progression-free survival time with matched therapy was 1.6 months, and the median overall survival was 10 months. CONCLUSION: We identified a high prevalence of gene alterations using an next-generation sequencing test. Although some benefit was associated with the matched therapy, most of the patients had disease progression as the best response, indicating the limited biological potential and unclear clinical relevance of this practice.

  4. Genetic architecture of retinal and macular degenerative diseases: the promise and challenges of next-generation sequencing

    Science.gov (United States)

    2013-01-01

    Inherited retinal degenerative diseases (RDDs) display wide variation in their mode of inheritance, underlying genetic defects, age of onset, and phenotypic severity. Molecular mechanisms have not been delineated for many retinal diseases, and treatment options are limited. In most instances, genotype-phenotype correlations have not been elucidated because of extensive clinical and genetic heterogeneity. Next-generation sequencing (NGS) methods, including exome, genome, transcriptome and epigenome sequencing, provide novel avenues towards achieving comprehensive understanding of the genetic architecture of RDDs. Whole-exome sequencing (WES) has already revealed several new RDD genes, whereas RNA-Seq and ChIP-Seq analyses are expected to uncover novel aspects of gene regulation and biological networks that are involved in retinal development, aging and disease. In this review, we focus on the genetic characterization of retinal and macular degeneration using NGS technology and discuss the basic framework for further investigations. We also examine the challenges of NGS application in clinical diagnosis and management. PMID:24112618

  5. BATCH-GE: Batch analysis of Next-Generation Sequencing data for genome editing assessment

    Science.gov (United States)

    Boel, Annekatrien; Steyaert, Woutert; De Rocker, Nina; Menten, Björn; Callewaert, Bert; De Paepe, Anne; Coucke, Paul; Willaert, Andy

    2016-01-01

    Targeted mutagenesis by the CRISPR/Cas9 system is currently revolutionizing genetics. The ease of this technique has enabled genome engineering in-vitro and in a range of model organisms and has pushed experimental dimensions to unprecedented proportions. Due to its tremendous progress in terms of speed, read length, throughput and cost, Next-Generation Sequencing (NGS) has been increasingly used for the analysis of CRISPR/Cas9 genome editing experiments. However, the current tools for genome editing assessment lack flexibility and fall short in the analysis of large amounts of NGS data. Therefore, we designed BATCH-GE, an easy-to-use bioinformatics tool for batch analysis of NGS-generated genome editing data, available from https://github.com/WouterSteyaert/BATCH-GE.git. BATCH-GE detects and reports indel mutations and other precise genome editing events and calculates the corresponding mutagenesis efficiencies for a large number of samples in parallel. Furthermore, this new tool provides flexibility by allowing the user to adapt a number of input variables. The performance of BATCH-GE was evaluated in two genome editing experiments, aiming to generate knock-out and knock-in zebrafish mutants. This tool will not only contribute to the evaluation of CRISPR/Cas9-based experiments, but will be of use in any genome editing experiment and has the ability to analyze data from every organism with a sequenced genome. PMID:27461955

  6. Targeted next generation sequencing reveals a novel intragenic deletion of the TPO gene in a family with intellectual disability

    NARCIS (Netherlands)

    Iqbal, Z.; Neveling, K.; Razzaq, A.; Shahzad, M.; Zahoor, M.Y.; Qasim, M.; Gilissen, C.F.H.A.; Wieskamp, N.; Kwint, M.P.; Gijsen, S.; de Brouwer, A.P.; Veltman, J.A.; Riazuddin, S.; Bokhoven, J.H.L.M. van

    2012-01-01

    BACKGROUNDS AND AIMS: Next generation sequencing (NGS) approaches have revolutionized the identification of mutations underlying genetic disorders. This technology is particularly useful for the identification of mutations in known and new genes for conditions with extensive genetic heterogeneity.

  7. Creative Destruction: Next Generation Sequencing in Drug Development, Formulary Evaluations and Pricing

    Directory of Open Access Journals (Sweden)

    Paul C Langley

    2016-11-01

    Full Text Available Next generation sequencing (NGS has the potential to disrupt not only the accepted process of drug development but also the hurdles a drug manufacturer would be expected to face in securing formulary approval and a possible premium price for the new compound. The purpose of this commentary is to consider the role of NGS in this process, one which is characterized as a process of creative destruction, where adoption of NGS in personalized medicine sets in train a mechanism of incessant product and process review. A mechanism driven by continuing modifications and extensions to NGS platforms as our understanding of the role of mutations and mutation load in therapy choice expands. At the same time this mechanism has significant implications for the continued revision of treatment guidelines and their adoption of NGS as integral parts of the treatment pathway. There are, however, a number of unresolved issues which have to be addressed. These include the choice of NGS platform, barriers to integrating evidence to support NGS-based therapy choices in treatment guidelines, the implications of NGS for drug development and the modification or rejection of current trial structures, the integration of comorbid disease states and the standards that formulary committees should adopt to evaluate NGS claims. The overarching theme, however, is the need to invest in a robust and credible evidence base. While we are a long way from achieving this, the focus must be on putting claims for therapy choice forward that are credible, evaluable and replicable.   Type: Commentary

  8. Investigation of next-generation sequencing data of Klebsiella pneumoniae using web-based tools.

    Science.gov (United States)

    Brhelova, Eva; Antonova, Mariya; Pardy, Filip; Kocmanova, Iva; Mayer, Jiri; Racil, Zdenek; Lengerova, Martina

    2017-11-01

    Rapid identification and characterization of multidrug-resistant Klebsiella pneumoniae strains is necessary due to the increasing frequency of severe infections in patients. The decreasing cost of next-generation sequencing enables us to obtain a comprehensive overview of genetic information in one step. The aim of this study is to demonstrate and evaluate the utility and scope of the application of web-based databases to next-generation sequenced (NGS) data. The whole genomes of 11 clinical Klebsiella pneumoniae isolates were sequenced using Illumina MiSeq. Selected web-based tools were used to identify a variety of genetic characteristics, such as acquired antimicrobial resistance genes, multilocus sequence types, plasmid replicons, and identify virulence factors, such as virulence genes, cps clusters, urease-nickel clusters and efflux systems. Using web-based tools hosted by the Center for Genomic Epidemiology, we detected resistance to 8 main antimicrobial groups with at least 11 acquired resistance genes. The isolates were divided into eight sequence types (ST11, 23, 37, 323, 433, 495 and 562, and a new one, ST1646). All of the isolates carried replicons of large plasmids. Capsular types, virulence factors and genes coding AcrAB and OqxAB efflux pumps were detected using BIGSdb-Kp, whereas the selected virulence genes, identified in almost all of the isolates, were detected using CLC Genomic Workbench software. Applying appropriate web-based online tools to NGS data enables the rapid extraction of comprehensive information that can be used for more efficient diagnosis and treatment of patients, while data processing is free of charge, easy and time-efficient.

  9. A machine learning model to determine the accuracy of variant calls in capture-based next generation sequencing.

    Science.gov (United States)

    van den Akker, Jeroen; Mishne, Gilad; Zimmer, Anjali D; Zhou, Alicia Y

    2018-04-17

    Next generation sequencing (NGS) has become a common technology for clinical genetic tests. The quality of NGS calls varies widely and is influenced by features like reference sequence characteristics, read depth, and mapping accuracy. With recent advances in NGS technology and software tools, the majority of variants called using NGS alone are in fact accurate and reliable. However, a small subset of difficult-to-call variants that still do require orthogonal confirmation exist. For this reason, many clinical laboratories confirm NGS results using orthogonal technologies such as Sanger sequencing. Here, we report the development of a deterministic machine-learning-based model to differentiate between these two types of variant calls: those that do not require confirmation using an orthogonal technology (high confidence), and those that require additional quality testing (low confidence). This approach allows reliable NGS-based calling in a clinical setting by identifying the few important variant calls that require orthogonal confirmation. We developed and tested the model using a set of 7179 variants identified by a targeted NGS panel and re-tested by Sanger sequencing. The model incorporated several signals of sequence characteristics and call quality to determine if a variant was identified at high or low confidence. The model was tuned to eliminate false positives, defined as variants that were called by NGS but not confirmed by Sanger sequencing. The model achieved very high accuracy: 99.4% (95% confidence interval: +/- 0.03%). It categorized 92.2% (6622/7179) of the variants as high confidence, and 100% of these were confirmed to be present by Sanger sequencing. Among the variants that were categorized as low confidence, defined as NGS calls of low quality that are likely to be artifacts, 92.1% (513/557) were found to be not present by Sanger sequencing. This work shows that NGS data contains sufficient characteristics for a machine-learning-based model to

  10. Mutation Detection in Patients with Retinal Dystrophies Using Targeted Next Generation Sequencing.

    Directory of Open Access Journals (Sweden)

    Nicole Weisschuh

    Full Text Available Retinal dystrophies (RD constitute a group of blinding diseases that are characterized by clinical variability and pronounced genetic heterogeneity. The different nonsyndromic and syndromic forms of RD can be attributed to mutations in more than 200 genes. Consequently, next generation sequencing (NGS technologies are among the most promising approaches to identify mutations in RD. We screened a large cohort of patients comprising 89 independent cases and families with various subforms of RD applying different NGS platforms. While mutation screening in 50 cases was performed using a RD gene capture panel, 47 cases were analyzed using whole exome sequencing. One family was analyzed using whole genome sequencing. A detection rate of 61% was achieved including mutations in 34 known and two novel RD genes. A total of 69 distinct mutations were identified, including 39 novel mutations. Notably, genetic findings in several families were not consistent with the initial clinical diagnosis. Clinical reassessment resulted in refinement of the clinical diagnosis in some of these families and confirmed the broad clinical spectrum associated with mutations in RD genes.

  11. No more non-model species: the promise of next generation sequencing for comparative immunology.

    Science.gov (United States)

    Dheilly, Nolwenn M; Adema, Coen; Raftos, David A; Gourbal, Benjamin; Grunau, Christoph; Du Pasquier, Louis

    2014-07-01

    Next generation sequencing (NGS) allows for the rapid, comprehensive and cost effective analysis of entire genomes and transcriptomes. NGS provides approaches for immune response gene discovery, profiling gene expression over the course of parasitosis, studying mechanisms of diversification of immune receptors and investigating the role of epigenetic mechanisms in regulating immune gene expression and/or diversification. NGS will allow meaningful comparisons to be made between organisms from different taxa in an effort to understand the selection of diverse strategies for host defence under different environmental pathogen pressures. At the same time, it will reveal the shared and unique components of the immunological toolkit and basic functional aspects that are essential for immune defence throughout the living world. In this review, we argue that NGS will revolutionize our understanding of immune responses throughout the animal kingdom because the depth of information it provides will circumvent the need to concentrate on a few "model" species. Copyright © 2014 Elsevier Ltd. All rights reserved.

  12. How next-generation sequencing and multiscale data analysis will transform infectious disease management.

    Science.gov (United States)

    Pak, Theodore R; Kasarskis, Andrew

    2015-12-01

    Recent reviews have examined the extent to which routine next-generation sequencing (NGS) on clinical specimens will improve the capabilities of clinical microbiology laboratories in the short term, but do not explore integrating NGS with clinical data from electronic medical records (EMRs), immune profiling data, and other rich datasets to create multiscale predictive models. This review introduces a range of "omics" and patient data sources relevant to managing infections and proposes 3 potentially disruptive applications for these data in the clinical workflow. The combined threats of healthcare-associated infections and multidrug-resistant organisms may be addressed by multiscale analysis of NGS and EMR data that is ideally updated and refined over time within each healthcare organization. Such data and analysis should form the cornerstone of future learning health systems for infectious disease. © The Author 2015. Published by Oxford University Press on behalf of the Infectious Diseases Society of America.

  13. Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics.

    Science.gov (United States)

    Straub, Shannon C K; Parks, Matthew; Weitemier, Kevin; Fishbein, Mark; Cronn, Richard C; Liston, Aaron

    2012-02-01

    Just as Sanger sequencing did more than 20 years ago, next-generation sequencing (NGS) is poised to revolutionize plant systematics. By combining multiplexing approaches with NGS throughput, systematists may no longer need to choose between more taxa or more characters. Here we describe a genome skimming (shallow sequencing) approach for plant systematics. Through simulations, we evaluated optimal sequencing depth and performance of single-end and paired-end short read sequences for assembly of nuclear ribosomal DNA (rDNA) and plastomes and addressed the effect of divergence on reference-guided plastome assembly. We also used simulations to identify potential phylogenetic markers from low-copy nuclear loci at different sequencing depths. We demonstrated the utility of genome skimming through phylogenetic analysis of the Sonoran Desert clade (SDC) of Asclepias (Apocynaceae). Paired-end reads performed better than single-end reads. Minimum sequencing depths for high quality rDNA and plastome assemblies were 40× and 30×, respectively. Divergence from the reference significantly affected plastome assembly, but relatively similar references are available for most seed plants. Deeper rDNA sequencing is necessary to characterize intragenomic polymorphism. The low-copy fraction of the nuclear genome was readily surveyed, even at low sequencing depths. Nearly 160000 bp of sequence from three organelles provided evidence of phylogenetic incongruence in the SDC. Adoption of NGS will facilitate progress in plant systematics, as whole plastome and rDNA cistrons, partial mitochondrial genomes, and low-copy nuclear markers can now be efficiently obtained for molecular phylogenetics studies.

  14. Construction of a high-density genetic map for grape using next generation restriction-site associated DNA sequencing

    Directory of Open Access Journals (Sweden)

    Wang Nian

    2012-08-01

    Full Text Available Abstract Background Genetic mapping and QTL detection are powerful methodologies in plant improvement and breeding. Construction of a high-density and high-quality genetic map would be of great benefit in the production of superior grapes to meet human demand. High throughput and low cost of the recently developed next generation sequencing (NGS technology have resulted in its wide application in genome research. Sequencing restriction-site associated DNA (RAD might be an efficient strategy to simplify genotyping. Combining NGS with RAD has proven to be powerful for single nucleotide polymorphism (SNP marker development. Results An F1 population of 100 individual plants was developed. In-silico digestion-site prediction was used to select an appropriate restriction enzyme for construction of a RAD sequencing library. Next generation RAD sequencing was applied to genotype the F1 population and its parents. Applying a cluster strategy for SNP modulation, a total of 1,814 high-quality SNP markers were developed: 1,121 of these were mapped to the female genetic map, 759 to the male map, and 1,646 to the integrated map. A comparison of the genetic maps to the published Vitis vinifera genome revealed both conservation and variations. Conclusions The applicability of next generation RAD sequencing for genotyping a grape F1 population was demonstrated, leading to the successful development of a genetic map with high density and quality using our designed SNP markers. Detailed analysis revealed that this newly developed genetic map can be used for a variety of genome investigations, such as QTL detection, sequence assembly and genome comparison.

  15. Statistical modelling of spatio-temporal dependencies in NGS data

    NARCIS (Netherlands)

    Ranciati, Saverio

    2016-01-01

    Next-generation sequencing (NGS) heeft zich snel gevestigd als de huidige standaard in de genetische analyse. Deze omschakeling van microarray naar NGS vereist nieuwe statistische strategieën om de onderzoeksvragen aan te pakken. Ten eerste, NGS data bestaat uit discrete waarnemingen, meestal

  16. Molecular diagnosis of Usher syndrome: application of two different next generation sequencing-based procedures.

    Directory of Open Access Journals (Sweden)

    Danilo Licastro

    Full Text Available Usher syndrome (USH is a clinically and genetically heterogeneous disorder characterized by visual and hearing impairments. Clinically, it is subdivided into three subclasses with nine genes identified so far. In the present study, we investigated whether the currently available Next Generation Sequencing (NGS technologies are already suitable for molecular diagnostics of USH. We analyzed a total of 12 patients, most of which were negative for previously described mutations in known USH genes upon primer extension-based microarray genotyping. We enriched the NGS template either by whole exome capture or by Long-PCR of the known USH genes. The main NGS sequencing platforms were used: SOLiD for whole exome sequencing, Illumina (Genome Analyzer II and Roche 454 (GS FLX for the Long-PCR sequencing. Long-PCR targeting was more efficient with up to 94% of USH gene regions displaying an overall coverage higher than 25×, whereas whole exome sequencing yielded a similar coverage for only 50% of those regions. Overall this integrated analysis led to the identification of 11 novel sequence variations in USH genes (2 homozygous and 9 heterozygous out of 18 detected. However, at least two cases were not genetically solved. Our result highlights the current limitations in the diagnostic use of NGS for USH patients. The limit for whole exome sequencing is linked to the need of a strong coverage and to the correct interpretation of sequence variations with a non obvious, pathogenic role, whereas the targeted approach suffers from the high genetic heterogeneity of USH that may be also caused by the presence of additional causative genes yet to be identified.

  17. Molecular Diagnosis of Usher Syndrome: Application of Two Different Next Generation Sequencing-Based Procedures

    Science.gov (United States)

    Licastro, Danilo; Mutarelli, Margherita; Peluso, Ivana; Neveling, Kornelia; Wieskamp, Nienke; Rispoli, Rossella; Vozzi, Diego; Athanasakis, Emmanouil; D'Eustacchio, Angela; Pizzo, Mariateresa; D'Amico, Francesca; Ziviello, Carmela; Simonelli, Francesca; Fabretto, Antonella; Scheffer, Hans; Gasparini, Paolo; Banfi, Sandro; Nigro, Vincenzo

    2012-01-01

    Usher syndrome (USH) is a clinically and genetically heterogeneous disorder characterized by visual and hearing impairments. Clinically, it is subdivided into three subclasses with nine genes identified so far. In the present study, we investigated whether the currently available Next Generation Sequencing (NGS) technologies are already suitable for molecular diagnostics of USH. We analyzed a total of 12 patients, most of which were negative for previously described mutations in known USH genes upon primer extension-based microarray genotyping. We enriched the NGS template either by whole exome capture or by Long-PCR of the known USH genes. The main NGS sequencing platforms were used: SOLiD for whole exome sequencing, Illumina (Genome Analyzer II) and Roche 454 (GS FLX) for the Long-PCR sequencing. Long-PCR targeting was more efficient with up to 94% of USH gene regions displaying an overall coverage higher than 25×, whereas whole exome sequencing yielded a similar coverage for only 50% of those regions. Overall this integrated analysis led to the identification of 11 novel sequence variations in USH genes (2 homozygous and 9 heterozygous) out of 18 detected. However, at least two cases were not genetically solved. Our result highlights the current limitations in the diagnostic use of NGS for USH patients. The limit for whole exome sequencing is linked to the need of a strong coverage and to the correct interpretation of sequence variations with a non obvious, pathogenic role, whereas the targeted approach suffers from the high genetic heterogeneity of USH that may be also caused by the presence of additional causative genes yet to be identified. PMID:22952768

  18. Interpretation of custom designed Illumina genotype cluster plots for targeted association studies and next-generation sequence validation

    Directory of Open Access Journals (Sweden)

    Tindall Elizabeth A

    2010-02-01

    Full Text Available Abstract Background High-throughput custom designed genotyping arrays are a valuable resource for biologically focused research studies and increasingly for validation of variation predicted by next-generation sequencing (NGS technologies. We investigate the Illumina GoldenGate chemistry using custom designed VeraCode and sentrix array matrix (SAM assays for each of these applications, respectively. We highlight applications for interpretation of Illumina generated genotype cluster plots to maximise data inclusion and reduce genotyping errors. Findings We illustrate the dramatic effect of outliers in genotype calling and data interpretation, as well as suggest simple means to avoid genotyping errors. Furthermore we present this platform as a successful method for two-cluster rare or non-autosomal variant calling. The success of high-throughput technologies to accurately call rare variants will become an essential feature for future association studies. Finally, we highlight additional advantages of the Illumina GoldenGate chemistry in generating unusually segregated cluster plots that identify potential NGS generated sequencing error resulting from minimal coverage. Conclusions We demonstrate the importance of visually inspecting genotype cluster plots generated by the Illumina software and issue warnings regarding commonly accepted quality control parameters. In addition to suggesting applications to minimise data exclusion, we propose that the Illumina cluster plots may be helpful in identifying potential in-put sequence errors, particularly important for studies to validate NGS generated variation.

  19. ChronQC: a quality control monitoring system for clinical next generation sequencing.

    Science.gov (United States)

    Tawari, Nilesh R; Seow, Justine Jia Wen; Perumal, Dharuman; Ow, Jack L; Ang, Shimin; Devasia, Arun George; Ng, Pauline C

    2018-05-15

    ChronQC is a quality control (QC) tracking system for clinical implementation of next-generation sequencing (NGS). ChronQC generates time series plots for various QC metrics to allow comparison of current runs to historical runs. ChronQC has multiple features for tracking QC data including Westgard rules for clinical validity, laboratory-defined thresholds and historical observations within a specified time period. Users can record their notes and corrective actions directly onto the plots for long-term recordkeeping. ChronQC facilitates regular monitoring of clinical NGS to enable adherence to high quality clinical standards. ChronQC is freely available on GitHub (https://github.com/nilesh-tawari/ChronQC), Docker (https://hub.docker.com/r/nileshtawari/chronqc/) and the Python Package Index. ChronQC is implemented in Python and runs on all common operating systems (Windows, Linux and Mac OS X). tawari.nilesh@gmail.com or pauline.c.ng@gmail.com. Supplementary data are available at Bioinformatics online.

  20. Unravelling the Complexity of Inherited Retinal Dystrophies Molecular Testing: Added Value of Targeted Next-Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Isabella Bernardis

    2016-01-01

    Full Text Available To assess the clinical utility of targeted Next-Generation Sequencing (NGS for the diagnosis of Inherited Retinal Dystrophies (IRDs, a total of 109 subjects were enrolled in the study, including 88 IRD affected probands and 21 healthy relatives. Clinical diagnoses included Retinitis Pigmentosa (RP, Leber Congenital Amaurosis (LCA, Stargardt Disease (STGD, Best Macular Dystrophy (BMD, Usher Syndrome (USH, and other IRDs with undefined clinical diagnosis. Participants underwent a complete ophthalmologic examination followed by genetic counseling. A custom AmpliSeq™ panel of 72 IRD-related genes was designed for the analysis and tested using Ion semiconductor Next-Generation Sequencing (NGS. Potential disease-causing mutations were identified in 59.1% of probands, comprising mutations in 16 genes. The highest diagnostic yields were achieved for BMD, LCA, USH, and STGD patients, whereas RP confirmed its high genetic heterogeneity. Causative mutations were identified in 17.6% of probands with undefined diagnosis. Revision of the initial diagnosis was performed for 9.6% of genetically diagnosed patients. This study demonstrates that NGS represents a comprehensive cost-effective approach for IRDs molecular diagnosis. The identification of the genetic alterations underlying the phenotype enabled the clinicians to achieve a more accurate diagnosis. The results emphasize the importance of molecular diagnosis coupled with clinic information to unravel the extensive phenotypic heterogeneity of these diseases.

  1. A genome-wide analysis of lentivector integration sites using targeted sequence capture and next generation sequencing technology.

    Science.gov (United States)

    Ustek, Duran; Sirma, Sema; Gumus, Ergun; Arikan, Muzaffer; Cakiris, Aris; Abaci, Neslihan; Mathew, Jaicy; Emrence, Zeliha; Azakli, Hulya; Cosan, Fulya; Cakar, Atilla; Parlak, Mahmut; Kursun, Olcay

    2012-10-01

    One application of next-generation sequencing (NGS) is the targeted resequencing of interested genes which has not been used in viral integration site analysis of gene therapy applications. Here, we combined targeted sequence capture array and next generation sequencing to address the whole genome profiling of viral integration sites. Human 293T and K562 cells were transduced with a HIV-1 derived vector. A custom made DNA probe sets targeted pLVTHM vector used to capture lentiviral vector/human genome junctions. The captured DNA was sequenced using GS FLX platform. Seven thousand four hundred and eighty four human genome sequences flanking the long terminal repeats (LTR) of pLVTHM fragment sequences matched with an identity of at least 98% and minimum 50 bp criteria in both cells. In total, 203 unique integration sites were identified. The integrations in both cell lines were totally distant from the CpG islands and from the transcription start sites and preferentially located in introns. A comparison between the two cell lines showed that the lentiviral-transduced DNA does not have the same preferred regions in the two different cell lines. Copyright © 2012 Elsevier B.V. All rights reserved.

  2. Building a Robust Tumor Profiling Program: Synergy between Next-Generation Sequencing and Targeted Single-Gene Testing.

    Directory of Open Access Journals (Sweden)

    Matthew C Hiemenz

    Full Text Available Next-generation sequencing (NGS is a powerful platform for identifying cancer mutations. Routine clinical adoption of NGS requires optimized quality control metrics to ensure accurate results. To assess the robustness of our clinical NGS pipeline, we analyzed the results of 304 solid tumor and hematologic malignancy specimens tested simultaneously by NGS and one or more targeted single-gene tests (EGFR, KRAS, BRAF, NPM1, FLT3, and JAK2. For samples that passed our validated tumor percentage and DNA quality and quantity thresholds, there was perfect concordance between NGS and targeted single-gene tests with the exception of two FLT3 internal tandem duplications that fell below the stringent pre-established reporting threshold but were readily detected by manual inspection. In addition, NGS identified clinically significant mutations not covered by single-gene tests. These findings confirm NGS as a reliable platform for routine clinical use when appropriate quality control metrics, such as tumor percentage and DNA quality cutoffs, are in place. Based on our findings, we suggest a simple workflow that should facilitate adoption of clinical oncologic NGS services at other institutions.

  3. Identification and characterization of Highlands J virus from a Mississippi sandhill crane using unbiased next-generation sequencing

    Science.gov (United States)

    Ip, Hon S.; Wiley, Michael R.; Long, Renee; Gustavo, Palacios; Shearn-Bochsler, Valerie; Whitehouse, Chris A.

    2014-01-01

    Advances in massively parallel DNA sequencing platforms, commonly termed next-generation sequencing (NGS) technologies, have greatly reduced time, labor, and cost associated with DNA sequencing. Thus, NGS has become a routine tool for new viral pathogen discovery and will likely become the standard for routine laboratory diagnostics of infectious diseases in the near future. This study demonstrated the application of NGS for the rapid identification and characterization of a virus isolated from the brain of an endangered Mississippi sandhill crane. This bird was part of a population restoration effort and was found in an emaciated state several days after Hurricane Isaac passed over the refuge in Mississippi in 2012. Post-mortem examination had identified trichostrongyliasis as the possible cause of death, but because a virus with morphology consistent with a togavirus was isolated from the brain of the bird, an arboviral etiology was strongly suspected. Because individual molecular assays for several known arboviruses were negative, unbiased NGS by Illumina MiSeq was used to definitively identify and characterize the causative viral agent. Whole genome sequencing and phylogenetic analysis revealed the viral isolate to be the Highlands J virus, a known avian pathogen. This study demonstrates the use of unbiased NGS for the rapid detection and characterization of an unidentified viral pathogen and the application of this technology to wildlife disease diagnostics and conservation medicine.

  4. Multicenter validation of cancer gene panel-based next-generation sequencing for translational research and molecular diagnostics.

    Science.gov (United States)

    Hirsch, B; Endris, V; Lassmann, S; Weichert, W; Pfarr, N; Schirmacher, P; Kovaleva, V; Werner, M; Bonzheim, I; Fend, F; Sperveslage, J; Kaulich, K; Zacher, A; Reifenberger, G; Köhrer, K; Stepanow, S; Lerke, S; Mayr, T; Aust, D E; Baretton, G; Weidner, S; Jung, A; Kirchner, T; Hansmann, M L; Burbat, L; von der Wall, E; Dietel, M; Hummel, M

    2018-04-01

    The simultaneous detection of multiple somatic mutations in the context of molecular diagnostics of cancer is frequently performed by means of amplicon-based targeted next-generation sequencing (NGS). However, only few studies are available comparing multicenter testing of different NGS platforms and gene panels. Therefore, seven partner sites of the German Cancer Consortium (DKTK) performed a multicenter interlaboratory trial for targeted NGS using the same formalin-fixed, paraffin-embedded (FFPE) specimen of molecularly pre-characterized tumors (n = 15; each n = 5 cases of Breast, Lung, and Colon carcinoma) and a colorectal cancer cell line DNA dilution series. Detailed information regarding pre-characterized mutations was not disclosed to the partners. Commercially available and custom-designed cancer gene panels were used for library preparation and subsequent sequencing on several devices of two NGS different platforms. For every case, centrally extracted DNA and FFPE tissue sections for local processing were delivered to each partner site to be sequenced with the commercial gene panel and local bioinformatics. For cancer-specific panel-based sequencing, only centrally extracted DNA was analyzed at seven sequencing sites. Subsequently, local data were compiled and bioinformatics was performed centrally. We were able to demonstrate that all pre-characterized mutations were re-identified correctly, irrespective of NGS platform or gene panel used. However, locally processed FFPE tissue sections disclosed that the DNA extraction method can affect the detection of mutations with a trend in favor of magnetic bead-based DNA extraction methods. In conclusion, targeted NGS is a very robust method for simultaneous detection of various mutations in FFPE tissue specimens if certain pre-analytical conditions are carefully considered.

  5. Clinical pharmacogenomics testing in the era of next generation sequencing: challenges and opportunities for precision medicine.

    Science.gov (United States)

    Ji, Yuan; Si, Yue; McMillin, Gwendolyn A; Lyon, Elaine

    2018-04-23

    The rapid development and dramatic decrease in cost of sequencing techniques have ushered the implementation of genomic testing in patient care. Next generation DNA sequencing (NGS) techniques have been used increasingly in clinical laboratories to scan the whole or part of the human genome in order to facilitate diagnosis and/or prognostics of genetic disease. Despite many hurdles and debates, pharmacogenomics (PGx) is believed to be an area of genomic medicine where precision medicine could have immediate impact in the near future. Areas covered: This review focuses on lessons learned through early attempts of clinically implementing PGx testing; the challenges and opportunities that PGx testing brings to precision medicine in the era of NGS. Expert commentary: Replacing targeted analysis approach with NGS for PGx testing is neither technically feasible nor necessary currently due to several technical limitations and uncertainty involved in interpreting variants of uncertain significance for PGx variants. However, reporting PGx variants out of clinical whole exome or whole genome sequencing (WES/WGS) might represent additional benefits for patients who are tested by WES/WGS.

  6. Understanding Cancer Genome and Its Evolution by Next Generation Sequencing

    DEFF Research Database (Denmark)

    Hou, Yong

    Cancer will cause 13 million deaths by the year of 2030, ranking the second leading cause of death worldwide. Previous studies indicate that most of the cancers originate from cells that acquired somatic mutations and evolved as Darwin Theory. Ten biological insights of cancer have been summarized...... recently. Cutting-age technologies like next generation sequencing (NGS) enable exploring cancer genome and evolution much more efficiently. However, integrated cancer genome sequencing studies showed great inter-/intra-tumoral heterogeneity (ITH) and complex evolution patterns beyond the cancer biological...... knowledge we previously know. There is very limited knowledge of East Asia lung cancer genome except enrichment of EGFR mutations and lack of KRAS mutations. We carried out integrated genomic, transcriptomic and methylomic analysis of 335 primary Chinese lung adenocarcinomas (LUAD) and 35 corresponding...

  7. Comparison of Four Human Papillomavirus Genotyping Methods: Next-generation Sequencing, INNO-LiPA, Electrochemical DNA Chip, and Nested-PCR.

    Science.gov (United States)

    Nilyanimit, Pornjarim; Chansaenroj, Jira; Poomipak, Witthaya; Praianantathavorn, Kesmanee; Payungporn, Sunchai; Poovorawan, Yong

    2018-03-01

    Human papillomavirus (HPV) infection causes cervical cancer, thus necessitating early detection by screening. Rapid and accurate HPV genotyping is crucial both for the assessment of patients with HPV infection and for surveillance studies. Fifty-eight cervicovaginal samples were tested for HPV genotypes using four methods in parallel: nested-PCR followed by conventional sequencing, INNO-LiPA, electrochemical DNA chip, and next-generation sequencing (NGS). Seven HPV genotypes (16, 18, 31, 33, 45, 56, and 58) were identified by all four methods. Nineteen HPV genotypes were detected by NGS, but not by nested-PCR, INNO-LiPA, or electrochemical DNA chip. Although NGS is relatively expensive and complex, it may serve as a sensitive HPV genotyping method. Because of its highly sensitive detection of multiple HPV genotypes, NGS may serve as an alternative for diagnostic HPV genotyping in certain situations. © The Korean Society for Laboratory Medicine

  8. Am I My Family's Keeper? Disclosure Dilemmas in Next-Generation Sequencing.

    Science.gov (United States)

    Wouters, Roel H P; Bijlsma, Rhodé M; Ausems, Margreet G E M; van Delden, Johannes J M; Voest, Emile E; Bredenoord, Annelien L

    2016-12-01

    Ever since genetic testing is possible for specific mutations, ethical debate has sparked on the question of whether professionals have a duty to warn not only patients but also their relatives that might be at risk for hereditary diseases. As next-generation sequencing (NGS) swiftly finds its way into clinical practice, the question who is responsible for conveying unsolicited findings to family members becomes increasingly urgent. Traditionally, there is a strong emphasis on the duties of the professional in this debate. But what is the role of the patient and her family? In this article, we discuss the question of whose duty it is to convey relevant genetic risk information concerning hereditary diseases that can be cured or prevented to the relatives of patients undergoing NGS. We argue in favor of a shared responsibility for professionals and patients and present a strategy that reconciles these roles: a moral accountability nudge. Incorporated into informed consent and counseling services such as letters and online tools, this nudge aims to create awareness on specific patient responsibilities. Commitment of all parties is needed to ensure adequate dissemination of results in the NGS era. © 2016 WILEY PERIODICALS, INC.

  9. Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.

    Science.gov (United States)

    Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

    2013-01-01

    Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.

  10. A Review of Study Designs and Statistical Methods for Genomic Epidemiology Studies using Next Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Qian eWang

    2015-04-01

    Full Text Available Results from numerous linkage and association studies have greatly deepened scientists’ understanding of the genetic basis of many human diseases, yet some important questions remain unanswered. For example, although a large number of disease-associated loci have been identified from genome-wide association studies (GWAS in the past 10 years, it is challenging to interpret these results as most disease-associated markers have no clear functional roles in disease etiology, and all the identified genomic factors only explain a small portion of disease heritability. With the help of next-generation sequencing (NGS, diverse types of genomic and epigenetic variations can be detected with high accuracy. More importantly, instead of using linkage disequilibrium to detect association signals based on a set of pre-set probes, NGS allows researchers to directly study all the variants in each individual, therefore promises opportunities for identifying functional variants and a more comprehensive dissection of disease heritability. Although the current scale of NGS studies is still limited due to the high cost, the success of several recent studies suggests the great potential for applying NGS in genomic epidemiology, especially as the cost of sequencing continues to drop. In this review, we discuss several pioneer applications of NGS, summarize scientific discoveries for rare and complex diseases, and compare various study designs including targeted sequencing and whole-genome sequencing using population-based and family-based cohorts. Finally, we highlight recent advancements in statistical methods proposed for sequencing analysis, including group-based association tests, meta-analysis techniques, and annotation tools for variant prioritization.

  11. gCUP: rapid GPU-based HIV-1 co-receptor usage prediction for next-generation sequencing.

    Science.gov (United States)

    Olejnik, Michael; Steuwer, Michel; Gorlatch, Sergei; Heider, Dominik

    2014-11-15

    Next-generation sequencing (NGS) has a large potential in HIV diagnostics, and genotypic prediction models have been developed and successfully tested in the recent years. However, albeit being highly accurate, these computational models lack computational efficiency to reach their full potential. In this study, we demonstrate the use of graphics processing units (GPUs) in combination with a computational prediction model for HIV tropism. Our new model named gCUP, parallelized and optimized for GPU, is highly accurate and can classify >175 000 sequences per second on an NVIDIA GeForce GTX 460. The computational efficiency of our new model is the next step to enable NGS technologies to reach clinical significance in HIV diagnostics. Moreover, our approach is not limited to HIV tropism prediction, but can also be easily adapted to other settings, e.g. drug resistance prediction. The source code can be downloaded at http://www.heiderlab.de d.heider@wz-straubing.de. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Next Generation Sequencing and ALS: known genes, different phenotyphes.

    Science.gov (United States)

    Campopiano, Rosa; Ryskalin, Larisa; Giardina, Emiliano; Zampatti, Stefania; Busceti, Carla L; Biagioni, Francesca; Ferese, Rosangela; Storto, Marianna; Gambardella, Stefano; Fornai, Francesco

    2017-12-01

    Amyotrophic lateral sclerosis (ALS) is fatal neurodegenerative disease clinically characterized by upper and lower motor neuron dysfunction resulting in rapidly progressive paralysis and death from respiratory failure. Most cases appear to be sporadic, but 5-10 % of cases have a family history of the disease, and over the last decade, identification of mutations in about 20 genes predisposing to these disorders has provided the means to better understand their pathogenesis. Next Generation sequencing (NGS) is an advanced high-throughput DNA sequencing technology which have rapidly contributed to an acceleration in the discovery of genetic risk factors for both familial and sporadic neurological and neurodegenerative diseases. These strategies allowed to rapidly identify disease-associated variants and genetic risk factors for both familial (fALS) and sporadic ALS (sALS), strongly contributing to the knowledge of the genetic architecture of ALS. Moreover, as the number of ALS genes grows, many of the proteins they encode are in intracellular processes shared with other known diseases, suggesting an overlapping of clinical and phatological features between different diseases. To emphasize this concept, the review focuses on genes coding for Valosin-containing protein (VPC) and two Heterogeneous nuclear RNA-binding proteins (HNRNPA1 and hnRNPA2B1), recently idefied through NGS, where different mutations have been associated in both ALS and other neurological and neurodegenerative diseases.

  13. Combining next-generation sequencing and online databases for microsatellite development in non-model organisms.

    Science.gov (United States)

    Rico, Ciro; Normandeau, Eric; Dion-Côté, Anne-Marie; Rico, María Inés; Côté, Guillaume; Bernatchez, Louis

    2013-12-03

    Next-generation sequencing (NGS) is revolutionising marker development and the rapidly increasing amount of transcriptomes published across a wide variety of taxa is providing valuable sequence databases for the identification of genetic markers without the need to generate new sequences. Microsatellites are still the most important source of polymorphic markers in ecology and evolution. Motivated by our long-term interest in the adaptive radiation of a non-model species complex of whitefishes (Coregonus spp.), in this study, we focus on microsatellite characterisation and multiplex optimisation using transcriptome sequences generated by Illumina® and Roche-454, as well as online databases of Expressed Sequence Tags (EST) for the study of whitefish evolution and demographic history. We identified and optimised 40 polymorphic loci in multiplex PCR reactions and validated the robustness of our analyses by testing several population genetics and phylogeographic predictions using 494 fish from five lakes and 2 distinct ecotypes.

  14. Next-generation sequencing and culture-based techniques offer complementary insights into fungi and prokaryotes in beach sands.

    Science.gov (United States)

    Romão, Daniela; Staley, Christopher; Ferreira, Filipa; Rodrigues, Raquel; Sabino, Raquel; Veríssimo, Cristina; Wang, Ping; Sadowsky, Michael; Brandão, João

    2017-06-15

    A next-generation sequencing (NGS) approach, in conjunction with culture-based methods, was used to examine fungal and prokaryotic communities for the presence of potential pathogens in beach sands throughout Portugal. Culture-based fungal enumeration revealed low and variable concentrations of the species targeted (yeasts and dermatophytes), which were underrepresented in the community characterized by NGS targeting the ITS1 region. Conversely, NGS indicated that the potentially pathogenic species Purpureocillium liliacinum comprised nearly the entire fungal community. Culturable fecal indicator bacterial concentrations were low throughout the study and unrelated to communities characterized by NGS. Notably, the prokaryotic communities characterized revealed a considerable abundance of archaea. Results highlight differences in communities between methods in beach sand monitoring but indicate the techniques offer complementary insights. Thus, there is a need to leverage culture-based methods with NGS methods, using a toolbox approach, to determine appropriate targets and metrics for beach sand monitoring to adequately protect public health. Copyright © 2017. Published by Elsevier Ltd.

  15. Plastid: nucleotide-resolution analysis of next-generation sequencing and genomics data.

    Science.gov (United States)

    Dunn, Joshua G; Weissman, Jonathan S

    2016-11-22

    Next-generation sequencing (NGS) informs many biological questions with unprecedented depth and nucleotide resolution. These assays have created a need for analytical tools that enable users to manipulate data nucleotide-by-nucleotide robustly and easily. Furthermore, because many NGS assays encode information jointly within multiple properties of read alignments - for example, in ribosome profiling, the locations of ribosomes are jointly encoded in alignment coordinates and length - analytical tools are often required to extract the biological meaning from the alignments before analysis. Many assay-specific pipelines exist for this purpose, but there remains a need for user-friendly, generalized, nucleotide-resolution tools that are not limited to specific experimental regimes or analytical workflows. Plastid is a Python library designed specifically for nucleotide-resolution analysis of genomics and NGS data. As such, Plastid is designed to extract assay-specific information from read alignments while retaining generality and extensibility to novel NGS assays. Plastid represents NGS and other biological data as arrays of values associated with genomic or transcriptomic positions, and contains configurable tools to convert data from a variety of sources to such arrays. Plastid also includes numerous tools to manipulate even discontinuous genomic features, such as spliced transcripts, with nucleotide precision. Plastid automatically handles conversion between genomic and feature-centric coordinates, accounting for splicing and strand, freeing users of burdensome accounting. Finally, Plastid's data models use consistent and familiar biological idioms, enabling even beginners to develop sophisticated analytical workflows with minimal effort. Plastid is a versatile toolkit that has been used to analyze data from multiple NGS assays, including RNA-seq, ribosome profiling, and DMS-seq. It forms the genomic engine of our ORF annotation tool, ORF-RATER, and is readily

  16. Increased genetic diversity and prevalence of co-infection with Trypanosoma spp. in koalas (Phascolarctos cinereus and their ticks identified using next-generation sequencing (NGS.

    Directory of Open Access Journals (Sweden)

    Amanda D Barbosa

    Full Text Available Infections with Trypanosoma spp. have been associated with poor health and decreased survival of koalas (Phascolarctos cinereus, particularly in the presence of concurrent pathogens such as Chlamydia and koala retrovirus. The present study describes the application of a next-generation sequencing (NGS-based assay to characterise the prevalence and genetic diversity of trypanosome communities in koalas and two native species of ticks (Ixodes holocyclus and I. tasmani removed from koala hosts. Among 168 koalas tested, 32.2% (95% CI: 25.2-39.8% were positive for at least one Trypanosoma sp. Previously described Trypanosoma spp. from koalas were identified, including T. irwini (32.1%, 95% CI: 25.2-39.8%, T. gilletti (25%, 95% CI: 18.7-32.3%, T. copemani (27.4%, 95% CI: 20.8-34.8% and T. vegrandis (10.1%, 95% CI: 6.0-15.7%. Trypanosoma noyesi was detected for the first time in koalas, although at a low prevalence (0.6% 95% CI: 0-3.3%, and a novel species (Trypanosoma sp. AB-2017 was identified at a prevalence of 4.8% (95% CI: 2.1-9.2%. Mixed infections with up to five species were present in 27.4% (95% CI: 21-35% of the koalas, which was significantly higher than the prevalence of single infections 4.8% (95% CI: 2-9%. Overall, a considerably higher proportion (79.7% of the Trypanosoma sequences isolated from koala blood samples were identified as T. irwini, suggesting this is the dominant species. Co-infections involving T. gilletti, T. irwini, T. copemani, T. vegrandis and Trypanosoma sp. AB-2017 were also detected in ticks, with T. gilletti and T. copemani being the dominant species within the invertebrate hosts. Direct Sanger sequencing of Trypanosoma 18S rRNA gene amplicons was also performed and results revealed that this method was only able to identify the genotypes with greater amount of reads (according to NGS within koala samples, which highlights the advantages of NGS in detecting mixed infections. The present study provides new insights

  17. Increased genetic diversity and prevalence of co-infection with Trypanosoma spp. in koalas (Phascolarctos cinereus) and their ticks identified using next-generation sequencing (NGS).

    Science.gov (United States)

    Barbosa, Amanda D; Gofton, Alexander W; Paparini, Andrea; Codello, Annachiara; Greay, Telleasha; Gillett, Amber; Warren, Kristin; Irwin, Peter; Ryan, Una

    2017-01-01

    Infections with Trypanosoma spp. have been associated with poor health and decreased survival of koalas (Phascolarctos cinereus), particularly in the presence of concurrent pathogens such as Chlamydia and koala retrovirus. The present study describes the application of a next-generation sequencing (NGS)-based assay to characterise the prevalence and genetic diversity of trypanosome communities in koalas and two native species of ticks (Ixodes holocyclus and I. tasmani) removed from koala hosts. Among 168 koalas tested, 32.2% (95% CI: 25.2-39.8%) were positive for at least one Trypanosoma sp. Previously described Trypanosoma spp. from koalas were identified, including T. irwini (32.1%, 95% CI: 25.2-39.8%), T. gilletti (25%, 95% CI: 18.7-32.3%), T. copemani (27.4%, 95% CI: 20.8-34.8%) and T. vegrandis (10.1%, 95% CI: 6.0-15.7%). Trypanosoma noyesi was detected for the first time in koalas, although at a low prevalence (0.6% 95% CI: 0-3.3%), and a novel species (Trypanosoma sp. AB-2017) was identified at a prevalence of 4.8% (95% CI: 2.1-9.2%). Mixed infections with up to five species were present in 27.4% (95% CI: 21-35%) of the koalas, which was significantly higher than the prevalence of single infections 4.8% (95% CI: 2-9%). Overall, a considerably higher proportion (79.7%) of the Trypanosoma sequences isolated from koala blood samples were identified as T. irwini, suggesting this is the dominant species. Co-infections involving T. gilletti, T. irwini, T. copemani, T. vegrandis and Trypanosoma sp. AB-2017 were also detected in ticks, with T. gilletti and T. copemani being the dominant species within the invertebrate hosts. Direct Sanger sequencing of Trypanosoma 18S rRNA gene amplicons was also performed and results revealed that this method was only able to identify the genotypes with greater amount of reads (according to NGS) within koala samples, which highlights the advantages of NGS in detecting mixed infections. The present study provides new insights on the

  18. Combined DECS Analysis and Next-Generation Sequencing Enable Efficient Detection of Novel Plant RNA Viruses

    Directory of Open Access Journals (Sweden)

    Hironobu Yanagisawa

    2016-03-01

    Full Text Available The presence of high molecular weight double-stranded RNA (dsRNA within plant cells is an indicator of infection with RNA viruses as these possess genomic or replicative dsRNA. DECS (dsRNA isolation, exhaustive amplification, cloning, and sequencing analysis has been shown to be capable of detecting unknown viruses. We postulated that a combination of DECS analysis and next-generation sequencing (NGS would improve detection efficiency and usability of the technique. Here, we describe a model case in which we efficiently detected the presumed genome sequence of Blueberry shoestring virus (BSSV, a member of the genus Sobemovirus, which has not so far been reported. dsRNAs were isolated from BSSV-infected blueberry plants using the dsRNA-binding protein, reverse-transcribed, amplified, and sequenced using NGS. A contig of 4,020 nucleotides (nt that shared similarities with sequences from other Sobemovirus species was obtained as a candidate of the BSSV genomic sequence. Reverse transcription (RT-PCR primer sets based on sequences from this contig enabled the detection of BSSV in all BSSV-infected plants tested but not in healthy controls. A recombinant protein encoded by the putative coat protein gene was bound by the BSSV-antibody, indicating that the candidate sequence was that of BSSV itself. Our results suggest that a combination of DECS analysis and NGS, designated here as “DECS-C,” is a powerful method for detecting novel plant viruses.

  19. A novel procedure on next generation sequencing data analysis using text mining algorithm.

    Science.gov (United States)

    Zhao, Weizhong; Chen, James J; Perkins, Roger; Wang, Yuping; Liu, Zhichao; Hong, Huixiao; Tong, Weida; Zou, Wen

    2016-05-13

    Next-generation sequencing (NGS) technologies have provided researchers with vast possibilities in various biological and biomedical research areas. Efficient data mining strategies are in high demand for large scale comparative and evolutional studies to be performed on the large amounts of data derived from NGS projects. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. We report a novel procedure to analyse NGS data using topic modeling. It consists of four major procedures: NGS data retrieval, preprocessing, topic modeling, and data mining using Latent Dirichlet Allocation (LDA) topic outputs. The NGS data set of the Salmonella enterica strains were used as a case study to show the workflow of this procedure. The perplexity measurement of the topic numbers and the convergence efficiencies of Gibbs sampling were calculated and discussed for achieving the best result from the proposed procedure. The output topics by LDA algorithms could be treated as features of Salmonella strains to accurately describe the genetic diversity of fliC gene in various serotypes. The results of a two-way hierarchical clustering and data matrix analysis on LDA-derived matrices successfully classified Salmonella serotypes based on the NGS data. The implementation of topic modeling in NGS data analysis procedure provides a new way to elucidate genetic information from NGS data, and identify the gene-phenotype relationships and biomarkers, especially in the era of biological and medical big data. The implementation of topic modeling in NGS data analysis provides a new way to elucidate genetic information from NGS data, and identify the gene-phenotype relationships and biomarkers, especially in the era of biological and medical big data.

  20. Mining and Development of Novel SSR Markers Using Next Generation Sequencing (NGS Data in Plants

    Directory of Open Access Journals (Sweden)

    Sima Taheri

    2018-02-01

    Full Text Available Microsatellites, or simple sequence repeats (SSRs, are one of the most informative and multi-purpose genetic markers exploited in plant functional genomics. However, the discovery of SSRs and development using traditional methods are laborious, time-consuming, and costly. Recently, the availability of high-throughput sequencing technologies has enabled researchers to identify a substantial number of microsatellites at less cost and effort than traditional approaches. Illumina is a noteworthy transcriptome sequencing technology that is currently used in SSR marker development. Although 454 pyrosequencing datasets can be used for SSR development, this type of sequencing is no longer supported. This review aims to present an overview of the next generation sequencing, with a focus on the efficient use of de novo transcriptome sequencing (RNA-Seq and related tools for mining and development of microsatellites in plants.

  1. Accelerating next generation sequencing data analysis with system level optimizations.

    Science.gov (United States)

    Kathiresan, Nagarajan; Temanni, Ramzi; Almabrazi, Hakeem; Syed, Najeeb; Jithesh, Puthen V; Al-Ali, Rashid

    2017-08-22

    Next generation sequencing (NGS) data analysis is highly compute intensive. In-memory computing, vectorization, bulk data transfer, CPU frequency scaling are some of the hardware features in the modern computing architectures. To get the best execution time and utilize these hardware features, it is necessary to tune the system level parameters before running the application. We studied the GATK-HaplotypeCaller which is part of common NGS workflows, that consume more than 43% of the total execution time. Multiple GATK 3.x versions were benchmarked and the execution time of HaplotypeCaller was optimized by various system level parameters which included: (i) tuning the parallel garbage collection and kernel shared memory to simulate in-memory computing, (ii) architecture-specific tuning in the PairHMM library for vectorization, (iii) including Java 1.8 features through GATK source code compilation and building a runtime environment for parallel sorting and bulk data transfer (iv) the default 'on-demand' mode of CPU frequency is over-clocked by using 'performance-mode' to accelerate the Java multi-threads. As a result, the HaplotypeCaller execution time was reduced by 82.66% in GATK 3.3 and 42.61% in GATK 3.7. Overall, the execution time of NGS pipeline was reduced to 70.60% and 34.14% for GATK 3.3 and GATK 3.7 respectively.

  2. Next generation sequencing identifies abnormal Y chromosome and candidate causal variants in premature ovarian failure patients.

    Science.gov (United States)

    Lee, Yujung; Kim, Changshin; Park, YoungJoon; Pyun, Jung-A; Kwack, KyuBum

    2016-12-01

    Premature ovarian failure (POF) is characterized by heterogeneous genetic causes such as chromosomal abnormalities and variants in causal genes. Recently, development of techniques made next generation sequencing (NGS) possible to detect genome wide variants including chromosomal abnormalities. Among 37 Korean POF patients, XY karyotype with distal part deletions of Y chromosome, Yp11.32-31 and Yp12 end part, was observed in two patients through NGS. Six deleterious variants in POF genes were also detected which might explain the pathogenesis of POF with abnormalities in the sex chromosomes. Additionally, the two POF patients had no mutation in SRY but three non-synonymous variants were detected in genes regarding sex reversal. These findings suggest candidate causes of POF and sex reversal and show the propriety of NGS to approach the heterogeneous pathogenesis of POF. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. Stepwise threshold clustering: a new method for genotyping MHC loci using next-generation sequencing technology.

    Directory of Open Access Journals (Sweden)

    William E Stutz

    Full Text Available Genes of the vertebrate major histocompatibility complex (MHC are of great interest to biologists because of their important role in immunity and disease, and their extremely high levels of genetic diversity. Next generation sequencing (NGS technologies are quickly becoming the method of choice for high-throughput genotyping of multi-locus templates like MHC in non-model organisms. Previous approaches to genotyping MHC genes using NGS technologies suffer from two problems:1 a "gray zone" where low frequency alleles and high frequency artifacts can be difficult to disentangle and 2 a similar sequence problem, where very similar alleles can be difficult to distinguish as two distinct alleles. Here were present a new method for genotyping MHC loci--Stepwise Threshold Clustering (STC--that addresses these problems by taking full advantage of the increase in sequence data provided by NGS technologies. Unlike previous approaches for genotyping MHC with NGS data that attempt to classify individual sequences as alleles or artifacts, STC uses a quasi-Dirichlet clustering algorithm to cluster similar sequences at increasing levels of sequence similarity. By applying frequency and similarity based criteria to clusters rather than individual sequences, STC is able to successfully identify clusters of sequences that correspond to individual or similar alleles present in the genomes of individual samples. Furthermore, STC does not require duplicate runs of all samples, increasing the number of samples that can be genotyped in a given project. We show how the STC method works using a single sample library. We then apply STC to 295 threespine stickleback (Gasterosteus aculeatus samples from four populations and show that neighboring populations differ significantly in MHC allele pools. We show that STC is a reliable, accurate, efficient, and flexible method for genotyping MHC that will be of use to biologists interested in a variety of downstream applications.

  4. Targeted next-generation sequencing makes new molecular diagnoses and expands genotype-phenotype relationship in Ehlers-Danlos syndrome.

    Science.gov (United States)

    Weerakkody, Ruwan A; Vandrovcova, Jana; Kanonidou, Christina; Mueller, Michael; Gampawar, Piyush; Ibrahim, Yousef; Norsworthy, Penny; Biggs, Jennifer; Abdullah, Abdulshakur; Ross, David; Black, Holly A; Ferguson, David; Cheshire, Nicholas J; Kazkaz, Hanadi; Grahame, Rodney; Ghali, Neeti; Vandersteen, Anthony; Pope, F Michael; Aitman, Timothy J

    2016-11-01

    Ehlers-Danlos syndrome (EDS) comprises a group of overlapping hereditary disorders of connective tissue with significant morbidity and mortality, including major vascular complications. We sought to identify the diagnostic utility of a next-generation sequencing (NGS) panel in a mixed EDS cohort. We developed and applied PCR-based NGS assays for targeted, unbiased sequencing of 12 collagen and aortopathy genes to a cohort of 177 unrelated EDS patients. Variants were scored blind to previous genetic testing and then compared with results of previous Sanger sequencing. Twenty-eight pathogenic variants in COL5A1/2, COL3A1, FBN1, and COL1A1 and four likely pathogenic variants in COL1A1, TGFBR1/2, and SMAD3 were identified by the NGS assays. These included all previously detected single-nucleotide and other short pathogenic variants in these genes, and seven newly detected pathogenic or likely pathogenic variants leading to clinically significant diagnostic revisions. Twenty-two variants of uncertain significance were identified, seven of which were in aortopathy genes and required clinical follow-up. Unbiased NGS-based sequencing made new molecular diagnoses outside the expected EDS genotype-phenotype relationship and identified previously undetected clinically actionable variants in aortopathy susceptibility genes. These data may be of value in guiding future clinical pathways for genetic diagnosis in EDS.Genet Med 18 11, 1119-1127.

  5. AQME: A forensic mitochondrial DNA analysis tool for next-generation sequencing data.

    Science.gov (United States)

    Sturk-Andreaggi, Kimberly; Peck, Michelle A; Boysen, Cecilie; Dekker, Patrick; McMahon, Timothy P; Marshall, Charla K

    2017-11-01

    The feasibility of generating mitochondrial DNA (mtDNA) data has expanded considerably with the advent of next-generation sequencing (NGS), specifically in the generation of entire mtDNA genome (mitogenome) sequences. However, the analysis of these data has emerged as the greatest challenge to implementation in forensics. To address this need, a custom toolkit for use in the CLC Genomics Workbench (QIAGEN, Hilden, Germany) was developed through a collaborative effort between the Armed Forces Medical Examiner System - Armed Forces DNA Identification Laboratory (AFMES-AFDIL) and QIAGEN Bioinformatics. The AFDIL-QIAGEN mtDNA Expert, or AQME, generates an editable mtDNA profile that employs forensic conventions and includes the interpretation range required for mtDNA data reporting. AQME also integrates an mtDNA haplogroup estimate into the analysis workflow, which provides the analyst with phylogenetic nomenclature guidance and a profile quality check without the use of an external tool. Supplemental AQME outputs such as nucleotide-per-position metrics, configurable export files, and an audit trail are produced to assist the analyst during review. AQME is applied to standard CLC outputs and thus can be incorporated into any mtDNA bioinformatics pipeline within CLC regardless of sample type, library preparation or NGS platform. An evaluation of AQME was performed to demonstrate its functionality and reliability for the analysis of mitogenome NGS data. The study analyzed Illumina mitogenome data from 21 samples (including associated controls) of varying quality and sample preparations with the AQME toolkit. A total of 211 tool edits were automatically applied to 130 of the 698 total variants reported in an effort to adhere to forensic nomenclature. Although additional manual edits were required for three samples, supplemental tools such as mtDNA haplogroup estimation assisted in identifying and guiding these necessary modifications to the AQME-generated profile. Along

  6. Next-Generation Sequencing for Infectious Disease Diagnosis and Management: A Report of the Association for Molecular Pathology.

    Science.gov (United States)

    Lefterova, Martina I; Suarez, Carlos J; Banaei, Niaz; Pinsky, Benjamin A

    2015-11-01

    Next-generation sequencing (NGS) technologies are increasingly being used for diagnosis and monitoring of infectious diseases. Herein, we review the application of NGS in clinical microbiology, focusing on genotypic resistance testing, direct detection of unknown disease-associated pathogens in clinical specimens, investigation of microbial population diversity in the human host, and strain typing. We have organized the review into three main sections: i) applications in clinical virology, ii) applications in clinical bacteriology, mycobacteriology, and mycology, and iii) validation, quality control, and maintenance of proficiency. Although NGS holds enormous promise for clinical infectious disease testing, many challenges remain, including automation, standardizing technical protocols and bioinformatics pipelines, improving reference databases, establishing proficiency testing and quality control measures, and reducing cost and turnaround time, all of which would be necessary for widespread adoption of NGS in clinical microbiology laboratories. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  7. Capture-based next-generation sequencing reveals multiple actionable mutations in cancer patients failed in traditional testing.

    Science.gov (United States)

    Xie, Jing; Lu, Xiongxiong; Wu, Xue; Lin, Xiaoyi; Zhang, Chao; Huang, Xiaofang; Chang, Zhili; Wang, Xinjing; Wen, Chenlei; Tang, Xiaomei; Shi, Minmin; Zhan, Qian; Chen, Hao; Deng, Xiaxing; Peng, Chenghong; Li, Hongwei; Fang, Yuan; Shao, Yang; Shen, Baiyong

    2016-05-01

    Targeted therapies including monoclonal antibodies and small molecule inhibitors have dramatically changed the treatment of cancer over past 10 years. Their therapeutic advantages are more tumor specific and with less side effects. For precisely tailoring available targeted therapies to each individual or a subset of cancer patients, next-generation sequencing (NGS) has been utilized as a promising diagnosis tool with its advantages of accuracy, sensitivity, and high throughput. We developed and validated a NGS-based cancer genomic diagnosis targeting 115 prognosis and therapeutics relevant genes on multiple specimen including blood, tumor tissue, and body fluid from 10 patients with different cancer types. The sequencing data was then analyzed by the clinical-applicable analytical pipelines developed in house. We have assessed analytical sensitivity, specificity, and accuracy of the NGS-based molecular diagnosis. Also, our developed analytical pipelines were capable of detecting base substitutions, indels, and gene copy number variations (CNVs). For instance, several actionable mutations of EGFR,PIK3CA,TP53, and KRAS have been detected for indicating drug susceptibility and resistance in the cases of lung cancer. Our study has shown that NGS-based molecular diagnosis is more sensitive and comprehensive to detect genomic alterations in cancer, and supports a direct clinical use for guiding targeted therapy.

  8. Integration of microbiological, epidemiological and next generation sequencing technologies data for the managing of nosocomial infections

    Directory of Open Access Journals (Sweden)

    Matteo Brilli

    2018-02-01

    Full Text Available At its core, the work of clinical microbiologists consists in the retrieving of a few bytes of information (species identification; metabolic capacities; staining and antigenic properties; antibiotic resistance profiles, etc. from pathogenic agents. The development of next generation sequencing technologies (NGS, and the possibility to determine the entire genome for bacterial pathogens, fungi and protozoans will likely introduce a breakthrough in the amount of information generated by clinical microbiology laboratories: from bytes to Megabytes of information, for a single isolate. In parallel, the development of novel informatics tools, designed for the management and analysis of the so-called Big Data, offers the possibility to search for patterns in databases collecting genomic and microbiological information on the pathogens, as well as epidemiological data and information on the clinical parameters of the patients. Nosocomial infections and antibiotic resistance will likely represent major challenges for clinical microbiologists, in the next decades. In this paper, we describe how bacterial genomics based on NGS, integrated with novel informatic tools, could contribute to the control of hospital infections and multi-drug resistant pathogens.

  9. Characterization of the cutaneous mycobiota in healthy and allergic cats using next generation sequencing.

    Science.gov (United States)

    Meason-Smith, Courtney; Diesel, Alison; Patterson, Adam P; Older, Caitlin E; Johnson, Timothy J; Mansell, Joanne M; Suchodolski, Jan S; Rodrigues Hoffmann, Aline

    2017-02-01

    Next generation sequencing (NGS) studies have demonstrated a diverse skin-associated microbiota and microbial dysbiosis associated with atopic dermatitis in people and in dogs. The skin of cats has yet to be investigated using NGS techniques. We hypothesized that the fungal microbiota of healthy feline skin would be similar to that of dogs, with a predominance of environmental fungi, and that fungal dysbiosis would be present on the skin of allergic cats. Eleven healthy cats and nine cats diagnosed with one or more cutaneous hypersensitivity disorders, including flea bite, food-induced and nonflea nonfood-induced hypersensitivity. Healthy cats were sampled at twelve body sites and allergic cats at six sites. DNA was isolated and Illumina sequencing was performed targeting the internal transcribed spacer region of fungi. Sequences were processed using the bioinformatics software QIIME. The most abundant fungal sequences from the skin of all cats were classified as Cladosporium and Alternaria. The mucosal sites, including nostril, conjunctiva and reproductive tracts, had the fewest number of fungi, whereas the pre-aural space had the most. Allergic feline skin had significantly greater amounts of Agaricomycetes and Sordariomycetes, and significantly less Epicoccum compared to healthy feline skin. The skin of healthy cats appears to have a more diverse fungal microbiota compared to previous studies, and a fungal dysbiosis is noted in the skin of allergic cats. Future studies assessing the temporal stability of the skin microbiota in cats will be useful in determining whether the microbiota sequenced using NGS are colonizers or transient microbes. © 2016 ESVD and ACVD.

  10. Next-Generation Mitogenomics: A Comparison of Approaches Applied to Caecilian Amphibian Phylogeny.

    Science.gov (United States)

    Maddock, Simon T; Briscoe, Andrew G; Wilkinson, Mark; Waeschenbach, Andrea; San Mauro, Diego; Day, Julia J; Littlewood, D Tim J; Foster, Peter G; Nussbaum, Ronald A; Gower, David J

    2016-01-01

    Mitochondrial genome (mitogenome) sequences are being generated with increasing speed due to the advances of next-generation sequencing (NGS) technology and associated analytical tools. However, detailed comparisons to explore the utility of alternative NGS approaches applied to the same taxa have not been undertaken. We compared a 'traditional' Sanger sequencing method with two NGS approaches (shotgun sequencing and non-indexed, multiplex amplicon sequencing) on four different sequencing platforms (Illumina's HiSeq and MiSeq, Roche's 454 GS FLX, and Life Technologies' Ion Torrent) to produce seven (near-) complete mitogenomes from six species that form a small radiation of caecilian amphibians from the Seychelles. The fastest, most accurate method of obtaining mitogenome sequences that we tested was direct sequencing of genomic DNA (shotgun sequencing) using the MiSeq platform. Bayesian inference and maximum likelihood analyses using seven different partitioning strategies were unable to resolve compellingly all phylogenetic relationships among the Seychelles caecilian species, indicating the need for additional data in this case.

  11. Next-Generation Mitogenomics: A Comparison of Approaches Applied to Caecilian Amphibian Phylogeny.

    Directory of Open Access Journals (Sweden)

    Simon T Maddock

    Full Text Available Mitochondrial genome (mitogenome sequences are being generated with increasing speed due to the advances of next-generation sequencing (NGS technology and associated analytical tools. However, detailed comparisons to explore the utility of alternative NGS approaches applied to the same taxa have not been undertaken. We compared a 'traditional' Sanger sequencing method with two NGS approaches (shotgun sequencing and non-indexed, multiplex amplicon sequencing on four different sequencing platforms (Illumina's HiSeq and MiSeq, Roche's 454 GS FLX, and Life Technologies' Ion Torrent to produce seven (near- complete mitogenomes from six species that form a small radiation of caecilian amphibians from the Seychelles. The fastest, most accurate method of obtaining mitogenome sequences that we tested was direct sequencing of genomic DNA (shotgun sequencing using the MiSeq platform. Bayesian inference and maximum likelihood analyses using seven different partitioning strategies were unable to resolve compellingly all phylogenetic relationships among the Seychelles caecilian species, indicating the need for additional data in this case.

  12. Next-generation sequencing identifies a novel compound heterozygous mutation in MYO7A in a Chinese patient with Usher Syndrome 1B.

    Science.gov (United States)

    Wei, Xiaoming; Sun, Yan; Xie, Jiansheng; Shi, Quan; Qu, Ning; Yang, Guanghui; Cai, Jun; Yang, Yi; Liang, Yu; Wang, Wei; Yi, Xin

    2012-11-20

    Targeted enrichment and next-generation sequencing (NGS) have been employed for detection of genetic diseases. The purpose of this study was to validate the accuracy and sensitivity of our method for comprehensive mutation detection of hereditary hearing loss, and identify inherited mutations involved in human deafness accurately and economically. To make genetic diagnosis of hereditary hearing loss simple and timesaving, we designed a 0.60 MB array-based chip containing 69 nuclear genes and mitochondrial genome responsible for human deafness and conducted NGS toward ten patients with five known mutations and a Chinese family with hearing loss (never genetically investigated). Ten patients with five known mutations were sequenced using next-generation sequencing to validate the sensitivity of the method. We identified four known mutations in two nuclear deafness causing genes (GJB2 and SLC26A4), one in mitochondrial DNA. We then performed this method to analyze the variants in a Chinese family with hearing loss and identified compound heterozygosity for two novel mutations in gene MYO7A. The compound heterozygosity identified in gene MYO7A causes Usher Syndrome 1B with severe phenotypes. The results support that the combination of enrichment of targeted genes and next-generation sequencing is a valuable molecular diagnostic tool for hereditary deafness and suitable for clinical application. Copyright © 2012 Elsevier B.V. All rights reserved.

  13. Next-generation sequencing meets genetic diagnostics: development of a comprehensive workflow for the analysis of BRCA1 and BRCA2 genes

    Science.gov (United States)

    Feliubadaló, Lídia; Lopez-Doriga, Adriana; Castellsagué, Ester; del Valle, Jesús; Menéndez, Mireia; Tornero, Eva; Montes, Eva; Cuesta, Raquel; Gómez, Carolina; Campos, Olga; Pineda, Marta; González, Sara; Moreno, Victor; Brunet, Joan; Blanco, Ignacio; Serra, Eduard; Capellá, Gabriel; Lázaro, Conxi

    2013-01-01

    Next-generation sequencing (NGS) is changing genetic diagnosis due to its huge sequencing capacity and cost-effectiveness. The aim of this study was to develop an NGS-based workflow for routine diagnostics for hereditary breast and ovarian cancer syndrome (HBOCS), to improve genetic testing for BRCA1 and BRCA2. A NGS-based workflow was designed using BRCA MASTR kit amplicon libraries followed by GS Junior pyrosequencing. Data analysis combined Variant Identification Pipeline freely available software and ad hoc R scripts, including a cascade of filters to generate coverage and variant calling reports. A BRCA homopolymer assay was performed in parallel. A research scheme was designed in two parts. A Training Set of 28 DNA samples containing 23 unique pathogenic mutations and 213 other variants (33 unique) was used. The workflow was validated in a set of 14 samples from HBOCS families in parallel with the current diagnostic workflow (Validation Set). The NGS-based workflow developed permitted the identification of all pathogenic mutations and genetic variants, including those located in or close to homopolymers. The use of NGS for detecting copy-number alterations was also investigated. The workflow meets the sensitivity and specificity requirements for the genetic diagnosis of HBOCS and improves on the cost-effectiveness of current approaches. PMID:23249957

  14. The Role of Next-Generation Sequencing in the Diagnosis of Lysosomal Storage Disorders

    Directory of Open Access Journals (Sweden)

    Katalin Komlosi MD, PhD

    2016-10-01

    Full Text Available Next-generation sequencing (NGS panels are used widely in clinical diagnostics to identify genetic causes of various monogenic disease groups including neurometabolic disorders and, more recently, lysosomal storage disorders (LSDs. Many new challenges have been introduced through these new technologies, both at the laboratory level and at the bioinformatics level, with consequences including new requirements for interpretation of results, and for genetic counseling. We review some recent examples of the application of NGS technologies, with purely diagnostic and with both diagnostic and research aims, for establishing a rapid genetic diagnosis in LSDs. Given that NGS can be applied in a way that takes into account the many issues raised by international consensus guidelines, it can have a significant role even early in the course of the diagnostic process, in combination with biochemical and clinical data. Besides decreasing the delay in diagnosis for many patients, a precise molecular diagnosis is extremely important as new therapies are becoming available within the LSD spectrum for patients who share specific types of mutations. A genetic diagnosis is also the prerequisite for genetic counseling, family planning, and the individual choice of reproductive options in affected families.

  15. Molecular Diagnostics of Gliomas Using Next Generation Sequencing of a Glioma-Tailored Gene Panel.

    Science.gov (United States)

    Zacher, Angela; Kaulich, Kerstin; Stepanow, Stefanie; Wolter, Marietta; Köhrer, Karl; Felsberg, Jörg; Malzkorn, Bastian; Reifenberger, Guido

    2017-03-01

    Current classification of gliomas is based on histological criteria according to the World Health Organization (WHO) classification of tumors of the central nervous system. Over the past years, characteristic genetic profiles have been identified in various glioma types. These can refine tumor diagnostics and provide important prognostic and predictive information. We report on the establishment and validation of gene panel next generation sequencing (NGS) for the molecular diagnostics of gliomas. We designed a glioma-tailored gene panel covering 660 amplicons derived from 20 genes frequently aberrant in different glioma types. Sensitivity and specificity of glioma gene panel NGS for detection of DNA sequence variants and copy number changes were validated by single gene analyses. NGS-based mutation detection was optimized for application on formalin-fixed paraffin-embedded tissue specimens including small stereotactic biopsy samples. NGS data obtained in a retrospective analysis of 121 gliomas allowed for their molecular classification into distinct biological groups, including (i) isocitrate dehydrogenase gene (IDH) 1 or 2 mutant astrocytic gliomas with frequent α-thalassemia/mental retardation syndrome X-linked (ATRX) and tumor protein p53 (TP53) gene mutations, (ii) IDH mutant oligodendroglial tumors with 1p/19q codeletion, telomerase reverse transcriptase (TERT) promoter mutation and frequent Drosophila homolog of capicua (CIC) gene mutation, as well as (iii) IDH wildtype glioblastomas with frequent TERT promoter mutation, phosphatase and tensin homolog (PTEN) mutation and/or epidermal growth factor receptor (EGFR) amplification. Oligoastrocytic gliomas were genetically assigned to either of these groups. Our findings implicate gene panel NGS as a promising diagnostic technique that may facilitate integrated histological and molecular glioma classification. © 2016 International Society of Neuropathology.

  16. [Diagnosis of mitochondrial disorders in children with next generation sequencing].

    Science.gov (United States)

    Liu, Zhimei; Fang, Fang; Ding, Changhong; Zhang, Weihua; Li, Jiuwei; Yang, Xinying; Wang, Xiaohui; Wu, Yun; Wang, Hongmei; Liu, Liying; Han, Tongli; Wang, Xu; Chen, Chunhong; Lyu, Junlan; Wu, Husheng

    2015-10-01

    To explore the application value of next generation sequencing (NGS) in the diagnosis of mitochondrial disorders. According to mitochondrial disease criteria, genomic DNA was extracted using standard procedure from peripheral venous blood of patients with suspected mitochondrial disease collected from neurological department of Beijing Children's Hospital Affiliated to Capital Medical University between October 2012 and February 2014. Targeted NGS to capture and sequence the entire mtDNA and exons of the 1 000 nuclear genes related to mitochondrial structure and function. Clinical data were collected from patients diagnosed at a molecular level, then clinical features and the relationship between genotype and phenotype were analyzed. Mutation was detected in 21 of 70 patients with suspected mitochondrial disease, in whom 10 harbored mtDNA mutation, while 11 nuclear DNA (nDNA) mutation. In 21 patients, 1 was diagnosed congenital myasthenic syndrome with episodic apnea due to CHAT gene p.I187T homozygous mutation, and 20 were diagnosed mitochondrial disease, in which 10 were Leigh syndrome, 4 were mitochondrial encephalomyopathy with lactic acidosis and stroke like episodes syndrome, 3 were Leber hereditary optic neuropathy (LHON) and LHON plus, 2 were mitochondrial DNA depletion syndrome and 1 was unknown. All the mtDNA mutations were point mutations, which contained A3243G, G3460A, G11778A, T14484C, T14502C and T14487C. Ten mitochondrial disease patients harbored homozygous or compound heterozygous mutations in 5 genes previously shown to cause disease: SURF1, PDHA1, NDUFV1, SUCLA2 and SUCLG1, which had 14 mutations, and 7 of the 14 mutations have not been reported. NGS has a certain application value in the diagnosis of mitochondrial diseases, especially in Leigh syndrome atypical mitochondrial syndrome and rare mitochondrial disorders.

  17. Quality control of next-generation sequencing library through an integrative digital microfluidic platform.

    Science.gov (United States)

    Thaitrong, Numrin; Kim, Hanyoup; Renzi, Ronald F; Bartsch, Michael S; Meagher, Robert J; Patel, Kamlesh D

    2012-12-01

    We have developed an automated quality control (QC) platform for next-generation sequencing (NGS) library characterization by integrating a droplet-based digital microfluidic (DMF) system with a capillary-based reagent delivery unit and a quantitative CE module. Using an in-plane capillary-DMF interface, a prepared sample droplet was actuated into position between the ground electrode and the inlet of the separation capillary to complete the circuit for an electrokinetic injection. Using a DNA ladder as an internal standard, the CE module with a compact LIF detector was capable of detecting dsDNA in the range of 5-100 pg/μL, suitable for the amount of DNA required by the Illumina Genome Analyzer sequencing platform. This DMF-CE platform consumes tenfold less sample volume than the current Agilent BioAnalyzer QC technique, preserving precious sample while providing necessary sensitivity and accuracy for optimal sequencing performance. The ability of this microfluidic system to validate NGS library preparation was demonstrated by examining the effects of limited-cycle PCR amplification on the size distribution and the yield of Illumina-compatible libraries, demonstrating that as few as ten cycles of PCR bias the size distribution of the library toward undesirable larger fragments. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Next-generation sequencing approaches for improvement of lactic acid bacteria-fermented plant-based beverages

    Directory of Open Access Journals (Sweden)

    Jordyn Bergsveinson

    2017-01-01

    Full Text Available Plant-based beverages and milk alternatives produced from cereals and legumes have grown in popularity in recent years due to a range of consumer concerns over dairy products. These plant-based products can often have undesirable physiochemical properties related to flavour, texture, and nutrient availability and/or deficiencies. Lactic acid bacteria (LAB fermentation offers potential remediation for many of these issues, and allows consumers to retain their perception of the resultant products as natural and additive-free. Using next-generation sequencing (NGS or omics approaches to characterize LAB isolates to find those that will improve properties of plant-based beverages is the most direct way to product improvement. Although NGS/omics approaches have been extensively used for selection of LAB for use in the dairy industry, a comparable effort has not occurred for selecting LAB for fermenting plant raw substrates, save those used in producing wine and certain types of beer. Here we review the few and recent applications of NGS/omics to profile and improve LAB fermentation of various plant-based substrates for beverage production. We also identify specific issues in the production of various LAB fermented plant-based beverages that such NGS/omics applications have the power to resolve.

  19. Validation of a next-generation sequencing assay for clinical molecular oncology.

    Science.gov (United States)

    Cottrell, Catherine E; Al-Kateb, Hussam; Bredemeyer, Andrew J; Duncavage, Eric J; Spencer, David H; Abel, Haley J; Lockwood, Christina M; Hagemann, Ian S; O'Guin, Stephanie M; Burcea, Lauren C; Sawyer, Christopher S; Oschwald, Dayna M; Stratman, Jennifer L; Sher, Dorie A; Johnson, Mark R; Brown, Justin T; Cliften, Paul F; George, Bijoy; McIntosh, Leslie D; Shrivastava, Savita; Nguyen, Tudung T; Payton, Jacqueline E; Watson, Mark A; Crosby, Seth D; Head, Richard D; Mitra, Robi D; Nagarajan, Rakesh; Kulkarni, Shashikant; Seibert, Karen; Virgin, Herbert W; Milbrandt, Jeffrey; Pfeifer, John D

    2014-01-01

    Currently, oncology testing includes molecular studies and cytogenetic analysis to detect genetic aberrations of clinical significance. Next-generation sequencing (NGS) allows rapid analysis of multiple genes for clinically actionable somatic variants. The WUCaMP assay uses targeted capture for NGS analysis of 25 cancer-associated genes to detect mutations at actionable loci. We present clinical validation of the assay and a detailed framework for design and validation of similar clinical assays. Deep sequencing of 78 tumor specimens (≥ 1000× average unique coverage across the capture region) achieved high sensitivity for detecting somatic variants at low allele fraction (AF). Validation revealed sensitivities and specificities of 100% for detection of single-nucleotide variants (SNVs) within coding regions, compared with SNP array sequence data (95% CI = 83.4-100.0 for sensitivity and 94.2-100.0 for specificity) or whole-genome sequencing (95% CI = 89.1-100.0 for sensitivity and 99.9-100.0 for specificity) of HapMap samples. Sensitivity for detecting variants at an observed 10% AF was 100% (95% CI = 93.2-100.0) in HapMap mixes. Analysis of 15 masked specimens harboring clinically reported variants yielded concordant calls for 13/13 variants at AF of ≥ 15%. The WUCaMP assay is a robust and sensitive method to detect somatic variants of clinical significance in molecular oncology laboratories, with reduced time and cost of genetic analysis allowing for strategic patient management. Copyright © 2014 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  20. Current Opportunities and Challenges of Next Generation Sequencing (NGS) of DNA; Determining Health and Diseases

    NARCIS (Netherlands)

    Robert, Vincent Albert Romain Ghislain

    2016-01-01

    Many publications have demonstrated the huge potential of NGS methods in terms of new species discovery, environment monitoring, ecological studies, etc. [24,35,92,97,103]. Undoubtedly, NGS will become one the major tools for species identification and for routine diagnostic use. While read lengths

  1. Next-generation DNA sequencing of HEXA: a step in the right direction for carrier screening

    OpenAIRE

    Hoffman, Jodi D; Greger, Valerie; Strovel, Erin T; Blitzer, Miriam G; Umbarger, Mark A; Kennedy, Caleb; Bishop, Brian; Saunders, Patrick; Porreca, Gregory J; Schienda, Jaclyn; Davie, Jocelyn; Hallam, Stephanie; Towne, Charles

    2013-01-01

    Tay-Sachs disease (TSD) is the prototype for ethnic-based carrier screening, with a carrier rate of ∼1/27 in Ashkenazi Jews and French Canadians. HexA enzyme analysis is the current gold standard for TSD carrier screening (detection rate ∼98%), but has technical limitations. We compared DNA analysis by next-generation DNA sequencing (NGS) plus an assay for the 7.6 kb deletion to enzyme analysis for TSD carrier screening using 74 samples collected from participants at a TSD family conference. ...

  2. Houston Methodist variant viewer: An application to support clinical laboratory interpretation of next-generation sequencing data for cancer

    Directory of Open Access Journals (Sweden)

    Paul A Christensen

    2017-01-01

    Full Text Available Introduction: Next-generation-sequencing (NGS is increasingly used in clinical and research protocols for patients with cancer. NGS assays are routinely used in clinical laboratories to detect mutations bearing on cancer diagnosis, prognosis and personalized therapy. A typical assay may interrogate 50 or more gene targets that encompass many thousands of possible gene variants. Analysis of NGS data in cancer is a labor-intensive process that can become overwhelming to the molecular pathologist or research scientist. Although commercial tools for NGS data analysis and interpretation are available, they are often costly, lack key functionality or cannot be customized by the end user. Methods: To facilitate NGS data analysis in our clinical molecular diagnostics laboratory, we created a custom bioinformatics tool termed Houston Methodist Variant Viewer (HMVV. HMVV is a Java-based solution that integrates sequencing instrument output, bioinformatics analysis, storage resources and end user interface. Results: Compared to the predicate method used in our clinical laboratory, HMVV markedly simplifies the bioinformatics workflow for the molecular technologist and facilitates the variant review by the molecular pathologist. Importantly, HMVV reduces time spent researching the biological significance of the variants detected, standardizes the online resources used to perform the variant investigation and assists generation of the annotated report for the electronic medical record. HMVV also maintains a searchable variant database, including the variant annotations generated by the pathologist, which is useful for downstream quality improvement and research projects. Conclusions: HMVV is a clinical grade, low-cost, feature-rich, highly customizable platform that we have made available for continued development by the pathology informatics community.

  3. Efficient error correction for next-generation sequencing of viral amplicons.

    Science.gov (United States)

    Skums, Pavel; Dimitrova, Zoya; Campo, David S; Vaughan, Gilberto; Rossi, Livia; Forbi, Joseph C; Yokosawa, Jonny; Zelikovsky, Alex; Khudyakov, Yury

    2012-06-25

    Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error identification and correction. Most error-correction methods to date are not optimized for amplicon analysis and assume that the error rate is randomly distributed. Recent quality assessment of amplicon sequences obtained using 454-sequencing showed that the error rate is strongly linked to the presence and size of homopolymers, position in the sequence and length of the amplicon. All these parameters are strongly sequence specific and should be incorporated into the calibration of error-correction algorithms designed for amplicon sequencing. In this paper, we present two new efficient error correction algorithms optimized for viral amplicons: (i) k-mer-based error correction (KEC) and (ii) empirical frequency threshold (ET). Both were compared to a previously published clustering algorithm (SHORAH), in order to evaluate their relative performance on 24 experimental datasets obtained by 454-sequencing of amplicons with known sequences. All three algorithms show similar accuracy in finding true haplotypes. However, KEC and ET were significantly more efficient than SHORAH in removing false haplotypes and estimating the frequency of true ones. Both algorithms, KEC and ET, are highly suitable for rapid recovery of error-free haplotypes obtained by 454-sequencing of amplicons from heterogeneous viruses.The implementations of the algorithms and data sets used for their testing are available at: http://alan.cs.gsu.edu/NGS/?q=content/pyrosequencing-error-correction-algorithm.

  4. Single-variant and multi-variant trend tests for genetic association with next-generation sequencing that are robust to sequencing error.

    Science.gov (United States)

    Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Alejandro Q; Musolf, Anthony; Matise, Tara C; Finch, Stephen J; Gordon, Derek

    2012-01-01

    As with any new technology, next-generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to those data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have

  5. Clinical Use of Next-Generation Sequencing in the Diagnosis of Wilson’s Disease

    Directory of Open Access Journals (Sweden)

    Dániel Németh

    2016-01-01

    Full Text Available Objective. Wilson’s disease is a disorder of copper metabolism which is fatal without treatment. The great number of disease-causing ATP7B gene mutations and the variable clinical presentation of WD may cause a real diagnostic challenge. The emergence of next-generation sequencing provides a time-saving, cost-effective method for full sequencing of the whole ATP7B gene compared to the traditional Sanger sequencing. This is the first report on the clinical use of NGS to examine ATP7B gene. Materials and Methods. We used Ion Torrent Personal Genome Machine in four heterozygous patients for the identification of the other mutations and also in two patients with no known mutation. One patient with acute on chronic liver failure was a candidate for acute liver transplantation. The results were validated by Sanger sequencing. Results. In each case, the diagnosis of Wilson’s disease was confirmed by identifying the mutations in both alleles within 48 hours. One novel mutation (p.Ala1270Ile was found beyond the eight other known ones. The rapid detection of the mutations made possible the prompt diagnosis of WD in a patient with acute liver failure. Conclusions. According to our results we found next-generation sequencing a very useful, reliable, time-saving, and cost-effective method for diagnosing Wilson’s disease in selected cases.

  6. Applying Ancestry and Sex Computation as a Quality Control Tool in Targeted Next-Generation Sequencing.

    Science.gov (United States)

    Mathias, Patrick C; Turner, Emily H; Scroggins, Sheena M; Salipante, Stephen J; Hoffman, Noah G; Pritchard, Colin C; Shirts, Brian H

    2016-03-01

    To apply techniques for ancestry and sex computation from next-generation sequencing (NGS) data as an approach to confirm sample identity and detect sample processing errors. We combined a principal component analysis method with k-nearest neighbors classification to compute the ancestry of patients undergoing NGS testing. By combining this calculation with X chromosome copy number data, we determined the sex and ancestry of patients for comparison with self-report. We also modeled the sensitivity of this technique in detecting sample processing errors. We applied this technique to 859 patient samples with reliable self-report data. Our k-nearest neighbors ancestry screen had an accuracy of 98.7% for patients reporting a single ancestry. Visual inspection of principal component plots was consistent with self-report in 99.6% of single-ancestry and mixed-ancestry patients. Our model demonstrates that approximately two-thirds of potential sample swaps could be detected in our patient population using this technique. Patient ancestry can be estimated from NGS data incidentally sequenced in targeted panels, enabling an inexpensive quality control method when coupled with patient self-report. © American Society for Clinical Pathology, 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    Science.gov (United States)

    Guttikonda, Satish K; Marri, Pradeep; Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.

  8. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    Directory of Open Access Journals (Sweden)

    Satish K Guttikonda

    Full Text Available Demand for the commercial use of genetically modified (GM crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.

  9. Estimating inbreeding coefficients from NGS data

    DEFF Research Database (Denmark)

    Garrett Vieira, Filipe Jorge; Fumagalli, Matteo; Albrechtsen, Anders

    2013-01-01

    Most methods for Next-Generation Sequencing (NGS) data analyses incorporate information regarding allele frequencies using the assumption of Hardy-Weinberg Equilibrium (HWE) as a prior. However, many organisms including domesticated, partially selfing or with asexual life cycles show strong...

  10. A Web-Hosted R Workflow to Simplify and Automate the Analysis of 16S NGS Data

    Science.gov (United States)

    Next-Generation Sequencing (NGS) produces large data sets that include tens-of-thousands of sequence reads per sample. For analysis of bacterial diversity, 16S NGS sequences are typically analyzed in a workflow that containing best-of-breed bioinformatics packages that may levera...

  11. Comparison of three human papillomavirus DNA detection methods: Next generation sequencing, multiplex-PCR and nested-PCR followed by Sanger based sequencing.

    Science.gov (United States)

    da Fonseca, Allex Jardim; Galvão, Renata Silva; Miranda, Angelica Espinosa; Ferreira, Luiz Carlos de Lima; Chen, Zigui

    2016-05-01

    To compare the diagnostic performance for HPV infection using three laboratorial techniques. Ninty-five cervicovaginal samples were randomly selected; each was tested for HPV DNA and genotypes using 3 methods in parallel: Multiplex-PCR, the Nested PCR followed by Sanger sequencing, and the Next_Gen Sequencing (NGS) with two assays (NGS-A1, NGS-A2). The study was approved by the Brazilian National IRB (CONEP protocol 16,800). The prevalence of HPV by the NGS assays was higher than that using the Multiplex-PCR (64.2% vs. 45.2%, respectively; P = 0.001) and the Nested-PCR (64.2% vs. 49.5%, respectively; P = 0.003). NGS also showed better performance in detecting high-risk HPV (HR-HPV) and HPV16. There was a weak interobservers agreement between the results of Multiplex-PCR and Nested-PCR in relation to NGS for the diagnosis of HPV infection, and a moderate correlation for HR-HPV detection. Both NGS assays showed a strong correlation for detection of HPVs (k = 0.86), HR-HPVs (k = 0.91), HPV16 (k = 0.92) and HPV18 (k = 0.91). NGS is more sensitive than the traditional Sanger sequencing and the Multiplex PCR to genotype HPVs, with promising ability to detect multiple infections, and may have the potential to establish an alternative method for the diagnosis and genotyping of HPV. © 2015 Wiley Periodicals, Inc.

  12. BioVLAB-MMIA-NGS: microRNA-mRNA integrated analysis using high-throughput sequencing data.

    Science.gov (United States)

    Chae, Heejoon; Rhee, Sungmin; Nephew, Kenneth P; Kim, Sun

    2015-01-15

    It is now well established that microRNAs (miRNAs) play a critical role in regulating gene expression in a sequence-specific manner, and genome-wide efforts are underway to predict known and novel miRNA targets. However, the integrated miRNA-mRNA analysis remains a major computational challenge, requiring powerful informatics systems and bioinformatics expertise. The objective of this study was to modify our widely recognized Web server for the integrated mRNA-miRNA analysis (MMIA) and its subsequent deployment on the Amazon cloud (BioVLAB-MMIA) to be compatible with high-throughput platforms, including next-generation sequencing (NGS) data (e.g. RNA-seq). We developed a new version called the BioVLAB-MMIA-NGS, deployed on both Amazon cloud and on a high-performance publicly available server called MAHA. By using NGS data and integrating various bioinformatics tools and databases, BioVLAB-MMIA-NGS offers several advantages. First, sequencing data is more accurate than array-based methods for determining miRNA expression levels. Second, potential novel miRNAs can be detected by using various computational methods for characterizing miRNAs. Third, because miRNA-mediated gene regulation is due to hybridization of an miRNA to its target mRNA, sequencing data can be used to identify many-to-many relationship between miRNAs and target genes with high accuracy. http://epigenomics.snu.ac.kr/biovlab_mmia_ngs/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Preparation of Small RNA NGS Libraries from Biofluids.

    Science.gov (United States)

    Etheridge, Alton; Wang, Kai; Baxter, David; Galas, David

    2018-01-01

    Next generation sequencing (NGS) is a powerful method for transcriptome analysis. Unlike other gene expression profiling methods, such as microarrays, NGS provides additional information such as splicing variants, sequence polymorphisms, and novel transcripts. For this reason, NGS is well suited for comprehensive profiling of the wide range of extracellular RNAs (exRNAs) in biofluids. ExRNAs are of great interest because of their possible biological role in cell-to-cell communication and for their potential use as biomarkers or for therapeutic purposes. Here, we describe a modified protocol for preparation of small RNA libraries for NGS analysis. This protocol has been optimized for use with low-input exRNA-containing samples, such as plasma or serum, and has modifications designed to reduce the sequence-specific bias typically encountered with commercial small RNA library construction kits.

  14. Next-generation DNA sequencing of HEXA: a step in the right direction for carrier screening

    Science.gov (United States)

    Hoffman, Jodi D; Greger, Valerie; Strovel, Erin T; Blitzer, Miriam G; Umbarger, Mark A; Kennedy, Caleb; Bishop, Brian; Saunders, Patrick; Porreca, Gregory J; Schienda, Jaclyn; Davie, Jocelyn; Hallam, Stephanie; Towne, Charles

    2013-01-01

    Tay-Sachs disease (TSD) is the prototype for ethnic-based carrier screening, with a carrier rate of ∼1/27 in Ashkenazi Jews and French Canadians. HexA enzyme analysis is the current gold standard for TSD carrier screening (detection rate ∼98%), but has technical limitations. We compared DNA analysis by next-generation DNA sequencing (NGS) plus an assay for the 7.6 kb deletion to enzyme analysis for TSD carrier screening using 74 samples collected from participants at a TSD family conference. Fifty-one of 74 participants had positive enzyme results (46 carriers, five late-onset Tay-Sachs [LOTS]), 16 had negative, and seven had inconclusive results. NGS + 7.6 kb del screening of HEXA found a pathogenic mutation, pseudoallele, or variant of unknown significance (VUS) in 100% of the enzyme-positive or obligate carrier/enzyme-inconclusive samples. NGS detected the B1 allele in two enzyme-negative obligate carriers. Our data indicate that NGS can be used as a TSD clinical carrier screening tool. We demonstrate that NGS can be superior in detecting TSD carriers compared to traditional enzyme and genotyping methodologies, which are limited by false-positive and false-negative results and ethnically focused, limited mutation panels, respectively, but is not ready for sole use due to lack of information regarding some VUS. PMID:24498621

  15. Spectrum of benzo[a]pyrene-induced mutations in the Pig-a gene of L5178YTk+/- cells identified with next generation sequencing.

    Science.gov (United States)

    Revollo, Javier; Wang, Yiying; McKinzie, Page; Dad, Azra; Pearce, Mason; Heflich, Robert H; Dobrovolsky, Vasily N

    2017-12-01

    We used Sanger sequencing and next generation sequencing (NGS) for analysis of mutations in the endogenous X-linked Pig-a gene of clonally expanded L5178YTk +/- cells. The clones developed from single cells that were sorted on a flow cytometer based upon the expression pattern of the GPI-anchored marker, CD90, on their surface. CD90-deficient and CD90-proficient cells were sorted from untreated cultures and CD90-deficient cells were sorted from cultures treated with benzo[a]pyrene (B[a]P). Pig-a mutations were identified in all clones developed from CD90-deficient cells; no Pig-a mutations were found in clones of CD90-proficient cells. The spectrum of B[a]P-induced Pig-a mutations was dominated by basepair substitutions, small insertions and deletions at G:C, or at sequences rich in G:C content. We observed high concordance between Pig-a mutations determined by Sanger sequencing and by NGS, but NGS was able to identify mutations in samples that were difficult to analyze by Sanger sequencing (e.g., mixtures of two mutant clones). Overall, the NGS method is a cost and labor efficient high throughput approach for analysis of a large number of mutant clones. Published by Elsevier B.V.

  16. Beta-Binomial Model for the Detection of Rare Mutations in Pooled Next-Generation Sequencing Experiments.

    Science.gov (United States)

    Jakaitiene, Audrone; Avino, Mariano; Guarracino, Mario Rosario

    2017-04-01

    Against diminishing costs, next-generation sequencing (NGS) still remains expensive for studies with a large number of individuals. As cost saving, sequencing genome of pools containing multiple samples might be used. Currently, there are many software available for the detection of single-nucleotide polymorphisms (SNPs). Sensitivity and specificity depend on the model used and data analyzed, indicating that all software have space for improvement. We use beta-binomial model to detect rare mutations in untagged pooled NGS experiments. We propose a multireference framework for pooled data with ability being specific up to two patients affected by neuromuscular disorders (NMD). We assessed the results comparing with The Genome Analysis Toolkit (GATK), CRISP, SNVer, and FreeBayes. Our results show that the multireference approach applying beta-binomial model is accurate in predicting rare mutations at 0.01 fraction. Finally, we explored the concordance of mutations between the model and software, checking their involvement in any NMD-related gene. We detected seven novel SNPs, for which the functional analysis produced enriched terms related to locomotion and musculature.

  17. Next-Generation Sequencing Platforms

    Science.gov (United States)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  18. CFSAN SNP Pipeline: an automated method for constructing SNP matrices from next-generation sequence data

    Directory of Open Access Journals (Sweden)

    Steve Davis

    2015-08-01

    Full Text Available The analysis of next-generation sequence (NGS data is often a fragmented step-wise process. For example, multiple pieces of software are typically needed to map NGS reads, extract variant sites, and construct a DNA sequence matrix containing only single nucleotide polymorphisms (i.e., a SNP matrix for a set of individuals. The management and chaining of these software pieces and their outputs can often be a cumbersome and difficult task. Here, we present CFSAN SNP Pipeline, which combines into a single package the mapping of NGS reads to a reference genome with Bowtie2, processing of those mapping (BAM files using SAMtools, identification of variant sites using VarScan, and production of a SNP matrix using custom Python scripts. We also introduce a Python package (CFSAN SNP Mutator that when given a reference genome will generate variants of known position against which we validate our pipeline. We created 1,000 simulated Salmonella enterica sp. enterica Serovar Agona genomes at 100× and 20× coverage, each containing 500 SNPs, 20 single-base insertions and 20 single-base deletions. For the 100× dataset, the CFSAN SNP Pipeline recovered 98.9% of the introduced SNPs and had a false positive rate of 1.04 × 10−6; for the 20× dataset 98.8% of SNPs were recovered and the false positive rate was 8.34 × 10−7. Based on these results, CFSAN SNP Pipeline is a robust and accurate tool that it is among the first to combine into a single executable the myriad steps required to produce a SNP matrix from NGS data. Such a tool is useful to those working in an applied setting (e.g., food safety traceback investigations as well as for those interested in evolutionary questions.

  19. Next generation sequencing and molecular analysis of artichoke Italian latent virus.

    Science.gov (United States)

    Elbeaino, Toufic; Belghacem, Imen; Mascia, Tiziana; Gallitelli, Donato; Digiaro, Michele

    2017-06-01

    Next-generation sequencing (NGS) allowed the assembly of the complete RNA-1 and RNA-2 sequences of a grapevine isolate of artichoke Italian latent virus (AILV). RNA-1 and RNA-2 are 7,338 and 4,630 nucleotides in length excluding the 3' terminal poly(A) tail, and encode two putative polyproteins of 255.8 kDa (p1) and 149.6 kDa (p2), respectively. All conserved motifs and predicted cleavage sites, typical for nepovirus polyproteins, were found in p1 and p2. AILV p1 and p2 share high amino acid identity with their homologues in beet ringspot virus (p1, 81% and p2, 71%), tomato black ring virus (p1, 79% and p2, 63%), grapevine Anatolian ringspot virus (p1, 65% and p2, 63%), and grapevine chrome mosaic virus (p1, 60% and p2, 54%), and to a lesser extent with other grapevine nepoviruses of subgroup A and C. Phylogenetic and sequence analyses, all confirmed the strict relationship of AILV with members classified in subgroup B of genus Nepovirus.

  20. Evaluation of next generation sequencing for the analysis of Eimeria communities in wildlife.

    Science.gov (United States)

    Vermeulen, Elke T; Lott, Matthew J; Eldridge, Mark D B; Power, Michelle L

    2016-05-01

    Next-generation sequencing (NGS) techniques are well-established for studying bacterial communities but not yet for microbial eukaryotes. Parasite communities remain poorly studied, due in part to the lack of reliable and accessible molecular methods to analyse eukaryotic communities. We aimed to develop and evaluate a methodology to analyse communities of the protozoan parasite Eimeria from populations of the Australian marsupial Petrogale penicillata (brush-tailed rock-wallaby) using NGS. An oocyst purification method for small sample sizes and polymerase chain reaction (PCR) protocol for the 18S rRNA locus targeting Eimeria was developed and optimised prior to sequencing on the Illumina MiSeq platform. A data analysis approach was developed by modifying methods from bacterial metagenomics and utilising existing Eimeria sequences in GenBank. Operational taxonomic unit (OTU) assignment at a high similarity threshold (97%) was more accurate at assigning Eimeria contigs into Eimeria OTUs but at a lower threshold (95%) there was greater resolution between OTU consensus sequences. The assessment of two amplification PCR methods prior to Illumina MiSeq, single and nested PCR, determined that single PCR was more sensitive to Eimeria as more Eimeria OTUs were detected in single amplicons. We have developed a simple and cost-effective approach to a data analysis pipeline for community analysis of eukaryotic organisms using Eimeria communities as a model. The pipeline provides a basis for evaluation using other eukaryotic organisms and potential for diverse community analysis studies. Copyright © 2016 Elsevier B.V. All rights reserved.

  1. SMITH: a LIMS for handling next-generation sequencing workflows.

    Science.gov (United States)

    Venco, Francesco; Vaskin, Yuriy; Ceol, Arnaud; Muller, Heiko

    2014-01-01

    Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling). SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses. SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The workflows are available

  2. SMITH: a LIMS for handling next-generation sequencing workflows

    Science.gov (United States)

    2014-01-01

    Background Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling). Methods SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses. Results SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The

  3. NGSCheckMate: software for validating sample identity in next-generation sequencing studies within and across data types.

    Science.gov (United States)

    Lee, Sejoon; Lee, Soohyun; Ouellette, Scott; Park, Woong-Yang; Lee, Eunjung A; Park, Peter J

    2017-06-20

    In many next-generation sequencing (NGS) studies, multiple samples or data types are profiled for each individual. An important quality control (QC) step in these studies is to ensure that datasets from the same subject are properly paired. Given the heterogeneity of data types, file types and sequencing depths in a multi-dimensional study, a robust program that provides a standardized metric for genotype comparisons would be useful. Here, we describe NGSCheckMate, a user-friendly software package for verifying sample identities from FASTQ, BAM or VCF files. This tool uses a model-based method to compare allele read fractions at known single-nucleotide polymorphisms, considering depth-dependent behavior of similarity metrics for identical and unrelated samples. Our evaluation shows that NGSCheckMate is effective for a variety of data types, including exome sequencing, whole-genome sequencing, RNA-seq, ChIP-seq, targeted sequencing and single-cell whole-genome sequencing, with a minimal requirement for sequencing depth (>0.5X). An alignment-free module can be run directly on FASTQ files for a quick initial check. We recommend using this software as a QC step in NGS studies. https://github.com/parklab/NGSCheckMate. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Challenges in the Setup of Large-scale Next-Generation Sequencing Analysis Workflows

    Directory of Open Access Journals (Sweden)

    Pranav Kulkarni

    Full Text Available While Next-Generation Sequencing (NGS can now be considered an established analysis technology for research applications across the life sciences, the analysis workflows still require substantial bioinformatics expertise. Typical challenges include the appropriate selection of analytical software tools, the speedup of the overall procedure using HPC parallelization and acceleration technology, the development of automation strategies, data storage solutions and finally the development of methods for full exploitation of the analysis results across multiple experimental conditions. Recently, NGS has begun to expand into clinical environments, where it facilitates diagnostics enabling personalized therapeutic approaches, but is also accompanied by new technological, legal and ethical challenges. There are probably as many overall concepts for the analysis of the data as there are academic research institutions. Among these concepts are, for instance, complex IT architectures developed in-house, ready-to-use technologies installed on-site as well as comprehensive Everything as a Service (XaaS solutions. In this mini-review, we summarize the key points to consider in the setup of the analysis architectures, mostly for scientific rather than diagnostic purposes, and provide an overview of the current state of the art and challenges of the field.

  5. Use of next-generation sequencing to detect LDLR gene copy number variation in familial hypercholesterolemia[S

    Science.gov (United States)

    Iacocca, Michael A.; Wang, Jian; Dron, Jacqueline S.; Robinson, John F.; McIntyre, Adam D.; Cao, Henian

    2017-01-01

    Familial hypercholesterolemia (FH) is a heritable condition of severely elevated LDL cholesterol, caused predominantly by autosomal codominant mutations in the LDL receptor gene (LDLR). In providing a molecular diagnosis for FH, the current procedure often includes targeted next-generation sequencing (NGS) panels for the detection of small-scale DNA variants, followed by multiplex ligation-dependent probe amplification (MLPA) in LDLR for the detection of whole-exon copy number variants (CNVs). The latter is essential because ∼10% of FH cases are attributed to CNVs in LDLR; accounting for them decreases false negative findings. Here, we determined the potential of replacing MLPA with bioinformatic analysis applied to NGS data, which uses depth-of-coverage analysis as its principal method to identify whole-exon CNV events. In analysis of 388 FH patient samples, there was 100% concordance in LDLR CNV detection between these two methods: 38 reported CNVs identified by MLPA were also successfully detected by our NGS method, while 350 samples negative for CNVs by MLPA were also negative by NGS. This result suggests that MLPA can be removed from the routine diagnostic screening for FH, significantly reducing associated costs, resources, and analysis time, while promoting more widespread assessment of this important class of mutations across diagnostic laboratories. PMID:28874442

  6. Barriers to clinical adoption of next generation sequencing: Perspectives of a policy Delphi panel

    Directory of Open Access Journals (Sweden)

    Donna A. Messner

    2016-09-01

    Full Text Available This research aims to inform policymakers by engaging expert stakeholders to identify, prioritize, and deliberate the most important and tractable policy barriers to the clinical adoption of next generation sequencing (NGS. A 4-round Delphi policy study was done with a multi-stakeholder panel of 48 experts. The first 2 rounds of online questionnaires (reported here assessed the importance and tractability of 28 potential barriers to clinical adoption of NGS across 3 major policy domains: intellectual property, coverage and reimbursement, and FDA regulation. We found that: 1 proprietary variant databases are seen as a key challenge, and a potentially intractable one; 2 payer policies were seen as a frequent barrier, especially a perceived inconsistency in standards for coverage; 3 relative to other challenges considered, FDA regulation was not strongly perceived as a barrier to clinical use of NGS. Overall the results indicate a perceived need for policies to promote data-sharing, and a desire for consistent payer coverage policies that maintain reasonably high standards of evidence for clinical utility, limit testing to that needed for clinical care decisions, and yet also flexibly allow for clinician discretion to use genomic testing in uncertain circumstances of high medical need.

  7. Use of next-generation sequencing to detect LDLR gene copy number variation in familial hypercholesterolemia.

    Science.gov (United States)

    Iacocca, Michael A; Wang, Jian; Dron, Jacqueline S; Robinson, John F; McIntyre, Adam D; Cao, Henian; Hegele, Robert A

    2017-11-01

    Familial hypercholesterolemia (FH) is a heritable condition of severely elevated LDL cholesterol, caused predominantly by autosomal codominant mutations in the LDL receptor gene ( LDLR ). In providing a molecular diagnosis for FH, the current procedure often includes targeted next-generation sequencing (NGS) panels for the detection of small-scale DNA variants, followed by multiplex ligation-dependent probe amplification (MLPA) in LDLR for the detection of whole-exon copy number variants (CNVs). The latter is essential because ∼10% of FH cases are attributed to CNVs in LDLR ; accounting for them decreases false negative findings. Here, we determined the potential of replacing MLPA with bioinformatic analysis applied to NGS data, which uses depth-of-coverage analysis as its principal method to identify whole-exon CNV events. In analysis of 388 FH patient samples, there was 100% concordance in LDLR CNV detection between these two methods: 38 reported CNVs identified by MLPA were also successfully detected by our NGS method, while 350 samples negative for CNVs by MLPA were also negative by NGS. This result suggests that MLPA can be removed from the routine diagnostic screening for FH, significantly reducing associated costs, resources, and analysis time, while promoting more widespread assessment of this important class of mutations across diagnostic laboratories. Copyright © 2017 by the American Society for Biochemistry and Molecular Biology, Inc.

  8. Performance of next-generation sequencing on small tumor specimens and/or low tumor content samples using a commercially available platform.

    Directory of Open Access Journals (Sweden)

    Scott Morris

    Full Text Available Next generation sequencing tests (NGS are usually performed on relatively small core biopsy or fine needle aspiration (FNA samples. Data is limited on what amount of tumor by volume or minimum number of FNA passes are needed to yield sufficient material for running NGS. We sought to identify the amount of tumor for running the PCDx NGS platform.2,723 consecutive tumor tissues of all cancer types were queried and reviewed for inclusion. Information on tumor volume, success of performing NGS, and results of NGS were compiled. Assessment of sequence analysis, mutation calling and sensitivity, quality control, drug associations, and data aggregation and analysis were performed.6.4% of samples were rejected from all testing due to insufficient tumor quantity. The number of genes with insufficient sensitivity make definitive mutation calls increased as the percentage of tumor decreased, reaching statistical significance below 5% tumor content. The number of drug associations also decreased with a lower percentage of tumor, but this difference only became significant between 1-3%. The number of drug associations did decrease with smaller tissue size as expected. Neither specimen size or percentage of tumor affected the ability to pass mRNA quality control. A tumor area of 10 mm2 provides a good margin of error for specimens to yield adequate drug association results.Specimen suitability remains a major obstacle to clinical NGS testing. We determined that PCR-based library creation methods allow the use of smaller specimens, and those with a lower percentage of tumor cells to be run on the PCDx NGS platform.

  9. PVT: an efficient computational procedure to speed up next-generation sequence analysis.

    Science.gov (United States)

    Maji, Ranjan Kumar; Sarkar, Arijita; Khatua, Sunirmal; Dasgupta, Subhasis; Ghosh, Zhumur

    2014-06-04

    High-throughput Next-Generation Sequencing (NGS) techniques are advancing genomics and molecular biology research. This technology generates substantially large data which puts up a major challenge to the scientists for an efficient, cost and time effective solution to analyse such data. Further, for the different types of NGS data, there are certain common challenging steps involved in analysing those data. Spliced alignment is one such fundamental step in NGS data analysis which is extremely computational intensive as well as time consuming. There exists serious problem even with the most widely used spliced alignment tools. TopHat is one such widely used spliced alignment tools which although supports multithreading, does not efficiently utilize computational resources in terms of CPU utilization and memory. Here we have introduced PVT (Pipelined Version of TopHat) where we take up a modular approach by breaking TopHat's serial execution into a pipeline of multiple stages, thereby increasing the degree of parallelization and computational resource utilization. Thus we address the discrepancies in TopHat so as to analyze large NGS data efficiently. We analysed the SRA dataset (SRX026839 and SRX026838) consisting of single end reads and SRA data SRR1027730 consisting of paired-end reads. We used TopHat v2.0.8 to analyse these datasets and noted the CPU usage, memory footprint and execution time during spliced alignment. With this basic information, we designed PVT, a pipelined version of TopHat that removes the redundant computational steps during 'spliced alignment' and breaks the job into a pipeline of multiple stages (each comprising of different step(s)) to improve its resource utilization, thus reducing the execution time. PVT provides an improvement over TopHat for spliced alignment of NGS data analysis. PVT thus resulted in the reduction of the execution time to ~23% for the single end read dataset. Further, PVT designed for paired end reads showed an

  10. Participation in interdisciplinary meetings on genetic diagnostics (NGS)

    NARCIS (Netherlands)

    Koole, Tom; van Burgsteden, Lotte; Harms, Paulien; van Diemen, Cleo C; van Langen, Irene M

    2017-01-01

    Diagnostics using next generation sequencing (NGS) requires high-quality interdisciplinary collaboration. In order to gain insight into this crucial collaborative process, we made video recordings of a new multidisciplinary team at work in the clinical genetics department of the University Medical

  11. Extensive next-generation sequencing analysis in chronic lymphocytic leukemia at diagnosis: clinical and biological correlations

    Directory of Open Access Journals (Sweden)

    Gian Matteo Rigolin

    2016-09-01

    Full Text Available Abstract Background In chronic lymphocytic leukemia (CLL, next-generation sequencing (NGS analysis represents a sensitive, reproducible, and resource-efficient technique for routine screening of gene mutations. Methods We performed an extensive biologic characterization of newly diagnosed CLL, including NGS analysis of 20 genes frequently mutated in CLL and karyotype analysis to assess whether NGS and karyotype results could be of clinical relevance in the refinement of prognosis and assessment of risk of progression. The genomic DNA from peripheral blood samples of 200 consecutive CLL patients was analyzed using Ion Torrent Personal Genome Machine, a NGS platform that uses semiconductor sequencing technology. Karyotype analysis was performed using efficient mitogens. Results Mutations were detected in 42.0 % of cases with 42.8 % of mutated patients presenting 2 or more mutations. The presence of mutations by NGS was associated with unmutated IGHV gene (p = 0.009, CD38 positivity (p = 0.010, risk stratification by fluorescence in situ hybridization (FISH (p < 0.001, and the complex karyotype (p = 0.003. A high risk as assessed by FISH analysis was associated with mutations affecting TP53 (p = 0.012, BIRC3 (p = 0.003, and FBXW7 (p = 0.003 while the complex karyotype was significantly associated with TP53, ATM, and MYD88 mutations (p = 0.003, 0.018, and 0.001, respectively. By multivariate analysis, the multi-hit profile (≥2 mutations by NGS was independently associated with a shorter time to first treatment (p = 0.004 along with TP53 disruption (p = 0.040, IGHV unmutated status (p < 0.001, and advanced stage (p < 0.001. Advanced stage (p = 0.010, TP53 disruption (p < 0.001, IGHV unmutated status (p = 0.020, and the complex karyotype (p = 0.007 were independently associated with a shorter overall survival. Conclusions At diagnosis, an extensive biologic characterization including

  12. Non-invasive prenatal diagnosis of achondroplasia and thanatophoric dysplasia: next-generation sequencing allows for a safer, more accurate, and comprehensive approach.

    Science.gov (United States)

    Chitty, Lyn S; Mason, Sarah; Barrett, Angela N; McKay, Fiona; Lench, Nicholas; Daley, Rebecca; Jenkins, Lucy A

    2015-07-01

    Accurate prenatal diagnosis of genetic conditions can be challenging and usually requires invasive testing. Here, we demonstrate the potential of next-generation sequencing (NGS) for the analysis of cell-free DNA in maternal blood to transform prenatal diagnosis of monogenic disorders. Analysis of cell-free DNA using a PCR and restriction enzyme digest (PCR-RED) was compared with a novel NGS assay in pregnancies at risk of achondroplasia and thanatophoric dysplasia. PCR-RED was performed in 72 cases and was correct in 88.6%, inconclusive in 7% with one false negative. NGS was performed in 47 cases and was accurate in 96.2% with no inconclusives. Both approaches were used in 27 cases, with NGS giving the correct result in the two cases inconclusive with PCR-RED. NGS provides an accurate, flexible approach to non-invasive prenatal diagnosis of de novo and paternally inherited mutations. It is more sensitive than PCR-RED and is ideal when screening a gene with multiple potential pathogenic mutations. These findings highlight the value of NGS in the development of non-invasive prenatal diagnosis for other monogenic disorders. © 2015 John Wiley & Sons, Ltd.

  13. Comprehensive molecular diagnosis of Epstein-Barr virus-associated lymphoproliferative diseases using next-generation sequencing.

    Science.gov (United States)

    Ono, Shintaro; Nakayama, Manabu; Kanegane, Hirokazu; Hoshino, Akihiro; Shimodera, Saeko; Shibata, Hirofumi; Fujino, Hisanori; Fujino, Takahiro; Yunomae, Yuta; Okano, Tsubasa; Yamashita, Motoi; Yasumi, Takahiro; Izawa, Kazushi; Takagi, Masatoshi; Imai, Kohsuke; Zhang, Kejian; Marsh, Rebecca; Picard, Capucine; Latour, Sylvain; Ohara, Osamu; Morio, Tomohiro

    2018-05-18

    Epstein-Barr virus (EBV) is associated with several life-threatening diseases, such as lymphoproliferative disease (LPD), particularly in immunocompromised hosts. Some categories of primary immunodeficiency diseases (PIDs) including X-linked lymphoproliferative syndrome (XLP), are characterized by susceptibility and vulnerability to EBV infection. The number of genetically defined PIDs is rapidly increasing, and clinical genetic testing plays an important role in establishing a definitive diagnosis. Whole-exome sequencing is performed for diagnosing rare genetic diseases, but is both expensive and time-consuming. Low-cost, high-throughput gene analysis systems are thus necessary. We developed a comprehensive molecular diagnostic method using a two-step tailed polymerase chain reaction (PCR) and a next-generation sequencing (NGS) platform to detect mutations in 23 candidate genes responsible for XLP or XLP-like diseases. Samples from 19 patients suspected of having EBV-associated LPD were used in this comprehensive molecular diagnosis. Causative gene mutations (involving PRF1 and SH2D1A) were detected in two of the 19 patients studied. This comprehensive diagnosis method effectively detected mutations in all coding exons of 23 genes with sufficient read numbers for each amplicon. This comprehensive molecular diagnostic method using PCR and NGS provides a rapid, accurate, low-cost diagnosis for patients with XLP or XLP-like diseases.

  14. Robust Sub-nanomolar Library Preparation for High Throughput Next Generation Sequencing.

    Science.gov (United States)

    Wu, Wells W; Phue, Je-Nie; Lee, Chun-Ting; Lin, Changyi; Xu, Lai; Wang, Rong; Zhang, Yaqin; Shen, Rong-Fong

    2018-05-04

    Current library preparation protocols for Illumina HiSeq and MiSeq DNA sequencers require ≥2 nM initial library for subsequent loading of denatured cDNA onto flow cells. Such amounts are not always attainable from samples having a relatively low DNA or RNA input; or those for which a limited number of PCR amplification cycles is preferred (less PCR bias and/or more even coverage). A well-tested sub-nanomolar library preparation protocol for Illumina sequencers has however not been reported. The aim of this study is to provide a much needed working protocol for sub-nanomolar libraries to achieve outcomes as informative as those obtained with the higher library input (≥ 2 nM) recommended by Illumina's protocols. Extensive studies were conducted to validate a robust sub-nanomolar (initial library of 100 pM) protocol using PhiX DNA (as a control), genomic DNA (Bordetella bronchiseptica and microbial mock community B for 16S rRNA gene sequencing), messenger RNA, microRNA, and other small noncoding RNA samples. The utility of our protocol was further explored for PhiX library concentrations as low as 25 pM, which generated only slightly fewer than 50% of the reads achieved under the standard Illumina protocol starting with > 2 nM. A sub-nanomolar library preparation protocol (100 pM) could generate next generation sequencing (NGS) results as robust as the standard Illumina protocol. Following the sub-nanomolar protocol, libraries with initial concentrations as low as 25 pM could also be sequenced to yield satisfactory and reproducible sequencing results.

  15. Development and preliminary evaluation of a multiplexed amplification and next generation sequencing method for viral hemorrhagic fever diagnostics.

    Directory of Open Access Journals (Sweden)

    Annika Brinkmann

    2017-11-01

    Full Text Available We describe the development and evaluation of a novel method for targeted amplification and Next Generation Sequencing (NGS-based identification of viral hemorrhagic fever (VHF agents and assess the feasibility of this approach in diagnostics.An ultrahigh-multiplex panel was designed with primers to amplify all known variants of VHF-associated viruses and relevant controls. The performance of the panel was evaluated via serially quantified nucleic acids from Yellow fever virus, Rift Valley fever virus, Crimean-Congo hemorrhagic fever (CCHF virus, Ebola virus, Junin virus and Chikungunya virus in a semiconductor-based sequencing platform. A comparison of direct NGS and targeted amplification-NGS was performed. The panel was further tested via a real-time nanopore sequencing-based platform, using clinical specimens from CCHF patients.The multiplex primer panel comprises two pools of 285 and 256 primer pairs for the identification of 46 virus species causing hemorrhagic fevers, encompassing 6,130 genetic variants of the strains involved. In silico validation revealed that the panel detected over 97% of all known genetic variants of the targeted virus species. High levels of specificity and sensitivity were observed for the tested virus strains. Targeted amplification ensured viral read detection in specimens with the lowest virus concentration (1-10 genome equivalents and enabled significant increases in specific reads over background for all viruses investigated. In clinical specimens, the panel enabled detection of the causative agent and its characterization within 10 minutes of sequencing, with sample-to-result time of less than 3.5 hours.Virus enrichment via targeted amplification followed by NGS is an applicable strategy for the diagnosis of VHFs which can be adapted for high-throughput or nanopore sequencing platforms and employed for surveillance or outbreak monitoring.

  16. Development and preliminary evaluation of a multiplexed amplification and next generation sequencing method for viral hemorrhagic fever diagnostics.

    Science.gov (United States)

    Brinkmann, Annika; Ergünay, Koray; Radonić, Aleksandar; Kocak Tufan, Zeliha; Domingo, Cristina; Nitsche, Andreas

    2017-11-01

    We describe the development and evaluation of a novel method for targeted amplification and Next Generation Sequencing (NGS)-based identification of viral hemorrhagic fever (VHF) agents and assess the feasibility of this approach in diagnostics. An ultrahigh-multiplex panel was designed with primers to amplify all known variants of VHF-associated viruses and relevant controls. The performance of the panel was evaluated via serially quantified nucleic acids from Yellow fever virus, Rift Valley fever virus, Crimean-Congo hemorrhagic fever (CCHF) virus, Ebola virus, Junin virus and Chikungunya virus in a semiconductor-based sequencing platform. A comparison of direct NGS and targeted amplification-NGS was performed. The panel was further tested via a real-time nanopore sequencing-based platform, using clinical specimens from CCHF patients. The multiplex primer panel comprises two pools of 285 and 256 primer pairs for the identification of 46 virus species causing hemorrhagic fevers, encompassing 6,130 genetic variants of the strains involved. In silico validation revealed that the panel detected over 97% of all known genetic variants of the targeted virus species. High levels of specificity and sensitivity were observed for the tested virus strains. Targeted amplification ensured viral read detection in specimens with the lowest virus concentration (1-10 genome equivalents) and enabled significant increases in specific reads over background for all viruses investigated. In clinical specimens, the panel enabled detection of the causative agent and its characterization within 10 minutes of sequencing, with sample-to-result time of less than 3.5 hours. Virus enrichment via targeted amplification followed by NGS is an applicable strategy for the diagnosis of VHFs which can be adapted for high-throughput or nanopore sequencing platforms and employed for surveillance or outbreak monitoring.

  17. Rapid molecular diagnostics of severe primary immunodeficiency determined by using targeted next-generation sequencing.

    Science.gov (United States)

    Yu, Hui; Zhang, Victor Wei; Stray-Pedersen, Asbjørg; Hanson, Imelda Celine; Forbes, Lisa R; de la Morena, M Teresa; Chinn, Ivan K; Gorman, Elizabeth; Mendelsohn, Nancy J; Pozos, Tamara; Wiszniewski, Wojciech; Nicholas, Sarah K; Yates, Anne B; Moore, Lindsey E; Berge, Knut Erik; Sorte, Hanne; Bayer, Diana K; ALZahrani, Daifulah; Geha, Raif S; Feng, Yanming; Wang, Guoli; Orange, Jordan S; Lupski, James R; Wang, Jing; Wong, Lee-Jun

    2016-10-01

    Primary immunodeficiency diseases (PIDDs) are inherited disorders of the immune system. The most severe form, severe combined immunodeficiency (SCID), presents with profound deficiencies of T cells, B cells, or both at birth. If not treated promptly, affected patients usually do not live beyond infancy because of infections. Genetic heterogeneity of SCID frequently delays the diagnosis; a specific diagnosis is crucial for life-saving treatment and optimal management. We developed a next-generation sequencing (NGS)-based multigene-targeted panel for SCID and other severe PIDDs requiring rapid therapeutic actions in a clinical laboratory setting. The target gene capture/NGS assay provides an average read depth of approximately 1000×. The deep coverage facilitates simultaneous detection of single nucleotide variants and exonic copy number variants in one comprehensive assessment. Exons with insufficient coverage (diagnostic yield of severe primary immunodeficiency. Establishing a molecular diagnosis enables early immune reconstitution through prompt therapeutic intervention and guides management for improved long-term quality of life. Copyright © 2016 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  18. nQuire: a statistical framework for ploidy estimation using next generation sequencing.

    Science.gov (United States)

    Weiß, Clemens L; Pais, Marina; Cano, Liliana M; Kamoun, Sophien; Burbano, Hernán A

    2018-04-04

    Intraspecific variation in ploidy occurs in a wide range of species including pathogenic and nonpathogenic eukaryotes such as yeasts and oomycetes. Ploidy can be inferred indirectly - without measuring DNA content - from experiments using next-generation sequencing (NGS). We present nQuire, a statistical framework that distinguishes between diploids, triploids and tetraploids using NGS. The command-line tool models the distribution of base frequencies at variable sites using a Gaussian Mixture Model, and uses maximum likelihood to select the most plausible ploidy model. nQuire handles large genomes at high coverage efficiently and uses standard input file formats. We demonstrate the utility of nQuire analyzing individual samples of the pathogenic oomycete Phytophthora infestans and the Baker's yeast Saccharomyces cerevisiae. Using these organisms we show the dependence between reliability of the ploidy assignment and sequencing depth. Additionally, we employ normalized maximized log- likelihoods generated by nQuire to ascertain ploidy level in a population of samples with ploidy heterogeneity. Using these normalized values we cluster samples in three dimensions using multivariate Gaussian mixtures. The cluster assignments retrieved from a S. cerevisiae population recovered the true ploidy level in over 96% of samples. Finally, we show that nQuire can be used regionally to identify chromosomal aneuploidies. nQuire provides a statistical framework to study organisms with intraspecific variation in ploidy. nQuire is likely to be useful in epidemiological studies of pathogens, artificial selection experiments, and for historical or ancient samples where intact nuclei are not preserved. It is implemented as a stand-alone Linux command line tool in the C programming language and is available at https://github.com/clwgg/nQuire under the MIT license.

  19. Profiling cancer gene mutations in clinical formalin-fixed, paraffin-embedded colorectal tumor specimens using targeted next-generation sequencing.

    Science.gov (United States)

    Zhang, Liangxuan; Chen, Liangjing; Sah, Sachin; Latham, Gary J; Patel, Rajesh; Song, Qinghua; Koeppen, Hartmut; Tam, Rachel; Schleifman, Erica; Mashhedi, Haider; Chalasani, Sreedevi; Fu, Ling; Sumiyoshi, Teiko; Raja, Rajiv; Forrest, William; Hampton, Garret M; Lackner, Mark R; Hegde, Priti; Jia, Shidong

    2014-04-01

    The success of precision oncology relies on accurate and sensitive molecular profiling. The Ion AmpliSeq Cancer Panel, a targeted enrichment method for next-generation sequencing (NGS) using the Ion Torrent platform, provides a fast, easy, and cost-effective sequencing workflow for detecting genomic "hotspot" regions that are frequently mutated in human cancer genes. Most recently, the U.K. has launched the AmpliSeq sequencing test in its National Health Service. This study aimed to evaluate the clinical application of the AmpliSeq methodology. We used 10 ng of genomic DNA from formalin-fixed, paraffin-embedded human colorectal cancer (CRC) tumor specimens to sequence 46 cancer genes using the AmpliSeq platform. In a validation study, we developed an orthogonal NGS-based resequencing approach (SimpliSeq) to assess the AmpliSeq variant calls. Validated mutational analyses revealed that AmpliSeq was effective in profiling gene mutations, and that the method correctly pinpointed "true-positive" gene mutations with variant frequency >5% and demonstrated high-level molecular heterogeneity in CRC. However, AmpliSeq enrichment and NGS also produced several recurrent "false-positive" calls in clinically druggable oncogenes such as PIK3CA. AmpliSeq provided highly sensitive and quantitative mutation detection for most of the genes on its cancer panel using limited DNA quantities from formalin-fixed, paraffin-embedded samples. For those genes with recurrent "false-positive" variant calls, caution should be used in data interpretation, and orthogonal verification of mutations is recommended for clinical decision making.

  20. Use of next generation sequencing data to develop a qPCR method for specific detection of EU-unauthorized genetically modified Bacillus subtilis overproducing riboflavin.

    Science.gov (United States)

    Barbau-Piednoir, Elodie; De Keersmaecker, Sigrid C J; Delvoye, Maud; Gau, Céline; Philipp, Patrick; Roosens, Nancy H

    2015-11-11

    Recently, the presence of an unauthorized genetically modified (GM) Bacillus subtilis bacterium overproducing vitamin B2 in a feed additive was notified by the Rapid Alert System for Food and Feed (RASFF). This has demonstrated that a contamination by a GM micro-organism (GMM) may occur in feed additives and has confronted for the first time,the enforcement laboratories with this type of RASFF. As no sequence information of this GMM nor any specific detection or identification method was available, Next GenerationSequencing (NGS) was used to generate sequence information. However, NGS data analysis often requires appropriate tools, involving bioinformatics expertise which is not alwayspresent in the average enforcement laboratory. This hampers the use of this technology to rapidly obtain critical sequence information in order to be able to develop a specific qPCRdetection method. Data generated by NGS were exploited using a simple BLAST approach. A TaqMan® qPCR method was developed and tested on isolated bacterial strains and on the feed additive directly. In this study, a very simple strategy based on the common BLAST tools that can be used by any enforcement lab without profound bioinformatics expertise, was successfully used toanalyse the B. subtilis data generated by NGS. The results were used to design and assess a new TaqMan® qPCR method, specifically detecting this GM vitamin B2 overproducing bacterium. The method complies with EU critical performance parameters for specificity, sensitivity, PCR efficiency and repeatability. The VitB2-UGM method also could detect the B. subtilis strain in genomic DNA extracted from the feed additive, without prior culturing step. The proposed method, provides a crucial tool for specifically and rapidly identifying this unauthorized GM bacterium in food and feed additives by enforcement laboratories. Moreover, this work can be seen as a case study to substantiate how the use of NGS data can offer an added value to easily

  1. Alignment of Short Reads: A Crucial Step for Application of Next-Generation Sequencing Data in Precision Medicine

    Directory of Open Access Journals (Sweden)

    Hao Ye

    2015-11-01

    Full Text Available Precision medicine or personalized medicine has been proposed as a modernized and promising medical strategy. Genetic variants of patients are the key information for implementation of precision medicine. Next-generation sequencing (NGS is an emerging technology for deciphering genetic variants. Alignment of raw reads to a reference genome is one of the key steps in NGS data analysis. Many algorithms have been developed for alignment of short read sequences since 2008. Users have to make a decision on which alignment algorithm to use in their studies. Selection of the right alignment algorithm determines not only the alignment algorithm but also the set of suitable parameters to be used by the algorithm. Understanding these algorithms helps in selecting the appropriate alignment algorithm for different applications in precision medicine. Here, we review current available algorithms and their major strategies such as seed-and-extend and q-gram filter. We also discuss the challenges in current alignment algorithms, including alignment in multiple repeated regions, long reads alignment and alignment facilitated with known genetic variants.

  2. HeurAA: accurate and fast detection of genetic variations with a novel heuristic amplicon aligner program for next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Lőrinc S Pongor

    Full Text Available Next generation sequencing (NGS of PCR amplicons is a standard approach to detect genetic variations in personalized medicine such as cancer diagnostics. Computer programs used in the NGS community often miss insertions and deletions (indels that constitute a large part of known human mutations. We have developed HeurAA, an open source, heuristic amplicon aligner program. We tested the program on simulated datasets as well as experimental data from multiplex sequencing of 40 amplicons in 12 oncogenes collected on a 454 Genome Sequencer from lung cancer cell lines. We found that HeurAA can accurately detect all indels, and is more than an order of magnitude faster than previous programs. HeurAA can compare reads and reference sequences up to several thousand base pairs in length, and it can evaluate data from complex mixtures containing reads of different gene-segments from different samples. HeurAA is written in C and Perl for Linux operating systems, the code and the documentation are available for research applications at http://sourceforge.net/projects/heuraa/

  3. Identification of Five Novel Variants in Chinese Oculocutaneous Albinism by Targeted Next-Generation Sequencing.

    Science.gov (United States)

    Qiu, Biyuan; Ma, Tao; Peng, Chunyan; Zheng, Xiaoqin; Yang, Jiyun

    2018-04-01

    The diagnosis of oculocutaneous albinism (OCA) is established using clinical signs and symptoms. OCA is, however, a highly genetically heterogeneous disease with mutations identified in at least nineteen unique genes, many of which produce overlapping phenotypic traits. Thus, differentiating genetic OCA subtypes for diagnoses and genetic counseling is challenging, based on clinical presentation alone, and would benefit from a comprehensive molecular diagnostic. To develop and validate a more comprehensive, targeted, next-generation-sequencing-based diagnostic for the identification of OCA-causing variants. The genomic DNA samples from 28 OCA probands were analyzed by targeted next-generation sequencing (NGS), and the candidate variants were confirmed through Sanger sequencing. We observed mutations in the TYR, OCA2, and SLC45A2 genes in 25/28 (89%) patients with OCA. We identified 38 pathogenic variants among these three genes, including 5 novel variants: c.1970G>T (p.Gly657Val), c.1669A>C (p.Thr557Pro), c.2339-2A>C, and c.1349C>G (p.Thr450Arg) in OCA2; c.459_470delTTTTGCTGCCGA (p.Ala155_Phe158del) in SLC45A2. Our findings expand the mutational spectrum of OCA in the Chinese population, and the assay we developed should be broadly useful as a molecular diagnostic, and as an aid for genetic counseling for OCA patients.

  4. FASTQSim: platform-independent data characterization and in silico read generation for NGS datasets.

    Science.gov (United States)

    Shcherbina, Anna

    2014-08-15

    High-throughput next generation sequencing technologies have enabled rapid characterization of clinical and environmental samples. Consequently, the largest bottleneck to actionable data has become sample processing and bioinformatics analysis, creating a need for accurate and rapid algorithms to process genetic data. Perfectly characterized in silico datasets are a useful tool for evaluating the performance of such algorithms. Background contaminating organisms are observed in sequenced mixtures of organisms. In silico samples provide exact truth. To create the best value for evaluating algorithms, in silico data should mimic actual sequencer data as closely as possible. FASTQSim is a tool that provides the dual functionality of NGS dataset characterization and metagenomic data generation. FASTQSim is sequencing platform-independent, and computes distributions of read length, quality scores, indel rates, single point mutation rates, indel size, and similar statistics for any sequencing platform. To create training or testing datasets, FASTQSim has the ability to convert target sequences into in silico reads with specific error profiles obtained in the characterization step. FASTQSim enables users to assess the quality of NGS datasets. The tool provides information about read length, read quality, repetitive and non-repetitive indel profiles, and single base pair substitutions. FASTQSim allows the user to simulate individual read datasets that can be used as standardized test scenarios for planning sequencing projects or for benchmarking metagenomic software. In this regard, in silico datasets generated with the FASTQsim tool hold several advantages over natural datasets: they are sequencing platform independent, extremely well characterized, and less expensive to generate. Such datasets are valuable in a number of applications, including the training of assemblers for multiple platforms, benchmarking bioinformatics algorithm performance, and creating challenge

  5. Využití metod next-generation sekvenování pro rekonstrukci fylogeneze polyploidních rostlin

    OpenAIRE

    Skopalíková, Jana

    2015-01-01

    This bachelor thesis summarizes available information about currently used next- generation sequencing (NGS) methods where a big progress was achieved during last few years. Great advantage of NGS is the ability to gain huge amount of data at much lower cost per base compared to the Sanger sequencing. However, there are various pitfalls in data analysis. Nowadays it is possible to sequence the entire genomes of individuals. Nevertheless, this approach remains challenging when studying many in...

  6. PCR-Free Enrichment of Mitochondrial DNA from Human Blood and Cell Lines for High Quality Next-Generation DNA Sequencing.

    Directory of Open Access Journals (Sweden)

    Meetha P Gould

    Full Text Available Recent advances in sequencing technology allow for accurate detection of mitochondrial sequence variants, even those in low abundance at heteroplasmic sites. Considerable sequencing cost savings can be achieved by enriching samples for mitochondrial (relative to nuclear DNA. Reduction in nuclear DNA (nDNA content can also help to avoid false positive variants resulting from nuclear mitochondrial sequences (numts. We isolate intact mitochondrial organelles from both human cell lines and blood components using two separate methods: a magnetic bead binding protocol and differential centrifugation. DNA is extracted and further enriched for mitochondrial DNA (mtDNA by an enzyme digest. Only 1 ng of the purified DNA is necessary for library preparation and next generation sequence (NGS analysis. Enrichment methods are assessed and compared using mtDNA (versus nDNA content as a metric, measured by using real-time quantitative PCR and NGS read analysis. Among the various strategies examined, the optimal is differential centrifugation isolation followed by exonuclease digest. This strategy yields >35% mtDNA reads in blood and cell lines, which corresponds to hundreds-fold enrichment over baseline. The strategy also avoids false variant calls that, as we show, can be induced by the long-range PCR approaches that are the current standard in enrichment procedures. This optimization procedure allows mtDNA enrichment for efficient and accurate massively parallel sequencing, enabling NGS from samples with small amounts of starting material. This will decrease costs by increasing the number of samples that may be multiplexed, ultimately facilitating efforts to better understand mitochondria-related diseases.

  7. Next-Generation DNA Sequencing of VH/VL Repertoires: A Primer and Guide to Applications in Single-Domain Antibody Discovery.

    Science.gov (United States)

    Henry, Kevin A

    2018-01-01

    Immunogenetic analyses of expressed antibody repertoires are becoming increasingly common experimental investigations and are critical to furthering our understanding of autoimmunity, infectious disease, and cancer. Next-generation DNA sequencing (NGS) technologies have now made it possible to interrogate antibody repertoires to unprecedented depths, typically by sequencing of cDNAs encoding immunoglobulin variable domains. In this chapter, we describe simple, fast, and reliable methods for producing and sequencing multiplex PCR amplicons derived from the variable regions (V H , V H H or V L ) of rearranged immunoglobulin heavy and light chain genes using the Illumina MiSeq platform. We include complete protocols and primer sets for amplicon sequencing of V H /V H H/V L repertoires directly from human, mouse, and llama lymphocytes as well as from phage-displayed V H /V H H/V L libraries; these can be easily be adapted to other types of amplicons with little modification. The resulting amplicons are diverse and representative, even using as few as 10 3 input B cells, and their generation is relatively inexpensive, requiring no special equipment and only a limited set of primers. In the absence of heavy-light chain pairing, single-domain antibodies are uniquely amenable to NGS analyses. We present a number of applications of NGS technology useful in discovery of single-domain antibodies from phage display libraries, including: (i) assessment of library functionality; (ii) confirmation of desired library randomization; (iii) estimation of library diversity; and (iv) monitoring the progress of panning experiments. While the case studies presented here are of phage-displayed single-domain antibody libraries, the principles extend to other types of in vitro display libraries.

  8. Detection of TET2 , KRAS and CBL variants by Next Generation ...

    African Journals Online (AJOL)

    Aim: In this study, we aimed to find possible genetic markers for molecular analysis in childhood AML by screening hot-spot exons of TET2, KRAS, and CBL using Next Generation Sequencing (NGS) analysis. In addition, association between found variants and mutations of Januse Kinase-2 (JAK2) and Fms Related ...

  9. Next Generation Sequencing of Actinobacteria for the Discovery of Novel Natural Products

    Science.gov (United States)

    Gomez-Escribano, Juan Pablo; Alt, Silke; Bibb, Mervyn J.

    2016-01-01

    Like many fields of the biosciences, actinomycete natural products research has been revolutionised by next-generation DNA sequencing (NGS). Hundreds of new genome sequences from actinobacteria are made public every year, many of them as a result of projects aimed at identifying new natural products and their biosynthetic pathways through genome mining. Advances in these technologies in the last five years have meant not only a reduction in the cost of whole genome sequencing, but also a substantial increase in the quality of the data, having moved from obtaining a draft genome sequence comprised of several hundred short contigs, sometimes of doubtful reliability, to the possibility of obtaining an almost complete and accurate chromosome sequence in a single contig, allowing a detailed study of gene clusters and the design of strategies for refactoring and full gene cluster synthesis. The impact that these technologies are having in the discovery and study of natural products from actinobacteria, including those from the marine environment, is only starting to be realised. In this review we provide a historical perspective of the field, analyse the strengths and limitations of the most relevant technologies, and share the insights acquired during our genome mining projects. PMID:27089350

  10. Next-Generation Sequencing for Typing and Detection of ESBL and MBL E. coli causing UTI

    Directory of Open Access Journals (Sweden)

    Nabakishore Nayak

    2017-10-01

    Full Text Available Next-generation sequencing (NGS has the potential to provide typing results and detect resistance genes in a single assay, thus guiding timely treatment decisions and allowing rapid tracking of transmission of resistant clones. We can be evaluated the performance of a new NGS assay during an outbreak of sequence type 131 (ST131 Escherichia coli infections in a teaching hospital. The assay will be performed on 100 extended-spectrum- beta-lactamase (ESBL E. coli isolates collected from UTI during last 5 years. Typing results will be compared to those of amplified fragment length polymorphism (AFLP, whereby we will be visually assessed the agreement of the Bio-Detection phylogenetic tree with clusters defined by AFLP. A microarray will be considered the gold standard for detection of resistance genes. AFLP will be identified a large cluster of different indistinguishable isolates on adjacent departments, indicating clonal spread. The BioDetection phylogenetic tree will be showed that all isolates of this outbreak cluster will be strongly related, while the further arrangement of the tree also largely agreed with other clusters defined by AFLP. With these experiments we will detect the ESBL and MBL strains and the patient can be prescribed the antibiotics accordingly.

  11. Next-Generation Genomics Facility at C-CAMP: Accelerating Genomic Research in India

    Science.gov (United States)

    S, Chandana; Russiachand, Heikham; H, Pradeep; S, Shilpa; M, Ashwini; S, Sahana; B, Jayanth; Atla, Goutham; Jain, Smita; Arunkumar, Nandini; Gowda, Malali

    2014-01-01

    Next-Generation Sequencing (NGS; http://www.genome.gov/12513162) is a recent life-sciences technological revolution that allows scientists to decode genomes or transcriptomes at a much faster rate with a lower cost. Genomic-based studies are in a relatively slow pace in India due to the non-availability of genomics experts, trained personnel and dedicated service providers. Using NGS there is a lot of potential to study India's national diversity (of all kinds). We at the Centre for Cellular and Molecular Platforms (C-CAMP) have launched the Next Generation Genomics Facility (NGGF) to provide genomics service to scientists, to train researchers and also work on national and international genomic projects. We have HiSeq1000 from Illumina and GS-FLX Plus from Roche454. The long reads from GS FLX Plus, and high sequence depth from HiSeq1000, are the best and ideal hybrid approaches for de novo and re-sequencing of genomes and transcriptomes. At our facility, we have sequenced around 70 different organisms comprising of more than 388 genomes and 615 transcriptomes – prokaryotes and eukaryotes (fungi, plants and animals). In addition we have optimized other unique applications such as small RNA (miRNA, siRNA etc), long Mate-pair sequencing (2 to 20 Kb), Coding sequences (Exome), Methylome (ChIP-Seq), Restriction Mapping (RAD-Seq), Human Leukocyte Antigen (HLA) typing, mixed genomes (metagenomes) and target amplicons, etc. Translating DNA sequence data from NGS sequencer into meaningful information is an important exercise. Under NGGF, we have bioinformatics experts and high-end computing resources to dissect NGS data such as genome assembly and annotation, gene expression, target enrichment, variant calling (SSR or SNP), comparative analysis etc. Our services (sequencing and bioinformatics) have been utilized by more than 45 organizations (academia and industry) both within India and outside, resulting several publications in peer-reviewed journals and several genomic

  12. Mutation Detection with Next-Generation Resequencing through a Mediator Genome

    Energy Technology Data Exchange (ETDEWEB)

    Wurtzel, Omri; Dori-Bachash, Mally; Pietrokovski, Shmuel; Jurkevitch, Edouard; Sorek, Rotem; Ben-Jacob, Eshel

    2010-12-31

    The affordability of next generation sequencing (NGS) is transforming the field of mutation analysis in bacteria. The genetic basis for phenotype alteration can be identified directly by sequencing the entire genome of the mutant and comparing it to the wild-type (WT) genome, thus identifying acquired mutations. A major limitation for this approach is the need for an a-priori sequenced reference genome for the WT organism, as the short reads of most current NGS approaches usually prohibit de-novo genome assembly. To overcome this limitation we propose a general framework that utilizes the genome of relative organisms as mediators for comparing WT and mutant bacteria. Under this framework, both mutant and WT genomes are sequenced with NGS, and the short sequencing reads are mapped to the mediator genome. Variations between the mutant and the mediator that recur in the WT are ignored, thus pinpointing the differences between the mutant and the WT. To validate this approach we sequenced the genome of Bdellovibrio bacteriovorus 109J, an obligatory bacterial predator, and its prey-independent mutant, and compared both to the mediator species Bdellovibrio bacteriovorus HD100. Although the mutant and the mediator sequences differed in more than 28,000 nucleotide positions, our approach enabled pinpointing the single causative mutation. Experimental validation in 53 additional mutants further established the implicated gene. Our approach extends the applicability of NGS-based mutant analyses beyond the domain of available reference genomes.

  13. High resolution clustering of Salmonella enterica serovar Montevideo strains using a next-generation sequencing approach

    Directory of Open Access Journals (Sweden)

    Allard Marc W

    2012-01-01

    Full Text Available Abstract Background Next-Generation Sequencing (NGS is increasingly being used as a molecular epidemiologic tool for discerning ancestry and traceback of the most complicated, difficult to resolve bacterial pathogens. Making a linkage between possible food sources and clinical isolates requires distinguishing the suspected pathogen from an environmental background and placing the variation observed into the wider context of variation occurring within a serovar and among other closely related foodborne pathogens. Equally important is the need to validate these high resolution molecular tools for use in molecular epidemiologic traceback. Such efforts include the examination of strain cluster stability as well as the cumulative genetic effects of sub-culturing on these clusters. Numerous isolates of S. Montevideo were shot-gun sequenced including diverse lineage representatives as well as numerous replicate clones to determine how much variability is due to bias, sequencing error, and or the culturing of isolates. All new draft genomes were compared to 34 S. Montevideo isolates previously published during an NGS-based molecular epidemiological case study. Results Intraserovar lineages of S. Montevideo differ by thousands of SNPs, that are only slightly less than the number of SNPs observed between S. Montevideo and other distinct serovars. Much less variability was discovered within an individual S. Montevideo clade implicated in a recent foodborne outbreak as well as among individual NGS replicates. These findings were similar to previous reports documenting homopolymeric and deletion error rates with the Roche 454 GS Titanium technology. In no case, however, did variability associated with sequencing methods or sample preparations create inconsistencies with our current phylogenetic results or the subsequent molecular epidemiological evidence gleaned from these data. Conclusions Implementation of a validated pipeline for NGS data acquisition and

  14. [Application of Next Generation Sequencing for AML/MDS Diagnosis and Treatment].

    Science.gov (United States)

    Cheng, Huan-Chen; Liu, Sheng-Wei; Liu, Yu; Zhao, Xue-Fei; Li, Wei; Qiu, Lin; Ma, Jun

    2017-12-01

    To detect the mutations of AML/MDS- related genes by using next generation sequencing (NGS), to analyze the mutation levels of each genes in the AML/MDS and the sensitivity of NGS, and to evaluate the feasibility of gene mutations for monitoring the MRD and predicating the progression of diseases. The specimens were collected from primary AML (68 cases) and MDS (57 cases) patients from August 2015 to June 2016 in the Harbin Institute of Hematology and Oncology. The mutations of 22 related genes were detected by using AML/MDS-NGS chips. TET2 gene showed the highest mutation rate in AML (55.9%) and MDS (56.1%). The gene mutations were as follows: CEBPA (11.8%), DNMT3A (7.4%), C-KIT (7.4%) and FLT3-ITD (7.4%) in AML, and U2AF1 (10.5%) and SRSF2 (10.5%) in MDS. All the genes had specific mutation sites except TP53 and CEBPA. The mutations of FLT3, C-KIT and CEBPA became negative in the 5 AML patients in remission when compared with those at primary attack, but the mutation rate of TET2 gene was not obviously changed, whereas the mutation rate of the 5 MDS patients was not significantly changed. The new gene mutations appeared in 3 MDS patients with disease progression, but the mutation rate was not changed significantly in the disease progression. The gene mutation rate still has not been changed significantly even after remission. Both AML and MDS have their own specific mutated genes and sites. Some gene mutations, such as CEBPA, can be used as an effective indicator to monitoring MRD in AML patients, but those only used for the evaluation of the disease progression and prognosis in MDS patients.

  15. CNV-RF Is a Random Forest-Based Copy Number Variation Detection Method Using Next-Generation Sequencing.

    Science.gov (United States)

    Onsongo, Getiria; Baughn, Linda B; Bower, Matthew; Henzler, Christine; Schomaker, Matthew; Silverstein, Kevin A T; Thyagarajan, Bharat

    2016-11-01

    Simultaneous detection of small copy number variations (CNVs) (<0.5 kb) and single-nucleotide variants in clinically significant genes is of great interest for clinical laboratories. The analytical variability in next-generation sequencing (NGS) and artifacts in coverage data because of issues with mappability along with lack of robust bioinformatics tools for CNV detection have limited the utility of targeted NGS data to identify CNVs. We describe the development and implementation of a bioinformatics algorithm, copy number variation-random forest (CNV-RF), that incorporates a machine learning component to identify CNVs from targeted NGS data. Using CNV-RF, we identified 12 of 13 deletions in samples with known CNVs, two cases with duplications, and identified novel deletions in 22 additional cases. Furthermore, no CNVs were identified among 60 genes in 14 cases with normal copy number and no CNVs were identified in another 104 patients with clinical suspicion of CNVs. All positive deletions and duplications were confirmed using a quantitative PCR method. CNV-RF also detected heterozygous deletions and duplications with a specificity of 50% across 4813 genes. The ability of CNV-RF to detect clinically relevant CNVs with a high degree of sensitivity along with confirmation using a low-cost quantitative PCR method provides a framework for providing comprehensive NGS-based CNV/single-nucleotide variant detection in a clinical molecular diagnostics laboratory. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  16. Use of next generation sequencing to detect biofilm bacteria in a patient with pedicle screw loosening after spine surgery

    DEFF Research Database (Denmark)

    Xu, Yijuan; Thomsen, Trine Rolighed; Lorenzen, Jan

    2016-01-01

    2. Center for Microbial Communities, Department of Biotechnology, Chemistry and Environmental Engineering, Aalborg University, Denmark 3. Otto-von-Guericke University Magdeburg, Department of Orthopedic Surgery, Magdeburg, Germany 4. Eifelklinik St. Brigida, Simmerath, Germany Aim: ”Hidden deep...... implant-related infection is believed to be linked to pedicle screw loosening after spine surgery. Low-grade bacterial infection can be hard to diagnose and may be undetected by conventional culture based methods. Next generation sequencing (NGS) could help to uncover hidden bacterial infections...... as a possible cause for implant loosening. This case report describes the use of NGS in the diagnostic work-up of a patient with pedicle screw loosening after spine surgery.” Method: ”A 60 y/o male had to undergo revision spine surgery for pedicle screw loosening and adjacent segment disease 3 years after...

  17. Not All Next Generation Sequencing Diagnostics are Created Equal: Understanding the Nuances of Solid Tumor Assay Design for Somatic Mutation Detection

    Energy Technology Data Exchange (ETDEWEB)

    Gray, Phillip N., E-mail: pgray@ambrygen.com; Dunlop, Charles L.M.; Elliott, Aaron M. [Ambry Genetics, 15 Argonaut, Aliso Viejo, CA 92656 (United States)

    2015-07-17

    The molecular characterization of tumors using next generation sequencing (NGS) is an emerging diagnostic tool that is quickly becoming an integral part of clinical decision making. Cancer genomic profiling involves significant challenges including DNA quality and quantity, tumor heterogeneity, and the need to detect a wide variety of complex genetic mutations. Most available comprehensive diagnostic tests rely on primer based amplification or probe based capture methods coupled with NGS to detect hotspot mutation sites or whole regions implicated in disease. These tumor panels utilize highly customized bioinformatics pipelines to perform the difficult task of accurately calling cancer relevant alterations such as single nucleotide variations, small indels or large genomic alterations from the NGS data. In this review, we will discuss the challenges of solid tumor assay design/analysis and report a case study that highlights the need to include complementary technologies (i.e., arrays) and germline analysis in tumor testing to reliably identify copy number alterations and actionable variants.

  18. Next-generation sequencing: a challenge to meet the increasing demand for training workshops in Australia.

    Science.gov (United States)

    Watson-Haigh, Nathan S; Shang, Catherine A; Haimel, Matthias; Kostadima, Myrto; Loos, Remco; Deshpande, Nandan; Duesing, Konsta; Li, Xi; McGrath, Annette; McWilliam, Sean; Michnowicz, Simon; Moolhuijzen, Paula; Quenette, Steve; Revote, Jerico Nico De Leon; Tyagi, Sonika; Schneider, Maria V

    2013-09-01

    The widespread adoption of high-throughput next-generation sequencing (NGS) technology among the Australian life science research community is highlighting an urgent need to up-skill biologists in tools required for handling and analysing their NGS data. There is currently a shortage of cutting-edge bioinformatics training courses in Australia as a consequence of a scarcity of skilled trainers with time and funding to develop and deliver training courses. To address this, a consortium of Australian research organizations, including Bioplatforms Australia, the Commonwealth Scientific and Industrial Research Organisation and the Australian Bioinformatics Network, have been collaborating with EMBL-EBI training team. A group of Australian bioinformaticians attended the train-the-trainer workshop to improve training skills in developing and delivering bioinformatics workshop curriculum. A 2-day NGS workshop was jointly developed to provide hands-on knowledge and understanding of typical NGS data analysis workflows. The road show-style workshop was successfully delivered at five geographically distant venues in Australia using the newly established Australian NeCTAR Research Cloud. We highlight the challenges we had to overcome at different stages from design to delivery, including the establishment of an Australian bioinformatics training network and the computing infrastructure and resource development. A virtual machine image, workshop materials and scripts for configuring a machine with workshop contents have all been made available under a Creative Commons Attribution 3.0 Unported License. This means participants continue to have convenient access to an environment they had become familiar and bioinformatics trainers are able to access and reuse these resources.

  19. A Next-Generation Sequencing Approach Uncovers Viral Transcripts Incorporated in Poxvirus Virions

    Directory of Open Access Journals (Sweden)

    Marica Grossegesse

    2017-10-01

    Full Text Available Transcripts are known to be incorporated in particles of DNA viruses belonging to the families of Herpesviridae and Mimiviridae, but the presence of transcripts in other DNA viruses, such as poxviruses, has not been analyzed yet. Therefore, we first established a next-generation-sequencing (NGS-based protocol, enabling the unbiased identification of transcripts in virus particles. Subsequently, we applied our protocol to analyze RNA in an emerging zoonotic member of the Poxviridae family, namely Cowpox virus. Our results revealed the incorporation of 19 viral transcripts, while host identifications were restricted to ribosomal and mitochondrial RNA. Most viral transcripts had an unknown and immunomodulatory function, suggesting that transcript incorporation may be beneficial for poxvirus immune evasion. Notably, the most abundant transcript originated from the D5L/I1R gene that encodes a viral inhibitor of the host cytoplasmic DNA sensing machinery.

  20. Enhancing Next-Generation Sequencing-Guided Cancer Care Through Cognitive Computing.

    Science.gov (United States)

    Patel, Nirali M; Michelini, Vanessa V; Snell, Jeff M; Balu, Saianand; Hoyle, Alan P; Parker, Joel S; Hayward, Michele C; Eberhard, David A; Salazar, Ashley H; McNeillie, Patrick; Xu, Jia; Huettner, Claudia S; Koyama, Takahiko; Utro, Filippo; Rhrissorrakrai, Kahn; Norel, Raquel; Bilal, Erhan; Royyuru, Ajay; Parida, Laxmi; Earp, H Shelton; Grilley-Olson, Juneko E; Hayes, D Neil; Harvey, Stephen J; Sharpless, Norman E; Kim, William Y

    2018-02-01

    Using next-generation sequencing (NGS) to guide cancer therapy has created challenges in analyzing and reporting large volumes of genomic data to patients and caregivers. Specifically, providing current, accurate information on newly approved therapies and open clinical trials requires considerable manual curation performed mainly by human "molecular tumor boards" (MTBs). The purpose of this study was to determine the utility of cognitive computing as performed by Watson for Genomics (WfG) compared with a human MTB. One thousand eighteen patient cases that previously underwent targeted exon sequencing at the University of North Carolina (UNC) and subsequent analysis by the UNCseq informatics pipeline and the UNC MTB between November 7, 2011, and May 12, 2015, were analyzed with WfG, a cognitive computing technology for genomic analysis. Using a WfG-curated actionable gene list, we identified additional genomic events of potential significance (not discovered by traditional MTB curation) in 323 (32%) patients. The majority of these additional genomic events were considered actionable based upon their ability to qualify patients for biomarker-selected clinical trials. Indeed, the opening of a relevant clinical trial within 1 month prior to WfG analysis provided the rationale for identification of a new actionable event in nearly a quarter of the 323 patients. This automated analysis took potentially improve patient care by providing a rapid, comprehensive approach for data analysis and consideration of up-to-date availability of clinical trials. The results of this study demonstrate that the interpretation and actionability of somatic next-generation sequencing results are evolving too rapidly to rely solely on human curation. Molecular tumor boards empowered by cognitive computing can significantly improve patient care by providing a fast, cost-effective, and comprehensive approach for data analysis in the delivery of precision medicine. Patients and physicians who

  1. Assessing Cat Flea Microbiomes in Northern and Southern California by 16S rRNA Next-Generation Sequencing.

    Science.gov (United States)

    Vasconcelos, Elton J R; Billeter, Sarah A; Jett, Lindsey A; Meinersmann, Richard J; Barr, Margaret C; Diniz, Pedro P V P; Oakley, Brian B

    2018-06-12

    Flea-borne diseases (FBDs) impact both human and animal health worldwide. Because adult fleas are obligately hematophagous and can harbor potential pathogens, fleas act as ectoparasites of vertebrates, as well as zoonotic disease vectors. Cat fleas (Ctenocephalides felis) are important vectors of two zoonotic bacterial genera listed as priority pathogens by the National Institute of Allergy and Infectious Diseases (NIAID-USA): Bartonella spp. and Rickettsia spp., causative agents of bartonelloses and rickettsioses, respectively. In this study, we introduce the first microbiome analysis of C. felis samples from California, determining the presence and abundance of relevant pathogenic genera by characterizing the cat flea microbiome through 16S rRNA next-generation sequencing (16S-NGS). Samples from both northern (NoCal) and southern (SoCal) California were assessed to expand current knowledge regarding FBDs in the state. We identified Rickettsia and Bartonella, as well as the endosymbiont Wolbachia, as the most abundant genera, followed by less abundant taxa. In comparison to our previous study screening Californian cat fleas for rickettsiae using PCR/digestion/sequencing of the ompB gene, the 16S-NGS approach applied herein showed a 95% level of agreement in detecting Rickettsia spp. There was no overall difference in microbiome diversity between NoCal and SoCal samples. Bacterial taxa identified by 16S-NGS in this study may help to improve epidemiological investigations, pathogen surveillance efforts, and clinical diagnostics of FBDs in California and elsewhere.

  2. A comparative study of mutation screening of sarcomeric genes (MYBPC3, MYH7, TNNT2 using single gene approach versus targeted gene panel next generation sequencing in a cohort of HCM patients in Egypt

    Directory of Open Access Journals (Sweden)

    Heba Sh. Kassem

    2017-10-01

    Full Text Available Background: NGS enables simultaneous sequencing of large numbers of associated genes in genetic heterogeneous disorders, in a more rapid and cost-effective manner than traditional technologies. However there have been limited direct comparisons between NGS and more established technologies to assess the sensitivity and false negative rates of this new approach. The scope of the present manuscript is to compare variants detected in MYBPC3, MYH7 and TNNT2 genes using the stepwise dHPLC/Sanger versus targeted NGS. Methods: In this study, we have analysed a group of 150 samples of patients from the Bibliotheca Alexandrina-Aswan Heart Centre National HCM program. The genetic testing was simultaneously undertaken by high throughput denaturing high-performance liquid chromatography (dHPLC followed by Sanger based sequencing and targeted next generation deep sequencing using panel of inherited cardiac genes (ICC. The panel included over 100 genes including the 3 sarcomeric genes. Analysis of the sequencing data of the 3 genes was undertaken in a double blinded strategy. Results: NGS analysis detected all pathogenic and likely pathogenic variants identified by dHPLC (50 in total, some samples had double hits. There was a 0% false negative rate for NGS based analysis. Nineteen variants were missed by dHPLC and detected by NGS, thus increasing the diagnostic yield in this co- analysed cohort from 22.0% (33/150 to 31.3% (47/150.Of interest to note that the mutation spectrum in this Egyptian HCM population revealed a high rate of homozygosity in MYBPC3 and MYH7 genes in comparison to other population studies (6/150, 4%. None of the homozygous samples were detected by dHPLC analysis. Conclusion: NGS provides a useful and rapid tool to allow panoramic screening of several genes simultaneously with a high sensitivity rate amongst genes of known etiologic role allowing high throughput analysis of HCM patients and relevant control series in a less characterised

  3. Codon-Precise, Synthetic, Antibody Fragment Libraries Built Using Automated Hexamer Codon Additions and Validated through Next Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Laura Frigotto

    2015-05-01

    Full Text Available We have previously described ProxiMAX, a technology that enables the fabrication of precise, combinatorial gene libraries via codon-by-codon saturation mutagenesis. ProxiMAX was originally performed using manual, enzymatic transfer of codons via blunt-end ligation. Here we present Colibra™: an automated, proprietary version of ProxiMAX used specifically for antibody library generation, in which double-codon hexamers are transferred during the saturation cycling process. The reduction in process complexity, resulting library quality and an unprecedented saturation of up to 24 contiguous codons are described. Utility of the method is demonstrated via fabrication of complementarity determining regions (CDR in antibody fragment libraries and next generation sequencing (NGS analysis of their quality and diversity.

  4. FDA's Activities Supporting Regulatory Application of "Next Gen" Sequencing Technologies.

    Science.gov (United States)

    Wilson, Carolyn A; Simonyan, Vahan

    2014-01-01

    Applications of next-generation sequencing (NGS) technologies require availability and access to an information technology (IT) infrastructure and bioinformatics tools for large amounts of data storage and analyses. The U.S. Food and Drug Administration (FDA) anticipates that the use of NGS data to support regulatory submissions will continue to increase as the scientific and clinical communities become more familiar with the technologies and identify more ways to apply these advanced methods to support development and evaluation of new biomedical products. FDA laboratories are conducting research on different NGS platforms and developing the IT infrastructure and bioinformatics tools needed to enable regulatory evaluation of the technologies and the data sponsors will submit. A High-performance Integrated Virtual Environment, or HIVE, has been launched, and development and refinement continues as a collaborative effort between the FDA and George Washington University to provide the tools to support these needs. The use of a highly parallelized environment facilitated by use of distributed cloud storage and computation has resulted in a platform that is both rapid and responsive to changing scientific needs. The FDA plans to further develop in-house capacity in this area, while also supporting engagement by the external community, by sponsoring an open, public workshop to discuss NGS technologies and data formats standardization, and to promote the adoption of interoperability protocols in September 2014. Next-generation sequencing (NGS) technologies are enabling breakthroughs in how the biomedical community is developing and evaluating medical products. One example is the potential application of this method to the detection and identification of microbial contaminants in biologic products. In order for the U.S. Food and Drug Administration (FDA) to be able to evaluate the utility of this technology, we need to have the information technology infrastructure and

  5. Next-generation sequencing indicates false-positive MRD results and better predicts prognosis after SCT in patients with childhood ALL.

    Science.gov (United States)

    Kotrova, M; van der Velden, V H J; van Dongen, J J M; Formankova, R; Sedlacek, P; Brüggemann, M; Zuna, J; Stary, J; Trka, J; Fronkova, E

    2017-07-01

    Minimal residual disease (MRD) monitoring via quantitative PCR (qPCR) detection of Ag receptor gene rearrangements has been the most sensitive method for predicting prognosis and making post-transplant treatment decisions for patients with ALL. Despite the broad clinical usefulness and standardization of this method, we and others have repeatedly reported the possibility of false-positive MRD results caused by massive B-lymphocyte regeneration after stem cell transplantation (SCT). Next-generation sequencing (NGS) enables precise and sensitive detection of multiple Ag receptor rearrangements, thus providing a more specific readout compared to qPCR. We investigated two cohorts of children with ALL who underwent SCT (30 patients and 228 samples). The first cohort consisted of 17 patients who remained in long-term CR after SCT despite having low MRD positivity (SCT monitoring using qPCR. Only one of 27 qPCR-positive samples was confirmed to be positive by NGS. Conversely, 10 of 15 samples with low qPCR-detected MRD positivity from 13 patients who subsequently relapsed were also confirmed to be positive by NGS (P=0.002). These data show that NGS has a better specificity in post-SCT ALL management and indicate that treatment interventions aimed at reverting impending relapse should not be based on qPCR only.

  6. Development and Validation of a Scalable Next-Generation Sequencing System for Assessing Relevant Somatic Variants in Solid Tumors12

    Science.gov (United States)

    Hovelson, Daniel H.; McDaniel, Andrew S.; Cani, Andi K.; Johnson, Bryan; Rhodes, Kate; Williams, Paul D.; Bandla, Santhoshi; Bien, Geoffrey; Choppa, Paul; Hyland, Fiona; Gottimukkala, Rajesh; Liu, Guoying; Manivannan, Manimozhi; Schageman, Jeoffrey; Ballesteros-Villagrana, Efren; Grasso, Catherine S.; Quist, Michael J.; Yadati, Venkata; Amin, Anmol; Siddiqui, Javed; Betz, Bryan L.; Knudsen, Karen E.; Cooney, Kathleen A.; Feng, Felix Y.; Roh, Michael H.; Nelson, Peter S.; Liu, Chia-Jen; Beer, David G.; Wyngaard, Peter; Chinnaiyan, Arul M.; Sadis, Seth; Rhodes, Daniel R.; Tomlins, Scott A.

    2015-01-01

    Next-generation sequencing (NGS) has enabled genome-wide personalized oncology efforts at centers and companies with the specialty expertise and infrastructure required to identify and prioritize actionable variants. Such approaches are not scalable, preventing widespread adoption. Likewise, most targeted NGS approaches fail to assess key relevant genomic alteration classes. To address these challenges, we predefined the catalog of relevant solid tumor somatic genome variants (gain-of-function or loss-of-function mutations, high-level copy number alterations, and gene fusions) through comprehensive bioinformatics analysis of >700,000 samples. To detect these variants, we developed the Oncomine Comprehensive Panel (OCP), an integrative NGS-based assay [compatible with 95% accuracy for KRAS, epidermal growth factor receptor, and BRAF mutation detection as well as for ALK and TMPRSS2:ERG gene fusions. Associating positive variants with potential targeted treatments demonstrated that 6% to 42% of profiled samples (depending on cancer type) harbored alterations beyond routine molecular testing that were associated with approved or guideline-referenced therapies. As a translational research tool, OCP identified adaptive CTNNB1 amplifications/mutations in treated prostate cancers. Through predefining somatic variants in solid tumors and compiling associated potential treatment strategies, OCP represents a simplified, broadly applicable targeted NGS system with the potential to advance precision oncology efforts. PMID:25925381

  7. Next Generation Solvent (NGS): Development for Caustic-Side Solvent Extraction of Cesium

    Energy Technology Data Exchange (ETDEWEB)

    Moyer, Bruce A. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Birdwell, Jr, Joseph F. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Bonnesen, Peter V. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Bruffey, Stephanie H. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Delmau, Laetitia Helene [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Duncan, Nathan C. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Ensor, Dale [Tennessee Technological Univ., Cookeville, TN (United States); Hill, Talon G. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lee, Denise L. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Rajbanshi, Arbin [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Roach, Benjamin D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Szczygiel, Patricia L. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sloop, Jr., Frederick V. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Stoner, Erica L. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Williams, Neil J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2014-03-01

    This report summarizes the FY 2010 and 2011 accomplishments at Oak Ridge National Laboratory (ORNL) in developing the Next Generation Caustic-Side Solvent Extraction (NG-CSSX) process, referred to commonly as the Next Generation Solvent (NGS), under funding from the U.S. Department of Energy, Office of Environmental Management (DOE-EM), Office of Technology Innovation and Development. The primary product of this effort is a process solvent and preliminary flowsheet capable of meeting a target decontamination factor (DF) of 40,000 for worst-case Savannah River Site (SRS) waste with a concentration factor of 15 or higher in the 18-stage equipment configuration of the SRS Modular Caustic-Side Solvent Extraction Unit (MCU). In addition, the NG-CSSX process may be readily adapted for use in the SRS Salt Waste Processing Facility (SWPF) or in supplemental tank-waste treatment at Hanford upon appropriate solvent or flowsheet modifications. Efforts in FY 2010 focused on developing a solvent composition and process flowsheet for MCU implementation. In FY 2011 accomplishments at ORNL involved a wide array of chemical-development activities and testing up through single-stage hydraulic and mass-transfer tests in 5-cm centrifugal contactors. Under subcontract from ORNL, Argonne National Laboratory (ANL) designed a preliminary flowsheet using ORNL cesium distribution data, and Tennessee Technological University confirmed a chemical model for cesium distribution ratios (DCs) as a function of feed composition. Interlaboratory efforts were coordinated with complementary engineering tests carried out (and reported separately) by personnel at Savannah River National Laboratory (SRNL) and Savannah River Remediation (SRR) with helpful advice by Parsons Engineering and General Atomics on aspects of possible SWPF implementation.

  8. Killer Immunoglobulin-Like Receptor Allele Determination Using Next-Generation Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Bercelin Maniangou

    2017-05-01

    Full Text Available The impact of natural killer (NK cell alloreactivity on hematopoietic stem cell transplantation (HSCT outcome is still debated due to the complexity of graft parameters, HLA class I environment, the nature of killer cell immunoglobulin-like receptor (KIR/KIR ligand genetic combinations studied, and KIR+ NK cell repertoire size. KIR genes are known to be polymorphic in terms of gene content, copy number variation, and number of alleles. These allelic polymorphisms may impact both the phenotype and function of KIR+ NK cells. We, therefore, speculate that polymorphisms may alter donor KIR+ NK cell phenotype/function thus modulating post-HSCT KIR+ NK cell alloreactivity. To investigate KIR allele polymorphisms of all KIR genes, we developed a next-generation sequencing (NGS technology on a MiSeq platform. To ensure the reliability and specificity of our method, genomic DNA from well-characterized cell lines were used; high-resolution KIR typing results obtained were then compared to those previously reported. Two different bioinformatic pipelines were used allowing the attribution of sequencing reads to specific KIR genes and the assignment of KIR alleles for each KIR gene. Our results demonstrated successful long-range KIR gene amplifications of all reference samples using intergenic KIR primers. The alignment of reads to the human genome reference (hg19 using BiRD pipeline or visualization of data using Profiler software demonstrated that all KIR genes were completely sequenced with a sufficient read depth (mean 317× for all loci and a high percentage of mapping (mean 93% for all loci. Comparison of high-resolution KIR typing obtained to those published data using exome capture resulted in a reported concordance rate of 95% for centromeric and telomeric KIR genes. Overall, our results suggest that NGS can be used to investigate the broad KIR allelic polymorphism. Hence, these data improve our knowledge, not only on KIR+ NK cell alloreactivity in

  9. Scalable and cost-effective NGS genotyping in the cloud.

    Science.gov (United States)

    Souilmi, Yassine; Lancaster, Alex K; Jung, Jae-Yoon; Rizzo, Ettore; Hawkins, Jared B; Powles, Ryan; Amzazi, Saaïd; Ghazal, Hassan; Tonellato, Peter J; Wall, Dennis P

    2015-10-15

    While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10's of dollars. We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets. Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration.

  10. DEVELOPMENT OF ANALYTICAL METHODS FOR DETERMINING SUPPRESSOR CONCENTRATION IN THE MCU NEXT GENERATION SOLVENT (NGS)

    Energy Technology Data Exchange (ETDEWEB)

    Taylor-Pashow, K.; Fondeur, F.; White, T.; Diprete, D.; Milliken, C.

    2013-07-31

    Savannah River National Laboratory (SRNL) was tasked with identifying and developing at least one, but preferably two methods for quantifying the suppressor in the Next Generation Solvent (NGS) system. The suppressor is a guanidine derivative, N,N',N"-tris(3,7-dimethyloctyl)guanidine (TiDG). A list of 10 possible methods was generated, and screening experiments were performed for 8 of the 10 methods. After completion of the screening experiments, the non-aqueous acid-base titration was determined to be the most promising, and was selected for further development as the primary method. {sup 1}H NMR also showed promising results from the screening experiments, and this method was selected for further development as the secondary method. Other methods, including {sup 36}Cl radiocounting and ion chromatography, also showed promise; however, due to the similarity to the primary method (titration) and the inability to differentiate between TiDG and TOA (tri-n-ocytlamine) in the blended solvent, {sup 1}H NMR was selected over these methods. Analysis of radioactive samples obtained from real waste ESS (extraction, scrub, strip) testing using the titration method showed good results. Based on these results, the titration method was selected as the method of choice for TiDG measurement. {sup 1}H NMR has been selected as the secondary (back-up) method, and additional work is planned to further develop this method and to verify the method using radioactive samples. Procedures for analyzing radioactive samples of both pure NGS and blended solvent were developed and issued for the both methods.

  11. Development of Genome-Wide SSR Markers from Angelica gigas Nakai Using Next Generation Sequencing.

    Science.gov (United States)

    Gil, Jinsu; Um, Yurry; Kim, Serim; Kim, Ok Tae; Koo, Sung Cheol; Reddy, Chinreddy Subramanyam; Kim, Seong-Cheol; Hong, Chang Pyo; Park, Sin-Gi; Kim, Ho Bang; Lee, Dong Hoon; Jeong, Byung-Hoon; Chung, Jong-Wook; Lee, Yi

    2017-09-21

    Angelica gigas Nakai is an important medicinal herb, widely utilized in Asian countries especially in Korea, Japan, and China. Although it is a vital medicinal herb, the lack of sequencing data and efficient molecular markers has limited the application of a genetic approach for horticultural improvements. Simple sequence repeats (SSRs) are universally accepted molecular markers for population structure study. In this study, we found over 130,000 SSRs, ranging from di- to deca-nucleotide motifs, using the genome sequence of Manchu variety (MV) of A. gigas, derived from next generation sequencing (NGS). From the putative SSR regions identified, a total of 16,496 primer sets were successfully designed. Among them, we selected 848 SSR markers that showed polymorphism from in silico analysis and contained tri- to hexa-nucleotide motifs. We tested 36 SSR primer sets for polymorphism in 16 A. gigas accessions. The average polymorphism information content (PIC) was 0.69; the average observed heterozygosity ( H O ) values, and the expected heterozygosity ( H E ) values were 0.53 and 0.73, respectively. These newly developed SSR markers would be useful tools for molecular genetics, genotype identification, genetic mapping, molecular breeding, and studying species relationships of the Angelica genus.

  12. Next-generation sequencing-based user-friendly platforms for drug-resistant tuberculosis diagnosis: A promise for the near future

    Directory of Open Access Journals (Sweden)

    David L Dolinger

    2016-01-01

    Full Text Available Since 2002, there has been a gradual worldwide 1.3% annual decrease in the incidence of tuberculosis (TB. This is an encouraging statistic; however, it will not achieve the World Health Organization's goal of eliminating TB by 2050, and it is being compounded by the persistent global incidence of drug-resistant tuberculosis (DR-TB acquired by transmission and by treatment pressure. One key to effectively control tuberculosis and the spread of multiresistant strains is accurate information pertaining to drug resistance and susceptibility. Next-generation sequencing (NGS has the potential to effectively change global health and the management of TB. Industry has focused primarily on using NGS for oncology diagnostics and human genomics, but the area in which NGS can rapidly impact health care is in the area of infectious disease diagnostics in low- and middle-income countries. To date, there has been a failure as a community to capitalize on the potential of NGS, especially at the reference laboratory level where it can provide actionable information pertaining to treatment options for patients. The rapid evolution of knowledge about the genetic foundations of tuberculosis drug resistance makes sequencing a versatile technology platform for providing rapid, accurate, and actionable results for treating this disease. No “plug-and-play” and “end-to-end” NGS solutions exist that provide clinically relevant sequence data from the Mycobacterium tuberculosis complex genome from primary clinical samples (e.g., sputum in high-burden country reference laboratories, which is where they are most needed. However, such a system-based solution is underdeveloped by Foundation for Innovative Diagnostics (FIND, in collaboration with partners from academia, nongovernmental organizations, and industry. The solution is modular and is designed and developed to perform targeted amplicon sequencing directly from a patient's primary sputum sample. This solution

  13. Less frequently mutated genes in colorectal cancer: evidences from next-generation sequencing of 653 routine cases.

    Science.gov (United States)

    Malapelle, Umberto; Pisapia, Pasquale; Sgariglia, Roberta; Vigliar, Elena; Biglietto, Maria; Carlomagno, Chiara; Giuffrè, Giuseppe; Bellevicine, Claudio; Troncone, Giancarlo

    2016-09-01

    The incidence of RAS/RAF/PI3KA and TP53 gene mutations in colorectal cancer (CRC) is well established. Less information, however, is available on other components of the CRC genomic landscape, which are potential CRC prognostic/predictive markers. Following a previous validation study, ion-semiconductor next-generation sequencing (NGS) was employed to process 653 routine CRC samples by a multiplex PCR targeting 91 hotspot regions in 22 CRC significant genes. A total of 796 somatic mutations in 499 (76.4%) tumours were detected. Besides RAS/RAF/PI3KA and TP53, other 12 genes showed at least one mutation including FBXW7 (6%), PTEN (2.8%), SMAD4 (2.1%), EGFR (1.2%), CTNNB1 (1.1%), AKT1 (0.9%), STK11 (0.8%), ERBB2 (0.6%), ERBB4 (0.6%), ALK (0.2%), MAP2K1 (0.2%) and NOTCH1 (0.2%). In a routine diagnostic setting, NGS had the potential to generate robust and comprehensive genetic information also including less frequently mutated genes potentially relevant for prognostic assessments or for actionable treatments. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  14. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae.

    Directory of Open Access Journals (Sweden)

    Isabel A S Bonatelli

    Full Text Available Microsatellite markers (also known as SSRs, Simple Sequence Repeats are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.

  15. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

    Science.gov (United States)

    Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.

  16. Live births after simultaneous avoidance of monogenic diseases and chromosome abnormality by next-generation sequencing with linkage analyses.

    Science.gov (United States)

    Yan, Liying; Huang, Lei; Xu, Liya; Huang, Jin; Ma, Fei; Zhu, Xiaohui; Tang, Yaqiong; Liu, Mingshan; Lian, Ying; Liu, Ping; Li, Rong; Lu, Sijia; Tang, Fuchou; Qiao, Jie; Xie, X Sunney

    2015-12-29

    In vitro fertilization (IVF), preimplantation genetic diagnosis (PGD), and preimplantation genetic screening (PGS) help patients to select embryos free of monogenic diseases and aneuploidy (chromosome abnormality). Next-generation sequencing (NGS) methods, while experiencing a rapid cost reduction, have improved the precision of PGD/PGS. However, the precision of PGD has been limited by the false-positive and false-negative single-nucleotide variations (SNVs), which are not acceptable in IVF and can be circumvented by linkage analyses, such as short tandem repeats or karyomapping. It is noteworthy that existing methods of detecting SNV/copy number variation (CNV) and linkage analysis often require separate procedures for the same embryo. Here we report an NGS-based PGD/PGS procedure that can simultaneously detect a single-gene disorder and aneuploidy and is capable of linkage analysis in a cost-effective way. This method, called "mutated allele revealed by sequencing with aneuploidy and linkage analyses" (MARSALA), involves multiple annealing and looping-based amplification cycles (MALBAC) for single-cell whole-genome amplification. Aneuploidy is determined by CNVs, whereas SNVs associated with the monogenic diseases are detected by PCR amplification of the MALBAC product. The false-positive and -negative SNVs are avoided by an NGS-based linkage analysis. Two healthy babies, free of the monogenic diseases of their parents, were born after such embryo selection. The monogenic diseases originated from a single base mutation on the autosome and the X-chromosome of the disease-carrying father and mother, respectively.

  17. Next-Generation Sequencing in the Mycology Lab.

    Science.gov (United States)

    Zoll, Jan; Snelders, Eveline; Verweij, Paul E; Melchers, Willem J G

    New state-of-the-art techniques in sequencing offer valuable tools in both detection of mycobiota and in understanding of the molecular mechanisms of resistance against antifungal compounds and virulence. Introduction of new sequencing platform with enhanced capacity and a reduction in costs for sequence analysis provides a potential powerful tool in mycological diagnosis and research. In this review, we summarize the applications of next-generation sequencing techniques in mycology.

  18. Association analysis using next-generation sequence data from publicly available control groups: the robust variance score statistic.

    Science.gov (United States)

    Derkach, Andriy; Chiang, Theodore; Gong, Jiafen; Addis, Laura; Dobbins, Sara; Tomlinson, Ian; Houlston, Richard; Pal, Deb K; Strug, Lisa J

    2014-08-01

    Sufficiently powered case-control studies with next-generation sequence (NGS) data remain prohibitively expensive for many investigators. If feasible, a more efficient strategy would be to include publicly available sequenced controls. However, these studies can be confounded by differences in sequencing platform; alignment, single nucleotide polymorphism and variant calling algorithms; read depth; and selection thresholds. Assuming one can match cases and controls on the basis of ethnicity and other potential confounding factors, and one has access to the aligned reads in both groups, we investigate the effect of systematic differences in read depth and selection threshold when comparing allele frequencies between cases and controls. We propose a novel likelihood-based method, the robust variance score (RVS), that substitutes genotype calls by their expected values given observed sequence data. We show theoretically that the RVS eliminates read depth bias in the estimation of minor allele frequency. We also demonstrate that, using simulated and real NGS data, the RVS method controls Type I error and has comparable power to the 'gold standard' analysis with the true underlying genotypes for both common and rare variants. An RVS R script and instructions can be found at strug.research.sickkids.ca, and at https://github.com/strug-lab/RVS. lisa.strug@utoronto.ca Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  19. Non-Aqueous Titration Method for Determining Suppressor Concentration in the MCU Next Generation Solvent (NGS)

    International Nuclear Information System (INIS)

    Taylor-Pashow, Kathryn M. L.; Jones, Daniel H.

    2017-01-01

    A non-aqueous titration method has been used for quantifying the suppressor concentration in the MCU solvent hold tank (SHT) monthly samples since the Next Generation Solvent (NGS) was implemented in 2013. The titration method measures the concentration of the NGS suppressor (TiDG) as well as the residual tri-n-octylamine (TOA) that is a carryover from the previous solvent. As the TOA concentration has decreased over time, it has become difficult to resolve the TiDG equivalence point as the TOA equivalence point has moved closer. In recent samples, the TiDG equivalence point could not be resolved, and therefore, the TiDG concentration was determined by subtracting the TOA concentration as measured by semi-volatile organic analysis (SVOA) from the total base concentration as measured by titration. In order to improve the titration method so that the TiDG concentration can be measured directly, without the need for the SVOA data, a new method has been developed that involves spiking of the sample with additional TOA to further separate the two equivalence points in the titration. This method has been demonstrated on four recent SHT samples and comparison to results obtained using the SVOA TOA subtraction method shows good agreement. Therefore, it is recommended that the titration procedure be revised to include the TOA spike addition, and this to become the primary method for quantifying the TiDG.

  20. Non-Aqueous Titration Method for Determining Suppressor Concentration in the MCU Next Generation Solvent (NGS)

    Energy Technology Data Exchange (ETDEWEB)

    Taylor-Pashow, Kathryn M. L. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Jones, Daniel H. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL)

    2017-10-23

    A non-aqueous titration method has been used for quantifying the suppressor concentration in the MCU solvent hold tank (SHT) monthly samples since the Next Generation Solvent (NGS) was implemented in 2013. The titration method measures the concentration of the NGS suppressor (TiDG) as well as the residual tri-n-octylamine (TOA) that is a carryover from the previous solvent. As the TOA concentration has decreased over time, it has become difficult to resolve the TiDG equivalence point as the TOA equivalence point has moved closer. In recent samples, the TiDG equivalence point could not be resolved, and therefore, the TiDG concentration was determined by subtracting the TOA concentration as measured by semi-volatile organic analysis (SVOA) from the total base concentration as measured by titration. In order to improve the titration method so that the TiDG concentration can be measured directly, without the need for the SVOA data, a new method has been developed that involves spiking of the sample with additional TOA to further separate the two equivalence points in the titration. This method has been demonstrated on four recent SHT samples and comparison to results obtained using the SVOA TOA subtraction method shows good agreement. Therefore, it is recommended that the titration procedure be revised to include the TOA spike addition, and this to become the primary method for quantifying the TiDG.

  1. VarWalker: personalized mutation network analysis of putative cancer genes from next-generation sequencing data.

    Science.gov (United States)

    Jia, Peilin; Zhao, Zhongming

    2014-02-01

    A major challenge in interpreting the large volume of mutation data identified by next-generation sequencing (NGS) is to distinguish driver mutations from neutral passenger mutations to facilitate the identification of targetable genes and new drugs. Current approaches are primarily based on mutation frequencies of single-genes, which lack the power to detect infrequently mutated driver genes and ignore functional interconnection and regulation among cancer genes. We propose a novel mutation network method, VarWalker, to prioritize driver genes in large scale cancer mutation data. VarWalker fits generalized additive models for each sample based on sample-specific mutation profiles and builds on the joint frequency of both mutation genes and their close interactors. These interactors are selected and optimized using the Random Walk with Restart algorithm in a protein-protein interaction network. We applied the method in >300 tumor genomes in two large-scale NGS benchmark datasets: 183 lung adenocarcinoma samples and 121 melanoma samples. In each cancer, we derived a consensus mutation subnetwork containing significantly enriched consensus cancer genes and cancer-related functional pathways. These cancer-specific mutation networks were then validated using independent datasets for each cancer. Importantly, VarWalker prioritizes well-known, infrequently mutated genes, which are shown to interact with highly recurrently mutated genes yet have been ignored by conventional single-gene-based approaches. Utilizing VarWalker, we demonstrated that network-assisted approaches can be effectively adapted to facilitate the detection of cancer driver genes in NGS data.

  2. Graph mining for next generation sequencing: leveraging the assembly graph for biological insights.

    Science.gov (United States)

    Warnke-Sommer, Julia; Ali, Hesham

    2016-05-06

    The assembly of Next Generation Sequencing (NGS) reads remains a challenging task. This is especially true for the assembly of metagenomics data that originate from environmental samples potentially containing hundreds to thousands of unique species. The principle objective of current assembly tools is to assemble NGS reads into contiguous stretches of sequence called contigs while maximizing for both accuracy and contig length. The end goal of this process is to produce longer contigs with the major focus being on assembly only. Sequence read assembly is an aggregative process, during which read overlap relationship information is lost as reads are merged into longer sequences or contigs. The assembly graph is information rich and capable of capturing the genomic architecture of an input read data set. We have developed a novel hybrid graph in which nodes represent sequence regions at different levels of granularity. This model, utilized in the assembly and analysis pipeline Focus, presents a concise yet feature rich view of a given input data set, allowing for the extraction of biologically relevant graph structures for graph mining purposes. Focus was used to create hybrid graphs to model metagenomics data sets obtained from the gut microbiomes of five individuals with Crohn's disease and eight healthy individuals. Repetitive and mobile genetic elements are found to be associated with hybrid graph structure. Using graph mining techniques, a comparative study of the Crohn's disease and healthy data sets was conducted with focus on antibiotics resistance genes associated with transposase genes. Results demonstrated significant differences in the phylogenetic distribution of categories of antibiotics resistance genes in the healthy and diseased patients. Focus was also evaluated as a pure assembly tool and produced excellent results when compared against the Meta-velvet, Omega, and UD-IDBA assemblers. Mining the hybrid graph can reveal biological phenomena captured

  3. A Window Into Clinical Next-Generation Sequencing-Based Oncology Testing Practices.

    Science.gov (United States)

    Nagarajan, Rakesh; Bartley, Angela N; Bridge, Julia A; Jennings, Lawrence J; Kamel-Reid, Suzanne; Kim, Annette; Lazar, Alexander J; Lindeman, Neal I; Moncur, Joel; Rai, Alex J; Routbort, Mark J; Vasalos, Patricia; Merker, Jason D

    2017-12-01

    - Detection of acquired variants in cancer is a paradigm of precision medicine, yet little has been reported about clinical laboratory practices across a broad range of laboratories. - To use College of American Pathologists proficiency testing survey results to report on the results from surveys on next-generation sequencing-based oncology testing practices. - College of American Pathologists proficiency testing survey results from more than 250 laboratories currently performing molecular oncology testing were used to determine laboratory trends in next-generation sequencing-based oncology testing. - These presented data provide key information about the number of laboratories that currently offer or are planning to offer next-generation sequencing-based oncology testing. Furthermore, we present data from 60 laboratories performing next-generation sequencing-based oncology testing regarding specimen requirements and assay characteristics. The findings indicate that most laboratories are performing tumor-only targeted sequencing to detect single-nucleotide variants and small insertions and deletions, using desktop sequencers and predesigned commercial kits. Despite these trends, a diversity of approaches to testing exists. - This information should be useful to further inform a variety of topics, including national discussions involving clinical laboratory quality systems, regulation and oversight of next-generation sequencing-based oncology testing, and precision oncology efforts in a data-driven manner.

  4. Genome survey sequencing and genetic background characterization of Gracilariopsis lemaneiformis (Rhodophyta) based on next-generation sequencing.

    Science.gov (United States)

    Zhou, Wei; Hu, Yiyi; Sui, Zhenghong; Fu, Feng; Wang, Jinguo; Chang, Lianpeng; Guo, Weihua; Li, Binbin

    2013-01-01

    Gracilariopsis lemaneiformis has a high economic value and is one of the most important aquaculture species in China. Despite it is economic importance, it has remained largely unstudied at the genomic level. In this study, we conducted a genome survey of Gp. lemaneiformis using next-generation sequencing (NGS) technologies. In total, 18.70 Gb of high-quality sequence data with an estimated genome size of 97 Mb were obtained by HiSeq 2000 sequencing for Gp. lemaneiformis. These reads were assembled into 160,390 contigs with a N50 length of 3.64 kb, which were further assembled into 125,685 scaffolds with a total length of 81.17 Mb. Genome analysis predicted 3490 genes and a GC% content of 48%. The identified genes have an average transcript length of 1,429 bp, an average coding sequence size of 1,369 bp, 1.36 exons per gene, exon length of 1,008 bp, and intron length of 191 bp. From the initial assembled scaffold, transposable elements constituted 54.64% (44.35 Mb) of the genome, and 7737 simple sequence repeats (SSRs) were identified. Among these SSRs, the trinucleotide repeat type was the most abundant (up to 73.20% of total SSRs), followed by the di- (17.41%), tetra- (5.49%), hexa- (2.90%), and penta- (1.00%) nucleotide repeat type. These characteristics suggest that Gp. lemaneiformis is a model organism for genetic study. This is the first report of genome-wide characterization within this taxon.

  5. Genome Survey Sequencing and Genetic Background Characterization of Gracilariopsis lemaneiformis (Rhodophyta) Based on Next-Generation Sequencing

    Science.gov (United States)

    Sui, Zhenghong; Fu, Feng; Wang, Jinguo; Chang, Lianpeng; Guo, Weihua; Li, Binbin

    2013-01-01

    Gracilariopsis lemaneiformis has a high economic value and is one of the most important aquaculture species in China. Despite it is economic importance, it has remained largely unstudied at the genomic level. In this study, we conducted a genome survey of Gp. lemaneiformis using next-generation sequencing (NGS) technologies. In total, 18.70 Gb of high-quality sequence data with an estimated genome size of 97 Mb were obtained by HiSeq 2000 sequencing for Gp. lemaneiformis. These reads were assembled into 160,390 contigs with a N50 length of 3.64 kb, which were further assembled into 125,685 scaffolds with a total length of 81.17 Mb. Genome analysis predicted 3490 genes and a GC% content of 48%. The identified genes have an average transcript length of 1,429 bp, an average coding sequence size of 1,369 bp, 1.36 exons per gene, exon length of 1,008 bp, and intron length of 191 bp. From the initial assembled scaffold, transposable elements constituted 54.64% (44.35 Mb) of the genome, and 7737 simple sequence repeats (SSRs) were identified. Among these SSRs, the trinucleotide repeat type was the most abundant (up to 73.20% of total SSRs), followed by the di- (17.41%), tetra- (5.49%), hexa- (2.90%), and penta- (1.00%) nucleotide repeat type. These characteristics suggest that Gp. lemaneiformis is a model organism for genetic study. This is the first report of genome-wide characterization within this taxon. PMID:23875008

  6. Novel mutations and their genotype-phenotype correlations in patients with Noonan syndrome, using next-generation sequencing.

    Science.gov (United States)

    Tafazoli, Alireza; Eshraghi, Peyman; Pantaleoni, Francesca; Vakili, Rahim; Moghaddassian, Morteza; Ghahraman, Martha; Muto, Valentina; Paolacci, Stefano; Golyan, Fatemeh Fardi; Abbaszadegan, Mohammad Reza

    2018-03-01

    Noonan Syndrome (NS) is an autosomal dominant disorder with many variable and heterogeneous conditions. The genetic basis for 20-30% of cases is still unknown. This study evaluates Iranian Noonan patients both clinically and genetically for the first time. Mutational analysis of PTPN11 gene was performed in 15 Iranian patients, using PCR and Sanger sequencing at phase one. Then, as phase two, Next Generation Sequencing (NGS) in the form of targeted resequencing was utilized for analysis of exons from other related genes. Homology modelling for the novel founded mutations was performed as well. The genotype, phenotype correlation was done according to the molecular findings and clinical features. Previously reported mutation (p.N308D) in some patients and a novel mutation (p.D155N) in one of the patients were identified in phase one. After applying NGS methods, known and new variants were found in four patients in other genes, including: CBL (p. V904I), KRAS (p. L53W), SOS1 (p. I1302V), and SOS1 (p. R552G). Structural studies of two deduced novel mutations in related genes revealed deficiencies in the mutated proteins. Following genotype, phenotype correlation, a new pattern of the presence of intellectual disability in two patients was registered. NS shows strong variable expressivity along the high genetic heterogeneity especially in distinct populations and ethnic groups. Also possibly unknown other causative genes may be exist. Obviously, more comprehensive and new technologies like NGS methods are the best choice for detection of molecular defects in patients for genotype, phenotype correlation and disease management. Copyright © 2017 Medical University of Bialystok. Published by Elsevier B.V. All rights reserved.

  7. Routine use of next-generation sequencing for preimplantation genetic diagnosis of blastomeres obtained from embryos on day 3 in fresh in vitro fertilization cycles.

    Science.gov (United States)

    Łukaszuk, Krzysztof; Pukszta, Sebastian; Wells, Dagan; Cybulska, Celina; Liss, Joanna; Płóciennik, Łukasz; Kuczyński, Waldemar; Zabielska, Judyta

    2015-04-01

    To determine the usefulness of semiconductor-based next-generation sequencing (NGS) for cleavage-stage preimplantation genetic diagnosis (PGD) of aneuploidy. Prospective case-control study. A private center for reproductive medicine. A total of 45 patients underwent day-3 embryo biopsy with PGD and fresh cycle transfer. Additionally, 53 patients, matched according to age, anti-Müllerian hormone levels, antral follicles count, and infertility duration were selected as controls. Choice of embryos for transfer was based on the PGD NGS results. Clinical pregnancy rate (PR) per embryo transfer (ET) was the primary outcome. Secondary outcomes were implantation and miscarriage rates. The PR per transfer was higher in the NGS group (84.4% vs. 41.5%). The implantation rate (61.5% vs. 34.8%) was higher in the NGS group. The miscarriage rate was similar in the 2 groups (2.8% vs. 4.6%). We demonstrate the technical feasibility of NGS-based PGD involving cleavage-stage biopsy and fresh ETs. Encouraging data were obtained from a prospective trial using this approach, arguing that cleavage-stage NGS may represent a valuable addition to current aneuploidy screening methods. These findings require further validation in a well-designed randomized controlled trial. ACTRN12614001035617. Copyright © 2015 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.

  8. Results of Analyses of the Next Generation Solvent for Parsons

    International Nuclear Information System (INIS)

    Peters, T.; Washington, A.; Fink, S.

    2012-01-01

    Savannah River National Laboratory (SRNL) prepared a nominal 150 gallon batch of Next Generation Solvent (NGS) for Parsons. This material was then analyzed and tested for cesium mass transfer efficiency. The bulk of the results indicate that the solvent is qualified as acceptable for use in the upcoming pilot-scale testing at Parsons Technology Center. This report describes the analysis and testing of a batch of Next Generation Solvent (NGS) prepared in support of pilot-scale testing in the Parsons Technology Center. A total of ∼150 gallons of NGS solvent was prepared in late November of 2011. Details for the work are contained in a controlled laboratory notebook. Analysis of the Parsons NGS solvent indicates that the material is acceptable for use. SRNL is continuing to improve the analytical method for the guanidine.

  9. Genomic Selection in the Era of Next Generation Sequencing for Complex Traits in Plant Breeding.

    Science.gov (United States)

    Bhat, Javaid A; Ali, Sajad; Salgotra, Romesh K; Mir, Zahoor A; Dutta, Sutapa; Jadon, Vasudha; Tyagi, Anshika; Mushtaq, Muntazir; Jain, Neelu; Singh, Pradeep K; Singh, Gyanendra P; Prabhu, K V

    2016-01-01

    Genomic selection (GS) is a promising approach exploiting molecular genetic markers to design novel breeding programs and to develop new markers-based models for genetic evaluation. In plant breeding, it provides opportunities to increase genetic gain of complex traits per unit time and cost. The cost-benefit balance was an important consideration for GS to work in crop plants. Availability of genome-wide high-throughput, cost-effective and flexible markers, having low ascertainment bias, suitable for large population size as well for both model and non-model crop species with or without the reference genome sequence was the most important factor for its successful and effective implementation in crop species. These factors were the major limitations to earlier marker systems viz., SSR and array-based, and was unimaginable before the availability of next-generation sequencing (NGS) technologies which have provided novel SNP genotyping platforms especially the genotyping by sequencing. These marker technologies have changed the entire scenario of marker applications and made the use of GS a routine work for crop improvement in both model and non-model crop species. The NGS-based genotyping have increased genomic-estimated breeding value prediction accuracies over other established marker platform in cereals and other crop species, and made the dream of GS true in crop breeding. But to harness the true benefits from GS, these marker technologies will be combined with high-throughput phenotyping for achieving the valuable genetic gain from complex traits. Moreover, the continuous decline in sequencing cost will make the WGS feasible and cost effective for GS in near future. Till that time matures the targeted sequencing seems to be more cost-effective option for large scale marker discovery and GS, particularly in case of large and un-decoded genomes.

  10. ChimericSeq: An open-source, user-friendly interface for analyzing NGS data to identify and characterize viral-host chimeric sequences

    Science.gov (United States)

    Shieh, Fwu-Shan; Jongeneel, Patrick; Steffen, Jamin D.; Lin, Selena; Jain, Surbhi; Song, Wei

    2017-01-01

    Identification of viral integration sites has been important in understanding the pathogenesis and progression of diseases associated with particular viral infections. The advent of next-generation sequencing (NGS) has enabled researchers to understand the impact that viral integration has on the host, such as tumorigenesis. Current computational methods to analyze NGS data of virus-host junction sites have been limited in terms of their accessibility to a broad user base. In this study, we developed a software application (named ChimericSeq), that is the first program of its kind to offer a graphical user interface, compatibility with both Windows and Mac operating systems, and optimized for effectively identifying and annotating virus-host chimeric reads within NGS data. In addition, ChimericSeq’s pipeline implements custom filtering to remove artifacts and detect reads with quantitative analytical reporting to provide functional significance to discovered integration sites. The improved accessibility of ChimericSeq through a GUI interface in both Windows and Mac has potential to expand NGS analytical support to a broader spectrum of the scientific community. PMID:28829778

  11. ChimericSeq: An open-source, user-friendly interface for analyzing NGS data to identify and characterize viral-host chimeric sequences.

    Directory of Open Access Journals (Sweden)

    Fwu-Shan Shieh

    Full Text Available Identification of viral integration sites has been important in understanding the pathogenesis and progression of diseases associated with particular viral infections. The advent of next-generation sequencing (NGS has enabled researchers to understand the impact that viral integration has on the host, such as tumorigenesis. Current computational methods to analyze NGS data of virus-host junction sites have been limited in terms of their accessibility to a broad user base. In this study, we developed a software application (named ChimericSeq, that is the first program of its kind to offer a graphical user interface, compatibility with both Windows and Mac operating systems, and optimized for effectively identifying and annotating virus-host chimeric reads within NGS data. In addition, ChimericSeq's pipeline implements custom filtering to remove artifacts and detect reads with quantitative analytical reporting to provide functional significance to discovered integration sites. The improved accessibility of ChimericSeq through a GUI interface in both Windows and Mac has potential to expand NGS analytical support to a broader spectrum of the scientific community.

  12. Applications of nanotechnology, next generation sequencing and microarrays in biomedical research.

    Science.gov (United States)

    Elingaramil, Sauli; Li, Xiaolong; He, Nongyue

    2013-07-01

    Next-generation sequencing technologies, microarrays and advances in bio nanotechnology have had an enormous impact on research within a short time frame. This impact appears certain to increase further as many biomedical institutions are now acquiring these prevailing new technologies. Beyond conventional sampling of genome content, wide-ranging applications are rapidly evolving for next-generation sequencing, microarrays and nanotechnology. To date, these technologies have been applied in a variety of contexts, including whole-genome sequencing, targeted re sequencing and discovery of transcription factor binding sites, noncoding RNA expression profiling and molecular diagnostics. This paper thus discusses current applications of nanotechnology, next-generation sequencing technologies and microarrays in biomedical research and highlights the transforming potential these technologies offer.

  13. Targeted next-generation sequencing at copy-number breakpoints for personalized analysis of rearranged ends in solid tumors.

    Directory of Open Access Journals (Sweden)

    Hyun-Kyoung Kim

    Full Text Available BACKGROUND: The concept of the utilization of rearranged ends for development of personalized biomarkers has attracted much attention owing to its clinical applicability. Although targeted next-generation sequencing (NGS for recurrent rearrangements has been successful in hematologic malignancies, its application to solid tumors is problematic due to the paucity of recurrent translocations. However, copy-number breakpoints (CNBs, which are abundant in solid tumors, can be utilized for identification of rearranged ends. METHOD: As a proof of concept, we performed targeted next-generation sequencing at copy-number breakpoints (TNGS-CNB in nine colon cancer cases including seven primary cancers and two cell lines, COLO205 and SW620. For deduction of CNBs, we developed a novel competitive single-nucleotide polymorphism (cSNP microarray method entailing CNB-region refinement by competitor DNA. RESULT: Using TNGS-CNB, 19 specific rearrangements out of 91 CNBs (20.9% were identified, and two polymerase chain reaction (PCR-amplifiable rearrangements were obtained in six cases (66.7%. And significantly, TNGS-CNB, with its high positive identification rate (82.6% of PCR-amplifiable rearrangements at candidate sites (19/23, just from filtering of aligned sequences, requires little effort for validation. CONCLUSION: Our results indicate that TNGS-CNB, with its utility for identification of rearrangements in solid tumors, can be successfully applied in the clinical laboratory for cancer-relapse and therapy-response monitoring.

  14. Targeted next-generation sequencing at copy-number breakpoints for personalized analysis of rearranged ends in solid tumors.

    Science.gov (United States)

    Kim, Hyun-Kyoung; Park, Won Cheol; Lee, Kwang Man; Hwang, Hai-Li; Park, Seong-Yeol; Sorn, Sungbin; Chandra, Vishal; Kim, Kwang Gi; Yoon, Woong-Bae; Bae, Joon Seol; Shin, Hyoung Doo; Shin, Jong-Yeon; Seoh, Ju-Young; Kim, Jong-Il; Hong, Kyeong-Man

    2014-01-01

    The concept of the utilization of rearranged ends for development of personalized biomarkers has attracted much attention owing to its clinical applicability. Although targeted next-generation sequencing (NGS) for recurrent rearrangements has been successful in hematologic malignancies, its application to solid tumors is problematic due to the paucity of recurrent translocations. However, copy-number breakpoints (CNBs), which are abundant in solid tumors, can be utilized for identification of rearranged ends. As a proof of concept, we performed targeted next-generation sequencing at copy-number breakpoints (TNGS-CNB) in nine colon cancer cases including seven primary cancers and two cell lines, COLO205 and SW620. For deduction of CNBs, we developed a novel competitive single-nucleotide polymorphism (cSNP) microarray method entailing CNB-region refinement by competitor DNA. Using TNGS-CNB, 19 specific rearrangements out of 91 CNBs (20.9%) were identified, and two polymerase chain reaction (PCR)-amplifiable rearrangements were obtained in six cases (66.7%). And significantly, TNGS-CNB, with its high positive identification rate (82.6%) of PCR-amplifiable rearrangements at candidate sites (19/23), just from filtering of aligned sequences, requires little effort for validation. Our results indicate that TNGS-CNB, with its utility for identification of rearrangements in solid tumors, can be successfully applied in the clinical laboratory for cancer-relapse and therapy-response monitoring.

  15. Genome-Based Identification of Active Prophage Regions by Next Generation Sequencing in Bacillus licheniformis DSM13

    Science.gov (United States)

    Hertel, Robert; Rodríguez, David Pintor; Hollensteiner, Jacqueline; Dietrich, Sascha; Leimbach, Andreas; Hoppert, Michael; Liesegang, Heiko; Volland, Sonja

    2015-01-01

    Prophages are viruses, which have integrated their genomes into the genome of a bacterial host. The status of the prophage genome can vary from fully intact with the potential to form infective particles to a remnant state where only a few phage genes persist. Prophages have impact on the properties of their host and are therefore of great interest for genomic research and strain design. Here we present a genome- and next generation sequencing (NGS)-based approach for identification and activity evaluation of prophage regions. Seven prophage or prophage-like regions were identified in the genome of Bacillus licheniformis DSM13. Six of these regions show similarity to members of the Siphoviridae phage family. The remaining region encodes the B. licheniformis orthologue of the PBSX prophage from Bacillus subtilis. Analysis of isolated phage particles (induced by mitomycin C) from the wild-type strain and prophage deletion mutant strains revealed activity of the prophage regions BLi_Pp2 (PBSX-like), BLi_Pp3 and BLi_Pp6. In contrast to BLi_Pp2 and BLi_Pp3, neither phage DNA nor phage particles of BLi_Pp6 could be visualized. However, the ability of prophage BLi_Pp6 to generate particles could be confirmed by sequencing of particle-protected DNA mapping to prophage locus BLi_Pp6. The introduced NGS-based approach allows the investigation of prophage regions and their ability to form particles. Our results show that this approach increases the sensitivity of prophage activity analysis and can complement more conventional approaches such as transmission electron microscopy (TEM). PMID:25811873

  16. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    Directory of Open Access Journals (Sweden)

    Michael Knapp

    2010-07-01

    Full Text Available The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA  research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions.

  17. Targeted next-generation sequencing extends the phenotypic and mutational spectrums for EYS mutations.

    Science.gov (United States)

    Gu, Shun; Tian, Yuanyuan; Chen, Xue; Zhao, Chen

    2016-01-01

    We aim to determine genetic lesions with a phenotypic correlation in four Chinese families with autosomal recessive retinitis pigmentosa (RP). Medical histories were carefully reviewed. All patients received comprehensive ophthalmic evaluations. The next-generation sequencing (NGS) approach targeting a panel of 205 retinal disease-relevant genes and 15 candidate genes was selectively performed on probands from the four recruited families for mutation detection. Online predictive software and crystal structure modeling were also applied to test the potential pathogenic effects of identified mutations. Of the four families, two were diagnosed with RP sino pigmento (RPSP). Patients with RPSP claimed to have earlier RP age of onset but slower disease progression. Five mutations in the eyes shut homolog (EYS) gene, involving two novel (c.7228+1G>A and c.9248G>A) and three recurrent mutations (c.4957dupA, c.6416G>A and c.6557G>A), were found as RP causative in the four families. The missense variant c.5093T>C was determined to be a variant of unknown significance (VUS) due to the variant's colocalization in the same allele with the reported pathogenic mutation c.6416G>A. The two novel variants were further confirmed absent in 100 unrelated healthy controls. Online predictive software indicated potential pathogenicity of the three missense mutations. Further, crystal structural modeling suggested generation of two abnormal hydrogen bonds by the missense mutation p.G2186E (c.6557G>A) and elongation of its neighboring β-sheet induced by p.G3083D (c.9248G>A), which could alter the tertiary structure of the eys protein and thus interrupt its physicochemical properties. Taken together, with the targeted NGS approach, we reveal novel EYS mutations and prove the efficiency of targeted NGS in the genetic diagnoses of RP. We also first report the correlation between EYS mutations and RPSP. The genotypic-phenotypic relationship in all Chinese patients carrying mutations in the EYS

  18. Next-Generation Sequencing: From Understanding Biology to Personalized Medicine

    Directory of Open Access Journals (Sweden)

    Benjamin Meder

    2013-03-01

    Full Text Available Within just a few years, the new methods for high-throughput next-generation sequencing have generated completely novel insights into the heritability and pathophysiology of human disease. In this review, we wish to highlight the benefits of the current state-of-the-art sequencing technologies for genetic and epigenetic research. We illustrate how these technologies help to constantly improve our understanding of genetic mechanisms in biological systems and summarize the progress made so far. This can be exemplified by the case of heritable heart muscle diseases, so-called cardiomyopathies. Here, next-generation sequencing is able to identify novel disease genes, and first clinical applications demonstrate the successful translation of this technology into personalized patient care.

  19. Targeted sequencing of plant genomes

    Science.gov (United States)

    Mark D. Huynh

    2014-01-01

    Next-generation sequencing (NGS) has revolutionized the field of genetics by providing a means for fast and relatively affordable sequencing. With the advancement of NGS, wholegenome sequencing (WGS) has become more commonplace. However, sequencing an entire genome is still not cost effective or even beneficial in all cases. In studies that do not require a whole-...

  20. Different Somatic Hypermutation Levels among Antibody Subclasses Disclosed by a New Next-Generation Sequencing-Based Antibody Repertoire Analysis

    Directory of Open Access Journals (Sweden)

    Kazutaka Kitaura

    2017-05-01

    Full Text Available A diverse antibody repertoire is primarily generated by the rearrangement of V, D, and J genes and subsequent somatic hypermutation (SHM. Class-switch recombination (CSR produces various isotypes and subclasses with different functional properties. Although antibody isotypes and subclasses are considered to be produced by both direct and sequential CSR, it is still not fully understood how SHMs accumulate during the process in which antibody subclasses are generated. Here, we developed a new next-generation sequencing (NGS-based antibody repertoire analysis capable of identifying all antibody isotype and subclass genes and used it to examine the peripheral blood mononuclear cells of 12 healthy individuals. Using a total of 5,480,040 sequences, we compared percentage frequency of variable (V, junctional (J sequence, and a combination of V and J, diversity, length, and amino acid compositions of CDR3, SHM, and shared clones in the IgM, IgD, IgG3, IgG1, IgG2, IgG4, IgA1, IgE, and IgA2 genes. The usage and diversity were similar among the immunoglobulin (Ig subclasses. Clonally related sequences sharing identical V, D, J, and CDR3 amino acid sequences were frequently found within multiple Ig subclasses, especially between IgG1 and IgG2 or IgA1 and IgA2. SHM occurred most frequently in IgG4, while IgG3 genes were the least mutated among all IgG subclasses. The shared clones had almost the same SHM levels among Ig subclasses, while subclass-specific clones had different levels of SHM dependent on the genomic location. Given the sequential CSR, these results suggest that CSR occurs sequentially over multiple subclasses in the order corresponding to the genomic location of IGHCs, but CSR is likely to occur more quickly than SHMs accumulate within Ig genes under physiological conditions. NGS-based antibody repertoire analysis should provide critical information on how various antibodies are generated in the immune system.

  1. A Microfluidic Device for Preparing Next Generation DNA Sequencing Libraries and for Automating Other Laboratory Protocols That Require One or More Column Chromatography Steps

    Science.gov (United States)

    Tan, Swee Jin; Phan, Huan; Gerry, Benjamin Michael; Kuhn, Alexandre; Hong, Lewis Zuocheng; Min Ong, Yao; Poon, Polly Suk Yean; Unger, Marc Alexander; Jones, Robert C.; Quake, Stephen R.; Burkholder, William F.

    2013-01-01

    Library preparation for next-generation DNA sequencing (NGS) remains a key bottleneck in the sequencing process which can be relieved through improved automation and miniaturization. We describe a microfluidic device for automating laboratory protocols that require one or more column chromatography steps and demonstrate its utility for preparing Next Generation sequencing libraries for the Illumina and Ion Torrent platforms. Sixteen different libraries can be generated simultaneously with significantly reduced reagent cost and hands-on time compared to manual library preparation. Using an appropriate column matrix and buffers, size selection can be performed on-chip following end-repair, dA tailing, and linker ligation, so that the libraries eluted from the chip are ready for sequencing. The core architecture of the device ensures uniform, reproducible column packing without user supervision and accommodates multiple routine protocol steps in any sequence, such as reagent mixing and incubation; column packing, loading, washing, elution, and regeneration; capture of eluted material for use as a substrate in a later step of the protocol; and removal of one column matrix so that two or more column matrices with different functional properties can be used in the same protocol. The microfluidic device is mounted on a plastic carrier so that reagents and products can be aliquoted and recovered using standard pipettors and liquid handling robots. The carrier-mounted device is operated using a benchtop controller that seals and operates the device with programmable temperature control, eliminating any requirement for the user to manually attach tubing or connectors. In addition to NGS library preparation, the device and controller are suitable for automating other time-consuming and error-prone laboratory protocols requiring column chromatography steps, such as chromatin immunoprecipitation. PMID:23894273

  2. A microfluidic device for preparing next generation DNA sequencing libraries and for automating other laboratory protocols that require one or more column chromatography steps.

    Directory of Open Access Journals (Sweden)

    Swee Jin Tan

    Full Text Available Library preparation for next-generation DNA sequencing (NGS remains a key bottleneck in the sequencing process which can be relieved through improved automation and miniaturization. We describe a microfluidic device for automating laboratory protocols that require one or more column chromatography steps and demonstrate its utility for preparing Next Generation sequencing libraries for the Illumina and Ion Torrent platforms. Sixteen different libraries can be generated simultaneously with significantly reduced reagent cost and hands-on time compared to manual library preparation. Using an appropriate column matrix and buffers, size selection can be performed on-chip following end-repair, dA tailing, and linker ligation, so that the libraries eluted from the chip are ready for sequencing. The core architecture of the device ensures uniform, reproducible column packing without user supervision and accommodates multiple routine protocol steps in any sequence, such as reagent mixing and incubation; column packing, loading, washing, elution, and regeneration; capture of eluted material for use as a substrate in a later step of the protocol; and removal of one column matrix so that two or more column matrices with different functional properties can be used in the same protocol. The microfluidic device is mounted on a plastic carrier so that reagents and products can be aliquoted and recovered using standard pipettors and liquid handling robots. The carrier-mounted device is operated using a benchtop controller that seals and operates the device with programmable temperature control, eliminating any requirement for the user to manually attach tubing or connectors. In addition to NGS library preparation, the device and controller are suitable for automating other time-consuming and error-prone laboratory protocols requiring column chromatography steps, such as chromatin immunoprecipitation.

  3. A probable prehistoric case of meningococcal disease from San Francisco Bay: Next generation sequencing of Neisseria meningitidis from dental calculus and osteological evidence.

    Science.gov (United States)

    Eerkens, Jelmer W; Nichols, Ruth V; Murray, Gemma G R; Perez, Katherine; Murga, Engel; Kaijankoski, Phil; Rosenthal, Jeffrey S; Engbring, Laurel; Shapiro, Beth

    2018-05-25

    Next Generation Sequencing (NGS) of ancient dental calculus samples from a prehistoric site in San Francisco Bay, CA-SCL-919, reveals a wide range of potentially pathogenic bacteria. One older adult woman, in particular, had high levels of Neisseria meningitidis and low levels of Haemophilus influenzae, species that were not observed in the calculus from three other individuals. Combined with the presence of incipient endocranial lesions and pronounced meningeal grooves, we interpret this as an ancient case of meningococcal disease. This disease afflicts millions around the globe today, but little is known about its (pre)history. With additional sampling, we suggest NGS of calculus offers an exciting new window into the evolutionary history of these bacterial species and their interactions with humans. Copyright © 2018 Elsevier Inc. All rights reserved.

  4. Rapid identification and recovery of ENU-induced mutations with next-generation sequencing and Paired-End Low-Error analysis.

    Science.gov (United States)

    Pan, Luyuan; Shah, Arish N; Phelps, Ian G; Doherty, Dan; Johnson, Eric A; Moens, Cecilia B

    2015-02-14

    Targeting Induced Local Lesions IN Genomes (TILLING) is a reverse genetics approach to directly identify point mutations in specific genes of interest in genomic DNA from a large chemically mutagenized population. Classical TILLING processes, based on enzymatic detection of mutations in heteroduplex PCR amplicons, are slow and labor intensive. Here we describe a new TILLING strategy in zebrafish using direct next generation sequencing (NGS) of 250 bp amplicons followed by Paired-End Low-Error (PELE) sequence analysis. By pooling a genomic DNA library made from over 9,000 N-ethyl-N-nitrosourea (ENU) mutagenized F1 fish into 32 equal pools of 288 fish, each with a unique Illumina barcode, we reduce the complexity of the template to a level at which we can detect mutations that occur in a single heterozygous fish in the entire library. MiSeq sequencing generates 250 base-pair overlapping paired-end reads, and PELE analysis aligns the overlapping sequences to each other and filters out any imperfect matches, thereby eliminating variants introduced during the sequencing process. We find that this filtering step reduces the number of false positive calls 50-fold without loss of true variant calls. After PELE we were able to validate 61.5% of the mutant calls that occurred at a frequency between 1 mutant call:100 wildtype calls and 1 mutant call:1000 wildtype calls in a pool of 288 fish. We then use high-resolution melt analysis to identify the single heterozygous mutation carrier in the 288-fish pool in which the mutation was identified. Using this NGS-TILLING protocol we validated 28 nonsense or splice site mutations in 20 genes, at a two-fold higher efficiency than using traditional Cel1 screening. We conclude that this approach significantly increases screening efficiency and accuracy at reduced cost and can be applied in a wide range of organisms.

  5. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers

    Directory of Open Access Journals (Sweden)

    Quail Michael A

    2012-07-01

    Full Text Available Abstract Background Next generation sequencing (NGS technology has revolutionized genomic and genetic research. The pace of change in this area is rapid with three major new sequencing platforms having been released in 2011: Ion Torrent’s PGM, Pacific Biosciences’ RS and the Illumina MiSeq. Here we compare the results obtained with those platforms to the performance of the Illumina HiSeq, the current market leader. In order to compare these platforms, and get sufficient coverage depth to allow meaningful analysis, we have sequenced a set of 4 microbial genomes with mean GC content ranging from 19.3 to 67.7%. Together, these represent a comprehensive range of genome content. Here we report our analysis of that sequence data in terms of coverage distribution, bias, GC distribution, variant detection and accuracy. Results Sequence generated by Ion Torrent, MiSeq and Pacific Biosciences technologies displays near perfect coverage behaviour on GC-rich, neutral and moderately AT-rich genomes, but a profound bias was observed upon sequencing the extremely AT-rich genome of Plasmodium falciparum on the PGM, resulting in no coverage for approximately 30% of the genome. We analysed the ability to call variants from each platform and found that we could call slightly more variants from Ion Torrent data compared to MiSeq data, but at the expense of a higher false positive rate. Variant calling from Pacific Biosciences data was possible but higher coverage depth was required. Context specific errors were observed in both PGM and MiSeq data, but not in that from the Pacific Biosciences platform. Conclusions All three fast turnaround sequencers evaluated here were able to generate usable sequence. However there are key differences between the quality of that data and the applications it will support.

  6. Rapid evaluation and quality control of next generation sequencing data with FaQCs.

    Science.gov (United States)

    Lo, Chien-Chi; Chain, Patrick S G

    2014-11-19

    Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly process large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.

  7. A novel ABCD1 mutation detected by next generation sequencing in presumed hereditary spastic paraplegia: A 30-year diagnostic delay caused by misleading biochemical findings.

    Science.gov (United States)

    Koutsis, Georgios; Lynch, David S; Tucci, Arianna; Houlden, Henry; Karadima, Georgia; Panas, Marios

    2015-08-15

    To present a Greek family in which 5 male and 2 female members developed progressive spastic paraplegia. Plasma very long chain fatty acids (VLCFA) were reportedly normal at first testing in an affected male and for over 30 years the presumed diagnosis was hereditary spastic paraplegia (HSP). Targeted next generation sequencing (NGS) was used as a further diagnostic tool. Targeted exome sequencing in the proband, followed by Sanger sequencing confirmation; mutation segregation testing in multiple family members and plasma VLCFA measurement in the proband. NGS of the proband revealed a novel frameshift mutation in ABCD1 (c.1174_1178del, p.Leu392Serfs*7), bringing an end to diagnostic uncertainty by establishing the diagnosis of adrenomyeloneuropathy (AMN), the myelopathic phenotype of X-linked adrenoleukodystrophy (ALD). The mutation segregated in all family members and the diagnosis of AMN/ALD was confirmed by plasma VLCFA measurement. Confounding factors that delayed the diagnosis are presented. This report highlights the diagnostic utility of NGS in patients with undiagnosed spastic paraplegia, establishing a molecular diagnosis of AMN, allowing proper genetic counseling and management, and overcoming the diagnostic delay that can be rarely caused by false negative VLCFA analysis. Copyright © 2015 Elsevier B.V. All rights reserved.

  8. Identification of HIV Mutation as Diagnostic Biomarker through Next Generation Sequencing.

    Science.gov (United States)

    Shaw, Wen Hui; Lin, Qianqian; Muhammad, Zikry Zhiwei Bin Roslee; Lee, Jia Jun; Khong, Wei Xin; Ng, Oon Tek; Tan, Eng Lee; Li, Peng

    2016-07-01

    Current clinical detection of Human immunodeficiency virus 1 (HIV-1) is used to target viral genes and proteins. However, the immunoassay, such as viral culture or Polymerase Chain Reaction (PCR), lacks accuracy in the diagnosis, as these conventional assays rely on the stable genome and HIV-1 is a highly-mutated virus. Next generation sequencing (NGS) promises to be transformative for the practice of infectious disease, and the rapidly reducing cost and processing time mean that this will become a feasible technology in diagnostic and research laboratories in the near future. The technology offers the superior sensitivity to detect the pathogenic viruses, including unknown and unexpected strains. To leverage the NGS technology in order to improve current HIV-1 diagnosis and genotyping methods. Ten blood samples were collected from HIV-1 infected patients which were diagnosed by RT PCR at Singapore Communicable Disease Centre, Tan Tock Seng Hospital from October 2014 to March 2015. Viral RNAs were extracted from blood plasma and reversed into cDNA. The HIV-1 cDNA samples were cleaned up using a PCR purification kit and the sequencing library was prepared and identified through MiSeq. Two common mutations were observed in all ten samples. The common mutations were identified at genome locations 1908 and 2104 as missense and silent mutations respectively, conferring S37N and S3S found on aspartic protease and reverse transcriptase subunits. The common mutations identified in this study were not previously reported, therefore suggesting the potential for them to be used for identification of viral infection, disease transmission and drug resistance. This was especially the case for, missense mutation S37N which could cause an amino acid change in viral proteases thus reducing the binding affinity of some protease inhibitors. Thus, the unique common mutations identified in this study could be used as diagnostic biomarkers to indicate the origin of infection as being

  9. Targeted next-generation sequencing analysis identifies novel mutations in families with severe familial exudative vitreoretinopathy

    Science.gov (United States)

    Huang, Xiao-Yan; Zhuang, Hong; Wu, Ji-Hong; Li, Jian-Kang; Hu, Fang-Yuan; Zheng, Yu; Tellier, Laurent Christian Asker M.; Zhang, Sheng-Hai; Gao, Feng-Juan; Zhang, Jian-Guo

    2017-01-01

    Purpose Familial exudative vitreoretinopathy (FEVR) is a genetically and clinically heterogeneous disease, characterized by failure of vascular development of the peripheral retina. The symptoms of FEVR vary widely among patients in the same family, and even between the two eyes of a given patient. This study was designed to identify the genetic defect in a patient cohort of ten Chinese families with a definitive diagnosis of FEVR. Methods To identify the causative gene, next-generation sequencing (NGS)-based target capture sequencing was performed. Segregation analysis of the candidate variant was performed in additional family members by using Sanger sequencing and quantitative real-time PCR (QPCR). Results Of the cohort of ten FEVR families, six pathogenic variants were identified, including four novel and two known heterozygous mutations. Of the variants identified, four were missense variants, and two were novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del]. The two novel heterozygous deletion mutations were not observed in the control subjects and could give rise to a relatively severe FEVR phenotype, which could be explained by the protein function prediction. Conclusions We identified two novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del] using targeted NGS as a causative mutation for FEVR. These genetic deletion variations exhibit a severe form of FEVR, with tractional retinal detachments compared with other known point mutations. The data further enrich the mutation spectrum of FEVR and enhance our understanding of genotype–phenotype correlations to provide useful information for disease diagnosis, prognosis, and effective genetic counseling. PMID:28867931

  10. Molecular diagnostics for congenital hearing loss including 15 deafness genes using a next generation sequencing platform

    Directory of Open Access Journals (Sweden)

    De Keulenaer Sarah

    2012-05-01

    Full Text Available Abstract Background Hereditary hearing loss (HL can originate from mutations in one of many genes involved in the complex process of hearing. Identification of the genetic defects in patients is currently labor intensive and expensive. While screening with Sanger sequencing for GJB2 mutations is common, this is not the case for the other known deafness genes (> 60. Next generation sequencing technology (NGS has the potential to be much more cost efficient. Published methods mainly use hybridization based target enrichment procedures that are time saving and efficient, but lead to loss in sensitivity. In this study we used a semi-automated PCR amplification and NGS in order to combine high sensitivity, speed and cost efficiency. Results In this proof of concept study, we screened 15 autosomal recessive deafness genes in 5 patients with congenital genetic deafness. 646 specific primer pairs for all exons and most of the UTR of the 15 selected genes were designed using primerXL. Using patient specific identifiers, all amplicons were pooled and analyzed using the Roche 454 NGS technology. Three of these patients are members of families in which a region of interest has previously been characterized by linkage studies. In these, we were able to identify two new mutations in CDH23 and OTOF. For another patient, the etiology of deafness was unclear, and no causal mutation was found. In a fifth patient, included as a positive control, we could confirm a known mutation in TMC1. Conclusions We have developed an assay that holds great promise as a tool for screening patients with familial autosomal recessive nonsyndromal hearing loss (ARNSHL. For the first time, an efficient, reliable and cost effective genetic test, based on PCR enrichment, for newborns with undiagnosed deafness is available.

  11. Development of a Targeted Next-Generation Sequencing Assay to Detect Diagnostically Relevant Mutations of JAK2, CALR, and MPL in Myeloproliferative Neoplasms.

    Science.gov (United States)

    Frawley, Thomas; O'Brien, Cathal P; Conneally, Eibhlin; Vandenberghe, Elisabeth; Percy, Melanie; Langabeer, Stephen E; Haslam, Karl

    2018-02-01

    The classical Philadelphia chromosome-negative myeloproliferative neoplasms (MPNs), consisting of polycythemia vera, essential thrombocythemia, and primary myelofibrosis, are a heterogeneous group of neoplasms that harbor driver mutations in the JAK2, CALR, and MPL genes. The detection of mutations in these genes has been incorporated into the recent World Health Organization (WHO) diagnostic criteria for MPN. Given a pressing clinical need to screen for mutations in these genes in a routine diagnostic setting, a targeted next-generation sequencing (NGS) assay for the detection of MPN-associated mutations located in JAK2 exon 14, JAK2 exon 12, CALR exon 9, and MPL exon 10 was developed to provide a single platform alternative to reflexive, stepwise diagnostic algorithms. Polymerase chain reaction (PCR) primers were designed to target mutation hotspots in JAK2 exon 14, JAK2 exon 12, MPL exon 10, and CALR exon 9. Multiplexed PCR conditions were optimized by using qualitative PCR followed by NGS. Diagnostic genomic DNA from 35 MPN patients, known to harbor driver mutations in one of the target genes, was used to validate the assay. One hundred percent concordance was observed between the previously-identified mutations and those detected by NGS, with no false positives, nor any known mutations missed (specificity = 100%, CI = 0.96, sensitivity = 100%, CI = 0.89). Improved resolution of mutation sequences was also revealed by NGS analysis. Detection of diagnostically relevant driver mutations of MPN is enhanced by employing a targeted multiplex NGS approach. This assay presents a robust solution to classical MPN mutation screening, providing an alternative to time-consuming sequential analyses.

  12. The contribution of next generation sequencing to epilepsy genetics

    DEFF Research Database (Denmark)

    Møller, Rikke S.; Dahl, Hans A.; Helbig, Ingo

    2015-01-01

    During the last decade, next generation sequencing technologies such as targeted gene panels, whole exome sequencing and whole genome sequencing have led to an explosion of gene identifications in monogenic epilepsies including both familial epilepsies and severe epilepsies, often referred to as ...

  13. Introduction of the Python script STRinNGS for analysis of STR regions in FASTQ or BAM files and expansion of the Danish STR sequence database to 11 STRs

    DEFF Research Database (Denmark)

    Friis, Susanne L; Buchard, Anders; Rockenbauer, Eszter

    2016-01-01

    This work introduces the in-house developed Python application STRinNGS for analysis of STR sequence elements in BAM or FASTQ files. STRinNGS identifies sequence reads with STR loci by their flanking sequences, it analyses the STR sequence and the flanking regions, and generates a report with the......This work introduces the in-house developed Python application STRinNGS for analysis of STR sequence elements in BAM or FASTQ files. STRinNGS identifies sequence reads with STR loci by their flanking sequences, it analyses the STR sequence and the flanking regions, and generates a report...

  14. Use of the LUS in sequence allele designations to facilitate probabilistic genotyping of NGS-based STR typing results.

    Science.gov (United States)

    Just, Rebecca S; Irwin, Jodi A

    2018-05-01

    Some of the expected advantages of next generation sequencing (NGS) for short tandem repeat (STR) typing include enhanced mixture detection and genotype resolution via sequence variation among non-homologous alleles of the same length. However, at the same time that NGS methods for forensic DNA typing have advanced in recent years, many caseworking laboratories have implemented or are transitioning to probabilistic genotyping to assist the interpretation of complex autosomal STR typing results. Current probabilistic software programs are designed for length-based data, and were not intended to accommodate sequence strings as the product input. Yet to leverage the benefits of NGS for enhanced genotyping and mixture deconvolution, the sequence variation among same-length products must be utilized in some form. Here, we propose use of the longest uninterrupted stretch (LUS) in allele designations as a simple method to represent sequence variation within the STR repeat regions and facilitate - in the nearterm - probabilistic interpretation of NGS-based typing results. An examination of published population data indicated that a reference LUS region is straightforward to define for most autosomal STR loci, and that using repeat unit plus LUS length as the allele designator can represent greater than 80% of the alleles detected by sequencing. A proof of concept study performed using a freely available probabilistic software demonstrated that the LUS length can be used in allele designations when a program does not require alleles to be integers, and that utilizing sequence information improves interpretation of both single-source and mixed contributor STR typing results as compared to using repeat unit information alone. The LUS concept for allele designation maintains the repeat-based allele nomenclature that will permit backward compatibility to extant STR databases, and the LUS lengths themselves will be concordant regardless of the NGS assay or analysis tools

  15. Next-Generation Sequencing Approaches in Genome-Wide Discovery of Single Nucleotide Polymorphism Markers Associated with Pungency and Disease Resistance in Pepper.

    Science.gov (United States)

    Manivannan, Abinaya; Kim, Jin-Hee; Yang, Eun-Young; Ahn, Yul-Kyun; Lee, Eun-Su; Choi, Sena; Kim, Do-Sun

    2018-01-01

    Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS) approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP) indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.

  16. Next-Generation Sequencing Approaches in Genome-Wide Discovery of Single Nucleotide Polymorphism Markers Associated with Pungency and Disease Resistance in Pepper

    Directory of Open Access Journals (Sweden)

    Abinaya Manivannan

    2018-01-01

    Full Text Available Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.

  17. Reconstruction of mitogenomes by NGS and phylogenetic implications for leaf beetles.

    Science.gov (United States)

    Song, Nan; Yin, Xinming; Zhao, Xincheng; Chen, Junhua; Yin, Jian

    2017-11-30

    Mitochondrial genome (mitogenome) sequences are frequently used to infer phylogenetic relationships of insects at different taxonomic levels. Next-generation sequencing (NGS) techniques are revolutionizing many fields of biology, and allow for acquisition of insect mitogenomes for large number of species simultaneously. In this study, 20 full or partial mitogenomes were sequenced from pooled genomic DNA samples by NGS for leaf beetles (Chrysomelidae). Combined with published mitogenome sequences, a higher level phylogeny of Chrysomelidae was reconstructed under maximum likelihood and Bayesian inference with different models and various data treatments. The results revealed support for a basal position of Bruchinae within Chrysomelidae. In addition, two major subfamily groupings were recovered: one including seven subfamilies, namely Donaciinae, Criocerinae, Spilopyrinae, Cassidinae, Cryptocephalinae, Chlamisinae and Eumolpinae, another containing a non-monophyletic Chrysomelinae and a monophyletic Galerucinae.

  18. Simultaneous Detection of Both Single Nucleotide Variations and Copy Number Alterations by Next-Generation Sequencing in Gorlin Syndrome.

    Directory of Open Access Journals (Sweden)

    Kei-ichi Morita

    Full Text Available Gorlin syndrome (GS is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs. In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals, whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.

  19. Simultaneous Detection of Both Single Nucleotide Variations and Copy Number Alterations by Next-Generation Sequencing in Gorlin Syndrome.

    Science.gov (United States)

    Morita, Kei-ichi; Naruto, Takuya; Tanimoto, Kousuke; Yasukawa, Chisato; Oikawa, Yu; Masuda, Kiyoshi; Imoto, Issei; Inazawa, Johji; Omura, Ken; Harada, Hiroyuki

    2015-01-01

    Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.

  20. The Impact of Next-Generation Sequencing on the Diagnosis and Treatment of Epilepsy in Paediatric Patients.

    Science.gov (United States)

    Mei, Davide; Parrini, Elena; Marini, Carla; Guerrini, Renzo

    2017-08-01

    Next-generation sequencing (NGS) has contributed to the identification of many monogenic epilepsy syndromes and is favouring earlier and more accurate diagnosis in a subset of paediatric patients with epilepsy. The cumulative information emerging from NGS studies is rapidly changing our comprehension of the relations between early-onset severe epilepsy and the associated neurological impairment, progressively delineating specific entities previously gathered under the umbrella definition of epileptic encephalopathies, thereby influencing treatment choices and limiting the most aggressive drug regimens only to those conditions that are likely to actually benefit from them. Although ion channel genes represent the gene family most frequently causally related to epilepsy, other genes have gradually been associated with complex developmental epilepsy conditions, revealing the pathogenic role of mutations affecting diverse molecular pathways that regulate membrane excitability, synaptic plasticity, presynaptic neurotransmitter release, postsynaptic receptors, transporters, cell metabolism, and many formative steps in early brain development. Some of these discoveries are being followed by proof-of-concept laboratory studies that might open new pathways towards personalized treatment choices. No specific treatment is available for most of the monogenic disorders that can now be diagnosed early using NGS, and the main benefits of knowing the specific cause include etiological diagnosis, better prognostication and genetic counselling; however, for a limited number of disorders, timely treatment based on their known molecular pathology is already possible and sometimes decisive. Discovery of a causative gene defect associated with a non-progressive course may reduce the need for further diagnostic investigations in the search for a progressive disorder at the biochemical and imaging level. NGS has also improved the turnaround time for molecular diagnosis and allowed more

  1. MicroRNA and piRNA profiles in normal human testis detected by next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Qingling Yang

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are the class of small endogenous RNAs that play an important regulatory role in cells by negatively affecting gene expression at transcriptional and post-transcriptional levels. There have been extensive studies aiming to discover miRNAs and to analyze their functions in the cells from a variety of species. However, there are no published studies of miRNA profiles in human testis using next generation sequencing (NGS technology. RESULTS: We employed Solexa sequencing technology to profile miRNAs in normal human testis. Total 770 known and 5 novel human miRNAs, and 20121 piRNAs were detected, indicating that the human testis has a complex population of small RNAs. The expression of 15 known and 5 novel detected miRNAs was validated by qRT-PCR. We have also predicted the potential target genes of the abundant known and novel miRNAs, and subjected them to GO and pathway analysis, revealing the involvement of miRNAs in many important biological phenomenon including meiosis and p53-related pathways that are implicated in the regulation of spermatogenesis. CONCLUSIONS: This study reports the first genome-wide miRNA profiles in human testis using a NGS approach. The presence of large number of miRNAs and the nature of their target genes suggested that miRNAs play important roles in spermatogenesis. Here we provide a useful resource for further elucidation of the regulatory role of miRNAs and piRNAs in the spermatogenesis. It may also facilitate the development of prophylactic strategies for male infertility.

  2. Targeted next generation sequencing identified a novel mutation in MYO7A causing Usher syndrome type 1 in an Iranian consanguineous pedigree.

    Science.gov (United States)

    Kooshavar, Daniz; Razipour, Masoumeh; Movasat, Morteza; Keramatipour, Mohammad

    2018-01-01

    Usher syndrome (USH) is characterized by congenital hearing loss and retinitis pigmentosa (RP) with a later onset. It is an autosomal recessive trait with clinical and genetic heterogeneity which makes the molecular diagnosis much difficult. In this study, we introduce a pedigree with two affected members with USH type 1 and represent a cost and time effective approach for genetic diagnosis of USH as a genetically heterogeneous disorder. Target region capture in the genes of interest, followed by next generation sequencing (NGS) was used to determine the causative mutations in one of the probands. Then segregation analysis in the pedigree was conducted using PCR-Sanger sequencing. Targeted NGS detected a novel homozygous nonsense variant c.4513G > T (p.Glu1505Ter) in MYO7A. The variant is segregating in the pedigree with an autosomal recessive pattern. In this study, a novel stop gained variant c.4513G > T (p.Glu1505Ter) in MYO7A was found in an Iranian pedigree with two affected members with USH type 1. Bioinformatic as well as pedigree segregation analyses were in line with pathogenic nature of this variant. Targeted NGS panel was showed to be an efficient method for mutation detection in hereditary disorders with locus heterogeneity. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. Next-generation sequencing offers new insights into DNA degradation

    DEFF Research Database (Denmark)

    Overballe-Petersen, Søren; Orlando, Ludovic Antoine Alexandre; Willerslev, Eske

    2012-01-01

    The processes underlying DNA degradation are central to various disciplines, including cancer research, forensics and archaeology. The sequencing of ancient DNA molecules on next-generation sequencing platforms provides direct measurements of cytosine deamination, depurination and fragmentation...... rates that previously were obtained only from extrapolations of results from in vitro kinetic experiments performed over short timescales. For example, recent next-generation sequencing of ancient DNA reveals purine bases as one of the main targets of postmortem hydrolytic damage, through base...... elimination and strand breakage. It also shows substantially increased rates of DNA base-loss at guanosine. In this review, we argue that the latter results from an electron resonance structure unique to guanosine rather than adenosine having an extra resonance structure over guanosine as previously suggested....

  4. Next-generation sequencing for molecular diagnosis of lung adenocarcinoma specimens obtained by fine needle aspiration cytology

    Science.gov (United States)

    Qiu, Tian; Guo, Huiqin; Zhao, Huan; Wang, Luhua; Zhang, Zhihui

    2015-06-01

    Identification of multi-gene variations has led to the development of new targeted therapies in lung adenocarcinoma patients, and identification of an appropriate patient population with a reliable screening method is the key to the overall success of tumor targeted therapies. In this study, we used the Ion Torrent next-generation sequencing (NGS) technique to screen for mutations in 89 cases of lung adenocarcinoma metastatic lymph node specimens obtained by fine-needle aspiration cytology (FNAC). Of the 89 specimens, 30 (34%) were found to harbor epidermal growth factor receptor (EGFR) kinase domain mutations. Seven (8%) samples harbored KRAS mutations, and three (3%) samples had BRAF mutations involving exon 11 (G469A) and exon 15 (V600E). Eight (9%) samples harbored PIK3CA mutations. One (1%) sample had a HRAS G12C mutation. Thirty-two (36%) samples (36%) harbored TP53 mutations. Other genes including APC, ATM, MET, PTPN11, GNAS, HRAS, RB1, SMAD4 and STK11 were found each in one case. Our study has demonstrated that NGS using the Ion Torrent technology is a useful tool for gene mutation screening in lung adenocarcinoma metastatic lymph node specimens obtained by FNAC, and may promote the development of new targeted therapies in lung adenocarcinoma patients.

  5. Identification of a Novel Heterozygous Missense Mutation in the CACNA1F Gene in a Chinese Family with Retinitis Pigmentosa by Next Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Qi Zhou

    2015-01-01

    Full Text Available Background. Retinitis pigmentosa (RP is an inherited retinal degenerative disease, which is clinically and genetically heterogeneous, and the inheritance pattern is complex. In this study, we have intended to study the possible association of certain genes with X-linked RP (XLRP in a Chinese family. Methods. A Chinese family with RP was recruited, and a total of seven individuals were enrolled in this genetic study. Genomic DNA was isolated from peripheral leukocytes, and used for the next generation sequencing (NGS. Results. The affected individual presented the clinical signs of XLRP. A heterozygous missense mutation (c.1555C>T, p.R519W was identified by NGS in exon 13 of the CACNA1F gene on X chromosome, and was confirmed by Sanger sequencing. It showed perfect cosegregation with the disease in the family. The mutation at this position in the CACNA1F gene of RP was found novel by database searching. Conclusion. By using NGS, we have found a novel heterozygous missense mutation (c.1555C>T, p.R519W in CACNA1F gene, which is probably associated with XLRP. The findings might provide new insights into the cause and diagnosis of RP, and have implications for genetic counseling and clinical management in this family.

  6. A targeted next-generation sequencing assay for the molecular diagnosis of genetic disorders with orodental involvement

    Science.gov (United States)

    Prasad, Megana K; Geoffroy, Véronique; Vicaire, Serge; Jost, Bernard; Dumas, Michael; Le Gras, Stéphanie; Switala, Marzena; Gasse, Barbara; Laugel-Haushalter, Virginie; Paschaki, Marie; Leheup, Bruno; Droz, Dominique; Dalstein, Amelie; Loing, Adeline; Grollemund, Bruno; Muller-Bolla, Michèle; Lopez-Cazaux, Séréna; Minoux, Maryline; Jung, Sophie; Obry, Frédéric; Vogt, Vincent; Davideau, Jean-Luc; Davit-Beal, Tiphaine; Kaiser, Anne-Sophie; Moog, Ute; Richard, Béatrice; Morrier, Jean-Jacques; Duprez, Jean-Pierre; Odent, Sylvie; Bailleul-Forestier, Isabelle; Rousset, Monique Marie; Merametdijan, Laure; Toutain, Annick; Joseph, Clara; Giuliano, Fabienne; Dahlet, Jean-Christophe; Courval, Aymeric; El Alloussi, Mustapha; Laouina, Samir; Soskin, Sylvie; Guffon, Nathalie; Dieux, Anne; Doray, Bérénice; Feierabend, Stephanie; Ginglinger, Emmanuelle; Fournier, Benjamin; de la Dure Molla, Muriel; Alembik, Yves; Tardieu, Corinne; Clauss, François; Berdal, Ariane; Stoetzel, Corinne; Manière, Marie Cécile; Dollfus, Hélène; Bloch-Zupan, Agnès

    2016-01-01

    Background Orodental diseases include several clinically and genetically heterogeneous disorders that can present in isolation or as part of a genetic syndrome. Due to the vast number of genes implicated in these disorders, establishing a molecular diagnosis can be challenging. We aimed to develop a targeted next-generation sequencing (NGS) assay to diagnose mutations and potentially identify novel genes mutated in this group of disorders. Methods We designed an NGS gene panel that targets 585 known and candidate genes in orodental disease. We screened a cohort of 101 unrelated patients without a molecular diagnosis referred to the Reference Centre for Oro-Dental Manifestations of Rare Diseases, Strasbourg, France, for a variety of orodental disorders including isolated and syndromic amelogenesis imperfecta (AI), isolated and syndromic selective tooth agenesis (STHAG), isolated and syndromic dentinogenesis imperfecta, isolated dentin dysplasia, otodental dysplasia and primary failure of tooth eruption. Results We discovered 21 novel pathogenic variants and identified the causative mutation in 39 unrelated patients in known genes (overall diagnostic rate: 39%). Among the largest subcohorts of patients with isolated AI (50 unrelated patients) and isolated STHAG (21 unrelated patients), we had a definitive diagnosis in 14 (27%) and 15 cases (71%), respectively. Surprisingly, COL17A1 mutations accounted for the majority of autosomal-dominant AI cases. Conclusions We have developed a novel targeted NGS assay for the efficient molecular diagnosis of a wide variety of orodental diseases. Furthermore, our panel will contribute to better understanding the contribution of these genes to orodental disease. Trial registration numbers NCT01746121 and NCT02397824. PMID:26502894

  7. a Comparison of Morphological Taxonomy and Next Generation DNA Sequencing for the Assessment of Zooplankton Diversity

    Science.gov (United States)

    Harvey, J.; Fisher, J. L.; Johnson, S.; Morgan, S.; Peterson, W. T.; Satterthwaite, E. V.; Vrijenhoek, R. C.

    2016-02-01

    Our ability to accurately characterize the diversity of planktonic organisms is affected by both the methods we use to collect water samples and our approaches to assessing sample contents. Plankton nets collect organisms from high volumes of water, but integrate sample contents along the net's path. In contrast, plankton pumps collect water from discrete depths. Autonomous underwater vehicles (AUVs) can collect water samples with pinpoint accuracy from physical features such as upwelling fronts or biological features such as phytoplankton blooms, but sample volumes are necessarily much smaller than those possible with nets. Characterization of plankton diversity and abundances in water samples may also vary with the assessment method we apply. Morphological taxonomy provides visual identification and enumeration of organisms via microscopy, but is labor intensive. Next generation DNA sequencing (NGS) shows great promise for assessing plankton diversity in water samples but accurate assessment of relative abundances may not be possible in all cases. Comparison of morphological taxonomy to molecular approaches is necessary to identify areas of overlap and also areas of disagreement between these methods. We have compared morphological taxonomic assessments to mitochondrial COI and nuclear 28S ribosomal RNA NGS results for plankton net samples collected in Monterey bay, California. We have made a similar comparison for plankton pump samples, and have also applied our NGS methods to targeted, small volume water samples collected by an AUV. Our goal is to communicate current results and lessons learned regarding application of traditional taxonomy and novel molecular approaches to the study of plankton diversity in spatially and temporally variable, coastal marine environments.

  8. Successful preimplantation genetic diagnosis by targeted next-generation sequencing on an ion torrent personal genome machine platform.

    Science.gov (United States)

    Hao, Yan; Chen, Dawei; Zhang, Zhiguo; Zhou, Ping; Cao, Yunxia; Wei, Zhaolian; Xu, Xiaofeng; Chen, Beili; Zou, Weiwei; Lv, Mingrong; Ji, Dongmei; He, Xiaojin

    2018-04-01

    Hearing loss may place a heavy burden on the patient and patient's family. Given the high incidence of hearing loss among newborns and the huge cost of treatment and care (including cochlear implantation), prenatal diagnosis is strongly recommended. Termination of the fetus may be considered as an extreme outcome to the discovery of a potential deaf fetus, and therefore preimplantation genetic diagnosis has become an important option for avoiding the birth of affected children without facing the risk of abortion following prenatal diagnosis. In one case, a couple had a 7-year-old daughter affected by non-syndromic sensorineural hearing loss. The affected fetus carried a causative compound heterozygous mutation c.919-2 A>G (IVS7-2 A>G) and c.1707+5 G>A (IVS15+5 G>A) of the solute carrier family 26 member 4 gene inherited from maternal and paternal sides, respectively. The present study applied multiple displacement amplification for whole genome amplification of biopsied trophectoderm cells and next-generation sequencing (NGS)-based single nucleotide polymorphism haplotyping on an Ion Torrent Personal Genome Machine. One unaffected embryo was transferred in a frozen-thawed embryo transfer cycle and the patient was impregnated. To conclude, to the best of our knowledge, this may be the first report of NGS-based preimplantation genetic diagnosis (PGD) for non-syndromic hearing loss caused by a compound heterozygous mutation using an Ion Torrent Personal Genome Machine. NGS provides unprecedented high-throughput, highly parallel and base-pair resolution data for genetic analysis. The method meets the requirements of medium-sized diagnostics laboratories. With decreased costs compared with previous techniques (such as Sanger sequencing), this technique may have potential widespread clinical application in PGD of other types of monogenic disease.

  9. Application of a molecular diagnostic algorithm for haemophilia A and B using next-generation sequencing of entire F8, F9 and VWF genes.

    Science.gov (United States)

    Bastida, Jose Maria; González-Porras, Jose Ramon; Jiménez, Cristina; Benito, Rocio; Ordoñez, Gonzalo R; Álvarez-Román, Maria Teresa; Fontecha, M Elena; Janusz, Kamila; Castillo, David; Fisac, Rosa María; García-Frade, Luis Javier; Aguilar, Carlos; Martínez, María Paz; Bermejo, Nuria; Herrero, Sonia; Balanzategui, Ana; Martin-Antorán, Jose Manuel; Ramos, Rafael; Cebeiro, Maria Jose; Pardal, Emilia; Aguilera, Carmen; Pérez-Gutierrez, Belen; Prieto, Manuel; Riesco, Susana; Mendoza, Maria Carmen; Benito, Ana; Hortal Benito-Sendin, Ana; Jiménez-Yuste, Víctor; Hernández-Rivas, Jesus Maria; García-Sanz, Ramon; González-Díaz, Marcos; Sarasquete, Maria Eugenia

    2017-01-05

    Currently, molecular diagnosis of haemophilia A and B (HA and HB) highlights the excess risk-inhibitor development associated with specific mutations, and enables carrier testing of female relatives and prenatal or preimplantation genetic diagnosis. Molecular testing for HA also helps distinguish it from von Willebrand disease (VWD). Next-generation sequencing (NGS) allows simultaneous investigation of several complete genes, even though they may span very extensive regions. This study aimed to evaluate the usefulness of a molecular algorithm employing an NGS approach for sequencing the complete F8, F9 and VWF genes. The proposed algorithm includes the detection of inversions of introns 1 and 22, an NGS custom panel (the entire F8, F9 and VWF genes), and multiplex ligation-dependent probe amplification (MLPA) analysis. A total of 102 samples (97 FVIII- and FIX-deficient patients, and five female carriers) were studied. IVS-22 screening identified 11 out of 20 severe HA patients and one female carrier. IVS-1 analysis did not reveal any alterations. The NGS approach gave positive results in 88 cases, allowing the differential diagnosis of mild/moderate HA and VWD in eight cases. MLPA confirmed one large exon deletion. Only one case did have no pathogenic variants. The proposed algorithm had an overall success rate of 99 %. In conclusion, our evaluation demonstrates that this algorithm can reliably identify pathogenic variants and diagnose patients with HA, HB or VWD.

  10. BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data

    Science.gov (United States)

    Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Pareja, Eduardo; Tobes, Raquel

    2012-01-01

    BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version – which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future. PMID:23185310

  11. BG7: a new approach for bacterial genome annotation designed for next generation sequencing data.

    Directory of Open Access Journals (Sweden)

    Pablo Pareja-Tobes

    Full Text Available BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version - which is developed in Java, takes advantage of Amazon Web Services (AWS cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future.

  12. A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.

    Directory of Open Access Journals (Sweden)

    Yu-Nong Gong

    Full Text Available Forty-two cytopathic effect (CPE-positive isolates were collected from 2008 to 2012. All isolates could not be identified for known viral pathogens by routine diagnostic assays. They were pooled into 8 groups of 5-6 isolates to reduce the sequencing cost. Next-generation sequencing (NGS was conducted for each group of mixed samples, and the proposed data analysis pipeline was used to identify viral pathogens in these mixed samples. Polymerase chain reaction (PCR or enzyme-linked immunosorbent assay (ELISA was individually conducted for each of these 42 isolates depending on the predicted viral types in each group. Two isolates remained unknown after these tests. Moreover, iteration mapping was implemented for each of these 2 isolates, and predicted human parechovirus (HPeV in both. In summary, our NGS pipeline detected the following viruses among the 42 isolates: 29 human rhinoviruses (HRVs, 10 HPeVs, 1 human adenovirus (HAdV, 1 echovirus and 1 rotavirus. We then focused on the 10 identified Taiwanese HPeVs because of their reported clinical significance over HRVs. Their genomes were assembled and their genetic diversity was explored. One novel 6-bp deletion was found in one HPeV-1 virus. In terms of nucleotide heterogeneity, 64 genetic variants were detected from these HPeVs using the mapped NGS reads. Most importantly, a recombination event was found between our HPeV-3 and a known HPeV-4 strain in the database. Similar event was detected in the other HPeV-3 strains in the same clade of the phylogenetic tree. These findings demonstrated that the proposed NGS data analysis pipeline identified unknown viruses from the mixed clinical samples, revealed their genetic identity and variants, and characterized their genetic features in terms of viral evolution.

  13. Hereditary spastic paraplegia in Greece: characterisation of a previously unexplored population using next-generation sequencing.

    Science.gov (United States)

    Lynch, David S; Koutsis, Georgios; Tucci, Arianna; Panas, Marios; Baklou, Markella; Breza, Marianthi; Karadima, Georgia; Houlden, Henry

    2016-06-01

    Hereditary Spastic Paraplegia (HSP) is a syndrome characterised by lower limb spasticity, occurring alone or in association with other neurological manifestations, such as cognitive impairment, seizures, ataxia or neuropathy. HSP occurs worldwide, with different populations having different frequencies of causative genes. The Greek population has not yet been characterised. The purpose of this study was to describe the clinical presentation and molecular epidemiology of the largest cohort of HSP in Greece, comprising 54 patients from 40 families. We used a targeted next-generation sequencing (NGS) approach to genetically assess a proband from each family. We made a genetic diagnosis in >50% of cases and identified 11 novel variants. Variants in SPAST and KIF5A were the most common causes of autosomal dominant HSP, whereas SPG11 and CYP7B1 were the most common cause of autosomal recessive HSP. We identified a novel variant in SPG11, which led to disease with later onset and may be unique to the Greek population and report the first nonsense mutation in KIF5A. Interestingly, the frequency of HSP mutations in the Greek population, which is relatively isolated, was very similar to other European populations. We confirm that NGS approaches are an efficient diagnostic tool and should be employed early in the assessment of HSP patients.

  14. Standardization and quality management in next-generation sequencing.

    Science.gov (United States)

    Endrullat, Christoph; Glökler, Jörn; Franke, Philipp; Frohme, Marcus

    2016-09-01

    DNA sequencing continues to evolve quickly even after > 30 years. Many new platforms suddenly appeared and former established systems have vanished in almost the same manner. Since establishment of next-generation sequencing devices, this progress gains momentum due to the continually growing demand for higher throughput, lower costs and better quality of data. In consequence of this rapid development, standardized procedures and data formats as well as comprehensive quality management considerations are still scarce. Here, we listed and summarized current standardization efforts and quality management initiatives from companies, organizations and societies in form of published studies and ongoing projects. These comprise on the one hand quality documentation issues like technical notes, accreditation checklists and guidelines for validation of sequencing workflows. On the other hand, general standard proposals and quality metrics are developed and applied to the sequencing workflow steps with the main focus on upstream processes. Finally, certain standard developments for downstream pipeline data handling, processing and storage are discussed in brief. These standardization approaches represent a first basis for continuing work in order to prospectively implement next-generation sequencing in important areas such as clinical diagnostics, where reliable results and fast processing is crucial. Additionally, these efforts will exert a decisive influence on traceability and reproducibility of sequence data.

  15. Detecting novel genetic mutations in Chinese Usher syndrome families using next-generation sequencing technology.

    Science.gov (United States)

    Qu, Ling-Hui; Jin, Xin; Xu, Hai-Wei; Li, Shi-Ying; Yin, Zheng-Qin

    2015-02-01

    Usher syndrome (USH) is the most common cause of combined blindness and deafness inherited in an autosomal recessive mode. Molecular diagnosis is of great significance in revealing the molecular pathogenesis and aiding the clinical diagnosis of this disease. However, molecular diagnosis remains a challenge due to high phenotypic and genetic heterogeneity in USH. This study explored an approach for detecting disease-causing genetic mutations in candidate genes in five index cases from unrelated USH families based on targeted next-generation sequencing (NGS) technology. Through systematic data analysis using an established bioinformatics pipeline and segregation analysis, 10 pathogenic mutations in the USH disease genes were identified in the five USH families. Six of these mutations were novel: c.4398G > A and EX38-49del in MYO7A, c.988_989delAT in USH1C, c.15104_15105delCA and c.6875_6876insG in USH2A. All novel variations segregated with the disease phenotypes in their respective families and were absent from ethnically matched control individuals. This study expanded the mutation spectrum of USH and revealed the genotype-phenotype relationships of the novel USH mutations in Chinese patients. Moreover, this study proved that targeted NGS is an accurate and effective method for detecting genetic mutations related to USH. The identification of pathogenic mutations is of great significance for elucidating the underlying pathophysiology of USH.

  16. MAFsnp: A Multi-Sample Accurate and Flexible SNP Caller Using Next-Generation Sequencing Data

    Science.gov (United States)

    Hu, Jiyuan; Li, Tengfei; Xiu, Zidi; Zhang, Hong

    2015-01-01

    Most existing statistical methods developed for calling single nucleotide polymorphisms (SNPs) using next-generation sequencing (NGS) data are based on Bayesian frameworks, and there does not exist any SNP caller that produces p-values for calling SNPs in a frequentist framework. To fill in this gap, we develop a new method MAFsnp, a Multiple-sample based Accurate and Flexible algorithm for calling SNPs with NGS data. MAFsnp is based on an estimated likelihood ratio test (eLRT) statistic. In practical situation, the involved parameter is very close to the boundary of the parametric space, so the standard large sample property is not suitable to evaluate the finite-sample distribution of the eLRT statistic. Observing that the distribution of the test statistic is a mixture of zero and a continuous part, we propose to model the test statistic with a novel two-parameter mixture distribution. Once the parameters in the mixture distribution are estimated, p-values can be easily calculated for detecting SNPs, and the multiple-testing corrected p-values can be used to control false discovery rate (FDR) at any pre-specified level. With simulated data, MAFsnp is shown to have much better control of FDR than the existing SNP callers. Through the application to two real datasets, MAFsnp is also shown to outperform the existing SNP callers in terms of calling accuracy. An R package “MAFsnp” implementing the new SNP caller is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/. PMID:26309201

  17. Population clustering based on copy number variations detected from next generation sequencing data.

    Science.gov (United States)

    Duan, Junbo; Zhang, Ji-Gang; Wan, Mingxi; Deng, Hong-Wen; Wang, Yu-Ping

    2014-08-01

    Copy number variations (CNVs) can be used as significant bio-markers and next generation sequencing (NGS) provides a high resolution detection of these CNVs. But how to extract features from CNVs and further apply them to genomic studies such as population clustering have become a big challenge. In this paper, we propose a novel method for population clustering based on CNVs from NGS. First, CNVs are extracted from each sample to form a feature matrix. Then, this feature matrix is decomposed into the source matrix and weight matrix with non-negative matrix factorization (NMF). The source matrix consists of common CNVs that are shared by all the samples from the same group, and the weight matrix indicates the corresponding level of CNVs from each sample. Therefore, using NMF of CNVs one can differentiate samples from different ethnic groups, i.e. population clustering. To validate the approach, we applied it to the analysis of both simulation data and two real data set from the 1000 Genomes Project. The results on simulation data demonstrate that the proposed method can recover the true common CNVs with high quality. The results on the first real data analysis show that the proposed method can cluster two family trio with different ancestries into two ethnic groups and the results on the second real data analysis show that the proposed method can be applied to the whole-genome with large sample size consisting of multiple groups. Both results demonstrate the potential of the proposed method for population clustering.

  18. Application of High-Throughput Next-Generation Sequencing for HLA Typing on Buccal Extracted DNA: Results from over 10,000 Donor Recruitment Samples.

    Science.gov (United States)

    Yin, Yuxin; Lan, James H; Nguyen, David; Valenzuela, Nicole; Takemura, Ping; Bolon, Yung-Tsi; Springer, Brianna; Saito, Katsuyuki; Zheng, Ying; Hague, Tim; Pasztor, Agnes; Horvath, Gyorgy; Rigo, Krisztina; Reed, Elaine F; Zhang, Qiuheng

    2016-01-01

    Unambiguous HLA typing is important in hematopoietic stem cell transplantation (HSCT), HLA disease association studies, and solid organ transplantation. However, current molecular typing methods only interrogate the antigen recognition site (ARS) of HLA genes, resulting in many cis-trans ambiguities that require additional typing methods to resolve. Here we report high-resolution HLA typing of 10,063 National Marrow Donor Program (NMDP) registry donors using long-range PCR by next generation sequencing (NGS) approach on buccal swab DNA. Multiplex long-range PCR primers amplified the full-length of HLA class I genes (A, B, C) from promotor to 3' UTR. Class II genes (DRB1, DQB1) were amplified from exon 2 through part of exon 4. PCR amplicons were pooled and sheared using Covaris fragmentation. Library preparation was performed using the Illumina TruSeq Nano kit on the Beckman FX automated platform. Each sample was tagged with a unique barcode, followed by 2×250 bp paired-end sequencing on the Illumina MiSeq. HLA typing was assigned using Omixon Twin software that combines two independent computational algorithms to ensure high confidence in allele calling. Consensus sequence and typing results were reported in Histoimmunogenetics Markup Language (HML) format. All homozygous alleles were confirmed by Luminex SSO typing and exon novelties were confirmed by Sanger sequencing. Using this automated workflow, over 10,063 NMDP registry donors were successfully typed under high-resolution by NGS. Despite known challenges of nucleic acid degradation and low DNA concentration commonly associated with buccal-based specimens, 97.8% of samples were successfully amplified using long-range PCR. Among these, 98.2% were successfully reported by NGS, with an accuracy rate of 99.84% in an independent blind Quality Control audit performed by the NDMP. In this study, NGS-HLA typing identified 23 null alleles (0.023%), 92 rare alleles (0.091%) and 42 exon novelties (0.042%). Long

  19. Application of High-Throughput Next-Generation Sequencing for HLA Typing on Buccal Extracted DNA: Results from over 10,000 Donor Recruitment Samples.

    Directory of Open Access Journals (Sweden)

    Yuxin Yin

    Full Text Available Unambiguous HLA typing is important in hematopoietic stem cell transplantation (HSCT, HLA disease association studies, and solid organ transplantation. However, current molecular typing methods only interrogate the antigen recognition site (ARS of HLA genes, resulting in many cis-trans ambiguities that require additional typing methods to resolve. Here we report high-resolution HLA typing of 10,063 National Marrow Donor Program (NMDP registry donors using long-range PCR by next generation sequencing (NGS approach on buccal swab DNA.Multiplex long-range PCR primers amplified the full-length of HLA class I genes (A, B, C from promotor to 3' UTR. Class II genes (DRB1, DQB1 were amplified from exon 2 through part of exon 4. PCR amplicons were pooled and sheared using Covaris fragmentation. Library preparation was performed using the Illumina TruSeq Nano kit on the Beckman FX automated platform. Each sample was tagged with a unique barcode, followed by 2×250 bp paired-end sequencing on the Illumina MiSeq. HLA typing was assigned using Omixon Twin software that combines two independent computational algorithms to ensure high confidence in allele calling. Consensus sequence and typing results were reported in Histoimmunogenetics Markup Language (HML format. All homozygous alleles were confirmed by Luminex SSO typing and exon novelties were confirmed by Sanger sequencing.Using this automated workflow, over 10,063 NMDP registry donors were successfully typed under high-resolution by NGS. Despite known challenges of nucleic acid degradation and low DNA concentration commonly associated with buccal-based specimens, 97.8% of samples were successfully amplified using long-range PCR. Among these, 98.2% were successfully reported by NGS, with an accuracy rate of 99.84% in an independent blind Quality Control audit performed by the NDMP. In this study, NGS-HLA typing identified 23 null alleles (0.023%, 92 rare alleles (0.091% and 42 exon novelties (0.042%.Long

  20. Novel mutations in CRB1 gene identified in a chinese pedigree with retinitis pigmentosa by targeted capture and next generation sequencing

    Science.gov (United States)

    Lo, David; Weng, Jingning; Liu, xiaohong; Yang, Juhua; He, Fen; Wang, Yun; Liu, Xuyang

    2016-01-01

    PURPOSE To detect the disease-causing gene in a Chinese pedigree with autosomal-recessive retinitis pigmentosa (ARRP). METHODS All subjects in this family underwent a complete ophthalmic examination. Targeted-capture next generation sequencing (NGS) was performed on the proband to detect variants. All variants were verified in the remaining family members by PCR amplification and Sanger sequencing. RESULTS All the affected subjects in this pedigree were diagnosed with retinitis pigmentosa (RP). The compound heterozygous c.138delA (p.Asp47IlefsX24) and c.1841G>T (p.Gly614Val) mutations in the Crumbs homolog 1 (CRB1) gene were identified in all the affected patients but not in the unaffected individuals in this family. These mutations were inherited from their parents, respectively. CONCLUSION The novel compound heterozygous mutations in CRB1 were identified in a Chinese pedigree with ARRP using targeted-capture next generation sequencing. After evaluating the significant heredity and impaired protein function, the compound heterozygous c.138delA (p.Asp47IlefsX24) and c.1841G>T (p.Gly614Val) mutations are the causal genes of early onset ARRP in this pedigree. To the best of our knowledge, there is no previous report regarding the compound mutations. PMID:27806333

  1. System for Informatics in the Molecular Pathology Laboratory: An Open-Source End-to-End Solution for Next-Generation Sequencing Clinical Data Management.

    Science.gov (United States)

    Kang, Wenjun; Kadri, Sabah; Puranik, Rutika; Wurst, Michelle N; Patil, Sushant A; Mujacic, Ibro; Benhamed, Sonia; Niu, Nifang; Zhen, Chao Jie; Ameti, Bekim; Long, Bradley C; Galbo, Filipo; Montes, David; Iracheta, Crystal; Gamboa, Venessa L; Lopez, Daisy; Yourshaw, Michael; Lawrence, Carolyn A; Aisner, Dara L; Fitzpatrick, Carrie; McNerney, Megan E; Wang, Y Lynn; Andrade, Jorge; Volchenboum, Samuel L; Furtado, Larissa V; Ritterhouse, Lauren L; Segal, Jeremy P

    2018-04-24

    Next-generation sequencing (NGS) diagnostic assays increasingly are becoming the standard of care in oncology practice. As the scale of an NGS laboratory grows, management of these assays requires organizing large amounts of information, including patient data, laboratory processes, genomic data, as well as variant interpretation and reporting. Although several Laboratory Information Systems and/or Laboratory Information Management Systems are commercially available, they may not meet all of the needs of a given laboratory, in addition to being frequently cost-prohibitive. Herein, we present the System for Informatics in the Molecular Pathology Laboratory, a free and open-source Laboratory Information System/Laboratory Information Management System for academic and nonprofit molecular pathology NGS laboratories, developed at the Genomic and Molecular Pathology Division at the University of Chicago Medicine. The System for Informatics in the Molecular Pathology Laboratory was designed as a modular end-to-end information system to handle all stages of the NGS laboratory workload from test order to reporting. We describe the features of the system, its clinical validation at the Genomic and Molecular Pathology Division at the University of Chicago Medicine, and its installation and testing within a different academic center laboratory (University of Colorado), and we propose a platform for future community co-development and interlaboratory data sharing. Copyright © 2018. Published by Elsevier Inc.

  2. Next-generation sequencing reveals phylogeographic structure and a species tree for recent bird divergences

    DEFF Research Database (Denmark)

    McCormack, John E.; Maley, James M.; Hird, Sarah M.

    2012-01-01

    divergence in four phylogenetically diverse avian systems using a method for quick and cost-effective generation of primary DNA sequence data using pyrosequencing. NGS data were processed using an analytical pipeline that reduces many reads into two called alleles per locus per individual. Using single...... throughout the genome. Using eight loci found in Zonotrichia and Junco lineages, we were also able to generate a species tree of these sparrow sister genera, demonstrating the potential of this method for generating data amenable to coalescent-based analysis. We discuss improvements that should enhance...

  3. Analysis of intra-host genetic diversity of Prunus necrotic ringspot virus (PNRSV) using amplicon next generation sequencing.

    Science.gov (United States)

    Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan

    2017-01-01

    PCR amplicon next generation sequencing (NGS) analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV) from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored.

  4. Analysis of intra-host genetic diversity of Prunus necrotic ringspot virus (PNRSV using amplicon next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Wycliff M Kinoti

    Full Text Available PCR amplicon next generation sequencing (NGS analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored.

  5. Screening for single nucleotide variants, small indels and exon deletions with a next-generation sequencing based gene panel approach for Usher syndrome.

    Science.gov (United States)

    Krawitz, Peter M; Schiska, Daniela; Krüger, Ulrike; Appelt, Sandra; Heinrich, Verena; Parkhomchuk, Dmitri; Timmermann, Bernd; Millan, Jose M; Robinson, Peter N; Mundlos, Stefan; Hecht, Jochen; Gross, Manfred

    2014-09-01

    Usher syndrome is an autosomal recessive disorder characterized both by deafness and blindness. For the three clinical subtypes of Usher syndrome causal mutations in altogether 12 genes and a modifier gene have been identified. Due to the genetic heterogeneity of Usher syndrome, the molecular analysis is predestined for a comprehensive and parallelized analysis of all known genes by next-generation sequencing (NGS) approaches. We describe here the targeted enrichment and deep sequencing for exons of Usher genes and compare the costs and workload of this approach compared to Sanger sequencing. We also present a bioinformatics analysis pipeline that allows us to detect single-nucleotide variants, short insertions and deletions, as well as copy number variations of one or more exons on the same sequence data. Additionally, we present a flexible in silico gene panel for the analysis of sequence variants, in which newly identified genes can easily be included. We applied this approach to a cohort of 44 Usher patients and detected biallelic pathogenic mutations in 35 individuals and monoallelic mutations in eight individuals of our cohort. Thirty-nine of the sequence variants, including two heterozygous deletions comprising several exons of USH2A, have not been reported so far. Our NGS-based approach allowed us to assess single-nucleotide variants, small indels, and whole exon deletions in a single test. The described diagnostic approach is fast and cost-effective with a high molecular diagnostic yield.

  6. Next-generation sequencing-based detection of circulating tumour DNA After allogeneic stem cell transplantation for lymphoma.

    Science.gov (United States)

    Herrera, Alex F; Kim, Haesook T; Kong, Katherine A; Faham, Malek; Sun, Heather; Sohani, Aliyah R; Alyea, Edwin P; Carlton, Victoria E; Chen, Yi-Bin; Cutler, Corey S; Ho, Vincent T; Koreth, John; Kotwaliwale, Chitra; Nikiforow, Sarah; Ritz, Jerome; Rodig, Scott J; Soiffer, Robert J; Antin, Joseph H; Armand, Philippe

    2016-12-01

    Next-generation sequencing (NGS)-based circulating tumour DNA (ctDNA) detection is a promising monitoring tool for lymphoid malignancies. We evaluated whether the presence of ctDNA was associated with outcome after allogeneic haematopoietic stem cell transplantation (HSCT) in lymphoma patients. We studied 88 patients drawn from a phase 3 clinical trial of reduced-intensity conditioning HSCT in lymphoma. Conventional restaging and collection of peripheral blood samples occurred at pre-specified time points before and after HSCT and were assayed for ctDNA by sequencing of the immunoglobulin or T-cell receptor genes. Tumour clonotypes were identified in 87% of patients with adequate tumour samples. Sixteen of 19 (84%) patients with disease progression after HSCT had detectable ctDNA prior to progression at a median of 3·7 months prior to relapse/progression. Patients with detectable ctDNA 3 months after HSCT had inferior progression-free survival (PFS) (2-year PFS 58% vs. 84% in ctDNA-negative patients, P = 0·033). In multivariate models, detectable ctDNA was associated with increased risk of progression/death (Hazard ratio 3·9, P = 0·003) and increased risk of relapse/progression (Hazard ratio 10·8, P = 0·0006). Detectable ctDNA is associated with an increased risk of relapse/progression, but further validation studies are necessary to confirm these findings and determine the clinical utility of NGS-based minimal residual disease monitoring in lymphoma patients after HSCT. © 2016 John Wiley & Sons Ltd.

  7. Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

    Science.gov (United States)

    Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

    2015-01-01

    Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438

  8. Next-generation sequencing of 34 genes in sudden unexplained death victims in forensics and in patients with channelopathic cardiac diseases

    DEFF Research Database (Denmark)

    Hertz, Christin Løth; Christiansen, Sofie Lindgren; Ferrero-Miliani, Laura

    2015-01-01

    Sudden cardiac death (SCD) is responsible for a large proportion of sudden deaths in young individuals. In forensic medicine, many cases remain unexplained after routine postmortem autopsy and conventional investigations. These cases are called sudden unexplained deaths (SUD). Genetic testing has...... been suggested useful in forensic medicine, although in general with a significantly lower success rate compared to the clinical setting. The purpose of the study was to estimate the frequency of pathogenic variants in the genes most frequently associated with SCD in SUD cases and compare the frequency...... to that in patients with inherited cardiac channelopathies. Fifteen forensic SUD cases and 29 patients with channelopathies were investigated. DNA from 34 of the genes most frequently associated with SCD were captured using NimbleGen SeqCap EZ library build and were sequenced with next-generation sequencing (NGS...

  9. Detecting exact breakpoints of deletions with diversity in hepatitis B viral genomic DNA from next-generation sequencing data.

    Science.gov (United States)

    Cheng, Ji-Hong; Liu, Wen-Chun; Chang, Ting-Tsung; Hsieh, Sun-Yuan; Tseng, Vincent S

    2017-10-01

    Many studies have suggested that deletions of Hepatitis B Viral (HBV) are associated with the development of progressive liver diseases, even ultimately resulting in hepatocellular carcinoma (HCC). Among the methods for detecting deletions from next-generation sequencing (NGS) data, few methods considered the characteristics of virus, such as high evolution rates and high divergence among the different HBV genomes. Sequencing high divergence HBV genome sequences using the NGS technology outputs millions of reads. Thus, detecting exact breakpoints of deletions from these big and complex data incurs very high computational cost. We proposed a novel analytical method named VirDelect (Virus Deletion Detect), which uses split read alignment base to detect exact breakpoint and diversity variable to consider high divergence in single-end reads data, such that the computational cost can be reduced without losing accuracy. We use four simulated reads datasets and two real pair-end reads datasets of HBV genome sequence to verify VirDelect accuracy by score functions. The experimental results show that VirDelect outperforms the state-of-the-art method Pindel in terms of accuracy score for all simulated datasets and VirDelect had only two base errors even in real datasets. VirDelect is also shown to deliver high accuracy in analyzing the single-end read data as well as pair-end data. VirDelect can serve as an effective and efficient bioinformatics tool for physiologists with high accuracy and efficient performance and applicable to further analysis with characteristics similar to HBV on genome length and high divergence. The software program of VirDelect can be downloaded at https://sourceforge.net/projects/virdelect/. Copyright © 2017. Published by Elsevier Inc.

  10. JVM: Java Visual Mapping tool for next generation sequencing read.

    Science.gov (United States)

    Yang, Ye; Liu, Juan

    2015-01-01

    We developed a program JVM (Java Visual Mapping) for mapping next generation sequencing read to reference sequence. The program is implemented in Java and is designed to deal with millions of short read generated by sequence alignment using the Illumina sequencing technology. It employs seed index strategy and octal encoding operations for sequence alignments. JVM is useful for DNA-Seq, RNA-Seq when dealing with single-end resequencing. JVM is a desktop application, which supports reads capacity from 1 MB to 10 GB.

  11. Vision from next generation sequencing: multi-dimensional genome-wide analysis for producing gene regulatory networks underlying retinal development, aging and disease.

    Science.gov (United States)

    Yang, Hyun-Jin; Ratnapriya, Rinki; Cogliati, Tiziana; Kim, Jung-Woong; Swaroop, Anand

    2015-05-01

    Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of "gene" itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases. Published by Elsevier Ltd.

  12. Targeted NGS meets expert clinical characterization: Efficient diagnosis of spastic paraplegia type 11

    Directory of Open Access Journals (Sweden)

    Cristina Castro-Fernández

    2015-06-01

    Full Text Available Next generation sequencing (NGS is transforming the diagnostic approach for neurological disorders, since it allows simultaneous analysis of hundreds of genes, even based on just a broad, syndromic patient categorization. However, such an approach bears a high risk of incidental and uncertain genetic findings. We report a patient with spastic paraplegia whose comprehensive neurological and imaging examination raised a high clinical suspicion of SPG11. Thus, although our NGS pipeline for this group of disorders includes gene panel and exome sequencing, in this sample only the spatacsin gene region was captured and subsequently searched for mutations. Two probably pathogenic variants were quickly and clearly identified, confirming the diagnosis of SPG11. This case illustrates how combination of expert clinical characterization with highly oriented NGS protocols leads to a fast, cost-efficient diagnosis, minimizing the risk of findings with unclear significance.

  13. High-performance integrated virtual environment (HIVE): a robust infrastructure for next-generation sequence data analysis.

    Science.gov (United States)

    Simonyan, Vahan; Chumakov, Konstantin; Dingerdissen, Hayley; Faison, William; Goldweber, Scott; Golikov, Anton; Gulzar, Naila; Karagiannis, Konstantinos; Vinh Nguyen Lam, Phuc; Maudru, Thomas; Muravitskaja, Olesja; Osipova, Ekaterina; Pan, Yang; Pschenichnov, Alexey; Rostovtsev, Alexandre; Santana-Quintero, Luis; Smith, Krista; Thompson, Elaine E; Tkachenko, Valery; Torcivia-Rodriguez, John; Voskanian, Alin; Wan, Quan; Wang, Jing; Wu, Tsung-Jung; Wilson, Carolyn; Mazumder, Raja

    2016-01-01

    The High-performance Integrated Virtual Environment (HIVE) is a distributed storage and compute environment designed primarily to handle next-generation sequencing (NGS) data. This multicomponent cloud infrastructure provides secure web access for authorized users to deposit, retrieve, annotate and compute on NGS data, and to analyse the outcomes using web interface visual environments appropriately built in collaboration with research and regulatory scientists and other end users. Unlike many massively parallel computing environments, HIVE uses a cloud control server which virtualizes services, not processes. It is both very robust and flexible due to the abstraction layer introduced between computational requests and operating system processes. The novel paradigm of moving computations to the data, instead of moving data to computational nodes, has proven to be significantly less taxing for both hardware and network infrastructure.The honeycomb data model developed for HIVE integrates metadata into an object-oriented model. Its distinction from other object-oriented databases is in the additional implementation of a unified application program interface to search, view and manipulate data of all types. This model simplifies the introduction of new data types, thereby minimizing the need for database restructuring and streamlining the development of new integrated information systems. The honeycomb model employs a highly secure hierarchical access control and permission system, allowing determination of data access privileges in a finely granular manner without flooding the security subsystem with a multiplicity of rules. HIVE infrastructure will allow engineers and scientists to perform NGS analysis in a manner that is both efficient and secure. HIVE is actively supported in public and private domains, and project collaborations are welcomed. Database URL: https://hive.biochemistry.gwu.edu. © The Author(s) 2016. Published by Oxford University Press.

  14. MiSeq: A Next Generation Sequencing Platform for Genomic Analysis.

    Science.gov (United States)

    Ravi, Rupesh Kanchi; Walton, Kendra; Khosroheidari, Mahdieh

    2018-01-01

    MiSeq, Illumina's integrated next generation sequencing instrument, uses reversible-terminator sequencing-by-synthesis technology to provide end-to-end sequencing solutions. The MiSeq instrument is one of the smallest benchtop sequencers that can perform onboard cluster generation, amplification, genomic DNA sequencing, and data analysis, including base calling, alignment and variant calling, in a single run. It performs both single- and paired-end runs with adjustable read lengths from 1 × 36 base pairs to 2 × 300 base pairs. A single run can produce output data of up to 15 Gb in as little as 4 h of runtime and can output up to 25 M single reads and 50 M paired-end reads. Thus, MiSeq provides an ideal platform for rapid turnaround time. MiSeq is also a cost-effective tool for various analyses focused on targeted gene sequencing (amplicon sequencing and target enrichment), metagenomics, and gene expression studies. For these reasons, MiSeq has become one of the most widely used next generation sequencing platforms. Here, we provide a protocol to prepare libraries for sequencing using the MiSeq instrument and basic guidelines for analysis of output data from the MiSeq sequencing run.

  15. A safe an easy method for building consensus HIV sequences from 454 massively parallel sequencing data.

    Science.gov (United States)

    Fernández-Caballero Rico, Jose Ángel; Chueca Porcuna, Natalia; Álvarez Estévez, Marta; Mosquera Gutiérrez, María Del Mar; Marcos Maeso, María Ángeles; García, Federico

    2018-02-01

    To show how to generate a consensus sequence from the information of massive parallel sequences data obtained from routine HIV anti-retroviral resistance studies, and that may be suitable for molecular epidemiology studies. Paired Sanger (Trugene-Siemens) and next-generation sequencing (NGS) (454 GSJunior-Roche) HIV RT and protease sequences from 62 patients were studied. NGS consensus sequences were generated using Mesquite, using 10%, 15%, and 20% thresholds. Molecular evolutionary genetics analysis (MEGA) was used for phylogenetic studies. At a 10% threshold, NGS-Sanger sequences from 17/62 patients were phylogenetically related, with a median bootstrap-value of 88% (IQR83.5-95.5). Association increased to 36/62 sequences, median bootstrap 94% (IQR85.5-98)], using a 15% threshold. Maximum association was at the 20% threshold, with 61/62 sequences associated, and a median bootstrap value of 99% (IQR98-100). A safe method is presented to generate consensus sequences from HIV-NGS data at 20% threshold, which will prove useful for molecular epidemiological studies. Copyright © 2016 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.

  16. Elucidating the diet of the island flying fox (Pteropus hypomelanus) in Peninsular Malaysia through Illumina Next-Generation Sequencing.

    Science.gov (United States)

    Aziz, Sheema Abdul; Clements, Gopalasamy Reuben; Peng, Lee Yin; Campos-Arceiz, Ahimsa; McConkey, Kim R; Forget, Pierre-Michel; Gan, Han Ming

    2017-01-01

    There is an urgent need to identify and understand the ecosystem services of pollination and seed dispersal provided by threatened mammals such as flying foxes. The first step towards this is to obtain comprehensive data on their diet. However, the volant and nocturnal nature of bats presents a particularly challenging situation, and conventional microhistological approaches to studying their diet can be laborious and time-consuming, and provide incomplete information. We used Illumina Next-Generation Sequencing (NGS) as a novel, non-invasive method for analysing the diet of the island flying fox ( Pteropus hypomelanus ) on Tioman Island, Peninsular Malaysia. Through DNA metabarcoding of plants in flying fox droppings, using primers targeting the rbcL gene, we identified at least 29 Operationally Taxonomic Units (OTUs) comprising the diet of this giant pteropodid. OTU sequences matched at least four genera and 14 plant families from online reference databases based on a conservative Least Common Ancestor approach, and eight species from our site-specific plant reference collection. NGS was just as successful as conventional microhistological analysis in detecting plant taxa from droppings, but also uncovered six additional plant taxa. The island flying fox's diet appeared to be dominated by figs ( Ficus sp.), which was the most abundant plant taxon detected in the droppings every single month. Our study has shown that NGS can add value to the conventional microhistological approach in identifying food plant species from flying fox droppings. At this point in time, more accurate genus- and species-level identification of OTUs not only requires support from databases with more representative sequences of relevant plant DNA, but probably necessitates in situ collection of plant specimens to create a reference collection. Although this method cannot be used to quantify true abundance or proportion of plant species, nor plant parts consumed, it ultimately provides a

  17. Elucidating the diet of the island flying fox (Pteropus hypomelanus in Peninsular Malaysia through Illumina Next-Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Sheema Abdul Aziz

    2017-04-01

    Full Text Available There is an urgent need to identify and understand the ecosystem services of pollination and seed dispersal provided by threatened mammals such as flying foxes. The first step towards this is to obtain comprehensive data on their diet. However, the volant and nocturnal nature of bats presents a particularly challenging situation, and conventional microhistological approaches to studying their diet can be laborious and time-consuming, and provide incomplete information. We used Illumina Next-Generation Sequencing (NGS as a novel, non-invasive method for analysing the diet of the island flying fox (Pteropus hypomelanus on Tioman Island, Peninsular Malaysia. Through DNA metabarcoding of plants in flying fox droppings, using primers targeting the rbcL gene, we identified at least 29 Operationally Taxonomic Units (OTUs comprising the diet of this giant pteropodid. OTU sequences matched at least four genera and 14 plant families from online reference databases based on a conservative Least Common Ancestor approach, and eight species from our site-specific plant reference collection. NGS was just as successful as conventional microhistological analysis in detecting plant taxa from droppings, but also uncovered six additional plant taxa. The island flying fox’s diet appeared to be dominated by figs (Ficus sp., which was the most abundant plant taxon detected in the droppings every single month. Our study has shown that NGS can add value to the conventional microhistological approach in identifying food plant species from flying fox droppings. At this point in time, more accurate genus- and species-level identification of OTUs not only requires support from databases with more representative sequences of relevant plant DNA, but probably necessitates in situ collection of plant specimens to create a reference collection. Although this method cannot be used to quantify true abundance or proportion of plant species, nor plant parts consumed, it ultimately

  18. Next-generation sequencing for endocrine cancers: Recent advances and challenges.

    Science.gov (United States)

    Suresh, Padmanaban S; Venkatesh, Thejaswini; Tsutsumi, Rie; Shetty, Abhishek

    2017-05-01

    Contemporary molecular biology research tools have enriched numerous areas of biomedical research that address challenging diseases, including endocrine cancers (pituitary, thyroid, parathyroid, adrenal, testicular, ovarian, and neuroendocrine cancers). These tools have placed several intriguing clues before the scientific community. Endocrine cancers pose a major challenge in health care and research despite considerable attempts by researchers to understand their etiology. Microarray analyses have provided gene signatures from many cells, tissues, and organs that can differentiate healthy states from diseased ones, and even show patterns that correlate with stages of a disease. Microarray data can also elucidate the responses of endocrine tumors to therapeutic treatments. The rapid progress in next-generation sequencing methods has overcome many of the initial challenges of these technologies, and their advantages over microarray techniques have enabled them to emerge as valuable aids for clinical research applications (prognosis, identification of drug targets, etc.). A comprehensive review describing the recent advances in next-generation sequencing methods and their application in the evaluation of endocrine and endocrine-related cancers is lacking. The main purpose of this review is to illustrate the concepts that collectively constitute our current view of the possibilities offered by next-generation sequencing technological platforms, challenges to relevant applications, and perspectives on the future of clinical genetic testing of patients with endocrine tumors. We focus on recent discoveries in the use of next-generation sequencing methods for clinical diagnosis of endocrine tumors in patients and conclude with a discussion on persisting challenges and future objectives.

  19. HLA typing: Conventional techniques v. next-generation sequencing ...

    African Journals Online (AJOL)

    Background. The large number of population-specific polymorphisms present in the HLA complex in the South African (SA) population reduces the probability of finding an adequate HLA-matched donor for individuals in need of an unrelated haematopoietic stem cell transplantation (HSCT). Next-generation sequencing ...

  20. Whole genome sequence analysis of the arctic-lineage strain responsible for distemper in Italian wolves and dogs through a fast and robust next generation sequencing protocol.

    Science.gov (United States)

    Marcacci, Maurilia; Ancora, Massimo; Mangone, Iolanda; Teodori, Liana; Di Sabatino, Daria; De Massis, Fabrizio; Camma', Cesare; Savini, Giovanni; Lorusso, Alessio

    2014-06-01

    Dynamic surveillance and characterization of canine distemper virus (CDV) circulating strains are essential against possible vaccine breakthroughs events. This study describes the setup of a fast and robust next-generation sequencing (NGS) Ion PGM™ protocol that was used to obtain the complete genome sequence of a CDV isolate (CDV2784/2013). CDV2784/2013 is the prototype of CDV strains responsible for severe clinical distemper in dogs and wolves in Italy during 2013. CDV2784/2013 was isolated on cell culture and total RNA was used for NGS sample preparation. A total of 112.3 Mb of reads were assembled de novo using MIRA version 4.0rc4, which yielded a total number of 403 contigs with 12.1% coverage. The whole genome (15,690 bp) was recovered successfully and compared to those of existing CDV whole genomes. CDV2784/2013 was shown to have 92% nt identity with the Onderstepoort vaccine strain. This study describes for the first time a fast and robust Ion PGM™ platform-based whole genome amplification protocol for non-segmented negative stranded RNA viruses starting from total cell-purified RNA. Additionally, this is the first study reporting the whole genome analysis of an Arctic lineage strain that is known to circulate widely in Europe, Asia and USA. Copyright © 2014 Elsevier B.V. All rights reserved.

  1. Laser capture microdissection followed by next-generation sequencing identifies disease-related microRNAs in psoriatic skin that reflect systemic microRNA changes in psoriasis

    DEFF Research Database (Denmark)

    Løvendorf, Marianne B; Mitsui, Hiroshi; Zibert, John R

    2015-01-01

    Psoriasis is a systemic disease with cutaneous manifestations. MicroRNAs (miRNAs) are small non-coding RNA molecules that are differentially expressed in psoriatic skin; however, only few cell- and region-specific miRNAs have been identified in psoriatic lesions. We used laser capture...... microdissection (LCM) and next-generation sequencing (NGS) to study the specific miRNA expression profiles in the epidermis (Epi) and dermal inflammatory infiltrates (RD) of psoriatic skin (N = 6). We identified 24 deregulated miRNAs in the Epi and 37 deregulated miRNAs in the RD of psoriatic plaque compared...... with normal psoriatic skin (FCH > 2, FDR

  2. Clinical Application of Picodroplet Digital PCR Technology for Rapid Detection of EGFR T790M in Next-Generation Sequencing Libraries and DNA from Limited Tumor Samples.

    Science.gov (United States)

    Borsu, Laetitia; Intrieri, Julie; Thampi, Linta; Yu, Helena; Riely, Gregory; Nafa, Khedoudja; Chandramohan, Raghu; Ladanyi, Marc; Arcila, Maria E

    2016-11-01

    Although next-generation sequencing (NGS) is a robust technology for comprehensive assessment of EGFR-mutant lung adenocarcinomas with acquired resistance to tyrosine kinase inhibitors, it may not provide sufficiently rapid and sensitive detection of the EGFR T790M mutation, the most clinically relevant resistance biomarker. Here, we describe a digital PCR (dPCR) assay for rapid T790M detection on aliquots of NGS libraries prepared for comprehensive profiling, fully maximizing broad genomic analysis on limited samples. Tumor DNAs from patients with EGFR-mutant lung adenocarcinomas and acquired resistance to epidermal growth factor receptor inhibitors were prepared for Memorial Sloan-Kettering-Integrated Mutation Profiling of Actionable Cancer Targets sequencing, a hybrid capture-based assay interrogating 410 cancer-related genes. Precapture library aliquots were used for rapid EGFR T790M testing by dPCR, and results were compared with NGS and locked nucleic acid-PCR Sanger sequencing (reference high sensitivity method). Seventy resistance samples showed 99% concordance with the reference high sensitivity method in accuracy studies. Input as low as 2.5 ng provided a sensitivity of 1% and improved further with increasing DNA input. dPCR on libraries required less DNA and showed better performance than direct genomic DNA. dPCR on NGS libraries is a robust and rapid approach to EGFR T790M testing, allowing most economical utilization of limited material for comprehensive assessment. The same assay can also be performed directly on any limited DNA source and cell-free DNA. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  3. Open source tools to exploit DNA sequence data from livestock species

    Science.gov (United States)

    Next-Generation Sequencing (NGS) is a recent technological development that allows researchers to rapidly determine the DNA sequence of an individual. The decrease in cost of NGS has brought the technology into the realm of practical applications in livestock genomics, where it can be used to genera...

  4. A Fast Solution to NGS Library Prep with Low Nanogram DNA Input

    Science.gov (United States)

    Liu, Pingfang; Lohman, Gregory J.S.; Cantor, Eric; Langhorst, Bradley W.; Yigit, Erbay; Apone, Lynne M.; Munafo, Daniela B.; Stewart, Fiona J.; Evans, Thomas C.; Nichols, Nicole; Dimalanta, Eileen T.; Davis, Theodore B.; Sumner, Christine

    2013-01-01

    Next Generation Sequencing (NGS) has significantly impacted human genetics, enabling a comprehensive characterization of the human genome as well as a better understanding of many genomic abnormalities. By delivering massive DNA sequences at unprecedented speed and cost, NGS promises to make personalized medicine a reality in the foreseeable future. To date, library construction with clinical samples has been a challenge, primarily due to the limited quantities of sample DNA available. Our objective here was to overcome this challenge by developing NEBNext® Ultra DNA Library Prep Kit, a fast library preparation method. Specifically, we streamlined the workflow utilizing novel NEBNext reagents and adaptors, including a new DNA polymerase that has been optimized to minimize GC bias. As a result of this work, we have developed a simple method for library construction from an amount of DNA as low as 5 ng, which can be used for both intact and fragmented DNA. Moreover, the workflow is compatible with multiple NGS platforms.

  5. Acceptability of, and Information Needs Regarding, Next-Generation Sequencing in People Tested for Hereditary Cancer: A Qualitative Study.

    Science.gov (United States)

    Meiser, Bettina; Storey, Ben; Quinn, Veronica; Rahman, Belinda; Andrews, Lesley

    2016-04-01

    Next generation sequencing (NGS) for patients at risk of hereditary cancer syndromes can also identify non-cancer related mutations, as well as variants of unknown significance. This study aimed to determine what benefits and shortcomings patients perceive in relation to NGS, as well as their interest and information preferences in regards to such testing. Eligible patients had previously received inconclusive results from clinical mutation testing for cancer susceptibility. Semi-structured telephone interviews were subjected to qualitative analysis guided by the approach developed by Miles and Huberman. The majority of the 19 participants reported they would be interested in panel/genomic testing. Advantages identified included that it would enable better preparation and allow implementation of individualized preventative strategies, with few disadvantages mentioned. Almost all participants said they would want all results, not just those related to their previous diagnosis. Participants felt that a face-to-face discussion supplemented by an information booklet would be the best way to convey information and achieve informed consent. All participants wanted their information stored and reviewed in accordance with new developments. Although the findings indicate strong interest among these individuals, it seems that the consent process, and the interpretation and communication of results will be areas that will require revision to meet the needs of patients.

  6. Next Generation Sequencing approach to molecular diagnosis of Duchenne muscular dystrophy; identification of a novel mutation.

    Science.gov (United States)

    Ebrahimzadeh-Vesal, Reza; Teymoori, Atieh; Azimi-Nezhad, Mohsen; Hosseini, Forough Sadat

    2018-02-20

    Duchenne Muscular Dystrophy (DMD; MIM 310200) is one of the most common and severe type of hereditary muscular dystrophies. The disease is caused by mutations in the dystrophin gene. The dystrophin gene is associated with X-linked recessive Duchenne and Becker muscular dystrophy. This disease occurs almost exclusively in males. The clinical symptoms of muscle weakness usually begin at childhood. The main symptoms of this disorder are gradually muscular weakness. The affected patients have inability to standing up and walking. Death is usually due to respiratory infection or cardiomyopathy. In this article, we have reported the discovery of a new nonsense mutation that creates abnormal stop codon in the dystrophin gene. This mutation was detected using Next Generation Sequencing (NGS) technique. The subject was a 17-year-old male with muscular dystrophy that who was suspected of having DMD. He was referred to Hakim medical genetics center of Neyshabur, IRAN. Copyright © 2017. Published by Elsevier B.V.

  7. Comparison of two next-generation sequencing kits for diagnosis of epileptic disorders with a user-friendly tool for displaying gene coverage, DeCovA

    Directory of Open Access Journals (Sweden)

    Sarra Dimassi

    2015-12-01

    Full Text Available In recent years, molecular genetics has been playing an increasing role in the diagnostic process of monogenic epilepsies. Knowing the genetic basis of one patient's epilepsy provides accurate genetic counseling and may guide therapeutic options. Genetic diagnosis of epilepsy syndromes has long been based on Sanger sequencing and search for large rearrangements using MLPA or DNA arrays (array-CGH or SNP-array. Recently, next-generation sequencing (NGS was demonstrated to be a powerful approach to overcome the wide clinical and genetic heterogeneity of epileptic disorders. Coverage is critical for assessing the quality and accuracy of results from NGS. However, it is often a difficult parameter to display in practice. The aim of the study was to compare two library-building methods (Haloplex, Agilent and SeqCap EZ, Roche for a targeted panel of 41 genes causing monogenic epileptic disorders. We included 24 patients, 20 of whom had known disease-causing mutations. For each patient both libraries were built in parallel and sequenced on an Ion Torrent Personal Genome Machine (PGM. To compare coverage and depth, we developed a simple homemade tool, named DeCovA (Depth and Coverage Analysis. DeCovA displays the sequencing depth of each base and the coverage of target genes for each genomic position. The fraction of each gene covered at different thresholds could be easily estimated. None of the two methods used, namely NextGene and Ion Reporter, were able to identify all the known mutations/CNVs displayed by the 20 patients. Variant detection rate was globally similar for the two techniques and DeCovA showed that failure to detect a mutation was mainly related to insufficient coverage.

  8. Quantitative miRNA expression analysis: comparing microarrays with next-generation sequencing

    DEFF Research Database (Denmark)

    Willenbrock, Hanni; Salomon, Jesper; Søkilde, Rolf

    2009-01-01

    Recently, next-generation sequencing has been introduced as a promising, new platform for assessing the copy number of transcripts, while the existing microarray technology is considered less reliable for absolute, quantitative expression measurements. Nonetheless, so far, results from the two...... technologies have only been compared based on biological data, leading to the conclusion that, although they are somewhat correlated, expression values differ significantly. Here, we use synthetic RNA samples, resembling human microRNA samples, to find that microarray expression measures actually correlate...... better with sample RNA content than expression measures obtained from sequencing data. In addition, microarrays appear highly sensitive and perform equivalently to next-generation sequencing in terms of reproducibility and relative ratio quantification....

  9. Multi-Locus Next-Generation Sequence Typing of DNA Extracted From Pooled Colonies Detects Multiple Unrelated Candida albicans Strains in a Significant Proportion of Patient Samples

    Directory of Open Access Journals (Sweden)

    Ningxin Zhang

    2018-06-01

    Full Text Available The yeast Candida albicans is an important opportunistic human pathogen. For C. albicans strain typing or drug susceptibility testing, a single colony recovered from a patient sample is normally used. This is insufficient when multiple strains are present at the site sampled. How often this is the case is unclear. Previous studies, confined to oral, vaginal and vulvar samples, have yielded conflicting results and have assessed too small a number of colonies per sample to reliably detect the presence of multiple strains. We developed a next-generation sequencing (NGS modification of the highly discriminatory C. albicans MLST (multilocus sequence typing method, 100+1 NGS-MLST, for detection and typing of multiple strains in clinical samples. In 100+1 NGS-MLST, DNA is extracted from a pool of colonies from a patient sample and also from one of the colonies. MLST amplicons from both DNA preparations are analyzed by high-throughput sequencing. Using base call frequencies, our bespoke DALMATIONS software determines the MLST type of the single colony. If base call frequency differences between pool and single colony indicate the presence of an additional strain, the differences are used to computationally infer the second MLST type without the need for MLST of additional individual colonies. In mixes of previously typed pairs of strains, 100+1 NGS-MLST reliably detected a second strain. Inferred MLST types of second strains were always more similar to their real MLST types than to those of any of 59 other isolates (22 of 31 inferred types were identical to the real type. Using 100+1 NGS-MLST we found that 7/60 human samples, including three superficial candidiasis samples, contained two unrelated strains. In addition, at least one sample contained two highly similar variants of the same strain. The probability of samples containing unrelated strains appears to differ considerably between body sites. Our findings indicate the need for wider surveys to

  10. Analysis of immune-related genes during Nora virus infection of Drosophila melanogaster using next generation sequencing.

    Science.gov (United States)

    Lopez, Wilfredo; Page, Alexis M; Carlson, Darby J; Ericson, Brad L; Cserhati, Matyas F; Guda, Chittibabu; Carlson, Kimberly A

    2018-01-01

    Drosophila melanogaster depends upon the innate immune system to regulate and combat viral infection. This is a complex, yet widely conserved process that involves a number of immune pathways and gene interactions. In addition, expression of genes involved in immunity are differentially regulated as the organism ages. This is particularly true for viruses that demonstrate chronic infection, as is seen with Nora virus. Nora virus is a persistent non-pathogenic virus that replicates in a horizontal manner in D. melanogaster . The genes involved in the regulation of the immune response to Nora virus infection are largely unknown. In addition, the temporal response of immune response genes as a result of infection has not been examined. In this study, D. melanogaster either infected with Nora virus or left uninfected were aged for 2, 10, 20 and 30 days. The RNA from these samples was analyzed by next generation sequencing (NGS) and the resulting immune-related genes evaluated by utilizing both the PANTHER and DAVID databases, as well as comparison to lists of immune related genes and FlyBase. The data demonstrate that Nora virus infected D. melanogaster exhibit an increase in immune related gene expression over time. In addition, at day 30, the data demonstrate that a persistent immune response may occur leading to an upregulation of specific immune response genes. These results demonstrate the utility of NGS in determining the potential immune system genes involved in Nora virus replication, chronic infection and involvement of antiviral pathways.

  11. Human papillomavirus genome integration in squamous carcinogenesis: what have next-generation sequencing studies taught us?

    Science.gov (United States)

    Groves, Ian J; Coleman, Nicholas

    2018-05-01

    Human papillomavirus (HPV) infection is associated with ∼5% of all human cancers, including a range of squamous cell carcinomas. Persistent infection by high-risk HPVs (HRHPVs) is associated with the integration of virus genomes (which are usually stably maintained as extrachromosomal episomes) into host chromosomes. Although HRHPV integration rates differ across human sites of infection, this process appears to be an important event in HPV-associated neoplastic progression, leading to deregulation of virus oncogene expression, host gene expression modulation, and further genomic instability. However, the mechanisms by which HRHPV integration occur and by which the subsequent gene expression changes take place are incompletely understood. The advent of next-generation sequencing (NGS) of both RNA and DNA has allowed powerful interrogation of the association of HRHPVs with human disease, including precise determination of the sites of integration and the genomic rearrangements at integration loci. In turn, these data have indicated that integration occurs through two main mechanisms: looping integration and direct insertion. Improved understanding of integration sites is allowing further investigation of the factors that provide a competitive advantage to some integrants during disease progression. Furthermore, advanced approaches to the generation of genome-wide samples have given novel insights into the three-dimensional interactions within the nucleus, which could act as another layer of epigenetic control of both virus and host transcription. It is hoped that further advances in NGS techniques and analysis will not only allow the examination of further unanswered questions regarding HPV infection, but also direct new approaches to treating HPV-associated human disease. Copyright © 2018 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. Copyright © 2018 Pathological Society of Great Britain and Ireland. Published by John

  12. Next-generation sequencing and phylogenetic signal of complete mitochondrial genomes for resolving the evolutionary history of leaf-nosed bats (Phyllostomidae).

    Science.gov (United States)

    Botero-Castro, Fidel; Tilak, Marie-ka; Justy, Fabienne; Catzeflis, François; Delsuc, Frédéric; Douzery, Emmanuel J P

    2013-12-01

    Leaf-nosed bats (Phyllostomidae) are one of the most studied groups within the order Chiroptera mainly because of their outstanding species richness and diversity in morphological and ecological traits. Rapid diversification and multiple homoplasies have made the phylogeny of the family difficult to solve using morphological characters. Molecular data have contributed to shed light on the evolutionary history of phyllostomid bats, yet several relationships remain unresolved at the intra-familial level. Complete mitochondrial genomes have proven useful to deal with this kind of situation in other groups of mammals by providing access to a large number of molecular characters. At present, there are only two mitogenomes available for phyllostomid bats hinting at the need for further exploration of the mitogenomic approach in this group. We used both standard Sanger sequencing of PCR products and next-generation sequencing (NGS) of shotgun genomic DNA to obtain new complete mitochondrial genomes from 10 species of phyllostomid bats, including representatives of major subfamilies, plus one outgroup belonging to the closely-related mormoopids. We then evaluated the contribution of mitogenomics to the resolution of the phylogeny of leaf-nosed bats and compared the results to those based on mitochondrial genes and the RAG2 and VWF nuclear makers. Our results demonstrate the advantages of the Illumina NGS approach to efficiently obtain mitogenomes of phyllostomid bats. The phylogenetic signal provided by entire mitogenomes is highly comparable to the one of a concatenation of individual mitochondrial and nuclear markers, and allows increasing both resolution and statistical support for several clades. This enhanced phylogenetic signal is the result of combining markers with heterogeneous evolutionary rates representing a large number of nucleotide sites. Our results illustrate the potential of the NGS mitogenomic approach for resolving the evolutionary history of

  13. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin; Wong, Yue Him; Tsang, Ling Ming; Chu, Ka Hou; Qian, Pei Yuan; Chan, Benny K K

    2013-01-01

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  14. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin

    2013-12-12

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  15. Practical issues in implementing whole-genome-sequencing in routine diagnostic microbiology

    NARCIS (Netherlands)

    Rossen, J. W. A.; Friedrich, A. W.; Moran-Gilad, J.

    Background: Next generation sequencing (NGS) is increasingly being used in clinical microbiology. Like every new technology adopted in microbiology, the integration of NGS into clinical and routine workflows must be carefully managed. Aim: To review the practical aspects of implementing bacterial

  16. tropiTree: An NGS-Based EST-SSR Resource for 24 Tropical Tree Species

    Science.gov (United States)

    Russell, Joanne R.; Hedley, Peter E.; Cardle, Linda; Dancey, Siobhan; Morris, Jenny; Booth, Allan; Odee, David; Mwaura, Lucy; Omondi, William; Angaine, Peter; Machua, Joseph; Muchugi, Alice; Milne, Iain; Kindt, Roeland; Jamnadass, Ramni; Dawson, Ian K.

    2014-01-01

    The development of genetic tools for non-model organisms has been hampered by cost, but advances in next-generation sequencing (NGS) have created new opportunities. In ecological research, this raises the prospect for developing molecular markers to simultaneously study important genetic processes such as gene flow in multiple non-model plant species within complex natural and anthropogenic landscapes. Here, we report the use of bar-coded multiplexed paired-end Illumina NGS for the de novo development of expressed sequence tag-derived simple sequence repeat (EST-SSR) markers at low cost for a range of 24 tree species. Each chosen tree species is important in complex tropical agroforestry systems where little is currently known about many genetic processes. An average of more than 5,000 EST-SSRs was identified for each of the 24 sequenced species, whereas prior to analysis 20 of the species had fewer than 100 nucleotide sequence citations. To make results available to potential users in a suitable format, we have developed an open-access, interactive online database, tropiTree (http://bioinf.hutton.ac.uk/tropiTree), which has a range of visualisation and search facilities, and which is a model for the efficient presentation and application of NGS data. PMID:25025376

  17. Practical Issues in Implementing Whole-Genome-Sequencing in Routine Diagnostic Microbiology

    NARCIS (Netherlands)

    Rossen, John W A; Friedrich, Alexander W; Moran-Gilad, Jacob

    BACKGROUND: next generation sequencing (NGS) is increasingly being used in clinical microbiology. Like every new technology that is being adopted in microbiology, the integration of NGS into clinical and routine workflows needs to be carefully managed. AIM: to review the practical aspects of

  18. Genome-wide analysis of replication timing by next-generation sequencing with E/L Repli-seq.

    Science.gov (United States)

    Marchal, Claire; Sasaki, Takayo; Vera, Daniel; Wilson, Korey; Sima, Jiao; Rivera-Mulia, Juan Carlos; Trevilla-García, Claudia; Nogues, Coralin; Nafie, Ebtesam; Gilbert, David M

    2018-05-01

    This protocol is an extension to: Nat. Protoc. 6, 870-895 (2014); doi:10.1038/nprot.2011.328; published online 02 June 2011Cycling cells duplicate their DNA content during S phase, following a defined program called replication timing (RT). Early- and late-replicating regions differ in terms of mutation rates, transcriptional activity, chromatin marks and subnuclear position. Moreover, RT is regulated during development and is altered in diseases. Here, we describe E/L Repli-seq, an extension of our Repli-chip protocol. E/L Repli-seq is a rapid, robust and relatively inexpensive protocol for analyzing RT by next-generation sequencing (NGS), allowing genome-wide assessment of how cellular processes are linked to RT. Briefly, cells are pulse-labeled with BrdU, and early and late S-phase fractions are sorted by flow cytometry. Labeled nascent DNA is immunoprecipitated from both fractions and sequenced. Data processing leads to a single bedGraph file containing the ratio of nascent DNA from early versus late S-phase fractions. The results are comparable to those of Repli-chip, with the additional benefits of genome-wide sequence information and an increased dynamic range. We also provide computational pipelines for downstream analyses, for parsing phased genomes using single-nucleotide polymorphisms (SNPs) to analyze RT allelic asynchrony, and for direct comparison to Repli-chip data. This protocol can be performed in up to 3 d before sequencing, and requires basic cellular and molecular biology skills, as well as a basic understanding of Unix and R.

  19. Characterization of novel precursor miRNAs using next generation sequencing and prediction of miRNA targets in Atlantic halibut.

    Directory of Open Access Journals (Sweden)

    Teshome Tilahun Bizuayehu

    Full Text Available BACKGROUND: microRNAs (miRNAs are implicated in regulation of many cellular processes. miRNAs are processed to their mature functional form in a step-wise manner by multiple proteins and cofactors in the nucleus and cytoplasm. Many miRNAs are conserved across vertebrates. Mature miRNAs have recently been characterized in Atlantic halibut (Hippoglossus hippoglossus L.. The aim of this study was to identify and characterize precursor miRNA (pre-miRNAs and miRNA targets in this non-model flatfish. Discovery of miRNA precursor forms and targets in non-model organisms is difficult because of limited source information available. Therefore, we have developed a methodology to overcome this limitation. METHODS: Genomic DNA and small transcriptome of Atlantic halibut were sequenced using Roche 454 pyrosequencing and SOLiD next generation sequencing (NGS, respectively. Identified pre- miRNAs were further validated with reverse-transcription PCR. miRNA targets were identified using miRanda and RNAhybrid target prediction tools using sequences from public databases. Some of miRNA targets were also identified using RACE-PCR. miRNA binding sites were validated with luciferase assay using the RTS34st cell line. RESULTS: We obtained more than 1.3 M and 92 M sequence reads from 454 genomic DNA sequencing and SOLiD small RNA sequencing, respectively. We identified 34 known and 9 novel pre-miRNAs. We predicted a number of miRNA target genes involved in various biological pathways. miR-24 binding to kisspeptin 1 receptor-2 (kiss1-r2 was confirmed using luciferase assay. CONCLUSION: This study demonstrates that identification of conserved and novel pre-miRNAs in a non-model vertebrate lacking substantial genomic resources can be performed by combining different next generation sequencing technologies. Our results indicate a wide conservation of miRNA precursors and involvement of miRNA in multiple regulatory pathways, and provide resources for further research on mi

  20. Immune Repertoire after Immunization As Seen by Next-Generation Sequencing and Proteomics

    Directory of Open Access Journals (Sweden)

    Martijn M. VanDuijn

    2017-10-01

    Full Text Available The immune system produces a diverse repertoire of immunoglobulins in response to foreign antigens. During B-cell development, VDJ recombination and somatic mutations generate diversity, whereas selection processes remove it. Using both proteomic and NGS approaches, we characterized the immune repertoires in groups of rats after immunization with purified antigens. Proteomics and NGS data on the repertoire are in qualitative agreement, but did show quantitative differences that may relate to differences between the biological niches that were sampled for these approaches. Both methods contributed complementary information in the characterization of the immune repertoire. It was found that the immune repertoires resulting from each antigen had many similarities that allowed samples to cluster together, and that mutated immunoglobulin peptides were shared among animals with a response to the same antigen significantly more than for different antigens. However, the number of shared sequences decreased in a log-linear fashion relative to the number of animals that share them, which may affect future applications. A phylogenetic analysis on the NGS reads showed that reads from different individuals immunized with the same antigen populated distinct branches of the phylogram, an indication that the repertoire had converged. Also, similar mutation patterns were found in branches of the phylogenetic tree that were associated with antigen-specific immunoglobulins through proteomics data. Thus, data from different analysis methods and different experimental platforms show that the immunoglobulin repertoires of immunized animals have overlapping and converging features. With additional research, this may enable interesting applications in biotechnology and clinical diagnostics.

  1. Timely diagnosis of sitosterolemia by next generation sequencing in two children with severe hypercholesterolemia.

    Science.gov (United States)

    Buonuomo, Paola Sabrina; Iughetti, Lorenzo; Pisciotta, Livia; Rabacchi, Claudio; Papadia, Francesco; Bruzzi, Patrizia; Tummolo, Albina; Bartuli, Andrea; Cortese, Claudio; Bertolini, Stefano; Calandra, Sebastiano

    2017-07-01

    Severe hypercholesterolemia associated or not with xanthomas in a child may suggest the diagnosis of homozygous autosomal dominant hypercholesterolemia (ADH), autosomal recessive hypercholesterolemia (ARH) or sitosterolemia, depending on the transmission of hypercholesterolemia in the patient's family. Sitosterolemia is a recessive disorder characterized by high plasma levels of cholesterol and plant sterols due to mutations in the ABCG5 or the ABCG8 gene, leading to a loss of function of the ATP-binding cassette (ABC) heterodimer transporter G5-G8. We aimed to perform the molecular characterization of two children with severe primary hypercholesterolemia. Case #1 was a 2 year-old girl with high LDL-cholesterol (690 mg/dl) and tuberous and intertriginous xanthomas. Case #2 was a 7 year-old boy with elevated LDL-C (432 mg/dl) but no xanthomas. In both cases, at least one parent had elevated LDL-cholesterol levels. For the molecular diagnosis, we applied targeted next generation sequencing (NGS), which unexpectedly revealed that both patients were compound heterozygous for nonsense mutations: Case #1 in ABCG5 gene [p.(Gln251*)/p.(Arg446*)] and Case #2 in ABCG8 gene [p.(Ser107*)/p.(Trp361*)]. Both children had extremely high serum sitosterol and campesterol levels, thus confirming the diagnosis of sisterolemia. A low-fat/low-sterol diet was promptly adopted with and without the addition of ezetimibe for Case #1 and Case #2, respectively. In both patients, serum total and LDL-cholesterol decreased dramatically in two months and progressively normalized. Targeted NGS allows the rapid diagnosis of sitosterolemia in children with severe hypercholesterolemia, even though their family history does not unequivocally suggest a recessive transmission of hypercholesterolemia. A timely diagnosis is crucial to avoid delays in treatment. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Clinical utility of a 377 gene custom next-generation sequencing ...

    Indian Academy of Sciences (India)

    JEN BEVILACQUA

    2017-07-26

    Jul 26, 2017 ... Clinical utility of a 377 gene custom next-generation sequencing epilepsy panel ... number of genes, making it a very attractive option for a condition as .... clinical value of various test offerings to guide decision making.

  3. Next-generation sequencing analysis of gene regulation in the rat model of retinopathy of prematurity.

    Science.gov (United States)

    Griffith, Rachel M; Li, Hu; Zhang, Nan; Favazza, Tara L; Fulton, Anne B; Hansen, Ronald M; Akula, James D

    2013-08-01

    The purpose of this study was to identify the genes, biochemical signaling pathways, and biological themes involved in the pathogenesis of retinopathy of prematurity (ROP). Next-generation sequencing (NGS) was performed on the RNA transcriptome of rats with the Penn et al. (Pediatr Res 36:724-731, 1994) oxygen-induced retinopathy model of ROP at the height of vascular abnormality, postnatal day (P) 19, and normalized to age-matched, room-air-reared littermate controls. Eight custom-developed pathways with potential relevance to known ROP sequelae were evaluated for significant regulation in ROP: The three major Wnt signaling pathways, canonical, planar cell polarity (PCP), and Wnt/Ca(2+); two signaling pathways mediated by the Rho GTPases RhoA and Cdc42, which are, respectively, thought to intersect with canonical and non-canonical Wnt signaling; nitric oxide signaling pathways mediated by two nitric oxide synthase (NOS) enzymes, neuronal (nNOS) and endothelial (eNOS); and the retinoic acid (RA) signaling pathway. Regulation of other biological pathways and themes was detected by gene ontology using the Kyoto Encyclopedia of Genes and Genomes and the NIH's Database for Annotation, Visualization, and Integrated Discovery's GO terms databases. Canonical Wnt signaling was found to be regulated, but the non-canonical PCP and Wnt/Ca(2+) pathways were not. Nitric oxide signaling, as measured by the activation of nNOS and eNOS, was also regulated, as was RA signaling. Biological themes related to protein translation (ribosomes), neural signaling, inflammation and immunity, cell cycle, and cell death were (among others) highly regulated in ROP rats. These several genes and pathways identified by NGS might provide novel targets for intervention in ROP.

  4. Next Generation Sequencing Analysis of Gene Regulation in the Rat Model of Retinopathy of Prematurity

    Science.gov (United States)

    Griffith, Rachel M.; Li, Hu; Zhang, Nan; Favazza, Tara L.; Fulton, Anne B.; Hansen, Ronald M.; Akula, James D.

    2013-01-01

    Purpose To identify the genes, biochemical signaling pathways and biological themes involved in the pathogenesis of retinopathy of prematurity (ROP). Methods Next-generation sequencing (NGS) was performed on the RNA transcriptome of rats with the Penn et al. (1994) oxygen-induced retinopathy (OIR) model of ROP at the height of vascular abnormality, postnatal day (P) 19, and normalized to age-matched, room-air-reared littermate controls. Eight custom developed pathways with potential relevance to known ROP sequelae were evaluated for significant regulation in ROP: The three major Wnt signaling pathways, canonical, planar cell polarity (PCP), and Wnt/Ca2+, two signaling pathways mediated by the Rho GTPases RhoA and Cdc42, which are respectively thought to intersect with canonical and noncanonical Wnt signaling, nitric oxide signaling pathways mediated by two nitrox oxide synthase (NOS) enzymes, neuronal (nNOS) and endothelial (eNOS), and the retinoic acid (RA) signaling pathway. Regulation of other biological pathways and themes were detected by gene ontology using the Kyoto Encyclopedia of Genes and Genomes (KEGG) and the NIH's Database for Annotation, Visualization and Integrated Discovery (DAVID)'s GO terms databases. Results Canonical Wnt signaling was found to be regulated, but the non-canonical PCP and Wnt/Ca2+ pathways were not. Nitric oxide (NO) signaling, as measured by the activation of nNOS eNOS, was also regulated, as was RA signaling. Biological themes related to protein translation (ribosomes), neural signaling, inflammation and immunity, cell cycle and cell death, were (among others) highly regulated in ROP rats. Conclusions These several genes and pathways identified by NGS might provide novel targets for intervention in ROP. PMID:23775346

  5. Next-generation sequencing technology a new tool for killer cell immunoglobulin-like receptor allele typing in hematopoietic stem cell transplantation.

    Science.gov (United States)

    Maniangou, B; Retière, C; Gagne, K

    2018-02-01

    Killer cell Immunoglobulin-like Receptor (KIR) genes are a family of genes located together within the leukocyte receptor cluster on human chromosome 19q13.4. To date, 17 KIR genes have been identified including nine inhibitory genes (2DL1/L2/L3/L4/L5A/L5B, 3DL1/L2/L3), six activating genes (2DS1/S2/S3/S4/S5, 3DS1) and two pseudogenes (2DP1, 3DP1) classified into group A (KIR A) and group B (KIR B) haplotypes. The number and the nature of KIR genes vary between the individuals. In addition, these KIR genes are known to be polymorphic at allelic level (907 alleles described in July 2017). KIR genes encode for receptors which are predominantly expressed by Natural Killer (NK) cells. KIR receptors recognize HLA class I molecules and are able to kill residual recipient leukemia cells, and thus reduce the likelihood of relapse. KIR alleles of Hematopoietic Stem Cell (HSC) donor would require to be known (Alicata et al. Eur J Immunol 2016) because the KIR allele polymorphism may affect both the KIR + NK cell phenotype and function (Gagne et al. Eur J Immunol 2013; Bari R, et al. Sci Rep 2016) as well as HSCT outcome (Boudreau et al. JCO 2017). The introduction of the Next Generation Sequencing (NGS) has overcome current conventional DNA sequencing method limitations, known to be time consuming. Recently, a novel NGS KIR allele typing approach of all KIR genes was developed by our team in Nantes from 30 reference DNAs (Maniangou et al. Front in Immunol 2017). This NGS KIR allele typing approach is simple, fast, reliable, specific and showed a concordance rate of 95% for centromeric and telomeric KIR genes in comparison with high-resolution KIR typing obtained to those published data using exome capture (Norman PJ et al. Am J Hum Genet 2016). This NGS KIR allele typing approach may also be used in reproduction and to better study KIR + NK cell implication in the control of viral infections. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  6. Get your high-quality low-cost genome sequence

    NARCIS (Netherlands)

    Faino, L.; Thomma, B.P.H.J.

    2014-01-01

    The study of whole-genome sequences has become essential for almost all branches of biological research. Next-generation sequencing (NGS) has revolutionized the scalability, speed, and resolution of sequencing and brought genomic science within reach of academic laboratories that study non-model

  7. Improving diagnosis for congenital cataract by introducing NGS genetic testing.

    Science.gov (United States)

    Musleh, Mohammud; Ashworth, Jane; Black, Graeme; Hall, Georgina

    2016-01-01

    Childhood cataract (CC) has an incidence of 3.5 per 10,000 by age 15 years. Diagnosis of any underlying cause is important to ensure effective and prompt management of multisystem complications, to facilitate accurate genetic counselling and to streamline multidisciplinary care. Next generation sequencing (NGS) has been shown to be effective in providing an underlying diagnosis in 70% of patients with CC in a research setting. This project aimed to integrate NGS testing in CC within six months of presentation and increase the rate of diagnosis. A retrospective case note review was undertaken to define the baseline efficacy of current care in providing a precise diagnosis. Quality improvement methods were used to integrate and optimize NGS testing in clinical care and measure the improvements made. The percentage of children receiving an NGS result within six months increased from 26% to 71% during the project period. The mean time to NGS testing and receiving a report decreased and there was a reduction in variation over the study period. Several patients and families had a change in management or genetic counselling as a direct result of the diagnosis given by the NGS test. The current recommended investigation of patients with bilateral CC is ineffective in identifying a diagnosis. Quality Improvement methods have facilitated successful integration of NGS testing into clinical care, improving time to diagnosis and leading to development of a new care pathway.

  8. Discovery of novel transcripts of the human tissue kallikrein (KLK1) and kallikrein-related peptidase 2 (KLK2) in human cancer cells, exploiting Next-Generation Sequencing technology.

    Science.gov (United States)

    Adamopoulos, Panagiotis G; Kontos, Christos K; Scorilas, Andreas

    2018-03-31

    Tissue kallikrein, kallikrein-related peptidases (KLKs), and plasma kallikrein form the largest group of serine proteases in the human genome, sharing many structural and functional properties. Several KLK transcripts have been found aberrantly expressed in numerous human malignancies, confirming their prognostic or/and diagnostic values. However, the process of alternative splicing can now be studied in-depth due to the development of Next-Generation Sequencing (NGS). In the present study, we used NGS to discover novel transcripts of the KLK1 and KLK2 genes, after nested touchdown PCR. Bioinformatics analysis and PCR experiments revealed a total of eleven novel KLK transcripts (two KLK1 and nine KLK2 transcripts). In addition, the expression profiles of each novel transcript were investigated with nested PCR experiments using variant-specific primers. Since KLKs are implicated in human malignancies, qualifying as potential biomarkers, the quantification of the presented novel transcripts in human samples may have clinical applications in different types of cancer. Copyright © 2018. Published by Elsevier Inc.

  9. cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate.

    Science.gov (United States)

    Klambauer, Günter; Schwarzbauer, Karin; Mayr, Andreas; Clevert, Djork-Arné; Mitterecker, Andreas; Bodenhofer, Ulrich; Hochreiter, Sepp

    2012-05-01

    Quantitative analyses of next-generation sequencing (NGS) data, such as the detection of copy number variations (CNVs), remain challenging. Current methods detect CNVs as changes in the depth of coverage along chromosomes. Technological or genomic variations in the depth of coverage thus lead to a high false discovery rate (FDR), even upon correction for GC content. In the context of association studies between CNVs and disease, a high FDR means many false CNVs, thereby decreasing the discovery power of the study after correction for multiple testing. We propose 'Copy Number estimation by a Mixture Of PoissonS' (cn.MOPS), a data processing pipeline for CNV detection in NGS data. In contrast to previous approaches, cn.MOPS incorporates modeling of depths of coverage across samples at each genomic position. Therefore, cn.MOPS is not affected by read count variations along chromosomes. Using a Bayesian approach, cn.MOPS decomposes variations in the depth of coverage across samples into integer copy numbers and noise by means of its mixture components and Poisson distributions, respectively. The noise estimate allows for reducing the FDR by filtering out detections having high noise that are likely to be false detections. We compared cn.MOPS with the five most popular methods for CNV detection in NGS data using four benchmark datasets: (i) simulated data, (ii) NGS data from a male HapMap individual with implanted CNVs from the X chromosome, (iii) data from HapMap individuals with known CNVs, (iv) high coverage data from the 1000 Genomes Project. cn.MOPS outperformed its five competitors in terms of precision (1-FDR) and recall for both gains and losses in all benchmark data sets. The software cn.MOPS is publicly available as an R package at http://www.bioinf.jku.at/software/cnmops/ and at Bioconductor.

  10. Next generation sequencing provides rapid access to the genome of Puccinia striiformis f. sp. tritici, the causal agent of wheat stripe rust.

    Directory of Open Access Journals (Sweden)

    Dario Cantu

    Full Text Available BACKGROUND: The wheat stripe rust fungus (Puccinia striiformis f. sp. tritici, PST is responsible for significant yield losses in wheat production worldwide. In spite of its economic importance, the PST genomic sequence is not currently available. Fortunately Next Generation Sequencing (NGS has radically improved sequencing speed and efficiency with a great reduction in costs compared to traditional sequencing technologies. We used Illumina sequencing to rapidly access the genomic sequence of the highly virulent PST race 130 (PST-130. METHODOLOGY/PRINCIPAL FINDINGS: We obtained nearly 80 million high quality paired-end reads (>50x coverage that were assembled into 29,178 contigs (64.8 Mb, which provide an estimated coverage of at least 88% of the PST genes and are available through GenBank. Extensive micro-synteny with the Puccinia graminis f. sp. tritici (PGTG genome and high sequence similarity with annotated PGTG genes support the quality of the PST-130 contigs. We characterized the transposable elements present in the PST-130 contigs and using an ab initio gene prediction program we identified and tentatively annotated 22,815 putative coding sequences. We provide examples on the use of comparative approaches to improve gene annotation for both PST and PGTG and to identify candidate effectors. Finally, the assembled contigs provided an inventory of PST repetitive elements, which were annotated and deposited in Repbase. CONCLUSIONS/SIGNIFICANCE: The assembly of the PST-130 genome and the predicted proteins provide useful resources to rapidly identify and clone PST genes and their regulatory regions. Although the automatic gene prediction has limitations, we show that a comparative genomics approach using multiple rust species can greatly improve the quality of gene annotation in these species. The PST-130 sequence will also be useful for comparative studies within PST as more races are sequenced. This study illustrates the power of NGS for

  11. Targeted next generation sequencing identifies functionally deleterious germline mutations in novel genes in early-onset/familial prostate cancer.

    Directory of Open Access Journals (Sweden)

    Paula Paulo

    2018-04-01

    Full Text Available Considering that mutations in known prostate cancer (PrCa predisposition genes, including those responsible for hereditary breast/ovarian cancer and Lynch syndromes, explain less than 5% of early-onset/familial PrCa, we have sequenced 94 genes associated with cancer predisposition using next generation sequencing (NGS in a series of 121 PrCa patients. We found monoallelic truncating/functionally deleterious mutations in seven genes, including ATM and CHEK2, which have previously been associated with PrCa predisposition, and five new candidate PrCa associated genes involved in cancer predisposing recessive disorders, namely RAD51C, FANCD2, FANCI, CEP57 and RECQL4. Furthermore, using in silico pathogenicity prediction of missense variants among 18 genes associated with breast/ovarian cancer and/or Lynch syndrome, followed by KASP genotyping in 710 healthy controls, we identified "likely pathogenic" missense variants in ATM, BRIP1, CHEK2 and TP53. In conclusion, this study has identified putative PrCa predisposing germline mutations in 14.9% of early-onset/familial PrCa patients. Further data will be necessary to confirm the genetic heterogeneity of inherited PrCa predisposition hinted in this study.

  12. Tablet—next generation sequence assembly visualization

    Science.gov (United States)

    Milne, Iain; Bayer, Micha; Cardle, Linda; Shaw, Paul; Stephen, Gordon; Wright, Frank; Marshall, David

    2010-01-01

    Summary: Tablet is a lightweight, high-performance graphical viewer for next-generation sequence assemblies and alignments. Supporting a range of input assembly formats, Tablet provides high-quality visualizations showing data in packed or stacked views, allowing instant access and navigation to any region of interest, and whole contig overviews and data summaries. Tablet is both multi-core aware and memory efficient, allowing it to handle assemblies containing millions of reads, even on a 32-bit desktop machine. Availability: Tablet is freely available for Microsoft Windows, Apple Mac OS X, Linux and Solaris. Fully bundled installers can be downloaded from http://bioinf.scri.ac.uk/tablet in 32- and 64-bit versions. Contact: tablet@scri.ac.uk PMID:19965881

  13. Next-generation sequencing identifies deregulation of microRNAs involved in both innate and adaptive immune response in ALK+ ALCL.

    Directory of Open Access Journals (Sweden)

    Julia Steinhilber

    Full Text Available Anaplastic large cell lymphoma (ALCL is divided into two systemic diseases according to the expression of the anaplastic lymphoma kinase (ALK. We investigated the differential expression of miRNAs between ALK+ ALCL, ALK- ALCL cells and normal T-cells using next generation sequencing (NGS. In addition, a C/EBPβ-dependent miRNA profile was generated. The data were validated in primary ALCL cases. NGS identified 106 miRNAs significantly differentially expressed between ALK+ and ALK- ALCL and 228 between ALK+ ALCL and normal T-cells. We identified a signature of 56 miRNAs distinguishing ALK+ ALCL, ALK- ALCL and T-cells. The top candidates significant differentially expressed between ALK+ and ALK- ALCL included 5 upregulated miRNAs: miR-340, miR-203, miR-135b, miR-182, miR-183; and 7 downregulated: miR-196b, miR-155, miR-146a, miR-424, miR-503, miR-424*, miR-542-3p. The miR-17-92 cluster was also upregulated in ALK+ cells. Additionally, we identified a signature of 3 miRNAs significantly regulated by the transcription factor C/EBPβ, which is specifically overexpressed in ALK+ ALCL, including the miR-181 family. Of interest, miR-181a, which regulates T-cell differentiation and modulates TCR signalling strength, was significantly downregulated in ALK+ ALCL cases. In summary, our data reveal a miRNA signature linking ALK+ ALCL to a deregulated immune response and may reflect the abnormal TCR antigen expression known in ALK+ ALCL.

  14. A novel pathogenic variant in an Iranian Ataxia telangiectasia family revealed by next-generation sequencing followed by in silico analysis.

    Science.gov (United States)

    Tabatabaiefar, Mohammad Amin; Alipour, Paria; Pourahmadiyan, Azam; Fattahi, Najmeh; Shariati, Laleh; Golchin, Neda; Mohammadi-Asl, Javad

    2017-08-15

    Ataxia telangiectasia (A-T) is a neurodegenerative autosomal recessive disorder with the main characteristics of progressive cerebellar degeneration, sensitivity to ionizing radiation, immunodeficiency, telangiectasia, premature aging, recurrent sinopulmonary infections, and increased risk of malignancy, especially of lymphoid origin. Ataxia Telangiectasia Mutated gene, ATM, as a causative gene for the A-T disorder, encodes the ATM protein, which plays an important role in the activation of cell-cycle checkpoints and initiation of DNA repair in response to DNA damage. Targeted next-generation sequencing (NGS) was performed on an Iranian 5-year-old boy presented with truncal and limb ataxia, telangiectasia of the eye, Hodgkin lymphoma, hyper pigmentation, total alopecia, hepatomegaly, and dysarthria. Sanger sequencing was used to confirm the candidate pathogenic variants. Computational docking was done using the HEX software to examine how this change affects the interactions of ATM with the upstream and downstream proteins. Three different variants were identified comprising two homozygous SNPs and one novel homozygous frameshift variant (c.80468047delTA, p.Thr2682ThrfsX5), which creates a stop codon in exon 57 leaving the protein truncated at its C-terminal portion. Therefore, the activation and phosphorylation of target proteins are lost. Moreover, the HEX software confirmed that the mutated protein lost its interaction with upstream and downstream proteins. The variant was classified as pathogenic based on the American College of Medical Genetics and Genomics guideline. This study expands the spectrum of ATM pathogenic variants in Iran and demonstrates the utility of targeted NGS in genetic diagnostics. Copyright © 2017. Published by Elsevier B.V.

  15. Towards a consensus Y-chromosomal phylogeny and Y-SNP set in forensics in the next-generation sequencing era.

    Science.gov (United States)

    Larmuseau, Maarten H D; Van Geystelen, Anneleen; Kayser, Manfred; van Oven, Mannis; Decorte, Ronny

    2015-03-01

    Currently, several different Y-chromosomal phylogenies and haplogroup nomenclatures are presented in scientific literature and at conferences demonstrating the present diversity in Y-chromosomal phylogenetic trees and Y-SNP sets used within forensic and anthropological research. This situation can be ascribed to the exponential growth of the number of Y-SNPs discovered due to mostly next-generation sequencing (NGS) studies. As Y-SNPs and their respective phylogenetic positions are important in forensics, such as for male lineage characterization and paternal bio-geographic ancestry inference, there is a need for forensic geneticists to know how to deal with these newly identified Y-SNPs and phylogenies, especially since these phylogenies are often created with other aims than to carry out forensic genetic research. Therefore, we give here an overview of four categories of currently used Y-chromosomal phylogenies and the associated Y-SNP sets in scientific research in the current NGS era. We compare these categories based on the construction method, their advantages and disadvantages, the disciplines wherein the phylogenetic tree can be used, and their specific relevance for forensic geneticists. Based on this overview, it is clear that an up-to-date reduced tree with a consensus Y-SNP set and a stable nomenclature will be the most appropriate reference resource for forensic research. Initiatives to reach such an international consensus are therefore highly recommended. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  16. The Utility of Next Generation Sequencing in Gene Discovery for Mutation-negative Patients with Rett Syndrome

    Directory of Open Access Journals (Sweden)

    Wendy Anne Gold

    2015-07-01

    Full Text Available Rett syndrome (RTT is a rare, severe disorder of neuronal plasticity that predominantly affects girls. Girls with RTT usually appear asymptomatic in the first 6-18 months of life, but gradually develop severe motor, cognitive and behavioural abnormalities that persist for life. A predominance of neuronal and synaptic dysfunction, with altered excitatory-inhibitory neuronal synaptic transmission and synaptic plasticity are overarching features of RTT in children and in mouse models. Approximately 95% of patients with classical RTT have mutations in the X-linked methyl-CpG-binding (MECP2 gene, whilst other genes, including cyclin-dependent kinase-like 5 (CDKL5, Forkhead box protein G1 (FOXG1, Myocyte-specific enhancer factor 2C (MEF2C and Transcription factor 4 (TCF4, have been associated with phenotypes overlapping with RTT. However, there remain a proportion of patients who carry a clinical diagnosis of RTT, but who are mutation negative. In recent years, next-generation sequencing (NGS technologies have revolutionized approaches to genetic studies, making whole-exome and even whole-genome sequencing possible strategies for the detection of rare and de novo mutations, aiding the discovery of novel disease genes. Here, we review the recent progress that is emerging in identifying pathogenic variations, specifically from exome sequencing in RTT patients, and emphasize the need for the use of this technology to identify known and new disease genes in RTT patients.

  17. A viral metagenomic approach on a non-metagenomic experiment: Mining next generation sequencing datasets from pig DNA identified several porcine parvoviruses for a retrospective evaluation of viral infections.

    Directory of Open Access Journals (Sweden)

    Samuele Bovo

    Full Text Available Shot-gun next generation sequencing (NGS on whole DNA extracted from specimens collected from mammals often produces reads that are not mapped (i.e. unmapped reads on the host reference genome and that are usually discarded as by-products of the experiments. In this study, we mined Ion Torrent reads obtained by sequencing DNA isolated from archived blood samples collected from 100 performance tested Italian Large White pigs. Two reduced representation libraries were prepared from two DNA pools constructed each from 50 equimolar DNA samples. Bioinformatic analyses were carried out to mine unmapped reads on the reference pig genome that were obtained from the two NGS datasets. In silico analyses included read mapping and sequence assembly approaches for a viral metagenomic analysis using the NCBI Viral Genome Resource. Our approach identified sequences matching several viruses of the Parvoviridae family: porcine parvovirus 2 (PPV2, PPV4, PPV5 and PPV6 and porcine bocavirus 1-H18 isolate (PBoV1-H18. The presence of these viruses was confirmed by PCR and Sanger sequencing of individual DNA samples. PPV2, PPV4, PPV5, PPV6 and PBoV1-H18 were all identified in samples collected in 1998-2007, 1998-2000, 1997-2000, 1998-2004 and 2003, respectively. For most of these viruses (PPV4, PPV5, PPV6 and PBoV1-H18 previous studies reported their first occurrence much later (from 5 to more than 10 years than our identification period and in different geographic areas. Our study provided a retrospective evaluation of apparently asymptomatic parvovirus infected pigs providing information that could be important to define occurrence and prevalence of different parvoviruses in South Europe. This study demonstrated the potential of mining NGS datasets non-originally derived by metagenomics experiments for viral metagenomics analyses in a livestock species.

  18. Application of next-generation sequencing for rapid marker development in molecular plant breeding: a case study on anthracnose disease resistance in Lupinus angustifolius L.

    Directory of Open Access Journals (Sweden)

    Yang Huaan

    2012-07-01

    Full Text Available Abstract Background In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Results Twenty informative plants from a cross of RxS (disease resistant x susceptible in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM, and are now replacing the markers previously developed by a traditional DNA

  19. Mutational profiling of non-small-cell lung cancer patients resistant to first-generation EGFR tyrosine kinase inhibitors using next generation sequencing

    Science.gov (United States)

    Jin, Ying; Shao, Yang; Shi, Xun; Lou, Guangyuan; Zhang, Yiping; Wu, Xue; Tong, Xiaoling; Yu, Xinmin

    2016-01-01

    Patients with advanced non-small-cell lung cancer (NSCLC) harboring sensitive epithelial growth factor receptor (EGFR) mutations invariably develop acquired resistance to EGFR tyrosine kinase inhibitors (TKIs). Identification of actionable genetic alterations conferring drug-resistance can be helpful for guiding the subsequent treatment decision. One of the major resistant mechanisms is secondary EGFR-T790M mutation. Other mechanisms, such as HER2 and MET amplifications, and PIK3CA mutations, were also reported. However, the mechanisms in the remaining patients are still unknown. In this study, we performed mutational profiling in a cohort of 83 NSCLC patients with TKI-sensitizing EGFR mutations at diagnosis and acquired resistance to three different first-generation EGFR TKIs using targeted next generation sequencing (NGS) of 416 cancer-related genes. In total, we identified 322 genetic alterations with a median of 3 mutations per patient. 61% of patients still exhibit TKI-sensitizing EGFR mutations, and 36% of patients acquired EGFR-T790M. Besides other known resistance mechanisms, we identified TET2 mutations in 12% of patients. Interestingly, we also observed SOX2 amplification in EGFR-T790M negative patients, which are restricted to Icotinib treatment resistance, a drug widely used in Chinese NSCLC patients. Our study uncovered mutational profiles of NSCLC patients with first-generation EGFR TKIs resistance with potential therapeutic implications. PMID:27528220

  20. Tracking the NGS revolution: managing life science research on shared high-performance computing clusters.

    Science.gov (United States)

    Dahlö, Martin; Scofield, Douglas G; Schaal, Wesley; Spjuth, Ola

    2018-05-01

    Next-generation sequencing (NGS) has transformed the life sciences, and many research groups are newly dependent upon computer clusters to store and analyze large datasets. This creates challenges for e-infrastructures accustomed to hosting computationally mature research in other sciences. Using data gathered from our own clusters at UPPMAX computing center at Uppsala University, Sweden, where core hour usage of ∼800 NGS and ∼200 non-NGS projects is now similar, we compare and contrast the growth, administrative burden, and cluster usage of NGS projects with projects from other sciences. The number of NGS projects has grown rapidly since 2010, with growth driven by entry of new research groups. Storage used by NGS projects has grown more rapidly since 2013 and is now limited by disk capacity. NGS users submit nearly twice as many support tickets per user, and 11 more tools are installed each month for NGS projects than for non-NGS projects. We developed usage and efficiency metrics and show that computing jobs for NGS projects use more RAM than non-NGS projects, are more variable in core usage, and rarely span multiple nodes. NGS jobs use booked resources less efficiently for a variety of reasons. Active monitoring can improve this somewhat. Hosting NGS projects imposes a large administrative burden at UPPMAX due to large numbers of inexperienced users and diverse and rapidly evolving research areas. We provide a set of recommendations for e-infrastructures that host NGS research projects. We provide anonymized versions of our storage, job, and efficiency databases.

  1. Tracking the NGS revolution: managing life science research on shared high-performance computing clusters

    Science.gov (United States)

    2018-01-01

    Abstract Background Next-generation sequencing (NGS) has transformed the life sciences, and many research groups are newly dependent upon computer clusters to store and analyze large datasets. This creates challenges for e-infrastructures accustomed to hosting computationally mature research in other sciences. Using data gathered from our own clusters at UPPMAX computing center at Uppsala University, Sweden, where core hour usage of ∼800 NGS and ∼200 non-NGS projects is now similar, we compare and contrast the growth, administrative burden, and cluster usage of NGS projects with projects from other sciences. Results The number of NGS projects has grown rapidly since 2010, with growth driven by entry of new research groups. Storage used by NGS projects has grown more rapidly since 2013 and is now limited by disk capacity. NGS users submit nearly twice as many support tickets per user, and 11 more tools are installed each month for NGS projects than for non-NGS projects. We developed usage and efficiency metrics and show that computing jobs for NGS projects use more RAM than non-NGS projects, are more variable in core usage, and rarely span multiple nodes. NGS jobs use booked resources less efficiently for a variety of reasons. Active monitoring can improve this somewhat. Conclusions Hosting NGS projects imposes a large administrative burden at UPPMAX due to large numbers of inexperienced users and diverse and rapidly evolving research areas. We provide a set of recommendations for e-infrastructures that host NGS research projects. We provide anonymized versions of our storage, job, and efficiency databases. PMID:29659792

  2. Quantifying population genetic differentiation from next-generation sequencing data

    DEFF Research Database (Denmark)

    Fumagalli, Matteo; Garrett Vieira, Filipe Jorge; Korneliussen, Thorfinn Sand

    2013-01-01

    method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy to investigate population structure via Principal Components Analysis. Through extensive simulations, we compare the new method herein proposed to approaches based...... on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled......Over the last few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data...

  3. Next-generation phylogenomics

    Directory of Open Access Journals (Sweden)

    Chan Cheong Xin

    2013-01-01

    Full Text Available Abstract Thanks to advances in next-generation technologies, genome sequences are now being generated at breadth (e.g. across environments and depth (thousands of closely related strains, individuals or samples unimaginable only a few years ago. Phylogenomics – the study of evolutionary relationships based on comparative analysis of genome-scale data – has so far been developed as industrial-scale molecular phylogenetics, proceeding in the two classical steps: multiple alignment of homologous sequences, followed by inference of a tree (or multiple trees. However, the algorithms typically employed for these steps scale poorly with number of sequences, such that for an increasing number of problems, high-quality phylogenomic analysis is (or soon will be computationally infeasible. Moreover, next-generation data are often incomplete and error-prone, and analysis may be further complicated by genome rearrangement, gene fusion and deletion, lateral genetic transfer, and transcript variation. Here we argue that next-generation data require next-generation phylogenomics, including so-called alignment-free approaches. Reviewers Reviewed by Mr Alexander Panchin (nominated by Dr Mikhail Gelfand, Dr Eugene Koonin and Prof Peter Gogarten. For the full reviews, please go to the Reviewers’ comments section.

  4. Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data

    Czech Academy of Sciences Publication Activity Database

    Novák, Petr; Neumann, Pavel; Macas, Jiří

    2010-01-01

    Roč. 11, č. 1 (2010), s. 378-389 ISSN 1471-2105 R&D Projects: GA MŠk(CZ) OC10037; GA MŠk(CZ) LC06004 Institutional research plan: CEZ:AV0Z50510513 Keywords : repetitive DNA * plant genome * next generation sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.028, year: 2010

  5. The Challenges of Implementing Next Generation Sequencing Across a Large Healthcare System, and the Molecular Epidemiology and Antibiotic Susceptibilities of Carbapenemase-Producing Bacteria in the Healthcare System of the U.S. Department of Defense.

    Directory of Open Access Journals (Sweden)

    Emil Lesho

    Full Text Available We sought to: 1 provide an overview of the genomic epidemiology of an extensive collection of carbapenemase-producing bacteria (CPB collected in the U.S. Department of Defense health system; 2 increase awareness of the public availability of the sequences, isolates, and customized antimicrobial resistance database of that system; and 3 illustrate challenges and offer mitigations for implementing next generation sequencing (NGS across large health systems.Prospective surveillance and system-wide implementation of NGS.288-hospital healthcare network.All phenotypically carbapenem resistant bacteria underwent CarbaNP® testing and PCR, followed by NGS. Commercial (Newbler and Geneious, on-line (ResFinder, and open-source software (Btrim, FLASh, Bowtie2, an Samtools were used for assembly, SNP detection and clustering. Laboratory capacity, throughput, and response time were assessed.From 2009 through 2015, 27,000 multidrug-resistant Gram-negative isolates were submitted. 225 contained carbapenemase-encoding genes (most commonly blaKPC, blaNDM, and blaOXA23. These were found in 15 species from 146 inpatients in 19 facilities. Genetically related CPB were found in more than one hospital. Other clusters or outbreaks were not clonal and involved genetically related plasmids, while some involved several unrelated plasmids. Relatedness depended on the clustering algorithm used. Transmission patterns of plasmids and other mobile genetic elements could not be determined without ultra-long read, single-molecule real-time sequencing. 80% of carbapenem-resistant phenotypes retained susceptibility to aminoglycosides, and 70% retained susceptibility to fluoroquinolones. However, among the CPB-confirmed genotypes, fewer than 25% retained susceptibility to aminoglycosides or fluoroquinolones.Although NGS is increasingly acclaimed to revolutionize clinical practice, resource-constrained environments, large or geographically dispersed healthcare networks, and

  6. The Challenges of Implementing Next Generation Sequencing Across a Large Healthcare System, and the Molecular Epidemiology and Antibiotic Susceptibilities of Carbapenemase-Producing Bacteria in the Healthcare System of the U.S. Department of Defense.

    Science.gov (United States)

    Lesho, Emil; Clifford, Robert; Onmus-Leone, Fatma; Appalla, Lakshmi; Snesrud, Erik; Kwak, Yoon; Ong, Ana; Maybank, Rosslyn; Waterman, Paige; Rohrbeck, Patricia; Julius, Michael; Roth, Amanda; Martinez, Joshua; Nielsen, Lindsey; Steele, Eric; McGann, Patrick; Hinkle, Mary

    2016-01-01

    We sought to: 1) provide an overview of the genomic epidemiology of an extensive collection of carbapenemase-producing bacteria (CPB) collected in the U.S. Department of Defense health system; 2) increase awareness of the public availability of the sequences, isolates, and customized antimicrobial resistance database of that system; and 3) illustrate challenges and offer mitigations for implementing next generation sequencing (NGS) across large health systems. Prospective surveillance and system-wide implementation of NGS. 288-hospital healthcare network. All phenotypically carbapenem resistant bacteria underwent CarbaNP® testing and PCR, followed by NGS. Commercial (Newbler and Geneious), on-line (ResFinder), and open-source software (Btrim, FLASh, Bowtie2, an Samtools) were used for assembly, SNP detection and clustering. Laboratory capacity, throughput, and response time were assessed. From 2009 through 2015, 27,000 multidrug-resistant Gram-negative isolates were submitted. 225 contained carbapenemase-encoding genes (most commonly blaKPC, blaNDM, and blaOXA23). These were found in 15 species from 146 inpatients in 19 facilities. Genetically related CPB were found in more than one hospital. Other clusters or outbreaks were not clonal and involved genetically related plasmids, while some involved several unrelated plasmids. Relatedness depended on the clustering algorithm used. Transmission patterns of plasmids and other mobile genetic elements could not be determined without ultra-long read, single-molecule real-time sequencing. 80% of carbapenem-resistant phenotypes retained susceptibility to aminoglycosides, and 70% retained susceptibility to fluoroquinolones. However, among the CPB-confirmed genotypes, fewer than 25% retained susceptibility to aminoglycosides or fluoroquinolones. Although NGS is increasingly acclaimed to revolutionize clinical practice, resource-constrained environments, large or geographically dispersed healthcare networks, and military or

  7. Targeted next-generation sequencing identifies a novel nonsense mutation in SPTB for hereditary spherocytosis: A case report of a Korean family.

    Science.gov (United States)

    Shin, Soyoung; Jang, Woori; Kim, Myungshin; Kim, Yonggoo; Park, Suk Young; Park, Joonhong; Yang, Young Jun

    2018-01-01

    Hereditary spherocytosis (HS) is an inherited disorder characterized by the presence of spherical-shaped red blood cells (RBCs) on the peripheral blood (PB) smear. To date, a number of mutations in 5 genes have been identified and the mutations in SPTB gene account for about 20% patients. A 65-year-old female had been diagnosed as hemolytic anemia 30 years ago, based on a history of persistent anemia and hyperbilirubinemia for several years. She received RBC transfusion several times and a cholecystectomy roughly 20 years ago before. Round, densely staining spherical-shaped erythrocytes (spherocytes) were frequently found on the PB smear. Numerous spherocytes were frequently found in the PB smears of symptomatic family members, her 3rd son and his 2 grandchildren. One heterozygous mutation of SPTB was identified by targeted next-generation sequencing (NGS). The nonsense mutation, c.1956G>A (p.Trp652*), in exon 13 was confirmed by Sanger sequencing and thus the proband was diagnosed with HS. The proband underwent a splenectomy due to transfusion-refractory anemia and splenomegaly. After the splenectomy, her hemoglobin level improved to normal range (14.1 g/dL) and her bilirubin levels decreased dramatically (total bilirubin 1.9 mg/dL; direct bilirubin 0.6 mg/dL). We suggest that NGS of causative genes could be a useful diagnostic tool for the genetically heterogeneous RBC membrane disorders, especially in cases with a mild or atypical clinical manifestation. Copyright © 2017 The Authors. Published by Wolters Kluwer Health, Inc. All rights reserved.

  8. The next generation of targeted mutation discovery

    NARCIS (Netherlands)

    Harakalova, M.

    2013-01-01

    Sequencing technologies (NGS) now allows efficient analysis of the complete protein-coding regions of genomes (exomes) for multiple samples in a single sequencing run. In Chapter 2, we present our results with a genomic DNA pooling strategy for rare variant discovery on a NGS platform. The high

  9. Next-generation sequencing in schizophrenia and other neuropsychiatric disorders.

    Science.gov (United States)

    Schreiber, Matthew; Dorschner, Michael; Tsuang, Debby

    2013-10-01

    Schizophrenia is a debilitating lifelong illness that lacks a cure and poses a worldwide public health burden. The disease is characterized by a heterogeneous clinical and genetic presentation that complicates research efforts to identify causative genetic variations. This review examines the potential of current findings in schizophrenia and in other related neuropsychiatric disorders for application in next-generation technologies, particularly whole-exome sequencing (WES) and whole-genome sequencing (WGS). These approaches may lead to the discovery of underlying genetic factors for schizophrenia and may thereby identify and target novel therapeutic targets for this devastating disorder. © 2013 Wiley Periodicals, Inc.

  10. Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data

    Directory of Open Access Journals (Sweden)

    Niko eBeerenwinkel

    2012-09-01

    Full Text Available Many viruses, including the clinically relevant RNA viruses HIV and HCV, exist in large populations and display high genetic heterogeneity within and between infected hosts. Assessing intra-patient viral genetic diversity is essential for understanding the evolutionary dynamics of viruses, for designing effective vaccines, and for the success of antiviral therapy. Next-generation sequencing technologies allow the rapid and cost-effective acquisition of thousands to millions of short DNA sequences from a single sample. However, this approach entails several challenges in experimental design and computational data analysis. Here, we review the entire process of inferring viral diversity from sample collection to computing measures of genetic diversity. We discuss sample preparation, including reverse transcription and amplification, and the effect of experimental conditions on diversity estimates due to in vitro base substitutions, insertions, deletions, and recombination. The use of different next-generation sequencing platforms and their sequencing error profiles are compared in the context of various applications of diversity estimation, ranging from the detection of single nucleotide variants to the reconstruction of whole-genome haplotypes. We describe the statistical and computational challenges arising from these technical artifacts, and we review existing approaches, including available software, for their solution. Finally, we discuss open problems, and highlight successful biomedical applications and potential future clinical use of next-generation sequencing to estimate viral diversity.

  11. Fetal Kidney Anomalies: Next Generation Sequencing

    DEFF Research Database (Denmark)

    Rasmussen, Maria; Sunde, Lone; Nielsen, Marlene Louise

    Aim and Introduction Identification of abnormal kidneys in the fetus may lead to termination of the pregnancy and raises questions about the underlying cause and recurrence risk in future pregnancies. In this study, we investigate the effectiveness of targeted next generation sequencing in fetuses...... with prenatally detected kidney anomalies in order to uncover genetic explanations and assess recurrence risk. Also, we aim to study the relation between genetic findings and post mortem kidney histology. Methods The study comprises fetuses diagnosed prenatally with bilateral kidney anomalies that have undergone...... postmortem examination. The approximately 110 genes included in the targeted panel were chosen on the basis of their potential involvement in embryonic kidney development, cystic kidney disease, or the renin-angiotensin system. DNA was extracted from fetal tissue samples or cultured chorion villus cells...

  12. Comparison of Burrows-Wheeler transform-based mapping algorithms used in high-throughput whole-genome sequencing: application to Illumina data for livestock genomes

    Science.gov (United States)

    Ongoing developments and cost decreases in next-generation sequencing (NGS) technologies have led to an increase in their application, which has greatly enhanced the fields of genetics and genomics. Mapping sequence reads onto a reference genome is a fundamental step in the analysis of NGS data. Eff...

  13. Identifying Corneal Infections in Formalin-Fixed Specimens Using Next Generation Sequencing.

    Science.gov (United States)

    Li, Zhigang; Breitwieser, Florian P; Lu, Jennifer; Jun, Albert S; Asnaghi, Laura; Salzberg, Steven L; Eberhart, Charles G

    2018-01-01

    We test the ability of next-generation sequencing, combined with computational analysis, to identify a range of organisms causing infectious keratitis. This retrospective study evaluated 16 cases of infectious keratitis and four control corneas in formalin-fixed tissues from the pathology laboratory. Infectious cases also were analyzed in the microbiology laboratory using culture, polymerase chain reaction, and direct staining. Classified sequence reads were analyzed with two different metagenomics classification engines, Kraken and Centrifuge, and visualized using the Pavian software tool. Sequencing generated 20 to 46 million reads per sample. On average, 96% of the reads were classified as human, 0.3% corresponded to known vectors or contaminant sequences, 1.7% represented microbial sequences, and 2.4% could not be classified. The two computational strategies successfully identified the fungal, bacterial, and amoebal pathogens in most patients, including all four bacterial and mycobacterial cases, five of six fungal cases, three of three Acanthamoeba cases, and one of three herpetic keratitis cases. In several cases, additional potential pathogens also were identified. In one case with cytomegalovirus identified by Kraken and Centrifuge, the virus was confirmed by direct testing, while two where Staphylococcus aureus or cytomegalovirus were identified by Centrifuge but not Kraken could not be confirmed. Confirmation was not attempted for an additional three potential pathogens identified by Kraken and 11 identified by Centrifuge. Next generation sequencing combined with computational analysis can identify a wide range of pathogens in formalin-fixed corneal specimens, with potential applications in clinical diagnostics and research.

  14. Zseq: An Approach for Preprocessing Next-Generation Sequencing Data.

    Science.gov (United States)

    Alkhateeb, Abedalrhman; Rueda, Luis

    2017-08-01

    Next-generation sequencing technology generates a huge number of reads (short sequences), which contain a vast amount of genomic data. The sequencing process, however, comes with artifacts. Preprocessing of sequences is mandatory for further downstream analysis. We present Zseq, a linear method that identifies the most informative genomic sequences and reduces the number of biased sequences, sequence duplications, and ambiguous nucleotides. Zseq finds the complexity of the sequences by counting the number of unique k-mers in each sequence as its corresponding score and also takes into the account other factors such as ambiguous nucleotides or high GC-content percentage in k-mers. Based on a z-score threshold, Zseq sweeps through the sequences again and filters those with a z-score less than the user-defined threshold. Zseq algorithm is able to provide a better mapping rate; it reduces the number of ambiguous bases significantly in comparison with other methods. Evaluation of the filtered reads has been conducted by aligning the reads and assembling the transcripts using the reference genome as well as de novo assembly. The assembled transcripts show a better discriminative ability to separate cancer and normal samples in comparison with another state-of-the-art method. Moreover, de novo assembled transcripts from the reads filtered by Zseq have longer genomic sequences than other tested methods. Estimating the threshold of the cutoff point is introduced using labeling rules with optimistic results.

  15. Next-generation phylogeography: a targeted approach for multilocus sequencing of non-model organisms.

    Directory of Open Access Journals (Sweden)

    Jonathan B Puritz

    Full Text Available The field of phylogeography has long since realized the need and utility of incorporating nuclear DNA (nDNA sequences into analyses. However, the use of nDNA sequence data, at the population level, has been hindered by technical laboratory difficulty, sequencing costs, and problematic analytical methods dealing with genotypic sequence data, especially in non-model organisms. Here, we present a method utilizing the 454 GS-FLX Titanium pyrosequencing platform with the capacity to simultaneously sequence two species of sea star (Meridiastra calcar and Parvulastra exigua at five different nDNA loci across 16 different populations of 20 individuals each per species. We compare results from 3 populations with traditional Sanger sequencing based methods, and demonstrate that this next-generation sequencing platform is more time and cost effective and more sensitive to rare variants than Sanger based sequencing. A crucial advantage is that the high coverage of clonally amplified sequences simplifies haplotype determination, even in highly polymorphic species. This targeted next-generation approach can greatly increase the use of nDNA sequence loci in phylogeographic and population genetic studies by mitigating many of the time, cost, and analytical issues associated with highly polymorphic, diploid sequence markers.

  16. Validation of an NGS mutation detection panel for melanoma.

    Science.gov (United States)

    Reiman, Anne; Kikuchi, Hugh; Scocchia, Daniela; Smith, Peter; Tsang, Yee Wah; Snead, David; Cree, Ian A

    2017-02-22

    Knowledge of the genotype of melanoma is important to guide patient management. Identification of mutations in BRAF and c-KIT lead directly to targeted treatment, but it is also helpful to know if there are driver oncogene mutations in NRAS, GNAQ or GNA11 as these patients may benefit from alternative strategies such as immunotherapy. While polymerase chain reaction (PCR) methods are often used to detect BRAF mutations, next generation sequencing (NGS) is able to determine all of the necessary information on several genes at once, with potential advantages in turnaround time. We describe here an Ampliseq hotspot panel for melanoma for use with the IonTorrent Personal Genome Machine (PGM) which covers the mutations currently of most clinical interest. We have validated this in 151 cases of skin and uveal melanoma from our files, and correlated the data with PCR based assessment of BRAF status. There was excellent agreement, with few discrepancies, though NGS does have greater coverage and picks up some mutations that would be missed by PCR. However, these are often rare and of unknown significance for treatment. PCR methods are rapid, less time-consuming and less expensive than NGS, and could be used as triage for patients requiring more extensive diagnostic workup. The NGS panel described here is suitable for clinical use with formalin-fixed paraffin-embedded (FFPE) samples.

  17. Next generation sequencing reveals the hidden diversity of zooplankton assemblages.

    Directory of Open Access Journals (Sweden)

    Penelope K Lindeque

    Full Text Available BACKGROUND: Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel. METHODOLOGY/PRINCIPLE FINDINGS: Plankton net hauls (200 µm were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups. CONCLUSIONS: Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may

  18. The past, present and future of immune repertoire biology – the rise of next-generation repertoire analysis

    Directory of Open Access Journals (Sweden)

    Adrien eSix

    2013-11-01

    Full Text Available T and B cell repertoires are collections of lymphocytes, each characterised by its antigen-specific receptor. We review here classical technologies and analysis strategies developed to assess Immunoglobulin (IG and T cell receptor (TR repertoire diversity, and describe recent advances in the field. First, we describe the broad range of available methodological tools developed in the past decades, each of which answering different questions and showing complementarity for progressive identification of the level of repertoire alterations: global overview of the diversity by flow cytometry, IG repertoire descriptions at the protein level for the identification of IG reactivities, IG/TR CDR3 spectratyping strategies, and related molecular quantification or dynamics of T/B cell differentiation. Additionally, we introduce the recent technological advances in molecular biology tools allowing deeper analysis of IG/TR diversity by next-generation sequencing (NGS, offering systematic and comprehensive sequencing of IG/TR transcripts in a short amount of time. NGS provides several angles of analysis such as clonotype frequency, CDR3 diversity, CDR3 sequence analysis, V allele identification with a quantitative dimension, therefore requiring high-throughput analysis tools development. In this line, we discuss the recent efforts made for nomenclature standardisation and ontology development. We then present the variety of available statistical analysis and modelling approaches developed with regards to the various levels of diversity analysis, and reveal the increasing sophistication of modelling approaches. To conclude, we provide some examples of recent mathematical modelling strategies and perspectives that illustrate the active rise of a next-generation of repertoire analysis.

  19. CRISPR-DAV: CRISPR NGS data analysis and visualization pipeline.

    Science.gov (United States)

    Wang, Xuning; Tilford, Charles; Neuhaus, Isaac; Mintier, Gabe; Guo, Qi; Feder, John N; Kirov, Stefan

    2017-12-01

    The simplicity and precision of CRISPR/Cas9 system has brought in a new era of gene editing. Screening for desired clones with CRISPR-mediated genomic edits in a large number of samples is made possible by next generation sequencing (NGS) due to its multiplexing. Here we present CRISPR-DAV (CRISPR Data Analysis and Visualization) pipeline to analyze the CRISPR NGS data in a high throughput manner. In the pipeline, Burrows-Wheeler Aligner and Assembly Based ReAlignment are used for small and large indel detection, and results are presented in a comprehensive set of charts and interactive alignment view. CRISPR-DAV is available at GitHub and Docker Hub repositories: https://github.com/pinetree1/crispr-dav.git and https://hub.docker.com/r/pinetree1/crispr-dav/. xuning.wang@bms.com. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  20. Functional Analyses of a Novel Splice Variant in the CHD7 Gene, Found by Next Generation Sequencing, Confirm Its Pathogenicity in a Spanish Patient and Diagnose Him with CHARGE Syndrome

    Directory of Open Access Journals (Sweden)

    Olatz Villate

    2018-01-01

    Full Text Available Mutations in CHD7 have been shown to be a major cause of CHARGE syndrome, which presents many symptoms and features common to other syndromes making its diagnosis difficult. Next generation sequencing (NGS of a panel of intellectual disability related genes was performed in an adult patient without molecular diagnosis. A splice donor variant in CHD7 (c.5665 + 1G > T was identified. To study its potential pathogenicity, exons and flanking intronic sequences were amplified from patient DNA and cloned into the pSAD® splicing vector. HeLa cells were transfected with this construct and a wild-type minigene and functional analysis were performed. The construct with the c.5665 + 1G > T variant produced an aberrant transcript with an insert of 63 nucleotides of intron 28 creating a premature termination codon (TAG 25 nucleotides downstream. This would lead to the insertion of 8 new amino acids and therefore a truncated 1896 amino acid protein. As a result of this, the patient was diagnosed with CHARGE syndrome. Functional analyses underline their usefulness for studying the pathogenicity of variants found by NGS and therefore its application to accurately diagnose patients.

  1. Functional Analyses of a Novel Splice Variant in the CHD7 Gene, Found by Next Generation Sequencing, Confirm Its Pathogenicity in a Spanish Patient and Diagnose Him with CHARGE Syndrome.

    Science.gov (United States)

    Villate, Olatz; Ibarluzea, Nekane; Fraile-Bethencourt, Eugenia; Valenzuela, Alberto; Velasco, Eladio A; Grozeva, Detelina; Raymond, F L; Botella, María P; Tejada, María-Isabel

    2018-01-01

    Mutations in CHD7 have been shown to be a major cause of CHARGE syndrome, which presents many symptoms and features common to other syndromes making its diagnosis difficult. Next generation sequencing (NGS) of a panel of intellectual disability related genes was performed in an adult patient without molecular diagnosis. A splice donor variant in CHD7 (c.5665 + 1G > T) was identified. To study its potential pathogenicity, exons and flanking intronic sequences were amplified from patient DNA and cloned into the pSAD ® splicing vector. HeLa cells were transfected with this construct and a wild-type minigene and functional analysis were performed. The construct with the c.5665 + 1G > T variant produced an aberrant transcript with an insert of 63 nucleotides of intron 28 creating a premature termination codon (TAG) 25 nucleotides downstream. This would lead to the insertion of 8 new amino acids and therefore a truncated 1896 amino acid protein. As a result of this, the patient was diagnosed with CHARGE syndrome. Functional analyses underline their usefulness for studying the pathogenicity of variants found by NGS and therefore its application to accurately diagnose patients.

  2. Next generation sequencing reveals distinct fecal pollution signatures in aquatic sediments across gradients of anthropogenic influence

    Directory of Open Access Journals (Sweden)

    Gian Marco Luna

    2016-11-01

    Full Text Available Aquatic sediments are the repository of a variety of anthropogenic pollutants, including bacteria of fecal origin, that reach the aquatic environment from a variety of sources. Although fecal bacteria can survive for long periods of time in aquatic sediments, the microbiological quality of sediments is almost entirely neglected when performing quality assessments of aquatic ecosystems. Here we investigated the relative abundance, patterns and diversity of fecal bacterial populations in two coastal areas in the Northern Adriatic Sea (Italy: the Po river prodelta (PRP, an estuarine area receiving significant contaminant discharge from one of the largest European rivers and the Lagoon of Venice (LV, a transitional environment impacted by a multitude of anthropogenic stressors. From both areas, several indicators of fecal and sewage contamination were determined in the sediments using Next Generation Sequencing (NGS of 16S rDNA amplicons. At both areas, fecal contamination was high, with fecal bacteria accounting for up to 3.96% and 1.12% of the sediment bacterial assemblages in PRP and LV, respectively. The magnitude of the fecal signature was highest in the PRP site, highlighting the major role of the Po river in spreading microbial contaminants into the adjacent coastal area. In the LV site, fecal pollution was highest in the urban area, and almost disappeared when moving to the open sea. Our analysis revealed a large number of fecal Operational Taxonomic Units (OTU, 960 and 181 in PRP and LV, respectively and showed a different fecal signature in the two areas, suggesting a diverse contribution of human and non-human sources of contamination. These results highlight the potential of NGS techniques to gain insights into the origin and fate of different fecal bacteria populations in aquatic sediments.

  3. Epidemiology of transmissible diseases: Array hybridization and next generation sequencing as universal nucleic acid-mediated typing tools.

    Science.gov (United States)

    Michael Dunne, W; Pouseele, Hannes; Monecke, Stefan; Ehricht, Ralf; van Belkum, Alex

    2017-09-21

    The magnitude of interest in the epidemiology of transmissible human diseases is reflected in the vast number of tools and methods developed recently with the expressed purpose to characterize and track evolutionary changes that occur in agents of these diseases over time. Within the past decade a new suite of such tools has become available with the emergence of the so-called "omics" technologies. Among these, two are exponents of the ongoing genomic revolution. Firstly, high-density nucleic acid probe arrays have been proposed and developed using various chemical and physical approaches. Via hybridization-mediated detection of entire genes or genetic polymorphisms in such genes and intergenic regions these so called "DNA chips" have been successfully applied for distinguishing very closely related microbial species and strains. Second and even more phenomenal, next generation sequencing (NGS) has facilitated the assessment of the complete nucleotide sequence of entire microbial genomes. This technology currently provides the most detailed level of bacterial genotyping and hence allows for the resolution of microbial spread and short-term evolution in minute detail. We will here review the very recent history of these two technologies, sketch their usefulness in the elucidation of the spread and epidemiology of mostly hospital-acquired infections and discuss future developments. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.

    Directory of Open Access Journals (Sweden)

    Ram Vinay Pandey

    Full Text Available Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.

  5. MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.

    Science.gov (United States)

    Pandey, Ram Vinay; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas

    2016-01-01

    Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.

  6. A fast and robust method for full genome sequencing of Porcine Reproductive and Respiratory Syndrome Virus (PRRSV) Type 1 and Type 2

    DEFF Research Database (Denmark)

    Kvisgaard, Lise Kirstine; Hjulsager, Charlotte Kristiane; Fahnøe, Ulrik

    2013-01-01

    . In the present study, fast and robust methods for long range RT-PCR amplification and subsequent next generation sequencing (NGS) were developed and validated on nine Type 1 and nine Type 2 PRRSV viruses. The methods generated robust and reliable sequences both on primary material and cell culture adapted...... viruses and the protocols performed well on all three NGS platforms tested (Roche 454 FLX, Illumina HiSeq2000, and Ion Torrent PGM™ Sequencer). These methods will greatly facilitate the generation of more full genome PRRSV sequences globally....

  7. Biomolecule Sequencer: Next-Generation DNA Sequencing Technology for In-Flight Environmental Monitoring, Research, and Beyond

    Science.gov (United States)

    Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.

    2016-01-01

    On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human

  8. Bio-Docklets: virtualization containers for single-step execution of NGS pipelines.

    Science.gov (United States)

    Kim, Baekdoo; Ali, Thahmina; Lijeron, Carlos; Afgan, Enis; Krampis, Konstantinos

    2017-08-01

    Processing of next-generation sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized postanalysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers toward seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform. We present an approach for abstracting the complex data operations of multistep, bioinformatics pipelines for NGS data analysis. As examples, we have deployed 2 pipelines for RNA sequencing and chromatin immunoprecipitation sequencing, preconfigured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines as simply as running a single bioinformatics tool. This is achieved using a "meta-script" that automatically starts the Bio-Docklets and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface. The pipeline output is postprocessed by integration with the Visual Omics Explorer framework, providing interactive data visualizations that users can access through a web browser. Our goal is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts on any computing environment, whether a laboratory workstation, university computer cluster, or a cloud service provider. Beyond end users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets. © The Authors 2017. Published by Oxford University Press.

  9. Pitfalls of improperly procured adjacent non-neoplastic tissue for somatic mutation analysis using next-generation sequencing

    Directory of Open Access Journals (Sweden)

    Lei Wei

    2016-10-01

    Full Text Available Abstract Background The rapid adoption of next-generation sequencing provides an efficient system for detecting somatic alterations in neoplasms. The detection of such alterations requires a matched non-neoplastic sample for adequate filtering of non-somatic events such as germline polymorphisms. Non-neoplastic tissue adjacent to the excised neoplasm is often used for this purpose as it is simultaneously collected and generally contains the same tissue type as the neoplasm. Following NGS analysis, we and others have frequently observed low-level somatic mutations in these non-neoplastic tissues, which may impose additional challenges to somatic mutation detection as it complicates germline variant filtering. Methods We hypothesized that the low-level somatic mutation observed in non-neoplastic tissues may be entirely or partially caused by inadvertent contamination by neoplastic cells during the surgical pathology gross assessment or tissue procurement process. To test this hypothesis, we applied a systematic protocol designed to collect multiple grossly non-neoplastic tissues using different methods surrounding each single neoplasm. The procedure was applied in two breast cancer lumpectomy specimens. In each case, all samples were first sequenced by whole-exome sequencing to identify somatic mutations in the neoplasm and determine their presence in the adjacent non-neoplastic tissues. We then generated ultra-deep coverage using targeted sequencing to assess the levels of contamination in non-neoplastic tissue samples collected under different conditions. Results Contamination levels in non-neoplastic tissues ranged up to 3.5 and 20.9 % respectively in the two cases tested, with consistent pattern correlated with the manner of grossing and procurement. By carefully controlling the conditions of various steps during this process, we were able to eliminate any detectable contamination in both patients. Conclusion The results demonstrated that the

  10. Targeted assembly of short sequence reads.

    Directory of Open Access Journals (Sweden)

    René L Warren

    Full Text Available As next-generation sequence (NGS production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS data sets can be mined for the presence of sequence variants of interest by localized assembly, which is a faster, easier, and more accurate approach. We present TASR, a streamlined assembler that interrogates very large NGS data sets for the presence of specific variants by only considering reads within the sequence space of input target sequences provided by the user. The NGS data set is searched for reads with an exact match to all possible short words within the target sequence, and these reads are then assembled stringently to generate a consensus of the target and flanking sequence. Typically, variants of a particular locus are provided as different target sequences, and the presence of the variant in the data set being interrogated is revealed by a successful assembly outcome. However, TASR can also be used to find unknown sequences that flank a given target. We demonstrate that TASR has utility in finding or confirming genomic mutations, polymorphisms, fusions and integration events. Targeted assembly is a powerful method for interrogating large data sets for the presence of sequence variants of interest. TASR is a fast, flexible and easy to use tool for targeted assembly.

  11. Am I my Family's Keeper? : Disclosure Dilemmas in Next Generation Sequencing

    NARCIS (Netherlands)

    Wouters, Roel H P; Bijlsma, Rhodé M; Ausems, Margreet G E M; van Delden, Johannes J M; Voest, Emile E; Bredenoord, Annelien L

    2016-01-01

    Ever since genetic testing is possible for specific mutations, ethical debate has sparked on the question of whether professionals have a duty to warn not only patients but also their relatives that might be at risk for hereditary diseases. As next generation sequencing swiftly finds its way into

  12. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies.

    Science.gov (United States)

    Gimode, Davis; Odeny, Damaris A; de Villiers, Etienne P; Wanyonyi, Solomon; Dida, Mathews M; Mneney, Emmarold E; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M

    2016-01-01

    Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional

  13. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies.

    Directory of Open Access Journals (Sweden)

    Davis Gimode

    Full Text Available Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS technologies to develop both Simple Sequence Repeat (SSR and Single Nucleotide Polymorphism (SNP markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included

  14. Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies.

    Science.gov (United States)

    de Brevern, Alexandre G; Meyniel, Jean-Philippe; Fairhead, Cécile; Neuvéglise, Cécile; Malpertuy, Alain

    2015-01-01

    Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.

  15. NGSUtils: a software suite for analyzing and manipulating next-generation sequencing datasets

    OpenAIRE

    Breese, Marcus R.; Liu, Yunlong

    2013-01-01

    Summary: NGSUtils is a suite of software tools for manipulating data common to next-generation sequencing experiments, such as FASTQ, BED and BAM format files. These tools provide a stable and modular platform for data management and analysis.

  16. Sequencing of BAC pools by different next generation sequencing platforms and strategies

    Directory of Open Access Journals (Sweden)

    Scholz Uwe

    2011-10-01

    Full Text Available Abstract Background Next generation sequencing of BACs is a viable option for deciphering the sequence of even large and highly repetitive genomes. In order to optimize this strategy, we examined the influence of read length on the quality of Roche/454 sequence assemblies, to what extent Illumina/Solexa mate pairs (MPs improve the assemblies by scaffolding and whether barcoding of BACs is dispensable. Results Sequencing four BACs with both FLX and Titanium technologies revealed similar sequencing accuracy, but showed that the longer Titanium reads produce considerably less misassemblies and gaps. The 454 assemblies of 96 barcoded BACs were improved by scaffolding 79% of the total contig length with MPs from a non-barcoded library. Assembly of the unmasked 454 sequences without separation by barcodes revealed chimeric contig formation to be a major problem, encompassing 47% of the total contig length. Masking the sequences reduced this fraction to 24%. Conclusion Optimal BAC pool sequencing should be based on the longest available reads, with barcoding essential for a comprehensive assessment of both repetitive and non-repetitive sequence information. When interest is restricted to non-repetitive regions and repeats are masked prior to assembly, barcoding is non-essential. In any case, the assemblies can be improved considerably by scaffolding with non-barcoded BAC pool MPs.

  17. Estimation of allele frequency and association mapping using next-generation sequencing data

    DEFF Research Database (Denmark)

    Kim, Su Yeon; Lohmueller, Kirk E; Albrechtsen, Anders

    2011-01-01

    Estimation of allele frequency is of fundamental importance in population genetic analyses and in association mapping. In most studies using next-generation sequencing, a cost effective approach is to use medium or low-coverage data (e.g., frequency estimation...

  18. Detection of a divergent variant of grapevine virus F by next-generation sequencing.

    Science.gov (United States)

    Molenaar, Nicholas; Burger, Johan T; Maree, Hans J

    2015-08-01

    The complete genome sequence of a South African isolate of grapevine virus F (GVF) is presented. It was first detected by metagenomic next-generation sequencing of field samples and validated through direct Sanger sequencing. The genome sequence of GVF isolate V5 consists of 7539 nucleotides and contains a poly(A) tail. It has a typical vitivirus genome arrangement that comprises five open reading frames (ORFs), which share only 88.96 % nucleotide sequence identity with the existing complete GVF genome sequence (JX105428).

  19. Mutation Detection in Patients with Retinal Dystrophies Using Targeted Next Generation Sequencing

    DEFF Research Database (Denmark)

    Weisschuh, Nicole; Mayer, Anja K; Strom, Tim M

    2016-01-01

    Retinal dystrophies (RD) constitute a group of blinding diseases that are characterized by clinical variability and pronounced genetic heterogeneity. The different nonsyndromic and syndromic forms of RD can be attributed to mutations in more than 200 genes. Consequently, next generation sequencing...

  20. AlignerBoost: A Generalized Software Toolkit for Boosting Next-Gen Sequencing Mapping Accuracy Using a Bayesian-Based Mapping Quality Framework.

    Directory of Open Access Journals (Sweden)

    Qi Zheng

    2016-10-01

    Full Text Available Accurate mapping of next-generation sequencing (NGS reads to reference genomes is crucial for almost all NGS applications and downstream analyses. Various repetitive elements in human and other higher eukaryotic genomes contribute in large part to ambiguously (non-uniquely mapped reads. Most available NGS aligners attempt to address this by either removing all non-uniquely mapping reads, or reporting one random or "best" hit based on simple heuristics. Accurate estimation of the mapping quality of NGS reads is therefore critical albeit completely lacking at present. Here we developed a generalized software toolkit "AlignerBoost", which utilizes a Bayesian-based framework to accurately estimate mapping quality of ambiguously mapped NGS reads. We tested AlignerBoost with both simulated and real DNA-seq and RNA-seq datasets at various thresholds. In most cases, but especially for reads falling within repetitive regions, AlignerBoost dramatically increases the mapping precision of modern NGS aligners without significantly compromising the sensitivity even without mapping quality filters. When using higher mapping quality cutoffs, AlignerBoost achieves a much lower false mapping rate while exhibiting comparable or higher sensitivity compared to the aligner default modes, therefore significantly boosting the detection power of NGS aligners even using extreme thresholds. AlignerBoost is also SNP-aware, and higher quality alignments can be achieved if provided with known SNPs. AlignerBoost's algorithm is computationally efficient, and can process one million alignments within 30 seconds on a typical desktop computer. AlignerBoost is implemented as a uniform Java application and is freely available at https://github.com/Grice-Lab/AlignerBoost.

  1. AlignerBoost: A Generalized Software Toolkit for Boosting Next-Gen Sequencing Mapping Accuracy Using a Bayesian-Based Mapping Quality Framework.

    Science.gov (United States)

    Zheng, Qi; Grice, Elizabeth A

    2016-10-01

    Accurate mapping of next-generation sequencing (NGS) reads to reference genomes is crucial for almost all NGS applications and downstream analyses. Various repetitive elements in human and other higher eukaryotic genomes contribute in large part to ambiguously (non-uniquely) mapped reads. Most available NGS aligners attempt to address this by either removing all non-uniquely mapping reads, or reporting one random or "best" hit based on simple heuristics. Accurate estimation of the mapping quality of NGS reads is therefore critical albeit completely lacking at present. Here we developed a generalized software toolkit "AlignerBoost", which utilizes a Bayesian-based framework to accurately estimate mapping quality of ambiguously mapped NGS reads. We tested AlignerBoost with both simulated and real DNA-seq and RNA-seq datasets at various thresholds. In most cases, but especially for reads falling within repetitive regions, AlignerBoost dramatically increases the mapping precision of modern NGS aligners without significantly compromising the sensitivity even without mapping quality filters. When using higher mapping quality cutoffs, AlignerBoost achieves a much lower false mapping rate while exhibiting comparable or higher sensitivity compared to the aligner default modes, therefore significantly boosting the detection power of NGS aligners even using extreme thresholds. AlignerBoost is also SNP-aware, and higher quality alignments can be achieved if provided with known SNPs. AlignerBoost's algorithm is computationally efficient, and can process one million alignments within 30 seconds on a typical desktop computer. AlignerBoost is implemented as a uniform Java application and is freely available at https://github.com/Grice-Lab/AlignerBoost.

  2. Bringing Next-Generation Sequencing into the Classroom through a Comparison of Molecular Biology Techniques

    Science.gov (United States)

    Bowling, Bethany; Zimmer, Erin; Pyatt, Robert E.

    2014-01-01

    Although the development of next-generation (NextGen) sequencing technologies has revolutionized genomic research and medicine, the incorporation of these topics into the classroom is challenging, given an implied high degree of technical complexity. We developed an easy-to-implement, interactive classroom activity investigating the similarities…

  3. Consistency and reproducibility of next-generation sequencing and other multigene mutational assays: A worldwide ring trial study on quantitative cytological molecular reference specimens.

    Science.gov (United States)

    Malapelle, Umberto; Mayo-de-Las-Casas, Clara; Molina-Vila, Miguel A; Rosell, Rafael; Savic, Spasenija; Bihl, Michel; Bubendorf, Lukas; Salto-Tellez, Manuel; de Biase, Dario; Tallini, Giovanni; Hwang, David H; Sholl, Lynette M; Luthra, Rajyalakshmi; Weynand, Birgit; Vander Borght, Sara; Missiaglia, Edoardo; Bongiovanni, Massimo; Stieber, Daniel; Vielh, Philippe; Schmitt, Fernando; Rappa, Alessandra; Barberis, Massimo; Pepe, Francesco; Pisapia, Pasquale; Serra, Nicola; Vigliar, Elena; Bellevicine, Claudio; Fassan, Matteo; Rugge, Massimo; de Andrea, Carlos E; Lozano, Maria D; Basolo, Fulvio; Fontanini, Gabriella; Nikiforov, Yuri E; Kamel-Reid, Suzanne; da Cunha Santos, Gilda; Nikiforova, Marina N; Roy-Chowdhuri, Sinchita; Troncone, Giancarlo

    2017-08-01

    Molecular testing of cytological lung cancer specimens includes, beyond epidermal growth factor receptor (EGFR), emerging predictive/prognostic genomic biomarkers such as Kirsten rat sarcoma viral oncogene homolog (KRAS), neuroblastoma RAS viral [v-ras] oncogene homolog (NRAS), B-Raf proto-oncogene, serine/threonine kinase (BRAF), and phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit α (PIK3CA). Next-generation sequencing (NGS) and other multigene mutational assays are suitable for cytological specimens, including smears. However, the current literature reflects single-institution studies rather than multicenter experiences. Quantitative cytological molecular reference slides were produced with cell lines designed to harbor concurrent mutations in the EGFR, KRAS, NRAS, BRAF, and PIK3CA genes at various allelic ratios, including low allele frequencies (AFs; 1%). This interlaboratory ring trial study included 14 institutions across the world that performed multigene mutational assays, from tissue extraction to data analysis, on these reference slides, with each laboratory using its own mutation analysis platform and methodology. All laboratories using NGS (n = 11) successfully detected the study's set of mutations with minimal variations in the means and standard errors of variant fractions at dilution points of 10% (P = .171) and 5% (P = .063) despite the use of different sequencing platforms (Illumina, Ion Torrent/Proton, and Roche). However, when mutations at a low AF of 1% were analyzed, the concordance of the NGS results was low, and this reflected the use of different thresholds for variant calling among the institutions. In contrast, laboratories using matrix-assisted laser desorption/ionization-time of flight (n = 2) showed lower concordance in terms of mutation detection and mutant AF quantification. Quantitative molecular reference slides are a useful tool for monitoring the performance of different multigene mutational

  4. Targeted next generation sequencing in Italian patients with Usher syndrome: phenotype-genotype correlations.

    Science.gov (United States)

    Eandi, Chiara M; Dallorto, Laura; Spinetta, Roberta; Micieli, Maria Pia; Vanzetti, Mario; Mariottini, Alessandro; Passerini, Ilaria; Torricelli, Francesca; Alovisi, Camilla; Marchese, Cristiana

    2017-11-15

    We report results of DNA analysis with next generation sequencing (NGS) of 21 consecutive Italian patients from 17 unrelated families with clinical diagnosis of Usher syndrome (4 USH1 and 17 USH2) searching for mutations in 11 genes: MYO7A, CDH23, PCDH15, USH1C, USH1G, USH2A, ADGVR1, DFNB31, CLRN1, PDZD7, HARS. Likely causative mutations were found in all patients: 25 pathogenic variants, 18 previously reported and 7 novel, were identified in three genes (USH2A, MYO7A, ADGRV1). All USH1 presented biallelic MYO7A mutations, one USH2 exhibited ADGRV1 mutations, whereas 16 USH2 displayed USH2A mutations. USH1 patients experienced hearing problems very early in life, followed by visual impairment at 1, 4 and 6 years. Visual symptoms were noticed at age 20 in a patient with homozygous novel MYO7A missense mutation c.849G > A. USH2 patients' auditory symptoms, instead, arose between 11 months and 14 years, while visual impairment occurred later on. A homozygous c.5933_5940del;5950_5960dup in USH2A was detected in one patient with early deafness. One patient with homozygous deletion from exon 23 to 32 in USH2A suffered early visual symptoms. Therefore, the type of mutation in USH2A and MYO7A genes seems to affect the age at which both auditory and visual impairment occur in patients with USH.

  5. Evaluating multiplexed next-generation sequencing as a method in palynology for mixed pollen samples.

    Science.gov (United States)

    Keller, A; Danner, N; Grimmer, G; Ankenbrand, M; von der Ohe, K; von der Ohe, W; Rost, S; Härtel, S; Steffan-Dewenter, I

    2015-03-01

    The identification of pollen plays an important role in ecology, palaeo-climatology, honey quality control and other areas. Currently, expert knowledge and reference collections are essential to identify pollen origin through light microscopy. Pollen identification through molecular sequencing and DNA barcoding has been proposed as an alternative approach, but the assessment of mixed pollen samples originating from multiple plant species is still a tedious and error-prone task. Next-generation sequencing has been proposed to avoid this hindrance. In this study we assessed mixed pollen probes through next-generation sequencing of amplicons from the highly variable, species-specific internal transcribed spacer 2 region of nuclear ribosomal DNA. Further, we developed a bioinformatic workflow to analyse these high-throughput data with a newly created reference database. To evaluate the feasibility, we compared results from classical identification based on light microscopy from the same samples with our sequencing results. We assessed in total 16 mixed pollen samples, 14 originated from honeybee colonies and two from solitary bee nests. The sequencing technique resulted in higher taxon richness (deeper assignments and more identified taxa) compared to light microscopy. Abundance estimations from sequencing data were significantly correlated with counted abundances through light microscopy. Simulation analyses of taxon specificity and sensitivity indicate that 96% of taxa present in the database are correctly identifiable at the genus level and 70% at the species level. Next-generation sequencing thus presents a useful and efficient workflow to identify pollen at the genus and species level without requiring specialised palynological expert knowledge. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.

  6. Applications and Case Studies of the Next-Generation Sequencing Technologies in Food, Nutrition and Agriculture.

    Science.gov (United States)

    Next-generation sequencing technologies are able to produce high-throughput short sequence reads in a cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. Here I survey their major applications ranging...

  7. A case study for cloud based high throughput analysis of NGS data using the globus genomics system

    Directory of Open Access Journals (Sweden)

    Krithika Bhuvaneshwar

    2015-01-01

    Full Text Available Next generation sequencing (NGS technologies produce massive amounts of data requiring a powerful computational infrastructure, high quality bioinformatics software, and skilled personnel to operate the tools. We present a case study of a practical solution to this data management and analysis challenge that simplifies terabyte scale data handling and provides advanced tools for NGS data analysis. These capabilities are implemented using the “Globus Genomics” system, which is an enhanced Galaxy workflow system made available as a service that offers users the capability to process and transfer data easily, reliably and quickly to address end-to-endNGS analysis requirements. The Globus Genomics system is built on Amazon's cloud computing infrastructure. The system takes advantage of elastic scaling of compute resources to run multiple workflows in parallel and it also helps meet the scale-out analysis needs of modern translational genomics research.

  8. Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies

    Directory of Open Access Journals (Sweden)

    Alexandre G. de Brevern

    2015-01-01

    Full Text Available Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.

  9. Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies

    Science.gov (United States)

    de Brevern, Alexandre G.; Meyniel, Jean-Philippe; Fairhead, Cécile; Neuvéglise, Cécile; Malpertuy, Alain

    2015-01-01

    Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries. PMID:26125026

  10. Exploring the Mechanisms of Gastrointestinal Cancer Development Using Deep Sequencing Analysis

    International Nuclear Information System (INIS)

    Matsumoto, Tomonori; Shimizu, Takahiro; Takai, Atsushi; Marusawa, Hiroyuki

    2015-01-01

    Next-generation sequencing (NGS) technologies have revolutionized cancer genomics due to their high throughput sequencing capacity. Reports of the gene mutation profiles of various cancers by many researchers, including international cancer genome research consortia, have increased over recent years. In addition to detecting somatic mutations in tumor cells, NGS technologies enable us to approach the subject of carcinogenic mechanisms from new perspectives. Deep sequencing, a method of optimizing the high throughput capacity of NGS technologies, allows for the detection of genetic aberrations in small subsets of premalignant and/or tumor cells in noncancerous chronically inflamed tissues. Genome-wide NGS data also make it possible to clarify the mutational signatures of each cancer tissue by identifying the precise pattern of nucleotide alterations in the cancer genome, providing new information regarding the mechanisms of tumorigenesis. In this review, we highlight these new methods taking advantage of NGS technologies, and discuss our current understanding of carcinogenic mechanisms elucidated from such approaches

  11. Exploring the Mechanisms of Gastrointestinal Cancer Development Using Deep Sequencing Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Matsumoto, Tomonori; Shimizu, Takahiro; Takai, Atsushi; Marusawa, Hiroyuki, E-mail: maru@kuhp.kyoto-u.ac.jp [Department of Gastroenterology and Hepatology, Graduate School of Medicine, Kyoto University, 54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507 (Japan)

    2015-06-15

    Next-generation sequencing (NGS) technologies have revolutionized cancer genomics due to their high throughput sequencing capacity. Reports of the gene mutation profiles of various cancers by many researchers, including international cancer genome research consortia, have increased over recent years. In addition to detecting somatic mutations in tumor cells, NGS technologies enable us to approach the subject of carcinogenic mechanisms from new perspectives. Deep sequencing, a method of optimizing the high throughput capacity of NGS technologies, allows for the detection of genetic aberrations in small subsets of premalignant and/or tumor cells in noncancerous chronically inflamed tissues. Genome-wide NGS data also make it possible to clarify the mutational signatures of each cancer tissue by identifying the precise pattern of nucleotide alterations in the cancer genome, providing new information regarding the mechanisms of tumorigenesis. In this review, we highlight these new methods taking advantage of NGS technologies, and discuss our current understanding of carcinogenic mechanisms elucidated from such approaches.

  12. Low-pass whole-genome sequencing in clinical cytogenetics

    DEFF Research Database (Denmark)

    Dong, Zirui; Zhang, Jun; Hu, Ping

    2016-01-01

    Purpose: Chromosomal microarray analysis is the gold standard for copy-number variant (CNV) detection in prenatal and postnatal diagnosis. We aimed to determine whether next-generation sequencing (NGS) technology could be an alternative method for CNV detection in routine clinical application. Me...

  13. Use of next-generation sequencing in oral cavity cancer

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Kruse, Torben A; Thomassen, Mads

    Background: Oral cavity cancer is a subgroup of head and neck cancer which is the world’s 6th most common cancer form. Oral squamous cell carcinomas (OSCC) constitute almost all oral cavity cancers, and OSCC are primarily attributed by excessive alcohol consumption and tobacco exposure...... of tumour cells exists. Conclusions: Use of next generation sequencing in oral cavity cancer can give valuable insight into the biology of the disease. By investigating intra tumour heterogeneity we see that the different tumour specimens in each patient are quite homogenous, but evidence of heterogeneous...

  14. Association testing for next-generation sequencing data using score statistics

    DEFF Research Database (Denmark)

    Skotte, Line; Korneliussen, Thorfinn Sand; Albrechtsen, Anders

    2012-01-01

    computationally feasible due to the use of score statistics. As part of the joint likelihood, we model the distribution of the phenotypes using a generalized linear model framework, which works for both quantitative and discrete phenotypes. Thus, the method presented here is applicable to case-control studies...... of genotype calls into account have been proposed; most require numerical optimization which for large-scale data is not always computationally feasible. We show that using a score statistic for the joint likelihood of observed phenotypes and observed sequencing data provides an attractive approach...... to association testing for next-generation sequencing data. The joint model accounts for the genotype classification uncertainty via the posterior probabilities of the genotypes given the observed sequencing data, which gives the approach higher power than methods based on called genotypes. This strategy remains...

  15. SRComp: short read sequence compression using burstsort and Elias omega coding.

    Directory of Open Access Journals (Sweden)

    Jeremy John Selva

    Full Text Available Next-generation sequencing (NGS technologies permit the rapid production of vast amounts of data at low cost. Economical data storage and transmission hence becomes an increasingly important challenge for NGS experiments. In this paper, we introduce a new non-reference based read sequence compression tool called SRComp. It works by first employing a fast string-sorting algorithm called burstsort to sort read sequences in lexicographical order and then Elias omega-based integer coding to encode the sorted read sequences. SRComp has been benchmarked on four large NGS datasets, where experimental results show that it can run 5-35 times faster than current state-of-the-art read sequence compression tools such as BEETL and SCALCE, while retaining comparable compression efficiency for large collections of short read sequences. SRComp is a read sequence compression tool that is particularly valuable in certain applications where compression time is of major concern.

  16. New tool to assemble repetitive regions using next-generation sequencing data

    Science.gov (United States)

    Kuśmirek, Wiktor; Nowak, Robert M.; Neumann, Łukasz

    2017-08-01

    The next generation sequencing techniques produce a large amount of sequencing data. Some part of the genome are composed of repetitive DNA sequences, which are very problematic for the existing genome assemblers. We propose a modification of the algorithm for a DNA assembly, which uses the relative frequency of reads to properly reconstruct repetitive sequences. The new approach was implemented and tested, as a demonstration of the capability of our software we present some results for model organisms. The new implementation, using a three-layer software architecture was selected, where the presentation layer, data processing layer, and data storage layer were kept separate. Source code as well as demo application with web interface and the additional data are available at project web-page: http://dnaasm.sourceforge.net.

  17. Targeted next generation sequencing for molecular diagnosis of Usher syndrome.

    Science.gov (United States)

    Aparisi, María J; Aller, Elena; Fuster-García, Carla; García-García, Gema; Rodrigo, Regina; Vázquez-Manrique, Rafael P; Blanco-Kelly, Fiona; Ayuso, Carmen; Roux, Anne-Françoise; Jaijo, Teresa; Millán, José M

    2014-11-18

    Usher syndrome is an autosomal recessive disease that associates sensorineural hearing loss, retinitis pigmentosa and, in some cases, vestibular dysfunction. It is clinically and genetically heterogeneous. To date, 10 genes have been associated with the disease, making its molecular diagnosis based on Sanger sequencing, expensive and time-consuming. Consequently, the aim of the present study was to develop a molecular diagnostics method for Usher syndrome, based on targeted next generation sequencing. A custom HaloPlex panel for Illumina platforms was designed to capture all exons of the 10 known causative Usher syndrome genes (MYO7A, USH1C, CDH23, PCDH15, USH1G, CIB2, USH2A, GPR98, DFNB31 and CLRN1), the two Usher syndrome-related genes (HARS and PDZD7) and the two candidate genes VEZT and MYO15A. A cohort of 44 patients suffering from Usher syndrome was selected for this study. This cohort was divided into two groups: a test group of 11 patients with known mutations and another group of 33 patients with unknown mutations. Forty USH patients were successfully sequenced, 8 USH patients from the test group and 32 patients from the group composed of USH patients without genetic diagnosis. We were able to detect biallelic mutations in one USH gene in 22 out of 32 USH patients (68.75%) and to identify 79.7% of the expected mutated alleles. Fifty-three different mutations were detected. These mutations included 21 missense, 8 nonsense, 9 frameshifts, 9 intronic mutations and 6 large rearrangements. Targeted next generation sequencing allowed us to detect both point mutations and large rearrangements in a single experiment, minimizing the economic cost of the study, increasing the detection ratio of the genetic cause of the disease and improving the genetic diagnosis of Usher syndrome patients.

  18. Analysis of selected genes associated with cardiomyopathy by next-generation sequencing.

    Science.gov (United States)

    Szabadosova, Viktoria; Boronova, Iveta; Ferenc, Peter; Tothova, Iveta; Bernasovska, Jarmila; Zigova, Michaela; Kmec, Jan; Bernasovsky, Ivan

    2018-02-01

    As the leading cause of congestive heart failure, cardiomyopathy represents a heterogenous group of heart muscle disorders. Despite considerable progress being made in the genetic diagnosis of cardiomyopathy by detection of the mutations in the most prevalent cardiomyopathy genes, the cause remains unsolved in many patients. High-throughput mutation screening in the disease genes for cardiomyopathy is now possible because of using target enrichment followed by next-generation sequencing. The aim of the study was to analyze a panel of genes associated with dilated or hypertrophic cardiomyopathy based on previously published results in order to identify the subjects at risk. The method of next-generation sequencing by IlluminaHiSeq 2500 platform was used to detect sequence variants in 16 individuals diagnosed with dilated or hypertrophic cardiomyopathy. Detected variants were filtered and the functional impact of amino acid changes was predicted by computational programs. DNA samples of the 16 patients were analyzed by whole exome sequencing. We identified six nonsynonymous variants that were shown to be pathogenic in all used prediction softwares: rs3744998 (EPG5), rs11551768 (MGME1), rs148374985 (MURC), rs78461695 (PLEC), rs17158558 (RET) and rs2295190 (SYNE1). Two of the analyzed sequence variants had minor allele frequency (MAF)MURC), rs34580776 (MYBPC3). Our data support the potential role of the detected variants in pathogenesis of dilated or hypertrophic cardiomyopathy; however, the possibility that these variants might not be true disease-causing variants but are susceptibility alleles that require additional mutations or injury to cause the clinical phenotype of disease must be considered. © 2017 Wiley Periodicals, Inc.

  19. Next-generation sequencing and FISH studies reveal the appearance of gene mutations and chromosomal abnormalities in hematopoietic progenitors in chronic lymphocytic leukemia

    Directory of Open Access Journals (Sweden)

    Miguel Quijada-Álamo

    2017-04-01

    Full Text Available Abstract Background Chronic lymphocytic leukemia (CLL is a highly genetically heterogeneous disease. Although CLL has been traditionally considered as a mature B cell leukemia, few independent studies have shown that the genetic alterations may appear in CD34+ hematopoietic progenitors. However, the presence of both chromosomal aberrations and gene mutations in CD34+ cells from the same patients has not been explored. Methods Amplicon-based deep next-generation sequencing (NGS studies were carried out in magnetically activated-cell-sorting separated CD19+ mature B lymphocytes and CD34+ hematopoietic progenitors (n = 56 to study the mutational status of TP53, NOTCH1, SF3B1, FBXW7, MYD88, and XPO1 genes. In addition, ultra-deep NGS was performed in a subset of seven patients to determine the presence of mutations in flow-sorted CD34+CD19− early hematopoietic progenitors. Fluorescence in situ hybridization (FISH studies were performed in the CD34+ cells from nine patients of the cohort to examine the presence of cytogenetic abnormalities. Results NGS studies revealed a total of 28 mutations in 24 CLL patients. Interestingly, 15 of them also showed the same mutations in their corresponding whole population of CD34+ progenitors. The majority of NOTCH1 (7/9 and XPO1 (4/4 mutations presented a similar mutational burden in both cell fractions; by contrast, mutations of TP53 (2/2, FBXW7 (2/2, and SF3B1 (3/4 showed lower mutational allele frequencies, or even none, in the CD34+ cells compared with the CD19+ population. Ultra-deep NGS confirmed the presence of FBXW7, MYD88, NOTCH1, and XPO1 mutations in the subpopulation of CD34+CD19− early hematopoietic progenitors (6/7. Furthermore, FISH studies showed the presence of 11q and 13q deletions (2/2 and 3/5, respectively in CD34+ progenitors but the absence of IGH cytogenetic alterations (0/2 in the CD34+ cells. Combining all the results from NGS and FISH, a model of the appearance and expansion of

  20. Genome sequence analysis with MonetDB - A case study on Ebola virus diversity

    NARCIS (Netherlands)

    Cijvat, R.; Manegold, S.; Kersten, M.; Klau, G.W.; Schönhuth, A.; Marschall, T.; Zhang, Y.

    2015-01-01

    Next-generation sequencing (NGS) technology has led the life sciences into the big data era. Today, sequencing genomes takes little time and cost, but yields terabytes of data to be stored and analyzed. Biologists are often exposed to excessively time consuming and error-prone data management and

  1. Microbial Community Profiling of Human Saliva Using Shotgun Metagenomic Sequencing

    OpenAIRE

    Hasan, Nur A.; Young, Brian A.; Minard-Smith, Angela T.; Saeed, Kelly; Li, Huai; Heizer, Esley M.; McMillan, Nancy J.; Isom, Richard; Abdullah, Abdul Shakur; Bornman, Daniel M.; Faith, Seth A.; Choi, Seon Young; Dickens, Michael L.; Cebula, Thomas A.; Colwell, Rita R.

    2014-01-01

    Human saliva is clinically informative of both oral and general health. Since next generation shotgun sequencing (NGS) is now widely used to identify and quantify bacteria, we investigated the bacterial flora of saliva microbiomes of two healthy volunteers and five datasets from the Human Microbiome Project, along with a control dataset containing short NGS reads from bacterial species representative of the bacterial flora of human saliva. GENIUS, a system designed to identify and quantify ba...

  2. ddClone: joint statistical inference of clonal populations from single cell and bulk tumour sequencing data.

    Science.gov (United States)

    Salehi, Sohrab; Steif, Adi; Roth, Andrew; Aparicio, Samuel; Bouchard-Côté, Alexandre; Shah, Sohrab P

    2017-03-01

    Next-generation sequencing (NGS) of bulk tumour tissue can identify constituent cell populations in cancers and measure their abundance. This requires computational deconvolution of allelic counts from somatic mutations, which may be incapable of fully resolving the underlying population structure. Single cell sequencing (SCS) is a more direct method, although its replacement of NGS is impeded by technical noise and sampling limitations. We propose ddClone, which analytically integrates NGS and SCS data, leveraging their complementary attributes through joint statistical inference. We show on real and simulated datasets that ddClone produces more accurate results than can be achieved by either method alone.

  3. Comparison of next generation sequencing technologies for transcriptome characterization

    Directory of Open Access Journals (Sweden)

    Soltis Douglas E

    2009-08-01

    Full Text Available Abstract Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19. We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica and the magnoliid avocado (Persea americana using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB, 119,518 (88.7% mapped exactly to known exons, while 1,117 (0.8% mapped to introns, 11,524 (8.6% spanned annotated intron/exon boundaries, and 3,066 (2.3% extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance

  4. NGS testing for cardiomyopathy: Utility of adding RASopathy-associated genes.

    Science.gov (United States)

    Ceyhan-Birsoy, Ozge; Miatkowski, Maya M; Hynes, Elizabeth; Funke, Birgit H; Mason-Suares, Heather

    2018-04-25

    RASopathies include a group of syndromes caused by pathogenic germline variants in RAS-MAPK pathway genes and typically present with facial dysmorphology, cardiovascular disease, and musculoskeletal anomalies. Recently, variants in RASopathy-associated genes have been reported in individuals with apparently nonsyndromic cardiomyopathy, suggesting that subtle features may be overlooked. To determine the utility and burden of adding RASopathy-associated genes to cardiomyopathy panels, we tested 11 RASopathy-associated genes by next-generation sequencing (NGS), including NGS-based copy number variant assessment, in 1,111 individuals referred for genetic testing for hypertrophic cardiomyopathy (HCM) or dilated cardiomyopathy (DCM). Disease-causing variants were identified in 0.6% (four of 692) of individuals with HCM, including three missense variants in the PTPN11, SOS1, and BRAF genes. Overall, 36 variants of uncertain significance (VUSs) were identified, averaging ∼3VUSs/100 cases. This study demonstrates that adding a subset of the RASopathy-associated genes to cardiomyopathy panels will increase clinical diagnoses without significantly increasing the number of VUSs/case. © 2018 Wiley Periodicals, Inc.

  5. Next-generation sequencing for diagnosis of rare diseases in the neonatal intensive care unit

    Science.gov (United States)

    Daoud, Hussein; Luco, Stephanie M.; Li, Rui; Bareke, Eric; Beaulieu, Chandree; Jarinova, Olga; Carson, Nancy; Nikkel, Sarah M.; Graham, Gail E.; Richer, Julie; Armour, Christine; Bulman, Dennis E.; Chakraborty, Pranesh; Geraghty, Michael; Lines, Matthew A.; Lacaze-Masmonteil, Thierry; Majewski, Jacek; Boycott, Kym M.; Dyment, David A.

    2016-01-01

    Background: Rare diseases often present in the first days and weeks of life and may require complex management in the setting of a neonatal intensive care unit (NICU). Exhaustive consultations and traditional genetic or metabolic investigations are costly and often fail to arrive at a final diagnosis when no recognizable syndrome is suspected. For this pilot project, we assessed the feasibility of next-generation sequencing as a tool to improve the diagnosis of rare diseases in newborns in the NICU. Methods: We retrospectively identified and prospectively recruited newborns and infants admitted to the NICU of the Children’s Hospital of Eastern Ontario and the Ottawa Hospital, General Campus, who had been referred to the medical genetics or metabolics inpatient consult service and had features suggesting an underlying genetic or metabolic condition. DNA from the newborns and parents was enriched for a panel of clinically relevant genes and sequenced on a MiSeq sequencing platform (Illumina Inc.). The data were interpreted with a standard informatics pipeline and reported to care providers, who assessed the importance of genotype–phenotype correlations. Results: Of 20 newborns studied, 8 received a diagnosis on the basis of next-generation sequencing (diagnostic rate 40%). The diagnoses were renal tubular dysgenesis, SCN1A-related encephalopathy syndrome, myotubular myopathy, FTO deficiency syndrome, cranioectodermal dysplasia, congenital myasthenic syndrome, autosomal dominant intellectual disability syndrome type 7 and Denys–Drash syndrome. Interpretation: This pilot study highlighted the potential of next-generation sequencing to deliver molecular diagnoses rapidly with a high success rate. With broader use, this approach has the potential to alter health care delivery in the NICU. PMID:27241786

  6. Next-generation sequencing for diagnosis of rare diseases in the neonatal intensive care unit.

    Science.gov (United States)

    Daoud, Hussein; Luco, Stephanie M; Li, Rui; Bareke, Eric; Beaulieu, Chandree; Jarinova, Olga; Carson, Nancy; Nikkel, Sarah M; Graham, Gail E; Richer, Julie; Armour, Christine; Bulman, Dennis E; Chakraborty, Pranesh; Geraghty, Michael; Lines, Matthew A; Lacaze-Masmonteil, Thierry; Majewski, Jacek; Boycott, Kym M; Dyment, David A

    2016-08-09

    Rare diseases often present in the first days and weeks of life and may require complex management in the setting of a neonatal intensive care unit (NICU). Exhaustive consultations and traditional genetic or metabolic investigations are costly and often fail to arrive at a final diagnosis when no recognizable syndrome is suspected. For this pilot project, we assessed the feasibility of next-generation sequencing as a tool to improve the diagnosis of rare diseases in newborns in the NICU. We retrospectively identified and prospectively recruited newborns and infants admitted to the NICU of the Children's Hospital of Eastern Ontario and the Ottawa Hospital, General Campus, who had been referred to the medical genetics or metabolics inpatient consult service and had features suggesting an underlying genetic or metabolic condition. DNA from the newborns and parents was enriched for a panel of clinically relevant genes and sequenced on a MiSeq sequencing platform (Illumina Inc.). The data were interpreted with a standard informatics pipeline and reported to care providers, who assessed the importance of genotype-phenotype correlations. Of 20 newborns studied, 8 received a diagnosis on the basis of next-generation sequencing (diagnostic rate 40%). The diagnoses were renal tubular dysgenesis, SCN1A-related encephalopathy syndrome, myotubular myopathy, FTO deficiency syndrome, cranioectodermal dysplasia, congenital myasthenic syndrome, autosomal dominant intellectual disability syndrome type 7 and Denys-Drash syndrome. This pilot study highlighted the potential of next-generation sequencing to deliver molecular diagnoses rapidly with a high success rate. With broader use, this approach has the potential to alter health care delivery in the NICU. © 2016 Canadian Medical Association or its licensors.

  7. A framework for organizing cancer-related variations from existing databases, publications and NGS data using a High-performance Integrated Virtual Environment (HIVE).

    Science.gov (United States)

    Wu, Tsung-Jung; Shamsaddini, Amirhossein; Pan, Yang; Smith, Krista; Crichton, Daniel J; Simonyan, Vahan; Mazumder, Raja

    2014-01-01

    Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Because of the petabytes of data and information present in NGS primary repositories, a platform HIVE (High-performance Integrated Virtual Environment) for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31 979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13 896 small-scale and 308 986 large-scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies. Database URL: BioMuta: http

  8. Prescreening of microbial populations for the assessment of sequencing potential.

    Science.gov (United States)

    Hanning, Irene B; Ricke, Steven C

    2011-01-01

    Next-generation sequencing (NGS) is a powerful tool that can be utilized to profile and compare microbial populations. By amplifying a target gene present in all bacteria and subsequently sequencing amplicons, the bacteria genera present in the populations can be identified and compared. In some scenarios, little to no difference may exist among microbial populations being compared in which case a prescreening method would be practical to determine which microbial populations would be suitable for further analysis by NGS. Denaturing density-gradient electrophoresis (DGGE) is relatively cheaper than NGS and the data comparing microbial populations are ready to be viewed immediately after electrophoresis. DGGE follows essentially the same initial methodology as NGS by targeting and amplifying the 16S rRNA gene. However, as opposed to sequencing amplicons, DGGE amplicons are analyzed by electrophoresis. By prescreening microbial populations with DGGE, more efficient use of NGS methods can be accomplished. In this chapter, we outline the protocol for DGGE targeting the same gene (16S rRNA) that would be targeted for NGS to compare and determine differences in microbial populations from a wide range of ecosystems.

  9. Deep sequencing methods for protein engineering and design.

    Science.gov (United States)

    Wrenbeck, Emily E; Faber, Matthew S; Whitehead, Timothy A

    2017-08-01

    The advent of next-generation sequencing (NGS) has revolutionized protein science, and the development of complementary methods enabling NGS-driven protein engineering have followed. In general, these experiments address the functional consequences of thousands of protein variants in a massively parallel manner using genotype-phenotype linked high-throughput functional screens followed by DNA counting via deep sequencing. We highlight the use of information rich datasets to engineer protein molecular recognition. Examples include the creation of multiple dual-affinity Fabs targeting structurally dissimilar epitopes and engineering of a broad germline-targeted anti-HIV-1 immunogen. Additionally, we highlight the generation of enzyme fitness landscapes for conducting fundamental studies of protein behavior and evolution. We conclude with discussion of technological advances. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Mapping Breakpoints of Complex Chromosome Rearrangements Involving a Partial Trisomy 15q23.1-q26.2 Revealed by Next Generation Sequencing and Conventional Techniques.

    Directory of Open Access Journals (Sweden)

    Qiong Pan

    Full Text Available Complex chromosome rearrangements (CCRs, which are rather rare in the whole population, may be associated with aberrant phenotypes. Next-generation sequencing (NGS and conventional techniques, could be used to reveal specific CCRs for better genetic counseling. We report the CCRs of a girl and her mother, which were identified using a combination of NGS and conventional techniques including G-banding, fluorescence in situ hybridization (FISH and PCR. The girl demonstrated CCRs involving chromosomes 3 and 8, while the CCRs of her mother involved chromosomes 3, 5, 8, 11 and 15. HumanCytoSNP-12 Chip analysis identified a 35.4 Mb duplication on chromosome 15q21.3-q26.2 in the proband and a 1.6 Mb microdeletion at chromosome 15q21.3 in her mother. The proband inherited the rearranged chromosomes 3 and 8 from her mother, and the duplicated region on chromosome 15 of the proband was inherited from the mother. Approximately one hundred genes were identified in the 15q21.3-q26.2 duplicated region of the proband. In particular, TPM1, SMAD6, SMAD3, and HCN4 may be associated with her heart defects, and HEXA, KIF7, and IDH2 are responsible for her developmental and mental retardation. In addition, we suggest that a microdeletion on the 15q21.3 region of the mother, which involved TCF2, TCF12, ADMA10 and AQP9, might be associated with mental retardation. We delineate the precise structures of the derivative chromosomes, chromosome duplication origin and possible molecular mechanisms for aberrant phenotypes by combining NGS data with conventional techniques.

  11. Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

    Science.gov (United States)

    Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

    2018-02-01

    A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  12. Combining information from linkage and association mapping for next-generation sequencing longitudinal family data.

    Science.gov (United States)

    Balliu, Brunilda; Uh, Hae-Won; Tsonaka, Roula; Boehringer, Stefan; Helmer, Quinta; Houwing-Duistermaat, Jeanine J

    2014-01-01

    In this analysis, we investigate the contributions that linkage-based methods, such as identical-by-descent mapping, can make to association mapping to identify rare variants in next-generation sequencing data. First, we identify regions in which cases share more segments identical-by-descent around a putative causal variant than do controls. Second, we use a two-stage mixed-effect model approach to summarize the single-nucleotide polymorphism data within each region and include them as covariates in the model for the phenotype. We assess the impact of linkage disequilibrium in determining identical-by-descent states between individuals by using markers with and without linkage disequilibrium for the first part and the impact of imputation in testing for association by using imputed genome-wide association studies or raw sequence markers for the second part. We apply the method to next-generation sequencing longitudinal family data from Genetic Association Workshop 18 and identify a significant region at chromosome 3: 40249244-41025167 (p-value = 2.3 × 10(-3)).

  13. Approaches for in silico finishing of microbial genome sequences

    Directory of Open Access Journals (Sweden)

    Frederico Schmitt Kremer

    Full Text Available Abstract The introduction of next-generation sequencing (NGS had a significant effect on the availability of genomic information, leading to an increase in the number of sequenced genomes from a large spectrum of organisms. Unfortunately, due to the limitations implied by the short-read sequencing platforms, most of these newly sequenced genomes remained as “drafts”, incomplete representations of the whole genetic content. The previous genome sequencing studies indicated that finishing a genome sequenced by NGS, even bacteria, may require additional sequencing to fill the gaps, making the entire process very expensive. As such, several in silico approaches have been developed to optimize the genome assemblies and facilitate the finishing process. The present review aims to explore some free (open source, in many cases tools that are available to facilitate genome finishing.

  14. Approaches for in silico finishing of microbial genome sequences.

    Science.gov (United States)

    Kremer, Frederico Schmitt; McBride, Alan John Alexander; Pinto, Luciano da Silva

    The introduction of next-generation sequencing (NGS) had a significant effect on the availability of genomic information, leading to an increase in the number of sequenced genomes from a large spectrum of organisms. Unfortunately, due to the limitations implied by the short-read sequencing platforms, most of these newly sequenced genomes remained as "drafts", incomplete representations of the whole genetic content. The previous genome sequencing studies indicated that finishing a genome sequenced by NGS, even bacteria, may require additional sequencing to fill the gaps, making the entire process very expensive. As such, several in silico approaches have been developed to optimize the genome assemblies and facilitate the finishing process. The present review aims to explore some free (open source, in many cases) tools that are available to facilitate genome finishing.

  15. Unveiling distribution patterns of freshwater phytoplankton by a next generation sequencing based approach

    Czech Academy of Sciences Publication Activity Database

    Eiler, A.; Drakare, S.; Bertilsson, S.; Pernthaler, J.; Peura, S.; Rofner, C.; Šimek, Karel; Yang, Y.; Znachor, Petr; Lindström, E.S.

    2013-01-01

    Roč. 8, č. 1 (2013), e53516 E-ISSN 1932-6203 R&D Projects: GA ČR(CZ) GA206/08/0015 Institutional support: RVO:60077344 Keywords : phytoplankton * next generation sequencing * diversity Subject RIV: EE - Microbiology, Virology Impact factor: 3.534, year: 2013

  16. ReQON: a Bioconductor package for recalibrating quality scores from next-generation sequencing data

    Directory of Open Access Journals (Sweden)

    Cabanski Christopher R

    2012-09-01

    Full Text Available Abstract Background Next-generation sequencing technologies have become important tools for genome-wide studies. However, the quality scores that are assigned to each base have been shown to be inaccurate. If the quality scores are used in downstream analyses, these inaccuracies can have a significant impact on the results. Results Here we present ReQON, a tool that recalibrates the base quality scores from an input BAM file of aligned sequencing data using logistic regression. ReQON also generates diagnostic plots showing the effectiveness of the recalibration. We show that ReQON produces quality scores that are both more accurate, in the sense that they more closely correspond to the probability of a sequencing error, and do a better job of discriminating between sequencing errors and non-errors than the original quality scores. We also compare ReQON to other available recalibration tools and show that ReQON is less biased and performs favorably in terms of quality score accuracy. Conclusion ReQON is an open source software package, written in R and available through Bioconductor, for recalibrating base quality scores for next-generation sequencing data. ReQON produces a new BAM file with more accurate quality scores, which can improve the results of downstream analysis, and produces several diagnostic plots showing the effectiveness of the recalibration.

  17. "Shovel-ready" Sequences as a Stimulus for the Next Generation of Life Scientists.

    Science.gov (United States)

    Boyle, Michael D

    2010-01-01

    Genomics and bioinformatics are dynamic fields well-suited for capturing the imagination of undergraduates in both research laboratories and classrooms. Currently, raw nucleotide sequence is being provided, as part of several genomics research initiatives, for undergraduate research and teaching. These initiatives could be easily extended and much more effective if the source of the sequenced material and the subsequent focus of the data analysis were aligned with the research interests of individual faculty at undergraduate institutions. By judicious use of surplus capacity in existing nucleotide sequencing cores, raw sequence data could be generated to support ongoing research efforts involving undergraduates. This would allow these students to participate actively in discovery research, with a goal of making novel contributions to their field through original research while nurturing the next generation of talented research scientists.

  18. Viral Diagnostics in Plants Using Next Generation Sequencing: Computational Analysis in Practice

    Directory of Open Access Journals (Sweden)

    Susan Jones

    2017-10-01

    Full Text Available Viruses cause significant yield and quality losses in a wide variety of cultivated crops. Hence, the detection and identification of viruses is a crucial facet of successful crop production and of great significance in terms of world food security. Whilst the adoption of molecular techniques such as RT-PCR has increased the speed and accuracy of viral diagnostics, such techniques only allow the detection of known viruses, i.e., each test is specific to one or a small number of related viruses. Therefore, unknown viruses can be missed and testing can be slow and expensive if molecular tests are unavailable. Methods for simultaneous detection of multiple viruses have been developed, and (NGS is now a principal focus of this area, as it enables unbiased and hypothesis-free testing of plant samples. The development of NGS protocols capable of detecting multiple known and emergent viruses present in infected material is proving to be a major advance for crops, nuclear stocks or imported plants and germplasm, in which disease symptoms are absent, unspecific or only triggered by multiple viruses. Researchers want to answer the question “how many different viruses are present in this crop plant?” without knowing what they are looking for: RNA-sequencing (RNA-seq of plant material allows this question to be addressed. As well as needing efficient nucleic acid extraction and enrichment protocols, virus detection using RNA-seq requires fast and robust bioinformatics methods to enable host sequence removal and virus classification. In this review recent studies that use RNA-seq for virus detection in a variety of crop plants are discussed with specific emphasis on the computational methods implemented. The main features of a number of specific bioinformatics workflows developed for virus detection from NGS data are also outlined and possible reasons why these have not yet been widely adopted are discussed. The review concludes by discussing the future

  19. Viral Diagnostics in Plants Using Next Generation Sequencing: Computational Analysis in Practice.

    Science.gov (United States)

    Jones, Susan; Baizan-Edge, Amanda; MacFarlane, Stuart; Torrance, Lesley

    2017-01-01

    Viruses cause significant yield and quality losses in a wide variety of cultivated crops. Hence, the detection and identification of viruses is a crucial facet of successful crop production and of great significance in terms of world food security. Whilst the adoption of molecular techniques such as RT-PCR has increased the speed and accuracy of viral diagnostics, such techniques only allow the detection of known viruses, i.e., each test is specific to one or a small number of related viruses. Therefore, unknown viruses can be missed and testing can be slow and expensive if molecular tests are unavailable. Methods for simultaneous detection of multiple viruses have been developed, and (NGS) is now a principal focus of this area, as it enables unbiased and hypothesis-free testing of plant samples. The development of NGS protocols capable of detecting multiple known and emergent viruses present in infected material is proving to be a major advance for crops, nuclear stocks or imported plants and germplasm, in which disease symptoms are absent, unspecific or only triggered by multiple viruses. Researchers want to answer the question "how many different viruses are present in this crop plant?" without knowing what they are looking for: RNA-sequencing (RNA-seq) of plant material allows this question to be addressed. As well as needing efficient nucleic acid extraction and enrichment protocols, virus detection using RNA-seq requires fast and robust bioinformatics methods to enable host sequence removal and virus classification. In this review recent studies that use RNA-seq for virus detection in a variety of crop plants are discussed with specific emphasis on the computational methods implemented. The main features of a number of specific bioinformatics workflows developed for virus detection from NGS data are also outlined and possible reasons why these have not yet been widely adopted are discussed. The review concludes by discussing the future directions of this

  20. NeSSM: a Next-generation Sequencing Simulator for Metagenomics.

    Directory of Open Access Journals (Sweden)

    Ben Jia

    Full Text Available BACKGROUND: Metagenomics can reveal the vast majority of microbes that have been missed by traditional cultivation-based methods. Due to its extremely wide range of application areas, fast metagenome sequencing simulation systems with high fidelity are in great demand to facilitate the development and comparison of metagenomics analysis tools. RESULTS: We present here a customizable metagenome simulation system: NeSSM (Next-generation Sequencing Simulator for Metagenomics. Combining complete genomes currently available, a community composition table, and sequencing parameters, it can simulate metagenome sequencing better than existing systems. Sequencing error models based on the explicit distribution of errors at each base and sequencing coverage bias are incorporated in the simulation. In order to improve the fidelity of simulation, tools are provided by NeSSM to estimate the sequencing error models, sequencing coverage bias and the community composition directly from existing metagenome sequencing data. Currently, NeSSM supports single-end and pair-end sequencing for both 454 and Illumina platforms. In addition, a GPU (graphics processing units version of NeSSM is also developed to accelerate the simulation. By comparing the simulated sequencing data from NeSSM with experimental metagenome sequencing data, we have demonstrated that NeSSM performs better in many aspects than existing popular metagenome simulators, such as MetaSim, GemSIM and Grinder. The GPU version of NeSSM is more than one-order of magnitude faster than MetaSim. CONCLUSIONS: NeSSM is a fast simulation system for high-throughput metagenome sequencing. It can be helpful to develop tools and evaluate strategies for metagenomics analysis and it's freely available for academic users at http://cbb.sjtu.edu.cn/~ccwei/pub/software/NeSSM.php.

  1. Masking as an effective quality control method for next-generation sequencing data analysis.

    Science.gov (United States)

    Yun, Sajung; Yun, Sijung

    2014-12-13

    Next generation sequencing produces base calls with low quality scores that can affect the accuracy of identifying simple nucleotide variation calls, including single nucleotide polymorphisms and small insertions and deletions. Here we compare the effectiveness of two data preprocessing methods, masking and trimming, and the accuracy of simple nucleotide variation calls on whole-genome sequence data from Caenorhabditis elegans. Masking substitutes low quality base calls with 'N's (undetermined bases), whereas trimming removes low quality bases that results in a shorter read lengths. We demonstrate that masking is more effective than trimming in reducing the false-positive rate in single nucleotide polymorphism (SNP) calling. However, both of the preprocessing methods did not affect the false-negative rate in SNP calling with statistical significance compared to the data analysis without preprocessing. False-positive rate and false-negative rate for small insertions and deletions did not show differences between masking and trimming. We recommend masking over trimming as a more effective preprocessing method for next generation sequencing data analysis since masking reduces the false-positive rate in SNP calling without sacrificing the false-negative rate although trimming is more commonly used currently in the field. The perl script for masking is available at http://code.google.com/p/subn/. The sequencing data used in the study were deposited in the Sequence Read Archive (SRX450968 and SRX451773).

  2. Experience of targeted Usher exome sequencing as a clinical test

    Science.gov (United States)

    Besnard, Thomas; García-García, Gema; Baux, David; Vaché, Christel; Faugère, Valérie; Larrieu, Lise; Léonard, Susana; Millan, Jose M; Malcolm, Sue; Claustres, Mireille; Roux, Anne-Françoise

    2014-01-01

    We show that massively parallel targeted sequencing of 19 genes provides a new and reliable strategy for molecular diagnosis of Usher syndrome (USH) and nonsyndromic deafness, particularly appropriate for these disorders characterized by a high clinical and genetic heterogeneity and a complex structure of several of the genes involved. A series of 71 patients including Usher patients previously screened by Sanger sequencing plus newly referred patients was studied. Ninety-eight percent of the variants previously identified by Sanger sequencing were found by next-generation sequencing (NGS). NGS proved to be efficient as it offers analysis of all relevant genes which is laborious to reach with Sanger sequencing. Among the 13 newly referred Usher patients, both mutations in the same gene were identified in 77% of cases (10 patients) and one candidate pathogenic variant in two additional patients. This work can be considered as pilot for implementing NGS for genetically heterogeneous diseases in clinical service. PMID:24498627

  3. Exome-wide DNA capture and next generation sequencing in domestic and wild species

    Directory of Open Access Journals (Sweden)

    Ng Sarah B

    2011-07-01

    Full Text Available Abstract Background Gene-targeted and genome-wide markers are crucial to advance evolutionary biology, agriculture, and biodiversity conservation by improving our understanding of genetic processes underlying adaptation and speciation. Unfortunately, for eukaryotic species with large genomes it remains costly to obtain genome sequences and to develop genome resources such as genome-wide SNPs. A method is needed to allow gene-targeted, next-generation sequencing that is flexible enough to include any gene or number of genes, unlike transcriptome sequencing. Such a method would allow sequencing of many individuals, avoiding ascertainment bias in subsequent population genetic analyses. We demonstrate the usefulness of a recent technology, exon capture, for genome-wide, gene-targeted marker discovery in species with no genome resources. We use coding gene sequences from the domestic cow genome sequence (Bos taurus to capture (enrich for, and subsequently sequence, thousands of exons of B. taurus, B. indicus, and Bison bison (wild bison. Our capture array has probes for 16,131 exons in 2,570 genes, including 203 candidate genes with known function and of interest for their association with disease and other fitness traits. Results We successfully sequenced and mapped exon sequences from across the 29 autosomes and X chromosome in the B. taurus genome sequence. Exon capture and high-throughput sequencing identified thousands of putative SNPs spread evenly across all reference chromosomes, in all three individuals, including hundreds of SNPs in our targeted candidate genes. Conclusions This study shows exon capture can be customized for SNP discovery in many individuals and for non-model species without genomic resources. Our captured exome subset was small enough for affordable next-generation sequencing, and successfully captured exons from a divergent wild species using the domestic cow genome as reference.

  4. Exome-wide DNA capture and next generation sequencing in domestic and wild species.

    Science.gov (United States)

    Cosart, Ted; Beja-Pereira, Albano; Chen, Shanyuan; Ng, Sarah B; Shendure, Jay; Luikart, Gordon

    2011-07-05

    Gene-targeted and genome-wide markers are crucial to advance evolutionary biology, agriculture, and biodiversity conservation by improving our understanding of genetic processes underlying adaptation and speciation. Unfortunately, for eukaryotic species with large genomes it remains costly to obtain genome sequences and to develop genome resources such as genome-wide SNPs. A method is needed to allow gene-targeted, next-generation sequencing that is flexible enough to include any gene or number of genes, unlike transcriptome sequencing. Such a method would allow sequencing of many individuals, avoiding ascertainment bias in subsequent population genetic analyses.We demonstrate the usefulness of a recent technology, exon capture, for genome-wide, gene-targeted marker discovery in species with no genome resources. We use coding gene sequences from the domestic cow genome sequence (Bos taurus) to capture (enrich for), and subsequently sequence, thousands of exons of B. taurus, B. indicus, and Bison bison (wild bison). Our capture array has probes for 16,131 exons in 2,570 genes, including 203 candidate genes with known function and of interest for their association with disease and other fitness traits. We successfully sequenced and mapped exon sequences from across the 29 autosomes and X chromosome in the B. taurus genome sequence. Exon capture and high-throughput sequencing identified thousands of putative SNPs spread evenly across all reference chromosomes, in all three individuals, including hundreds of SNPs in our targeted candidate genes. This study shows exon capture can be customized for SNP discovery in many individuals and for non-model species without genomic resources. Our captured exome subset was small enough for affordable next-generation sequencing, and successfully captured exons from a divergent wild species using the domestic cow genome as reference.

  5. Screening for SNPs with Allele-Specific Methylation based on Next-Generation Sequencing Data

    OpenAIRE

    Hu, Bo; Ji, Yuan; Xu, Yaomin; Ting, Angela H

    2013-01-01

    Allele-specific methylation (ASM) has long been studied but mainly documented in the context of genomic imprinting and X chromosome inactivation. Taking advantage of the next-generation sequencing technology, we conduct a high-throughput sequencing experiment with four prostate cell lines to survey the whole genome and identify single nucleotide polymorphisms (SNPs) with ASM. A Bayesian approach is proposed to model the counts of short reads for each SNP conditional on its genotypes of multip...

  6. Next-generation sequencing using a pre-designed gene panel for the molecular diagnosis of congenital disorders in pediatric patients.

    Science.gov (United States)

    Lim, Eileen C P; Brett, Maggie; Lai, Angeline H M; Lee, Siew-Peng; Tan, Ee-Shien; Jamuar, Saumya S; Ng, Ivy S L; Tan, Ene-Choo

    2015-12-14

    Next-generation sequencing (NGS) has revolutionized genetic research and offers enormous potential for clinical application. Sequencing the exome has the advantage of casting the net wide for all known coding regions while targeted gene panel sequencing provides enhanced sequencing depths and can be designed to avoid incidental findings in adult-onset conditions. A HaloPlex panel consisting of 180 genes within commonly altered chromosomal regions is available for use on both the Ion Personal Genome Machine (PGM) and MiSeq platforms to screen for causative mutations in these genes. We used this Haloplex ICCG panel for targeted sequencing of 15 patients with clinical presentations indicative of an abnormality in one of the 180 genes. Sequencing runs were done using the Ion 318 Chips on the Ion Torrent PGM. Variants were filtered for known polymorphisms and analysis was done to identify possible disease-causing variants before validation by Sanger sequencing. When possible, segregation of variants with phenotype in family members was performed to ascertain the pathogenicity of the variant. More than 97% of the target bases were covered at >20×. There was an average of 9.6 novel variants per patient. Pathogenic mutations were identified in five genes for six patients, with two novel variants. There were another five likely pathogenic variants, some of which were unreported novel variants. In a cohort of 15 patients, we were able to identify a likely genetic etiology in six patients (40%). Another five patients had candidate variants for which further evaluation and segregation analysis are ongoing. Our results indicate that the HaloPlex ICCG panel is useful as a rapid, high-throughput and cost-effective screening tool for 170 of the 180 genes. There is low coverage for some regions in several genes which might have to be supplemented by Sanger sequencing. However, comparing the cost, ease of analysis, and shorter turnaround time, it is a good alternative to exome

  7. Charcot-Marie-Tooth disease: The development of a diagnostic platform using next generation sequencing

    DEFF Research Database (Denmark)

    Christensen, Rikke; Væth, Signe; Thorsen, Kasper

    , Sanger sequencing of 4 genes have led to a diagnosis in approximately 30% of the patients. Aims: 1) Development of a targeted NGS platform containing 63 genes that currently are found to be associated with CMT. 2) Analysis of the increased diagnostic yield using this platform to analyze 200 CMT samples...... previously analyzed using Sanger sequencing without identification of a disease causing mutation. Materials and Methods: Libraries for 200 patient samples obtained for CMT diagnostics were prepared using Illumina Truseq and target enrichment using SeqCap EZ Choise Library (Nimblegen). The libraries were...

  8. Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus.

    Science.gov (United States)

    Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan

    2017-01-01

    The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS) of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp) gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV) was the most frequently detected Ilarvirus , occurring in 48 of the 61 Ilarvirus -positive trees and Prune dwarf virus (PDV) and Apple mosaic virus (ApMV) were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV) was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus -like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus -like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus -like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples, and the

  9. Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus

    Directory of Open Access Journals (Sweden)

    Wycliff M. Kinoti

    2017-06-01

    Full Text Available The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV was the most frequently detected Ilarvirus, occurring in 48 of the 61 Ilarvirus-positive trees and Prune dwarf virus (PDV and Apple mosaic virus (ApMV were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus-like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus-like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus-like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples

  10. Using next generation sequencing to tackle non-typhoidal Salmonella infections

    DEFF Research Database (Denmark)

    Wain, John; Keddy, Karen H.; Hendriksen, Rene S.

    2013-01-01

    The publication of studies using next generation sequencing to analyse large numbers of bacterial isolates from global epidemics is transforming microbiology, epidemiology and public health. The emergence of multidrug resistant Salmonella Typhimurium ST313 is one example. While the epidemiology...... in Africa appears to be human-to-human spread and the association with invasive disease almost absolute, more needs to be done to exclude the possibility of animal reservoirs and to transfer the ability to track all Salmonella infections to the laboratories in the front line. In this mini-review we...

  11. Somatic mosaicism of a CDKL5 mutation identified by next-generation sequencing.

    Science.gov (United States)

    Kato, Takeshi; Morisada, Naoya; Nagase, Hiroaki; Nishiyama, Masahiro; Toyoshima, Daisaku; Nakagawa, Taku; Maruyama, Azusa; Fu, Xue Jun; Nozu, Kandai; Wada, Hiroko; Takada, Satoshi; Iijima, Kazumoto

    2015-10-01

    CDKL5-related encephalopathy is an X-linked dominantly inherited disorder that is characterized by early infantile epileptic encephalopathy or atypical Rett syndrome. We describe a 5-year-old Japanese boy with intractable epilepsy, severe developmental delay, and Rett syndrome-like features. Onset was at 2 months, when his electroencephalogram showed sporadic single poly spikes and diffuse irregular poly spikes. We conducted a genetic analysis using an Illumina® TruSight™ One sequencing panel on a next-generation sequencer. We identified two epilepsy-associated single nucleotide variants in our case: CDKL5 p.Ala40Val and KCNQ2 p.Glu515Asp. CDKL5 p.Ala40Val has been previously reported to be responsible for early infantile epileptic encephalopathy. In our case, the CDKL5 heterozygous mutation showed somatic mosaicism because the boy's karyotype was 46,XY. The KCNQ2 variant p.Glu515Asp is known to cause benign familial neonatal seizures-1, and this variant showed paternal inheritance. Although we believe that the somatic mosaic CDKL5 mutation is mainly responsible for the neurological phenotype in the patient, the KCNQ2 variant might have some neurological effect. Genetic analysis by next-generation sequencing is capable of identifying multiple variants in a patient. Copyright © 2015 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.

  12. Single Stage Contactor Testing Of The Next Generation Solvent Blend

    Energy Technology Data Exchange (ETDEWEB)

    Herman, D. T.; Peters, T. B.; Duignan, M. R.; Williams, M. R.; Poirier, M. R.; Brass, E. A.; Garrison, A. G.; Ketusky, E. T.

    2014-01-06

    The Modular Caustic Side Solvent Extraction (CSSX) Unit (MCU) facility at the Savannah River Site (SRS) is actively pursuing the transition from the current BOBCalixC6 based solvent to the Next Generation Solvent (NGS)-MCU solvent to increase the cesium decontamination factor. To support this integration of NGS into the MCU facility the Savannah River National Laboratory (SRNL) performed testing of a blend of the NGS (MaxCalix based solvent) with the current solvent (BOBCalixC6 based solvent) for the removal of cesium (Cs) from the liquid salt waste stream. This testing utilized a blend of BOBCalixC6 based solvent and the NGS with the new extractant, MaxCalix, as well as a new suppressor, tris(3,7dimethyloctyl) guanidine. Single stage tests were conducted using the full size V-05 and V-10 liquid-to-liquid centrifugal contactors installed at SRNL. These tests were designed to determine the mass transfer and hydraulic characteristics with the NGS solvent blended with the projected heel of the BOBCalixC6 based solvent that will exist in MCU at time of transition. The test program evaluated the amount of organic carryover and the droplet size of the organic carryover phases using several analytical methods. The results indicate that hydraulically, the NGS solvent performed hydraulically similar to the current solvent which was expected. For the organic carryover 93% of the solvent is predicted to be recovered from the stripping operation and 96% from the extraction operation. As for the mass transfer, the NGS solvent significantly improved the cesium DF by at least an order of magnitude when extrapolating the One-stage results to actual Seven-stage extraction operation with a stage efficiency of 95%.

  13. Streaming support for data intensive cloud-based sequence analysis.

    Science.gov (United States)

    Issa, Shadi A; Kienzler, Romeo; El-Kalioby, Mohamed; Tonellato, Peter J; Wall, Dennis; Bruggmann, Rémy; Abouelhoda, Mohamed

    2013-01-01

    Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS) technology. Based on the concepts of "resources-on-demand" and "pay-as-you-go", scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client's site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation.

  14. Streaming Support for Data Intensive Cloud-Based Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Shadi A. Issa

    2013-01-01

    Full Text Available Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client’s site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation.

  15. Streaming Support for Data Intensive Cloud-Based Sequence Analysis

    Science.gov (United States)

    Issa, Shadi A.; Kienzler, Romeo; El-Kalioby, Mohamed; Tonellato, Peter J.; Wall, Dennis; Bruggmann, Rémy; Abouelhoda, Mohamed

    2013-01-01

    Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS) technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client's site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation. PMID:23710461

  16. Analysis of Litopenaeus vannamei transcriptome using the next-generation DNA sequencing technique.

    Directory of Open Access Journals (Sweden)

    Chaozheng Li

    Full Text Available BACKGROUND: Pacific white shrimp (Litopenaeus vannamei, the major species of farmed shrimps in the world, has been attracting extensive studies, which require more and more genome background knowledge. The now available transcriptome data of L. vannamei are insufficient for research requirements, and have not been adequately assembled and annotated. METHODOLOGY/PRINCIPAL FINDINGS: This is the first study that used a next-generation high-throughput DNA sequencing technique, the Solexa/Illumina GA II method, to analyze the transcriptome from whole bodies of L. vannamei larvae. More than 2.4 Gb of raw data were generated, and 109,169 unigenes with a mean length of 396 bp were assembled using the SOAP denovo software. 73,505 unigenes (>200 bp with good quality sequences were selected and subjected to annotation analysis, among which 37.80% can be matched in NCBI Nr database, 37.3% matched in Swissprot, and 44.1% matched in TrEMBL. Using BLAST and BLAST2Go softwares, 11,153 unigenes were classified into 25 Clusters of Orthologous Groups of proteins (COG categories, 8171 unigenes were assigned into 51 Gene ontology (GO functional groups, and 18,154 unigenes were divided into 220 Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. To primarily verify part of the results of assembly and annotations, 12 assembled unigenes that are homologous to many embryo development-related genes were chosen and subjected to RT-PCR for electrophoresis and Sanger sequencing analyses, and to real-time PCR for expression profile analyses during embryo development. CONCLUSIONS/SIGNIFICANCE: The L. vannamei transcriptome analyzed using the next-generation sequencing technique enriches the information of L. vannamei genes, which will facilitate our understanding of the genome background of crustaceans, and promote the studies on L. vannamei.

  17. Targeted next-generation sequencing reveals novel USH2A mutations associated with diverse disease phenotypes: implications for clinical and molecular diagnosis.

    Science.gov (United States)

    Chen, Xue; Sheng, Xunlun; Liu, Xiaoxing; Li, Huiping; Liu, Yani; Rong, Weining; Ha, Shaoping; Liu, Wenzhou; Kang, Xiaoli; Zhao, Kanxing; Zhao, Chen

    2014-01-01

    USH2A mutations have been implicated in the disease etiology of several inherited diseases, including Usher syndrome type 2 (USH2), nonsyndromic retinitis pigmentosa (RP), and nonsyndromic deafness. The complex genetic and phenotypic spectrums relevant to USH2A defects make it difficult to manage patients with such mutations. In the present study, we aim to determine the genetic etiology and to characterize the correlated clinical phenotypes for three Chinese pedigrees with nonsyndromic RP, one with RP sine pigmento (RPSP), and one with USH2. Family histories and clinical details for all included patients were reviewed. Ophthalmic examinations included best corrected visual acuities, visual field measurements, funduscopy, and electroretinography. Targeted next-generation sequencing (NGS) was applied using two sequence capture arrays to reveal the disease causative mutations for each family. Genotype-phenotype correlations were also annotated. Seven USH2A mutations, including four missense substitutions (p.P2762A, p.G3320C, p.R3719H, and p.G4763R), two splice site variants (c.8223+1G>A and c.8559-2T>C), and a nonsense mutation (p.Y3745*), were identified as disease causative in the five investigated families, of which three reported to have consanguineous marriage. Among all seven mutations, six were novel, and one was recurrent. Two homozygous missense mutations (p.P2762A and p.G3320C) were found in one individual family suggesting a potential double hit effect. Significant phenotypic divergences were revealed among the five families. Three families of the five families were affected with early, moderated, or late onset RP, one with RPSP, and the other one with USH2. Our study expands the genotypic and phenotypic variability relevant to USH2A mutations, which would help with a clear insight into the complex genetic and phenotypic spectrums relevant to USH2A defects, and is complementary for a better management of patients with such mutations. We have also

  18. Harnessing NGS and Big Data Optimally: Comparison of miRNA Prediction from Assembled versus Non-assembled Sequencing Data--The Case of the Grass Aegilops tauschii Complex Genome.

    Science.gov (United States)

    Budak, Hikmet; Kantar, Melda

    2015-07-01

    MicroRNAs (miRNAs) are small, endogenous, non-coding RNA molecules that regulate gene expression at the post-transcriptional level. As high-throughput next generation sequencing (NGS) and Big Data rapidly accumulate for various species, efforts for in silico identification of miRNAs intensify. Surprisingly, the effect of the input genomics sequence on the robustness of miRNA prediction was not evaluated in detail to date. In the present study, we performed a homology-based miRNA and isomiRNA prediction of the 5D chromosome of bread wheat progenitor, Aegilops tauschii, using two distinct sequence data sets as input: (1) raw sequence reads obtained from 454-GS FLX Titanium sequencing platform and (2) an assembly constructed from these reads. We also compared this method with a number of available plant sequence datasets. We report here the identification of 62 and 22 miRNAs from raw reads and the assembly, respectively, of which 16 were predicted with high confidence from both datasets. While raw reads promoted sensitivity with the high number of miRNAs predicted, 55% (12 out of 22) of the assembly-based predictions were supported by previous observations, bringing specificity forward compared to the read-based predictions, of which only 37% were supported. Importantly, raw reads could identify several repeat-related miRNAs that could not be detected with the assembly. However, raw reads could not capture 6 miRNAs, for which the stem-loops could only be covered by the relatively longer sequences from the assembly. In summary, the comparison of miRNA datasets obtained by these two strategies revealed that utilization of raw reads, as well as assemblies for in silico prediction, have distinct advantages and disadvantages. Consideration of these important nuances can benefit future miRNA identification efforts in the current age of NGS and Big Data driven life sciences innovation.

  19. New Challenges in Targeting Signaling Pathways in Acute Lymphoblastic Leukemia by NGS Approaches: An Update

    Science.gov (United States)

    Hernández-Rivas, Jesús María

    2018-01-01

    The identification and study of genetic alterations involved in various signaling pathways associated with the pathogenesis of acute lymphoblastic leukemia (ALL) and the application of recent next-generation sequencing (NGS) in the identification of these lesions not only broaden our understanding of the involvement of various genetic alterations in the pathogenesis of the disease but also identify new therapeutic targets for future clinical trials. The present review describes the main deletions, amplifications, sequence mutations, epigenetic lesions, and new structural DNA rearrangements detected by NGS in B-ALL and T-ALL and their clinical importance for therapeutic procedures. We reviewed the molecular basis of pathways including transcriptional regulation, lymphoid differentiation and development, TP53 and the cell cycle, RAS signaling, JAK/STAT, NOTCH, PI3K/AKT/mTOR, Wnt/β-catenin signaling, chromatin structure modifiers, and epigenetic regulators. The implementation of NGS strategies has enabled important mutated genes in each pathway, their associations with the genetic subtypes of ALL, and their outcomes, which will be described further. We also discuss classic and new cryptic DNA rearrangements in ALL identified by mRNA-seq strategies. Novel cooperative abnormalities in ALL could be key prognostic and/or predictive biomarkers for selecting the best frontline treatment and for developing therapies after the first relapse or refractory disease. PMID:29642462

  20. Genome wide SNP discovery in flax through next generation sequencing of reduced representation libraries

    Directory of Open Access Journals (Sweden)

    Kumar Santosh

    2012-12-01

    Full Text Available Abstract Background Flax (Linum usitatissimum L. is a significant fibre and oilseed crop. Current flax molecular markers, including isozymes, RAPDs, AFLPs and SSRs are of limited use in the construction of high density linkage maps and for association mapping applications due to factors such as low reproducibility, intense labour requirements and/or limited numbers. We report here on the use of a reduced representation library strategy combined with next generation Illumina sequencing for rapid and large scale discovery of SNPs in eight flax genotypes. SNP discovery was performed through in silico analysis of the sequencing data against the whole genome shotgun sequence assembly of flax genotype CDC Bethune. Genotyping-by-sequencing of an F6-derived recombinant inbred line population provided validation of the SNPs. Results Reduced representation libraries of eight flax genotypes were sequenced on the Illumina sequencing platform resulting in sequence coverage ranging from 4.33 to 15.64X (genome equivalents. Depending on the relatedness of the genotypes and the number and length of the reads, between 78% and 93% of the reads mapped onto the CDC Bethune whole genome shotgun sequence assembly. A total of 55,465 SNPs were discovered with the largest number of SNPs belonging to the genotypes with the highest mapping coverage percentage. Approximately 84% of the SNPs discovered were identified in a single genotype, 13% were shared between any two genotypes and the remaining 3% in three or more. Nearly a quarter of the SNPs were found in genic regions. A total of 4,706 out of 4,863 SNPs discovered in Macbeth were validated using genotyping-by-sequencing of 96 F6 individuals from a recombinant inbred line population derived from a cross between CDC Bethune and Macbeth, corresponding to a validation rate of 96.8%. Conclusions Next generation sequencing of reduced representation libraries was successfully implemented for genome-wide SNP discovery from

  1. Next Generation Sequencing As an Aid to Diagnosis and Treatment of an Unusual Pediatric Brain Cancer

    Directory of Open Access Journals (Sweden)

    John Glod

    2014-07-01

    Full Text Available Classification of pediatric brain tumors with unusual histologic and clinical features may be a diagnostic challenge to the pathologist. We present a case of a 12-year-old girl with a primary intracranial tumor. The tumor classification was not certain initially, and the site of origin and clinical behavior were unusual. Genomic characterization of the tumor using a Clinical Laboratory Improvement Amendment (CLIA-certified next-generation sequencing assay assisted in the diagnosis and translated into patient benefit, albeit transient. Our case argues that next generation sequencing may play a role in the pathological classification of pediatric brain cancers and guiding targeted therapy, supporting additional studies of genetically targeted therapeutics.

  2. The role of next generation sequencing for the development and testing of veterinary biologics

    Science.gov (United States)

    Next generation sequencing technology has become widely available and it offers many new opportunities in vaccine technology. Both human and veterinary medicine has numerous examples of adventitious agents being found in live vaccines. In veterinary medicine a continuing trend is the use of viral ...

  3. Complete sequence and diversity of a maize-associated Polerovirus in East Africa

    Science.gov (United States)

    Since 2011-2012, Maize lethal necrosis (MLN) has emerged in East Africa, causing massive yield loss and propelling research to identify viruses and virus populations present in maize. As expected, next generation sequencing (NGS) has revealed diverse and abundant viruses from the family Potyviridae,...

  4. Next-generation sequencing technology for genetics and genomics of sorghum

    DEFF Research Database (Denmark)

    Luo, Hong; Mocoeur, Anne Raymonde Joelle; Jing, Hai-Chun

    2014-01-01

    and grain sorghum. NGS has also been used to examine the transcriptomes of sorghum under various stress conditions. Besides identifying interesting transcriptonal adpatation to stress conditions, these study show that sugar could potentially act as an osmitic adjusting factor via transcriptional regulation....... Furthermore, miRNAs are found to be important adaptation to both biotic and abiotic stresses in sorghum. We discuss the use of NGS for further genetic improvement and breeding in sorghum....

  5. “Shovel-ready” Sequences as a Stimulus for the Next Generation of Life Scientists

    Science.gov (United States)

    Boyle, Michael D.

    2010-01-01

    Genomics and bioinformatics are dynamic fields well-suited for capturing the imagination of undergraduates in both research laboratories and classrooms. Currently, raw nucleotide sequence is being provided, as part of several genomics research initiatives, for undergraduate research and teaching. These initiatives could be easily extended and much more effective if the source of the sequenced material and the subsequent focus of the data analysis were aligned with the research interests of individual faculty at undergraduate institutions. By judicious use of surplus capacity in existing nucleotide sequencing cores, raw sequence data could be generated to support ongoing research efforts involving undergraduates. This would allow these students to participate actively in discovery research, with a goal of making novel contributions to their field through original research while nurturing the next generation of talented research scientists. PMID:23653696

  6. Viral metagenomics: Analysis of begomoviruses by illumina high-throughput sequencing

    KAUST Repository

    Idris, Ali

    2014-03-12

    Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes) (genus, Begomovirus; family, Geminiviridae) were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA). Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS). CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions. 2014 by the authors; licensee MDPI, Basel, Switzerland.

  7. Viral Metagenomics: Analysis of Begomoviruses by Illumina High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Ali Idris

    2014-03-01

    Full Text Available Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes (genus, Begomovirus; family, Geminiviridae were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA. Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS. CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions.

  8. Quantitation of next generation sequencing library preparation protocol efficiencies using droplet digital PCR assays - a systematic comparison of DNA library preparation kits for Illumina sequencing.

    Science.gov (United States)

    Aigrain, Louise; Gu, Yong; Quail, Michael A

    2016-06-13

    The emergence of next-generation sequencing (NGS) technologies in the past decade has allowed the democratization of DNA sequencing both in terms of price per sequenced bases and ease to produce DNA libraries. When it comes to preparing DNA sequencing libraries for Illumina, the current market leader, a plethora of kits are available and it can be difficult for the users to determine which kit is the most appropriate and efficient for their applications; the main concerns being not only cost but also minimal bias, yield and time efficiency. We compared 9 commercially available library preparation kits in a systematic manner using the same DNA sample by probing the amount of DNA remaining after each protocol steps using a new droplet digital PCR (ddPCR) assay. This method allows the precise quantification of fragments bearing either adaptors or P5/P7 sequences on both ends just after ligation or PCR enrichment. We also investigated the potential influence of DNA input and DNA fragment size on the final library preparation efficiency. The overall library preparations efficiencies of the libraries show important variations between the different kits with the ones combining several steps into a single one exhibiting some final yields 4 to 7 times higher than the other kits. Detailed ddPCR data also reveal that the adaptor ligation yield itself varies by more than a factor of 10 between kits, certain ligation efficiencies being so low that it could impair the original library complexity and impoverish the sequencing results. When a PCR enrichment step is necessary, lower adaptor-ligated DNA inputs leads to greater amplification yields, hiding the latent disparity between kits. We describe a ddPCR assay that allows us to probe the efficiency of the most critical step in the library preparation, ligation, and to draw conclusion on which kits is more likely to preserve the sample heterogeneity and reduce the need of amplification.

  9. Using multi-locus allelic sequence data to estimate genetic divergence among four Lilium (Liliaceae) cultivars

    NARCIS (Netherlands)

    Shahin, A.; Smulders, M.J.M.; Tuyl, van J.M.; Arens, P.F.P.; Bakker, F.T.

    2014-01-01

    Next Generation Sequencing (NGS) may enable estimating relationships among genotypes using allelic variation of multiple nuclear genes simultaneously. We explored the potential and caveats of this strategy in four genetically distant Lilium cultivars to estimate their genetic divergence from

  10. DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing.

    Science.gov (United States)

    Vidaki, Athina; Ballard, David; Aliferi, Anastasia; Miller, Thomas H; Barron, Leon P; Syndercombe Court, Denise

    2017-05-01

    generation sequencing (NGS)-based method able to quantify the methylation status of the selected 16 CpG sites was developed using the Illumina MiSeq ® platform. The method was validated using DNA standards of known methylation levels and the age prediction accuracy has been initially assessed in a set of 46 whole blood samples. Although the resulted prediction accuracy using the NGS data was lower compared to the original model (MAE=7.5years), it is expected that future optimization of our strategy to account for technical variation as well as increasing the sample size will improve both the prediction accuracy and reproducibility. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  11. TumorNext-Lynch-MMR: a comprehensive next generation sequencing assay for the detection of germline and somatic mutations in genes associated with mismatch repair deficiency and Lynch syndrome.

    Science.gov (United States)

    Gray, Phillip N; Tsai, Pei; Chen, Daniel; Wu, Sitao; Hoo, Jayne; Mu, Wenbo; Li, Bing; Vuong, Huy; Lu, Hsiao-Mei; Batth, Navanjot; Willett, Sara; Uyeda, Lisa; Shah, Swati; Gau, Chia-Ling; Umali, Monalyn; Espenschied, Carin; Janicek, Mike; Brown, Sandra; Margileth, David; Dobrea, Lavinia; Wagman, Lawrence; Rana, Huma; Hall, Michael J; Ross, Theodora; Terdiman, Jonathan; Cullinane, Carey; Ries, Savita; Totten, Ellen; Elliott, Aaron M

    2018-04-17

    The current algorithm for Lynch syndrome diagnosis is highly complex with multiple steps which can result in an extended time to diagnosis while depleting precious tumor specimens. Here we describe the analytical validation of a custom probe-based NGS tumor panel, TumorNext-Lynch-MMR, which generates a comprehensive genetic profile of both germline and somatic mutations that can accelerate and streamline the time to diagnosis and preserve specimen. TumorNext-Lynch-MMR can detect single nucleotide variants, small insertions and deletions in 39 genes that are frequently mutated in Lynch syndrome and colorectal cancer. Moreover, the panel provides microsatellite instability status and detects loss of heterozygosity in the five Lynch genes; MSH2 , MSH6 , MLH1 , PMS2 and EPCAM . Clinical cases are described that highlight the assays ability to differentiate between somatic and germline mutations, precisely classify variants and resolve discordant cases.

  12. Sparc: a sparsity-based consensus algorithm for long erroneous sequencing reads

    Directory of Open Access Journals (Sweden)

    Chengxi Ye

    2016-06-01

    Full Text Available Motivation. The third generation sequencing (3GS technology generates long sequences of thousands of bases. However, its current error rates are estimated in the range of 15–40%, significantly higher than those of the prevalent next generation sequencing (NGS technologies (less than 1%. Fundamental bioinformatics tasks such as de novo genome assembly and variant calling require high-quality sequences that need to be extracted from these long but erroneous 3GS sequences. Results. We describe a versatile and efficient linear complexity consensus algorithm Sparc to facilitate de novo genome assembly. Sparc builds a sparse k-mer graph using a collection of sequences from a targeted genomic region. The heaviest path which approximates the most likely genome sequence is searched through a sparsity-induced reweighted graph as the consensus sequence. Sparc supports using NGS and 3GS data together, which leads to significant improvements in both cost efficiency and computational efficiency. Experiments with Sparc show that our algorithm can efficiently provide high-quality consensus sequences using both PacBio and Oxford Nanopore sequencing technologies. With only 30× PacBio data, Sparc can reach a consensus with error rate <0.5%. With the more challenging Oxford Nanopore data, Sparc can also achieve similar error rate when combined with NGS data. Compared with the existing approaches, Sparc calculates the consensus with higher accuracy, and uses approximately 80% less memory and time. Availability. The source code is available for download at https://github.com/yechengxi/Sparc.

  13. “Shovel-ready” Sequences as a Stimulus for the Next Generation of Life Scientists

    Directory of Open Access Journals (Sweden)

    Michael D. Boyle

    2010-04-01

    Full Text Available Genomics and bioinformatics are dynamic fields well-suited for capturing the imagination of undergraduates in both research laboratories and classrooms. Currently, raw nucleotide sequence is being provided, as part of several genomics research initiatives, for undergraduate research and teaching. These initiatives could be easily extended and much more effective if the source of the sequenced material and the subsequent focus of the data analysis were aligned with the research interests of individual faculty at undergraduate institutions. By judicious use of surplus capacity in existing nucleotide sequencing cores, raw sequence data could be generated to support ongoing research efforts involving undergraduates. This would allow these students to participate actively in discovery research, with a goal of making novel contributions to their field through original research while nurturing the next generation of talented research scientists.

  14. Transcriptional profiling of endocrine cerebro-osteodysplasia using microarray and next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Piya Lahiry

    Full Text Available BACKGROUND: Transcriptome profiling of patterns of RNA expression is a powerful approach to identify networks of genes that play a role in disease. To date, most mRNA profiling of tissues has been accomplished using microarrays, but next-generation sequencing can offer a richer and more comprehensive picture. METHODOLOGY/PRINCIPAL FINDINGS: ECO is a rare multi-system developmental disorder caused by a homozygous mutation in ICK encoding intestinal cell kinase. We performed gene expression profiling using both cDNA microarrays and next-generation mRNA sequencing (mRNA-seq of skin fibroblasts from ECO-affected subjects. We then validated a subset of differentially expressed transcripts identified by each method using quantitative reverse transcription-polymerase chain reaction (qRT-PCR. Finally, we used gene ontology (GO to identify critical pathways and processes that were abnormal according to each technical platform. Methodologically, mRNA-seq identifies a much larger number of differentially expressed genes with much better correlation to qRT-PCR results than the microarray (r² = 0.794 and 0.137, respectively. Biologically, cDNA microarray identified functional pathways focused on anatomical structure and development, while the mRNA-seq platform identified a higher proportion of genes involved in cell division and DNA replication pathways. CONCLUSIONS/SIGNIFICANCE: Transcriptome profiling with mRNA-seq had greater sensitivity, range and accuracy than the microarray. The two platforms generated different but complementary hypotheses for further evaluation.

  15. Next-Generation Sequencing to Detect Deletion of RB1 and ERBB4 Genes in Chromophobe Renal Cell Carcinoma: A Potential Role in Distinguishing Chromophobe Renal Cell Carcinoma from Renal Oncocytoma.

    Science.gov (United States)

    Liu, Qingqing; Cornejo, Kristine M; Cheng, Liang; Hutchinson, Lloyd; Wang, Mingsheng; Zhang, Shaobo; Tomaszewicz, Keith; Cosar, Ediz F; Woda, Bruce A; Jiang, Zhong

    2018-04-01

    Overlapping morphologic, immunohistochemical, and ultrastructural features make it difficult to diagnose chromophobe renal cell carcinoma (ChRCC) and renal oncocytoma (RO). Because ChRCC is a malignant tumor, whereas RO is a tumor with benign behavior, it is important to distinguish these two entities. We aimed to identify genetic markers that distinguish ChRCC from RO by using next-generation sequencing (NGS). NGS for hotspot mutations or gene copy number changes was performed on 12 renal neoplasms, including seven ChRCC and five RO cases. Matched normal tissues from the same patients were used to exclude germline variants. Rare hotspot mutations were found in cancer-critical genes (TP53 and PIK3CA) in ChRCC but not RO. The NGS gene copy number analysis revealed multiple abnormalities. The two most common deletions were tumor-suppressor genes RB1 and ERBB4 in ChRCC but not RO. Fluorescence in situ hybridization was performed on 65 cases (ChRCC, n = 33; RO, n = 32) to verify hemizygous deletion of RB1 (17/33, 52%) or ERBB4 (11/33, 33%) in ChRCC, but not in RO (0/32, 0%). In total, ChRCCs (23/33, 70%) carry either a hemizygous deletion of RB1 or ERBB4. The combined use of RB1 and ERBB4 fluorescence in situ hybridization to detect deletion of these genes may offer a highly sensitive and specific assay to distinguish ChRCC from RO. Copyright © 2018 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.

  16. An optimised protocol for isolation of RNA from small sections of laser-capture microdissected FFPE tissue amenable for next-generation sequencing.

    Science.gov (United States)

    Amini, Parisa; Ettlin, Julia; Opitz, Lennart; Clementi, Elena; Malbon, Alexandra; Markkanen, Enni

    2017-08-23

    Formalin-fixed paraffin embedded (FFPE) tissue constitutes a vast treasury of samples for biomedical research. Thus far however, extraction of RNA from FFPE tissue has proved challenging due to chemical RNA-protein crosslinking and RNA fragmentation, both of which heavily impact on RNA quantity and quality for downstream analysis. With very small sample sizes, e.g. when performing Laser-capture microdissection (LCM) to isolate specific subpopulations of cells, recovery of sufficient RNA for analysis with reverse-transcription quantitative PCR (RT-qPCR) or next-generation sequencing (NGS) becomes very cumbersome and difficult. We excised matched cancer-associated stroma (CAS) and normal stroma from clinical specimen of FFPE canine mammary tumours using LCM, and compared the commonly used protease-based RNA isolation procedure with an adapted novel technique that additionally incorporates a focused ultrasonication step. We successfully adapted a protocol that uses focused ultrasonication to isolate RNA from small amounts of deparaffinised, stained, clinical LCM samples. Using this approach, we found that total RNA yields could be increased by 8- to 12-fold compared to a commonly used protease-based extraction technique. Surprisingly, RNA extracted using this new approach was qualitatively at least equal if not superior compared to the old approach, as Cq values in RT-qPCR were on average 2.3-fold lower using the new method. Finally, we demonstrate that RNA extracted using the new method performs comparably in NGS as well. We present a successful isolation protocol for extraction of RNA from difficult and limiting FFPE tissue samples that enables successful analysis of small sections of clinically relevant specimen. The possibility to study gene expression signatures in specific small sections of archival FFPE tissue, which often entail large amounts of highly relevant clinical follow-up data, unlocks a new dimension of hitherto difficult-to-analyse samples which now

  17. Poly(A)-tag deep sequencing data processing to extract poly(A) sites.

    Science.gov (United States)

    Wu, Xiaohui; Ji, Guoli; Li, Qingshun Quinn

    2015-01-01

    Polyadenylation [poly(A)] is an essential posttranscriptional processing step in the maturation of eukaryotic mRNA. The advent of next-generation sequencing (NGS) technology has offered feasible means to generate large-scale data and new opportunities for intensive study of polyadenylation, particularly deep sequencing of the transcriptome targeting the junction of 3'-UTR and the poly(A) tail of the transcript. To take advantage of this unprecedented amount of data, we present an automated workflow to identify polyadenylation sites by integrating NGS data cleaning, processing, mapping, normalizing, and clustering. In this pipeline, a series of Perl scripts are seamlessly integrated to iteratively map the single- or paired-end sequences to the reference genome. After mapping, the poly(A) tags (PATs) at the same genome coordinate are grouped into one cleavage site, and the internal priming artifacts removed. Then the ambiguous region is introduced to parse the genome annotation for cleavage site clustering. Finally, cleavage sites within a close range of 24 nucleotides and from different samples can be clustered into poly(A) clusters. This procedure could be used to identify thousands of reliable poly(A) clusters from millions of NGS sequences in different tissues or treatments.

  18. Next Generation DNA Sequencing and the Future of Genomic Medicine

    OpenAIRE

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpreta...

  19. Inference with viral quasispecies diversity indices: clonal and NGS approaches.

    Science.gov (United States)

    Gregori, Josep; Salicrú, Miquel; Domingo, Esteban; Sanchez, Alex; Esteban, Juan I; Rodríguez-Frías, Francisco; Quer, Josep

    2014-04-15

    Given the inherent dynamics of a viral quasispecies, we are often interested in the comparison of diversity indices of sequential samples of a patient, or in the comparison of diversity indices of virus in groups of patients in a treated versus control design. It is then important to make sure that the diversity measures from each sample may be compared with no bias and within a consistent statistical framework. In the present report, we review some indices often used as measures for viral quasispecies complexity and provide means for statistical inference, applying procedures taken from the ecology field. In particular, we examine the Shannon entropy and the mutation frequency, and we discuss the appropriateness of different normalization methods of the Shannon entropy found in the literature. By taking amplicons ultra-deep pyrosequencing (UDPS) raw data as a surrogate of a real hepatitis C virus viral population, we study through in-silico sampling the statistical properties of these indices under two methods of viral quasispecies sampling, classical cloning followed by Sanger sequencing (CCSS) and next-generation sequencing (NGS) such as UDPS. We propose solutions specific to each of the two sampling methods-CCSS and NGS-to guarantee statistically conforming conclusions as free of bias as possible. josep.gregori@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. Hungarian Marfan family with large FBN1 deletion calls attention to copy number variation detection in the current NGS era

    Science.gov (United States)

    Ágg, Bence; Meienberg, Janine; Kopps, Anna M.; Fattorini, Nathalie; Stengl, Roland; Daradics, Noémi; Pólos, Miklós; Bors, András; Radovits, Tamás; Merkely, Béla; De Backer, Julie; Szabolcs, Zoltán; Mátyás, Gábor

    2018-01-01

    Copy number variations (CNVs) comprise about 10% of reported disease-causing mutations in Mendelian disorders. Nevertheless, pathogenic CNVs may have been under-detected due to the lack or insufficient use of appropriate detection methods. In this report, on the example of the diagnostic odyssey of a patient with Marfan syndrome (MFS) harboring a hitherto unreported 32-kb FBN1 deletion, we highlight the need for and the feasibility of testing for CNVs (>1 kb) in Mendelian disorders in the current next-generation sequencing (NGS) era. PMID:29850152