WorldWideScience

Sample records for sanger sequencing technology

  1. Pyrosequencing-An Alternative to Traditional Sanger Sequencing

    OpenAIRE

    2012-01-01

    Problem statement: Pyrosequencing has the potential to rapidly and reliably sequence DNA taking advantages over traditional Sanger di-deoxy sequencing approach. Approach: A comprehensive review of the literature on the principles, applications, challenges and prospects of pyrosequencing was performed. Results: Pyrosequencing was a DNA sequencing technology based on the sequencing-by-synthesis principle. It employs a series of four enzymes to accurately detect nucleic acid sequences during the...

  2. Pyrosequencing-An Alternative to Traditional Sanger Sequencing

    Directory of Open Access Journals (Sweden)

    Fakruddin

    2012-01-01

    Full Text Available Problem statement: Pyrosequencing has the potential to rapidly and reliably sequence DNA taking advantages over traditional Sanger di-deoxy sequencing approach. Approach: A comprehensive review of the literature on the principles, applications, challenges and prospects of pyrosequencing was performed. Results: Pyrosequencing was a DNA sequencing technology based on the sequencing-by-synthesis principle. It employs a series of four enzymes to accurately detect nucleic acid sequences during the synthesis. Pyrosequencing had the potential advantages of accuracy, flexibility, parallel processing and could be easily automated. The technique dispenses with the need for labeled primers, labeled nucleotides and gel-electrophoresis. Pyrosequencing had opened up new possibilities for performing sequence-based DNA analysis. The method had been proven highly suitable for single nucleotide polymorphism analysis and sequencing of short stretches of DNA. Pyrosequencing had been successful for both confirmatory sequencing and de novo sequencing. By increasing the read length to higher scores and by shortening the sequence reaction time per base calling, pyrosequencing may take over many broad areas of DNA sequencing applications as the trend was directed to analysis of fewer amounts of specimens and large-scale settings, with higher throughput and lower cost. Conclusion/Recommendations: The Competitiveness of pyrosequencing with other sequencing methods can be improved in future."

  3. Targeted next-generation sequencing can replace Sanger sequencing in clinical diagnostics

    NARCIS (Netherlands)

    Sikkema-Raddatz, B.; Johansson, L.F.; de Boer, E.N.; Almomani, R.; Boven, L.G.; van den Berg, M.P.; van Spaendonck-Zwarts, K.Y.; van Tintelen, J.P.; Sijmons, R.H.; Jongbloed, J.D.H.; Sinke, R.J.

    2013-01-01

    Mutation detection through exome sequencing allows simultaneous analysis of all coding sequences of genes. However, it cannot yet replace Sanger sequencing (SS) in diagnostics because of incomplete representation and coverage of exons leading to missing clinically relevant mutations. Targeted next-g

  4. Automated Sanger Analysis Pipeline (ASAP): A Tool for Rapidly Analyzing Sanger Sequencing Data with Minimum User Interference

    Science.gov (United States)

    Singh, Aditya; Bhatia, Prateek

    2016-01-01

    Sanger sequencing platforms, such as applied biosystems instruments, generate chromatogram files. Generally, for 1 region of a sequence, we use both forward and reverse primers to sequence that area, in that way, we have 2 sequences that need to be aligned and a consensus generated before mutation detection studies. This work is cumbersome and takes time, especially if the gene is large with many exons. Hence, we devised a rapid automated command system to filter, build, and align consensus sequences and also optionally extract exonic regions, translate them in all frames, and perform an amino acid alignment starting from raw sequence data within a very short time. In full capabilities of Automated Mutation Analysis Pipeline (ASAP), it is able to read "*.ab1" chromatogram files through command line interface, convert it to the FASTQ format, trim the low-quality regions, reverse-complement the reverse sequence, create a consensus sequence, extract the exonic regions using a reference exonic sequence, translate the sequence in all frames, and align the nucleic acid and amino acid sequences to reference nucleic acid and amino acid sequences, respectively. All files are created and can be used for further analysis. ASAP is available as Python 3.x executable at https://github.com/aditya-88/ASAP. The version described in this paper is 0.28. PMID:27790076

  5. iAssembler: a package for de novo assembly of Roche-454/Sanger transcriptome sequences

    Directory of Open Access Journals (Sweden)

    Zheng Yi

    2011-11-01

    Full Text Available Abstract Background Expressed Sequence Tags (ESTs have played significant roles in gene discovery and gene functional analysis, especially for non-model organisms. For organisms with no full genome sequences available, ESTs are normally assembled into longer consensus sequences for further downstream analysis. However current de novo EST assembly programs often generate large number of assembly errors that will negatively affect the downstream analysis. In order to generate more accurate consensus sequences from ESTs, tools are needed to reduce or eliminate errors from de novo assemblies. Results We present iAssembler, a pipeline that can assemble large-scale ESTs into consensus sequences with significantly higher accuracy than current existing assemblers. iAssembler employs MIRA and CAP3 assemblers to generate initial assemblies, followed by identifying and correcting two common types of transcriptome assembly errors: 1 ESTs from different transcripts (mainly alternatively spliced transcripts or paralogs are incorrectly assembled into same contigs; and 2 ESTs from same transcripts fail to be assembled together. iAssembler can be used to assemble ESTs generated using the traditional Sanger method and/or the Roche-454 massive parallel pyrosequencing technology. Conclusion We compared performances of iAssembler and several other de novo EST assembly programs using both Roche-454 and Sanger EST datasets. It demonstrated that iAssembler generated significantly more accurate consensus sequences than other assembly programs.

  6. Hidden mutations in Cornelia de Lange syndrome limitations of sanger sequencing in molecular diagnostics.

    Science.gov (United States)

    Braunholz, Diana; Obieglo, Carolin; Parenti, Ilaria; Pozojevic, Jelena; Eckhold, Juliane; Reiz, Benedikt; Braenne, Ingrid; Wendt, Kerstin S; Watrin, Erwan; Vodopiutz, Julia; Rieder, Harald; Gillessen-Kaesbach, Gabriele; Kaiser, Frank J

    2015-01-01

    Cornelia de Lange syndrome (CdLS) is a well-characterized developmental disorder. The genetic cause of CdLS is a mutation in one of five associated genes (NIPBL, SMC1A, SMC3, RAD21, and HDAC8) accounting for about 70% of cases. To improve our current molecular diagnostic and to analyze some of CdLS candidate genes, we developed and established a gene panel approach. Because recent data indicate a high frequency of mosaic NIPBL mutations that were not detected by conventional sequencing approaches of blood DNA, we started to collect buccal mucosa (BM) samples of our patients that were negative for mutations in the known CdLS genes. Here, we report the identification of three mosaic NIPBL mutations by our high-coverage gene panel sequencing approach that were undetected by classical Sanger sequencing analysis of BM DNA. All mutations were confirmed by the use of highly sensitive SNaPshot fragment analysis using DNA from BM, urine, and fibroblast samples. In blood samples, we could not detect the respective mutation. Finally, in fibroblast samples from all three patients, Sanger sequencing could identify all the mutations. Thus, our study highlights the need for highly sensitive technologies in molecular diagnostic of CdLS to improve genetic diagnosis and counseling of patients and their families. © 2014 WILEY PERIODICALS, INC.

  7. Simplified large-scale Sanger genome sequencing for influenza A/H3N2 virus.

    Directory of Open Access Journals (Sweden)

    Hong Kai Lee

    Full Text Available BACKGROUND: The advent of next-generation sequencing technologies and the resultant lower costs of sequencing have enabled production of massive amounts of data, including the generation of full genome sequences of pathogens. However, the small genome size of the influenza virus arguably justifies the use of the more conventional Sanger sequencing technology which is still currently more readily available in most diagnostic laboratories. RESULTS: We present a simplified Sanger-based genome sequencing method for sequencing the influenza A/H3N2 virus in a large-scale format. The entire genome sequencing was completed with 19 reverse transcription-polymerase chain reactions (RT-PCRs and 39 sequencing reactions. This method was tested on 15 native clinical samples and 15 culture isolates, respectively, collected between 2009 and 2011. The 15 native clinical samples registered quantification cycle values ranging from 21.0 to 30.56, which were equivalent to 2.4×10(3-1.4×10(6 viral copies/µL of RNA extract. All the PCR-amplified products were sequenced directly without PCR product purification. Notably, high quality sequencing data up to 700 bp were generated for all the samples tested. The completed sequence covered 408,810 nucleotides in total, with 13,627 nucleotides per genome, attaining 100% coding completeness. Of all the bases produced, an average of 89.49% were Phred quality value 40 (QV40 bases (representing an accuracy of circa one miscall for every 10,000 bases or higher, and an average of 93.46% were QV30 bases (one miscall every 1000 bases or higher. CONCLUSIONS: This sequencing protocol has been shown to be cost-effective and less labor-intensive in obtaining full influenza genomes. The constant high quality of sequences generated imparts confidence in extending the application of this non-purified amplicon sequencing approach to other gene sequencing assays, with appropriate use of suitably designed primers.

  8. The utility of direct specimen detection by Sanger sequencing in hospitalized pediatric patients.

    Science.gov (United States)

    Mongkolrattanothai, Kanokporn; Dien Bard, Jennifer

    2017-02-01

    Direct microbial DNA detection from clinical specimens by polymerase chain reaction and Sanger sequencing has been developed to address the innate limitations of traditional culture-based work-up. We report our institution's experience with direct specimen sequencing, its clinical utility, and barriers to effective clinical implementation.

  9. Noncontinuously binding loop-out primers for avoiding problematic DNA sequences in PCR and sanger sequencing.

    Science.gov (United States)

    Sumner, Kelli; Swensen, Jeffrey J; Procter, Melinda; Jama, Mohamed; Wooderchak-Donahue, Whitney; Lewis, Tracey; Fong, Michael; Hubley, Lindsey; Schwarz, Monica; Ha, Youna; Paul, Eleri; Brulotte, Benjamin; Lyon, Elaine; Bayrak-Toydemir, Pinar; Mao, Rong; Pont-Kingdon, Genevieve; Best, D Hunter

    2014-09-01

    We present a method in which noncontinuously binding (loop-out) primers are used to exclude regions of DNA that typically interfere with PCR amplification and/or analysis by Sanger sequencing. Several scenarios were tested using this design principle, including M13-tagged PCR primers, non-M13-tagged PCR primers, and sequencing primers. With this technique, a single oligonucleotide is designed in two segments that flank, but do not include, a short region of problematic DNA sequence. During PCR amplification or sequencing, the problematic region is looped-out from the primer binding site, where it does not interfere with the reaction. Using this method, we successfully excluded regions of up to 46 nucleotides. Loop-out primers were longer than traditional primers (27 to 40 nucleotides) and had higher melting temperatures. This method allows the use of a standardized PCR protocol throughout an assay, keeps the number of PCRs to a minimum, reduces the chance for laboratory error, and, above all, does not interrupt the clinical laboratory workflow.

  10. Online Diagnosis System: a webserver for analysis of Sanger sequencing-based genetic testing data.

    Science.gov (United States)

    Sun, Kun; Yuen, Yuet-Ping; Wang, Huating; Sun, Hao

    2014-10-01

    Sanger sequencing is a well-established molecular technique for diagnosis of genetic diseases. In these tests, DNA sequencers produce vast amounts of data that need to be examined and annotated within a short period of time. To achieve this goal, an online bioinformatics platform that can automate the process is essential. However, to date, there is no such integrated bioinformatics platform available. To fulfill this gap, we developed the Online Diagnosis System (ODS), which is a freely available webserver and supports the commonly used file format of Sanger sequencing data. ODS seamlessly integrates base calling, single nucleotide variation (SNV) identification, and SNV annotation into one single platform. It also allows laboratorians to manually inspect the quality of the identified SNVs in the final report. ODS can significantly reduce the data analysis time therefore allows Sanger sequencing-based genetic testing to be finished in a timely manner. ODS is freely available at http://sunlab.lihs.cuhk.edu.hk/ODS/. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. Screening PCR Versus Sanger Sequencing: Detection of CALR Mutations in Patients With Thrombocytosis.

    Science.gov (United States)

    Jeong, Ji Hun; Lee, Hwan Tae; Seo, Ja Young; Seo, Yiel Hea; Kim, Kyung Hee; Kim, Moon Jin; Lee, Jae Hoon; Park, Jinny; Hong, Jun Shik; Park, Pil Whan; Ahn, Jeong Yeal

    2016-07-01

    Mutations in calreticulin (CALR) have been reported to be key markers in the molecular diagnosis of myeloid proliferative neoplasms. In most previous reports, CALR mutations were analyzed by using Sanger sequencing. Here, we report a new, rapid, and convenient system for screening CALR mutations without sequencing. Eighty-three bone marrow samples were obtained from 81 patients with thrombocytosis. PCR primers were designed to detect wild-type CALR (product: 357 bp) and CALR with type 1 (product: 302 bp) and type 2 mutations (product: 272 bp) in one reaction. The results were confirmed by Sanger sequencing and compared with results from fragment analysis. The minimum detection limit of the screening PCR was 10 ng for type 1, 1 ng for type 2, and 0.1 ng for cases with both mutations. CALR type 1 and type 2 mutants were detected with screening PCR with a maximal analytical sensitivity of 3.2% and <0.8%, respectively. The screening PCR detected 94.1% (16/17) of mutation cases and showed concordant results with sequencing in the cases of type 1 and type 2 mutations. Sanger sequencing identified one novel mutation (c.1123_1132delinsTGC). Compared with sequencing, the screening PCR showed 94.1% sensitivity, 100.0% specificity, 100.0% positive predictive value, and 98.5% negative predictive value. Compared with fragment analysis, the screening PCR presented 88.9% sensitivity and 100.0% specificity. This screening PCR is a rapid, sensitive, and cost-effective method for the detection of major CALR mutations.

  12. Comparison of base composition analysis and Sanger sequencing of mitochondrial DNA for four U.S. population groups.

    Science.gov (United States)

    Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M

    2014-01-01

    A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays. Published by Elsevier Ireland Ltd.

  13. Comparing Whole-Genome Sequencing with Sanger Sequencing for spa Typing of Methicillin-Resistant Staphylococcus aureus

    DEFF Research Database (Denmark)

    Bartels, Mette Damkjaer; Petersen, Andreas; Worning, Peder

    2014-01-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013......, and an in-house analysis pipeline determines the spa types. Due to national surveillance, all MRSA isolates are sent to Statens Serum Institut, where the spa type is determined by PCR and Sanger sequencing. The purpose of this study was to evaluate the reliability of the spa types obtained by 150-bp paired......-end Illumina WGS. MRSA isolates from new MRSA patients in 2013 (n = 699) in the capital region of Denmark were included. We found a 97% agreement between spa types obtained by the two methods. All isolates achieved a spa type by both methods. Nineteen isolates differed in spa types by the two methods, in most...

  14. 454 Pyrosequencing and Sanger sequencing of tropical mycorrhizal fungi provide similar results but reveal substantial methodological biases.

    Science.gov (United States)

    Tedersoo, Leho; Nilsson, R Henrik; Abarenkov, Kessy; Jairus, Teele; Sadam, Ave; Saar, Irja; Bahram, Mohammad; Bechem, Eneke; Chuyong, George; Kõljalg, Urmas

    2010-10-01

    • Compared with Sanger sequencing-based methods, pyrosequencing provides orders of magnitude more data on the diversity of organisms in their natural habitat, but its technological biases and relative accuracy remain poorly understood. • This study compares the performance of pyrosequencing and traditional sequencing for species' recovery of ectomycorrhizal fungi on root tips in a Cameroonian rain forest and addresses biases related to multi-template PCR and pyrosequencing analyses. • Pyrosequencing and the traditional method yielded qualitatively similar results, but there were slight, but significant, differences that affected the taxonomic view of the fungal community. We found that most pyrosequencing singletons were artifactual and contained a strongly elevated proportion of insertions compared with natural intra- and interspecific variation. The alternative primers, DNA extraction methods and PCR replicates strongly influenced the richness and community composition as recovered by pyrosequencing. • Pyrosequencing offers a powerful alternative for the identification of ectomycorrhizal fungi in pooled root samples, but requires careful selection of molecular tools. A well-populated backbone database facilitates the detection of biological and technical artifacts. The pyrosequencing pipeline is available at http://unite.ut.ee/454pipeline.tgz.

  15. The empirical power of rare variant association methods: results from sanger sequencing in 1,998 individuals.

    Directory of Open Access Journals (Sweden)

    Martin Ladouceur

    2012-02-01

    Full Text Available The role of rare genetic variation in the etiology of complex disease remains unclear. However, the development of next-generation sequencing technologies offers the experimental opportunity to address this question. Several novel statistical methodologies have been recently proposed to assess the contribution of rare variation to complex disease etiology. Nevertheless, no empirical estimates comparing their relative power are available. We therefore assessed the parameters that influence their statistical power in 1,998 individuals Sanger-sequenced at seven genes by modeling different distributions of effect, proportions of causal variants, and direction of the associations (deleterious, protective, or both in simulated continuous trait and case/control phenotypes. Our results demonstrate that the power of recently proposed statistical methods depend strongly on the underlying hypotheses concerning the relationship of phenotypes with each of these three factors. No method demonstrates consistently acceptable power despite this large sample size, and the performance of each method depends upon the underlying assumption of the relationship between rare variants and complex traits. Sensitivity analyses are therefore recommended to compare the stability of the results arising from different methods, and promising results should be replicated using the same method in an independent sample. These findings provide guidance in the analysis and interpretation of the role of rare base-pair variation in the etiology of complex traits and diseases.

  16. Homozygosity mapping and targeted sanger sequencing reveal genetic defects underlying inherited retinal disease in families from pakistan.

    Directory of Open Access Journals (Sweden)

    Maleeha Maria

    Full Text Available Homozygosity mapping has facilitated the identification of the genetic causes underlying inherited diseases, particularly in consanguineous families with multiple affected individuals. This knowledge has also resulted in a mutation dataset that can be used in a cost and time effective manner to screen frequent population-specific genetic variations associated with diseases such as inherited retinal disease (IRD.We genetically screened 13 families from a cohort of 81 Pakistani IRD families diagnosed with Leber congenital amaurosis (LCA, retinitis pigmentosa (RP, congenital stationary night blindness (CSNB, or cone dystrophy (CD. We employed genome-wide single nucleotide polymorphism (SNP array analysis to identify homozygous regions shared by affected individuals and performed Sanger sequencing of IRD-associated genes located in the sizeable homozygous regions. In addition, based on population specific mutation data we performed targeted Sanger sequencing (TSS of frequent variants in AIPL1, CEP290, CRB1, GUCY2D, LCA5, RPGRIP1 and TULP1, in probands from 28 LCA families.Homozygosity mapping and Sanger sequencing of IRD-associated genes revealed the underlying mutations in 10 families. TSS revealed causative variants in three families. In these 13 families four novel mutations were identified in CNGA1, CNGB1, GUCY2D, and RPGRIP1.Homozygosity mapping and TSS revealed the underlying genetic cause in 13 IRD families, which is useful for genetic counseling as well as therapeutic interventions that are likely to become available in the near future.

  17. 454 next generation-sequencing outperforms allele-specific PCR, Sanger sequencing, and pyrosequencing for routine KRAS mutation analysis of formalin-fixed, paraffin-embedded samples.

    Science.gov (United States)

    Altimari, Annalisa; de Biase, Dario; De Maglio, Giovanna; Gruppioni, Elisa; Capizzi, Elisa; Degiovanni, Alessio; D'Errico, Antonia; Pession, Annalisa; Pizzolitto, Stefano; Fiorentino, Michelangelo; Tallini, Giovanni

    2013-01-01

    Detection of KRAS mutations in archival pathology samples is critical for therapeutic appropriateness of anti-EGFR monoclonal antibodies in colorectal cancer. We compared the sensitivity, specificity, and accuracy of Sanger sequencing, ARMS-Scorpion (TheraScreen®) real-time polymerase chain reaction (PCR), pyrosequencing, chip array hybridization, and 454 next-generation sequencing to assess KRAS codon 12 and 13 mutations in 60 nonconsecutive selected cases of colorectal cancer. Twenty of the 60 cases were detected as wild-type KRAS by all methods with 100% specificity. Among the 40 mutated cases, 13 were discrepant with at least one method. The sensitivity was 85%, 90%, 93%, and 92%, and the accuracy was 90%, 93%, 95%, and 95% for Sanger sequencing, TheraScreen real-time PCR, pyrosequencing, and chip array hybridization, respectively. The main limitation of Sanger sequencing was its low analytical sensitivity, whereas TheraScreen real-time PCR, pyrosequencing, and chip array hybridization showed higher sensitivity but suffered from the limitations of predesigned assays. Concordance between the methods was k = 0.79 for Sanger sequencing and k > 0.85 for the other techniques. Tumor cell enrichment correlated significantly with the abundance of KRAS-mutated deoxyribonucleic acid (DNA), evaluated as ΔCt for TheraScreen real-time PCR (P = 0.03), percentage of mutation for pyrosequencing (P = 0.001), ratio for chip array hybridization (P = 0.003), and percentage of mutation for 454 next-generation sequencing (P = 0.004). Also, 454 next-generation sequencing showed the best cross correlation for quantification of mutation abundance compared with all the other methods (P < 0.001). Our comparison showed the superiority of next-generation sequencing over the other techniques in terms of sensitivity and specificity. Next-generation sequencing will replace Sanger sequencing as the reference technique for diagnostic detection of KRAS mutation in archival tumor tissues.

  18. Margaret Sanger

    Institute of Scientific and Technical Information of China (English)

    吴伟华

    2005-01-01

    Many women today have the freedom to decide when they will have children, if they want them. Until aboutfifty years ago, women spent most of their adultlives having children,year after year. This changed because of efforts by activists like Margaret Sanger. She believed that a safe and sure method of preventing pregnancy was a necessary condition for women's freedom. She also believed birth control was necessary for human progress.

  19. A comparison of parallel pyrosequencing and sanger clone-based sequencing and its impact on the characterization of the genetic diversity of HIV-1.

    Directory of Open Access Journals (Sweden)

    Binhua Liang

    Full Text Available BACKGROUND: Pyrosequencing technology has the potential to rapidly sequence HIV-1 viral quasispecies without requiring the traditional approach of cloning. In this study, we investigated the utility of ultra-deep pyrosequencing to characterize genetic diversity of the HIV-1 gag quasispecies and assessed the possible contribution of pyrosequencing technology in studying HIV-1 biology and evolution. METHODOLOGY/PRINCIPAL FINDINGS: HIV-1 gag gene was amplified from 96 patients using nested PCR. The PCR products were cloned and sequenced using capillary based Sanger fluorescent dideoxy termination sequencing. The same PCR products were also directly sequenced using the 454 pyrosequencing technology. The two sequencing methods were evaluated for their ability to characterize quasispecies variation, and to reveal sites under host immune pressure for their putative functional significance. A total of 14,034 variations were identified by 454 pyrosequencing versus 3,632 variations by Sanger clone-based (SCB sequencing. 11,050 of these variations were detected only by pyrosequencing. These undetected variations were located in the HIV-1 Gag region which is known to contain putative cytotoxic T lymphocyte (CTL and neutralizing antibody epitopes, and sites related to virus assembly and packaging. Analysis of the positively selected sites derived by the two sequencing methods identified several differences. All of them were located within the CTL epitope regions. CONCLUSIONS/SIGNIFICANCE: Ultra-deep pyrosequencing has proven to be a powerful tool for characterization of HIV-1 genetic diversity with enhanced sensitivity, efficiency, and accuracy. It also improved reliability of downstream evolutionary and functional analysis of HIV-1 quasispecies.

  20. IROme, a new high-throughput molecular tool for the diagnosis of inherited retinal dystrophies-a price comparison with Sanger sequencing.

    Science.gov (United States)

    Schorderet, Daniel F; Bernasconi, Maude; Tiab, Leila; Favez, Tatiana; Escher, Pascal

    2014-01-01

    The molecular diagnosis of retinal dystrophies (RD) is difficult because of genetic and clinical heterogeneity. Previously, the molecular screening of genes was done one by one, sometimes in a scheme based on the frequency of sequence variants and the number of exons/length of the candidate genes. Payment for these procedures was complicated and the sequential billing of several genes created endless paperwork. We therefore evaluated the costs of generating and sequencing a hybridization-based DNA library enriched for the 64 most frequently mutated genes in RD, called IROme, and compared them to the costs of amplifying and sequencing these genes by the Sanger method. The production cost generated by the high-throughput (HT) sequencing of IROme was established at CHF 2,875.75 per case. Sanger sequencing of the same exons cost CHF 69,399.02. Turnaround time of the analysis was 3 days for IROme. For Sanger sequencing, it could only be estimated, as we never sequenced all 64 genes in one single patient. Sale cost for IROme calculated on the basis of the sale cost of one exon by Sanger sequencing is CHF 8,445.88, which corresponds to the sale price of 40 exons. In conclusion, IROme is cheaper and faster than Sanger sequencing and therefore represents a sound approach for the diagnosis of RD, both scientifically and economically. As a drop in the costs of HT sequencing is anticipated, target resequencing might become the new gold standard in the molecular diagnosis of RD.

  1. Very high resolution single pass HLA genotyping using amplicon sequencing on the 454 next generation DNA sequencers: Comparison with Sanger sequencing.

    Science.gov (United States)

    Yamamoto, F; Höglund, B; Fernandez-Vina, M; Tyan, D; Rastrou, M; Williams, T; Moonsamy, P; Goodridge, D; Anderson, M; Erlich, H A; Holcomb, C L

    2015-12-01

    Compared to Sanger sequencing, next-generation sequencing offers advantages for high resolution HLA genotyping including increased throughput, lower cost, and reduced genotype ambiguity. Here we describe an enhancement of the Roche 454 GS GType HLA genotyping assay to provide very high resolution (VHR) typing, by the addition of 8 primer pairs to the original 14, to genotype 11 HLA loci. These additional amplicons help resolve common and well-documented alleles and exclude commonly found null alleles in genotype ambiguity strings. Simplification of workflow to reduce the initial preparation effort using early pooling of amplicons or the Fluidigm Access Array™ is also described. Performance of the VHR assay was evaluated on 28 well characterized cell lines using Conexio Assign MPS software which uses genomic, rather than cDNA, reference sequence. Concordance was 98.4%; 1.6% had no genotype assignment. Of concordant calls, 53% were unambiguous. To further assess the assay, 59 clinical samples were genotyped and results compared to unambiguous allele assignments obtained by prior sequence-based typing supplemented with SSO and/or SSP. Concordance was 98.7% with 58.2% as unambiguous calls; 1.3% could not be assigned. Our results show that the amplicon-based VHR assay is robust and can replace current Sanger methodology. Together with software enhancements, it has the potential to provide even higher resolution HLA typing. Copyright © 2015. Published by Elsevier Inc.

  2. Barcoding the food chain: from Sanger to high-throughput sequencing.

    Science.gov (United States)

    Littlefair, Joanne E; Clare, Elizabeth L

    2016-11-01

    Society faces the complex challenge of supporting biodiversity and ecosystem functioning, while ensuring food security by providing safe traceable food through an ever-more-complex global food chain. The increase in human mobility brings the added threat of pests, parasites, and invaders that further complicate our agro-industrial efforts. DNA barcoding technologies allow researchers to identify both individual species, and, when combined with universal primers and high-throughput sequencing techniques, the diversity within mixed samples (metabarcoding). These tools are already being employed to detect market substitutions, trace pests through the forensic evaluation of trace "environmental DNA", and to track parasitic infections in livestock. The potential of DNA barcoding to contribute to increased security of the food chain is clear, but challenges remain in regulation and the need for validation of experimental analysis. Here, we present an overview of the current uses and challenges of applied DNA barcoding in agriculture, from agro-ecosystems within farmland to the kitchen table.

  3. Comparison of pyrosequencing, Sanger sequencing, and melting curve analysis for detection of low-frequency macrolide-resistant mycoplasma pneumoniae quasispecies in respiratory specimens.

    Science.gov (United States)

    Chan, Kwok-Hung; To, Kelvin K W; Chan, Betsy W K; Li, Clara P Y; Chiu, Susan S; Yuen, Kwok-Yung; Ho, Pak-Leung

    2013-08-01

    Macrolide-resistant Mycoplasma pneumoniae (MRMP) is emerging worldwide and has been associated with treatment failure. In this study, we used pyrosequencing to detect low-frequency MRMP quasispecies in respiratory specimens, and we compared the findings with those obtained by Sanger sequencing and SimpleProbe PCR coupled with a melting curve analysis (SimpleProbe PCR). Sanger sequencing, SimpleProbe PCR, and pyrosequencing were successfully performed for 96.7% (88/91), 96.7% (88/91), and 93.4% (85/91) of the M. pneumoniae-positive specimens, respectively. The A-to-G transition at position 2063 was the only mutation identified. Pyrosequencing identified A2063G MRMP quasispecies populations in 78.8% (67/88) of the specimens. Only 38.8% (26/67) of these specimens with the A2063G quasispecies detected by pyrosequencing were found to be A2063G quasispecies by Sanger sequencing or SimpleProbe PCR. The specimens that could be detected by SimpleProbe PCR and Sanger sequencing had higher frequencies of MRMP quasispecies (51% to 100%) than those that could not be detected by those two methods (1% to 44%). SimpleProbe PCR correctly categorized all specimens that were identified as wild type or mutant by Sanger sequencing. The clinical characteristics of the patients were not significantly different when they were grouped by the presence or absence of MRMP quasispecies, while patients with MRMP identified by Sanger sequencing more often required a switch from macrolides to an alternative M. pneumoniae-targeted therapy. The clinical significance of mutant quasispecies should be investigated further with larger patient populations and with specimens obtained before and after macrolide therapy.

  4. A Comprehensive Transcriptome Assembly of Pigeonpea (Cajanus cajan L.) using Sanger and Second-Generation Sequencing Platforms

    Science.gov (United States)

    Kudapa, Himabindu; Bharti, Arvind K.; Cannon, Steven B.; Farmer, Andrew D.; Mulaosmanovic, Benjamin; Kramer, Robin; Bohra, Abhishek; Weeks, Nathan T.; Crow, John A.; Tuteja, Reetu; Shah, Trushar; Dutta, Sutapa; Gupta, Deepak K.; Singh, Archana; Gaikwad, Kishor; Sharma, Tilak R.; May, Gregory D.; Singh, Nagendra K.; Varshney, Rajeev K.

    2012-01-01

    A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18 353 Sanger expressed sequenced tags from more than 16 genotypes. The resultant transcriptome assembly, referred to as CcTA v2, comprised 21 434 transcript assembly contigs (TACs) with an N50 of 1510 bp, the largest one being ∼8 kb. Of the 21 434 TACs, 16 622 (77.5%) could be mapped on to the soybean genome build 1.0.9 under fairly stringent alignment parameters. Based on knowledge of intron junctions, 10 009 primer pairs were designed from 5033 TACs for amplifying intron spanning regions (ISRs). By using in silico mapping of BAC-end-derived SSR loci of pigeonpea on the soybean genome as a reference, putative mapping positions at the chromosome level were predicted for 6284 ISR markers, covering all 11 pigeonpea chromosomes. A subset of 128 ISR markers were analyzed on a set of eight genotypes. While 116 markers were validated, 70 markers showed one to three alleles, with an average of 0.16 polymorphism information content (PIC) value. In summary, the CcTA v2 transcript assembly and ISR markers will serve as a useful resource to accelerate genetic research and breeding applications in pigeonpea. PMID:22241453

  5. A Comprehensive Transcriptome Assembly of Pigeonpea (Cajanus cajan L.) using Sanger and Second-Generation Sequencing Platforms

    Institute of Scientific and Technical Information of China (English)

    Himabindu Kudapa; Reetu Tuteja; Trushar Shah; Sutapa Dutta; Deepak K.Gupta; Archana Singh; Kishor Gaikwad; Tilak R.Sharma; Gregory D.May; Nagendra K.Singh; Rajeev K.Varshney; Arvind K.Bharti; Steven B.Cannon; Andrew D.Farmer; Benjamin Mulaosmanovic; Robin Kramer; Abhishek Bohra; Nathan T.Weeks; John A.Crow

    2012-01-01

    A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA Ⅱx single end reads,2.19 million single end FLX/454 reads,and 18353 Sanger expressed sequenced tags from more than 16 genotypes.The resultant transcriptome assembly,referred to as CcTA v2,comprised 21434 transcript assembly contigs (TACs) with an N50 of 1510 bp,the largest one being ~8 kb.Of the 21434 TACs,16622 (77.5%) could be mapped on to the soybean genome build 1.0.9 under fairly stringent alignment parameters.Based on knowledge of intron junctions,10009 primer pairs were designed from 5033 TACs for amplifying intron spanning regions (ISRs).By using in silico mapping of BAC-end-derived SSR loci of pigeonpea on the soybean genome as a reference,putative mapping positions at the chromosome level were predicted for 6284 ISR markers,covering all 11 pigeonpea chromosomes.A subset of 128 ISR markers were analyzed on a set of eight genotypes.While 116 markers were validated,70 markers showed one to three alleles,with an average of 0.16 polymorphism information content (PIC) value.In summary,the CcTA v2 transcript assembly and ISR markers will serve as a useful resource to accelerate genetic research and breeding applications in pigeonpea.

  6. Identification of novel BRCA founder mutations in Middle Eastern breast cancer patients using capture and Sanger sequencing analysis.

    Science.gov (United States)

    Bu, Rong; Siraj, Abdul K; Al-Obaisi, Khadija A S; Beg, Shaham; Al Hazmi, Mohsen; Ajarim, Dahish; Tulbah, Asma; Al-Dayel, Fouad; Al-Kuraya, Khawla S

    2016-09-01

    Ethnic differences of breast cancer genomics have prompted us to investigate the spectra of BRCA1 and BRCA2 mutations in different populations. The prevalence and effect of BRCA 1 and BRCA 2 mutations in Middle Eastern population is not fully explored. To characterize the prevalence of BRCA mutations in Middle Eastern breast cancer patients, BRCA mutation screening was performed in 818 unselected breast cancer patients using Capture and/or Sanger sequencing. 19 short tandem repeat (STR) markers were used for founder mutation analysis. In our study, nine different types of deleterious mutation were identified in 28 (3.4%) cases, 25 (89.3%) cases in BRCA 1 and 3 (10.7%) cases in BRCA 2. Seven recurrent mutations identified accounted for 92.9% (26/28) of all the mutant cases. Haplotype analysis was performed to confirm c.1140 dupG and c.4136_4137delCT mutations as novel putative founder mutation, accounting for 46.4% (13/28) of all BRCA mutant cases and 1.6% (13/818) of all the breast cancer cases, respectively. Moreover, BRCA 1 mutation was significantly associated with BRCA 1 protein expression loss (p = 0.0005). Our finding revealed that a substantial number of BRCA mutations were identified in clinically high risk breast cancer from Middle East region. Identification of the mutation spectrum, prevalence and founder effect in Middle Eastern population facilitates genetic counseling, risk assessment and development of cost-effective screening strategy.

  7. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L.) using sanger and next generation sequencing platforms: development and applications.

    Science.gov (United States)

    Kudapa, Himabindu; Azam, Sarwar; Sharpe, Andrew G; Taran, Bunyamin; Li, Rong; Deonovic, Benjamin; Cameron, Connor; Farmer, Andrew D; Cannon, Steven B; Varshney, Rajeev K

    2014-01-01

    A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs) from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201), comprising 46,369 transcript assembly contigs (TACs) has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8%) of the TACs and gene ontology assignments were determined for 21,471 (46.3%). The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs) and intron spanning regions (ISRs) for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC) of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding applications in

  8. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L. using sanger and next generation sequencing platforms: development and applications.

    Directory of Open Access Journals (Sweden)

    Himabindu Kudapa

    Full Text Available A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201, comprising 46,369 transcript assembly contigs (TACs has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8% of the TACs and gene ontology assignments were determined for 21,471 (46.3%. The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs and intron spanning regions (ISRs for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding

  9. Expanding the mutation spectrum in 130 probands with ARPKD: identification of 62 novel PKHD1 mutations by sanger sequencing and MLPA analysis.

    Science.gov (United States)

    Melchionda, Salvatore; Palladino, Teresa; Castellana, Stefano; Giordano, Mario; Benetti, Elisa; De Bonis, Patrizia; Zelante, Leopoldo; Bisceglia, Luigi

    2016-09-01

    Autosomal recessive polycystic kidney disease (ARPKD) is a rare severe genetic disorder arising in the perinatal period, although a late-onset presentation of the disease has been described. Pulmonary hypoplasia is the major cause of morbidity and mortality in the newborn period. ARPKD is caused by mutations in the PKHD1 (polycystic kidney and hepatic disease 1) gene that is among the largest human genes. To achieve a molecular diagnosis of the disease, a large series of Italian affected subjects were recruited. Exhaustive mutation analysis of PKHD1 gene was carried out by Sanger sequencing and multiple ligation probe amplification (MLPA) technique in 110 individuals. A total of 173 mutations resulting in a detection rate of 78.6% were identified. Additional 20 unrelated patients, in whom it was not possible to analyze the whole coding sequence, have been included in this study. Taking into account the total number (n=130) of this cohort of patients, 107 different types of mutations have been detected in 193 mutated alleles. Out of 107 mutations, 62 were novel: 11 nonsense, 6 frameshift, 7 splice site mutations, 2 in-frame deletions and 2 multiexon deletion detected by MLPA. Thirty-four were missense variants. In conclusion, our report expands the spectrum of PKHD1 mutations and confirms the heterogeneity of this disorder. The population under study represents the largest Italian ARPKD cohort reported to date. The estimated costs and the time invested for molecular screening of genes with large size and allelic heterogeneity such as PKHD1 demand the use of next-generation sequencing (NGS) technologies for a faster and cheaper screening of the affected subjects.

  10. Genetic Testing Requires NGS and Sanger Methodologies.

    Science.gov (United States)

    Jennings, Lawrence J; Kirschmann, Dawn

    2016-09-01

    Investigators from the EuroEPINOMICS rare epilepsy syndromes Dravet working group performed whole-exome sequencing on 31 trios that had been reported negative for SCN1A mutations by Sanger sequencing.

  11. Disagreement in genotyping results of drug resistance alleles of the Plasmodium falciparum dihydrofolate reductase (Pfdhfr) gene by allele-specific PCR (ASPCR) assays and Sanger sequencing.

    Science.gov (United States)

    Sharma, Divya; Lather, Manila; Dykes, Cherry L; Dang, Amita S; Adak, Tridibes; Singh, Om P

    2016-01-01

    The rapid spread of antimalarial drug resistance in Plasmodium falciparum over the past few decades has necessitated intensive monitoring of such resistance for an effective malaria control strategy. P. falciparum dihydropteroate synthase (Pfdhps) and P. falciparum dihydrofolate reductase (Pfdhfr) genes act as molecular markers for resistance against the antimalarial drugs sulphadoxine and pyrimethamine, respectively. Resistance to pyrimethamine which is used as a partner drug in artemisinin combination therapy (ACT) is associated with several mutations in the Pfdhfr gene, namely A16V, N51I, C59R, S108N/T and I164L. Therefore, routine monitoring of Pfdhfr-drug-resistant alleles in a population may help in effective drug resistance management. Allele-specific PCR (ASPCR) is one of the commonly used methods for molecular genotyping of these alleles. In this study, we genotyped 55 samples of P. falciparum for allele discrimination at four codons of Pfdhfr (N51, C59, S108 and I164) by ASPCR using published methods and by Sanger's DNA sequencing method. We found that the ASPCR identified a significantly higher number of mutant alleles as compared to the DNA sequencing method. Such discrepancies arise due to the non-specificity of some of the allele-specific primer sets and due to the lack of sensitivity of Sanger's DNA sequencing method to detect minor alleles present in multiple clone infections. This study reveals the need of a highly specific and sensitive method for genotyping and detecting minor drug-resistant alleles present in multiple clonal infections.

  12. Sanger Sequencing for BRCA1 c.68_69del, BRCA1 c.5266dup and BRCA2 c.5946del Mutation Screen on Pap Smear Cytology Samples.

    Science.gov (United States)

    Lee, Sin Hang; Zhou, Shaoxia; Zhou, Tianjun; Hong, Guofan

    2016-02-08

    Three sets of polymerase chain reaction (PCR) primers were designed for heminested PCR amplification of the target DNA fragments in the human genome which include the site of BRCA1 c.68_69del, BRCA1 c.5266dup and BRCA2 c.5946del respectively, to prepare the templates for direct Sanger sequencing screen of these three founder mutations. With a robust PCR mixture, crude proteinase K digestate of the fixed cervicovaginal cells in the liquid-based Papanicolaou (Pap) cytology specimens can be used as the sample for target DNA amplification without pre-PCR DNA extraction, purification and quantitation. The post-PCR products can be used directly as the sequencing templates without further purification or quantitation. By simplifying the frontend procedures for template preparation, the cost for screening these three founder mutations can be reduced to about US $200 per test when performed in conjunction with human papillomavirus (HPV) assays now routinely ordered for cervical cancer prevention. With this projected price structure, selective patients in a high-risk population can be tested and each provided with a set of DNA sequencing electropherograms to document the absence or presence of these founder mutations in her genome to help assess inherited susceptibility to breast and ovarian cancer in this era of precision molecular personalized medicine.

  13. Sanger Sequencing for BRCA1 c.68_69del, BRCA1 c.5266dup and BRCA2 c.5946del Mutation Screen on Pap Smear Cytology Samples

    Directory of Open Access Journals (Sweden)

    Sin Hang Lee

    2016-02-01

    Full Text Available Three sets of polymerase chain reaction (PCR primers were designed for heminested PCR amplification of the target DNA fragments in the human genome which include the site of BRCA1 c.68_69del, BRCA1 c.5266dup and BRCA2 c.5946del respectively, to prepare the templates for direct Sanger sequencing screen of these three founder mutations. With a robust PCR mixture, crude proteinase K digestate of the fixed cervicovaginal cells in the liquid-based Papanicolaou (Pap cytology specimens can be used as the sample for target DNA amplification without pre-PCR DNA extraction, purification and quantitation. The post-PCR products can be used directly as the sequencing templates without further purification or quantitation. By simplifying the frontend procedures for template preparation, the cost for screening these three founder mutations can be reduced to about US $200 per test when performed in conjunction with human papillomavirus (HPV assays now routinely ordered for cervical cancer prevention. With this projected price structure, selective patients in a high-risk population can be tested and each provided with a set of DNA sequencing electropherograms to document the absence or presence of these founder mutations in her genome to help assess inherited susceptibility to breast and ovarian cancer in this era of precision molecular personalized medicine.

  14. Highly sensitive KRAS mutation detection from formalin-fixed paraffin-embedded biopsies and circulating tumour cells using wild-type blocking polymerase chain reaction and Sanger sequencing.

    Science.gov (United States)

    Huang, Meggie Mo Chao; Leong, Sai Mun; Chua, Hui Wen; Tucker, Steven; Cheong, Wai Chye; Chiu, Lily; Li, Mo-Huang; Koay, Evelyn Siew-Chuan

    2014-08-01

    Among patients with colorectal cancer (CRC), KRAS mutations were reported to occur in 30-51 % of all cases. CRC patients with KRAS mutations were reported to be non-responsive to anti-epidermal growth factor receptor (EGFR) monoclonal antibody (MoAb) treatment in many clinical trials. Hence, accurate detection of KRAS mutations would be critical in guiding the use of anti-EGFR MoAb therapies in CRC. In this study, we carried out a detailed investigation of the efficacy of a wild-type (WT) blocking real-time polymerase chain reaction (PCR), employing WT KRAS locked nucleic acid blockers, and Sanger sequencing, for KRAS mutation detection in rare cells. Analyses were first conducted on cell lines to optimize the assay protocol which was subsequently applied to peripheral blood and tissue samples from patients with CRC. The optimized assay provided a superior sensitivity enabling detection of as little as two cells with mutated KRAS in the background of 10(4) WT cells (0.02 %). The feasibility of this assay was further investigated to assess the KRAS status of 45 colorectal tissue samples, which had been tested previously, using a conventional PCR sequencing approach. The analysis showed a mutational discordance between these two methods in 4 of 18 WT cases. Our results present a simple, effective, and robust method for KRAS mutation detection in both paraffin embedded tissues and circulating tumour cells, at single-cell level. The method greatly enhances the detection sensitivity and alleviates the need of exhaustively removing co-enriched contaminating lymphocytes.

  15. Screening for EGFR mutations in lung cancer by a novel real-time PCR with double-loop probe and Sanger DNA sequencing%特异引物双扩增实时PCR法和Sanger DNA测序法检测肺癌组织中表皮生长因子受体基因突变

    Institute of Scientific and Technical Information of China (English)

    张海萍; 阮力; 郑立谟; 白冬雨; 张海芳; 廖永强; 丁毅

    2013-01-01

    Objective To map the frequency and types of EGFR gene mutations present in lung cancer tissues.To evaluate the clinical applicability of a novel real-time double-loop probe PCR of which the ADx-EGFR kit is based,and to compare its performance with traditional Sanger DNA sequencing in the detection of somatic mutations of tumor genes.Methods A total of 208 formalin-fixed paraffin-embedded (FFPE) tumor samples were tested.Genomic DNA of the tissue samples was extracted and purified,and subjected to both traditional PCR amplification,Sanger sequencing of EGFR gene in exon 18,19,20,21,and ADx's EGFR mutation detection kit.The mutation rates for EGFR gene in exon 18,19,20,21,as well as the frequency of each mutation detected by the two methods,were analyzed.Results The traditional Sanger DNA sequencing technique was successfully performed in 196 out of 208 (94.2%) lung cancer samples,and 22 samples (11.2%) showed EGFR gene mutations.ADx-EGFR kit was successfully used in the lung cancers of all of the 208 cases (100.0%),and 40 samples (19.2%) showed mutations.In the lung cancer samples analyzed,mutations were mainly detected in the exon 19 and exon 21 L858R point mutation,i.e.4.8% (10/208) and 11.6% (23/208) of total mutations,respectively,and the remaining mutations were rare.Conclusions The success rate of ADx-EGFR real-time PCR for formalin-fixed and paraffin-embedded tissues samples is significantly higher than that of Sanger sequencing (P <0.01).There are significant differences between the two methods.ADx-EGFR real-time PCR shows a much higher successful detection rate and mutation rate of lung cancer tissues compared with that of Sanger sequencing.As a result,the real-time PCR with ADx-EGFR kit is proved to have a good clinical applicability and a strong advantage over the traditional Sanger DNA sequencing.It is an effective and reliable tool for clinical screening of somatic gene mutations in tumors.%目的 探讨特异引物双环探针扩增实

  16. Was Margaret Sanger a racist?

    Science.gov (United States)

    Valenza, C

    1985-01-01

    Margaret Sanger, as a young public health nurse, witnessed the sickness, disease and poverty caused by unwanted pregnancies. She spent the rest of her life trying to alleviate these conditions by bringing birth control to America. During the early 20th century, the idea of making contraceptives generally available was revolutionary. Contraceptive usage was considered a distinguishing feature of the 'haves.' In recent years, some revisionist biographers have portrayed Sanger as a eugenicist and a racist. This view has been widely publicized by critics of reproductive rights who have attempted to discredit Sanger's work by discrediting her personally. The basic concept of the eugenics movement in the 1920s and 1930s was that a better breed of humans would be created if the 'fit' had more children and the 'unfit' had fewer. This concept influenced a broad spectrum of thought, but there was little consensus on the definitions of fit and unfit. In theory, the movement was not racist--its message intended to cross race barriers for the overall advancement of mankind. Most eugenicists agreed that birth control would be a detriment to the human race and were opposed to it. Charges that Sanger's motives for promoting birth control were eugenic are not supported. In part of her most important work, "Pivot of Civilization," Sanger's dissent from eugenics was made clear. By examining extracts from her books, the author refutes the notion that Sanger was a eugenicist. Another unsupported argument raised by the anti-Sanger group was that Sanger, in her position as editor of "Birth Contol Review," published eugenicists' views. It would be more accurate to say that the review covered a wide range of opinions and research; the eugenicists views were included because they conferred respectability. David Kennedy, author of "Birth Control in America," does Sanger a grave injustice by falsely attributing to her the quotation: 'More children from the fit, less from the unfit--that is

  17. A genotypic test for HIV-1 tropism combining Sanger sequencing with ultradeep sequencing predicts virologic response in treatment-experienced patients.

    Directory of Open Access Journals (Sweden)

    Ron M Kagan

    Full Text Available A tropism test is required prior to initiation of CCR5 antagonist therapy in HIV-1 infected individuals, as these agents are not effective in patients harboring CXCR4 (X4 coreceptor-using viral variants. We developed a clinical laboratory-based genotypic tropism test for detection of CCR5-using (R5 or X4 variants that utilizes triplicate population sequencing (TPS followed by ultradeep sequencing (UDS for samples classified as R5. Tropism was inferred using the bioinformatic algorithms geno2pheno([coreceptor] and PSSM(x4r5. Virologic response as a function of tropism readout was retrospectively assessed using blinded samples from treatment-experienced subjects who received maraviroc (N = 327 in the MOTIVATE and A4001029 clinical trials. MOTIVATE patients were classified as R5 and A4001029 patients were classified as non-R5 by the original Trofile test. Virologic response was compared between the R5 and non-R5 groups determined by TPS, UDS alone, the reflex strategy and the Trofile Enhanced Sensitivity (TF-ES test. UDS had greater sensitivity than TPS to detect minority non-R5 variants. The median log(10 viral load change at week 8 was -2.4 for R5 subjects, regardless of the method used for classification; for subjects with non-R5 virus, median changes were -1.2 for TF-ES or the Reflex Test and -1.0 for UDS. The differences between R5 and non-R5 groups were highly significant in all 3 cases (p<0.0001. At week 8, the positive predictive value was 66% for TF-ES and 65% for both the Reflex test and UDS. Negative predictive values were 59% for TF-ES, 58% for the Reflex Test and 61% for UDS. In conclusion, genotypic tropism testing using UDS alone or a reflex strategy separated maraviroc responders and non-responders as well as a sensitive phenotypic test, and both assays showed improved performance compared to TPS alone. Genotypic tropism tests may provide an alternative to phenotypic testing with similar discriminating ability.

  18. The first determination of DNA sequence of a specific gene.

    Science.gov (United States)

    Inouye, Masayori

    2016-05-10

    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  19. Introduction of the hybcell-based compact sequencing technology and comparison to state-of-the-art methodologies for KRAS mutation detection.

    Science.gov (United States)

    Zopf, Agnes; Raim, Roman; Danzer, Martin; Niklas, Norbert; Spilka, Rita; Pröll, Johannes; Gabriel, Christian; Nechansky, Andreas; Roucka, Markus

    2015-03-01

    The detection of KRAS mutations in codons 12 and 13 is critical for anti-EGFR therapy strategies; however, only those methodologies with high sensitivity, specificity, and accuracy as well as the best cost and turnaround balance are suitable for routine daily testing. Here we compared the performance of compact sequencing using the novel hybcell technology with 454 next-generation sequencing (454-NGS), Sanger sequencing, and pyrosequencing, using an evaluation panel of 35 specimens. A total of 32 mutations and 10 wild-type cases were reported using 454-NGS as the reference method. Specificity ranged from 100% for Sanger sequencing to 80% for pyrosequencing. Sanger sequencing and hybcell-based compact sequencing achieved a sensitivity of 96%, whereas pyrosequencing had a sensitivity of 88%. Accuracy was 97% for Sanger sequencing, 85% for pyrosequencing, and 94% for hybcell-based compact sequencing. Quantitative results were obtained for 454-NGS and hybcell-based compact sequencing data, resulting in a significant correlation (r = 0.914). Whereas pyrosequencing and Sanger sequencing were not able to detect multiple mutated cell clones within one tumor specimen, 454-NGS and the hybcell-based compact sequencing detected multiple mutations in two specimens. Our comparison shows that the hybcell-based compact sequencing is a valuable alternative to state-of-the-art methodologies used for detection of clinically relevant point mutations.

  20. Sequencing of Ebola Virus Genomes Using Nanopore Technology

    Science.gov (United States)

    Hoenen, Thomas

    2017-01-01

    Sequencing of virus genomes during disease outbreaks can provide valuable information for diagnostics, epidemiology, and evaluation of potential countermeasures. However, particularly in remote areas logistical and technical challenges can be significant. Nanopore sequencing provides an alternative to classical Sanger and next-generation sequencing methods, and was successfully used under outbreak conditions (Hoenen et al., 2016; Quick et al., 2016). Here we describe a protocol used for sequencing of Ebola virus under outbreak conditions using Nanopore technology, which we successfully implemented at the CDC/NIH diagnostic laboratory (de Wit et al., 2016) located at the ELWA-3 Ebola virus Treatment Unit in Monrovia, Liberia, during the recent Ebola virus outbreak in West Africa.

  1. Allele Re-sequencing Technologies

    DEFF Research Database (Denmark)

    Byrne, Stephen; Farrell, Jacqueline Danielle; Asp, Torben

    2013-01-01

    The development of next-generation sequencing technologies has made sequencing an affordable approach for detection of genetic variations associated with various traits. However, the cost of whole genome re-sequencing still remains too high to be feasible for many plant species with large...... alternative to whole genome re-sequencing to identify causative genetic variations in plants. One challenge, however, will be efficient bioinformatics strategies for data handling and analysis from the increasing amount of sequence information....

  2. Analysis of the Pythium ultimum transcriptome using Sanger and Pyrosequencing approaches

    Directory of Open Access Journals (Sweden)

    André Lévesque C

    2008-11-01

    similarity to oomycete RXLR and Crinkler effectors, Kazal-like and cystatin-like protease inhibitors, and elicitins were identified. Sequences with similarity to thiamine biosynthesis enzymes that are lacking in the genome sequences of three Phytophthora species and one downy mildew were identified and could serve as useful phylogenetic markers. Furthermore, we identified 179 candidate simple sequence repeats that can be used for genotyping strains of P. ultimum. Conclusion Through these two technologies, we were able to generate a robust set (~10 Mb of transcribed sequences for P. ultimum. We were able to identify known sequences present in oomycetes as well as identify novel sequences. An ample number of candidate polymorphic markers were identified in the dataset providing resources for phylogenetic and diagnostic marker development for this species. On a technical level, in spite of the depth possible with 454 FLX platform, the Sanger and pyro-based sequencing methodologies were complementary as each method generated sequences unique to each platform.

  3. Electrostatic Potential Maps and Natural Bond Orbital Analysis: Visualization and Conceptualization of Reactivity in Sanger's Reagent

    Science.gov (United States)

    Mottishaw, Jeffery D.; Erck, Adam R.; Kramer, Jordan H.; Sun, Haoran; Koppang, Miles

    2015-01-01

    Frederick Sanger's early work on protein sequencing through the use of colorimetric labeling combined with liquid chromatography involves an important nucleophilic aromatic substitution (S[subscript N]Ar) reaction in which the N-terminus of a protein is tagged with Sanger's reagent. Understanding the inherent differences between this S[subscript…

  4. Advanced sequencing technologies and their wider impact in microbiology.

    Science.gov (United States)

    Hall, Neil

    2007-05-01

    In the past 10 years, microbiology has undergone a revolution that has been driven by access to cheap high-throughput DNA sequencing. It was not long ago that the cloning and sequencing of a target gene could take months or years, whereas now this entire process has been replaced by a 10 min Internet search of a public genome database. There has been no single innovation that has initiated this rapid technological change; in fact, the core chemistry of DNA sequencing is the same as it was 30 years ago. Instead, progress has been driven by large sequencing centers that have incrementally industrialized the Sanger sequencing method. A side effect of this industrialization is that large-scale sequencing has moved out of small research labs, and the vast majority of sequence data is now generated by large genome centers. Recently, there have been advances in technology that will enable high-throughput genome sequencing to be established in research labs using bench-top instrumentation. These new technologies are already being used to explore the vast microbial diversity in the natural environment and the untapped genetic variation that can occur in bacterial species. It is expected that these powerful new methods will open up new questions to genomic investigation and will also allow high-throughput sequencing to be more than just a discovery exercise but also a routine assay for hypothesis testing. While this review will concentrate on microorganisms, many of the important arguments about the need to measure and understand variation at the species, population and ecosystem level will hold true for many other biological systems.

  5. Next-generation sequencing technology for genetics and genomics of sorghum

    DEFF Research Database (Denmark)

    Luo, Hong; Mocoeur, Anne Raymonde Joelle; Jing, Hai-Chun

    2014-01-01

    NGS platforms, comparing their working theories and reveiwing their advantages and disavantages. We also discuss the future of NGS development and point out that single molecular sequencing would push the technology to the next level for biological sciences. Much of the chapter focuses on the use......The invention and application of Next-Generation Sequencing (NGS) technologies have revolutionized the study of genetics and genomics. Much research which would not even be considered are nowdays being excuted in many laboratories as routine. In this chapter, we introduce the currently available...... of NGS technologies in sorghum. Although the acquisition of the first whole-genome sequence in sorghum was carried out primarily using Sanger sequencing, the use of NGS for examining the genome-wide variation was almost synchronized with other work. Interesting genomic variation was found between sweet...

  6. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  7. Sequencing technologies for animal cell culture research.

    Science.gov (United States)

    Kremkow, Benjamin G; Lee, Kelvin H

    2015-01-01

    Over the last 10 years, 2nd and 3rd generation sequencing technologies have made the use of genomic sequencing within the animal cell culture community increasingly commonplace. Each technology's defining characteristics are unique, including the cost, time, sequence read length, daily throughput, and occurrence of sequence errors. Given each sequencing technology's intrinsic advantages and disadvantages, the optimal technology for a given experiment depends on the particular experiment's objective. This review discusses the current characteristics of six next-generation sequencing technologies, compares the differences between them, and characterizes their relevance to the animal cell culture community. These technologies are continually improving, as evidenced by the recent achievement of the field's benchmark goal: sequencing a human genome for less than $1,000.

  8. The Fast Changing Landscape of Sequencing Technologies and Their Impact on Microbial Genome Assemblies and Annotation

    Energy Technology Data Exchange (ETDEWEB)

    Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Brettin, Thomas S [ORNL; Quest, Daniel J [ORNL; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Clum, Alicia [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Cottingham, Robert W [ORNL; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2012-01-01

    Background: The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation. Methodology/Principal Findings: In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis. Conclusion: These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).

  9. Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency

    Directory of Open Access Journals (Sweden)

    Hsu Ming-Tsung

    2010-11-01

    Full Text Available Abstract Background Investigation of metagenomes provides greater insight into uncultured microbial communities. The improvement in sequencing technology, which yields a large amount of sequence data, has led to major breakthroughs in the field. However, at present, taxonomic binning tools for metagenomes discard 30-40% of Sanger sequencing data due to the stringency of BLAST cut-offs. In an attempt to provide a comprehensive overview of metagenomic data, we re-analyzed the discarded metagenomes by using less stringent cut-offs. Additionally, we introduced a new criterion, namely, the evolutionary conservation of adjacency between neighboring genes. To evaluate the feasibility of our approach, we re-analyzed discarded contigs and singletons from several environments with different levels of complexity. We also compared the consistency between our taxonomic binning and those reported in the original studies. Results Among the discarded data, we found that 23.7 ± 3.9% of singletons and 14.1 ± 1.0% of contigs were assigned to taxa. The recovery rates for singletons were higher than those for contigs. The Pearson correlation coefficient revealed a high degree of similarity (0.94 ± 0.03 at the phylum rank and 0.80 ± 0.11 at the family rank between the proposed taxonomic binning approach and those reported in original studies. In addition, an evaluation using simulated data demonstrated the reliability of the proposed approach. Conclusions Our findings suggest that taking account of conserved neighboring gene adjacency improves taxonomic assignment when analyzing metagenomes using Sanger sequencing. In other words, utilizing the conserved gene order as a criterion will reduce the amount of data discarded when analyzing metagenomes.

  10. 454 sequencing put to the test using the complex genome of barley

    Science.gov (United States)

    Wicker, Thomas; Schlagenhauf, Edith; Graner, Andreas; Close, Timothy J; Keller, Beat; Stein, Nils

    2006-01-01

    Background During the past decade, Sanger sequencing has been used to completely sequence hundreds of microbial and a few higher eukaryote genomes. In recent years, a number of alternative technologies became available, among them adaptations of the pyrosequencing procedure (i.e. "454 sequencing"), promising a ~100-fold increase in throughput over Sanger technology – an advancement which is needed to make large and complex genomes more amenable to full genome sequencing at affordable costs. Although several studies have demonstrated its potential usefulness for sequencing small and compact microbial genomes, it was unclear how the new technology would perform in large and highly repetitive genomes such as those of wheat or barley. Results To study its performance in complex genomes, we used 454 technology to sequence four barley Bacterial Artificial Chromosome (BAC) clones and compared the results to those from ABI-Sanger sequencing. All gene containing regions were covered efficiently and at high quality with 454 sequencing whereas repetitive sequences were more problematic with 454 sequencing than with ABI-Sanger sequencing. 454 sequencing provided a much more even coverage of the BAC clones than ABI-Sanger sequencing, resulting in almost complete assembly of all genic sequences even at only 9 to 10-fold coverage. To obtain highly advanced working draft sequences for the BACs, we developed a strategy to assemble large parts of the BAC sequences by combining comparative genomics, detailed repeat analysis and use of low-quality reads from 454 sequencing. Additionally, we describe an approach of including small numbers of ABI-Sanger sequences to produce hybrid assemblies to partly compensate the short read length of 454 sequences. Conclusion Our data indicate that 454 pyrosequencing allows rapid and cost-effective sequencing of the gene-containing portions of large and complex genomes and that its combination with ABI-Sanger sequencing and targeted sequence

  11. 454 sequencing put to the test using the complex genome of barley

    Directory of Open Access Journals (Sweden)

    Keller Beat

    2006-10-01

    Full Text Available Abstract Background During the past decade, Sanger sequencing has been used to completely sequence hundreds of microbial and a few higher eukaryote genomes. In recent years, a number of alternative technologies became available, among them adaptations of the pyrosequencing procedure (i.e. "454 sequencing", promising a ~100-fold increase in throughput over Sanger technology – an advancement which is needed to make large and complex genomes more amenable to full genome sequencing at affordable costs. Although several studies have demonstrated its potential usefulness for sequencing small and compact microbial genomes, it was unclear how the new technology would perform in large and highly repetitive genomes such as those of wheat or barley. Results To study its performance in complex genomes, we used 454 technology to sequence four barley Bacterial Artificial Chromosome (BAC clones and compared the results to those from ABI-Sanger sequencing. All gene containing regions were covered efficiently and at high quality with 454 sequencing whereas repetitive sequences were more problematic with 454 sequencing than with ABI-Sanger sequencing. 454 sequencing provided a much more even coverage of the BAC clones than ABI-Sanger sequencing, resulting in almost complete assembly of all genic sequences even at only 9 to 10-fold coverage. To obtain highly advanced working draft sequences for the BACs, we developed a strategy to assemble large parts of the BAC sequences by combining comparative genomics, detailed repeat analysis and use of low-quality reads from 454 sequencing. Additionally, we describe an approach of including small numbers of ABI-Sanger sequences to produce hybrid assemblies to partly compensate the short read length of 454 sequences. Conclusion Our data indicate that 454 pyrosequencing allows rapid and cost-effective sequencing of the gene-containing portions of large and complex genomes and that its combination with ABI-Sanger sequencing

  12. Next-generation sequencing technology for genetics and genomics of sorghum

    DEFF Research Database (Denmark)

    Luo, Hong; Mocoeur, Anne Raymonde Joelle; Jing, Hai-Chun

    2014-01-01

    of NGS technologies in sorghum. Although the acquisition of the first whole-genome sequence in sorghum was carried out primarily using Sanger sequencing, the use of NGS for examining the genome-wide variation was almost synchronized with other work. Interesting genomic variation was found between sweet...... and grain sorghum. NGS has also been used to examine the transcriptomes of sorghum under various stress conditions. Besides identifying interesting transcriptonal adpatation to stress conditions, these study show that sugar could potentially act as an osmitic adjusting factor via transcriptional regulation....... Furthermore, miRNAs are found to be important adaptation to both biotic and abiotic stresses in sorghum. We discuss the use of NGS for further genetic improvement and breeding in sorghum....

  13. A comparative study of pyrosequencing method and Sanger sequencing method for detecting the drug resistance mutation loci in hepatitis B virus%乙型肝炎病毒耐药突变的焦磷酸测序与Sanger双脱氧链终止法检测试剂的比对及临床应用

    Institute of Scientific and Technical Information of China (English)

    叶佩燕; 夏前林; 张建良

    2016-01-01

    Objective: To compare the pyrosequencing method and sanger sequencing method for detection of HBV drug resistance mutation loci. Methods: A total of 415 serum samples from hepatitis patients were collected, While 30 control serum samples were served as controls. HBV drug resistance mutation loci in the serum sample were detected in par allel with the pyrosequencing method and Sanger sequencing method. Using Sanger sequencing method as reference, the specificity, sensitivity and the total coincidence rate of pyrosequencing method were calculated. Kappa value was cal-culated for agreement analysis. Results:Taking Sanger sequencing method as reference, the specificity, sensitivity and the total coincidence rate of pyrosequencing method were 100%, 99.82% and 99.86%, respectively. Moreover, a high degree of agreement was observed (Kappa value, 0.997). Conclusions: Pyrosequencing method is a rapid, sensitive and specific method for the detection of HBV drug resistance mutation loci, and has a good prospect to be applied in clinical laboratory.%目的:将检测乙型肝炎病毒(hepatitis B virus,HBV)耐药突变位点的焦磷酸测序检测试剂与Sanger双脱氧链终止法(Sanger测序法,简称Sanger法)检测试剂进行临床比对,为临床诊断和个体化治疗提供参考。方法:分别用研制的焦磷酸测序检测试剂与Sanger法检测试剂检测415份临床慢性乙型肝炎(乙肝)患者的血清样本及30例对照血清样本,并与Sanger法检测试剂比对,计算焦磷酸测序检测试剂的特异度、灵敏度及总符合率,计算Kappa值,比较2种试剂检测结果的一致性。结果:与经典的Sanger法检测试剂比对,焦磷酸测序检测试剂的特异度为100%,灵敏度为99.82%,一致率为99.86%,受试者工作特征曲线下面积为0.9994,两者间具有较强的一致性。结论:焦磷酸测序检测试剂适合于临床对HBV样本进行耐药性诊断,其特异度、灵

  14. [DNA sequencing technology and automatization of it].

    Science.gov (United States)

    Kraev, A S

    1991-01-01

    Precise manipulations with genetic material, typical for modern experiments in molecular biology and in new biotechnology, require a capability to determine DNA base sequence. This capability enables today to exploit specific genetic knowledge for the dissection of complex cell processes and for modulation of cell metabolism in transgenic organisms. The review focuses on such DNA sequencing technologies that are widespread in general laboratory practice. They can safely be called, with the availability of commercial reagents, industrial techniques. Modern DNA sequencing requires recurrent breakdown of large genomic DNA into smaller pieces, that are then amplified, sequenced and the initial long stretch reconstructed via overlap of small pieces. The DNA sequencing process has several steps: a DNA fragment is obtained in sufficient quantity and purity, it is converted to a form suitable for a particular sequencing method, a sequencing reaction is performed and its products fractionated; and finally the resultant data are interpreted (i.e. an autoradiograph is read into a computer memory) and a long sequence in reconstructed via overlap of short stretches. These steps are considered in separate parts; an accent is made on sequencing strategies with respect to their biological task. In the last part, possibilities for automation of sequencing experiment are considered, followed by a discussion of domestic problems in DNA sequencing.

  15. Application of next generation sequencing technology in Mendelian movement disorders.

    Science.gov (United States)

    Wang, Yumin; Pan, Xuya; Xue, Dan; Li, Yuwei; Zhang, Xueying; Kuang, Biao; Zheng, Jiabo; Deng, Hao; Li, Xiaoling; Xiong, Wei; Zeng, Zhaoyang; Li, Guiyuan

    2016-02-01

    Next generation sequencing (NGS) has developed very rapidly in the last decade. Compared with Sanger sequencing, NGS has the advantages of high sensitivity and high throughput. Movement disorders are a common type of neurological disease. Although traditional linkage analysis has become a standard method to identify the pathogenic genes in diseases, it is getting difficult to find new pathogenic genes in rare Mendelian disorders, such as movement disorders, due to a lack of appropriate families with high penetrance or enough affected individuals. Thus, NGS is an ideal approach to identify the causal alleles for inherited disorders. NGS is used to identify genes in several diseases and new mutant sites in Mendelian movement disorders. This article reviewed the recent progress in NGS and the use of NGS in Mendelian movement disorders from genome sequencing and transcriptome sequencing. A perspective on how NGS could be employed in rare Mendelian disorders is also provided.

  16. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  17. MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.

    Science.gov (United States)

    Pandey, Ram Vinay; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas

    2016-01-01

    Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.

  18. DEVELOPMENT OF NEW SEQUENCING TECHNOLOGIES AND THEIR APPLICATION IN GENOME ANALYSIS OF DOMESTIC ANIMALS

    Directory of Open Access Journals (Sweden)

    Kristina Gvozdanović

    2015-12-01

    Full Text Available Sequencing and detailed study of the genom of domestic animals began in the middle of the last century. It was primarily referred to development of the first generation sequencing methods, i.e. Sanger sequencing method. Next generation sequencing methods are currently the most common methods in the analysis of domestic animals genom. The application of these methods gave us up to 100 time more data in comparison with Sanger method. Analyses including RNA sequencing, genotyping of whole genome, immunoprecipitation associated with DNA microarrays, detection ofmutations and inherited diseases, sequencing ofthemitochondrial genome and many others have been conducted with development and application of new sequencing methods since 2005 until today. Application of new sequencing methods in the analysis ofdomestic animal genome provides better understanding of the genetic basis for important production traits which could help in improving the livestock production.

  19. Comparative analysis of real-time quantitative PCR-Sanger sequencing method and TaqMan probe method for detection of KRAS/BRAF mutation in colorectal carcinomas%即时定量PCR-Sanger测序与TaqMan探针法检测结直肠癌KRAS、BRAF基因突变的对比分析

    Institute of Scientific and Technical Information of China (English)

    张汛; 王跃华; 高宁; 王晋芬

    2014-01-01

    Objective To compare the application values of real-time quantitative PCR-Sanger sequencing and TaqMan probe method in the detection of KRAS and BRAF mutations,and to correlate KRAS/BRAF mutations with the clinicopathological characteristics in colorectal carcinomas.Methods Genomic DNA of the tumor cells was extracted from formalin fixed paraffin embedded (FFPE) tissue samples of 344 colorectal carcinomas by microdissection.Real-time quantitative PCR-Sanger sequencing and TaqMan probe method were performed to detect the KRAS/BRAF mutations.The frequency and types of KRAS/BRAF mutations,clinicopathological characteristics and survival time were analyzed.Results KRAS mutations were detected in 39.8% (137/344) and 38.7% (133/344) of 344 colorectal carcinomas by using real-time quantitative PCR-Sanger sequencing and TaqMan probe method,respectively.BRAF mutation was detected in 4.7% (16/344) and 4.1% (14/344),respectively.There was no significant correlation between the two methods.The frequency of the KRAS mutation in female was higher than that in male (P <0.05).The frequency of the BRAF mutation in colon was higher than that in rectum.The frequency of the BRAF mutation in stage Ⅲ-Ⅳ cases was higher than that in stage Ⅰ-Ⅱ cases.The frequency of the BRAF mutation in signet ring cell carcinoma was higher than that in mucinous carcinoma and nonspecific adenocarcinoma had the lowest mutation rate.The frequency of the BRAF mutation in grade Ⅲ cases was higher than that in grade Ⅱ cases (P < 0.05).The overall concordance for the two methods of KRAS/BRAF mutation detection was 98.8% (kappa =0.976).There was statistic significance between BRAF and KRAS mutations for the survival time of colorectal carcinomas (P =0.039).There were no statistic significance between BRAF mutation type and BRAF/KRAS wild type (P =0.058).Conclusions (1) Compared with real-time quantitative PCR-Sanger sequencing,TaqMan probe method is better with regard to handling time

  20. Targeted high-throughput sequencing of tagged nucleic acid samples

    OpenAIRE

    M.; Meyer; Stenzel, U.; Myles, S.; Prüfer, K; Hofreiter, M.

    2007-01-01

    High-throughput 454 DNA sequencing technology allows much faster and more cost-effective sequencing than traditional Sanger sequencing. However, the technology imposes inherent limitations on the number of samples that can be processed in parallel. Here we introduce parallel tagged sequencing (PTS), a simple, inexpensive and flexible barcoding technique that can be used for parallel sequencing any number and type of double-stranded nucleic acid samples. We demonstrate that PTS is particularly...

  1. Frederick Sanger, Erwin Chargaff, and the metamorphosis of specificity.

    Science.gov (United States)

    Judson, H F

    1993-12-15

    That a transformation of ruling ideas in genetics and biochemistry took place at the dawn of molecular biology, in the late 1940s, is a commonplace; but the nature and components of that transformation are widely misunderstood. The change is often identified with the importation into biology of new styles of thought and new rigor by the many scientists trained in physics or chemistry who came into the nascent field--notably, Max Delbrück, Max Perutz, Francis Crick, John Kendrew, Maurice Wilkins, Rosalind Franklin. Most generally, the change is supposed to be the realization that genes are made not of protein but of nucleic acid--and this change was initiated, of course, by the work of Oswald Avery and his colleagues. These changes are not mutually exclusive, and both were surely important to the genesis of molecular biology. But logically prior to them, more fundamental, was another transformation in ruling preconceptions, one that has been neglected: the revolution in understanding of the chemical structures--the sequences of subunits--of proteins and of nucleic acids which was wrought by the work of Frederick Sanger and of Erwin Chargaff. This was a metamorphosis in the understanding of biochemical specificity, and while it astonished many biochemists it set free the small groups of those who were beginning to call themselves molecular biologists, enabling them to think of the relationship between genes and proteins in entirely new ways.

  2. Coverage bias and sensitivity of variant calling for four whole-genome sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Nora Rieber

    Full Text Available The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina's HiSeq2000, Life Technologies' SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics' technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms with Sanger sequencing, which is prohibitively expensive for whole genome studies. Here we present a detailed comparison of the performance of all currently available whole genome sequencing platforms, especially regarding their ability to call SNVs and to evenly cover the genome and specific genomic regions. Unlike earlier studies, we base our comparison on four different samples, allowing us to assess the between-sample variation of the platforms. We find a pronounced GC bias in GC-rich regions for Life Technologies' platforms, with Complete Genomics performing best here, while we see the least bias in GC-poor regions for HiSeq2000 and 5500xl. HiSeq2000 gives the most uniform coverage and displays the least sample-to-sample variation. In contrast, Complete Genomics exhibits by far the smallest fraction of bases not covered, while the SOLiD platforms reveal remarkable shortcomings, especially in covering CpG islands. When comparing the performance of the four platforms for calling SNPs, HiSeq2000 and Complete Genomics achieve the highest sensitivity, while the SOLiD platforms show the lowest false positive rate. Finally, we find that integrating sequencing data from different platforms offers the potential to combine the strengths of different technologies. In summary, our results detail the strengths and weaknesses of all four whole-genome sequencing platforms. It indicates application areas that call for a specific sequencing platform and disallow other

  3. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Science.gov (United States)

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  4. Simulation-Based Evaluation of Learning Sequences for Instructional Technologies

    Science.gov (United States)

    McEneaney, John E.

    2016-01-01

    Instructional technologies critically depend on systematic design, and learning hierarchies are a commonly advocated tool for designing instructional sequences. But hierarchies routinely allow numerous sequences and choosing an optimal sequence remains an unsolved problem. This study explores a simulation-based approach to modeling learning…

  5. Identification of Genetic Alterations, as Causative Genetic Defects in Long QT Syndrome, Using Next Generation Sequencing Technology.

    Directory of Open Access Journals (Sweden)

    Oscar Campuzano

    Full Text Available Long QT Syndrome is an inherited channelopathy leading to sudden cardiac death due to ventricular arrhythmias. Despite that several genes have been associated with the disease, nearly 20% of cases remain without an identified genetic cause. Other genetic alterations such as copy number variations have been recently related to Long QT Syndrome. Our aim was to take advantage of current genetic technologies in a family affected by Long QT Syndrome in order to identify the cause of the disease.Complete clinical evaluation was performed in all family members. In the index case, a Next Generation Sequencing custom-built panel, including 55 sudden cardiac death-related genes, was used both for detection of sequence and copy number variants. Next Generation Sequencing variants were confirmed by Sanger method. Copy number variations variants were confirmed by Multiplex Ligation dependent Probe Amplification method and at the mRNA level. Confirmed variants and copy number variations identified in the index case were also analyzed in relatives.In the index case, Next Generation Sequencing revealed a novel variant in TTN and a large deletion in KCNQ1, involving exons 7 and 8. Both variants were confirmed by alternative techniques. The mother and the brother of the index case were also affected by Long QT Syndrome, and family cosegregation was observed for the KCNQ1 deletion, but not for the TTN variant.Next Generation Sequencing technology allows a comprehensive genetic analysis of arrhythmogenic diseases. We report a copy number variation identified using Next Generation Sequencing analysis in Long QT Syndrome. Clinical and familiar correlation is crucial to elucidate the role of genetic variants identified to distinguish the pathogenic ones from genetic noise.

  6. Nanopore-based Fourth-generation DNA Sequencing Technology

    Institute of Scientific and Technical Information of China (English)

    Yanxiao Feng; Yuechuan Zhang; Cuifeng Ying; Deqiang Wang; Chunlei Du

    2015-01-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than$100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications.

  7. BRCA1-2 diagnostic workflow from next-generation sequencing technologies to variant identification and final report.

    Science.gov (United States)

    Pilato, Brunella; Pinto, Rosamaria; De Summa, Simona; Petriella, Daniela; Lacalamita, Rosanna; Danza, Katia; Paradiso, Angelo; Tommasi, Stefania

    2016-10-01

    The BRCA1-BRCA2 genes predispose to hereditary breast and ovarian cancer, and the germline and mutational status of these genes defines a target population that can benefit from PARP inhibitor treatments. To respond to the increasing number of BRCA1-BRCA2 tests, it is necessary to shift to high-throughput technologies that are reliable and less time consuming. Different methodological platforms are dedicated to this purpose with different approaches and algorithms for analysis. Our aim was to set up a cost-effective and low time-consuming BRCA1-BRCA2 mutation detection workflow using the Ion Torrent PGM technology. A retrospective cohort of 40 patients with familial breast/ovarian cancer previously tested by Sanger sequencing and a prospective cohort of 72 patients (validation set) were analyzed. The validation set included 64 patients affected by familial breast/ovarian cancer and eight sporadic ovarian cancer cases, who are potential candidates for PARPi treatments. A complete and standardized workflow easily usable and suitable in a certified laboratory has been proved and validated. This includes all steps from library preparation to the final report. The use of next-generation sequencing will be of benefit for patients enrolled in the genetic counseling process and, moreover, will enhance the process of selecting patients eligible for personalized treatments. © 2016 Wiley Periodicals, Inc.

  8. Next generation sequencing (NGS)technologies and applications

    Energy Technology Data Exchange (ETDEWEB)

    Vuyisich, Momchilo [Los Alamos National Laboratory

    2012-09-11

    NGS technology overview: (1) NGS library preparation - Nucleic acids extraction, Sample quality control, RNA conversion to cDNA, Addition of sequencing adapters, Quality control of library; (2) Sequencing - Clonal amplification of library fragments, (except PacBio), Sequencing by synthesis, Data output (reads and quality); and (3) Data analysis - Read mapping, Genome assembly, Gene expression, Operon structure, sRNA discovery, and Epigenetic analyses.

  9. Comparison of next generation sequencing technologies for transcriptome characterization

    Directory of Open Access Journals (Sweden)

    Soltis Douglas E

    2009-08-01

    Full Text Available Abstract Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19. We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica and the magnoliid avocado (Persea americana using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB, 119,518 (88.7% mapped exactly to known exons, while 1,117 (0.8% mapped to introns, 11,524 (8.6% spanned annotated intron/exon boundaries, and 3,066 (2.3% extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance

  10. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    Science.gov (United States)

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  11. Combining two technologies for full genome sequencing of human.

    Science.gov (United States)

    Skryabin, K G; Prokhortchouk, E B; Mazur, A M; Boulygina, E S; Tsygankova, S V; Nedoluzhko, A V; Rastorguev, S M; Matveev, V B; Chekanov, N N; D A, Goranskaya; Teslyuk, A B; Gruzdeva, N M; Velikhov, V E; Zaridze, D G; Kovalchuk, M V

    2009-10-01

    At present, the new technologies of DNA sequencing are rapidly developing allowing quick and efficient characterisation of organisms at the level of the genome structure. In this study, the whole genome sequencing of a human (Russian man) was performed using two technologies currently present on the market - Sequencing by Oligonucleotide Ligation and Detection (SOLiD™) (Applied Biosystems) and sequencing technologies of molecular clusters using fluorescently labeled precursors (Illumina). The total number of generated data resulted in 108.3 billion base pairs (60.2 billion from Illumina technology and 48.1 billion from SOLiD technology). Statistics performed on reads generated by GAII and SOLiD showed that they covered 75% and 96% of the genome respectively. Short polymorphic regions were detected with comparable accuracy however, the absolute amount of them revealed by SOLiD was several times less than by GAII. Optimal algorithm for using the latest methods of sequencing was established for the analysis of individual human genomes. The study is the first Russian effort towards whole human genome sequencing.

  12. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  13. Exploring the switchgrass transcriptome using second-generation sequencing technology.

    Directory of Open Access Journals (Sweden)

    Yixing Wang

    Full Text Available BACKGROUND: Switchgrass (Panicum virgatum L. is a C4 perennial grass and widely popular as an important bioenergy crop. To accelerate the pace of developing high yielding switchgrass cultivars adapted to diverse environmental niches, the generation of genomic resources for this plant is necessary. The large genome size and polyploid nature of switchgrass makes whole genome sequencing a daunting task even with current technologies. Exploring the transcriptional landscape using next generation sequencing technologies provides a viable alternative to whole genome sequencing in switchgrass. PRINCIPAL FINDINGS: Switchgrass cDNA libraries from germinating seedlings, emerging tillers, flowers, and dormant seeds were sequenced using Roche 454 GS-FLX Titanium technology, generating 980,000 reads with an average read length of 367 bp. De novo assembly generated 243,600 contigs with an average length of 535 bp. Using the foxtail millet genome as a reference greatly improved the assembly and annotation of switchgrass ESTs. Comparative analysis of the 454-derived switchgrass EST reads with other sequenced monocots including Brachypodium, sorghum, rice and maize indicated a 70-80% overlap. RPKM analysis demonstrated unique transcriptional signatures of the four tissues analyzed in this study. More than 24,000 ESTs were identified in the dormant seed library. In silico analysis indicated that there are more than 2000 EST-SSRs in this collection. Expression of several orphan ESTs was confirmed by RT-PCR. SIGNIFICANCE: We estimate that about 90% of the switchgrass gene space has been covered in this analysis. This study nearly doubles the amount of EST information for switchgrass currently in the public domain. The celerity and economical nature of second-generation sequencing technologies provide an in-depth view of the gene space of complex genomes like switchgrass. Sequence analysis of closely related members of the NAD(+-malic enzyme type C4 grasses such as

  14. Performance Characteristics and Validation of Next-Generation Sequencing for Human Leucocyte Antigen Typing.

    Science.gov (United States)

    Weimer, Eric T; Montgomery, Maureen; Petraroia, Rosanne; Crawford, John; Schmitz, John L

    2016-09-01

    High-resolution human leukocyte antigen (HLA) matching reduces graft-versus-host disease and improves overall patient survival after hematopoietic stem cell transplant. Sanger sequencing has been the gold standard for HLA typing since 1996. However, given the increasing number of new HLA alleles identified and the complexity of the HLA genes, clinical HLA typing by Sanger sequencing requires several rounds of additional testing to provide allele-level resolution. Although next-generation sequencing (NGS) is routinely used in molecular genetics, few clinical HLA laboratories use the technology. The performance characteristics of NGS HLA typing using TruSight HLA were determined using Sanger sequencing as the reference method. In total, 211 samples were analyzed with an overall accuracy of 99.8% (2954/2961) and 46 samples were analyzed for precision with 100% (368/368) reproducibility. Most discordant alleles were because of technical error rather than assay performance. More important, the ambiguity rate was 3.5% (103/2961). Seventy-four percentage of the ambiguities were within the DRB1 and DRB4 loci. HLA typing by NGS saves approximately $6000 per run when compared to Sanger sequencing. Thus, TruSight HLA assay enables high-throughput HLA typing with an accuracy, precision, ambiguity rate, and cost savings that should facilitate adoption of NGS technology in clinical HLA laboratories.

  15. Inferring short-range linkage information from sequencing chromatograms.

    Directory of Open Access Journals (Sweden)

    Bastian Beggel

    Full Text Available Direct Sanger sequencing of viral genome populations yields multiple ambiguous sequence positions. It is not straightforward to derive linkage information from sequencing chromatograms, which in turn hampers the correct interpretation of the sequence data. We present a method for determining the variants existing in a viral quasispecies in the case of two nearby ambiguous sequence positions by exploiting the effect of sequence context-dependent incorporation of dideoxynucleotides. The computational model was trained on data from sequencing chromatograms of clonal variants and was evaluated on two test sets of in vitro mixtures. The approach achieved high accuracies in identifying the mixture components of 97.4% on a test set in which the positions to be analyzed are only one base apart from each other, and of 84.5% on a test set in which the ambiguous positions are separated by three bases. In silico experiments suggest two major limitations of our approach in terms of accuracy. First, due to a basic limitation of Sanger sequencing, it is not possible to reliably detect minor variants with a relative frequency of no more than 10%. Second, the model cannot distinguish between mixtures of two or four clonal variants, if one of two sets of linear constraints is fulfilled. Furthermore, the approach requires repetitive sequencing of all variants that might be present in the mixture to be analyzed. Nevertheless, the effectiveness of our method on the two in vitro test sets shows that short-range linkage information of two ambiguous sequence positions can be inferred from Sanger sequencing chromatograms without any further assumptions on the mixture composition. Additionally, our model provides new insights into the established and widely used Sanger sequencing technology. The source code of our method is made available at http://bioinf.mpi-inf.mpg.de/publications/beggel/linkageinformation.zip.

  16. Solid-State Nanopore-Based DNA Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Zewen Liu

    2016-01-01

    Full Text Available The solid-state nanopore-based DNA sequencing technology is becoming more and more attractive for its brand new future in gene detection field. The challenges that need to be addressed are diverse: the effective methods to detect base-specific signatures, the control of the nanopore’s size and surface properties, and the modulation of translocation velocity and behavior of the DNA molecules. Among these challenges, the realization of the high-quality nanopores with the help of modern micro/nanofabrication technologies is a crucial one. In this paper, typical technologies applied in the field of solid-state nanopore-based DNA sequencing have been reviewed.

  17. DNA Sequencing Sensors: An Overview

    Directory of Open Access Journals (Sweden)

    Jose Antonio Garrido-Cardenas

    2017-03-01

    Full Text Available The first sequencing of a complete genome was published forty years ago by the double Nobel Prize in Chemistry winner Frederick Sanger. That corresponded to the small sized genome of a bacteriophage, but since then there have been many complex organisms whose DNA have been sequenced. This was possible thanks to continuous advances in the fields of biochemistry and molecular genetics, but also in other areas such as nanotechnology and computing. Nowadays, sequencing sensors based on genetic material have little to do with those used by Sanger. The emergence of mass sequencing sensors, or new generation sequencing (NGS meant a quantitative leap both in the volume of genetic material that was able to be sequenced in each trial, as well as in the time per run and its cost. One can envisage that incoming technologies, already known as fourth generation sequencing, will continue to cheapen the trials by increasing DNA reading lengths in each run. All of this would be impossible without sensors and detection systems becoming smaller and more precise. This article provides a comprehensive overview on sensors for DNA sequencing developed within the last 40 years.

  18. Application of next-generation sequencing technology in forensic science.

    Science.gov (United States)

    Yang, Yaran; Xie, Bingbing; Yan, Jiangwei

    2014-10-01

    Next-generation sequencing (NGS) technology, with its high-throughput capacity and low cost, has developed rapidly in recent years and become an important analytical tool for many genomics researchers. New opportunities in the research domain of the forensic studies emerge by harnessing the power of NGS technology, which can be applied to simultaneously analyzing multiple loci of forensic interest in different genetic contexts, such as autosomes, mitochondrial and sex chromosomes. Furthermore, NGS technology can also have potential applications in many other aspects of research. These include DNA database construction, ancestry and phenotypic inference, monozygotic twin studies, body fluid and species identification, and forensic animal, plant and microbiological analyses. Here we review the application of NGS technology in the field of forensic science with the aim of providing a reference for future forensics studies and practice.

  19. Application of Next-generation Sequencing Technology in Forensic Science

    Directory of Open Access Journals (Sweden)

    Yaran Yang

    2014-10-01

    Full Text Available Next-generation sequencing (NGS technology, with its high-throughput capacity and low cost, has developed rapidly in recent years and become an important analytical tool for many genomics researchers. New opportunities in the research domain of the forensic studies emerge by harnessing the power of NGS technology, which can be applied to simultaneously analyzing multiple loci of forensic interest in different genetic contexts, such as autosomes, mitochondrial and sex chromosomes. Furthermore, NGS technology can also have potential applications in many other aspects of research. These include DNA database construction, ancestry and phenotypic inference, monozygotic twin studies, body fluid and species identification, and forensic animal, plant and microbiological analyses. Here we review the application of NGS technology in the field of forensic science with the aim of providing a reference for future forensics studies and practice.

  20. Application of Next-generation Sequencing Technology in Forensic Science

    Institute of Scientific and Technical Information of China (English)

    Yaran Yang; Bingbing Xie; Jiangwei Yan

    2014-01-01

    Next-generation sequencing (NGS) technology, with its high-throughput capacity and low cost, has developed rapidly in recent years and become an important analytical tool for many genomics researchers. New opportunities in the research domain of the forensic studies emerge by harnessing the power of NGS technology, which can be applied to simultaneously analyzing multi-ple loci of forensic interest in different genetic contexts, such as autosomes, mitochondrial and sex chromosomes. Furthermore, NGS technology can also have potential applications in many other aspects of research. These include DNA database construction, ancestry and phenotypic inference, monozygotic twin studies, body fluid and species identification, and forensic animal, plant and microbiological analyses. Here we review the application of NGS technology in the field of forensic science with the aim of providing a reference for future forensics studies and practice.

  1. Implementing a genomic data management system using iRODS in the Wellcome Trust Sanger Institute

    Directory of Open Access Journals (Sweden)

    Sale Kevin

    2011-09-01

    Full Text Available Abstract Background Increasingly large amounts of DNA sequencing data are being generated within the Wellcome Trust Sanger Institute (WTSI. The traditional file system struggles to handle these increasing amounts of sequence data. A good data management system therefore needs to be implemented and integrated into the current WTSI infrastructure. Such a system enables good management of the IT infrastructure of the sequencing pipeline and allows biologists to track their data. Results We have chosen a data grid system, iRODS (Rule-Oriented Data management systems, to act as the data management system for the WTSI. iRODS provides a rule-based system management approach which makes data replication much easier and provides extra data protection. Unlike the metadata provided by traditional file systems, the metadata system of iRODS is comprehensive and allows users to customize their own application level metadata. Users and IT experts in the WTSI can then query the metadata to find and track data. The aim of this paper is to describe how we designed and used (from both system and user viewpoints iRODS as a data management system. Details are given about the problems faced and the solutions found when iRODS was implemented. A simple use case describing how users within the WTSI use iRODS is also introduced. Conclusions iRODS has been implemented and works as the production system for the sequencing pipeline of the WTSI. Both biologists and IT experts can now track and manage data, which could not previously be achieved. This novel approach allows biologists to define their own metadata and query the genomic data using those metadata.

  2. Advances of DNA Sequencing Technology and Its Applications%DNA测序技术及其应用研究进展

    Institute of Scientific and Technical Information of China (English)

    刘朋虎; 林冬梅; 林占熺; 李晶

    2012-01-01

    In this paper, we introduced principles and characteristics of the first, second and third generation of sequencing technology, then the applications of second sequencing technology were described. Because of complicated operation and high cost, the first generation DNA sequencing technology represented by Sanger sequencing method can not meet the needs of large - scale sequencing. The second generation DNA sequencing technology characterized by high-throughout and low cost including Solexa sequencing technology of Illumina, and Applied Biosystems SOLiD and Roche 454 now has been used in many fields of life science research. The third-generation sequencing technology which can sequence single DNA molecular has also been arisen, but not been widely used in life science research Key words .%本文首先介绍了第一代、第二代、第三代DNA测序技术的原理、特点,在此基础上介绍了第二代测序技术在基因组测序、重测序,RNA测序,宏基因组,DNA甲基化等方面的应用.第一代测序技术以Sanger测序法为代表,操作繁琐、成本较高,不能满足大规模测序的需要.第二代测序技术以高通量、低成本为主要特点,主要包括Illumina公司的Solexa测序技术、罗氏公司的454测序技术和ABI公司的SOLiD测序技术,目前已广泛应用于生命科学研究的各个领域.第三代测序技术以单分子测序为主要特点,目前已经初见端倪,但是还没有被大规模广泛应用.

  3. Evolution of DNA sequencing

    National Research Council Canada - National Science Library

    Tipu, Hamid Nawaz; Shabbir, Ambreen

    2015-01-01

    Sanger and coworkers introduced DNA sequencing in 1970s for the first time. It principally relied on termination of growing nucleotide chain when a dideoxythymidine triphosphate (ddTTP) was inserted...

  4. DNA sequencing technology, walking with modular primers. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Ulanovsky, L.

    1996-12-31

    The success of the Human Genome Project depends on the development of adequate technology for rapid and inexpensive DNA sequencing, which will also benefit biomedical research in general. The authors are working on DNA technologies that eliminate primer synthesis, the main bottleneck in sequencing by primer walking. They have developed modular primers that are assembled from three 5-mer, 6-mer or 7-mer modules selected from a presynthesized library of as few as 1,000 oligonucleotides ({double_bond}4, {double_bond}5, {double_bond}7). The three modules anneal contiguously at the selected template site and prime there uniquely, even though each is not unique for the most part when used alone. This technique is expected to speed up primer walking 30 to 50 fold, and reduce the sequencing cost by a factor of 5 to 15. Time and expensive will be saved on primer synthesis itself and even more so due to closed-loop automation of primer walking, made possible by the instant availability of primers. Apart from saving time and cost, closed-loop automation would also minimize the errors and complications associated with human intervention between the walks. The author has also developed two additional approaches to primer-library based sequencing. One involves a branched structure of modular primers which has a distinctly different mechanism of achieving priming specificity. The other introduces the concept of ``Differential Extension with Nucleotide Subsets`` as an approach increasing priming specificity, priming strength and allowing cycle sequencing. These approaches are expected to be more robust than the original version of the modular primer technique.

  5. Hughes, Twain, Child, and Sanger: Four Who Locked Horns with the Censors

    Science.gov (United States)

    Meltzer, Milton

    1969-01-01

    A look at the lives and conflicts of four writers--Langston Hughes, Mark Twain, Lydia Maria Child, and Margaret Sanger--who faced public criticism and censorship because oftheir views on controversial issues. (RM)

  6. Simultaneous detection of human mitochondrial DNA and nuclear-inserted mitochondrial-origin sequences (NumtS) using forensic mtDNA amplification strategies and pyrosequencing technology.

    Science.gov (United States)

    Bintz, Brittania J; Dixon, Groves B; Wilson, Mark R

    2014-07-01

    Next-generation sequencing technologies enable the identification of minor mitochondrial DNA variants with higher sensitivity than Sanger methods, allowing for enhanced identification of minor variants. In this study, mixtures of human mtDNA control region amplicons were subjected to pyrosequencing to determine the detection threshold of the Roche GS Junior(®) instrument (Roche Applied Science, Indianapolis, IN). In addition to expected variants, a set of reproducible variants was consistently found in reads from one particular amplicon. A BLASTn search of the variant sequence revealed identity to a segment of a 611-bp nuclear insertion of the mitochondrial control region (NumtS) spanning the primer-binding sites of this amplicon (Nature 1995;378:489). Primers (Hum Genet 2012;131:757; Hum Biol 1996;68:847) flanking the insertion were used to confirm the presence or absence of the NumtS in buccal DNA extracts from twenty donors. These results further our understanding of human mtDNA variation and are expected to have a positive impact on the interpretation of mtDNA profiles using deep-sequencing methods in casework.

  7. Technology development for gene discovery and full-length sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  8. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens.

    Science.gov (United States)

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-09-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content.

  9. Discovery of posttranscriptional regulatory RNAs using next generation sequencing technologies.

    Science.gov (United States)

    Gelderman, Grant; Contreras, Lydia M

    2013-01-01

    Next generation sequencing (NGS) has revolutionized the way by which we engineer metabolism by radically altering the path to genome-wide inquiries. This is due to the fact that NGS approaches offer several powerful advantages over traditional methods that include the ability to fully sequence hundreds to thousands of genes in a single experiment and simultaneously detect homozygous and heterozygous deletions, alterations in gene copy number, insertions, translocations, and exome-wide substitutions that include "hot-spot mutations." This chapter describes the use of these technologies as a sequencing technique for transcriptome analysis and discovery of regulatory RNA elements in the context of three main platforms: Illumina HiSeq, 454 pyrosequencing, and SOLiD sequencing. Specifically, this chapter focuses on the use of Illumina HiSeq, since it is the most widely used platform for RNA discovery and transcriptome analysis. Regulatory RNAs have now been found in all branches of life. In bacteria, noncoding small RNAs (sRNAs) are involved in highly sophisticated regulatory circuits that include quorum sensing, carbon metabolism, stress responses, and virulence (Gorke and Vogel, Gene Dev 22:2914-2925, 2008; Gottesman, Trends Genet 21:399-404, 2005; Romby et al., Curr Opin Microbiol 9:229-236, 2006). Further characterization of the underlying regulation of gene expression remains poorly understood given that it is estimated that over 60% of all predicted genes remain hypothetical and the 5' and 3' untranslated regions are unknown for more than 90% of the genes (Siegel et al., Trends Parasitol 27:434-441, 2011). Importantly, manipulation of the posttranscriptional regulation that occurs at the level of RNA stability and export, trans-splicing, polyadenylation, protein translation, and protein stability via untranslated regions (Clayton, EMBO J 21:1881-1888, 2002; Haile and Papadopoulou, Curr Opin Microbiol 10:569-577, 2007) could be highly beneficial to metabolic

  10. Next-generation sequencing technology in clinical virology.

    Science.gov (United States)

    Capobianchi, M R; Giombini, E; Rozera, G

    2013-01-01

    Recent advances in nucleic acid sequencing technologies, referred to as 'next-generation' sequencing (NGS), have produced a true revolution and opened new perspectives for research and diagnostic applications, owing to the high speed and throughput of data generation. So far, NGS has been applied to metagenomics-based strategies for the discovery of novel viruses and the characterization of viral communities. Additional applications include whole viral genome sequencing, detection of viral genome variability, and the study of viral dynamics. These applications are particularly suitable for viruses such as human immunodeficiency virus, hepatitis B virus, and hepatitis C virus, whose error-prone replication machinery, combined with the high replication rate, results, in each infected individual, in the formation of many genetically related viral variants referred to as quasi-species. The viral quasi-species, in turn, represents the substrate for the selective pressure exerted by the immune system or by antiviral drugs. With traditional approaches, it is difficult to detect and quantify minority genomes present in viral quasi-species that, in fact, may have biological and clinical relevance. NGS provides, for each patient, a dataset of clonal sequences that is some order of magnitude higher than those obtained with conventional approaches. Hence, NGS is an extremely powerful tool with which to investigate previously inaccessible aspects of viral dynamics, such as the contribution of different viral reservoirs to replicating virus in the course of the natural history of the infection, co-receptor usage in minority viral populations harboured by different cell lineages, the dynamics of development of drug resistance, and the re-emergence of hidden genomes after treatment interruptions. The diagnostic application of NGS is just around the corner. © 2012 The Authors Clinical Microbiology and Infection © 2012 European Society of Clinical Microbiology and Infectious

  11. Application of next-generation sequencing technologies in virology.

    Science.gov (United States)

    Radford, Alan D; Chapman, David; Dixon, Linda; Chantrey, Julian; Darby, Alistair C; Hall, Neil

    2012-09-01

    The progress of science is punctuated by the advent of revolutionary technologies that provide new ways and scales to formulate scientific questions and advance knowledge. Following on from electron microscopy, cell culture and PCR, next-generation sequencing is one of these methodologies that is now changing the way that we understand viruses, particularly in the areas of genome sequencing, evolution, ecology, discovery and transcriptomics. Possibilities for these methodologies are only limited by our scientific imagination and, to some extent, by their cost, which has restricted their use to relatively small numbers of samples. Challenges remain, including the storage and analysis of the large amounts of data generated. As the chemistries employed mature, costs will decrease. In addition, improved methods for analysis will become available, opening yet further applications in virology including routine diagnostic work on individuals, and new understanding of the interaction between viral and host transcriptomes. An exciting era of viral exploration has begun, and will set us new challenges to understand the role of newly discovered viral diversity in both disease and health.

  12. Applications and Case Studies of the Next-Generation Sequencing Technologies in Food, Nutrition and Agriculture.

    Science.gov (United States)

    Next-generation sequencing technologies are able to produce high-throughput short sequence reads in a cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. Here I survey their major applications ranging...

  13. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  14. Sporadic hereditary motor and sensory neuropathies: Advances in the diagnosis using next generation sequencing technology.

    Science.gov (United States)

    Fallerini, Chiara; Carignani, Giulia; Capoccitti, Giorgio; Federico, Antonio; Rufa, Alessandra; Pinto, Anna Maria; Rizzo, Caterina Lo; Rossi, Alessandro; Mari, Francesca; Mencarelli, Maria Antonietta; Giannini, Fabio; Renieri, Alessandra

    2015-12-15

    Hereditary motor and sensory neuropathies (HMSN) are genetically heterogeneous disorders affecting peripheral motor and sensory functions. Many different pathogenic variants in several genes involved in the demyelinating, the axonal and the intermediate HMSN forms have been identified, for which all inheritance patterns have been described. The mutation screening currently available is based on Sanger sequencing and is time-consuming and relatively expensive due to the high number of genes involved and to the absence of mutational hot spots. To overcome these limitations, we have designed a custom panel for simultaneous sequencing of 28 HMSN-related genes. We have applied this panel to three representative patients with variable HMSN phenotype and uncertain diagnostic classifications. Using our NGS platform we rapidly identified three already described pathogenic heterozygous variants in MFN2, MPZ and DNM2 genes. Here we show that our pre-custom platform allows a fast, specific and low-cost diagnosis in sporadic HMSN cases. This prompt diagnosis is useful for providing a well-timed treatment, establishing a recurrence risk and preventing further investigations poorly tolerated by patients and expensive for the health system. Importantly, our study illustrates the utility and successful application of NGS to mutation screening of a Mendelian disorder with extreme locus heterogeneity.

  15. Biomolecule Sequencer: Nanopore Sequencing Technology for In-Situ Environmental Monitoring and Astrobiology

    Science.gov (United States)

    John, K. K.; Botkin, D. J.; Burton, A. S.; Castro-Wallace, S. L.; Chaput, J. D.; Dworkin, J. P.; Lupisella, M. L.; Mason, C. E.; Rubins, K. H.; Smith, D. J.; Stahl, S.; Switzer, C.

    2016-10-01

    Biomolecule Sequencer will demonstrate, for the first time, that DNA sequencing is feasible as a tool for in-situ environmental monitoring and astrobiology. A space-based sequencer could identify microbes, diseases, and help detect DNA-based life.

  16. Maximum length sequence and Bessel diffusers using active technologies

    Science.gov (United States)

    Cox, Trevor J.; Avis, Mark R.; Xiao, Lejun

    2006-02-01

    Active technologies can enable room acoustic diffusers to operate over a wider bandwidth than passive devices, by extending the bass response. Active impedance control can be used to generate surface impedance distributions which cause wavefront dispersion, as opposed to the more normal absorptive or pressure-cancelling target functions. This paper details the development of two new types of active diffusers which are difficult, if not impossible, to make as passive wide-band structures. The first type is a maximum length sequence diffuser where the well depths are designed to be frequency dependent to avoid the critical frequencies present in the passive device, and so achieve performance over a finite-bandwidth. The second is a Bessel diffuser, which exploits concepts developed for transducer arrays to form a hybrid absorber-diffuser. Details of the designs are given, and measurements of scattering and impedance used to show that the active diffusers are operating correctly over a bandwidth of about 100 Hz to 1.1 kHz. Boundary element method simulation is used to show how more application-realistic arrays of these devices would behave.

  17. Next-generation sequencing technology:A technology review and future perspective

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    As one of the most powerful tools in biomedical research,DNA sequencing not only has been improving its productivity at an exponential growth rate but has also been evolving into a new layout of technological territories toward engineering and physical disciplines over the past three decades.In this technical review,we look into technical characteristics of the next-generation sequencers and provide insights into their future development and applications.We envisage that some of the emerging platforms are capable of supporting the USD1000 genome and USD100 genome goals if given a few years for technical maturation.We also suggest that scientists from China should play an active role in this campaign that will have a profound impact on both scientific research and societal healthcare systems.

  18. Transcriptome sequencing for SNP discovery across Cucumis melo

    OpenAIRE

    2012-01-01

    Background: Melon (Cucumis melo L.) is a highly diverse species that is cultivated worldwide. Recent advances in massively parallel sequencing have begun to allow the study of nucleotide diversity in this species. The Sanger method combined with medium-throughput 454 technology were used in a previous study to analyze the genetic diversity of germplasm representing 3 botanical varieties, yielding a collection of about 40,000 SNPs distributed in 14,000 unigenes. However, the usefulness of this...

  19. Clinical Application of Picodroplet Digital PCR Technology for Rapid Detection of EGFR T790M in Next-Generation Sequencing Libraries and DNA from Limited Tumor Samples.

    Science.gov (United States)

    Borsu, Laetitia; Intrieri, Julie; Thampi, Linta; Yu, Helena; Riely, Gregory; Nafa, Khedoudja; Chandramohan, Raghu; Ladanyi, Marc; Arcila, Maria E

    2016-11-01

    Although next-generation sequencing (NGS) is a robust technology for comprehensive assessment of EGFR-mutant lung adenocarcinomas with acquired resistance to tyrosine kinase inhibitors, it may not provide sufficiently rapid and sensitive detection of the EGFR T790M mutation, the most clinically relevant resistance biomarker. Here, we describe a digital PCR (dPCR) assay for rapid T790M detection on aliquots of NGS libraries prepared for comprehensive profiling, fully maximizing broad genomic analysis on limited samples. Tumor DNAs from patients with EGFR-mutant lung adenocarcinomas and acquired resistance to epidermal growth factor receptor inhibitors were prepared for Memorial Sloan-Kettering-Integrated Mutation Profiling of Actionable Cancer Targets sequencing, a hybrid capture-based assay interrogating 410 cancer-related genes. Precapture library aliquots were used for rapid EGFR T790M testing by dPCR, and results were compared with NGS and locked nucleic acid-PCR Sanger sequencing (reference high sensitivity method). Seventy resistance samples showed 99% concordance with the reference high sensitivity method in accuracy studies. Input as low as 2.5 ng provided a sensitivity of 1% and improved further with increasing DNA input. dPCR on libraries required less DNA and showed better performance than direct genomic DNA. dPCR on NGS libraries is a robust and rapid approach to EGFR T790M testing, allowing most economical utilization of limited material for comprehensive assessment. The same assay can also be performed directly on any limited DNA source and cell-free DNA.

  20. Evaluation of GS Junior and MiSeq next-generation sequencing technologies as an alternative to Trugene population sequencing in the clinical HIV laboratory.

    Science.gov (United States)

    Ram, Daniela; Leshkowitz, Dena; Gonzalez, Dimitri; Forer, Relly; Levy, Itzchak; Chowers, Michal; Lorber, Margalit; Hindiyeh, Musa; Mendelson, Ella; Mor, Orna

    2015-02-01

    Population HIV-1 sequencing is currently the method of choice for the identification and follow-up of HIV-1 antiretroviral drug resistance. It has limited sensitivity and results in a consensus sequence showing the most prevalent nucleotide per position. Moreover concomitant sequencing and interpretation of the results for several samples together is laborious and time consuming. In this study, the practical use of GS Junior and MiSeq bench-top next generation sequencing (NGS) platforms as an alternative to Trugene Sanger-based population sequencing in the clinical HIV laboratory was assessed. DeepChek(®)-HIV TherapyEdge software was used for processing all the protease and reverse transcriptase sequences and for resistance interpretation. Plasma samples from nine HIV-1 carriers, representing the major HIV-1 subtypes in Israel, were compared. The total number of amino acid substitutions identified in the nine samples by GS Junior (232 substitutions) and MiSeq (243 substitutions) was similar and higher than Trugene (181 substitutions), emphasizing the advantage of deep sequencing on population sequencing. More than 80% of the identified substitutions were identical between the GS Junior and MiSeq platforms, most of which (184 of 199) at similar frequency. Low abundance substitutions accounted for 20.9% of the MiSeq and 21.9% of the GS Junior output, the majority of which were not detected by Trugene. More drug resistance mutations were identified by both the NGS platforms, primarily, but not only, at low abundance. In conclusion, in combination with DeepChek, both GS Junior and MiSeq were found to be more sensitive than Trugene and adequate for HIV-1 resistance analysis in the clinical HIV laboratory.

  1. Applications of Next-Generation Sequencing Technologies to Diagnostic Virology

    Directory of Open Access Journals (Sweden)

    Giorgio Palù

    2011-11-01

    Full Text Available Novel DNA sequencing techniques, referred to as “next-generation” sequencing (NGS, provide high speed and throughput that can produce an enormous volume of sequences with many possible applications in research and diagnostic settings. In this article, we provide an overview of the many applications of NGS in diagnostic virology. NGS techniques have been used for high-throughput whole viral genome sequencing, such as sequencing of new influenza viruses, for detection of viral genome variability and evolution within the host, such as investigation of human immunodeficiency virus and human hepatitis C virus quasispecies, and monitoring of low-abundance antiviral drug-resistance mutations. NGS techniques have been applied to metagenomics-based strategies for the detection of unexpected disease-associated viruses and for the discovery of novel human viruses, including cancer-related viruses. Finally, the human virome in healthy and disease conditions has been described by NGS-based metagenomics.

  2. Recent Progress Using High-throughput Sequencing Technologies in Plant Molecular Breeding

    Institute of Scientific and Technical Information of China (English)

    Qiang Gao; Guidong Yue; Wenqi Li; Junyi Wang; Jiaohui Xu; Ye Yin

    2012-01-01

    High-throughput sequencing is a revolutionary technological innovation in DNA sequencing.This technology has an ultra-low cost per base of sequencing and an overwhelmingly high data output.High-throughput sequencing has brought novel research methods and solutions to the research fields of genomics and post-genomics.Furthermore,this technology is leading to a new molecular breeding revolution that has landmark significance for scientific research and enables us to launch multi-level,multifaceted,and multi-extent studies in the fields of crop genetics,genomics,and crop breeding.In this paper,we review progress in the application of high-throughput sequencing technologies to plant molecular breeding studies.

  3. Bioinformatic analysis of ESTs collected by Sanger and pyrosequencing methods for a keystone forest tree species: oak

    Directory of Open Access Journals (Sweden)

    Léger Patrick

    2010-11-01

    Full Text Available Abstract Background The Fagaceae family comprises about 1,000 woody species worldwide. About half belong to the Quercus family. These oaks are often a source of raw material for biomass wood and fiber. Pedunculate and sessile oaks, are among the most important deciduous forest tree species in Europe. Despite their ecological and economical importance, very few genomic resources have yet been generated for these species. Here, we describe the development of an EST catalogue that will support ecosystem genomics studies, where geneticists, ecophysiologists, molecular biologists and ecologists join their efforts for understanding, monitoring and predicting functional genetic diversity. Results We generated 145,827 sequence reads from 20 cDNA libraries using the Sanger method. Unexploitable chromatograms and quality checking lead us to eliminate 19,941 sequences. Finally a total of 125,925 ESTs were retained from 111,361 cDNA clones. Pyrosequencing was also conducted for 14 libraries, generating 1,948,579 reads, from which 370,566 sequences (19.0% were eliminated, resulting in 1,578,192 sequences. Following clustering and assembly using TGICL pipeline, 1,704,117 EST sequences collapsed into 69,154 tentative contigs and 153,517 singletons, providing 222,671 non-redundant sequences (including alternative transcripts. We also assembled the sequences using MIRA and PartiGene software and compared the three unigene sets. Gene ontology annotation was then assigned to 29,303 unigene elements. Blast search against the SWISS-PROT database revealed putative homologs for 32,810 (14.7% unigene elements, but more extensive search with Pfam, Refseq_protein, Refseq_RNA and eight gene indices revealed homology for 67.4% of them. The EST catalogue was examined for putative homologs of candidate genes involved in bud phenology, cuticle formation, phenylpropanoids biosynthesis and cell wall formation. Our results suggest a good coverage of genes involved in these

  4. Perspectives of DNA microarray and next-generation DNA sequencing technologies

    Institute of Scientific and Technical Information of China (English)

    TENG XiaoKun; XIAO HuaSheng

    2009-01-01

    DNA microarray and next-generation DNA sequencing technologies are important tools for high-throughput genome research, in revealing both the structural and functional characteristics of genomes. In the past decade the DNA microarray technologies have been widely applied in the studies of functional genomics, systems biology and pharmacogenomics. The next-generation DNA sequenc-ing method was first introduced by the 454 Company in 2003, immediately followed by the establish-ment of the Solexa and Solid techniques by other biotech companies. Though it has not been long since the first emergence of this technology, with the fast and impressive improvement, the application of this technology has extended to almost all fields of genomics research, as a rival challenging the existing DNA microarray technology. This paper briefly reviews the working principles of these two technologies as well as their application and perspectives in genome research.

  5. Multilocus sequence typing of Staphylococcus aureus with DNA array technology

    NARCIS (Netherlands)

    W.B. van Leeuwen (Willem); C. Jay (Corinne); S.V. Snijders (Susan); N. Durin (Nathalia); B. Lacroix (Bruno); H.A. Verbrugh (Henri); M.C. Enright (Mark); A. Troesch (Alain); A.F. van Belkum (Alex)

    2003-01-01

    textabstractA newly developed oligonucleotide array suited for multilocus sequence typing (MLST) of Staphylococcus aureus strains was analyzed with two strain collections in a two-center study. MLST allele identification for the first strain collection fully agreed with conventiona

  6. Applications and case studies of the next-generation sequencing technologies in food, nutrition and agriculture.

    Science.gov (United States)

    Liu, George E

    2009-01-01

    The next-generation sequencing technologies are able to produce millions of short sequence reads in a high-throughput, cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also started to change the landscape of life sciences. Here, I survey their major applications ranging from whole-genome sequencing and resequencing, single nucleotide polymorphism (SNP) and structural variation discovery, to mRNA and noncoding RNA profiling and protein-nucleic acid interaction assay. These case studies in structural, functional and comparative genomics, metagenomics, and epigenomics are providing a more complete picture of the genome structures and functions. In the near future, we will witness broad impacts of these next-generation sequencing technologies for solving the complex biological problems in food, nutrition and agriculture. In this article, recent patents based information is also included.

  7. Multilocus sequence typing of Staphylococcus aureus with DNA array technology

    OpenAIRE

    2003-01-01

    textabstractA newly developed oligonucleotide array suited for multilocus sequence typing (MLST) of Staphylococcus aureus strains was analyzed with two strain collections in a two-center study. MLST allele identification for the first strain collection fully agreed with conventional strain typing. Analysis of strains from the second collection revealed that chip-defined MLST was concordant with conventional MLST. Array-mediated MLST data were reproducible, exchangeable, and epidemiologically ...

  8. High-throughput sequencing in veterinary infection biology and diagnostics.

    Science.gov (United States)

    Belák, S; Karlsson, O E; Leijon, M; Granberg, F

    2013-12-01

    Sequencing methods have improved rapidly since the first versions of the Sanger techniques, facilitating the development of very powerful tools for detecting and identifying various pathogens, such as viruses, bacteria and other microbes. The ongoing development of high-throughput sequencing (HTS; also known as next-generation sequencing) technologies has resulted in a dramatic reduction in DNA sequencing costs, making the technology more accessible to the average laboratory. In this White Paper of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine (Uppsala, Sweden), several approaches and examples of HTS are summarised, and their diagnostic applicability is briefly discussed. Selected future aspects of HTS are outlined, including the need for bioinformatic resources, with a focus on improving the diagnosis and control of infectious diseases in veterinary medicine.

  9. Perspectives of DNA microarray and next-generation DNA sequencing technologies

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    DNA microarray and next-generation DNA sequencing technologies are important tools for high-throughput genome research,in revealing both the structural and functional characteristics of genomes.In the past decade the DNA microarray technologies have been widely applied in the studies of functional genomics,systems biology and pharmacogenomics.The next-generation DNA sequencing method was first introduced by the 454 Company in 2003,immediately followed by the establishment of the Solexa and Solid techniques by other biotech companies.Though it has not been long since the first emergence of this technology,with the fast and impressive improvement,the application of this technology has extended to almost all fields of genomics research,as a rival challenging the existing DNA microarray technology.This paper briefly reviews the working principles of these two technologies as well as their application and perspectives in genome research.

  10. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

    Directory of Open Access Journals (Sweden)

    Jiang Du

    2009-07-01

    Full Text Available The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen, with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs. SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome. To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of

  11. A fast Boyer-Moore type pattern matching algorithm for highly similar sequences.

    Science.gov (United States)

    Ben Nsira, Nadia; Lecroq, Thierry; Elloumi, Mourad

    2015-01-01

    In the last decade, biology and medicine have undergone a fundamental change: next generation sequencing (NGS) technologies have enabled to obtain genomic sequences very quickly and at small costs compared to the traditional Sanger method. These NGS technologies have thus permitted to collect genomic sequences (genes, exomes or even full genomes) of individuals of the same species. These latter sequences are identical to more than 99%. There is thus a strong need for efficient algorithms for indexing and performing fast pattern matching in such specific sets of sequences. In this paper we propose a very efficient algorithm that solves the exact pattern matching problem in a set of highly similar DNA sequences where only the pattern can be pre-processed. This new algorithm extends variants of the Boyer-Moore exact string matching algorithm. Experimental results show that it exhibits the best performances in practice.

  12. Transverse Electronic Signature of DNA for Electronic Sequencing

    Science.gov (United States)

    Xu, Mingsheng; Endres, Robert G.; Arakawa, Yasuhiko

    In recent years, the proliferation of large-scale DNA sequencing projects for applications in clinical medicine and health care has driven the search for new methods that could reduce the time and cost. The commonly used Sanger sequencing method relies on the chemistry to read the bases in DNA and is far too slow and expensive for reading personal genetic codes. There were earlier attempts to sequence DNA by directly visualizing the nucleotide composition of the DNA molecules by scanning tunneling microscopy (STM). However, sequencing DNA based on directly imaging DNA's atomic structure has not yet been successful. In Chap. 9, Xu, Endres, and Arakawa report a potential physical alternative by detecting unique transverse electronic signatures of DNA bases using ultrahigh vacuum STM. Supported by the principles, calculations and statistical analyses, these authors argue that it would be possible to directly sequence DNA by the STM-based technology without any modification of the DNA.

  13. Treatemnt of Wastewater with Modified Sequencing Batch Biofilm Reactor Technology

    Institute of Scientific and Technical Information of China (English)

    胡龙兴; 刘宇陆

    2002-01-01

    This paper describes the removel of COD and nitrogen from wastewater with modified sequencing batch biofilm reactor,The strategy of simultaneous feeding and draining was explored.The results show that introduction of a new batch of wastewater and withdrawal of the purifeid water can be conducted simultaneously with the maximum volumetric exchange rate of about 70%,Application of this feeding and draining mode leads to the reduction of the cycle time,the increase of the utilization of the reactor volume and the simplification of the reactor structure.The treatment of a synthetic wastewater containing COD and nitrogen was investigated.The operation mode of F(D)-O(i.e.,simultaneous feeding and draining followed by the aerobic condition)was adopted.It was found that COD was degraded very fast in the initial reaction period of time,then reduced slowly and the ammonia nitrogen and nitrate nitrogen concentrations decreased and increased with time respectively,while the nitrite nitrogen level increased first and then reduced.The relationship between the COD or ammonia nitrogen loading and its removal rate was examined,and the removal of COD,ammonia nitrogen and total nitrogen could exceed 95%,90%and 80% respectively,The fact that nitrogen could e removed more completely under constant aeration(aerobic condition)of the SBBR operation mode is very interesting and could be explained in several respects.

  14. [Recent progress in gene mapping through high-throughput sequencing technology and forward genetic approaches].

    Science.gov (United States)

    Lu, Cairui; Zou, Changsong; Song, Guoli

    2015-08-01

    Traditional gene mapping using forward genetic approaches is conducted primarily through construction of a genetic linkage map, the process of which is tedious and time-consuming, and often results in low accuracy of mapping and large mapping intervals. With the rapid development of high-throughput sequencing technology and decreasing cost of sequencing, a variety of simple and quick methods of gene mapping through sequencing have been developed, including direct sequencing of the mutant genome, sequencing of selective mutant DNA pooling, genetic map construction through sequencing of individuals in population, as well as sequencing of transcriptome and partial genome. These methods can be used to identify mutations at the nucleotide level and has been applied in complex genetic background. Recent reports have shown that sequencing mapping could be even done without the reference of genome sequence, hybridization, and genetic linkage information, which made it possible to perform forward genetic study in many non-model species. In this review, we summarized these new technologies and their application in gene mapping.

  15. Advanced Applications of Next-Generation Sequencing Technologies to Orchid Biology.

    Science.gov (United States)

    Yeh, Chuan-Ming; Liu, Zhong-Jian; Tsai, Wen-Chieh

    2017-09-08

    Next-generation sequencing technologies are revolutionizing biology by permitting, transcriptome sequencing, whole-genome sequencing and resequencing, and genome-wide single nucleotide polymorphism profiling. Orchid research has benefited from this breakthrough, and a few orchid genomes are now available; new biological questions can be approached and new breeding strategies can be designed. The first part of this review describes the unique features of orchid biology. The second part provides an overview of the current next-generation sequencing platforms, many of which are already used in plant laboratories. The third part summarizes the state of orchid transcriptome and genome sequencing and illustrates current achievements. The genetic sequences currently obtained will not only provide a broad scope for the study of orchid biology, but also serves as a starting point for uncovering the mystery of orchid evolution.

  16. Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics

    Directory of Open Access Journals (Sweden)

    Rama R Gullapalli

    2012-01-01

    Full Text Available The Human Genome Project (HGP provided the initial draft of mankind′s DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized. [7] We also review the issue of physician education which also is an important consideration for the successful implementation of NGS in the clinical workplace. NGS technologies represent a golden opportunity for the next generation of pathologists to be at the leading edge of the personalized medicine approaches coming our way. Often under-emphasized issues of data access and control as well as potential ethical implications of whole genome NGS sequencing are also discussed. Despite some challenges, it′s hard not to be optimistic about the future of personalized genome sequencing and its potential impact on patient care and the advancement of knowledge of human biology and disease in the near future.

  17. The complete mitochondrial genome sequence of Xingkai topmouth culter (Culter alburnus).

    Science.gov (United States)

    Liu, Yu; Yang, Jun

    2014-12-01

    The complete sequence of the mitochondrial genome of Culter alburnus was determined to be 16,622 bp in length by Sanger sequencing technology, and to contain 13 protein-coding genes (PCGs), 22 tRNA genes and 2 ribosomal genes. Its total A + T content is 55.99%. 6 CSBs (CSB-1, CSB-2, CSB-3, CSB-D, CSB-E and CSB-F) and 1TAS were identified in the control region; the control region also included a 2 bp tandem repeat with 8 repeat times.

  18. Novel technologies applied to the nucleotide sequencing and comparative sequence analysis of the genomes of infectious agents in veterinary medicine.

    Science.gov (United States)

    Granberg, F; Bálint, Á; Belák, S

    2016-04-01

    Next-generation sequencing (NGS), also referred to as deep, high-throughput or massively parallel sequencing, is a powerful new tool that can be used for the complex diagnosis and intensive monitoring of infectious disease in veterinary medicine. NGS technologies are also being increasingly used to study the aetiology, genomics, evolution and epidemiology of infectious disease, as well as host-pathogen interactions and other aspects of infection biology. This review briefly summarises recent progress and achievements in this field by first introducing a range of novel techniques and then presenting examples of NGS applications in veterinary infection biology. Various work steps and processes for sampling and sample preparation, sequence analysis and comparative genomics, and improving the accuracy of genomic prediction are discussed, as are bioinformatics requirements. Examples of sequencing-based applications and comparative genomics in veterinary medicine are then provided. This review is based on novel references selected from the literature and on experiences of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine, Uppsala, Sweden.

  19. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    Science.gov (United States)

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  20. ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence

    Directory of Open Access Journals (Sweden)

    Cañizares Joaquin

    2011-06-01

    Full Text Available Abstract Background The possibilities offered by next generation sequencing (NGS platforms are revolutionizing biotechnological laboratories. Moreover, the combination of NGS sequencing and affordable high-throughput genotyping technologies is facilitating the rapid discovery and use of SNPs in non-model species. However, this abundance of sequences and polymorphisms creates new software needs. To fulfill these needs, we have developed a powerful, yet easy-to-use application. Results The ngs_backbone software is a parallel pipeline capable of analyzing Sanger, 454, Illumina and SOLiD (Sequencing by Oligonucleotide Ligation and Detection sequence reads. Its main supported analyses are: read cleaning, transcriptome assembly and annotation, read mapping and single nucleotide polymorphism (SNP calling and selection. In order to build a truly useful tool, the software development was paired with a laboratory experiment. All public tomato Sanger EST reads plus 14.2 million Illumina reads were employed to test the tool and predict polymorphism in tomato. The cleaned reads were mapped to the SGN tomato transcriptome obtaining a coverage of 4.2 for Sanger and 8.5 for Illumina. 23,360 single nucleotide variations (SNVs were predicted. A total of 76 SNVs were experimentally validated, and 85% were found to be real. Conclusions ngs_backbone is a new software package capable of analyzing sequences produced by NGS technologies and predicting SNVs with great accuracy. In our tomato example, we created a highly polymorphic collection of SNVs that will be a useful resource for tomato researchers and breeders. The software developed along with its documentation is freely available under the AGPL license and can be downloaded from http://bioinf.comav.upv.es/ngs_backbone/ or http://github.com/JoseBlanca/franklin.

  1. BAC-pool sequencing and analysis of large segments of A12 and D12 homoeologous chromosomes in upland cotton.

    Directory of Open Access Journals (Sweden)

    Ramesh Buyyarapu

    Full Text Available Although new and emerging next-generation sequencing (NGS technologies have reduced sequencing costs significantly, much work remains to implement them for de novo sequencing of complex and highly repetitive genomes such as the tetraploid genome of Upland cotton (Gossypium hirsutum L.. Herein we report the results from implementing a novel, hybrid Sanger/454-based BAC-pool sequencing strategy using minimum tiling path (MTP BACs from Ctg-3301 and Ctg-465, two large genomic segments in A12 and D12 homoeologous chromosomes (Ctg. To enable generation of longer contig sequences in assembly, we implemented a hybrid assembly method to process ~35x data from 454 technology and 2.8-3x data from Sanger method. Hybrid assemblies offered higher sequence coverage and better sequence assemblies. Homology studies revealed the presence of retrotransposon regions like Copia and Gypsy elements in these contigs and also helped in identifying new genomic SSRs. Unigenes were anchored to the sequences in Ctg-3301 and Ctg-465 to support the physical map. Gene density, gene structure and protein sequence information derived from protein prediction programs were used to obtain the functional annotation of these genes. Comparative analysis of both contigs with Arabidopsis genome exhibited synteny and microcollinearity with a conserved gene order in both genomes. This study provides insight about use of MTP-based BAC-pool sequencing approach for sequencing complex polyploid genomes with limited constraints in generating better sequence assemblies to build reference scaffold sequences. Combining the utilities of MTP-based BAC-pool sequencing with current longer and short read NGS technologies in multiplexed format would provide a new direction to cost-effectively and precisely sequence complex plant genomes.

  2. Next-generation sequencing technologies: breaking the sound barrier of human genetics.

    Science.gov (United States)

    Bahassi, El Mustapha; Stambrook, Peter J

    2014-09-01

    Demand for new technologies that deliver fast, inexpensive and accurate genome information has never been greater. This challenge has catalysed the rapid development of advances in next-generation sequencing (NGS). The generation of large volumes of sequence data and the speed of data acquisition are the primary advantages over previous, more standard methods. In 2013, the Food and Drug Administration granted marketing authorisation for the first high-throughput NG sequencer, Illumina's MiSeqDx, which allowed the development and use of a large number of new genome-based tests. Here, we present a review of template preparation, nucleic acid sequencing and imaging, genome assembly and alignment approaches as well as recent advances in current and near-term commercially available NGS instruments. We also outline the broad range of applications for NGS technologies and provide guidelines for platform selection to best address biological questions of interest. DNA sequencing has revolutionised biological and medical research, and is poised to have a similar impact on the practice of medicine. This tool is but one of an increasing arsenal of developing tools that enhance our capabilities to identify, quantify and functionally characterise the components of biological networks that keep us healthy or make us sick. Despite advances in other 'omic' technologies, DNA sequencing and analysis, in many respects, have played the leading role to date. The new technologies provide a bridge between genotype and phenotype, both in man and model organisms, and have revolutionised how risk of developing a complex human disease may be assessed. The generation of large DNA sequence data sets is producing a wealth of medically relevant information on a large number of individuals and populations that will potentially form the basis of truly individualised medical care in the future.

  3. Stepwise threshold clustering: a new method for genotyping MHC loci using next-generation sequencing technology.

    Directory of Open Access Journals (Sweden)

    William E Stutz

    Full Text Available Genes of the vertebrate major histocompatibility complex (MHC are of great interest to biologists because of their important role in immunity and disease, and their extremely high levels of genetic diversity. Next generation sequencing (NGS technologies are quickly becoming the method of choice for high-throughput genotyping of multi-locus templates like MHC in non-model organisms. Previous approaches to genotyping MHC genes using NGS technologies suffer from two problems:1 a "gray zone" where low frequency alleles and high frequency artifacts can be difficult to disentangle and 2 a similar sequence problem, where very similar alleles can be difficult to distinguish as two distinct alleles. Here were present a new method for genotyping MHC loci--Stepwise Threshold Clustering (STC--that addresses these problems by taking full advantage of the increase in sequence data provided by NGS technologies. Unlike previous approaches for genotyping MHC with NGS data that attempt to classify individual sequences as alleles or artifacts, STC uses a quasi-Dirichlet clustering algorithm to cluster similar sequences at increasing levels of sequence similarity. By applying frequency and similarity based criteria to clusters rather than individual sequences, STC is able to successfully identify clusters of sequences that correspond to individual or similar alleles present in the genomes of individual samples. Furthermore, STC does not require duplicate runs of all samples, increasing the number of samples that can be genotyped in a given project. We show how the STC method works using a single sample library. We then apply STC to 295 threespine stickleback (Gasterosteus aculeatus samples from four populations and show that neighboring populations differ significantly in MHC allele pools. We show that STC is a reliable, accurate, efficient, and flexible method for genotyping MHC that will be of use to biologists interested in a variety of downstream applications.

  4. A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome

    Directory of Open Access Journals (Sweden)

    Scoté-Blachon Céline

    2008-09-01

    Full Text Available Abstract Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression, LongSAGE and MPSS (Massively Parallel Signature Sequencing are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method.

  5. Analysis of plant microbe interactions in the era of next generation sequencing technologies

    Directory of Open Access Journals (Sweden)

    Claudia eKnief

    2014-05-01

    Full Text Available Next generation sequencing (NGS technologies have impressively accelerated research in biological science during the last years by enabling the production of large volumes of sequence data to a drastically lower price per base, compared to traditional sequencing methods. The recent and ongoing developments in the field allow addressing research questions in plant-microbe biology that were not conceivable just a few years ago. The present review provides an overview of NGS technologies and their usefulness for the analysis of microorganisms that live in association with plants. Possible limitations of the different sequencing systems, in particular sources of errors and bias, are critically discussed and methods are disclosed that help to overcome these shortcomings. A focus will be on the application of NGS methods in metagenomic studies, including the analysis of microbial communities by amplicon sequencing, which can be considered as a targeted metagenomic approach. Different applications of NGS technologies are exemplified by selected research articles that address the biology of the pant associated microbiota to demonstrate the worth of the new methods.

  6. Students' Guided Reinvention of Definition of Limit of a Sequence with Interactive Technology

    Science.gov (United States)

    Flores, Alfinio; Park, Jungeun

    2016-01-01

    In a course emphasizing interactive technology, 19 students, including 18 mathematics education majors, mostly in their first year, reinvented the definition of limit of a sequence while working in small cooperative groups. The class spent four sessions of 75 minutes each on a cyclical process of guided reinvention of the definition of limit of a…

  7. Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology

    NARCIS (Netherlands)

    Botton, A.; Galla, G.; Conesa, A.; Bachem, C.W.B.; Ramina, A.; Barcaccia, G.

    2008-01-01

    Background: After 10-year-use of AFLP (Amplified Fragment Length Polymorphism) technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and

  8. Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology

    NARCIS (Netherlands)

    Botton, A.; Galla, G.; Conesa, A.; Bachem, C.W.B.; Ramina, A.; Barcaccia, G.

    2008-01-01

    Background: After 10-year-use of AFLP (Amplified Fragment Length Polymorphism) technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and

  9. Next-generation sequencing (NGS) in the microbiological world: How to make the most of your money.

    Science.gov (United States)

    Vincent, Antony T; Derome, Nicolas; Boyle, Brian; Culley, Alexander I; Charette, Steve J

    2017-07-01

    The Sanger sequencing method produces relatively long DNA sequences of unmatched quality and has been considered for long time as the gold standard for sequencing DNA. Many improvements of the Sanger method that culminated with fluorescent dyes coupled with automated capillary electrophoresis enabled the sequencing of the first genomes. Nevertheless, using this technology to sequence whole genomes was costly, laborious and time consuming even for genomes that are relatively small in size. A major technological advance was the introduction of next-generation sequencing (NGS) pioneered by 454 Life Sciences in the early part of the 21th century. NGS allowed scientists to sequence thousands to millions of DNA molecules in a single machine run. Since then, new NGS technologies have emerged and existing NGS platforms have been improved, enabling the production of genome sequences at an unprecedented rate as well as broadening the spectrum of NGS applications. The current affordability of generating genomic information, especially with microbial samples, has resulted in a false sense of simplicity that belies the fact that many researchers still consider these technologies a black box. In this review, our objective is to identify and discuss four steps that we consider crucial to the success of any NGS-related project. These steps are: (1) the definition of the research objectives beyond sequencing and appropriate experimental planning, (2) library preparation, (3) sequencing and (4) data analysis. The goal of this review is to give an overview of the process, from sample to analysis, and discuss how to optimize your resources to achieve the most from your NGS-based research. Regardless of the evolution and improvement of the sequencing technologies, these four steps will remain relevant. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Comparison of hybridization-based and sequencing-based gene expression technologies on biological replicates

    Directory of Open Access Journals (Sweden)

    Cepko Connie L

    2007-06-01

    Full Text Available Abstract Background High-throughput systems for gene expression profiling have been developed and have matured rapidly through the past decade. Broadly, these can be divided into two categories: hybridization-based and sequencing-based approaches. With data from different technologies being accumulated, concerns and challenges are raised about the level of agreement across technologies. As part of an ongoing large-scale cross-platform data comparison framework, we report here a comparison based on identical samples between one-dye DNA microarray platforms and MPSS (Massively Parallel Signature Sequencing. Results The DNA microarray platforms generally provided highly correlated data, while moderate correlations between microarrays and MPSS were obtained. Disagreements between the two types of technologies can be attributed to limitations inherent to both technologies. The variation found between pooled biological replicates underlines the importance of exercising caution in identification of differential expression, especially for the purposes of biomarker discovery. Conclusion Based on different principles, hybridization-based and sequencing-based technologies should be considered complementary to each other, rather than competitive alternatives for measuring gene expression, and currently, both are important tools for transcriptome profiling.

  11. Computational methods for the analysis of tag sequences in metagenomics studies.

    Science.gov (United States)

    Chang, Qin; Luan, Yihui; Chen, Ting; Fuhrman, Jed A; Sun, Fengzhu

    2012-06-01

    Metagenomics commonly refers to the study of genetic materials directly derived from environments without culturing. Several ongoing large-scale metagenomics projects related to human and marine life, as well as pedology studies, have generated enormous amounts of data, posing a key challenge for efficient analysis, as we try to 1) understand microbial organism assemblage under different conditions, 2) compare different communities, and 3) understand how microbial organisms associate with each other and the environment.To address such questions, investigators are using new sequencing technologies, including Sanger, Illumina Solexa, and Roche 454, to sequence either particular genes, called tag sequences, mostly 16S or 18S ribosomal RNA sequences or other conserved genes, or whole metagenome shotgun sequences of all the genetic materials in a given community. In this paper, we review computational methods used for the analysis of tag sequences.

  12. Transcriptome analysis of carnation (Dianthus caryophyllus L. based on next-generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Tanase Koji

    2012-07-01

    Full Text Available Abstract Background Carnation (Dianthus caryophyllus L., in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, we generated an expressed sequence tag (EST database for a carnation cultivar important in horticulture by high-throughput sequencing using 454 pyrosequencing technology. Results We constructed a normalized cDNA library and a 3’-UTR library of carnation, obtaining a total of 1,162,126 high-quality reads. These reads were assembled into 300,740 unigenes consisting of 37,844 contigs and 262,896 singlets. The contigs were searched against an Arabidopsis sequence database, and 61.8% (23,380 of them had at least one BLASTX hit. These contigs were also annotated with Gene Ontology (GO and were found to cover a broad range of GO categories. Furthermore, we identified 17,362 potential simple sequence repeats (SSRs in 14,291 of the unigenes. We focused on gene discovery in the areas of flower color and ethylene biosynthesis. Transcripts were identified for almost every gene involved in flower chlorophyll and carotenoid metabolism and in anthocyanin biosynthesis. Transcripts were also identified for every step in the ethylene biosynthesis pathway. Conclusions We present the first large-scale sequence data set for carnation, generated using next-generation sequencing technology. The large EST database generated from these sequences is an informative resource for identifying genes involved in various biological processes in carnation and provides an EST resource for understanding the genetic diversity of this plant.

  13. The new sequencer on the block: comparison of Life Technology's Proton sequencer to an Illumina HiSeq for whole-exome sequencing.

    Science.gov (United States)

    Boland, Joseph F; Chung, Charles C; Roberson, David; Mitchell, Jason; Zhang, Xijun; Im, Kate M; He, Ji; Chanock, Stephen J; Yeager, Meredith; Dean, Michael

    2013-10-01

    We assessed the performance of the new Life Technologies Proton sequencer by comparing whole-exome sequence data in a Centre d'Etude du Polymorphisme Humain trio (family 1463) to the Illumina HiSeq instrument. To simulate a typical user's results, we utilized the standard capture, alignment and variant calling methods specific to each platform. We restricted data analysis to include the capture region common to both methods. The Proton produced high quality data at a comparable average depth and read length, and the Ion Reporter variant caller identified 96 % of single nucleotide polymorphisms (SNPs) detected by the HiSeq and GATK pipeline. However, only 40 % of small insertion and deletion variants (indels) were identified by both methods. Usage of the trio structure and segregation of platform-specific alleles supported this result. Further comparison of the trio data with Complete Genomics sequence data and Illumina SNP microarray genotypes documented high concordance and accurate SNP genotyping of both Proton and Illumina platforms. However, our study underscored the problem of accurate detection of indels for both the Proton and HiSeq platforms.

  14. 测序技术的研究进展%Research Progress on the Sequencing Technologies

    Institute of Scientific and Technical Information of China (English)

    郝甜甜; 李强飞; 李国治; 陈禹翰; 邓卫东

    2014-01-01

    基因组测序技术是一种分子生物学及其相关学科研究中最常用的技术。从20世纪70年代中期第一代测序技术出现以来,基因组测序技术已取得了重大进展,并改变了生命科学诸多领域的研究面貌。测序成本的急剧下降,测序通量呈指数提高,使得测序速度大大提升。DNA和RNA是生命体的2个基本遗传物质,其组成和序列变化创造了形形色色的生命世界。快速、准确地获取生物体的遗传信息对于生命科学的研究具有重要意义,测序技术能够真正地反映基因组、遗传信息转录,全面地揭示基因组的复杂性和多样性,在生命科学研究中起着重要的作用。综述了各代测序技术的发展成果、测序原理、测序优缺点、国内研究现状,并对测序技术未来的发展方向进行了展望。%Genome sequencing technology is the most commonly used technique in molecular biology and its related discipline studies. Genome sequencing technology has made great progresses and changed the status of life science research in many fields since the first-generation of sequencing technology appeared in the 1970s. The sharp decline of sequencing cost and exponential increase of sequencing throughout greatly improved the sequencing speed. The two basic genetic materials of organism are DNA and RNA, whose composition and sequence changes create all sorts of lives in the world. It is of important significance to get the genetic information of organisms fast and accurately for life science research. Sequencing technology can reflect genome and transcription of genetic information truly and reveal the complexity and diversity of genome, which plays a very important role in life science research. The development achievements, sequencing principle, the advantages and disadvantages of sequencing, domestic research status of each generation of sequencing technology were reviewed. And the future development

  15. On Performance Analysis, Evaluation, and Enhancement of Reading Brain Function Using Sanger's Rule

    Directory of Open Access Journals (Sweden)

    Hassan M. H. Mustafa

    2016-08-01

    Full Text Available This piece of research adopts an interdisciplinary conceptual approach that incorporates Artificial Neural Networks (ANN with learning and cognitive sciences. Specifically, it considers modeling of associative memorization to introduce optimal analysis for development of reading brain performance. Herein, this brain performance simulated realistically using ANNs self-organized modeling paradigm. That namely: the Generalized Hebbian Algorithm (GHA, also known in the literature as Sanger's rule, is a linear feed forward neural network model for unsupervised learning with applications primarily in principal components analysis. Furthermore, it inspired by functioning of highly specialized biological neurons in reading brain based on the organization the brain's structures/substructures. In accordance with the prevailing concept of individual intrinsic characterized properties of highly specialized neurons. Presented models have been in close correspondence with set of neurons’ performance for developing reading brain in a significant way. More specifically, herein, introduced model concerned with their important role played in carrying out cognitive reading brain function's outcomes. Accordingly, the cognitive goal for reading brain is to translate that seen word (orthographic word-from into a spoken word (phonological word-form. In this context herein, the presented work illustrates via ANN simulation results: How ensembles of highly specialized neurons could be dynamically involved in performing associative memorization cognitive function for developing reading brain.

  16. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius).

    LENUS (Irish Health Repository)

    Edwards, Ceiridwen J

    2010-01-01

    BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius) has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs) from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+\\/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer). In total, 289.9 megabases (22.48%) of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously identified

  17. Coverage recommendation for genotyping analysis of highly heterologous species using next-generation sequencing technology

    Science.gov (United States)

    Song, Kai; Li, Li; Zhang, Guofan

    2016-01-01

    Next-generation sequencing (NGS) technology is being applied to an increasing number of non-model species and has been used as the primary approach for accurate genotyping in genetic and evolutionary studies. However, inferring genotypes from sequencing data is challenging, particularly for organisms with a high degree of heterozygosity. This is because genotype calls from sequencing data are often inaccurate due to low sequencing coverage, and if this is not accounted for, genotype uncertainty can lead to serious bias in downstream analyses, such as quantitative trait locus mapping and genome-wide association studies. Here, we used high-coverage reference data sets from Crassostrea gigas to simulate sequencing data with different coverage, and we evaluate the influence of genotype calling rate and accuracy as a function of coverage. Having initially identified the appropriate parameter settings for filtering to ensure genotype accuracy, we used two different single-nucleotide polymorphism (SNP) calling pipelines, single-sample and multi-sample. We found that a coverage of 15× was suitable for obtaining sufficient numbers of SNPs with high accuracy. Our work provides guidelines for the selection of sequence coverage when using NGS to investigate species with a high degree of heterozygosity and rapid decay of linkage disequilibrium. PMID:27760996

  18. Small RNA transcriptome investigation based on next-generation sequencing technology

    Institute of Scientific and Technical Information of China (English)

    Linglin Zhou; Xueying Li; Qi Liu; Fangqing Zhao; Jinyu Wu

    2011-01-01

    Over the past decade,there has been a growing realization that studying the small RNA transcriptome is essential for understanding the complexity of transcriptional regulation.With an increased throughput and a reduced cost,next-generation sequencing technology has provided an unprecedented opportunity to measure the extent and complexity of small RNA transcriptome.Meanwhile,the large amount of obtained data and varied technology platforms have also posed multiple challenges for effective data analysis and mining.To provide some insight into the small RNA transcriptome investigation,this review describes the major small RNA classes,experimental methods to identify small RNAs,and available bioinformatics tools and databases.

  19. The Application of Next Generation Sequencing Technology on Noninvasive Prenatal Test

    DEFF Research Database (Denmark)

    Jiang, Hui

    of effective treatment. The rapid development of next generation sequencing technology boosts the discovery of new causative gene for these rare diseases, as well as the genetic diagnosis in clinic practice. Carrier screening, prenatal diagnosis and newborn screening are wildly used in the world to prevent...... an invasive process, which might lead to maternal anxiety, or even miscarriage. Therefore, developing an effective approach to perform noninvasive prenatal test (NIPT) for rare diseases is the key challenge to prevent birth defect in the future. The discovery of cell-­free fetal DNA, coupling with next......, and maternal plasma. In order to obtain accurate result, we combined the haplotype information from the parents with maternal plasma deep sequencing data to recover the fetal genotype. Our study demonstrated that the sequencing-based new approach could be used to detect rare diseases, including chromosomal...

  20. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.

    2010-07-12

    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  1. Read length and repeat resolution: exploring prokaryote genomes using next-generation sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Matt J Cahill

    Full Text Available BACKGROUND: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. METHODOLOGY/PRINCIPAL FINDINGS: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. CONCLUSIONS: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length.

  2. Recent Advances in Autism Spectrum Disorders: Applications of Whole Exome Sequencing Technology.

    Science.gov (United States)

    Sener, Elif Funda; Canatan, Halit; Ozkul, Yusuf

    2016-05-01

    Autism spectrum disorders (ASD) is characterized by three core symptoms with impaired reciprocal social interaction and communication, a pattern of repetitive behavior and/or restricted interests in early childhood. The prevalence is higher in male children than in female children. As a complex neurodevelopmental disorder, the phenotype and severity of autism are extremely heterogeneous with differences from one patient to another. Genetics has a key role in the etiology of autism. Environmental factors are also interacting with the genetic profile and cause abnormal changes in neuronal development, brain growth, and functional connectivity. The term of exome represents less than 1% of the human genome, but contains 85% of known disease-causing variants. Whole-exome sequencing (WES) is an application of the next generation sequencing technology to determine the variations of all coding regions, or exons of known genes. For this reason, WES has been extensively used for clinical studies in the recent years. WES has achieved great success in the past years for identifying Mendelian disease genes. This review evaluates the potential of current findings in ASD for application in next generation sequencing technology, particularly WES. WES and whole-genome sequencing (WGS) approaches may lead to the discovery of underlying genetic factors for ASD and may thereby identify novel therapeutic targets for this disorder.

  3. Improved Efficiency and Reliability of NGS Amplicon Sequencing Data Analysis for Genetic Diagnostic Procedures Using AGSA Software.

    Science.gov (United States)

    Poulet, Axel; Privat, Maud; Ponelle, Flora; Viala, Sandrine; Decousus, Stephanie; Perin, Axel; Lafarge, Laurence; Ollier, Marie; El Saghir, Nagi S; Uhrhammer, Nancy; Bignon, Yves-Jean; Bidet, Yannick

    Screening for BRCA mutations in women with familial risk of breast or ovarian cancer is an ideal situation for high-throughput sequencing, providing large amounts of low cost data. However, 454, Roche, and Ion Torrent, Thermo Fisher, technologies produce homopolymer-associated indel errors, complicating their use in routine diagnostics. We developed software, named AGSA, which helps to detect false positive mutations in homopolymeric sequences. Seventy-two familial breast cancer cases were analysed in parallel by amplicon 454 pyrosequencing and Sanger dideoxy sequencing for genetic variations of the BRCA genes. All 565 variants detected by dideoxy sequencing were also detected by pyrosequencing. Furthermore, pyrosequencing detected 42 variants that were missed with Sanger technique. Six amplicons contained homopolymer tracts in the coding sequence that were systematically misread by the software supplied by Roche. Read data plotted as histograms by AGSA software aided the analysis considerably and allowed validation of the majority of homopolymers. As an optimisation, additional 250 patients were analysed using microfluidic amplification of regions of interest (Access Array Fluidigm) of the BRCA genes, followed by 454 sequencing and AGSA analysis. AGSA complements a complete line of high-throughput diagnostic sequence analysis, reducing time and costs while increasing reliability, notably for homopolymer tracts.

  4. Improved Efficiency and Reliability of NGS Amplicon Sequencing Data Analysis for Genetic Diagnostic Procedures Using AGSA Software

    Directory of Open Access Journals (Sweden)

    Axel Poulet

    2016-01-01

    Full Text Available Screening for BRCA mutations in women with familial risk of breast or ovarian cancer is an ideal situation for high-throughput sequencing, providing large amounts of low cost data. However, 454, Roche, and Ion Torrent, Thermo Fisher, technologies produce homopolymer-associated indel errors, complicating their use in routine diagnostics. We developed software, named AGSA, which helps to detect false positive mutations in homopolymeric sequences. Seventy-two familial breast cancer cases were analysed in parallel by amplicon 454 pyrosequencing and Sanger dideoxy sequencing for genetic variations of the BRCA genes. All 565 variants detected by dideoxy sequencing were also detected by pyrosequencing. Furthermore, pyrosequencing detected 42 variants that were missed with Sanger technique. Six amplicons contained homopolymer tracts in the coding sequence that were systematically misread by the software supplied by Roche. Read data plotted as histograms by AGSA software aided the analysis considerably and allowed validation of the majority of homopolymers. As an optimisation, additional 250 patients were analysed using microfluidic amplification of regions of interest (Access Array Fluidigm of the BRCA genes, followed by 454 sequencing and AGSA analysis. AGSA complements a complete line of high-throughput diagnostic sequence analysis, reducing time and costs while increasing reliability, notably for homopolymer tracts.

  5. Next-generation sequencing as a powerful motor for advances in the biological and environmental sciences.

    Science.gov (United States)

    Faure, Denis; Joly, Dominique

    2015-04-01

    Next-generation sequencing (NGS) provides unprecedented insight into (meta)genomes, (meta)transcriptomes (cDNA) and (meta)barcodes of individuals, populations and communities of Archaea, Bacteria and Eukarya, as well as viruses. This special issue combines reviews and original papers reporting technical and scientific advances in genomics and transcriptomics of non-model species, as well as quantification and functional analyses of biodiversity using NGS technologies of the second and third generations. In addition, certain papers also exemplify the transition from Sanger to NGS barcodes in molecular taxonomy.

  6. 基于时序路径的FPGA时序分析技术研究%Research of FPGA Timing Sequence Analysis Technology Based on Timing Sequence Path

    Institute of Scientific and Technical Information of China (English)

    周珊; 王金波; 王晓丹

    2016-01-01

    For the significance of high reliable FPGA test timing sequence analysis technology on Spaceflight , depending on the FPGA design and test experience for several years ,timing sequence analysis technology is analyzed deeply ,and a set of feasible solution is extracted ,and the analysis target of timing sequence analysis is clarified . The interface signal timing sequence calculation rules of timing sequence analysis and analysis rules of timing sequence test results are presented by the main method of timing sequence analysis technology ;this set of technology is applied successfully on the tests of several high reliable software on Spaceflight ,many significant functional disabled problems ,caused by timing sequence problems ,can be found ,and the common timing sequence problems are classified and summarized .%针对于航天高可靠FPGA测试时时序分析技术的重要性 ,根据多年FPGA设计测试经验对时序分析技术进行深入剖析 ,提炼出一套切实可行的时序分析技术 ,阐明了时序分析的分析对象 ,时序分析技术的主要方法 ,给出了时序分析时接口信号时序计算法则 ,以及时序测试结果的分析准则 ;并把这套分析技术成功的应用到了多个航天高可靠软件的测试中 ,发现了很多由时序问题引起功能失效的重大问题 ,对其中常见的时序问题给予归类总结 .

  7. Performance evaluation of the next-generation sequencing approach for molecular diagnosis of hereditary hearing loss.

    Science.gov (United States)

    Sivakumaran, Theru A; Husami, Ammar; Kissell, Diane; Zhang, Wenying; Keddache, Mehdi; Black, Angela P; Tinkle, Brad T; Greinwald, John H; Zhang, Kejian

    2013-06-01

    To evaluate the performance of a next-generation sequencing (NGS)-based targeted resequencing genetic test, OtoSeq, to identify the sequence variants in the genes causing sensorineural hearing loss (SNHL). Retrospective study. Tertiary children's hospital. A total of 8 individuals presenting with prelingual hearing loss were used in this study. The coding and flanking intronic regions of 24 well-studied SNHL genes were enriched using microdroplet polymerase chain reaction and sequenced on an Illumina HiSeq 2000 sequencer. The filtered high-quality sequence reads were mapped to reference sequence, and variants were detected using NextGENe software. A total of 1148 sequence variants were detected in 8 samples in 24 genes. Using in-house developed NGS data analysis criteria, we classified 810 (~71%) of these variants as potential true variants that include previously detected pathogenic mutations in 5 patients. To validate our strategy, we Sanger sequenced the target regions of 5 of the 24 genes, accounting for about 29.2% of all target sequence. Our results showed >99.99% concordance between NGS and Sanger sequencing in these 5 genes, resulting in an analytical sensitivity and specificity of 100% and 99.997%, respectively. We were able to successfully detect single base substitutions, small deletions, and insertions of up to 22 nucleotides. This study demonstrated that our NGS-based mutation screening strategy is highly sensitive and specific in detecting sequence variants in the SNHL genes. Therefore, we propose that this NGS-based targeted sequencing method would be an alternative to current technologies for identifying the multiple genetic causes of SNHL.

  8. Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology

    Directory of Open Access Journals (Sweden)

    Ramina Angelo

    2008-07-01

    Full Text Available Abstract Background After 10-year-use of AFLP (Amplified Fragment Length Polymorphism technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and are being extensively exploited for genome scanning and gene mapping, as well as cDNA-AFLP for transcriptome profiling and differentially expressed gene cloning. The evaluation, annotation and classification of genomic markers and expressed transcripts would be of great utility for both functional genomics and systems biology research in plants. This may be achieved by means of the Gene Ontology (GO, consisting in three structured vocabularies (i.e. ontologies describing genes, transcripts and proteins of any organism in terms of their associated cellular component, biological process and molecular function in a species-independent manner. In this paper, the functional annotation of about 8,000 AFLP-derived ESTs retrieved in the NCBI databases was carried out by using GO terminology. Results Descriptive statistics on the type, size and nature of gene sequences obtained by means of AFLP technology were calculated. The gene products associated with mRNA transcripts were then classified according to the three main GO vocabularies. A comparison of the functional content of cDNA-AFLP records was also performed by splitting the sequence dataset into monocots and dicots and by comparing them to all annotated ESTs of Arabidopsis and rice, respectively. On the whole, the statistical parameters adopted for the in silico AFLP-derived transcriptome-anchored sequence analysis proved to be critical for obtaining reliable GO results. Such an exhaustive annotation may offer a suitable platform for functional genomics, particularly useful in non-model species. Conclusion Reliable GO annotations of AFLP-derived sequences can be gathered through the optimization

  9. A FRET Biosensor for ROCK Based on a Consensus Substrate Sequence Identified by KISS Technology.

    Science.gov (United States)

    Li, Chunjie; Imanishi, Ayako; Komatsu, Naoki; Terai, Kenta; Amano, Mutsuki; Kaibuchi, Kozo; Matsuda, Michiyuki

    2017-01-11

    Genetically-encoded biosensors based on Förster/fluorescence resonance energy transfer (FRET) are versatile tools for studying the spatio-temporal regulation of signaling molecules within not only the cells but also tissues. Perhaps the hardest task in the development of a FRET biosensor for protein kinases is to identify the kinase-specific substrate peptide to be used in the FRET biosensor. To solve this problem, we took advantage of kinase-interacting substrate screening (KISS) technology, which deduces a consensus substrate sequence for the protein kinase of interest. Here, we show that a consensus substrate sequence for ROCK identified by KISS yielded a FRET biosensor for ROCK, named Eevee-ROCK, with high sensitivity and specificity. By treating HeLa cells with inhibitors or siRNAs against ROCK, we show that a substantial part of the basal FRET signal of Eevee-ROCK was derived from the activities of ROCK1 and ROCK2. Eevee-ROCK readily detected ROCK activation by epidermal growth factor, lysophosphatidic acid, and serum. When cells stably-expressing Eevee-ROCK were time-lapse imaged for three days, ROCK activity was found to increase after the completion of cytokinesis, concomitant with the spreading of cells. Eevee-ROCK also revealed a gradual increase in ROCK activity during apoptosis. Thus, Eevee-ROCK, which was developed from a substrate sequence predicted by the KISS technology, will pave the way to a better understanding of the function of ROCK in a physiological context.

  10. Next-generation sequencing for high-throughput molecular ecology: a step-by-step protocol for targeted multilocus genotyping by pyrosequencing.

    Science.gov (United States)

    Puritz, Jonathan B; Toonen, Robert J

    2013-01-01

    Next-generation sequencing technology can now provide population biologists and phylogeographers with information at the genomic scale; however, many pertinent questions in population genetics and phylogeography can be answered effectively with modest levels of genomic information. For the past two decades, most population-level studies have lacked nuclear DNA (nDNA) sequence data due to the complications and cost of amplifying and sequencing diploid loci. However, pyrosequencing of emulsion PCR reactions, amplifying from only one molecule at a time, can generate megabases of clonally amplified loci at high coverage, thereby greatly simplifying allelic sequence determination. Here, we present a step-by-step methodology for utilizing the 454 GS FLX Titanium pyrosequencing platform to simultaneously sequence 16 populations (at 20 individuals per population) at 10 different nDNA loci (3,200 loci in total) in one plate of sequencing for less than the cost of traditional Sanger sequencing.

  11. Evaluation of a pooled strategy for high-throughput sequencing of cosmid clones from metagenomic libraries.

    Directory of Open Access Journals (Sweden)

    Kathy N Lam

    Full Text Available High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.

  12. Evaluation of a pooled strategy for high-throughput sequencing of cosmid clones from metagenomic libraries.

    Science.gov (United States)

    Lam, Kathy N; Hall, Michael W; Engel, Katja; Vey, Gregory; Cheng, Jiujun; Neufeld, Josh D; Charles, Trevor C

    2014-01-01

    High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.

  13. Influence of sequence mismatches on the specificity of recombinase polymerase amplification technology.

    Science.gov (United States)

    Daher, Rana K; Stewart, Gale; Boissinot, Maurice; Boudreau, Dominique K; Bergeron, Michel G

    2015-04-01

    Recombinase polymerase amplification (RPA) technology relies on three major proteins, recombinase proteins, single-strand binding proteins, and polymerases, to specifically amplify nucleic acid sequences in an isothermal format. The performance of RPA with respect to sequence mismatches of closely-related non-target molecules is not well documented and the influence of the number and distribution of mismatches in DNA sequences on RPA amplification reaction is not well understood. We investigated the specificity of RPA by testing closely-related species bearing naturally occurring mismatches for the tuf gene sequence of Pseudomonas aeruginosa and/or Mycobacterium tuberculosis and for the cfb gene sequence of Streptococcus agalactiae. In addition, the impact of the number and distribution of mismatches on RPA efficiency was assessed by synthetically generating 14 types of mismatched forward primers for detecting five bacterial species of high diagnostic relevance such as Clostridium difficile, Staphylococcus aureus, S. agalactiae, P. aeruginosa, and M. tuberculosis as well as Bacillus atropheus subsp. globigii for which we use the spores as internal control in diagnostic assays. A total of 87 mismatched primers were tested in this study. We observed that target specific RPA primers with mismatches (n > 1) at their 3'extrimity hampered RPA reaction. In addition, 3 mismatches covering both extremities and the center of the primer sequence negatively affected RPA yield. We demonstrated that the specificity of RPA was multifactorial. Therefore its application in clinical settings must be selected and validated a priori. We recommend that the selection of a target gene must consider the presence of closely-related non-target genes. It is advisable to choose target regions with a high number of mismatches (≥36%, relative to the size of amplicon) with respect to closely-related species and the best case scenario would be by choosing a unique target gene.

  14. Genetic diagnosis of Duchenne/Becker muscular dystrophy using next-generation sequencing: validation analysis of DMD mutations

    Science.gov (United States)

    Okubo, Mariko; Minami, Narihiro; Goto, Kanako; Goto, Yuichi; Noguchi, Satoru; Mitsuhashi, Satomi; Nishino, Ichizo

    2016-01-01

    Duchenne and Becker muscular dystrophies (DMD/BMD) are the most common inherited neuromuscular disease. The genetic diagnosis is not easily made because of the large size of the dystrophin gene, complex mutational spectrum and high number of tests patients undergo for diagnosis. Multiplex ligation-dependent probe amplification (MLPA) has been used as the initial diagnostic test of choice. Although MLPA can diagnose 70% of DMD/BMD patients having deletions/duplications, the remaining 30% of patients with small mutations require further analysis, such as Sanger sequencing. We applied a high-throughput method using Ion Torrent next-generation sequencing technology and diagnosed 92% of patients with DMD/BMD in a single analysis. We designed a multiplex primer pool for DMD and sequenced 67 cases having different mutations: 37 with deletions/duplications and 30 with small mutations or short insertions/deletions in DMD, using an Ion PGM sequencer. The results were compared with those from MLPA or Sanger sequencing. All deletions were detected. In contrast, 50% of duplications were correctly identified compared with the MLPA method. Small insertions in consecutive bases could not be detected. We estimated that Ion Torrent sequencing could diagnose ~92% of DMD/BMD patients according to the mutational spectrum of our cohort. Our results clearly indicate that this method is suitable for routine clinical practice providing novel insights into comprehensive genetic information for future molecular therapy. PMID:26911353

  15. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences

    Directory of Open Access Journals (Sweden)

    Alessandra Traini

    2013-01-01

    Full Text Available Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  16. High-throughput sequence alignment using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Trapnell Cole

    2007-12-01

    Full Text Available Abstract Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.

  17. The JCVI standard operating procedure for annotating prokaryotic metagenomic shotgun sequencing data.

    Science.gov (United States)

    Tanenbaum, David M; Goll, Johannes; Murphy, Sean; Kumar, Prateek; Zafar, Nikhat; Thiagarajan, Mathangi; Madupu, Ramana; Davidsen, Tanja; Kagan, Leonid; Kravitz, Saul; Rusch, Douglas B; Yooseph, Shibu

    2010-03-30

    The JCVI metagenomics analysis pipeline provides for the efficient and consistent annotation of shotgun metagenomics sequencing data for sampling communities of prokaryotic organisms. The process can be equally applied to individual sequence reads from traditional Sanger capillary electrophoresis sequences, newer technologies such as 454 pyrosequencing, or sequence assemblies derived from one or more of these data types. It includes the analysis of both coding and non-coding genes, whether full-length or, as is often the case for shotgun metagenomics, fragmentary. The system is designed to provide the best-supported conservative functional annotation based on a combination of trusted homology-based scientific evidence and computational assertions and an annotation value hierarchy established through extensive manual curation. The functional annotation attributes assigned by this system include gene name, gene symbol, GO terms, EC numbers, and JCVI functional role categories.

  18. Hepatitis C virus whole genome sequencing: Current methods/issues and future challenges.

    Science.gov (United States)

    Trémeaux, Pauline; Caporossi, Alban; Thélu, Marie-Ange; Blum, Michael; Leroy, Vincent; Morand, Patrice; Larrat, Sylvie

    2016-10-01

    Therapy for hepatitis C is currently undergoing a revolution. The arrival of new antiviral agents targeting viral proteins reinforces the need for a better knowledge of the viral strains infecting each patient. Hepatitis C virus (HCV) whole genome sequencing provides essential information for precise typing, study of the viral natural history or identification of resistance-associated variants. First performed with Sanger sequencing, the arrival of next-generation sequencing (NGS) has simplified the technical process and provided more detailed data on the nature and evolution of viral quasi-species. We will review the different techniques used for HCV complete genome sequencing and their applications, both before and after the apparition of NGS. The progress brought by new and future technologies will also be discussed, as well as the remaining difficulties, largely due to the genomic variability.

  19. Next-Generation Sequencing of Aquatic Oligochaetes: Comparison of Experimental Communities

    Science.gov (United States)

    Vivien, Régis; Lejzerowicz, Franck; Pawlowski, Jan

    2016-01-01

    Aquatic oligochaetes are a common group of freshwater benthic invertebrates known to be very sensitive to environmental changes and currently used as bioindicators in some countries. However, more extensive application of oligochaetes for assessing the ecological quality of sediments in watercourses and lakes would require overcoming the difficulties related to morphology-based identification of oligochaetes species. This study tested the Next-Generation Sequencing (NGS) of a standard cytochrome c oxydase I (COI) barcode as a tool for the rapid assessment of oligochaete diversity in environmental samples, based on mixed specimen samples. To know the composition of each sample we Sanger sequenced every specimen present in these samples. Our study showed that a large majority of OTUs (Operational Taxonomic Unit) could be detected by NGS analyses. We also observed congruence between the NGS and specimen abundance data for several but not all OTUs. Because the differences in sequence abundance data were consistent across samples, we exploited these variations to empirically design correction factors. We showed that such factors increased the congruence between the values of oligochaetes-based indices inferred from the NGS and the Sanger-sequenced specimen data. The validation of these correction factors by further experimental studies will be needed for the adaptation and use of NGS technology in biomonitoring studies based on oligochaete communities. PMID:26866802

  20. Next-Generation Sequencing of Aquatic Oligochaetes: Comparison of Experimental Communities.

    Science.gov (United States)

    Vivien, Régis; Lejzerowicz, Franck; Pawlowski, Jan

    2016-01-01

    Aquatic oligochaetes are a common group of freshwater benthic invertebrates known to be very sensitive to environmental changes and currently used as bioindicators in some countries. However, more extensive application of oligochaetes for assessing the ecological quality of sediments in watercourses and lakes would require overcoming the difficulties related to morphology-based identification of oligochaetes species. This study tested the Next-Generation Sequencing (NGS) of a standard cytochrome c oxydase I (COI) barcode as a tool for the rapid assessment of oligochaete diversity in environmental samples, based on mixed specimen samples. To know the composition of each sample we Sanger sequenced every specimen present in these samples. Our study showed that a large majority of OTUs (Operational Taxonomic Unit) could be detected by NGS analyses. We also observed congruence between the NGS and specimen abundance data for several but not all OTUs. Because the differences in sequence abundance data were consistent across samples, we exploited these variations to empirically design correction factors. We showed that such factors increased the congruence between the values of oligochaetes-based indices inferred from the NGS and the Sanger-sequenced specimen data. The validation of these correction factors by further experimental studies will be needed for the adaptation and use of NGS technology in biomonitoring studies based on oligochaete communities.

  1. Next-Generation Sequencing of Aquatic Oligochaetes: Comparison of Experimental Communities.

    Directory of Open Access Journals (Sweden)

    Régis Vivien

    Full Text Available Aquatic oligochaetes are a common group of freshwater benthic invertebrates known to be very sensitive to environmental changes and currently used as bioindicators in some countries. However, more extensive application of oligochaetes for assessing the ecological quality of sediments in watercourses and lakes would require overcoming the difficulties related to morphology-based identification of oligochaetes species. This study tested the Next-Generation Sequencing (NGS of a standard cytochrome c oxydase I (COI barcode as a tool for the rapid assessment of oligochaete diversity in environmental samples, based on mixed specimen samples. To know the composition of each sample we Sanger sequenced every specimen present in these samples. Our study showed that a large majority of OTUs (Operational Taxonomic Unit could be detected by NGS analyses. We also observed congruence between the NGS and specimen abundance data for several but not all OTUs. Because the differences in sequence abundance data were consistent across samples, we exploited these variations to empirically design correction factors. We showed that such factors increased the congruence between the values of oligochaetes-based indices inferred from the NGS and the Sanger-sequenced specimen data. The validation of these correction factors by further experimental studies will be needed for the adaptation and use of NGS technology in biomonitoring studies based on oligochaete communities.

  2. Next generation sequencing (NGS): a golden tool in forensic toolkit.

    Science.gov (United States)

    Aly, S M; Sabri, D M

    2015-01-01

    The DNA analysis is a cornerstone in contemporary forensic sciences. DNA sequencing technologies are powerful tools that enrich molecular sciences in the past based on Sanger sequencing and continue to glowing these sciences based on Next generation sequencing (NGS). Next generation sequencing has excellent potential to flourish and increase the molecular applications in forensic sciences by jumping over the pitfalls of the conventional method of sequencing. The main advantages of NGS compared to conventional method that it utilizes simultaneously a large number of genetic markers with high-resolution of genetic data. These advantages will help in solving several challenges such as mixture analysis and dealing with minute degraded samples. Based on these new technologies, many markers could be examined to get important biological data such as age, geographical origins, tissue type determination, external visible traits and monozygotic twins identification. It also could get data related to microbes, insects, plants and soil which are of great medico-legal importance. Despite the dozens of forensic research involving NGS, there are requirements before using this technology routinely in forensic cases. Thus, there is a great need to more studies that address robustness of these techniques. Therefore, this work highlights the applications of forensic sciences in the era of massively parallel sequencing.

  3. Next generation sequencing technology: a powerful tool for the genome characterization of sugarcane mosaic virus from Sorghum almum

    Science.gov (United States)

    Next generation sequencing (NGS) technology was used to analyze the occurrence of viruses in Sorghum almum plants in Florida exhibiting mosaic symptoms. Total RNA was extracted from symptomatic leaves and used as a template for cDNA library preparation. The resulting library was sequenced on an Illu...

  4. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  5. Characterizing ncRNAs in human pathogenic protists using high-throughput sequencing technology

    Directory of Open Access Journals (Sweden)

    Lesley Joan Collins

    2011-12-01

    Full Text Available ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, snoRNAs and long ncRNAs on a genomic scale making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases.

  6. Transcriptome profiling of Chironomus kiinensis under phenol stress using Solexa sequencing technology.

    Directory of Open Access Journals (Sweden)

    Chuanwang Cao

    Full Text Available Phenol is a major pollutant in aquatic ecosystems due to its chemical stability, water solubility and environmental mobility. To date, little is known about the molecular modifications of invertebrates under phenol stress. In the present study, we used Solexa sequencing technology to investigate the transcriptome and differentially expressed genes (DEGs of midges (Chironomus kiinensis in response to phenol stress. A total of 51,518,972 and 51,150,832 clean reads in the phenol-treated and control libraries, respectively, were obtained and assembled into 51,014 non-redundant (Nr consensus sequences. A total of 6,032 unigenes were classified by Gene Ontology (GO, and 18,366 unigenes were categorized into 238 Kyoto Encyclopedia of Genes and Genomes (KEGG categories. These genes included representatives from almost all functional categories. A total of 10,724 differentially expressed genes (P value <0.05 were detected in a comparative analysis of the expression profiles between phenol-treated and control C. kiinensis including 8,390 upregulated and 2,334 downregulated genes. The expression levels of 20 differentially expressed genes were confirmed by real-time RT-PCR, and the trends in gene expression that were observed matched the Solexa expression profiles, although the magnitude of the variations was different. Through pathway enrichment analysis, significantly enriched pathways were identified for the DEGs, including metabolic pathways, aryl hydrocarbon receptor (AhR, pancreatic secretion and neuroactive ligand-receptor interaction pathways, which may be associated with the phenol responses of C. kiinensis. Using Solexa sequencing technology, we identified several groups of key candidate genes as well as important biological pathways involved in the molecular modifications of chironomids under phenol stress.

  7. Low diversity in the mitogenome of sperm whales revealed by next-generation sequencing.

    Science.gov (United States)

    Alexander, Alana; Steel, Debbie; Slikas, Beth; Hoekzema, Kendra; Carraher, Colm; Parks, Matthew; Cronn, Richard; Baker, C Scott

    2013-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity.

  8. Sequence assembly using next generation sequencing data--challenges and solutions.

    Science.gov (United States)

    Chin, Francis Y L; Leung, Henry C M; Yiu, S M

    2014-11-01

    Sequence assembling is an important step for bioinformatics study. With the help of next generation sequencing (NGS) technology, high throughput DNA fragment (reads) can be randomly sampled from DNA or RNA molecular sequence. However, as the positions of reads being sampled are unknown, assembling process is required for combining overlapped reads to reconstruct the original DNA or RNA sequence. Compared with traditional Sanger sequencing methods, although the throughput of NGS reads increases, the read length is shorter and the error rate is higher. It introduces several problems in assembling. Moreover, paired-end reads instead of single-end reads can be sampled which contain more information. The existing assemblers cannot fully utilize this information and fails to assemble longer contigs. In this article, we will revisit the major problems of assembling NGS reads on genomic, transcriptomic, metagenomic and metatranscriptomic data. We will also describe our IDBA package for solving these problems. IDBA package has adopted several novel ideas in assembling, including using multiple k, local assembling and progressive depth removal. Compared with existence assemblers, IDBA has better performance on many simulated and real sequencing datasets.

  9. Next Generation Sequencing and Health Technology Assessment in Autism Spectrum Disorder

    Science.gov (United States)

    Ungar, Wendy J.

    2015-01-01

    Next generation sequencing (NGS) is a new genome-based technology showing great promise in delineating the genetic basis of autism thus facilitating diagnosis and in the future, the selection of treatment. NGS can have a targeted use as well as provide clinically important findings from medically actionable variants regarding the risk of other disorders. As more is learned about the genomic basis of autism, the clinical utility of the risk information will increase. But at what cost? As the medical management that ensues from primary and secondary (incidental) findings grows, there will be increased pressure on sub-specialists with a longer and more circuitous pathway to care. This will result in higher costs to health care systems and to families. Health technology assessment is needed to measure the additional costs associated with NGS compared to standard care and to weigh these costs against additional health benefits. Well-designed data collection systems should be implemented early in clinical translation of this technology to enable assessment of clinical utility and cost-effectiveness and to generate high quality evidence to inform clinical and budget allocation decision-making. PMID:26379724

  10. The principle and application of the single-molecule real-time sequencing technology%单分子实时测序技术的原理与应用

    Institute of Scientific and Technical Information of China (English)

    柳延虎; 王璐; 于黎

    2015-01-01

    Last decade witnessed the explosive development of the third-generation sequencing strategy, includ-ing single-molecule real-time sequencing (SMRT), true single-molecule sequencing (tSMSTM) and the sin-gle-molecule nanopore DNA sequencing. In this review, we summarize the principle, performance and application of the SMRT sequencing technology. Compared with the traditional Sanger method and the next-generation sequencing (NGS) technologies, the SMRT approach has several advantages, including long read length, high speed, PCR-free and the capability of direct detection of epigenetic modifications. However, the disadvantage of its low accuracy, most of which resulted from insertions and deletions, is also notable. So, the raw sequence data need to be corrected before assembly. Up to now, the SMRT is a good fit for applications in the de novo genomic sequencing and the high-quality assemblies of small genomes. In the future, it is expected to play an important role in epigenetics, transcriptomic sequencing, and assemblies of large genomes.%单分子 DNA 测序技术是近10年发展起来的新一代测序技术,也称为第三代测序技术,包括单分子实时测序、真正单分子测序、单分子纳米孔测序等技术。文章介绍了单分子实时(Single-molecule real-time,SMRT)测序技术的基本原理、性能以及应用。与Sanger测序法和下一代测序技术相比,SMRT测序具有超长读长、测序周期短、无需模板扩增和直接检测表观修饰位点等特点,为研究人员提供了新选择。同时,SMRT测序的低准确率备受争议(约85%),其中约93%的错误是插入缺失,因此,其数据应用于基因组组装前需先对数据进行纠错处理。目前,S M RT测序在小型基因组从头测序和完整组装中已有良好应用,并且已经或将在表观遗传学、转录组学、大型基因组组装等领域发挥其优势,促进基因组学的研究。

  11. Targeted amplicon sequencing (TAS): a scalable next-gen approach to multilocus, multitaxa phylogenetics.

    Science.gov (United States)

    Bybee, Seth M; Bracken-Grissom, Heather; Haynes, Benjamin D; Hermansen, Russell A; Byers, Robert L; Clement, Mark J; Udall, Joshua A; Wilcox, Edward R; Crandall, Keith A

    2011-01-01

    Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach.

  12. The Road to Metagenomics: From Microbiology to DNA Sequencing Technologies and Bioinformatics

    Science.gov (United States)

    Escobar-Zepeda, Alejandra; Vera-Ponce de León, Arturo; Sanchez-Flores, Alejandro

    2015-01-01

    The study of microorganisms that pervade each and every part of this planet has encountered many challenges through time such as the discovery of unknown organisms and the understanding of how they interact with their environment. The aim of this review is to take the reader along the timeline and major milestones that led us to modern metagenomics. This new and thriving area is likely to be an important contributor to solve different problems. The transition from classical microbiology to modern metagenomics studies has required the development of new branches of knowledge and specialization. Here, we will review how the availability of high-throughput sequencing technologies has transformed microbiology and bioinformatics and how to tackle the inherent computational challenges that arise from the DNA sequencing revolution. New computational methods are constantly developed to collect, process, and extract useful biological information from a variety of samples and complex datasets, but metagenomics needs the integration of several of these computational methods. Despite the level of specialization needed in bioinformatics, it is important that life-scientists have a good understanding of it for a correct experimental design, which allows them to reveal the information in a metagenome. PMID:26734060

  13. Using 454 technology for long-PCR based sequencing of the complete mitochondrial genome from single Haemonchus contortus (Nematoda

    Directory of Open Access Journals (Sweden)

    Waeschenbach Andrea

    2008-01-01

    Full Text Available Abstract Background Mitochondrial (mt genomes represent a rich source of molecular markers for a range of applications, including population genetics, systematics, epidemiology and ecology. In the present study, we used 454 technology (or the GS20, massively parallel picolitre reactor platform to determine the complete mt genome of Haemonchus contortus (Nematoda: Trichostrongylidae, a parasite of substantial agricultural, veterinary and economic significance. We validate this approach by comparison with mt sequences from publicly available expressed sequence tag (EST and genomic survey sequence (GSS data sets. Results The complete mt genome of Haemonchus contortus was sequenced directly from long-PCR amplified template utilizing genomic DNA (~20–40 ng from a single adult male using 454 technology. A single contig was assembled and compared against mt sequences mined from publicly available EST (NemBLAST and GSS datasets. The comparison demonstrated that the 454 technology platform is reliable for the sequencing of AT-rich mt genomes from nematodes. The mt genome sequenced for Haemonchus contortus was 14,055 bp in length and was highly AT-rich (78.1%. In accordance with other chromadorean nematodes studied to date, the mt genome of H. contortus contained 36 genes (12 protein coding, 22 tRNAs, rrnL and rrnS and was similar in structure, size and gene arrangement to those characterized previously for members of the Strongylida. Conclusion The present study demonstrates the utility of 454 technology for the rapid determination of mt genome sequences from tiny amounts of DNA and reveals a wealth of mt genomic data in current databases available for mining. This approach provides a novel platform for high-throughput sequencing of mt genomes from nematodes and other organisms.

  14. Analysis of metagenomics next generation sequence data for fungal ITS barcoding: Do you need advance bioinformatics experience?

    Directory of Open Access Journals (Sweden)

    Abdalla Osman Abdalla Ahmed

    2016-07-01

    Full Text Available During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi.

  15. Inhibition of cell division induced by external guide sequences (EGS Technology targeting ftsZ.

    Directory of Open Access Journals (Sweden)

    Carol Davies Sala

    Full Text Available EGS (external guide sequence technology is a promising approach to designing new antibiotics. EGSs are short antisense oligoribonucleotides that induce RNase P-mediated cleavage of a target RNA by forming a precursor tRNA-like complex. The ftsZ mRNA secondary structure was modeled and EGSs complementary to two regions with high probability of being suitable targets were designed. In vitro reactions showed that EGSs targeting these regions bound ftsZ mRNA and elicited RNase P-mediated cleavage of ftsZ mRNA. A recombinant plasmid, pEGSb1, coding for an EGS that targets region "b" under the control of the T7 promoter was generated. Upon introduction of this plasmid into Escherichia coli BL21(DE3(pLysS the transformant strain formed filaments when expression of the EGS was induced. Concomitantly, E. coli harboring pEGSb1 showed a modest but significant inhibition of growth when synthesis of the EGSb1 was induced. Our results indicate that EGS technology could be a viable strategy to generate new antimicrobials targeting ftsZ.

  16. Robust global microRNA expression profiling using next-generation sequencing technologies.

    Science.gov (United States)

    Tam, Shirley; de Borja, Richard; Tsao, Ming-Sound; McPherson, John D

    2014-03-01

    miRNAs are a class of regulatory molecules involved in a wide range of cellular functions, including growth, development and apoptosis. Given their widespread roles in biological processes, understanding their patterns of expression in normal and diseased states will provide insights into the consequences of aberrant expression. As such, global miRNA expression profiling of human malignancies is gaining popularity in both basic and clinically driven research. However, to date, the majority of such analyses have used microarrays and quantitative real-time PCR. With the introduction of digital count technologies, such as next-generation sequencing (NGS) and the NanoString nCounter System, we have at our disposal many more options. To make effective use of these different platforms, the strengths and pitfalls of several miRNA profiling technologies were assessed, including a microarray platform, NGS technologies and the NanoString nCounter System. Overall, NGS had the greatest detection sensitivity, largest dynamic range of detection and highest accuracy in differential expression analysis when compared with gold-standard quantitative real-time PCR. Its technical reproducibility was high, with intrasample correlations of at least 0.95 in all cases. Furthermore, miRNA analysis of formalin-fixed, paraffin-embedded (FFPE) tissue was also evaluated. Expression profiles between paired frozen and FFPE samples were similar, with Spearman's ρ>0.93. These results show the superior sensitivity, accuracy and robustness of NGS for the comprehensive profiling of miRNAs in both frozen and FFPE tissues.

  17. Remediation of pharmaceuticals and personal care products using an aerobic granular sludge sequencing bioreactor and microbial community profiling using Solexa sequencing technology analysis.

    Science.gov (United States)

    Zhao, Xia; Chen, Zhonglin; Wang, Xiaochun; Li, Jinchunzi; Shen, Jimin; Xu, Hao

    2015-03-01

    Recently, a new type of organic pollution derived from pharmaceuticals and personal care products (PPCPs) is gradually on the rise. Wastewater treatment to remove PPCPs was investigated using an aerobic granular sludge sequencing bioreactor (GSBR). After optimization of influent organic load, hydraulic shear stress, sludge settling time, etc., aerobic granular sludge was analyzed for its physiological and biochemical characteristics and tested for its efficacy to remove PPCPs wastewater. The granular sludge effectively removed some but not all of the PPCPs tested; removal correlated with the microbial profiles in the granules, as assessed using Solexa sequencing technology. Sequencing revealed the presence of five phylogenetic groups: Proteobacteria, Bacteroidetes, Betaproteobacteria, an unclassified genus, and Zoogloea. The results demonstrated changes in the microbial profiles with time in response to the presence of PPCPs. The effects of PPCPs on microbial communities in granular sludge process are discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Complete Genome Sequence of Pelosinus sp. Strain UFO1 Assembled Using Single-Molecule Real-Time DNA Sequencing Technology

    Energy Technology Data Exchange (ETDEWEB)

    Steven D. Brown; Sagar M. Utturkar; Timothy S. Magnuson; Allison E. Ray; Farris L. Poole; W. Andrew Lancaster; Michael P. Thorgersen; Michael W. W. Adams; Dwayne A. Elias

    2014-09-01

    Pelosinus fermentans strain R7 was isolated from Russian kaolin clays as the type strain and it can reduce Fe(III) during fermentative growth (1). Draft genome sequences for P. fermentans R7 and four strains from Hanford, Washington, USA, have been published (2–4). The P. fermentans 16S rRNA sequence dominated the lactate-based enrichment cultures from three geochemically contrasting soils from the Melton Branch Watershed, Oak Ridge, Tennessee, USA (5) and also at another stimulated, uraniumcontaminated field site near Oak Ridge (6). For the current work, strain UFO1 was isolated from pristine sediments at a background field site in Oak Ridge and characterized as facilitating U(VI) reduction and precipitation with phosphate (7).

  19. 第三代测序基本原理%The Mechanisms of the Third Generation Sequencing Technology

    Institute of Scientific and Technical Information of China (English)

    李明爽; 赵敏

    2012-01-01

    In this review, I describe the mechanisms and features of the third generation sequencing technology which is a new generation of single molecule sequencing technology, introduce the True Single Molecule Sequencing (tSMS? of Helicos, the Single Molecule Real Time (SMRT? DNA Sequencing of the Pacific Bioseience, and the single-molecule nanopore DNA sequencing of the Oxford Nanopore Technologies. The advantages compared with the second generation sequencing, the problems and its future perspectives will be discussed here.%文章阐述了以单分子实时测序和纳米孔技术为标志第三代测序的基本原理,介绍了Helicos的Heliscope单分子测序仪、Pacific Bioscience的SMRT技术和Oxford Nanopore Technologies公司正在研究的纳米孔单分子测序技术.与其他测序技术进行了简单的对比以并提出一些单分子测序仍需面对的问题以及对未来单分子测序的展望.

  20. Genomic and Functional Characteristics of Human Cytomegalovirus Revealed by Next-Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Steven Sijmons

    2014-03-01

    Full Text Available The complete genome of human cytomegalovirus (HCMV was elucidated almost 25 years ago using a traditional cloning and Sanger sequencing approach. Analysis of the genetic content of additional laboratory and clinical isolates has lead to a better, albeit still incomplete, definition of the coding potential and diversity of wild-type HCMV strains. The introduction of a new generation of massively parallel sequencing technologies, collectively called next-generation sequencing, has profoundly increased the throughput and resolution of the genomics field. These increased possibilities are already leading to a better understanding of the circulating diversity of HCMV clinical isolates. The higher resolution of next-generation sequencing provides new opportunities in the study of intrahost viral population structures. Furthermore, deep sequencing enables novel diagnostic applications for sensitive drug resistance mutation detection. RNA-seq applications have changed the picture of the HCMV transcriptome, which resulted in proof of a vast amount of splicing events and alternative transcripts. This review discusses the application of next-generation sequencing technologies, which has provided a clearer picture of the intricate nature of the HCMV genome. The continuing development and application of novel sequencing technologies will further augment our understanding of this ubiquitous, but elusive, herpesvirus.

  1. Complete Genome Sequence of Ornithogalum Mosaic Virus Infecting Gladiolus spp. in South Korea.

    Science.gov (United States)

    Cho, Sang-Yun; Lim, Seungmo; Kim, Hongsup; Yi, Seung-In; Moon, Jae Sun

    2016-08-11

    We report here the first complete genome sequence of Ornithogalum mosaic virus (OrMV) isolated from Taean, South Korea, in 2011, which was obtained by next-generation sequencing and Sanger sequencing. The sequence information provided here may serve as a potential reference for other OrMV isolates.

  2. Complete genome sequence of a new tobamovirus naturally infecting tomatoes in Mexico

    Science.gov (United States)

    The complete genomic sequence of a new tobamovirus in tomato was determined through deep sequencing and assembly of small RNAs, thenvalidated through Sanger sequencing of the overlapping RT-PCR products and rapid amplification of cDNA ends (RACE). Based on the genomic sequence identity (85%) to kn...

  3. Killer Immunoglobulin-Like Receptor Allele Determination Using Next-Generation Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Bercelin Maniangou

    2017-05-01

    Full Text Available The impact of natural killer (NK cell alloreactivity on hematopoietic stem cell transplantation (HSCT outcome is still debated due to the complexity of graft parameters, HLA class I environment, the nature of killer cell immunoglobulin-like receptor (KIR/KIR ligand genetic combinations studied, and KIR+ NK cell repertoire size. KIR genes are known to be polymorphic in terms of gene content, copy number variation, and number of alleles. These allelic polymorphisms may impact both the phenotype and function of KIR+ NK cells. We, therefore, speculate that polymorphisms may alter donor KIR+ NK cell phenotype/function thus modulating post-HSCT KIR+ NK cell alloreactivity. To investigate KIR allele polymorphisms of all KIR genes, we developed a next-generation sequencing (NGS technology on a MiSeq platform. To ensure the reliability and specificity of our method, genomic DNA from well-characterized cell lines were used; high-resolution KIR typing results obtained were then compared to those previously reported. Two different bioinformatic pipelines were used allowing the attribution of sequencing reads to specific KIR genes and the assignment of KIR alleles for each KIR gene. Our results demonstrated successful long-range KIR gene amplifications of all reference samples using intergenic KIR primers. The alignment of reads to the human genome reference (hg19 using BiRD pipeline or visualization of data using Profiler software demonstrated that all KIR genes were completely sequenced with a sufficient read depth (mean 317× for all loci and a high percentage of mapping (mean 93% for all loci. Comparison of high-resolution KIR typing obtained to those published data using exome capture resulted in a reported concordance rate of 95% for centromeric and telomeric KIR genes. Overall, our results suggest that NGS can be used to investigate the broad KIR allelic polymorphism. Hence, these data improve our knowledge, not only on KIR+ NK cell alloreactivity in

  4. Detecting novel genetic mutations in Chinese Usher syndrome families using next-generation sequencing technology.

    Science.gov (United States)

    Qu, Ling-Hui; Jin, Xin; Xu, Hai-Wei; Li, Shi-Ying; Yin, Zheng-Qin

    2015-02-01

    Usher syndrome (USH) is the most common cause of combined blindness and deafness inherited in an autosomal recessive mode. Molecular diagnosis is of great significance in revealing the molecular pathogenesis and aiding the clinical diagnosis of this disease. However, molecular diagnosis remains a challenge due to high phenotypic and genetic heterogeneity in USH. This study explored an approach for detecting disease-causing genetic mutations in candidate genes in five index cases from unrelated USH families based on targeted next-generation sequencing (NGS) technology. Through systematic data analysis using an established bioinformatics pipeline and segregation analysis, 10 pathogenic mutations in the USH disease genes were identified in the five USH families. Six of these mutations were novel: c.4398G > A and EX38-49del in MYO7A, c.988_989delAT in USH1C, c.15104_15105delCA and c.6875_6876insG in USH2A. All novel variations segregated with the disease phenotypes in their respective families and were absent from ethnically matched control individuals. This study expanded the mutation spectrum of USH and revealed the genotype-phenotype relationships of the novel USH mutations in Chinese patients. Moreover, this study proved that targeted NGS is an accurate and effective method for detecting genetic mutations related to USH. The identification of pathogenic mutations is of great significance for elucidating the underlying pathophysiology of USH.

  5. Transcriptome Profile Analysis of Sugarcane Responses to Sporisorium scitaminea Infection Using Solexa Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Qibin Wu

    2013-01-01

    Full Text Available To understand the molecular basis of sugarcane-smut interaction, it is important to identify sugarcane genes that respond to the pathogen attack. High-throughput tag-sequencing (tag-seq analysis by Solexa technology was performed on sugarcane infected with Sporisorium scitaminea, which should have massively increased the amount of data available for transcriptome profile analysis. After mapping to sugarcane EST databases in NCBI, we obtained 2015 differentially expressed genes, of which 1125 were upregulated and 890 downregulated by infection. Gene ontology (GO analysis revealed that the differentially expressed genes involve in many cellular processes. Pathway analysis revealed that metabolic pathways and ribosome function are significantly affected, where upregulation of expression dominates over downregulation. Differential expression of three candidate genes involved in MAP kinase signaling pathway, ScBAK1 (GenBank Accession number: KC857629, ScMapkk (GenBank Accession number: KC857627, and ScGloI (GenBank Accession number: KC857628, was confirmed by reverse transcription polymerase chain reaction (RT-PCR. Real-time quantitative PCR (qRT-PCR analysis concluded that the expression of these genes were all up-regulated after the infection of S. scitaminea and may play a role in pathogen response in sugarcane. The present study provides insights into the molecular mechanism of sugarcane defense to S. scitaminea infection, leading to a more comprehensive understanding of sugarcane-smut interaction.

  6. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons.

    Science.gov (United States)

    Olson, Nathan D; Lund, Steven P; Zook, Justin M; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B

    2015-03-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing(®), or Ion Torrent PGM(®). The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  7. BRCA somatic and germline mutation detection in paraffin embedded ovarian cancers by next-generation sequencing

    Science.gov (United States)

    Mafficini, Andrea; Simbolo, Michele; Parisi, Alice; Rusev, Borislav; Luchini, Claudio; Cataldo, Ivana; Piazzola, Elena; Sperandio, Nicola; Turri, Giona; Franchi, Massimo; Tortora, Giampaolo; Bovo, Chiara; Lawlor, Rita T.; Scarpa, Aldo

    2016-01-01

    BRCA mutated ovarian cancers respond better to platinum-based therapy and to the recently approved PARP-inhibitors. There is the need for efficient and timely methods to detect both somatic and germline mutations using formalin-fixed paraffin-embedded (FFPE) tissues and commercially available technology. We used a commercial kit exploring all exons and 50bp exon-intron junctions of BRCA1 and BRCA2 genes, and semiconductor next-generation sequencing (NGS) on DNA from 47 FFPE samples of high-grade serous ovarian cancers. Pathogenic mutations were found in 13/47 (28%) cancers: eight in BRCA1 and five in BRCA2. All BRCA1 and two BRCA2 mutations were germline; three BRCA2 mutations were somatic. All mutations were confirmed by Sanger sequencing. To evaluate the performance of the NGS panel, we assessed its capability to detect the 6,953 variants described for BRCA1 and BRCA2 in ClinVar and COSMIC databases using callability analysis. 6,059 (87.1%) variants were identified automatically by the software; 829 (12.0%) required visual verification. The remaining 65 (0.9%) variants were uncallable, and would require 15 Sanger reactions to be resolved. Thus, the sensitivity of the NGS-panel was 99.1%. In conclusion, NGS performed with a commercial kit is highly efficient for detection of germline and somatic mutations in BRCA genes using routine FFPE tissue. PMID:26745875

  8. BRCA somatic and germline mutation detection in paraffin embedded ovarian cancers by next-generation sequencing.

    Science.gov (United States)

    Mafficini, Andrea; Simbolo, Michele; Parisi, Alice; Rusev, Borislav; Luchini, Claudio; Cataldo, Ivana; Piazzola, Elena; Sperandio, Nicola; Turri, Giona; Franchi, Massimo; Tortora, Giampaolo; Bovo, Chiara; Lawlor, Rita T; Scarpa, Aldo

    2016-01-12

    BRCA mutated ovarian cancers respond better to platinum-based therapy and to the recently approved PARP-inhibitors. There is the need for efficient and timely methods to detect both somatic and germline mutations using formalin-fixed paraffin-embedded (FFPE) tissues and commercially available technology. We used a commercial kit exploring all exons and 50bp exon-intron junctions of BRCA1 and BRCA2 genes, and semiconductor next-generation sequencing (NGS) on DNA from 47 FFPE samples of high-grade serous ovarian cancers. Pathogenic mutations were found in 13/47 (28%) cancers: eight in BRCA1 and five in BRCA2. All BRCA1 and two BRCA2 mutations were germline; three BRCA2 mutations were somatic. All mutations were confirmed by Sanger sequencing. To evaluate the performance of the NGS panel, we assessed its capability to detect the 6,953 variants described for BRCA1 and BRCA2 in ClinVar and COSMIC databases using callability analysis. 6,059 (87.1%) variants were identified automatically by the software; 829 (12.0%) required visual verification. The remaining 65 (0.9%) variants were uncallable, and would require 15 Sanger reactions to be resolved. Thus, the sensitivity of the NGS-panel was 99.1%. In conclusion, NGS performed with a commercial kit is highly efficient for detection of germline and somatic mutations in BRCA genes using routine FFPE tissue.

  9. The power of single molecule real-time sequencing technology in the de novo assembly of a eukaryotic genome.

    Science.gov (United States)

    Sakai, Hiroaki; Naito, Ken; Ogiso-Tanaka, Eri; Takahashi, Yu; Iseki, Kohtaro; Muto, Chiaki; Satou, Kazuhito; Teruya, Kuniko; Shiroma, Akino; Shimoji, Makiko; Hirano, Takashi; Itoh, Takeshi; Kaga, Akito; Tomooka, Norihiko

    2015-11-30

    Second-generation sequencers (SGS) have been game-changing, achieving cost-effective whole genome sequencing in many non-model organisms. However, a large portion of the genomes still remains unassembled. We reconstructed azuki bean (Vigna angularis) genome using single molecule real-time (SMRT) sequencing technology and achieved the best contiguity and coverage among currently assembled legume crops. The SMRT-based assembly produced 100 times longer contigs with 100 times smaller amount of gaps compared to the SGS-based assemblies. A detailed comparison between the assemblies revealed that the SMRT-based assembly enabled a more comprehensive gene annotation than the SGS-based assemblies where thousands of genes were missing or fragmented. A chromosome-scale assembly was generated based on the high-density genetic map, covering 86% of the azuki bean genome. We demonstrated that SMRT technology, though still needed support of SGS data, achieved a near-complete assembly of a eukaryotic genome.

  10. A Review on the Applications of Next Generation Sequencing Technologies as Applied to Food-Related Microbiome Studies

    Directory of Open Access Journals (Sweden)

    Yu Cao

    2017-09-01

    Full Text Available The development of next generation sequencing (NGS techniques has enabled researchers to study and understand the world of microorganisms from broader and deeper perspectives. The contemporary advances in DNA sequencing technologies have not only enabled finer characterization of bacterial genomes but also provided deeper taxonomic identification of complex microbiomes which in its genomic essence is the combined genetic material of the microorganisms inhabiting an environment, whether the environment be a particular body econiche (e.g., human intestinal contents or a food manufacturing facility econiche (e.g., floor drain. To date, 16S rDNA sequencing, metagenomics and metatranscriptomics are the three basic sequencing strategies used in the taxonomic identification and characterization of food-related microbiomes. These sequencing strategies have used different NGS platforms for DNA and RNA sequence identification. Traditionally, 16S rDNA sequencing has played a key role in understanding the taxonomic composition of a food-related microbiome. Recently, metagenomic approaches have resulted in improved understanding of a microbiome by providing a species-level/strain-level characterization. Further, metatranscriptomic approaches have contributed to the functional characterization of the complex interactions between different microbial communities within a single microbiome. Many studies have highlighted the use of NGS techniques in investigating the microbiome of fermented foods. However, the utilization of NGS techniques in studying the microbiome of non-fermented foods are limited. This review provides a brief overview of the advances in DNA sequencing chemistries as the technology progressed from first, next and third generations and highlights how NGS provided a deeper understanding of food-related microbiomes with special focus on non-fermented foods.

  11. Evaluation of next generation mtGenome sequencing using the Ion Torrent Personal Genome Machine (PGM)☆

    Science.gov (United States)

    Parson, Walther; Strobl, Christina; Huber, Gabriela; Zimmermann, Bettina; Gomes, Sibylle M.; Souto, Luis; Fendt, Liane; Delport, Rhena; Langit, Reina; Wootton, Sharon; Lagacé, Robert; Irwin, Jodi

    2013-01-01

    Insights into the human mitochondrial phylogeny have been primarily achieved by sequencing full mitochondrial genomes (mtGenomes). In forensic genetics (partial) mtGenome information can be used to assign haplotypes to their phylogenetic backgrounds, which may, in turn, have characteristic geographic distributions that would offer useful information in a forensic case. In addition and perhaps even more relevant in the forensic context, haplogroup-specific patterns of mutations form the basis for quality control of mtDNA sequences. The current method for establishing (partial) mtDNA haplotypes is Sanger-type sequencing (STS), which is laborious, time-consuming, and expensive. With the emergence of Next Generation Sequencing (NGS) technologies, the body of available mtDNA data can potentially be extended much more quickly and cost-efficiently. Customized chemistries, laboratory workflows and data analysis packages could support the community and increase the utility of mtDNA analysis in forensics. We have evaluated the performance of mtGenome sequencing using the Personal Genome Machine (PGM) and compared the resulting haplotypes directly with conventional Sanger-type sequencing. A total of 64 mtGenomes (>1 million bases) were established that yielded high concordance with the corresponding STS haplotypes (<0.02% differences). About two-thirds of the differences were observed in or around homopolymeric sequence stretches. In addition, the sequence alignment algorithm employed to align NGS reads played a significant role in the analysis of the data and the resulting mtDNA haplotypes. Further development of alignment software would be desirable to facilitate the application of NGS in mtDNA forensic genetics. PMID:23948325

  12. Towards clinical molecular diagnosis of inherited cardiac conditions: a comparison of bench-top genome DNA sequencers.

    Directory of Open Access Journals (Sweden)

    Xinzhong Li

    Full Text Available Molecular genetic testing is recommended for diagnosis of inherited cardiac disease, to guide prognosis and treatment, but access is often limited by cost and availability. Recently introduced high-throughput bench-top DNA sequencing platforms have the potential to overcome these limitations.We evaluated two next-generation sequencing (NGS platforms for molecular diagnostics. The protein-coding regions of six genes associated with inherited arrhythmia syndromes were amplified from 15 human samples using parallelised multiplex PCR (Access Array, Fluidigm, and sequenced on the MiSeq (Illumina and Ion Torrent PGM (Life Technologies. Overall, 97.9% of the target was sequenced adequately for variant calling on the MiSeq, and 96.8% on the Ion Torrent PGM. Regions missed tended to be of high GC-content, and most were problematic for both platforms. Variant calling was assessed using 107 variants detected using Sanger sequencing: within adequately sequenced regions, variant calling on both platforms was highly accurate (Sensitivity: MiSeq 100%, PGM 99.1%. Positive predictive value: MiSeq 95.9%, PGM 95.5%. At the time of the study the Ion Torrent PGM had a lower capital cost and individual runs were cheaper and faster. The MiSeq had a higher capacity (requiring fewer runs, with reduced hands-on time and simpler laboratory workflows. Both provide significant cost and time savings over conventional methods, even allowing for adjunct Sanger sequencing to validate findings and sequence exons missed by NGS.MiSeq and Ion Torrent PGM both provide accurate variant detection as part of a PCR-based molecular diagnostic workflow, and provide alternative platforms for molecular diagnosis of inherited cardiac conditions. Though there were performance differences at this throughput, platforms differed primarily in terms of cost, scalability, protocol stability and ease of use. Compared with current molecular genetic diagnostic tests for inherited cardiac arrhythmias

  13. Targeted 'Next-Generation' sequencing in anophthalmia and microphthalmia patients confirms SOX2, OTX2 and FOXE3 mutations

    Directory of Open Access Journals (Sweden)

    Lopez Jimenez Nelson

    2011-12-01

    Full Text Available Abstract Background Anophthalmia/microphthalmia (A/M is caused by mutations in several different transcription factors, but mutations in each causative gene are relatively rare, emphasizing the need for a testing approach that screens multiple genes simultaneously. We used next-generation sequencing to screen 15 A/M patients for mutations in 9 pathogenic genes to evaluate this technology for screening in A/M. Methods We used a pooled sequencing design, together with custom single nucleotide polymorphism (SNP calling software. We verified predicted sequence alterations using Sanger sequencing. Results We verified three mutations - c.542delC in SOX2, resulting in p.Pro181Argfs*22, p.Glu105X in OTX2 and p.Cys240X in FOXE3. We found several novel sequence alterations and SNPs that were likely to be non-pathogenic - p.Glu42Lys in CRYBA4, p.Val201Met in FOXE3 and p.Asp291Asn in VSX2. Our analysis methodology gave one false positive result comprising a mutation in PAX6 (c.1268A > T, predicting p.X423LeuextX*15 that was not verified by Sanger sequencing. We also failed to detect one 20 base pair (bp deletion and one 3 bp duplication in SOX2. Conclusions Our results demonstrated the power of next-generation sequencing with pooled sample groups for the rapid screening of candidate genes for A/M as we were correctly able to identify disease-causing mutations. However, next-generation sequencing was less useful for small, intragenic deletions and duplications. We did not find mutations in 10/15 patients and conclude that there is a need for further gene discovery in A/M.

  14. Is Whole Exome Sequencing an Ethically Disruptive Technology? Perspectives of Pediatric Oncologists and Parents of Pediatric Patients with Solid Tumors

    Science.gov (United States)

    McCullough, Laurence B.; Slashinski, Melody J.; McGuire, Amy L.; Street, Richard L.; Eng, Christine M.; Gibbs, Richard A.; Parsons, D. Williams; Plon, Sharon E.

    2016-01-01

    Background Some anticipate that physician and parents will be ill-prepared or unprepared for the clinical introduction of genome sequencing, making it ethically disruptive. Procedure As part of the Baylor Advancing Sequencing in Childhood Cancer Care (BASIC3) study, we conducted semi-structured interviews with 16 pediatric oncologists and 40 parents of pediatric patients with cancer prior to the return of sequencing results. We elicited expectations and attitudes concerning the impact of sequencing on clinical decision-making, clinical utility, and treatment expectations from both groups. Using accepted methods of qualitative research to analyze interview transcripts, we completed a thematic analysis to provide inductive insights into their views of sequencing. Results Our major findings reveal that neither pediatric oncologists nor parents anticipate sequencing to be an ethically disruptive technology, because they expect to be prepared to integrate sequencing results into their existing approaches to learning and using new clinical information for care. Pediatric oncologists do not expect sequencing results to be more complex than other diagnostic information and plan simply to incorporate these data into their evidence-based approach to clinical practice although they were concerned about impact on parents. For parents, there is an urgency to protect their chil's health and in this context they expect genomic information to better prepare them to participate in decisions about their chil's care. Conclusion Our data do not support concern that introducing genome sequencing into childhood cancer care will be ethically disruptive, i.e., leave physicians or parents ill-prepared or unprepared to make responsible decisions about patient care. PMID:26505993

  15. Impact of Next Generation Sequencing Techniques in Food Microbiology

    Science.gov (United States)

    Mayo, Baltasar; Rachid, Caio T. C. C; Alegría, Ángel; Leite, Analy M. O; Peixoto, Raquel S; Delgado, Susana

    2014-01-01

    Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety. PMID:25132799

  16. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius.

    Directory of Open Access Journals (Sweden)

    Ceiridwen J Edwards

    Full Text Available BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer. In total, 289.9 megabases (22.48% of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously

  17. Next-Generation Sequencing Platforms

    Science.gov (United States)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  18. Comparative analysis of Lactobacillus plantarum WCFS1 transcriptomes using DNA microarray and next generation sequencing technologies

    NARCIS (Netherlands)

    Leimena, M.M.; Wels, M.; Bongers, R.; Smid, E.J.; Zoetendal, E.G.; Kleerebezem, M.

    2012-01-01

    RNA sequencing is starting to compete with the use of DNA microarrays for transcription analysis in eukaryotes as well as in prokaryotes. Application of RNA sequencing in prokaryotes requires additional steps in the RNA preparation procedure to increase the relative abundance of mRNA and cannot

  19. Development of a low bias method for characterizing viral populations using next generation sequencing technology.

    Directory of Open Access Journals (Sweden)

    Stephanie M Willerth

    Full Text Available BACKGROUND: With an estimated 38 million people worldwide currently infected with human immunodeficiency virus (HIV, and an additional 4.1 million people becoming infected each year, it is important to understand how this virus mutates and develops resistance in order to design successful therapies. METHODOLOGY/PRINCIPAL FINDINGS: We report a novel experimental method for amplifying full-length HIV genomes without the use of sequence-specific primers for high throughput DNA sequencing, followed by assembly of full length viral genome sequences from the resulting large dataset. Illumina was chosen for sequencing due to its ability to provide greater coverage of the HIV genome compared to prior methods, allowing for more comprehensive characterization of the heterogeneity present in the HIV samples analyzed. Our novel amplification method in combination with Illumina sequencing was used to analyze two HIV populations: a homogenous HIV population based on the canonical NL4-3 strain and a heterogeneous viral population obtained from a HIV patient's infected T cells. In addition, the resulting sequence was analyzed using a new computational approach to obtain a consensus sequence and several metrics of diversity. SIGNIFICANCE: This study demonstrates how a lower bias amplification method in combination with next generation DNA sequencing provides in-depth, complete coverage of the HIV genome, enabling a stronger characterization of the quasispecies present in a clinically relevant HIV population as well as future study of how HIV mutates in response to a selective pressure.

  20. Complete Genome Sequence of Rehmannia Mosaic Virus Infecting Rehmannia glutinosa in South Korea.

    Science.gov (United States)

    Lim, Seungmo; Zhao, Fumei; Yoo, Ran Hee; Igori, Davaajargal; Jeong, Jae Cheol; Lee, Haeng-Soon; Kwak, Sang-Soo; Moon, Jae Sun

    2016-01-28

    The complete genome sequence of a South Korean isolate of Rehmannia mosaic virus (ReMV) infecting Rehmannia glutinosa was determined through next-generation sequencing and Sanger sequencing. To our knowledge, this is the first report of a natural infection of R. glutinosa by ReMV in South Korea.

  1. Complete Genome Sequence of a Tomato Isolate of Parietaria Mottle Virus from Italy.

    Science.gov (United States)

    Martínez, Carolina; Aramburu, José; Rubio, Luis; Galipienso, Luis

    2015-12-17

    We report here the complete genome sequence of isolate T32 of parietaria mottle virus (PMoV) infecting tomato plants in Turin, Italy, obtained by Sanger sequencing. T32 shares 90.48 to 96.69% nucleotide identity with other two PoMV isolates, CR8 and Pe1, respectively, whose complete genome sequences are available.

  2. Haplotyping and copy number estimation of the highly polymorphic human beta-defensin locus on 8p23 by 454 amplicon sequencing

    Directory of Open Access Journals (Sweden)

    Rosenstiel Philip

    2010-04-01

    Full Text Available Abstract Background The beta-defensin gene cluster (DEFB at chromosome 8p23.1 is one of the most copy number (CN variable regions of the human genome. Whereas individual DEFB CNs have been suggested as independent genetic risk factors for several diseases (e.g. psoriasis and Crohn's disease, the role of multisite sequence variations (MSV is less well understood and to date has only been reported for prostate cancer. Simultaneous assessment of MSVs and CNs can be achieved by PCR, cloning and Sanger sequencing, however, these methods are labour and cost intensive as well as prone to methodological bias introduced by bacterial cloning. Here, we demonstrate that amplicon sequencing of pooled individual PCR products by the 454 technology allows in-depth determination of MSV haplotypes and estimation of DEFB CNs in parallel. Results Six PCR products spread over ~87 kb of DEFB and harbouring 24 known MSVs were amplified from 11 DNA samples, pooled and sequenced on a Roche 454 GS FLX sequencer. From ~142,000 reads, ~120,000 haplotype calls (HC were inferred that identified 22 haplotypes ranging from 2 to 7 per amplicon. In addition to the 24 known MSVs, two additional sequence variations were detected. Minimal CNs were estimated from the ratio of HCs and compared to absolute CNs determined by alternative methods. Concordance in CNs was found for 7 samples, the CNs differed by one in 2 samples and the estimated minimal CN was half of the absolute in one sample. For 7 samples and 2 amplicons, the 454 haplotyping results were compared to those by cloning/Sanger sequencing. Intrinsic problems related to chimera formation during PCR and differences between haplotyping by 454 and cloning/Sanger sequencing are discussed. Conclusion Deep amplicon sequencing using the 454 technology yield thousands of HCs per amplicon for an affordable price and may represent an effective method for parallel haplotyping and CN estimation in small to medium-sized cohorts. The

  3. Identification of peptide sequences that selectively bind to pentaerythritol trinitrate hemisuccinate-a surrogate of PETN, via phage display technology.

    Science.gov (United States)

    Kubas, George; Rees, William; Caguiat, Jonathan; Asch, David; Fagan, Diana; Cortes, Pedro

    2017-03-01

    The present research investigates the identification of amino acid sequences that selectively bind to a pentaerythritol tetranitrate (PETN) explosive surrogate. Through the use of a phage display technique and enzyme-linked immunosorbent assays (ELISA), a peptide library was tested against pentaerythritol trinitrate hemisuccinate (PETNH), a surrogate of PETN, to screen for those with amino acids having affinity toward the explosive. The results suggest that the library contains peptides selective to PETNH. Following three rounds of panning, clones were picked and tested for specificity toward PETNH. ELISA results from these samples show that each phage clone has some level of selectivity for binding to PETNH. The peptides from these clones have been sequenced and shown to contain certain common amino acid segments among them. This work represents a technological platform for identifying amino-acid sequences selective toward any bio-chem analyte of interest.

  4. Preparation of next-generation sequencing libraries using Nextera™ technology: simultaneous DNA fragmentation and adaptor tagging by in vitro transposition.

    Science.gov (United States)

    Caruccio, Nicholas

    2011-01-01

    DNA library preparation is a common entry point and bottleneck for next-generation sequencing. Current methods generally consist of distinct steps that often involve significant sample loss and hands-on time: DNA fragmentation, end-polishing, and adaptor-ligation. In vitro transposition with Nextera™ Transposomes simultaneously fragments and covalently tags the target DNA, thereby combining these three distinct steps into a single reaction. Platform-specific sequencing adaptors can be added, and the sample can be enriched and bar-coded using limited-cycle PCR to prepare di-tagged DNA fragment libraries. Nextera technology offers a streamlined, efficient, and high-throughput method for generating bar-coded libraries compatible with multiple next-generation sequencing platforms.

  5. Transcriptome sequencing of lentil based on second-generation technology permits large-scale unigene assembly and SSR marker discovery

    Directory of Open Access Journals (Sweden)

    Materne Michael

    2011-05-01

    Full Text Available Abstract Background Lentil (Lens culinaris Medik. is a cool-season grain legume which provides a rich source of protein for human consumption. In terms of genomic resources, lentil is relatively underdeveloped, in comparison to other Fabaceae species, with limited available data. There is hence a significant need to enhance such resources in order to identify novel genes and alleles for molecular breeding to increase crop productivity and quality. Results Tissue-specific cDNA samples from six distinct lentil genotypes were sequenced using Roche 454 GS-FLX Titanium technology, generating c. 1.38 × 106 expressed sequence tags (ESTs. De novo assembly generated a total of 15,354 contigs and 68,715 singletons. The complete unigene set was sequence-analysed against genome drafts of the model legume species Medicago truncatula and Arabidopsis thaliana to identify 12,639, and 7,476 unique matches, respectively. When compared to the genome of Glycine max, a total of 20,419 unique hits were observed corresponding to c. 31% of the known gene space. A total of 25,592 lentil unigenes were subsequently annoated from GenBank. Simple sequence repeat (SSR-containing ESTs were identified from consensus sequences and a total of 2,393 primer pairs were designed. A subset of 192 EST-SSR markers was screened for validation across a panel 12 cultivated lentil genotypes and one wild relative species. A total of 166 primer pairs obtained successful amplification, of which 47.5% detected genetic polymorphism. Conclusions A substantial collection of ESTs has been developed from sequence analysis of lentil genotypes using second-generation technology, permitting unigene definition across a broad range of functional categories. As well as providing resources for functional genomics studies, the unigene set has permitted significant enhancement of the number of publicly-available molecular genetic markers as tools for improvement of this species.

  6. De novo 454 sequencing of barcoded BAC pools for comprehensive gene survey and genome analysis in the complex genome of barley

    Directory of Open Access Journals (Sweden)

    Scholz Uwe

    2009-11-01

    Full Text Available Abstract Background De novo sequencing the entire genome of a large complex plant genome like the one of barley (Hordeum vulgare L. is a major challenge both in terms of experimental feasibility and costs. The emergence and breathtaking progress of next generation sequencing technologies has put this goal into focus and a clone based strategy combined with the 454/Roche technology is conceivable. Results To test the feasibility, we sequenced 91 barcoded, pooled, gene containing barley BACs using the GS FLX platform and assembled the sequences under iterative change of parameters. The BAC assemblies were characterized by N50 of ~50 kb (N80 ~31 kb, N90 ~21 kb and a Q40 of 94%. For ~80% of the clones, the best assemblies consisted of less than 10 contigs at 24-fold mean sequence coverage. Moreover we show that gene containing regions seem to assemble completely and uninterrupted thus making the approach suitable for detecting complete and positionally anchored genes. By comparing the assemblies of four clones to their complete reference sequences generated by the Sanger method, we evaluated the distribution, quality and representativeness of the 454 sequences as well as the consistency and reliability of the assemblies. Conclusion The described multiplex 454 sequencing of barcoded BACs leads to sequence consensi highly representative for the clones. Assemblies are correct for the majority of contigs. Though the resolution of complex repetitive structures requires additional experimental efforts, our approach paves the way for a clone based strategy of sequencing the barley genome.

  7. Sequence based polymorphic (SBP marker technology for targeted genomic regions: its application in generating a molecular map of the Arabidopsis thaliana genome

    Directory of Open Access Journals (Sweden)

    Sahu Binod B

    2012-01-01

    Full Text Available Abstract Background Molecular markers facilitate both genotype identification, essential for modern animal and plant breeding, and the isolation of genes based on their map positions. Advancements in sequencing technology have made possible the identification of single nucleotide polymorphisms (SNPs for any genomic regions. Here a sequence based polymorphic (SBP marker technology for generating molecular markers for targeted genomic regions in Arabidopsis is described. Results A ~3X genome coverage sequence of the Arabidopsis thaliana ecotype, Niederzenz (Nd-0 was obtained by applying Illumina's sequencing by synthesis (Solexa technology. Comparison of the Nd-0 genome sequence with the assembled Columbia-0 (Col-0 genome sequence identified putative single nucleotide polymorphisms (SNPs throughout the entire genome. Multiple 75 base pair Nd-0 sequence reads containing SNPs and originating from individual genomic DNA molecules were the basis for developing co-dominant SBP markers. SNPs containing Col-0 sequences, supported by transcript sequences or sequences from multiple BAC clones, were compared to the respective Nd-0 sequences to identify possible restriction endonuclease enzyme site variations. Small amplicons, PCR amplified from both ecotypes, were digested with suitable restriction enzymes and resolved on a gel to reveal the sequence based polymorphisms. By applying this technology, 21 SBP markers for the marker poor regions of the Arabidopsis map representing polymorphisms between Col-0 and Nd-0 ecotypes were generated. Conclusions The SBP marker technology described here allowed the development of molecular markers for targeted genomic regions of Arabidopsis. It should facilitate isolation of co-dominant molecular markers for targeted genomic regions of any animal or plant species, whose genomic sequences have been assembled. This technology will particularly facilitate the development of high density molecular marker maps, essential for

  8. Illumina Production Sequencing at the DOE Joint Genome Institute - Workflow and Optimizations

    Energy Technology Data Exchange (ETDEWEB)

    Tarver, Angela; Fern, Alison; Diego, Matthew San; Kennedy, Megan; Zane, Matthew; Daum, Christopher; Hack, Christopher; Tang, Eric; Deshpande, Shweta; Cheng, Jan-Fang; Roberts, Simon; Alexandre, Melanie; Harmon-Smith, Miranda; Lucas, Susan

    2010-06-18

    The U.S. Department of Energy (DOE) Joint Genome Institute?s (JGI) Production Sequencing group is committed to the generation of high-quality genomic DNA sequence to support the DOE mission areas of renewable energy generation, global carbon management, and environmental characterization and clean-up. Within the JGI?s Production Sequencing group, the Illumina Genome Analyzer pipeline has been established as one of three sequencing platforms, along with Roche/454 and ABI/Sanger. Optimization of the Illumina pipeline has been ongoing with the aim of continual process improvement of the laboratory workflow. These process improvement projects are being led by the JGI?s Process Optimization, Sequencing Technologies, Instrumentation& Engineering, and the New Technology Production groups. Primary focus has been on improving the procedural ergonomics and the technicians? operating environment, reducing manually intensive technician operations with different tools, reducing associated production costs, and improving the overall process and generated sequence quality. The U.S. DOE JGI was established in 1997 in Walnut Creek, CA, to unite the expertise and resources of five national laboratories? Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge, and Pacific Northwest ? along with HudsonAlpha Institute for Biotechnology. JGI is operated by the University of California for the U.S. DOE.

  9. On the optimal trimming of high-throughput mRNA sequence data

    Directory of Open Access Journals (Sweden)

    Matthew D MacManes

    2014-01-01

    Full Text Available The widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype. In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other individuals with different phenotypes. While these techniques are extremely powerful, careful attention to data quality is required. In particular, because high-throughput sequencing is more error-prone than traditional Sanger sequencing, quality trimming of sequence reads should be an important step in all data processing pipelines. While several software packages for quality trimming exist, no general guidelines for the specifics of trimming have been developed. Here, using empirically derived sequence data, I provide general recommendations regarding the optimal strength of trimming, specifically in mRNA-Seq studies. Although very aggressive quality trimming is common, this study suggests that a more gentle trimming, specifically of those nucleotides whose Phred score < 2 or < 5, is optimal for most studies across a wide variety of metrics.

  10. SSR-Patchwork: An Optimized Protocol to Obtain a Rapid and Inexpensive SSR Library Using First-Generation Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Antonietta Di Maio

    2013-01-01

    Full Text Available Premise of the study: We have optimized a version of a microsatellite loci isolation protocol for first-generation sequencing (FGS technologies. The protocol is optimized to reduce the cost and number of steps, and it combines some procedures from previous simple sequence repeat (SSR protocols with several key improvements that significantly affect the final yield of the SSR library. This protocol may be accessible for laboratories with a moderate budget or for which next-generation sequencing (NGS is not readily available. Methods and Results: We drew from classic protocols for library enrichment by digestion, ligation, amplification, hybridization, cloning, and sequencing. Three different systems were chosen: two with very different genome sizes (Galdieria sulphuraria, 10 Mbp; Pancratium maritimum, 30000 Mbp, and a third with an undetermined genome size (Kochia saxicola. Moreover, we also report the optimization of the sequencing reagents. A good frequency of the obtained microsatellite loci was achieved. Conclusions: The method presented here is very detailed; comparative tests with other SSR protocols are also reported. This optimized protocol is a promising tool for low-cost genetic studies and the rapid, simple construction of homemade SSR libraries for small and large genomes.

  11. SSR-patchwork: An optimized protocol to obtain a rapid and inexpensive SSR library using first-generation sequencing technology.

    Science.gov (United States)

    Di Maio, Antonietta; De Castro, Olga

    2013-01-01

    We have optimized a version of a microsatellite loci isolation protocol for first-generation sequencing (FGS) technologies. The protocol is optimized to reduce the cost and number of steps, and it combines some procedures from previous simple sequence repeat (SSR) protocols with several key improvements that significantly affect the final yield of the SSR library. This protocol may be accessible for laboratories with a moderate budget or for which next-generation sequencing (NGS) is not readily available. • We drew from classic protocols for library enrichment by digestion, ligation, amplification, hybridization, cloning, and sequencing. Three different systems were chosen: two with very different genome sizes (Galdieria sulphuraria, 10 Mbp; Pancratium maritimum, 30 000 Mbp), and a third with an undetermined genome size (Kochia saxicola). Moreover, we also report the optimization of the sequencing reagents. A good frequency of the obtained microsatellite loci was achieved. • The method presented here is very detailed; comparative tests with other SSR protocols are also reported. This optimized protocol is a promising tool for low-cost genetic studies and the rapid, simple construction of homemade SSR libraries for small and large genomes.

  12. The next generation of target capture technologies - large DNA fragment enrichment and sequencing determines regional genomic variation of high complexity.

    Science.gov (United States)

    Dapprich, Johannes; Ferriola, Deborah; Mackiewicz, Kate; Clark, Peter M; Rappaport, Eric; D'Arcy, Monica; Sasson, Ariella; Gai, Xiaowu; Schug, Jonathan; Kaestner, Klaus H; Monos, Dimitri

    2016-07-09

    The ability to capture and sequence large contiguous DNA fragments represents a significant advancement towards the comprehensive characterization of complex genomic regions. While emerging sequencing platforms are capable of producing several kilobases-long reads, the fragment sizes generated by current DNA target enrichment technologies remain a limiting factor, producing DNA fragments generally shorter than 1 kbp. The DNA enrichment methodology described herein, Region-Specific Extraction (RSE), produces DNA segments in excess of 20 kbp in length. Coupling this enrichment method to appropriate sequencing platforms will significantly enhance the ability to generate complete and accurate sequence characterization of any genomic region without the need for reference-based assembly. RSE is a long-range DNA target capture methodology that relies on the specific hybridization of short (20-25 base) oligonucleotide primers to selected sequence motifs within the DNA target region. These capture primers are then enzymatically extended on the 3'-end, incorporating biotinylated nucleotides into the DNA. Streptavidin-coated beads are subsequently used to pull-down the original, long DNA template molecules via the newly synthesized, biotinylated DNA that is bound to them. We demonstrate the accuracy, simplicity and utility of the RSE method by capturing and sequencing a 4 Mbp stretch of the major histocompatibility complex (MHC). Our results show an average depth of coverage of 164X for the entire MHC. This depth of coverage contributes significantly to a 99.94 % total coverage of the targeted region and to an accuracy that is over 99.99 %. RSE represents a cost-effective target enrichment method capable of producing sequencing templates in excess of 20 kbp in length. The utility of our method has been proven to generate superior coverage across the MHC as compared to other commercially available methodologies, with the added advantage of producing longer sequencing

  13. Investigation of the fungal community structures of imported wheat using high-throughput sequencing technology

    Science.gov (United States)

    Wang, Ying; Zhang, Guiming; Gao, Ruifang; Xiang, Caiyu; Feng, Jianjun; Lou, Dingfeng; Liu, Ying

    2017-01-01

    This study introduced the application of high-throughput sequencing techniques to the investigation of microbial diversity in the field of plant quarantine. It examined the microbial diversity of wheat imported into China, and established a bioinformatics database of wheat pathogens based on high-throughput sequencing results. This study analyzed the nuclear ribosomal internal transcribed spacer (ITS) region of fungi through Illumina Miseq sequencing to investigate the fungal communities of both seeds and sieve-through. A total of 758,129 fungal ITS sequences were obtained from ten samples collected from five batches of wheat imported from the USA. These sequences were classified into 2 different phyla, 15 classes, 33 orders, 41 families, or 78 genera, suggesting a high fungal diversity across samples. Apairwise analysis revealed that the diversity of the fungal community in the sieve-through is significantly higher than those in the seeds. Taxonomic analysis showed that at the class level, Dothideomycetes dominated in the seeds and Sordariomycetes dominated in the sieve-through. In all, this study revealed the fungal community composition in the seeds and sieve-through of the wheat, and identified key differences in the fungal community between the seeds and sieve-through. PMID:28241020

  14. A Solution for Establishing the Information Technology Service Management Processes Implementation Sequence

    Science.gov (United States)

    Arcilla, Magdalena; Calvo-Manzano, Jose; Cuevas, Gonzalo; Gómez, Gerzon; Ruiz, Elena; San Feliu, Tomás

    This paper addresses the implementation sequence of Services Management processes defined in ITIL v2, from a topological perspective. Graphs Theory is used to represent the existing dependencies among the ITIL v2 processes, in order to find clusters of strongly connected processes. These clusters will help to determine the implementation priority of the service management processes. For it, OPreSSD (Organizational Procedure for Service Support and Service Delivery) is proposed in order to identify the processes implementation sequence related to the Service Support (SS) and Service Delivery (SD) areas.

  15. Improved performance of the PacBio SMRT technology for 16S rDNA sequencing.

    Science.gov (United States)

    Mosher, Jennifer J; Bowman, Brett; Bernberg, Erin L; Shevchenko, Olga; Kan, Jinjun; Korlach, Jonas; Kaplan, Louis A

    2014-09-01

    Improved sequencing accuracy was obtained with 16S amplicons from environmental samples and a known pure culture when upgraded Pacific Biosciences (PacBio) hardware and enzymes were used for the single molecule, real-time (SMRT) sequencing platform. The new PacBio RS II system with P4/C2 chemistry, when used with previously constructed libraries (Mosher et al., 2013) surpassed the accuracy of Roche/454 pyrosequencing platform. With accurate read lengths of >1400 base pairs, the PacBio system opens up the possibility of identifying microorganisms to the species level in environmental samples. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. Sequence recombination and conservation of Varroa destructor virus-1 and deformed wing virus in field collected honey bees (Apis mellifera.

    Directory of Open Access Journals (Sweden)

    Hui Wang

    Full Text Available We sequenced small (s RNAs from field collected honeybees (Apis mellifera and bumblebees (Bombuspascuorum using the Illumina technology. The sRNA reads were assembled and resulting contigs were used to search for virus homologues in GenBank. Matches with Varroadestructor virus-1 (VDV1 and Deformed wing virus (DWV genomic sequences were obtained for A. mellifera but not B. pascuorum. Further analyses suggested that the prevalent virus population was composed of VDV-1 and a chimera of 5'-DWV-VDV1-DWV-3'. The recombination junctions in the chimera genomes were confirmed by using RT-PCR, cDNA cloning and Sanger sequencing. We then focused on conserved short fragments (CSF, size > 25 nt in the virus genomes by using GenBank sequences and the deep sequencing data obtained in this study. The majority of CSF sites confirmed conservation at both between-species (GenBank sequences and within-population (dataset of this study levels. However, conserved nucleotide positions in the GenBank sequences might be variable at the within-population level. High mutation rates (Pi>10% were observed at a number of sites using the deep sequencing data, suggesting that sequence conservation might not always be maintained at the population level. Virus-host interactions and strategies for developing RNAi treatments against VDV1/DWV infections are discussed.

  17. Sequence recombination and conservation of Varroa destructor virus-1 and deformed wing virus in field collected honey bees (Apis mellifera).

    Science.gov (United States)

    Wang, Hui; Xie, Jiazheng; Shreeve, Tim G; Ma, Jinmin; Pallett, Denise W; King, Linda A; Possee, Robert D

    2013-01-01

    We sequenced small (s) RNAs from field collected honeybees (Apis mellifera) and bumblebees (Bombuspascuorum) using the Illumina technology. The sRNA reads were assembled and resulting contigs were used to search for virus homologues in GenBank. Matches with Varroadestructor virus-1 (VDV1) and Deformed wing virus (DWV) genomic sequences were obtained for A. mellifera but not B. pascuorum. Further analyses suggested that the prevalent virus population was composed of VDV-1 and a chimera of 5'-DWV-VDV1-DWV-3'. The recombination junctions in the chimera genomes were confirmed by using RT-PCR, cDNA cloning and Sanger sequencing. We then focused on conserved short fragments (CSF, size > 25 nt) in the virus genomes by using GenBank sequences and the deep sequencing data obtained in this study. The majority of CSF sites confirmed conservation at both between-species (GenBank sequences) and within-population (dataset of this study) levels. However, conserved nucleotide positions in the GenBank sequences might be variable at the within-population level. High mutation rates (Pi>10%) were observed at a number of sites using the deep sequencing data, suggesting that sequence conservation might not always be maintained at the population level. Virus-host interactions and strategies for developing RNAi treatments against VDV1/DWV infections are discussed.

  18. KRAS, BRAF, and TP53 deep sequencing for colorectal carcinoma patient diagnostics.

    Science.gov (United States)

    Rechsteiner, Markus; von Teichman, Adriana; Rüschoff, Jan H; Fankhauser, Niklaus; Pestalozzi, Bernhard; Schraml, Peter; Weber, Achim; Wild, Peter; Zimmermann, Dieter; Moch, Holger

    2013-05-01

    In colorectal carcinoma, KRAS (alias Ki-ras) and BRAF mutations have emerged as predictors of resistance to anti-epidermal growth factor receptor antibody treatment and worse patient outcome, respectively. In this study, we aimed to establish a high-throughput deep sequencing workflow according to 454 pyrosequencing technology to cope with the increasing demand for sequence information at medical institutions. A cohort of 81 patients with known KRAS mutation status detected by Sanger sequencing was chosen for deep sequencing. The workflow allowed us to analyze seven amplicons (one BRAF, two KRAS, and four TP53 exons) of nine patients in parallel in one deep sequencing run. Target amplification and variant calling showed reproducible results with input DNA derived from FFPE tissue that ranged from 0.4 to 50 ng with the use of different targets and multiplex identifiers. Equimolar pooling of each amplicon in a deep sequencing run was necessary to counterbalance differences in patient tissue quality. Five BRAF and 49 TP53 mutations with functional consequences were detected. The lowest mutation frequency detected in a patient tumor population was 5% in TP53 exon 5. This low-frequency mutation was successfully verified in a second PCR and deep sequencing run. In summary, our workflow allows us to process 315 targets a week and provides the quality, flexibility, and speed needed to be integrated as standard procedure for mutational analysis in diagnostics.

  19. Comparison of Next-Generation Sequencing Technologies for Comprehensive Assessment of Full-Length Hepatitis C Viral Genomes

    Science.gov (United States)

    Thomson, Emma; Ip, Camilla L. C.; Badhan, Anjna; Christiansen, Mette T.; Adamson, Walt; Ansari, M. Azim; Breuer, Judith; Brown, Anthony; Bowden, Rory; Bonsall, David; Da Silva Filipe, Ana; Hinds, Chris; Hudson, Emma; Klenerman, Paul; Lythgow, Kieren; Mbisa, Jean L.; McLauchlan, John; Myers, Richard; Piazza, Paolo; Roy, Sunando; Trebes, Amy; Sreenu, Vattipally B.; Witteveldt, Jeroen; Simmonds, Peter

    2016-01-01

    Affordable next-generation sequencing (NGS) technologies for hepatitis C virus (HCV) may potentially identify both viral genotype and resistance genetic motifs in the era of directly acting antiviral (DAA) therapies. This study compared the ability of high-throughput NGS methods to generate full-length, deep, HCV sequence data sets and evaluated their utility for diagnostics and clinical assessment. NGS methods using (i) unselected HCV RNA (metagenomics), (ii) preenrichment of HCV RNA by probe capture, and (iii) HCV preamplification by PCR implemented in four United Kingdom centers were compared. Metrics of sequence coverage and depth, quasispecies diversity, and detection of DAA resistance-associated variants (RAVs), mixed HCV genotypes, and other coinfections were compared using a panel of samples with different viral loads, genotypes, and mixed HCV genotypes/subtypes [geno(sub)types]. Each NGS method generated near-complete genome sequences from more than 90% of samples. Enrichment methods and PCR preamplification generated greater sequence depth and were more effective for samples with low viral loads. All NGS methodologies accurately identified mixed HCV genotype infections. Consensus sequences generated by different NGS methods were generally concordant, and majority RAVs were consistently detected. However, methods differed in their ability to detect minor populations of RAVs. Metagenomic methods identified human pegivirus coinfections. NGS provided a rapid, inexpensive method for generating whole HCV genomes to define infecting genotypes, RAVs, comprehensive viral strain analysis, and quasispecies diversity. Enrichment methods are particularly suited for high-throughput analysis while providing the genotype and information on potential DAA resistance. PMID:27385709

  20. Scenario drafting for early technology assessment of next generation sequencing in clinical oncology

    NARCIS (Netherlands)

    Joosten, S.E.P.; Retel, V.P.; Coupé, V.M.H.; Heuvel, van den M.M.; Harten, van W.H.

    2016-01-01

    Background Next Generation Sequencing (NGS) is expected to lift molecular diagnostics in clinical oncology to the next level. It enables simultaneous identification of mutations in a patient tumor, after which targeted therapy may be assigned. This approach could improve patient survival and/or assi

  1. H-TArget model: Early technology assessment for ext generation sequencing in oncology

    NARCIS (Netherlands)

    Retel, Valesca P.; Joore, Manuela A.; Ramaekers, Bram; Heuvel, van den Michel M.; Heijden, van der Michiel Simon; Harten, van W.H.

    2015-01-01

    Background: Next Generation Sequencing (NGS) promises to find mutations (targets) in individual cancer patients, to subsequently prescribe targeted therapy. Currently, NGS is in development, the effects on choice of therapy and prognosis are still unclear, and the costs for targeted therapies are hi

  2. Exome sequencing and genetic testing for MODY.

    Directory of Open Access Journals (Sweden)

    Stefan Johansson

    Full Text Available CONTEXT: Genetic testing for monogenic diabetes is important for patient care. Given the extensive genetic and clinical heterogeneity of diabetes, exome sequencing might provide additional diagnostic potential when standard Sanger sequencing-based diagnostics is inconclusive. OBJECTIVE: The aim of the study was to examine the performance of exome sequencing for a molecular diagnosis of MODY in patients who have undergone conventional diagnostic sequencing of candidate genes with negative results. RESEARCH DESIGN AND METHODS: We performed exome enrichment followed by high-throughput sequencing in nine patients with suspected MODY. They were Sanger sequencing-negative for mutations in the HNF1A, HNF4A, GCK, HNF1B and INS genes. We excluded common, non-coding and synonymous gene variants, and performed in-depth analysis on filtered sequence variants in a pre-defined set of 111 genes implicated in glucose metabolism. RESULTS: On average, we obtained 45 X median coverage of the entire targeted exome and found 199 rare coding variants per individual. We identified 0-4 rare non-synonymous and nonsense variants per individual in our a priori list of 111 candidate genes. Three of the variants were considered pathogenic (in ABCC8, HNF4A and PPARG, respectively, thus exome sequencing led to a genetic diagnosis in at least three of the nine patients. Approximately 91% of known heterozygous SNPs in the target exomes were detected, but we also found low coverage in some key diabetes genes using our current exome sequencing approach. Novel variants in the genes ARAP1, GLIS3, MADD, NOTCH2 and WFS1 need further investigation to reveal their possible role in diabetes. CONCLUSION: Our results demonstrate that exome sequencing can improve molecular diagnostics of MODY when used as a complement to Sanger sequencing. However, improvements will be needed, especially concerning coverage, before the full potential of exome sequencing can be realized.

  3. Technological sequence of creating components of the training system of the future officers to the management of physical training

    Directory of Open Access Journals (Sweden)

    Olkhovy O.M.

    2012-09-01

    Full Text Available The goal is to determine constructive ways of sequence of constructing components of the training system of the future officers to carry out official questions of managing the physical training in the process of the further military career. The structural logic circuit of the interconnections stages of optimum cycle management and technological sequence of constructing the components of the training system of the future officers to the management of physical training, which provides: definition of requirements to the typical problems of professional activities on the issues of the leadership, organization and conducting of physical training, the creation of the phased system model cadets training, training of the curriculum discipline ″Physical education, special physical training and sport″; model creation and definition of criteria of the integral evaluation of the readiness of the future officers to the management of physical training was determined through the analysis more than thirty documentary and scientific literature.

  4. Effective Optimization of Antibody Affinity by Phage Display Integrated with High-Throughput DNA Synthesis and Sequencing Technologies.

    Directory of Open Access Journals (Sweden)

    Dongmei Hu

    Full Text Available Phage display technology has been widely used for antibody affinity maturation for decades. The limited library sequence diversity together with excessive redundancy and labour-consuming procedure for candidate identification are two major obstacles to widespread adoption of this technology. We hereby describe a novel library generation and screening approach to address the problems. The approach started with the targeted diversification of multiple complementarity determining regions (CDRs of a humanized anti-ErbB2 antibody, HuA21, with a small perturbation mutagenesis strategy. A combination of three degenerate codons, NWG, NWC, and NSG, were chosen for amino acid saturation mutagenesis without introducing cysteine and stop residues. In total, 7,749 degenerate oligonucleotides were synthesized on two microchips and released to construct five single-chain antibody fragment (scFv gene libraries with 4 x 10(6 DNA sequences. Deep sequencing of the unselected and selected phage libraries using the Illumina platform allowed for an in-depth evaluation of the enrichment landscapes in CDR sequences and amino acid substitutions. Potent candidates were identified according to their high frequencies using NGS analysis, by-passing the need for the primary screening of target-binding clones. Furthermore, a subsequent library by recombination of the 10 most abundant variants from four CDRs was constructed and screened, and a mutant with 158-fold increased affinity (Kd = 25.5 pM was obtained. These results suggest the potential application of the developed methodology for optimizing the binding properties of other antibodies and biomolecules.

  5. De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping.

    Science.gov (United States)

    Olsen, Remi-Andre; Bunikis, Ignas; Tiukova, Ievgeniia; Holmberg, Kicki; Lötstedt, Britta; Pettersson, Olga Vinnere; Passoth, Volkmar; Käller, Max; Vezzi, Francesco

    2015-01-01

    It remains a challenge to perform de novo assembly using next-generation sequencing (NGS). Despite the availability of multiple sequencing technologies and tools (e.g., assemblers) it is still difficult to assemble new genomes at chromosome resolution (i.e., one sequence per chromosome). Obtaining high quality draft assemblies is extremely important in the case of yeast genomes to better characterise major events in their evolutionary history. The aim of this work is two-fold: on the one hand we want to show how combining different and somewhat complementary technologies is key to improving assembly quality and correctness, and on the other hand we present a de novo assembly pipeline we believe to be beneficial to core facility bioinformaticians. To demonstrate both the effectiveness of combining technologies and the simplicity of the pipeline, here we present the results obtained using the Dekkera bruxellensis genome. In this work we used short-read Illumina data and long-read PacBio data combined with the extreme long-range information from OpGen optical maps in the task of de novo genome assembly and finishing. Moreover, we developed NouGAT, a semi-automated pipeline for read-preprocessing, de novo assembly and assembly evaluation, which was instrumental for this work. We obtained a high quality draft assembly of a yeast genome, resolved on a chromosomal level. Furthermore, this assembly was corrected for mis-assembly errors as demonstrated by resolving a large collapsed repeat and by receiving higher scores by assembly evaluation tools. With the inclusion of PacBio data we were able to fill about 5 % of the optical mapped genome not covered by the Illumina data.

  6. Complete Genome Sequence of the Unclassified Iron-Oxidizing, Chemolithoautotrophic Burkholderiales Bacterium GJ-E10, Isolated from an Acidic River.

    Science.gov (United States)

    Fukushima, Jun; Tojo, Fuyumi; Asano, Ryoki; Kobayashi, Yayoi; Shimura, Yoichiro; Okano, Kunihiro; Miyata, Naoyuki

    2015-02-05

    Burkholderiales bacterium GJ-E10, isolated from the Tamagawa River in Akita Prefecture, Japan, is an unclassified, iron-oxidizing chemolithoautotrophic bacterium. Its single circular genome, consisting of 3,276,549 bp, was sequenced by using three types of next-generation sequencers and the sequences were then confirmed by PCR-based Sanger sequencing.

  7. GPU technology as a platform for accelerating local complexity analysis of protein sequences.

    Science.gov (United States)

    Papadopoulos, Agathoklis; Kirmitzoglou, Ioannis; Promponas, Vasilis J; Theocharides, Theocharis

    2013-01-01

    The use of GPGPU programming paradigm (running CUDA-enabled algorithms on GPU cards) in Bioinformatics showed promising results [1]. As such a similar approach can be used to speedup other algorithms such as CAST, a popular tool used for masking low-complexity regions (LCRs) in protein sequences [2] with increased sensitivity. We developed and implemented a CUDA-enabled version (GPU_CAST) of the multi-threaded version of CAST software first presented in [3] and optimized in [4]. The proposed software implementation uses the nVIDIA CUDA libraries and the GPGPU programming paradigm to take advantage of the inherent parallel characteristics of the CAST algorithm to execute the calculations on the GPU card of the host computer system. The GPU-based implementation presented in this work, is compared against the multi-threaded, multi-core optimized version of CAST [4] and yielded speedups of 5x-10x for large protein sequence datasets.

  8. Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers

    DEFF Research Database (Denmark)

    Varshney, Rajeev K.; Chen, Wenbin; Li, Yupeng;

    2012-01-01

    Pigeonpea is an important legume food crop grown primarily by smallholder farmers in many semi-arid tropical regions of the world. We used the Illumina next-generation sequencing platform to generate 237.2 Gb of sequence, which along with Sanger-based bacterial artificial chromosome end sequences...

  9. ReRep: Computational detection of repetitive sequences in genome survey sequences (GSS

    Directory of Open Access Journals (Sweden)

    Alves-Ferreira Marcelo

    2008-09-01

    Full Text Available Abstract Background Genome survey sequences (GSS offer a preliminary global view of a genome since, unlike ESTs, they cover coding as well as non-coding DNA and include repetitive regions of the genome. A more precise estimation of the nature, quantity and variability of repetitive sequences very early in a genome sequencing project is of considerable importance, as such data strongly influence the estimation of genome coverage, library quality and progress in scaffold construction. Also, the elimination of repetitive sequences from the initial assembly process is important to avoid errors and unnecessary complexity. Repetitive sequences are also of interest in a variety of other studies, for instance as molecular markers. Results We designed and implemented a straightforward pipeline called ReRep, which combines bioinformatics tools for identifying repetitive structures in a GSS dataset. In a case study, we first applied the pipeline to a set of 970 GSSs, sequenced in our laboratory from the human pathogen Leishmania braziliensis, the causative agent of leishmaniosis, an important public health problem in Brazil. We also verified the applicability of ReRep to new sequencing technologies using a set of 454-reads of an Escheria coli. The behaviour of several parameters in the algorithm is evaluated and suggestions are made for tuning of the analysis. Conclusion The ReRep approach for identification of repetitive elements in GSS datasets proved to be straightforward and efficient. Several potential repetitive sequences were found in a L. braziliensis GSS dataset generated in our laboratory, and further validated by the analysis of a more complete genomic dataset from the EMBL and Sanger Centre databases. ReRep also identified most of the E. coli K12 repeats prior to assembly in an example dataset obtained by automated sequencing using 454 technology. The parameters controlling the algorithm behaved consistently and may be tuned to the properties

  10. Sequence assembly

    DEFF Research Database (Denmark)

    Scheibye-Alsing, Karsten; Hoffmann, S.; Frankel, Annett Maria

    2009-01-01

    Despite the rapidly increasing number of sequenced and re-sequenced genomes, many issues regarding the computational assembly of large-scale sequencing data have remain unresolved. Computational assembly is crucial in large genome projects as well for the evolving high-throughput technologies...

  11. Complete Genome Sequence of Streptomyces venezuelae ATCC 15439, Producer of the Methymycin/Pikromycin Family of Macrolide Antibiotics, Using PacBio Technology.

    Science.gov (United States)

    He, Jingxuan; Sundararajan, Anitha; Devitt, Nicholas P; Schilkey, Faye D; Ramaraj, Thiruvarangan; Melançon, Charles E

    2016-05-05

    Here, we report the complete genome sequence of Streptomyces venezuelae ATCC 15439, a producer of the methymycin/pikromycin family of macrolide antibiotics and a model host for natural product studies, obtained exclusively using PacBio sequencing technology. The 9.03-Mbp genome harbors 8,775 genes and 11 polyketide and nonribosomal peptide natural product gene clusters.

  12. Advances in genetics and molecular breeding of three legume crops of semi-arid tropics using next-generation sequencing and high-throughput genotyping technologies

    Indian Academy of Sciences (India)

    Rajeev K Varshney; Himabindu Kudapa; Manish Roorkiwal; Mahendar Thudi; Manish K Pandey; Rachit K Saxena; Siva K Chamarthi; Murali Mohan S; Nalini Mallikarjuna; Hari Upadhyaya; Pooran M Gaur; L Krishnamurthy; K B Saxena; Shyam N Nigam; Suresh Pande

    2012-11-01

    Molecular markers are the most powerful genomic tools to increase the efficiency and precision of breeding practices for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics (SAT), namely, chickpea (Cicer arietinum), pigeonpea (Cajanus cajan) and groundnut (Arachis hypogaea), as compared to other crop species like cereals, has been very slow. With the advances in next-generation sequencing (NGS) and high-throughput (HTP) genotyping methods, there is a shift in development of genomic resources including molecular markers in these crops. For instance, 2,000 to 3,000 novel simple sequence repeats (SSR) markers have been developed each for chickpea, pigeonpea and groundnut. Based on Sanger, 454/FLX and Illumina transcript reads, transcriptome assemblies have been developed for chickpea (44,845 transcript assembly contigs, or TACs) and pigeonpea (21,434 TACs). Illumina sequencing of some parental genotypes of mapping populations has resulted in the development of 120 million reads for chickpea and 128.9 million reads for pigeonpea. Alignment of these Illumina reads with respective transcriptome assemblies have provided > 10,000 SNPs each in chickpea and pigeonpea. A variety of SNP genotyping platforms including GoldenGate, VeraCode and Competitive Allele Specific PCR (KASPar) assays have been developed in chickpea and pigeonpea. By using above resources, the first-generation or comprehensive genetic maps have been developed in the three legume speciesmentioned above. Analysis of phenotyping data together with genotyping data has provided candidate markers for drought-tolerance-related root traits in chickpea, resistance to foliar diseases in groundnut and sterility mosaic disease (SMD) and fertility restoration in pigeonpea. Together with these trait-associated markers along with those already available, molecular breeding programmes have been initiated for enhancing drought tolerance, resistance to

  13. Advances in genetics and molecular breeding of three legume crops of semi-arid tropics using next-generation sequencing and high-throughput genotyping technologies.

    Science.gov (United States)

    Varshney, Rajeev K; Kudapa, Himabindu; Roorkiwal, Manish; Thudi, Mahendar; Pandey, Manish K; Saxena, Rachit K; Chamarthi, Siva K; Mohan, S Murali; Mallikarjuna, Nalini; Upadhyaya, Hari; Gaur, Pooran M; Krishnamurthy, L; Saxena, K B; Nigam, Shyam N; Pande, Suresh

    2012-11-01

    Molecular markers are the most powerful genomic tools to increase the efficiency and precision of breeding practices for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics (SAT), namely, chickpea (Cicer arietinum), pigeonpea (Cajanus cajan) and groundnut (Arachis hypogaea), as compared to other crop species like cereals, has been very slow. With the advances in next-generation sequencing (NGS) and high-throughput (HTP) genotyping methods, there is a shift in development of genomic resources including molecular markers in these crops. For instance, 2,000 to 3,000 novel simple sequence repeats (SSR) markers have been developed each for chickpea, pigeonpea and groundnut. Based on Sanger, 454/FLX and Illumina transcript reads, transcriptome assemblies have been developed for chickpea (44,845 transcript assembly contigs, or TACs) and pigeonpea (21,434 TACs). Illumina sequencing of some parental genotypes of mapping populations has resulted in the development of 120 million reads for chickpea and 128.9 million reads for pigeonpea. Alignment of these Illumina reads with respective transcriptome assemblies have provided more than 10,000 SNPs each in chickpea and pigeonpea. A variety of SNP genotyping platforms including GoldenGate, VeraCode and Competitive Allele Specific PCR (KASPar) assays have been developed in chickpea and pigeonpea. By using above resources, the first-generation or comprehensive genetic maps have been developed in the three legume speciesmentioned above. Analysis of phenotyping data together with genotyping data has provided candidate markers for drought-tolerance-related root traits in chickpea, resistance to foliar diseases in groundnut and sterility mosaic disease (SMD) and fertility restoration in pigeonpea. Together with these traitassociated markers along with those already available, molecular breeding programmes have been initiated for enhancing drought tolerance, resistance

  14. A dated molecular phylogeny of manta and devil rays (Mobulidae) based on mitogenome and nuclear sequences

    NARCIS (Netherlands)

    Poortvliet, Marloes; Olsen, Jeanine; Croll, Donald A.; Bernardi, Giacomo; Newton, Kelly; Kollias, Spyros; O'Sullivan, John; Fernando, Daniel; Stevens, Guy; Galván Magaña, Felipe; Seret, Bernard; Wintner, Sabine; Hoarau, Galice

    2015-01-01

    Manta and devil rays are an iconic group of globally distributed pelagic filter feeders, yet their evolutionary history remains enigmatic. We employed next generation sequencing of mitogenomes for nine of the 11 recognized species and two outgroups; as well as additional Sanger sequencing of two mit

  15. Developmental genetics and new sequencing technologies: the rise of nonmodel organisms.

    Science.gov (United States)

    Rowan, Beth A; Weigel, Detlef; Koenig, Daniel

    2011-07-19

    Much of developmental biology in the past decades has been driven by forward genetic studies in a few model organisms. We review recent work with relatives of these species, motivated by a desire to understand the evolutionary and ecological context for morphological innovation. Unfortunately, despite a number of shining examples, progress in nonmodel systems has often been slow. The current revolution in DNA sequencing has, however, enormous potential in extending the reach of genetics. We discuss how developmental biology will benefit from these advances, particularly by increasing the universe of study species.

  16. Sequencing, annotation and comparative analysis of nine BACs of giant panda (Ailuropoda melanoleuca).

    Science.gov (United States)

    Zheng, Yang; Cai, Jing; Li, JianWen; Li, Bo; Lin, RunMao; Tian, Feng; Wang, XiaoLing; Wang, Jun

    2010-01-01

    A 10-fold BAC library for giant panda was constructed and nine BACs were selected to generate finish sequences. These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of giant panda newly generated by the Illumina GA sequencing technology. Complete sanger sequencing, assembly, annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb. Homologue search and de novo prediction methods were used to annotate genes and repeats. Twelve protein coding genes were predicted, seven of which could be functionally annotated. The seven genes have an average gene size of about 41 kb, an average coding size of about 1.2 kb and an average exon number of 6 per gene. Besides, seven tRNA genes were found. About 27 percent of the BAC sequence is composed of repeats. A phylogenetic tree was constructed using neighbor-join algorithm across five species, including giant panda, human, dog, cat and mouse, which reconfirms dog as the most related species to giant panda. Our results provide detailed sequence and structure information for new genes and repeats of giant panda, which will be helpful for further studies on the giant panda.

  17. Sequencing,annotation and comparative analysis of nine BACs of the giant panda(Ailuropoda melanoleuca)

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    A 10-fold BAC library for the giant panda was constructed and nine BACs were selected to generate finish sequences.These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of the giant panda newly generated by Illumina GA sequencing technology.Complete Sanger sequencing,assembly,annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb.Homologue search and de novo prediction methods were used to annotate genes and repeats.Twelve protein coding genes were predicted,seven of which could be functionally annotated.The seven genes have an average gene size of about 41 kb,an average coding size of about 1.2 kb and an average exon number of 6 per gene.Besides,seven tRNA genes were found.About 27 percent of the BAC sequence is composed of repeats.A phylogenetic tree was constructed using a neighbor-join algorithm across five species,including the giant panda,human,dog,cat and mouse,which reconfirms dog as the most closely related species to the giant panda.Our results provide detailed sequence and structure information for new genes and repeats of the giant panda,which will be helpful for further studies about the giant panda.

  18. Analysis of quality raw data of second generation sequencers with Quality Assessment Software

    Directory of Open Access Journals (Sweden)

    Schneider Maria PC

    2011-04-01

    Full Text Available Abstract Background Second generation technologies have advantages over Sanger; however, they have resulted in new challenges for the genome construction process, especially because of the small size of the reads, despite the high degree of coverage. Independent of the program chosen for the construction process, DNA sequences are superimposed, based on identity, to extend the reads, generating contigs; mismatches indicate a lack of homology and are not included. This process improves our confidence in the sequences that are generated. Findings We developed Quality Assessment Software, with which one can review graphs showing the distribution of quality values from the sequencing reads. This software allow us to adopt more stringent quality standards for sequence data, based on quality-graph analysis and estimated coverage after applying the quality filter, providing acceptable sequence coverage for genome construction from short reads. Conclusions Quality filtering is a fundamental step in the process of constructing genomes, as it reduces the frequency of incorrect alignments that are caused by measuring errors, which can occur during the construction process due to the size of the reads, provoking misassemblies. Application of quality filters to sequence data, using the software Quality Assessment, along with graphing analyses, provided greater precision in the definition of cutoff parameters, which increased the accuracy of genome construction.

  19. Understanding invasion history and predicting invasive niches using genetic sequencing technology in Australia: case studies from Cucurbitaceae and Boraginaceae

    Science.gov (United States)

    Shaik, Razia S.; Zhu, Xiaocheng; Clements, David R.; Weston, Leslie A.

    2016-01-01

    Part of the challenge in dealing with invasive plant species is that they seldom represent a uniform, static entity. Often, an accurate understanding of the history of plant introduction and knowledge of the real levels of genetic diversity present in species and populations of importance is lacking. Currently, the role of genetic diversity in promoting the successful establishment of invasive plants is not well defined. Genetic profiling of invasive plants should enhance our understanding of the dynamics of colonization in the invaded range. Recent advances in DNA sequencing technology have greatly facilitated the rapid and complete assessment of plant population genetics. Here, we apply our current understanding of the genetics and ecophysiology of plant invasions to recent work on Australian plant invaders from the Cucurbitaceae and Boraginaceae. The Cucurbitaceae study showed that both prickly paddy melon (Cucumis myriocarpus) and camel melon (Citrullus lanatus) were represented by only a single genotype in Australia, implying that each was probably introduced as a single introduction event. In contrast, a third invasive melon, Citrullus colocynthis, possessed a moderate level of genetic diversity in Australia and was potentially introduced to the continent at least twice. The Boraginaceae study demonstrated the value of comparing two similar congeneric species; one, Echium plantagineum, is highly invasive and genetically diverse, whereas the other, Echium vulgare, exhibits less genetic diversity and occupies a more limited ecological niche. Sequence analysis provided precise identification of invasive plant species, as well as information on genetic diversity and phylogeographic history. Improved sequencing technologies will continue to allow greater resolution of genetic relationships among invasive plant populations, thereby potentially improving our ability to predict the impact of these relationships upon future spread and better manage invaders possessing

  20. Understanding invasion history and predicting invasive niches using genetic sequencing technology in Australia: case studies from Cucurbitaceae and Boraginaceae.

    Science.gov (United States)

    Shaik, Razia S; Zhu, Xiaocheng; Clements, David R; Weston, Leslie A

    2016-01-01

    Part of the challenge in dealing with invasive plant species is that they seldom represent a uniform, static entity. Often, an accurate understanding of the history of plant introduction and knowledge of the real levels of genetic diversity present in species and populations of importance is lacking. Currently, the role of genetic diversity in promoting the successful establishment of invasive plants is not well defined. Genetic profiling of invasive plants should enhance our understanding of the dynamics of colonization in the invaded range. Recent advances in DNA sequencing technology have greatly facilitated the rapid and complete assessment of plant population genetics. Here, we apply our current understanding of the genetics and ecophysiology of plant invasions to recent work on Australian plant invaders from the Cucurbitaceae and Boraginaceae. The Cucurbitaceae study showed that both prickly paddy melon (Cucumis myriocarpus) and camel melon (Citrullus lanatus) were represented by only a single genotype in Australia, implying that each was probably introduced as a single introduction event. In contrast, a third invasive melon, Citrullus colocynthis, possessed a moderate level of genetic diversity in Australia and was potentially introduced to the continent at least twice. The Boraginaceae study demonstrated the value of comparing two similar congeneric species; one, Echium plantagineum, is highly invasive and genetically diverse, whereas the other, Echium vulgare, exhibits less genetic diversity and occupies a more limited ecological niche. Sequence analysis provided precise identification of invasive plant species, as well as information on genetic diversity and phylogeographic history. Improved sequencing technologies will continue to allow greater resolution of genetic relationships among invasive plant populations, thereby potentially improving our ability to predict the impact of these relationships upon future spread and better manage invaders possessing

  1. A more complete picture of metal hyperaccumulation through next-generation sequencing technologies

    Directory of Open Access Journals (Sweden)

    Nathalie eVerbruggen

    2013-10-01

    Full Text Available The mechanistic understanding of metal hyperaccumulation has benefitted immensely from the use of molecular genetics tools developed for Arabidopsis thaliana. The revolution in DNA sequencing will enable even greater strides in the near future, this time not restricted to the family Brassicaceae. Reference genomes are within reach for many ecologically interesting species including heterozygous outbreeders. They will allow deep RNA-seq transcriptome studies and the re-sequencing of contrasting individuals to unravel the genetic basis of phenotypic variation. Cell-type specific transcriptome analyses, which will be essential for the dissection of metal translocation pathways in hyperaccumulators, can be achieved through the combination of RNA-seq and translatome approaches. Affordable high-resolution genotyping of many individuals enables the elucidation of quantitative trait loci in intra- and interspecific crosses as well as through genome-wide association mapping across large panels of accessions. Furthermore, genome-wide scans have the power to detect loci under recent selection. Together these approaches will lead to a detailed understanding of the evolutionary path towards the emergence of hyperaccumulation traits.

  2. Isolation and characterization of microsatellite markers for Axonopus compressus (Sw.) Beauv. (Poaceae) using 454 sequencing technology.

    Science.gov (United States)

    Wang, X-L; Li, Y; Liao, L; Bai, C-J; Wang, Z-Y

    2015-05-11

    Axonopus compressus (Sw.) Beauv. is a perennial herb widely used as a garden lawn grass. In this study, we used Roche 454 pyrosequencing, combined with the magnetic bead enrichment method FIASCO, to isolate simple sequence repeat markers from the A. compressus genome. A total of 1942 microsatellite loci were identified, with 53,193 raw sequencing reads. One hundred micro-satellite loci were selected to test the primer amplification efficiency in 24 individuals; 14 primer pairs yielded polymorphic amplification products. The number of observed alleles ranged from two to six, with an average of 3.5. Shannon's Information index values ranged from 0.169 to 0.650, with an average of 0.393. Nei's genetic diversity values ranged from 0.108 to 0.457, with an average of 0.271. This first set of microsatellite markers developed for Axonopus will assist in the development of molecular marker-assisted breeding and the assessment of genetic diversity in A. compressus.

  3. A more complete picture of metal hyperaccumulation through next-generation sequencing technologies

    Science.gov (United States)

    Verbruggen, Nathalie; Hanikenne, Marc; Clemens, Stephan

    2013-01-01

    The mechanistic understanding of metal hyperaccumulation has benefitted immensely from the use of molecular genetics tools developed for Arabidopsis thaliana. The revolution in DNA sequencing will enable even greater strides in the near future, this time not restricted to the family Brassicaceae. Reference genomes are within reach for many ecologically interesting species including heterozygous outbreeders. They will allow deep RNA-seq transcriptome studies and the re-sequencing of contrasting individuals to unravel the genetic basis of phenotypic variation. Cell-type specific transcriptome analyses, which will be essential for the dissection of metal translocation pathways in hyperaccumulators, can be achieved through the combination of RNA-seq and translatome approaches. Affordable high-resolution genotyping of many individuals enables the elucidation of quantitative trait loci in intra- and interspecific crosses as well as through genome-wide association mapping across large panels of accessions. Furthermore, genome-wide scans have the power to detect loci under recent selection. Together these approaches will lead to a detailed understanding of the evolutionary path towards the emergence of hyperaccumulation traits. PMID:24098304

  4. The Application of Next Generation Sequencing Technology on Noninvasive Prenatal Test

    DEFF Research Database (Denmark)

    Jiang, Hui

    and diagnosis of rare diseases. Among them, genetic test for pregnant women is the most powerful and cost-­effective tool to identify and prevent rare diseases related birth defect. However, most of the current routine prenatal genetic testing for rare diseases requires of collecting fetal samples through...... an invasive process, which might lead to maternal anxiety, or even miscarriage. Therefore, developing an effective approach to perform noninvasive prenatal test (NIPT) for rare diseases is the key challenge to prevent birth defect in the future. The discovery of cell-­free fetal DNA, coupling with next...... a sensitivity and specificity of over 99%, which can provide accurate and reliable results and thus avoid most of invasive process compared to standard prenatal test. Moreover,we also designed probes for genes related to Monogenetic disorders and conducted target region sequencing for parents, proband...

  5. Charcot-Marie-Tooth disease: The development of a diagnostic platform using next generation sequencing

    DEFF Research Database (Denmark)

    Christensen, Rikke; Væth, Signe; Thorsen, Kasper

    Background: Charcot-Marie-Tooth Disease (CMT) is one of the most common inherited neurological diseases. Today, more than 70 CMT related genes are known to cause inherited neuropathy. The diagnostic strategy in most laboratories is based on Sanger-sequencing of few genes. In our patient cohort...... previously analyzed using Sanger sequencing without identification of a disease causing mutation. Materials and Methods: Libraries for 200 patient samples obtained for CMT diagnostics were prepared using Illumina Truseq and target enrichment using SeqCap EZ Choise Library (Nimblegen). The libraries were...

  6. Characterization of microflora in Latin-style cheeses by next-generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Lusk Tina S

    2012-11-01

    Full Text Available Abstract Background Cheese contamination can occur at numerous stages in the manufacturing process including the use of improperly pasteurized or raw milk. Of concern is the potential contamination by Listeria monocytogenes and other pathogenic bacteria that find the high moisture levels and moderate pH of popular Latin-style cheeses like queso fresco a hospitable environment. In the investigation of a foodborne outbreak, samples typically undergo enrichment in broth for 24 hours followed by selective agar plating to isolate bacterial colonies for confirmatory testing. The broth enrichment step may also enable background microflora to proliferate, which can confound subsequent analysis if not inhibited by effective broth or agar additives. We used 16S rRNA gene sequencing to provide a preliminary survey of bacterial species associated with three brands of Latin-style cheeses after 24-hour broth enrichment. Results Brand A showed a greater diversity than the other two cheese brands (Brands B and C at nearly every taxonomic level except phylum. Brand B showed the least diversity and was dominated by a single bacterial taxon, Exiguobacterium, not previously reported in cheese. This genus was also found in Brand C, although Lactococcus was prominent, an expected finding since this bacteria belongs to the group of lactic acid bacteria (LAB commonly found in fermented foods. Conclusions The contrasting diversity observed in Latin-style cheese was surprising, demonstrating that despite similarity of cheese type, raw materials and cheese making conditions appear to play a critical role in the microflora composition of the final product. The high bacterial diversity associated with Brand A suggests it may have been prepared with raw materials of high bacterial diversity or influenced by the ecology of the processing environment. Additionally, the presence of Exiguobacterium in high proportions (96% in Brand B and, to a lesser extent, Brand C (46%, may

  7. Rapid development of polymorphic microsatellite markers for the Amur sturgeon (Acipenser schrenckii) using next-generation sequencing technology.

    Science.gov (United States)

    Li, L M; Wei, L; Jiang, H Y; Zhang, Y; Zhang, X J; Yuan, L H; Chen, J P

    2015-07-14

    Anthropogenic activities have seriously impacted wild resources of the Amur sturgeon, Acipenser schrenckii, and more information on local and regional population genetic structure is required to aid the conservation of this species. In this study, we report the development of 12 novel polymorphic microsatellite loci using next-generation sequencing technology, and the genotyping of 24 individuals collected from a sturgeon farm. The results show that the mean number of ob-served alleles per locus is 6.6 (ranging from 2 to 17). Observed and expected heterozygosity values ranged from 0 to 0.958 and from 0.508 to 0.940, respectively. Not a single locus showed significant departure from Hardy-Weinberg equilibrium and no linkage disequilibrium was observed among any pairwise loci. These highly informative microsatellite markers will be useful for genetic diversity and population structure analyses of A. schrenckii and other species of this genus.

  8. Whole genome and exome sequencing of monozygotic twins with trisomy 21, discordant for a congenital heart defect and epilepsy.

    Directory of Open Access Journals (Sweden)

    Pongsathorn Chaiyasap

    Full Text Available Congenital heart defects (CHD occur in 40% of patients with trisomy 21, while the other 60% have a structurally normal heart. This suggests that the increased dosage of genes on chromosome 21 is a risk factor for abnormal heart development. Interaction of genes on chromosome 21 or their gene products with certain alleles of genes on other chromosomes could contribute to CHD. Here, we identified a pair of monozygotic twins with trisomy 21 but discordant for a ventricular septal defect and epilepsy. Twin-zygosity was confirmed by microsatellite genotyping. We hypothesized that some genetic differences from post-twinning mutations caused the discordant phenotypes. Thus, next generation sequencing (NGS technologies were applied to sequence both whole genome and exome of their leukocytes. The post-analyses of the sequencing data revealed 21 putative discordant exonic variants between the twins from either genome or exome data. However, of the 15 variants chosen for validation with conventional Sanger sequencing, these candidate variants showed no differences in both twins. The fact that no discordant DNA variants were found suggests that sequence differences of DNA from leukocytes of monozygotic twins might be extremely rare. It also emphasizes the limitation of the current NGS technology in identifying causative genes for discordant phenotypes in monozygotic twins.

  9. [Research on soil bacteria under the impact of sealed CO2 leakage by high-throughput sequencing technology].

    Science.gov (United States)

    Tian, Di; Ma, Xin; Li, Yu-E; Zha, Liang-Song; Wu, Yang; Zou, Xiao-Xia; Liu, Shuang

    2013-10-01

    Carbon dioxide Capture and Storage has provided a new option for mitigating global anthropogenic CO2 emission with its unique advantages. However, there is a risk of the sealed CO2 leakage, bringing a serious threat to the ecology system. It is widely known that soil microorganisms are closely related to soil health, while the study on the impact of sequestered CO2 leakage on soil microorganisms is quite deficient. In this study, the leakage scenarios of sealed CO2 were constructed and the 16S rRNA genes of soil bacteria were sequenced by Illumina high-throughput sequencing technology on Miseq platform, and related biological analysis was conducted to explore the changes of soil bacterial abundance, diversity and structure. There were 486,645 reads for 43,017 OTUs of 15 soil samples and the results of biological analysis showed that there were differences in the abundance, diversity and community structure of soil bacterial community under different CO, leakage scenarios while the abundance and diversity of the bacterial community declined with the amplification of CO2 leakage quantity and leakage time, and some bacteria species became the dominant bacteria species in the bacteria community, therefore the increase of Acidobacteria species would be a biological indicator for the impact of sealed CO2 leakage on soil ecology system.

  10. Second generation sequencing of the mesothelioma tumor genome.

    Directory of Open Access Journals (Sweden)

    Raphael Bueno

    Full Text Available The current paradigm for elucidating the molecular etiology of cancers relies on the interrogation of small numbers of genes, which limits the scope of investigation. Emerging second-generation massively parallel DNA sequencing technologies have enabled more precise definition of the cancer genome on a global scale. We examined the genome of a human primary malignant pleural mesothelioma (MPM tumor and matched normal tissue by using a combination of sequencing-by-synthesis and pyrosequencing methodologies to a 9.6X depth of coverage. Read density analysis uncovered significant aneuploidy and numerous rearrangements. Method-dependent informatics rules, which combined the results of different sequencing platforms, were developed to identify and validate candidate mutations of multiple types. Many more tumor-specific rearrangements than point mutations were uncovered at this depth of sequencing, resulting in novel, large-scale, inter- and intra-chromosomal deletions, inversions, and translocations. Nearly all candidate point mutations appeared to be previously unknown SNPs. Thirty tumor-specific fusions/translocations were independently validated with PCR and Sanger sequencing. Of these, 15 represented disrupted gene-encoding regions, including kinases, transcription factors, and growth factors. One large deletion in DPP10 resulted in altered transcription and expression of DPP10 transcripts in a set of 53 additional MPM tumors correlated with survival. Additionally, three point mutations were observed in the coding regions of NKX6-2, a transcription regulator, and NFRKB, a DNA-binding protein involved in modulating NFKB1. Several regions containing genes such as PCBD2 and DHFR, which are involved in growth factor signaling and nucleotide synthesis, respectively, were selectively amplified in the tumor. Second-generation sequencing uncovered all types of mutations in this MPM tumor, with DNA rearrangements representing the dominant type.

  11. Wolbachia Sequence Typing in Butterflies Using Pyrosequencing.

    Science.gov (United States)

    Choi, Sungmi; Shin, Su-Kyoung; Jeong, Gilsang; Yi, Hana

    2015-09-01

    Wolbachia is an obligate symbiotic bacteria that is ubiquitous in arthropods, with 25-70% of insect species estimated to be infected. Wolbachia species can interact with their insect hosts in a mutualistic or parasitic manner. Sequence types (ST) of Wolbachia are determined by multilocus sequence typing (MLST) of housekeeping genes. However, there are some limitations to MLST with respect to the generation of clone libraries and the Sanger sequencing method when a host is infected with multiple STs of Wolbachia. To assess the feasibility of massive parallel sequencing, also known as next-generation sequencing, we used pyrosequencing for sequence typing of Wolbachia in butterflies. We collected three species of butterflies (Eurema hecabe, Eurema laeta, and Tongeia fischeri) common to Korea and screened them for Wolbachia STs. We found that T. fischeri was infected with a single ST of Wolbachia, ST41. In contrast, E. hecabe and E. laeta were each infected with two STs of Wolbachia, ST41 and ST40. Our results clearly demonstrate that pyrosequencing-based MLST has a higher sensitivity than cloning and Sanger sequencing methods for the detection of minor alleles. Considering the high prevalence of infection with multiple Wolbachia STs, next-generation sequencing with improved analysis would assist with scaling up approaches to Wolbachia MLST.

  12. Characterization of Lactobacillus from Algerian goat's milk based on phenotypic, 16S rDNA sequencing and their technological properties

    Directory of Open Access Journals (Sweden)

    Ahmed Marroki

    2011-03-01

    Full Text Available Nineteen strains of Lactobacillus isolated from goat's milk from farms in north-west of Algeria were characterized. Isolates were identified by phenotypic, physiological and genotypic methods and some of their important technological properties were studied. Phenotypic characterization was carried out by studying physiological, morphological characteristics and carbohydrate fermentation patterns using API 50 CHL system. Isolates were also characterized by partial 16S rDNA sequencing. Results obtained with phenotypic methods were correlated with the genotypic characterization and 13 isolates were identified as L. plantarum, two isolates as L. rhamnosus and one isolate as L. fermentum. Three isolates identified as L. plantarum by phenotypic characterization were found to be L. pentosus by the genotypic method. A large diversity in technological properties (acid production in skim milk, exopolysaccharide production, aminopeptidase activity, antibacterial activity and antibiotic susceptibility was observed. Based on these results, two strains of L. plantarum (LbMS16 and LbMS21 and one strain of L. rhamnosus (LbMF25 have been tentatively selected for use as starter cultures in the manufacture of artisanal fermented dairy products in Algeria.

  13. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Science.gov (United States)

    Bertolini, Francesca; Ghionda, Marco Ciro; D'Alessandro, Enrico; Geraci, Claudia; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine) for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon) as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43%) in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97) and lower for avian species (0.70). PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  14. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43% in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97 and lower for avian species (0.70. PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  15. Can abundance of protists be inferred from sequence data: a case study of foraminifera.

    Directory of Open Access Journals (Sweden)

    Alexandra A-T Weber

    Full Text Available Protists are key players in microbial communities, yet our understanding of their role in ecosystem functioning is seriously impeded by difficulties in identification of protistan species and their quantification. Current microscopy-based methods used for determining the abundance of protists are tedious and often show a low taxonomic resolution. Recent development of next-generation sequencing technologies offered a very powerful tool for studying the richness of protistan communities. Still, the relationship between abundance of species and number of sequences remains subjected to various technical and biological biases. Here, we test the impact of some of these biological biases on sequence abundance of SSU rRNA gene in foraminifera. First, we quantified the rDNA copy number and rRNA expression level of three species of foraminifera by qPCR. Then, we prepared five mock communities with these species, two in equal proportions and three with one species ten times more abundant. The libraries of rDNA and cDNA of the mock communities were constructed, Sanger sequenced and the sequence abundance was calculated. The initial species proportions were compared to the raw sequence proportions as well as to the sequence abundance normalized by rDNA copy number and rRNA expression level per species. Our results showed that without normalization, all sequence data differed significantly from the initial proportions. After normalization, the congruence between the number of sequences and number of specimens was much better. We conclude that without normalization, species abundance determination based on sequence data was not possible because of the effect of biological biases. Nevertheless, by taking into account the variation of rDNA copy number and rRNA expression level we were able to infer species abundance, suggesting that our approach can be successful in controlled conditions.

  16. Comparison of microarray-predicted closest genomes to sequencing for poliovirus vaccine strain similarity and influenza A phylogeny.

    Science.gov (United States)

    Maurer-Stroh, Sebastian; Lee, Charlie W H; Patel, Champa; Lucero, Marilla; Nohynek, Hanna; Sung, Wing-Kin; Murad, Chrysanti; Ma, Jianmin; Hibberd, Martin L; Wong, Christopher W; Simões, Eric A F

    2016-03-01

    We evaluate sequence data from the PathChip high-density hybridization array for epidemiological interpretation of detected pathogens. For influenza A, we derive similar relative outbreak clustering in phylogenetic trees from PathChip-derived compared to classical Sanger-derived sequences. For a positive polio detection, recent infection could be excluded based on vaccine strain similarity.

  17. Identification and complete genome sequencing of paramyxoviruses in mallard ducks (Anas platyrhynchos using random access amplification and next generation sequencing technologies

    Directory of Open Access Journals (Sweden)

    van den Berg Thierry

    2011-10-01

    Full Text Available Abstract Background During a wildlife screening program for avian influenza A viruses (AIV and avian paramyxoviruses (APMV in Belgium, we isolated two hemagglutinating agents from pools of cloacal swabs of wild mallards (Anas platyrhynchos caught in a single sampling site at two different times. AIV and APMV1 were excluded using hemagglutination inhibition (HI testing and specific real-time RT-PCR tests. Methods To refine the virological identification of APMV2-10 realized by HI subtyping tests and in lack of validated molecular tests for APMV2-10, random access amplification was used in combination with next generation sequencing for the sequence independent identification of the viruses and the determination of their genomes. Results Three different APMVs were identified. From one pooled sample, the complete genome sequence (15054 nucleotides of an APMV4 was assembled from the random sequences. From the second pooled sample, the nearly complete genome sequence of an APMV6 (genome size of 16236 nucleotides was determined, as well as a partial sequence for an APMV4. This APMV4 was closely related but not identical to the APMV4 isolated from the first sample. Although a cross-reactivity with other APMV subtypes did not allow formal identification, the HI subtyping revealed APMV4 and APMV6 in the respective pooled samples but failed to identify the co-infecting APMV4 in the APMV6 infected pool. Conclusions These data further contribute to the knowledge about the genetic diversity within the serotypes APMV4 and 6, and confirm the limited sensitivity of the HI subtyping test. Moreover, this study demonstrates the value of a random access nucleic acid amplification method in combination with massive parallel sequencing. Using only a moderate and economical sequencing effort, the characterization and full genome sequencing of APMVs can be obtained, including the identification of viruses in mixed infections.

  18. Advantage of whole exome sequencing over allele-specific and targeted segment sequencing in detection of novel TULP1 mutation in leber congenital amaurosis

    DEFF Research Database (Denmark)

    Guo, Yiran; Prokudin, Ivan; Yu, Cong

    2015-01-01

    Background: Leber congenital amaurosis (LCA) is a severe form of retinal dystrophy with marked underlying genetic heterogeneity. Until recently, allele-specific assays and Sanger sequencing of targeted segments were the only available approaches for attempted genetic diagnosis in this condition. ...

  19. Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing

    Science.gov (United States)

    Just, Rebecca S.; Irwin, Jodi A.; Parson, Walther

    2015-01-01

    Long an important and useful tool in forensic genetic investigations, mitochondrial DNA (mtDNA) typing continues to mature. Research in the last few years has demonstrated both that data from the entire molecule will have practical benefits in forensic DNA casework, and that massively parallel sequencing (MPS) methods will make full mitochondrial genome (mtGenome) sequencing of forensic specimens feasible and cost-effective. A spate of recent studies has employed these new technologies to assess intraindividual mtDNA variation. However, in several instances, contamination and other sources of mixed mtDNA data have been erroneously identified as heteroplasmy. Well vetted mtGenome datasets based on both Sanger and MPS sequences have found authentic point heteroplasmy in approximately 25% of individuals when minor component detection thresholds are in the range of 10–20%, along with positional distribution patterns in the coding region that differ from patterns of point heteroplasmy in the well-studied control region. A few recent studies that examined very low-level heteroplasmy are concordant with these observations when the data are examined at a common level of resolution. In this review we provide an overview of considerations related to the use of MPS technologies to detect mtDNA heteroplasmy. In addition, we examine published reports on point heteroplasmy to characterize features of the data that will assist in the evaluation of future mtGenome data developed by any typing method. PMID:26009256

  20. Impact of next-generation sequencing on molecular diagnosis of inherited non-syndromic hearing loss

    Institute of Scientific and Technical Information of China (English)

    Xue Gao; Pu Dai

    2014-01-01

    Hearing loss is one of the most common birth defects, with inherited genetic defects play an important role, contributing to about 60% of deafness occurring in infants. However, hearing impairment is genetically heterogeneous, with both common and rare forms occurring due to mutations in estimated 500 genes. Due to the large number and presumably low mutation frequencies of those genes, it would be highly expensive and time-consuming to address this issue by conventional gene-by-gene Sanger sequencing. Next-generation sequencing is a revo-lutionary technology that allows the simultaneous screening of mutations in a large number of genes. It is cost effective compared to classical strategies of linkage analysis and direct sequencing when the number or size of genes is large, and thus has become a highly efficient strategy for identifying novel causative genes and mutations involved in heritable disease. In this review, we describe major NGS methodologies currently used for genetic disorders and highlight applications of these technologies in studies of molecular diagnosis and the discovery of genes implicated in non-syndromic hearing loss.

  1. Win on Sunday, Sell on Monday: From the Exome Sequencing of One Boy to the Delivery of Clinical Diagnostics

    Science.gov (United States)

    Tschannen, M.R.

    2011-01-01

    For several years, there have been discussions about using both Sanger and whole genome sequencing in clinical practice. In late 2009, the Medical College of Wisconsin initiated the infrastructure to streamline the delivery of current and emerging DNA technologies into state-of-the-art molecular diagnostics. The online publication of our initial case in Genetics of Medicine in late 2010 further intensified our efforts in this endeavor. However, being relatively new to the field of NextGen sequencing, we began with the addition of Sanger diagnostic sequencing to our already successful research core, which at that point had been in operation for almost ten years. This was a great undertaking, as typically, independent research laboratories performing cutting-edge science lack the financial resources and breadth of experience to launch their custom product or application to the diagnostic industry. An independent research laboratory is able to resolve these shortages by partnering with a core laboratory staffed with diagnostic expertise. Due to our lack of diagnostic experience, we quickly aligned the research core to a consortium of individuals with clinical experience to allow us to benefit from established diagnostic facilities on campus. Difficulties faced at the onset of diagnostic startup were many, including large issues such as accreditation program (CAP vs. CLIA), SOP generation and validation, competency and proficiency testing, and reimbursement, as well as smaller problems like semiannual pipette calibration, temperature monitoring, and inventory control. The purpose of this talk is to give insight into efficient ways to resolve these problems, both large and small, and transform a decade or more of research expertise into a viable diagnostic laboratory.

  2. [A safe an easy method for building consensus HIV sequences from 454 massively parallel sequencing data].

    Science.gov (United States)

    Fernández-Caballero Rico, Jose Ángel; Chueca Porcuna, Natalia; Álvarez Estévez, Marta; Mosquera Gutiérrez, María Del Mar; Marcos Maeso, María Ángeles; García, Federico

    2016-10-03

    To show how to generate a consensus sequence from the information of massive parallel sequences data obtained from routine HIV anti-retroviral resistance studies, and that may be suitable for molecular epidemiology studies. Paired Sanger (Trugene-Siemens) and next-generation sequencing (NGS) (454 GSJunior-Roche) HIV RT and protease sequences from 62 patients were studied. NGS consensus sequences were generated using Mesquite, using 10%, 15%, and 20% thresholds. Molecular evolutionary genetics analysis (MEGA) was used for phylogenetic studies. At a 10% threshold, NGS-Sanger sequences from 17/62 patients were phylogenetically related, with a median bootstrap-value of 88% (IQR83.5-95.5). Association increased to 36/62 sequences, median bootstrap 94% (IQR85.5-98)], using a 15% threshold. Maximum association was at the 20% threshold, with 61/62 sequences associated, and a median bootstrap value of 99% (IQR98-100). A safe method is presented to generate consensus sequences from HIV-NGS data at 20% threshold, which will prove useful for molecular epidemiological studies. Copyright © 2016 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.

  3. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  4. Use of Metagenomic Shotgun Sequencing Technology To Detect Foodborne Pathogens within the Microbiome of the Beef Production Chain.

    Science.gov (United States)

    Yang, Xiang; Noyes, Noelle R; Doster, Enrique; Martin, Jennifer N; Linke, Lyndsey M; Magnuson, Roberta J; Yang, Hua; Geornaras, Ifigenia; Woerner, Dale R; Jones, Kenneth L; Ruiz, Jaime; Boucher, Christina; Morley, Paul S; Belk, Keith E

    2016-04-01

    Foodborne illnesses associated with pathogenic bacteria are a global public health and economic challenge. The diversity of microorganisms (pathogenic and nonpathogenic) that exists within the food and meat industries complicates efforts to understand pathogen ecology. Further, little is known about the interaction of pathogens within the microbiome throughout the meat production chain. Here, a metagenomic approach and shotgun sequencing technology were used as tools to detect pathogenic bacteria in environmental samples collected from the same groups of cattle at different longitudinal processing steps of the beef production chain: cattle entry to feedlot, exit from feedlot, cattle transport trucks, abattoir holding pens, and the end of the fabrication system. The log read counts classified as pathogens per million reads for Salmonella enterica,Listeria monocytogenes,Escherichia coli,Staphylococcus aureus, Clostridium spp. (C. botulinum and C. perfringens), and Campylobacter spp. (C. jejuni,C. coli, and C. fetus) decreased over subsequential processing steps. Furthermore, the normalized read counts for S. enterica,E. coli, and C. botulinumwere greater in the final product than at the feedlots, indicating that the proportion of these bacteria increased (the effect on absolute numbers was unknown) within the remaining microbiome. From an ecological perspective, data indicated that shotgun metagenomics can be used to evaluate not only the microbiome but also shifts in pathogen populations during beef production. Nonetheless, there were several challenges in this analysis approach, one of the main ones being the identification of the specific pathogen from which the sequence reads originated, which makes this approach impractical for use in pathogen identification for regulatory and confirmation purposes.

  5. High-throughput next-generation sequencing technologies foster new cutting-edge computing techniques in bioinformatics.

    Science.gov (United States)

    Yang, Mary Qu; Athey, Brian D; Arabnia, Hamid R; Sung, Andrew H; Liu, Qingzhong; Yang, Jack Y; Mao, Jinghe; Deng, Youping

    2009-07-07

    The advent of high-throughput next generation sequencing technologies have fostered enormous potential applications of supercomputing techniques in genome sequencing, epi-genetics, metagenomics, personalized medicine, discovery of non-coding RNAs and protein-binding sites. To this end, the 2008 International Conference on Bioinformatics and Computational Biology (Biocomp) - 2008 World Congress on Computer Science, Computer Engineering and Applied Computing (Worldcomp) was designed to promote synergistic inter/multidisciplinary research and education in response to the current research trends and advances. The conference attracted more than two thousand scientists, medical doctors, engineers, professors and students gathered at Las Vegas, Nevada, USA during July 14-17 and received great success. Supported by International Society of Intelligent Biological Medicine (ISIBM), International Journal of Computational Biology and Drug Design (IJCBDD), International Journal of Functional Informatics and Personalized Medicine (IJFIPM) and the leading research laboratories from Harvard, M.I.T., Purdue, UIUC, UCLA, Georgia Tech, UT Austin, U. of Minnesota, U. of Iowa etc, the conference received thousands of research papers. Each submitted paper was reviewed by at least three reviewers and accepted papers were required to satisfy reviewers' comments. Finally, the review board and the committee decided to select only 19 high-quality research papers for inclusion in this supplement to BMC Genomics based on the peer reviews only. The conference committee was very grateful for the Plenary Keynote Lectures given by: Dr. Brian D. Athey (University of Michigan Medical School), Dr. Vladimir N. Uversky (Indiana University School of Medicine), Dr. David A. Patterson (Member of United States National Academy of Sciences and National Academy of Engineering, University of California at Berkeley) and Anousheh Ansari (Prodea Systems, Space Ambassador). The theme of the conference to promote

  6. High throughput SNP discovery and genotyping in grapevine (Vitis vinifera L.) by combining a re-sequencing approach and SNPlex technology

    Science.gov (United States)

    Lijavetzky, Diego; Cabezas, José Antonio; Ibáñez, Ana; Rodríguez, Virginia; Martínez-Zapater, José M

    2007-01-01

    selected grapevine genotypes. To validate the use of the detected polymorphisms in genetic mapping, cultivar identification and genetic diversity studies we have used the SNPlex™ genotyping technology in a sample of grapevine genotypes and segregating progenies. Conclusion These results provide accurate values for nucleotide diversity in coding sequences and a first estimate of short-range LD in grapevine. Using SNPlex™ genotyping we have shown the application of a set of discovered SNPs as molecular markers for cultivar identification, linkage mapping and genetic diversity studies. Thus, the combination a highly efficient re-sequencing approach and the SNPlex™ high throughput genotyping technology provide a powerful tool for grapevine genetic analysis. PMID:18021442

  7. Complete genome sequence of a copper-resistant bacterium from the citrus phyllosphere, #Stenotrophomonas# sp. strain LM091, obtained using long-read technology

    OpenAIRE

    Richard, Damien; Boyer, Claudine; Lefeuvre, Pierre; Pruvost, Olivier

    2016-01-01

    The Stenotrophomonas genus shows great adaptive potential including resistance to multiple antimicrobials, opportunistic pathogenicity, and production of numerous secondary metabolites. Using long-read technology, we report the sequence of a plant-associated Stenotrophomonas strain originating from the citrus phyllosphere that displays a copper resistance phenotype.(Résumé d'auteur)

  8. Complete Genome Sequence of a Copper-Resistant Bacterium from the Citrus Phyllosphere, Stenotrophomonas sp. Strain LM091, Obtained Using Long-Read Technology

    Science.gov (United States)

    Richard, Damien; Boyer, Claudine; Lefeuvre, Pierre

    2016-01-01

    The Stenotrophomonas genus shows great adaptive potential including resistance to multiple antimicrobials, opportunistic pathogenicity, and production of numerous secondary metabolites. Using long-read technology, we report the sequence of a plant-associated Stenotrophomonas strain originating from the citrus phyllosphere that displays a copper resistance phenotype. PMID:27979933

  9. Genome and exome sequencing in the clinic: unbiased genomic approaches with a high diagnostic yield

    NARCIS (Netherlands)

    Nelen, M.; Veltman, J.A.

    2012-01-01

    For the reasons discussed here, we think whole-genome- or exome-based approaches are currently most suited for diagnostic implementation in genetically heterogeneous diseases, initially to complement and later to replace Sanger sequencing, qPCR and genomic microarrays. Patients do need to be counsel

  10. Deep sequencing analysis of phage libraries using Illumina platform.

    Science.gov (United States)

    Matochko, Wadim L; Chu, Kiki; Jin, Bingjie; Lee, Sam W; Whitesides, George M; Derda, Ratmir

    2012-09-01

    This paper presents an analysis of phage-displayed libraries of peptides using Illumina. We describe steps for the preparation of short DNA fragments for deep sequencing and MatLab software for the analysis of the results. Screening of peptide libraries displayed on the surface of bacteriophage (phage display) can be used to discover peptides that bind to any target. The key step in this discovery is the analysis of peptide sequences present in the library. This analysis is usually performed by Sanger sequencing, which is labor intensive and limited to examination of a few hundred phage clones. On the other hand, Illumina deep-sequencing technology can characterize over 10(7) reads in a single run. We applied Illumina sequencing to analyze phage libraries. Using PCR, we isolated the variable regions from M13KE phage vectors from a phage display library. The PCR primers contained (i) sequences flanking the variable region, (ii) barcodes, and (iii) variable 5'-terminal region. We used this approach to examine how diversity of peptides in phage display libraries changes as a result of amplification of libraries in bacteria. Using HiSeq single-end Illumina sequencing of these fragments, we acquired over 2×10(7) reads, 57 base pairs (bp) in length. Each read contained information about the barcode (6bp), one complimentary region (12bp) and a variable region (36bp). We applied this sequencing to a model library of 10(6) unique clones and observed that amplification enriches ∼150 clones, which dominate ∼20% of the library. Deep sequencing, for the first time, characterized the collapse of diversity in phage libraries. The results suggest that screens based on repeated amplification and small-scale sequencing identify a few binding clones and miss thousands of useful clones. The deep sequencing approach described here could identify under-represented clones in phage screens. It could also be instrumental in developing new screening strategies, which can preserve

  11. Multiplex sequencing of pooled mitochondrial genomes-a crucial step toward biodiversity analysis using mito-metagenomics.

    Science.gov (United States)

    Tang, Min; Tan, Meihua; Meng, Guanliang; Yang, Shenzhou; Su, Xu; Liu, Shanlin; Song, Wenhui; Li, Yiyuan; Wu, Qiong; Zhang, Aibing; Zhou, Xin

    2014-12-16

    The advent in high-throughput-sequencing (HTS) technologies has revolutionized conventional biodiversity research by enabling parallel capture of DNA sequences possessing species-level diagnosis. However, polymerase chain reaction (PCR)-based implementation is biased by the efficiency of primer binding across lineages of organisms. A PCR-free HTS approach will alleviate this artefact and significantly improve upon the multi-locus method utilizing full mitogenomes. Here we developed a novel multiplex sequencing and assembly pipeline allowing for simultaneous acquisition of full mitogenomes from pooled animals without DNA enrichment or amplification. By concatenating assemblies from three de novo assemblers, we obtained high-quality mitogenomes for all 49 pooled taxa, with 36 species >15 kb and the remaining >10 kb, including 20 complete mitogenomes and nearly all protein coding genes (99.6%). The assembly quality was carefully validated with Sanger sequences, reference genomes and conservativeness of protein coding genes across taxa. The new method was effective even for closely related taxa, e.g. three Drosophila spp., demonstrating its broad utility for biodiversity research and mito-phylogenomics. Finally, the in silico simulation showed that by recruiting multiple mito-loci, taxon detection was improved at a fixed sequencing depth. Combined, these results demonstrate the plausibility of a multi-locus mito-metagenomics approach as the next phase of the current single-locus metabarcoding method.

  12. The diploid genome sequence of an individual human.

    Directory of Open Access Journals (Sweden)

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  13. Contrast-enhanced ultrasonography using cadence-contrast pulse sequencing technology for targeted biopsy of the prostate.

    Science.gov (United States)

    Aigner, Friedrich; Pallwein, Leo; Mitterberger, Michael; Pinggera, Germar M; Mikuz, Gregor; Horninger, Wolfgang; Frauscher, Ferdinand

    2009-02-01

    To evaluate contrast-enhanced ultrasonography (US) using cadence-contrast pulse sequencing (CPS) technology, compared with systematic biopsy for detecting prostate cancer, as grey-scale US has low sensitivity and specificity for detecting prostate cancer. In all, 44 men with suspicious prostate-specific antigen (PSA) levels and CPS findings were assessed; all had CPS-targeted and systematic biopsy. Transrectal CPS images were taken with a low mechanical index (0.14). A microbubble contrast agent (SonoVue, Bracco International BV, Amsterdam, the Netherlands) was administered as a bolus, with a maximum dose of 4.8 mL. CPS was used to assess prostatic vascularity. Areas with a rapid and increased contrast enhancement within the peripheral zone were defined as suspicious for prostate cancer. Up to five CPS targeted biopsies were taken and subsequently a 10-core systematic biopsy was taken. Cancer detection rates for the two techniques were compared. Overall, cancer was detected in 35 of 44 patients (80%), with a mean PSA level of 3.8 ng/mL. Lesions suspicious on CPS showed cancer in 35 of 44 patients (80%) and systematic biopsy detected cancer in 15 of 44 patients (34%). CPS-targeted cores were positive in 105 of 220 cores (47.7%) and in 41 of 440 systematic biopsy cores (9.3%) (P biopsy was 6.7 and for CPS-targeted biopsy 6.8 (P > 0.05). The sensitivity of CPS for detecting cancer was 100% (confidence interval, 95%). However, limitations in the series included that only CPS-positive cases were investigated, and CPS-targeted biopsy should be evaluated in a more extended biopsy scheme. Contrast-enhanced US using CPS enables excellent visualization of the microvasculature associated with prostate cancer, and can improve the detection of prostate cancer compared with systematic biopsy.

  14. Detection of very long antisense transcripts by whole transcriptome RNA-Seq analysis of Listeria monocytogenes by semiconductor sequencing technology.

    Science.gov (United States)

    Wehner, Stefanie; Mannala, Gopala K; Qing, Xiaoxing; Madhugiri, Ramakanth; Chakraborty, Trinad; Mraheil, Mobarak A; Hain, Torsten; Marz, Manja

    2014-01-01

    The Gram-positive bacterium Listeria monocytogenes is the causative agent of listeriosis, a severe food-borne infection characterised by abortion, septicaemia, or meningoencephalitis. L. monocytogenes causes outbreaks of febrile gastroenteritis and accounts for community-acquired bacterial meningitis in humans. Listeriosis has one of the highest mortality rates (up to 30%) of all food-borne infections. This human pathogenic bacterium is an important model organism for biomedical research to investigate cell-mediated immunity. L. monocytogenes is also one of the best characterised bacterial systems for the molecular analysis of intracellular parasitism. Recently several transcriptomic studies have also made the ubiquitous distributed bacterium as a model to understand mechanisms of gene regulation from the environment to the infected host on the level of mRNA and non-coding RNAs (ncRNAs). We have used semiconductor sequencing technology for RNA-seq to investigate the repertoire of listerial ncRNAs under extra- and intracellular growth conditions. Furthermore, we applied a new bioinformatic analysis pipeline for detection, comparative genomics and structural conservation to identify ncRNAs. With this work, in total, 741 ncRNA locations of potential ncRNA candidates are now known for L. monocytogenes, of which 611 ncRNA candidates were identified by RNA-seq. 441 transcribed ncRNAs have never been described before. Among these, we identified novel long non-coding antisense RNAs with a length of up to 5,400 nt e.g. opposite to genes coding for internalins, methylases or a high-affinity potassium uptake system, namely the kdpABC operon, which were confirmed by qRT-PCR analysis. RNA-seq, comparative genomics and structural conservation of L. monocytogenes ncRNAs illustrate that this human pathogen uses a large number and repertoire of ncRNA including novel long antisense RNAs, which could be important for intracellular survival within the infected eukaryotic host.

  15. Detection of very long antisense transcripts by whole transcriptome RNA-Seq analysis of Listeria monocytogenes by semiconductor sequencing technology.

    Directory of Open Access Journals (Sweden)

    Stefanie Wehner

    Full Text Available The Gram-positive bacterium Listeria monocytogenes is the causative agent of listeriosis, a severe food-borne infection characterised by abortion, septicaemia, or meningoencephalitis. L. monocytogenes causes outbreaks of febrile gastroenteritis and accounts for community-acquired bacterial meningitis in humans. Listeriosis has one of the highest mortality rates (up to 30% of all food-borne infections. This human pathogenic bacterium is an important model organism for biomedical research to investigate cell-mediated immunity. L. monocytogenes is also one of the best characterised bacterial systems for the molecular analysis of intracellular parasitism. Recently several transcriptomic studies have also made the ubiquitous distributed bacterium as a model to understand mechanisms of gene regulation from the environment to the infected host on the level of mRNA and non-coding RNAs (ncRNAs. We have used semiconductor sequencing technology for RNA-seq to investigate the repertoire of listerial ncRNAs under extra- and intracellular growth conditions. Furthermore, we applied a new bioinformatic analysis pipeline for detection, comparative genomics and structural conservation to identify ncRNAs. With this work, in total, 741 ncRNA locations of potential ncRNA candidates are now known for L. monocytogenes, of which 611 ncRNA candidates were identified by RNA-seq. 441 transcribed ncRNAs have never been described before. Among these, we identified novel long non-coding antisense RNAs with a length of up to 5,400 nt e.g. opposite to genes coding for internalins, methylases or a high-affinity potassium uptake system, namely the kdpABC operon, which were confirmed by qRT-PCR analysis. RNA-seq, comparative genomics and structural conservation of L. monocytogenes ncRNAs illustrate that this human pathogen uses a large number and repertoire of ncRNA including novel long antisense RNAs, which could be important for intracellular survival within the infected

  16. Detection of Very Long Antisense Transcripts by Whole Transcriptome RNA-Seq Analysis of Listeria monocytogenes by Semiconductor Sequencing Technology

    Science.gov (United States)

    Wehner, Stefanie; Mannala, Gopala K.; Qing, Xiaoxing; Madhugiri, Ramakanth; Chakraborty, Trinad; Mraheil, Mobarak A.; Hain, Torsten; Marz, Manja

    2014-01-01

    The Gram-positive bacterium Listeria monocytogenes is the causative agent of listeriosis, a severe food-borne infection characterised by abortion, septicaemia, or meningoencephalitis. L. monocytogenes causes outbreaks of febrile gastroenteritis and accounts for community-acquired bacterial meningitis in humans. Listeriosis has one of the highest mortality rates (up to 30%) of all food-borne infections. This human pathogenic bacterium is an important model organism for biomedical research to investigate cell-mediated immunity. L. monocytogenes is also one of the best characterised bacterial systems for the molecular analysis of intracellular parasitism. Recently several transcriptomic studies have also made the ubiquitous distributed bacterium as a model to understand mechanisms of gene regulation from the environment to the infected host on the level of mRNA and non-coding RNAs (ncRNAs). We have used semiconductor sequencing technology for RNA-seq to investigate the repertoire of listerial ncRNAs under extra- and intracellular growth conditions. Furthermore, we applied a new bioinformatic analysis pipeline for detection, comparative genomics and structural conservation to identify ncRNAs. With this work, in total, 741 ncRNA locations of potential ncRNA candidates are now known for L. monocytogenes, of which 611 ncRNA candidates were identified by RNA-seq. 441 transcribed ncRNAs have never been described before. Among these, we identified novel long non-coding antisense RNAs with a length of up to 5,400 nt e.g. opposite to genes coding for internalins, methylases or a high-affinity potassium uptake system, namely the kdpABC operon, which were confirmed by qRT-PCR analysis. RNA-seq, comparative genomics and structural conservation of L. monocytogenes ncRNAs illustrate that this human pathogen uses a large number and repertoire of ncRNA including novel long antisense RNAs, which could be important for intracellular survival within the infected eukaryotic host. PMID

  17. Is Whole-Exome Sequencing an Ethically Disruptive Technology? Perspectives of Pediatric Oncologists and Parents of Pediatric Patients With Solid Tumors.

    Science.gov (United States)

    McCullough, Laurence B; Slashinski, Melody J; McGuire, Amy L; Street, Richard L; Eng, Christine M; Gibbs, Richard A; Parsons, D William; Plon, Sharon E

    2016-03-01

    It has been anticipated that physician and parents will be ill prepared or unprepared for the clinical introduction of genome sequencing, making it ethically disruptive. As a part of the Baylor Advancing Sequencing in Childhood Cancer Care study, we conducted semistructured interviews with 16 pediatric oncologists and 40 parents of pediatric patients with cancer prior to the return of sequencing results. We elicited expectations and attitudes concerning the impact of sequencing on clinical decision making, clinical utility, and treatment expectations from both groups. Using accepted methods of qualitative research to analyze interview transcripts, we completed a thematic analysis to provide inductive insights into their views of sequencing. Our major findings reveal that neither pediatric oncologists nor parents anticipate sequencing to be an ethically disruptive technology, because they expect to be prepared to integrate sequencing results into their existing approaches to learning and using new clinical information for care. Pediatric oncologists do not expect sequencing results to be more complex than other diagnostic information and plan simply to incorporate these data into their evidence-based approach to clinical practice, although they were concerned about impact on parents. For parents, there is an urgency to protect their child's health and in this context they expect genomic information to better prepare them to participate in decisions about their child's care. Our data do not support the concern that introducing genome sequencing into childhood cancer care will be ethically disruptive, that is, leave physicians or parents ill prepared or unprepared to make responsible decisions about patient care. © 2015 Wiley Periodicals, Inc.

  18. Authentication of Herbal Supplements Using Next-Generation Sequencing

    OpenAIRE

    Ivanova, Natalia V.; Kuzmina, Maria L.; Thomas W A Braukmann; Borisenko, Alex V.; Zakharov, Evgeny V.

    2016-01-01

    Background DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious. Methods We utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-grae...

  19. Genetic mapping and exome sequencing identify variants associated with five novel diseases.

    Directory of Open Access Journals (Sweden)

    Erik G Puffenberger

    Full Text Available The Clinic for Special Children (CSC has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain children. Among the Plain people, we have used single nucleotide polymorphism (SNP microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb that contain many genes (mean = 79. For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data.

  20. Next-generation sequencing identifies novel CACNA1A gene mutations in episodic ataxia type 2.

    Science.gov (United States)

    Maksemous, Neven; Roy, Bishakha; Smith, Robert A; Griffiths, Lyn R

    2016-03-01

    Episodic Ataxia type 2 (EA2) is a rare autosomal dominantly inherited neurological disorder characterized by recurrent disabling imbalance, vertigo, and episodes of ataxia lasting minutes to hours. EA2 is caused most often by loss of function mutations of the calcium channel gene CACNA1A. In addition to EA2, mutations in CACNA1A are responsible for two other allelic disorders: familial hemiplegic migraine type 1 (FHM1) and spinocerebellar ataxia type 6 (SCA6). Herein, we have utilized next-generation sequencing (NGS) to screen the coding sequence, exon-intron boundaries, and Untranslated Regions (UTRs) of five genes where mutation is known to produce symptoms related to EA2, including CACNA1A. We performed this screening in a group of 31 unrelated patients with EA2 symptoms. Both novel and known mutations were detected through NGS technology, and confirmed through Sanger sequencing. Genetic testing showed in total 15 mutation bearing patients (48%), of which nine were novel mutations (6 missense and 3 small frameshift deletion mutations) and six known mutations (4 missense and 2 nonsense).These results demonstrate the efficiency of our NGS-panel for detecting known and novel mutations for EA2 in the CACNA1A gene, also identifying a novel missense mutation in ATP1A2 which is not a normal target for EA2 screening.

  1. In search of pathogens: transcriptome-based identification of viral sequences from the pine processionary moth (Thaumetopoea pityocampa).

    Science.gov (United States)

    Jakubowska, Agata K; Nalcacioglu, Remziye; Millán-Leiva, Anabel; Sanz-Carbonell, Alejandro; Muratoglu, Hacer; Herrero, Salvador; Demirbag, Zihni

    2015-01-23

    Thaumetopoea pityocampa (pine processionary moth) is one of the most important pine pests in the forests of Mediterranean countries, Central Europe, the Middle East and North Africa. Apart from causing significant damage to pinewoods, T. pityocampa occurrence is also an issue for public and animal health, as it is responsible for dermatological reactions in humans and animals by contact with its irritating hairs. High throughput sequencing technologies have allowed the fast and cost-effective generation of genetic information of interest to understand different biological aspects of non-model organisms as well as the identification of potential pathogens. Using these technologies, we have obtained and characterized the transcriptome of T. pityocampa larvae collected in 12 different geographical locations in Turkey. cDNA libraries for Illumina sequencing were prepared from four larval tissues, head, gut, fat body and integument. By pooling the sequences from Illumina platform with those previously published using the Roche 454-FLX and Sanger methods we generated the largest reference transcriptome of T. pityocampa. In addition, this study has also allowed identification of possible viral pathogens with potential application in future biocontrol strategies.

  2. In Search of Pathogens: Transcriptome-Based Identification of Viral Sequences from the Pine Processionary Moth (Thaumetopoea pityocampa

    Directory of Open Access Journals (Sweden)

    Agata K. Jakubowska

    2015-01-01

    Full Text Available Thaumetopoea pityocampa (pine processionary moth is one of the most important pine pests in the forests of Mediterranean countries, Central Europe, the Middle East and North Africa. Apart from causing significant damage to pinewoods, T. pityocampa occurrence is also an issue for public and animal health, as it is responsible for dermatological reactions in humans and animals by contact with its irritating hairs. High throughput sequencing technologies have allowed the fast and cost-effective generation of genetic information of interest to understand different biological aspects of non-model organisms as well as the identification of potential pathogens. Using these technologies, we have obtained and characterized the transcriptome of T. pityocampa larvae collected in 12 different geographical locations in Turkey. cDNA libraries for Illumina sequencing were prepared from four larval tissues, head, gut, fat body and integument. By pooling the sequences from Illumina platform with those previously published using the Roche 454-FLX and Sanger methods we generated the largest reference transcriptome of T. pityocampa. In addition, this study has also allowed identification of possible viral pathogens with potential application in future biocontrol strategies.

  3. Toward an Integrated BAC Library Resource for Genome Sequencing and Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Simon, M. I.; Kim, U.-J.

    2002-02-26

    We developed a great deal of expertise in building large BAC libraries from a variety of DNA sources including humans, mice, corn, microorganisms, worms, and Arabidopsis. We greatly improved the technology for screening these libraries rapidly and for selecting appropriate BACs and mapping BACs to develop large overlapping contigs. We became involved in supplying BACs and BAC contigs to a variety of sequencing and mapping projects and we began to collaborate with Drs. Adams and Venter at TIGR and with Dr. Leroy Hood and his group at University of Washington to provide BACs for end sequencing and for mapping and sequencing of large fragments of chromosome 16. Together with Dr. Ian Dunham and his co-workers at the Sanger Center we completed the mapping and they completed the sequencing of the first human chromosome, chromosome 22. This was published in Nature in 1999 and our BAC contigs made a major contribution to this sequencing effort. Drs. Shizuya and Ding invented an automated highly accurate BAC mapping technique. We also developed long-term collaborations with Dr. Uli Weier at UCSF in the design of BAC probes for characterization of human tumors and specific chromosome deletions and breakpoints. Finally the contribution of our work to the human genome project has been recognized in the publication both by the international consortium and the NIH of a draft sequence of the human genome in Nature last year. Dr. Shizuya was acknowledged in the authorship of that landmark paper. Dr. Simon was also an author on the Venter/Adams Celera project sequencing the human genome that was published in Science last year.

  4. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey

    NARCIS (Netherlands)

    Kerstens, H.H.D.; Crooijmans, R.P.M.A.; Veenendaal, A.; Dibbits, B.W.; Chin-A-Woeng, T.F.C.; Dunnen, den J.T.; Groenen, M.A.M.

    2009-01-01

    Background - The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs) in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a

  5. Comparative analysis of Lactobacillus plantarum WCFS1 transcriptomes by using DNA microarray and next-generation sequencing technologies.

    NARCIS (Netherlands)

    Leimena, M.M.; Wels, M.W.; Bongers, R.S.; Smid, E.J.; Zoetendal, E.G.; Kleerebezem, M.

    2012-01-01

    RNA sequencing is starting to compete with the use of DNA microarrays for transcription analysis in eukaryotes as well as in prokaryotes. The application of RNA sequencing in prokaryotes requires additional steps in the RNA preparation procedure to increase the relative abundance of mRNA and cannot

  6. Preparing a re-sequencing DNA library of 2 cancer candidate genes using the ligation-by-amplification protocol by two PCR reactions

    Institute of Scientific and Technical Information of China (English)

    SU YeYang; LIN Lin; TIAN Geng; CHEN Chen; LIU Tao; XU Xingya; QI XinPeng; ZHANG XiuQing; YANG HuanMing

    2009-01-01

    To meet the needs of large-scale genomic/genetic studies, the next-generation massively parallelized sequencing technologies provide high throughput, low cost and low labor-intensive sequencing ser-vice, with subsequent bioinformatic software and laboratory methods developed to expand their ap-plications in various types of research. PCR-based genomic/genetic studies, which have significant usage in association studies like cancer research, haven't benefited much from those next-generation sequencing technologies, because the shortgun re-sequencing strategy used by such sequencing machines as the Illumina/Solexa Genome Analyzer may not be applied to direct re-sequencing of short-length target regions like those in PCR-based genomic/genetic studies. Although several meth-ods have been proposed to solve this problem, including microarray-based genomic selections and selector-based technologies, they require advanced equipment and procedures which limit their ap-plications in many laboratories. By contrast, we overcame such potential drawbacks by utilizing a liga-tion by amplification (LBA) protocol, a method using a pair of Universal Adapters to randomly ligate target regions in a two-step-PCR procedure, whose Long LBA products were easily fragmented and sequenced on the next-generation sequencing machine. In this concept-proven study, we chose the consensus coding sequences of two human cancer genes: BRCA1 and BRCA2 as target regions, spe-cifically designed LBA primer pairs to amplify and randomly ligate them. 70 target sequences were successfully amplified and ligated into Long LBA products, which were then fragmented to construct DNA libraries for sequencing on both a conventional Sanger sequencer ABI 3730xl DNA Analyzer and the next-generation 'synthesis by sequencing technology' IlluminalSolexa Genome Analyzer. Bioin-formatic analysis demonstrated the utility and efficiency (including the coverage and depth of each target sequence and the SNPs detection

  7. Whole exome sequencing identifies three recessive FIG4 mutations in an apparently dominant pedigree with Charcot-Marie-Tooth disease.

    Science.gov (United States)

    Menezes, Manoj P; Waddell, Leigh; Lenk, Guy M; Kaur, Simranpreet; MacArthur, Daniel G; Meisler, Miriam H; Clarke, Nigel F

    2014-08-01

    Charcot-Marie-Tooth disease (CMT) is genetically heterogeneous and classification based on motor nerve conduction velocity and inheritance is used to direct genetic testing. With the less common genetic forms of CMT, identifying the causative genetic mutation by Sanger sequencing of individual genes can be time-consuming and costly. Next-generation sequencing technologies show promise for clinical testing in diseases where a similar phenotype is caused by different genes. We report the unusual occurrence of CMT4J, caused by mutations in FIG4, in a apparently dominant pedigree. The affected proband and her mother exhibit different disease severities associated with different combinations of compound heterozygous FIG4 mutations, identified by whole exome sequencing. The proband was also shown to carry a de novo nonsense mutation in the dystrophin gene, which may contribute to her more severe phenotype. This study is a cautionary reminder that in families with two generations affected, explanations other than dominant inheritance are possible, such as recessive inheritance due to three mutations segregating in the family. It also emphasises the advantages of next-generation sequencing approaches that screen multiple CMT genes at once for patients in whom the common genes have been excluded. Crown Copyright © 2014. Published by Elsevier B.V. All rights reserved.

  8. The pots and potters of Assyria: technology and organization of production, ceramics sequence and vessel function at Late Bronze Age Tell Sabi Abyad, Syria

    OpenAIRE

    Duistermaat, Kim

    2007-01-01

    “The Pots and Potters of Assyria” is a comprehensive discussion of all evidence relating to pottery production from the Late Bronze Age site of Tell Sabi Abyad, Syria. Technological, morphological, stylistic and archaeological data are integrated into the understanding of pottery production and use. The pottery itself and its chronological sequence, the shaping and firing techniques, raw materials, wasters and unfired pottery are presented. In addition, workshops and their layout, tools, as w...

  9. Mitochondrial DNA variant discovery and evaluation in human Cardiomyopathies through next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Michael V Zaragoza

    Full Text Available Mutations in mitochondrial DNA (mtDNA may cause maternally-inherited cardiomyopathy and heart failure. In homoplasmy all mtDNA copies contain the mutation. In heteroplasmy there is a mixture of normal and mutant copies of mtDNA. The clinical phenotype of an affected individual depends on the type of genetic defect and the ratios of mutant and normal mtDNA in affected tissues. We aimed at determining the sensitivity of next-generation sequencing compared to Sanger sequencing for mutation detection in patients with mitochondrial cardiomyopathy. We studied 18 patients with mitochondrial cardiomyopathy and two with suspected mitochondrial disease. We "shotgun" sequenced PCR-amplified mtDNA and multiplexed using a single run on Roche's 454 Genome Sequencer. By mapping to the reference sequence, we obtained 1,300x average coverage per case and identified high-confidence variants. By comparing these to >400 mtDNA substitution variants detected by Sanger, we found 98% concordance in variant detection. Simulation studies showed that >95% of the homoplasmic variants were detected at a minimum sequence coverage of 20x while heteroplasmic variants required >200x coverage. Several Sanger "misses" were detected by 454 sequencing. These included the novel heteroplasmic 7501T>C in tRNA serine 1 in a patient with sudden cardiac death. These results support a potential role of next-generation sequencing in the discovery of novel mtDNA variants with heteroplasmy below the level reliably detected with Sanger sequencing. We hope that this will assist in the identification of mtDNA mutations and key genetic determinants for cardiomyopathy and mitochondrial disease.

  10. Genomic libraries: II. Subcloning, sequencing, and assembling large-insert genomic DNA clones.

    Science.gov (United States)

    Quail, Mike A; Matthews, Lucy; Sims, Sarah; Lloyd, Christine; Beasley, Helen; Baxter, Simon W

    2011-01-01

    Sequencing large insert clones to completion is useful for characterizing specific genomic regions, identifying haplotypes, and closing gaps in whole genome sequencing projects. Despite being a standard technique in molecular laboratories, DNA sequencing using the Sanger method can be highly problematic when complex secondary structures or sequence repeats are encountered in genomic clones. Here, we describe methods to isolate DNA from a large insert clone (fosmid or BAC), subclone the sample, and sequence the region to the highest industry standard. Troubleshooting solutions for sequencing difficult templates are discussed.

  11. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey

    Directory of Open Access Journals (Sweden)

    den Dunnen Johan T

    2009-10-01

    Full Text Available Abstract Background The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a short read de novo sequence assembler and a program designed to identify variation within short reads. To illustrate the potential of this technique, we present the results obtained with a randomly sheared, enzymatically generated, 2-3 kbp genome fraction of six pooled Meleagris gallopavo (turkey individuals. Results A total of 100 million 36 bp reads were generated, representing approximately 5-6% (~62 Mbp of the turkey genome, with an estimated sequence depth of 58. Reads consisting of bases called with less than 1% error probability were selected and assembled into contigs. Subsequently, high throughput discovery of nucleotide variation was performed using sequences with more than 90% reliability by using the assembled contigs that were 50 bp or longer as the reference sequence. We identified more than 7,500 SNPs with a high probability of representing true nucleotide variation in turkeys. Increasing the reference genome by adding publicly available turkey BAC-end sequences increased the number of SNPs to over 11,000. A comparison with the sequenced chicken genome indicated that the assembled turkey contigs were distributed uniformly across the turkey genome. Genotyping of a representative sample of 340 SNPs resulted in a SNP conversion rate of 95%. The correlation of the minor allele count (MAC and observed minor allele frequency (MAF for the validated SNPs was 0.69. Conclusion We provide an efficient and cost-effective approach for the identification of thousands of high quality SNPs in species currently lacking a sequenced genome and applied this to turkey. The methodology addresses a random fraction of the genome, resulting in an even

  12. Quick genetic screening using targeted next-generation sequencing in patients with tuberous sclerosis.

    Science.gov (United States)

    Liu, Qing; Huang, Yan; Zhang, Mingrong; Wang, Lian Qing; Guo, Xia Nan; Si, Nuo; Qi, Zhan; Zhou, Xiang Qin; Cui, Li-ying

    2015-04-01

    Tuberous sclerosis complex is an autosomal dominant disorder characterized by hamartomas in multiple organ systems. Mutations in the 2 large genes TSC1 and TSC2 have been demonstrated to be associated with tuberous sclerosis complex by various mutation screening methods. Targeted next-generation sequencing for genetic analysis is performed in the current study and is proved to be less cost, labor, and time consuming compared with Sanger sequencing. Two de novo and 1 recurrent TSC2 mutation in patients with tuberous sclerosis complex were revealed. Clinical details of patients were described and the underlying mechanism of the 2 novel TSC2 mutations, c.245G>A(p.W82X) and c.5405_5408dupACTT(p.P1803Lfs*25), were discussed. These results added to variability of TSC mutation spectrum and suggest that targeted next-generation sequencing could be the primary choice over Sanger sequencing in future tuberous sclerosis complex genetic counseling.

  13. Transcriptome sequencing, and rapid development and application of SNP markers for the legume pod borer Maruca vitrata (Lepidoptera: Crambidae)

    Science.gov (United States)

    The legume pod borer, Maruca vitrata (Lepidoptera: Crambidae), is an insect pest species that is destructive to crops grown by subsistence farmers in tropical regions of West Africa. We present the de novo assembly of 3729 contigs from 454- and Sanger-derived sequencing reads for midgut, salivary, ...

  14. SSR-patchwork: An optimized protocol to obtain a rapid and inexpensive SSR library using first-generation sequencing technology1

    Science.gov (United States)

    Di Maio, Antonietta; De Castro, Olga

    2013-01-01

    • Premise of the study: We have optimized a version of a microsatellite loci isolation protocol for first-generation sequencing (FGS) technologies. The protocol is optimized to reduce the cost and number of steps, and it combines some procedures from previous simple sequence repeat (SSR) protocols with several key improvements that significantly affect the final yield of the SSR library. This protocol may be accessible for laboratories with a moderate budget or for which next-generation sequencing (NGS) is not readily available. • Methods and Results: We drew from classic protocols for library enrichment by digestion, ligation, amplification, hybridization, cloning, and sequencing. Three different systems were chosen: two with very different genome sizes (Galdieria sulphuraria, 10 Mbp; Pancratium maritimum, 30 000 Mbp), and a third with an undetermined genome size (Kochia saxicola). Moreover, we also report the optimization of the sequencing reagents. A good frequency of the obtained microsatellite loci was achieved. • Conclusions: The method presented here is very detailed; comparative tests with other SSR protocols are also reported. This optimized protocol is a promising tool for low-cost genetic studies and the rapid, simple construction of homemade SSR libraries for small and large genomes. PMID:25202476

  15. SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome.

    Science.gov (United States)

    Stadermann, Kai Bernd; Weisshaar, Bernd; Holtgräwe, Daniela

    2015-09-16

    which we could demonstrate with Fosmid End Sequences (FES) generated with Sanger technology. Nevertheless, this limitation also applies to short read sequencing data but is reached in this case at a much earlier stage during finishing.

  16. Nanopore DNA sequencing using kinetic proofreading

    Science.gov (United States)

    Ling, Xinsheng

    We propose a method of DNA sequencing by combining the physical method of nanopore electrical measurements and Southern's sequencing-by-hybridization. The new key ingredient, essential to both lowering the costs and increasing the precision, is an asymmetric nanopore sandwich device capable of measuring the DNA hybridization probe twice separated by a designed waiting time. Those incorrect probes appearing only once in nanopore ionic current traces are discriminated from the correct ones that appear twice. This method of discrimination is similar to the principle of kinetic proofreading proposed by Hopfield and Ninio in gene transcription and translation processes. An error analysis is of this nanopore kinetic proofreading (nKP) technique for DNA sequencing is carried out in comparison with the most precise 3' dideoxy termination method developed by Sanger. Nanopore DNA sequencing using kinetic proofreading.

  17. 下一代测序技术在分子诊断中的应用%Application of next-generation sequencing technologies in molecular diagnostics

    Institute of Scientific and Technical Information of China (English)

    魏军; 赵志军

    2013-01-01

    DNA sequencing is a powerful approach for decoding human diseases, including cancers. The rapid development of next-generation sequencing (NGS) greatly reduce the cost of sequencing and realize the high-throughput, which allows us to obtain the whole genome sequence, and entire genome information about those patients who are clinically diagnosed. However, the benefits offered by NGS technologies come with a number of challenges, which is how to make this technology become a conventional means in the clinical diagnosis. This article reviews the principle of a few technology platform, potential applications and the challenges of NGS in the clinical diagnosis.%  DNA测序是破译人类疾病的一种强大技术,尤其在癌症方面。飞速发展的下一代测序(next-generation sequencing,NGS)极大降低了测序成本,并且实现了高通量,这使我们可以获得整个基因组的序列,以及那些临床上确诊病人的全部基因组信息。然而下一代测序技术带来诸多益处的同时也带来了挑战,那就是怎样使这个技术在临床诊断中成为常规手段。本文就目前NGS的几大技术平台原理,在临床诊断中的应用,以及目前面临的挑战等进行综述。

  18. Use of whole genome sequencing to determine the microevolution of Mycobacterium tuberculosis during an outbreak.

    Directory of Open Access Journals (Sweden)

    Midori Kato-Maeda

    Full Text Available RATIONALE: Current tools available to study the molecular epidemiology of tuberculosis do not provide information about the directionality and sequence of transmission for tuberculosis cases occurring over a short period of time, such as during an outbreak. Recently, whole genome sequencing has been used to study molecular epidemiology of Mycobacterium tuberculosis over short time periods. OBJECTIVE: To describe the microevolution of M. tuberculosis during an outbreak caused by one drug-susceptible strain. METHOD AND MEASUREMENTS: We included 9 patients with tuberculosis diagnosed during a period of 22 months, from a population-based study of the molecular epidemiology in San Francisco. Whole genome sequencing was performed using Illumina's sequencing by synthesis technology. A custom program written in Python was used to determine single nucleotide polymorphisms which were confirmed by PCR product Sanger sequencing. MAIN RESULTS: We obtained an average of 95.7% (94.1-96.9% coverage for each isolate and an average fold read depth of 73 (1 to 250. We found 7 single nucleotide polymorphisms among the 9 isolates. The single nucleotide polymorphisms data confirmed all except one known epidemiological link. The outbreak strain resulted in 5 bacterial variants originating from the index case A1 with 0-2 mutations per transmission event that resulted in a secondary case. CONCLUSIONS: Whole genome sequencing analysis from a recent outbreak of tuberculosis enabled us to identify microevolutionary events observable during transmission, to determine 0-2 single nucleotide polymorphisms per transmission event that resulted in a secondary case, and to identify new epidemiologic links in the chain of transmission.

  19. Preparing a re-sequencing DNA library of 2 cancer candidate genes using the ligation-by-amplification protocol by two PCR reactions

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    To meet the needs of large-scale genomic/genetic studies, the next-generation massively parallelized sequencing technologies provide high throughput, low cost and low labor-intensive sequencing service, with subsequent bioinformatic software and laboratory methods developed to expand their applications in various types of research. PCR-based genomic/genetic studies, which have significant usage in association studies like cancer research, haven’t benefited much from those next-generation sequencing technologies, because the shortgun re-sequencing strategy used by such sequencing machines as the Illumina/Solexa Genome Analyzer may not be applied to direct re-sequencing of short-length target regions like those in PCR-based genomic/genetic studies. Although several methods have been proposed to solve this problem, including microarray-based genomic selections and selector-based technologies, they require advanced equipment and procedures which limit their applications in many laboratories. By contrast, we overcame such potential drawbacks by utilizing a ligation by amplification (LBA) protocol, a method using a pair of Universal Adapters to randomly ligate target regions in a two-step-PCR procedure, whose Long LBA products were easily fragmented and sequenced on the next-generation sequencing machine. In this concept-proven study, we chose the consensus coding sequences of two human cancer genes: BRCA1 and BRCA2 as target regions, specifically designed LBA primer pairs to amplify and randomly ligate them. 70 target sequences were successfully amplified and ligated into Long LBA products, which were then fragmented to construct DNA libraries for sequencing on both a conventional Sanger sequencer ABI 3730xl DNA Analyzer and the next-generation ’synthesis by sequencing technology’ Illumina/Solexa Genome Analyzer. Bioinformatic analysis demonstrated the utility and efficiency (including the coverage and depth of each target sequence and the SNPs detection

  20. Next-generation sequencing in veterinary medicine: how can the massive amount of information arising from high-throughput technologies improve diagnosis, control, and management of infectious diseases?

    Science.gov (United States)

    Van Borm, Steven; Belák, Sándor; Freimanis, Graham; Fusaro, Alice; Granberg, Fredrik; Höper, Dirk; King, Donald P; Monne, Isabella; Orton, Richard; Rosseel, Toon

    2015-01-01

    The development of high-throughput molecular technologies and associated bioinformatics has dramatically changed the capacities of scientists to produce, handle, and analyze large amounts of genomic, transcriptomic, and proteomic data. A clear example of this step-change is represented by the amount of DNA sequence data that can be now produced using next-generation sequencing (NGS) platforms. Similarly, recent improvements in protein and peptide separation efficiencies and highly accurate mass spectrometry have promoted the identification and quantification of proteins in a given sample. These advancements in biotechnology have increasingly been applied to the study of animal infectious diseases and are beginning to revolutionize the way that biological and evolutionary processes can be studied at the molecular level. Studies have demonstrated the value of NGS technologies for molecular characterization, ranging from metagenomic characterization of unknown pathogens or microbial communities to molecular epidemiology and evolution of viral quasispecies. Moreover, high-throughput technologies now allow detailed studies of host-pathogen interactions at the level of their genomes (genomics), transcriptomes (transcriptomics), or proteomes (proteomics). Ultimately, the interaction between pathogen and host biological networks can be questioned by analytically integrating these levels (integrative OMICS and systems biology). The application of high-throughput biotechnology platforms in these fields and their typical low-cost per information content has revolutionized the resolution with which these processes can now be studied. The aim of this chapter is to provide a current and prospective view on the opportunities and challenges associated with the application of massive parallel sequencing technologies to veterinary medicine, with particular focus on applications that have a potential impact on disease control and management.

  1. Development and validation of a 36-gene sequencing assay for hereditary cancer risk assessment

    Science.gov (United States)

    Wang, Xin; Robertson, Alex D.; Haas, Kevin R.; Theilmann, Mark R.; Spurka, Lindsay; Grauman, Peter V.; Lai, Henry H.; Jeon, Diana; Haliburton, Genevieve; Leggett, Matt; Chu, Clement S.; Iori, Kevin; Maguire, Jared R.; Ready, Kaylene; Evans, Eric A.; Haque, Imran S.

    2017-01-01

    The past two decades have brought many important advances in our understanding of the hereditary susceptibility to cancer. Numerous studies have provided convincing evidence that identification of germline mutations associated with hereditary cancer syndromes can lead to reductions in morbidity and mortality through targeted risk management options. Additionally, advances in gene sequencing technology now permit the development of multigene hereditary cancer testing panels. Here, we describe the 2016 revision of the Counsyl Inherited Cancer Screen for detecting single-nucleotide variants (SNVs), short insertions and deletions (indels), and copy number variants (CNVs) in 36 genes associated with an elevated risk for breast, ovarian, colorectal, gastric, endometrial, pancreatic, thyroid, prostate, melanoma, and neuroendocrine cancers. To determine test accuracy and reproducibility, we performed a rigorous analytical validation across 341 samples, including 118 cell lines and 223 patient samples. The screen achieved 100% test sensitivity across different mutation types, with high specificity and 100% concordance with conventional Sanger sequencing and multiplex ligation-dependent probe amplification (MLPA). We also demonstrated the screen’s high intra-run and inter-run reproducibility and robust performance on blood and saliva specimens. Furthermore, we showed that pathogenic Alu element insertions can be accurately detected by our test. Overall, the validation in our clinical laboratory demonstrated the analytical performance required for collecting and reporting genetic information related to risk of developing hereditary cancers. PMID:28243543

  2. Development and validation of a 36-gene sequencing assay for hereditary cancer risk assessment

    Directory of Open Access Journals (Sweden)

    Valentina S. Vysotskaia

    2017-02-01

    Full Text Available The past two decades have brought many important advances in our understanding of the hereditary susceptibility to cancer. Numerous studies have provided convincing evidence that identification of germline mutations associated with hereditary cancer syndromes can lead to reductions in morbidity and mortality through targeted risk management options. Additionally, advances in gene sequencing technology now permit the development of multigene hereditary cancer testing panels. Here, we describe the 2016 revision of the Counsyl Inherited Cancer Screen for detecting single-nucleotide variants (SNVs, short insertions and deletions (indels, and copy number variants (CNVs in 36 genes associated with an elevated risk for breast, ovarian, colorectal, gastric, endometrial, pancreatic, thyroid, prostate, melanoma, and neuroendocrine cancers. To determine test accuracy and reproducibility, we performed a rigorous analytical validation across 341 samples, including 118 cell lines and 223 patient samples. The screen achieved 100% test sensitivity across different mutation types, with high specificity and 100% concordance with conventional Sanger sequencing and multiplex ligation-dependent probe amplification (MLPA. We also demonstrated the screen’s high intra-run and inter-run reproducibility and robust performance on blood and saliva specimens. Furthermore, we showed that pathogenic Alu element insertions can be accurately detected by our test. Overall, the validation in our clinical laboratory demonstrated the analytical performance required for collecting and reporting genetic information related to risk of developing hereditary cancers.

  3. Subtle traps prediction using sequence stratigraphy and 3D seismic technology: A case study from Qikou depression in Huanghua basin

    Institute of Scientific and Technical Information of China (English)

    MAO Ning-bo; DAI Ta-gen; PENG Sheng-lin

    2005-01-01

    Forecasting subtle traps by sequence stratigraphy and 3D seismic data is a sensitive topic in hydrocarbon exploration. Research on subtle traps by geophysical data is the most popular and difficult. Based on the sufficiently drilling data, log data, core data and 3D seismic data, sediment sequence of Qikou depression, Huanghua basin was partitioned by using sequence stratigraphy theory. Each sediment sequence system mode was built. Sediment faces of subtle traps were pointed out. Dominating factors forming subtle traps were analyzed. Sandstone seismic rock physics and its response were studied in Tertiary System. Sandstone geophysical response and elastic modulus vary laws with pressure, temperature, porosity, depth were built. Experimental result and practice shows that it is possible using seismic information forecasting subtle traps. Integrated using geology, log, drilling data, special seismic processing technique, interpretation technique, high precision horizon calibration technique, 3D seismic visualizing interpretation, seismic coherence analysis, attribute analysis, logging-constrained inversion, time frequency analysis, subtle trapsobject is identified and interpreted. Finally, advantage object of subtle trap in this area was determined. Bottomland sand stratigraphic and lithologic reservoirs in Qinan slope zone have been founded by means of high resolution 3D seismic data field technique, high resolution 3D seismic data processing technique and seismic wave impendence inversion technique.

  4. Learning with Technology: Video Modeling with Concrete-Representational-Abstract Sequencing for Students with Autism Spectrum Disorder

    Science.gov (United States)

    Yakubova, Gulnoza; Hughes, Elizabeth M.; Shinaberry, Megan

    2016-01-01

    The purpose of this study was to determine the effectiveness of a video modeling intervention with concrete-representational-abstract instructional sequence in teaching mathematics concepts to students with autism spectrum disorder (ASD). A multiple baseline across skills design of single-case experimental methodology was used to determine the…

  5. Deciphering the microRNA signature of pathological cardiac hypertrophy by engineered heart tissue- and sequencing-technology.

    Science.gov (United States)

    Hirt, Marc N; Werner, Tessa; Indenbirken, Daniela; Alawi, Malik; Demin, Paul; Kunze, Ann-Cathrin; Stenzig, Justus; Starbatty, Jutta; Hansen, Arne; Fiedler, Jan; Thum, Thomas; Eschenhagen, Thomas

    2015-04-01

    Pathological cardiac hypertrophy and fibrosis are modulated by a set of microRNAs, most of which have been detected in biologically complex animal models of hypertrophy by arrays with moderate sensitivity and disregard of passenger strand (previously "star") microRNAs. Here, we aimed at precisely analyzing the microRNA signature of cardiac hypertrophy and fibrosis by RNA sequencing in a standardized in vitro hypertrophy model based on engineered heart tissue (EHT). Spontaneously beating, force-generating fibrin EHTs from neonatal rat heart cells were subjected to afterload enhancement for 7days (AE-EHT), and EHTs without intervention served as controls. AE resulted in reduced contractile force and relaxation velocity, fibrotic changes and reactivation of the fetal gene program. Small RNAs were extracted from control and AE-EHTs and sequencing yielded almost 750 different mature microRNAs, many of which have never been described before in rats. The detection of both arms of the precursor stem-loop (pre-miRNA), namely -3p and -5p miRs, was frequent. 22 abundantly sequenced microRNAs were >1.3× upregulated and 15 abundantly sequenced microRNAs downregulated to hypertrophy and fibrotic response, recapitulating prior results in whole animals. Taken together, AE-induced pathological hypertrophy in EHTs is associated with 37 differentially regulated microRNAs, including many passenger strands. Antagonizing miR-21-5p ameliorates dysfunction in this model. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Transcriptome sequencing of the blind subterranean mole rat, Spalax galili: Utility and potential for the discovery of novel evolutionary patterns

    KAUST Repository

    Malik, Assaf

    2011-08-12

    The blind subterranean mole rat (Spalax ehrenbergi superspecies) is a model animal for survival under extreme environments due to its ability to live in underground habitats under severe hypoxic stress and darkness. Here we report the transcriptome sequencing of Spalax galili, a chromosomal type of S. ehrenbergi. cDNA pools from muscle and brain tissues isolated from animals exposed to hypoxic and normoxic conditions were sequenced using Sanger, GS FLX, and GS FLX Titanium technologies. Assembly of the sequences yielded over 51,000 isotigs with homology to ~12,000 mouse, rat or human genes. Based on these results, it was possible to detect large numbers of splice variants, SNPs, and novel transcribed regions. In addition, multiple differential expression patterns were detected between tissues and treatments. The results presented here will serve as a valuable resource for future studies aimed at identifying genes and gene regions evolved during the adaptive radiation associated with underground life of the blind mole rat. 2011 Malik et al.

  7. Transcriptome sequencing of the blind subterranean mole rat, Spalax galili: utility and potential for the discovery of novel evolutionary patterns.

    Directory of Open Access Journals (Sweden)

    Assaf Malik

    Full Text Available The blind subterranean mole rat (Spalax ehrenbergi superspecies is a model animal for survival under extreme environments due to its ability to live in underground habitats under severe hypoxic stress and darkness. Here we report the transcriptome sequencing of Spalax galili, a chromosomal type of S. ehrenbergi. cDNA pools from muscle and brain tissues isolated from animals exposed to hypoxic and normoxic conditions were sequenced using Sanger, GS FLX, and GS FLX Titanium technologies. Assembly of the sequences yielded over 51,000 isotigs with homology to ∼12,000 mouse, rat or human genes. Based on these results, it was possible to detect large numbers of splice variants, SNPs, and novel transcribed regions. In addition, multiple differential expression patterns were detected between tissues and treatments. The results presented here will serve as a valuable resource for future studies aimed at identifying genes and gene regions evolved during the adaptive radiation associated with underground life of the blind mole rat.

  8. Technology to accelerate pangenomic scanning for unknown point mutations in exonic sequences: cycling temperature capillary electrophoresis (CTCE

    Directory of Open Access Journals (Sweden)

    Bjørheim Jens

    2007-08-01

    Full Text Available Abstract Background Rapid means to discover and enumerate unknown mutations in the exons of human genes on a pangenomic scale are needed to discover the genes carrying inherited risk for common diseases or the genes in which somatic mutations are required for clonal diseases such as atherosclerosis and cancers. The method of constant denaturing capillary electrophoresis (CDCE permitted sensitive detection and enumeration of unknown point mutations but labor-intensive optimization procedures for each exonic sequence made it impractical for application at a pangenomic scale. Results A variant denaturing capillary electrophoresis protocol, cycling temperature capillary electrophoresis (CTCE, has eliminated the need for the laboratory optimization of separation conditions for each target sequence. Here are reported the separation of wild type mutant homoduplexes from wild type/mutant heteroduplexes for 27 randomly chosen target sequences without any laboratory optimization steps. Calculation of the equilibrium melting map of each target sequence attached to a high melting domain (clamp was sufficient to design the analyte sequence and predict the expected degree of resolution. Conclusion CTCE provides practical means for economical pangenomic detection and enumeration of point mutations in large-scale human case/control cohort studies. We estimate that the combined reagent, instrumentation and labor costs for scanning the ~250,000 exons and splice sites of the ~25,000 human protein-coding genes using automated CTCE instruments in 100 case cohorts of 10,000 individuals each are now less than U.S. $500 million, less than U.S. $500 per person.

  9. Design of a high density SNP genotyping assay in the pig using SNPs identified and characterized by next generation sequencing technology.

    Directory of Open Access Journals (Sweden)

    Antonio M Ramos

    Full Text Available BACKGROUND: The dissection of complex traits of economic importance to the pig industry requires the availability of a significant number of genetic markers, such as single nucleotide polymorphisms (SNPs. This study was conducted to discover several hundreds of thousands of porcine SNPs using next generation sequencing technologies and use these SNPs, as well as others from different public sources, to design a high-density SNP genotyping assay. METHODOLOGY/PRINCIPAL FINDINGS: A total of 19 reduced representation libraries derived from four swine breeds (Duroc, Landrace, Large White, Pietrain and a Wild Boar population and three restriction enzymes (AluI, HaeIII and MspI were sequenced using Illumina's Genome Analyzer (GA. The SNP discovery effort resulted in the de novo identification of over 372K SNPs. More than 549K SNPs were used to design the Illumina Porcine 60K+SNP iSelect Beadchip, now commercially available as the PorcineSNP60. A total of 64,232 SNPs were included on the Beadchip. Results from genotyping the 158 individuals used for sequencing showed a high overall SNP call rate (97.5%. Of the 62,621 loci that could be reliably scored, 58,994 were polymorphic yielding a SNP conversion success rate of 94%. The average minor allele frequency (MAF for all scorable SNPs was 0.274. CONCLUSIONS/SIGNIFICANCE: Overall, the results of this study indicate the utility of using next generation sequencing technologies to identify large numbers of reliable SNPs. In addition, the validation of the PorcineSNP60 Beadchip demonstrated that the assay is an excellent tool that will likely be used in a variety of future studies in pigs.

  10. Deep sequencing: becoming a critical tool in clinical virology.

    Science.gov (United States)

    Quiñones-Mateu, Miguel E; Avila, Santiago; Reyes-Teran, Gustavo; Martinez, Miguel A

    2014-09-01

    Population (Sanger) sequencing has been the standard method in basic and clinical DNA sequencing for almost 40 years; however, next-generation (deep) sequencing methodologies are now revolutionizing the field of genomics, and clinical virology is no exception. Deep sequencing is highly efficient, producing an enormous amount of information at low cost in a relatively short period of time. High-throughput sequencing techniques have enabled significant contributions to multiples areas in virology, including virus discovery and metagenomics (viromes), molecular epidemiology, pathogenesis, and studies of how viruses to escape the host immune system and antiviral pressures. In addition, new and more affordable deep sequencing-based assays are now being implemented in clinical laboratories. Here, we review the use of the current deep sequencing platforms in virology, focusing on three of the most studied viruses: human immunodeficiency virus (HIV), hepatitis C virus (HCV), and influenza virus. Copyright © 2014 Elsevier B.V. All rights reserved.

  11. DNA Sequencing Technologies within the Chemical and Biological Defense Enterprise: How to Position the Department of Defense to Maximize the Use of These Emerging Technologies - JUPITR

    Science.gov (United States)

    2015-07-01

    computers, tablets , and smartphones have stretched the bounds of how we perceive and use communications data, the rapidly evolving science of DNA...included to ensure the single deployed platform is replaced when scientific evidence dictates the need. 3. RECOMMENDATIONS FOR THE FUTURE OF DNA...actionable information. 4.4 DNA as Archival Storage Material The evaluation of DNA technologies to support or replace modern long-term data

  12. Identification of fruit related microRNAs in cucumber (Cucumis sativus L.) using high-throughput sequencing technology.

    Science.gov (United States)

    Ye, Xueling; Song, Tiefeng; Liu, Chang; Feng, Hui; Liu, Zhiyong

    2014-12-01

    MicroRNAs (miRNAs) are approximately 21 nt noncoding RNAs that influence the phenotypes of different species through the post-transcriptional regulation of gene expression. Although many miRNAs have been identified in a few model plants, less is known about miRNAs specific to cucumber (Cucumis sativus L.). In this study, two libraries of cucumber RNA, one based on fruit samples and another based on mixed samples from leaves, stems, and roots, were prepared for deep-sequencing. A total of 110 sequences were matched to known miRNAs in 47 families, while 56 sequences in 46 families are newly identified in cucumber. Of these, 77 known and 44 new miRNAs were differentially expressed, with a fold-change of at least 2 and p-value < 0.05. In addition, we predicted the potential targets of known and new miRNAs. The identification and characterization of known and new miRNAs will enable us to better understand the role of these miRNAs in the formation of cucumber fruit.

  13. Development of novel, cross-species microsatellite markers for Acropora corals using next-generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Chuya eShinzato

    2014-05-01

    Full Text Available The genus Acropora (Scleractinia, Acroporidae is one of the most widespread coral genera, comprising the largest number of extant species among scleractinian (reef-building corals. Molecular phylogenetic studies have suggested that A. tenuis belongs to the most basal clade (clade I while A. digitifera belongs to a derived clade (clade IV. In order to develop microsatellite markers that would be useful for most Acropora species, we sequenced the genomic DNA of A. tenuis, using a next generation sequencer (Illumina MiSeq, and designed primer sets that amplify microsatellite loci. Afterward we selected primer pairs with perfectly matched nucleotide sequences from which at least one primer was uniquely mapped to the A. digitifera genome. Fourteen microsatellite markers showed non-significant departure from Hardy–Weinberg equilibrium (HWE in both A. tenuis and A. digitifera. Thus these markers could be used for wide range of species and may provide powerful tools for population genetics studies and conservation of Acropora corals.

  14. Investigating microbial eukaryotic diversity from a global census: insights from a comparison of pyrotag and full-length sequences of 18S rRNA genes.

    Science.gov (United States)

    Lie, Alle A Y; Liu, Zhenfeng; Hu, Sarah K; Jones, Adriane C; Kim, Diane Y; Countway, Peter D; Amaral-Zettler, Linda A; Cary, S Craig; Sherr, Evelyn B; Sherr, Barry F; Gast, Rebecca J; Caron, David A

    2014-07-01

    Next-generation DNA sequencing (NGS) approaches are rapidly surpassing Sanger sequencing for characterizing the diversity of natural microbial communities. Despite this rapid transition, few comparisons exist between Sanger sequences and the generally much shorter reads of NGS. Operational taxonomic units (OTUs) derived from full-length (Sanger sequencing) and pyrotag (454 sequencing of the V9 hypervariable region) sequences of 18S rRNA genes from 10 global samples were analyzed in order to compare the resulting protistan community structures and species richness. Pyrotag OTUs called at 98% sequence similarity yielded numbers of OTUs that were similar overall to those for full-length sequences when the latter were called at 97% similarity. Singleton OTUs strongly influenced estimates of species richness but not the higher-level taxonomic composition of the community. The pyrotag and full-length sequence data sets had slightly different taxonomic compositions of rhizarians, stramenopiles, cryptophytes, and haptophytes, but the two data sets had similarly high compositions of alveolates. Pyrotag-based OTUs were often derived from sequences that mapped to multiple full-length OTUs at 100% similarity. Thus, pyrotags sequenced from a single hypervariable region might not be appropriate for establishing protistan species-level OTUs. However, nonmetric multidimensional scaling plots constructed with the two data sets yielded similar clusters, indicating that beta diversity analysis results were similar for the Sanger and NGS sequences. Short pyrotag sequences can provide holistic assessments of protistan communities, although care must be taken in interpreting the results. The longer reads (>500 bp) that are now becoming available through NGS should provide powerful tools for assessing the diversity of microbial eukaryotic assemblages.

  15. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    Science.gov (United States)

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  16. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

    Directory of Open Access Journals (Sweden)

    Patrick D. Schloss

    2016-03-01

    Full Text Available Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3–V5, V1–V3, V1–V5, V1–V6, and V1–V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1–V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  17. Transcriptome response to pollutants and insecticides in the dengue vector Aedes aegypti using next-generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Riaz Muhammad

    2010-03-01

    Full Text Available Abstract Background The control of mosquitoes transmitting infectious diseases relies mainly on the use of chemical insecticides. However, mosquito control programs are now threatened by the emergence of insecticide resistance. Hitherto, most research efforts have been focused on elucidating the molecular basis of inherited resistance. Less attention has been paid to the short-term response of mosquitoes to insecticides and pollutants which could have a significant impact on insecticide efficacy. Here, a combination of LongSAGE and Solexa sequencing was used to perform a deep transcriptome analysis of larvae of the dengue vector Aedes aegypti exposed for 48 h to sub-lethal doses of three chemical insecticides and three anthropogenic pollutants. Results Thirty millions 20 bp cDNA tags were sequenced, mapped to the mosquito genome and clustered, representing 6850 known genes and 4868 additional clusters not located within predicted genes. Mosquitoes exposed to insecticides or anthropogenic pollutants showed considerable modifications of their transcriptome. Genes encoding cuticular proteins, transporters, and enzymes involved in the mitochondrial respiratory chain and detoxification processes were particularly affected. Genes and molecular mechanisms potentially involved in xenobiotic response and insecticide tolerance were identified. Conclusions The method used in the present study appears as a powerful approach for investigating fine transcriptome variations in genome-sequenced organisms and can provide useful informations for the detection of novel transcripts. At the biological level, despite low concentrations and no apparent phenotypic effects, the significant impact of these xenobiotics on mosquito transcriptomes raise important questions about the 'hidden impact' of anthropogenic pollutants on ecosystems and consequences on vector control.

  18. Existing and emerging detection technologies for DNA (Deoxyribonucleic Acid) finger printing, sequencing, bio- and analytical chips: a multidisciplinary development unifying molecular biology, chemical and electronics engineering.

    Science.gov (United States)

    Kumar Khanna, Vinod

    2007-01-01

    The current status and research trends of detection techniques for DNA-based analysis such as DNA finger printing, sequencing, biochips and allied fields are examined. An overview of main detectors is presented vis-à-vis these DNA operations. The biochip method is explained, the role of micro- and nanoelectronic technologies in biochip realization is highlighted, various optical and electrical detection principles employed in biochips are indicated, and the operational mechanisms of these detection devices are described. Although a diversity of biochips for diagnostic and therapeutic applications has been demonstrated in research laboratories worldwide, only some of these chips have entered the clinical market, and more chips are awaiting commercialization. The necessity of tagging is eliminated in refractive-index change based devices, but the basic flaw of indirect nature of most detection methodologies can only be overcome by generic and/or reagentless DNA sensors such as the conductance-based approach and the DNA-single electron transistor (DNA-SET) structure. Devices of the electrical detection-based category are expected to pave the pathway for the next-generation DNA chips. The review provides a comprehensive coverage of the detection technologies for DNA finger printing, sequencing and related techniques, encompassing a variety of methods from the primitive art to the state-of-the-art scenario as well as promising methods for the future.

  19. Efficient sequence-specific isolation of DNA fragments and chromatin by in vitro enChIP technology using recombinant CRISPR ribonucleoproteins.

    Science.gov (United States)

    Fujita, Toshitsugu; Yuno, Miyuki; Fujii, Hodaka

    2016-04-01

    The clustered regularly interspaced short palindromic repeats (CRISPR) system is widely used for various biological applications, including genome editing. We developed engineered DNA-binding molecule-mediated chromatin immunoprecipitation (enChIP) using CRISPR to isolate target genomic regions from cells for their biochemical characterization. In this study, we developed 'in vitro enChIP' using recombinant CRISPR ribonucleoproteins (RNPs) to isolate target genomic regions. in vitro enChIP has the great advantage over conventional enChIP of not requiring expression of CRISPR complexes in cells. We first showed that in vitro enChIP using recombinant CRISPR RNPs can be used to isolate target DNA from mixtures of purified DNA in a sequence-specific manner. In addition, we showed that this technology can be used to efficiently isolate target genomic regions, while retaining their intracellular molecular interactions, with negligible contamination from irrelevant genomic regions. Thus, in vitro enChIP technology is of potential use for sequence-specific isolation of DNA, as well as for identification of molecules interacting with genomic regions of interest in vivo in combination with downstream analysis.

  20. Sequencing bias: comparison of different protocols of MicroRNA library construction

    Directory of Open Access Journals (Sweden)

    Tian Geng

    2010-09-01

    Full Text Available Abstract Background MicroRNAs(miRNAs are 18-25 nt small RNAs playing critical roles in many biological processes. The majority of known miRNAs were discovered by conventional cloning and a Sanger sequencing approach. The next-generation sequencing (NGS technologies enable in-depth characterization of the global repertoire of miRNAs, and different protocols for miRNA library construction have been developed. However, the possible bias between the relative expression levels and sequences introduced by different protocols of library preparation have rarely been explored. Results We assessed three different miRNA library preparation protocols, SOLiD, Illumina versions 1 and 1.5, using cloning or SBS sequencing of total RNA samples extracted from skeletal muscles from Hu sheep and Dorper sheep, and then validated 9 miRNAs by qRT-PCR. Our results show that SBS sequencing data highly correlate with Illumina cloning data. The SOLiD data, when compared to Illumina's, indicate more dispersed distribution of length, higher frequency variation for nucleotides near the 3'- and 5'-ends, higher frequency occurrence for reads containing end secondary structure (ESS, and higher frequency for reads that do not map to known miRNAs. qRT-PCR results showed the best correlation with SOLiD cloning data. Fold difference of Hu sheep and Dorper sheep between qRT-PCR result and SBS sequencing data correlated well (r = 0.937, and fold difference of miR-1 and miR-206 among SOLiD cloning data, qRT-PCR and SBS sequencing data was similar. Conclusions The sequencing depth can influence the quantitative measurement of miRNA abundance, but the discrepancy caused by it was not statistically significant as high correlation was observed between Illumina cloning and SBS sequencing data. Bias of length distribution, sequence variation, and ESS was observed between data obtained with the different protocols. SOLiD cloning data differ from Illumina cloning data mainly because of

  1. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin

    2013-12-12

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  2. Technology.

    Science.gov (United States)

    Online-Offline, 1998

    1998-01-01

    Focuses on technology, on advances in such areas as aeronautics, electronics, physics, the space sciences, as well as computers and the attendant progress in medicine, robotics, and artificial intelligence. Describes educational resources for elementary and middle school students, including Web sites, CD-ROMs and software, videotapes, books,…

  3. Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies

    Directory of Open Access Journals (Sweden)

    Machado Moara

    2011-02-01

    Full Text Available Abstract Background Targeted re-sequencing is one of the most powerful and widely used strategies for population genetics studies because it allows an unbiased screening for variation that is suitable for a wide variety of organisms. Examples of studies that require re-sequencing data are evolutionary inferences, epidemiological studies designed to capture rare polymorphisms responsible for complex traits and screenings for mutations in families and small populations with high incidences of specific genetic diseases. Despite the advent of next-generation sequencing technologies, Sanger sequencing is still the most popular approach in population genetics studies because of the widespread availability of automatic sequencers based on capillary electrophoresis and because it is still less prone to sequencing errors, which is critical in population genetics studies. Two popular software applications for re-sequencing studies are Phred-Phrap-Consed-Polyphred, which performs base calling, alignment, graphical edition and genotype calling and DNAsp, which performs a set of population genetics analyses. These independent tools are the start and end points of basic analyses. In between the use of these tools, there is a set of basic but error-prone tasks to be performed with re-sequencing data. Results In order to assist with these intermediate tasks, we developed a pipeline that facilitates data handling typical of re-sequencing studies. Our pipeline: (1 consolidates different outputs produced by distinct Phred-Phrap-Consed contigs sharing a reference sequence; (2 checks for genotyping inconsistencies; (3 reformats genotyping data produced by Polyphred into a matrix of genotypes with individuals as rows and segregating sites as columns; (4 prepares input files for haplotype inferences using the popular software PHASE; and (5 handles PHASE output files that contain only polymorphic sites to reconstruct the inferred haplotypes including polymorphic and

  4. Global Analysis of Non-coding Small RNAs in Arabidopsis in Response to Jasmonate Treatment by Deep Sequencing Technology

    Institute of Scientific and Technical Information of China (English)

    Bosen Zhang; Zhiping Jin; Daoxin Xie

    2012-01-01

    In plants,non-coding small RNAs play a vital role in plant development and stress responses.To explore the possible role of non-coding small RNAs in the regulation of the jasmonate (JA) pathway,we compared the non-coding small RNAs between the JA-deficient aos mutant and the JA-treated wild type Arabidopsis via high-throughput sequencing.Thirty new miRNAs and 27 new miRNA candidates were identified through bioinformatics approach.Forty-nine known miRNAs (belonging to 24 families),15 new miRNAs and new miRNA candidates (belonging to 11 families) and 3 tasiRNA families were induced by JA,whereas 1 new miRNA,1 tasiRNA family and 22 known miRNAs (belonging to 9 families) were repressed by JA.

  5. Multiplexed microsatellite recovery using massively parallel sequencing.

    Science.gov (United States)

    Jennings, T N; Knaus, B J; Mullins, T D; Haig, S M; Cronn, R C

    2011-11-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356,958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5 M (USD).

  6. Next-generation phylogeography: a targeted approach for multilocus sequencing of non-model organisms.

    Directory of Open Access Journals (Sweden)

    Jonathan B Puritz

    Full Text Available The field of phylogeography has long since realized the need and utility of incorporating nuclear DNA (nDNA sequences into analyses. However, the use of nDNA sequence data, at the population level, has been hindered by technical laboratory difficulty, sequencing costs, and problematic analytical methods dealing with genotypic sequence data, especially in non-model organisms. Here, we present a method utilizing the 454 GS-FLX Titanium pyrosequencing platform with the capacity to simultaneously sequence two species of sea star (Meridiastra calcar and Parvulastra exigua at five different nDNA loci across 16 different populations of 20 individuals each per species. We compare results from 3 populations with traditional Sanger sequencing based methods, and demonstrate that this next-generation sequencing platform is more time and cost effective and more sensitive to rare variants than Sanger based sequencing. A crucial advantage is that the high coverage of clonally amplified sequences simplifies haplotype determination, even in highly polymorphic species. This targeted next-generation approach can greatly increase the use of nDNA sequence loci in phylogeographic and population genetic studies by mitigating many of the time, cost, and analytical issues associated with highly polymorphic, diploid sequence markers.

  7. Next-Generation Phylogeography: A Targeted Approach for Multilocus Sequencing of Non-Model Organisms

    Science.gov (United States)

    Puritz, Jonathan B.; Addison, Jason A.; Toonen, Robert J.

    2012-01-01

    The field of phylogeography has long since realized the need and utility of incorporating nuclear DNA (nDNA) sequences into analyses. However, the use of nDNA sequence data, at the population level, has been hindered by technical laboratory difficulty, sequencing costs, and problematic analytical methods dealing with genotypic sequence data, especially in non-model organisms. Here, we present a method utilizing the 454 GS-FLX Titanium pyrosequencing platform with the capacity to simultaneously sequence two species of sea star (Meridiastra calcar and Parvulastra exigua) at five different nDNA loci across 16 different populations of 20 individuals each per species. We compare results from 3 populations with traditional Sanger sequencing based methods, and demonstrate that this next-generation sequencing platform is more time and cost effective and more sensitive to rare variants than Sanger based sequencing. A crucial advantage is that the high coverage of clonally amplified sequences simplifies haplotype determination, even in highly polymorphic species. This targeted next-generation approach can greatly increase the use of nDNA sequence loci in phylogeographic and population genetic studies by mitigating many of the time, cost, and analytical issues associated with highly polymorphic, diploid sequence markers. PMID:22470543

  8. DNA sequencing with capillary electrophoresis and single cell analysis with mass spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    Fung, N.

    1998-03-27

    Since the first demonstration of the laser in the 1960`s, lasers have found numerous applications in analytical chemistry. In this work, two different applications are described, namely, DNA sequencing with capillary gel electrophoresis and single cell analysis with mass spectrometry. Two projects are described in which high-speed DNA separations with capillary gel electrophoresis were demonstrated. In the third project, flow cytometry and mass spectrometry were coupled via a laser vaporization/ionization interface and individual mammalian cells were analyzed. First, DNA Sanger fragments were separated by capillary gel electrophoresis. A separation speed of 20 basepairs per minute was demonstrated with a mixed poly(ethylene oxide) (PEO) sieving solution. In addition, a new capillary wall treatment protocol was developed in which bare (or uncoated) capillaries can be used in DNA sequencing. Second, a temperature programming scheme was used to separate DNA Sanger fragments. Third, flow cytometry and mass spectrometry were coupled with a laser vaporization/ionization interface.

  9. Technology

    Directory of Open Access Journals (Sweden)

    Xu Jing

    2016-01-01

    Full Text Available The traditional answer card reading method using OMR (Optical Mark Reader, most commonly, OMR special card special use, less versatile, high cost, aiming at the existing problems proposed a method based on pattern recognition of the answer card identification method. Using the method based on Line Segment Detector to detect the tilt of the image, the existence of tilt image rotation correction, and eventually achieve positioning and detection of answers to the answer sheet .Pattern recognition technology for automatic reading, high accuracy, detect faster

  10. Towards Decrypting Cryptobiosis—Analyzing Anhydrobiosis in the Tardigrade Milnesium tardigradum Using Transcriptome Sequencing

    OpenAIRE

    Chong Wang; Grohme, Markus A.; Brahim Mali; Schill, Ralph O.; Marcus Frohme

    2014-01-01

    BACKGROUND: Many tardigrade species are capable of anhydrobiosis; however, mechanisms underlying their extreme desiccation resistance remain elusive. This study attempts to quantify the anhydrobiotic transcriptome of the limno-terrestrial tardigrade Milnesium tardigradum. RESULTS: A prerequisite for differential gene expression analysis was the generation of a reference hybrid transcriptome atlas by assembly of Sanger, 454 and Illumina sequence data. The final assembly yielded 79,064 contigs ...

  11. Whole Exome Sequencing Identifies a Novel and a Recurrent Mutation in BBS2 Gene in a Family with Bardet-Biedl Syndrome

    Directory of Open Access Journals (Sweden)

    Yong Mong Bee

    2015-01-01

    Full Text Available Bardet-Biedl syndrome (BBS is a rare autosomal recessive disorder known to be caused by mutations in at least 19 BBS genes. We report the genetic analysis of a patient with indisputable features of BBS including cardinal features such as postaxial polydactyly, retinitis pigmentosa, obesity, and kidney failure. Taking advantage of next-generation sequencing technology, we applied whole exome sequencing (WES with Sanger direct sequencing to the proband and her unaffected mother. A pair of heterozygous nonsense mutations in BBS2 gene was identified in the proband, one being novel and the other recurrent. The novel mutation, p.Y644X, resides in exon 16 and was also found in the heterozygous state in the mother. This mutation is not currently found in the dsSNP and 1000 Genome SNP databases and is predicted to be disease causing by in silico analysis. This study highlights the potential for a rapid and precise detection of disease causing gene using WES in genetically heterogeneous disorders such as BBS.

  12. Panel-based next generation sequencing as a reliable and efficient technique to detect mutations in unselected patients with retinal dystrophies

    Science.gov (United States)

    Glöckle, Nicola; Kohl, Susanne; Mohr, Julia; Scheurenbrand, Tim; Sprecher, Andrea; Weisschuh, Nicole; Bernd, Antje; Rudolph, Günther; Schubach, Max; Poloschek, Charlotte; Zrenner, Eberhart; Biskup, Saskia; Berger, Wolfgang; Wissinger, Bernd; Neidhardt, John

    2014-01-01

    Hereditary retinal dystrophies (RD) constitute a group of blinding diseases that are characterized by clinical variability and pronounced genetic heterogeneity. The different forms of RD can be caused by mutations in >100 genes, including >1600 exons. Consequently, next generation sequencing (NGS) technologies are among the most promising approaches to identify mutations in RD. So far, NGS is not routinely used in gene diagnostics. We developed a diagnostic NGS pipeline to identify mutations in 170 genetically and clinically unselected RD patients. NGS was applied to 105 RD-associated genes. Underrepresented regions were examined by Sanger sequencing. The NGS approach was successfully established using cases with known sequence alterations. Depending on the initial clinical diagnosis, we identified likely causative mutations in 55% of retinitis pigmentosa and 80% of Bardet–Biedl or Usher syndrome cases. Seventy-one novel mutations in 40 genes were newly associated with RD. The genes USH2A, EYS, ABCA4, and RHO were more frequently affected than others. Occasionally, cases carried mutations in more than one RD-associated gene. In addition, we found possible dominant de-novo mutations in cases with sporadic RD, which implies consequences for counseling of patients and families. NGS-based mutation analyses are reliable and cost-efficient approaches in gene diagnostics of genetically heterogeneous diseases like RD. PMID:23591405

  13. Targeted, high-depth, next-generation sequencing of cancer genes in formalin-fixed, paraffin-embedded and fine-needle aspiration tumor specimens.

    Science.gov (United States)

    Hadd, Andrew G; Houghton, Jeff; Choudhary, Ashish; Sah, Sachin; Chen, Liangjing; Marko, Adam C; Sanford, Tiffany; Buddavarapu, Kalyan; Krosting, Julie; Garmire, Lana; Wylie, Dennis; Shinde, Rupali; Beaudenon, Sylvie; Alexander, Erik K; Mambo, Elizabeth; Adai, Alex T; Latham, Gary J

    2013-03-01

    Implementation of highly sophisticated technologies, such as next-generation sequencing (NGS), into routine clinical practice requires compatibility with common tumor biopsy types, such as formalin-fixed, paraffin-embedded (FFPE) and fine-needle aspiration specimens, and validation metrics for platforms, controls, and data analysis pipelines. In this study, a two-step PCR enrichment workflow was used to assess 540 known cancer-relevant variants in 16 oncogenes for high-depth sequencing in tumor samples on either mature (Illumina GAIIx) or emerging (Ion Torrent PGM) NGS platforms. The results revealed that the background noise of variant detection was elevated approximately twofold in FFPE compared with cell line DNA. Bioinformatic algorithms were optimized to accommodate this background. Variant calls from 38 residual clinical colorectal cancer FFPE specimens and 10 thyroid fine-needle aspiration specimens were compared across multiple cancer genes, resulting in an accuracy of 96.1% (95% CI, 96.1% to 99.3%) compared with Sanger sequencing, and 99.6% (95% CI, 97.9% to 99.9%) compared with an alternative method with an analytical sensitivity of 1% mutation detection. A total of 45 of 48 samples were concordant between NGS platforms across all matched regions, with the three discordant calls each represented at <10% of reads. Consequently, NGS of targeted oncogenes in real-life tumor specimens using distinct platforms addresses unmet needs for unbiased and highly sensitive mutation detection and can accelerate both basic and clinical cancer research.

  14. High-throughput sequencing technology to reveal the composition and function of cecal microbiota in Dagu chicken.

    Science.gov (United States)

    Xu, Yunhe; Yang, Huixin; Zhang, Lili; Su, Yuhong; Shi, Donghui; Xiao, Haidi; Tian, Yumin

    2016-11-04

    The chicken gut microbiota is an important and complicated ecosystem for the host. They play an important role in converting food into nutrient and energy. The coding capacity of microbiome vastly surpasses that of the host's genome, encoding biochemical pathways that the host has not developed. An optimal gut microbiota can increase agricultural productivity. This study aims to explore the composition and function of cecal microbiota in Dagu chicken under two feeding modes, free-range (outdoor, OD) and cage (indoor, ID) raising. Cecal samples were collected from 24 chickens across 4 groups (12-w OD, 12-w ID, 18-w OD, and 18-w ID). We performed high-throughput sequencing of the 16S rRNA genes V4 hypervariable regions to characterize the cecal microbiota of Dagu chicken and compare the difference of cecal microbiota between free-range and cage raising chickens. It was found that 34 special operational taxonomic units (OTUs) in OD groups and 4 special OTUs in ID groups. 24 phyla were shared by the 24 samples. Bacteroidetes was the most abundant phylum with the largest proportion, followed by Firmicutes and Proteobacteria. The OD groups showed a higher proportion of Bacteroidetes (>50 %) in cecum, but a lower Firmicutes/Bacteroidetes ratio in both 12-w old (0.42, 0.62) and 18-w old groups (0.37, 0.49) compared with the ID groups. Cecal microbiota in the OD groups have higher abundance of functions involved in amino acids and glycan metabolic pathway. The composition and function of cecal microbiota in Dagu chicken under two feeding modes, free-range and cage raising are different. The cage raising mode showed a lower proportion of Bacteroidetes in cecum, but a higher Firmicutes/Bacteroidetes ratio compared with free-range mode. Cecal microbiota in free-range mode have higher abundance of functions involved in amino acids and glycan metabolic pathway.

  15. Pooled deep sequencing of Plasmodium falciparum isolates: an efficient and scalable tool to quantify prevailing malaria drug-resistance genotypes.

    Science.gov (United States)

    Taylor, Steve M; Parobek, Christian M; Aragam, Nash; Ngasala, Billy E; Mårtensson, Andreas; Meshnick, Steven R; Juliano, Jonathan J

    2013-12-15

    Molecular surveillance for drug-resistant malaria parasites requires reliable, timely, and scalable methods. These data may be efficiently produced by genotyping parasite populations using second-generation sequencing (SGS). We designed and validated a SGS protocol to quantify mutant allele frequencies in the Plasmodium falciparum genes dhfr and dhps in mixed isolates. We applied this new protocol to field isolates from children and compared it to standard genotyping using Sanger sequencing. The SGS protocol accurately quantified dhfr and dhps allele frequencies in a mixture of parasite strains. Using SGS of DNA that was extracted and then pooled from individual isolates, we estimated mutant allele frequencies that were closely correlated to those estimated by Sanger sequencing (correlations, >0.98). The SGS protocol obviated most molecular steps in conventional methods and is cost saving for parasite populations >50. This SGS genotyping method efficiently and reproducibly estimates parasite allele frequencies within populations of P. falciparum for molecular epidemiologic studies.

  16. Implementation of Targeted Next Generation Sequencing in Clinical Diagnostics

    DEFF Research Database (Denmark)

    Larsen, Martin Jakob; Burton, Mark; Thomassen, Mads;

    Accurate mutation detection is essential in clinical genetic diagnostics of monogenic hereditary diseases. Targeted next generation sequencing (NGS) provides a promising and cost-effective alternative to Sanger sequencing and MLPA analysis currently used in most diagnostic laboratories. One...... advantage of targeted NGS is that multiple disease-specific genes can easily be sequenced simultaneously, which is favorable in genetic heterogeneous diseases. Prior to implementation in our diagnostic setting, we aimed to assess the sensitivity and specificity of targeted NGS by sequencing a collection......, respectively. For diagnostics, the sequencing coverage is essential, wherefore a minimum coverage of 30x per nucleotide in the coding regions was used as our primary quality criterion. For the majority of the included genes, we obtained adequate gene coverage, in which we were able to detect 100% of the known...

  17. De novo Transcriptome Generation and Annotation for Two Korean Endemic Land Snails, Aegista chejuensis and Aegista quelpartensis, Using Illumina Paired-End Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Se Won Kang

    2016-03-01

    Full Text Available Aegista chejuensis and Aegista quelpartensis (Family-Bradybaenidae are endemic to Korea, and are considered vulnerable due to declines in their population. The limited genetic resources for these species restricts the ability to prioritize conservation efforts. We sequenced the transcriptomes of these species using Illumina paired-end technology. Approximately 257 and 240 million reads were obtained and assembled into 198,531 and 230,497 unigenes for A. chejuensis and A. quelpartensis, respectively. The average and N50 unigene lengths were 735.4 and 1073 bp, respectively, for A. chejuensis, and 705.6 and 1001 bp, respectively, for A. quelpartensis. In total, 68,484 (34.5% and 77,745 (33.73% unigenes for A. chejuensis and A. quelpartensis, respectively, were annotated to databases. Gene Ontology terms were assigned to 23,778 (11.98% and 26,396 (11.45 unigenes, for A. chejuensis and A. quelpartensis, respectively, while 5050 and 5838 unigenes were mapped to 117 and 124 pathways in the Kyoto Encyclopedia of Genes and Genomes database. In addition, we identified and annotated 9542 and 10,395 putative simple sequence repeats (SSRs in unigenes from A. chejuensis and A. quelpartensis, respectively. We designed a list of PCR primers flanking the putative SSR regions. These microsatellites may be utilized for future phylogenetics and conservation initiatives.

  18. Assessing quality of Medicago sativa silage by monitoring bacterial composition with single molecule, real-time sequencing technology and various physiological parameters

    Science.gov (United States)

    Bao, Weichen; Mi, Zhihui; Xu, Haiyan; Zheng, Yi; Kwok, Lai Yu; Zhang, Heping; Zhang, Wenyi

    2016-01-01

    The present study applied the PacBio single molecule, real-time sequencing technology (SMRT) in evaluating the quality of silage production. Specifically, we produced four types of Medicago sativa silages by using four different lactic acid bacteria-based additives (AD-I, AD-II, AD-III and AD-IV). We monitored the changes in pH, organic acids (including butyric acid, the ratio of acetic acid/lactic acid, γ-aminobutyric acid, 4-hyroxy benzoic acid and phenyl lactic acid), mycotoxins, and bacterial microbiota during silage fermentation. Our results showed that the use of the additives was beneficial to the silage fermentation by enhancing a general pH and mycotoxin reduction, while increasing the organic acids content. By SMRT analysis of the microbial composition in eight silage samples, we found that the bacterial species number and relative abundances shifted apparently after fermentation. Such changes were specific to the LAB species in the additives. Particularly, Bacillus megaterium was the initial dominant species in the raw materials; and after the fermentation process, Pediococcus acidilactici and Lactobacillus plantarum became the most prevalent species, both of which were intrinsically present in the LAB additives. Our data have demonstrated that the SMRT sequencing platform is applicable in assessing the quality of silage. PMID:27340760

  19. Assessing quality of Medicago sativa silage by monitoring bacterial composition with single molecule, real-time sequencing technology and various physiological parameters.

    Science.gov (United States)

    Bao, Weichen; Mi, Zhihui; Xu, Haiyan; Zheng, Yi; Kwok, Lai Yu; Zhang, Heping; Zhang, Wenyi

    2016-06-24

    The present study applied the PacBio single molecule, real-time sequencing technology (SMRT) in evaluating the quality of silage production. Specifically, we produced four types of Medicago sativa silages by using four different lactic acid bacteria-based additives (AD-I, AD-II, AD-III and AD-IV). We monitored the changes in pH, organic acids (including butyric acid, the ratio of acetic acid/lactic acid, γ-aminobutyric acid, 4-hyroxy benzoic acid and phenyl lactic acid), mycotoxins, and bacterial microbiota during silage fermentation. Our results showed that the use of the additives was beneficial to the silage fermentation by enhancing a general pH and mycotoxin reduction, while increasing the organic acids content. By SMRT analysis of the microbial composition in eight silage samples, we found that the bacterial species number and relative abundances shifted apparently after fermentation. Such changes were specific to the LAB species in the additives. Particularly, Bacillus megaterium was the initial dominant species in the raw materials; and after the fermentation process, Pediococcus acidilactici and Lactobacillus plantarum became the most prevalent species, both of which were intrinsically present in the LAB additives. Our data have demonstrated that the SMRT sequencing platform is applicable in assessing the quality of silage.

  20. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  1. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic

  2. Higher specificity of nucleic acid sequence-based amplification isothermal technology than of real-time PCR for quantification of HIV-1 RNA on dried blood spots.

    Science.gov (United States)

    Mercier-Delarue, Severine; Vray, Muriel; Plantier, Jean Christophe; Maillard, Theodora; Adjout, Zidan; de Olivera, Fabienne; Schnepf, Nathalie; Maylin, Sarah; Simon, Francois; Delaugerre, Constance

    2014-01-01

    Dried blood spots (DBS) are widely proposed as a plasma surrogate for monitoring antiretroviral treatment efficacy based on the HIV-1 RNA level (viral load [VL]) in resource-limited settings. Interfering coamplification of cell-associated HIV-1 DNA during reverse transcription (RT)-PCR can be avoided by using nucleic acid sequence-based amplification (NASBA) technology, which is based on an RNA template and isothermic conditions. We analyzed VL values obtained with DBS and plasma samples by comparing isothermic NASBA (NucliSENS EasyQ HIV-1 V2.0; bioMérieux) with real-time RT-PCR (Cobas TaqMan HIV-1 V2.0; Roche). Samples from 197 HIV-1-infected patients were tested (non-B subtypes in 51% of the cases). Nucleic acid extractions were performed by use of NucliSENS EasyMAG (bioMérieux) and Cobas AmpliPrep (Roche) before the NASBA and RT-PCR quantifications, respectively. Both quantification assays have lower limits of detection of 20 (1.3) and 800 (2.9) log10 copies/ml (log) in plasma and DBS, respectively. The mean (DBS minus plasma) differences were -0.39 and -0.46 log, respectively, for RT-PCR and NASBA. RT-PCR on DBS identified virological failure in 122 of 126 patients (sensitivity, 97%) and viral suppression in 58 of 70 patients (specificity, 83%), yielding 12 false-positive results (median, 3.2 log). NASBA on DBS identified virological failure in 85 of 96 patients (sensitivity, 89%) and viral suppression in 95 of 97 patients (specificity, 98%) and yielded 2 false-positive results (3.0 log for both). Both technologies detected HIV-1 RNA in DBS at a threshold of 800 copies/ml. This higher specificity of NASBA technology could avoid overestimation of poor compliance or the emergence of resistance when monitoring antiretroviral efficacy with the DBS method.

  3. Genome sequence of vibrio cholerae G4222, a South African clinical isolate

    CSIR Research Space (South Africa)

    Le Rouw, Wouter J

    2013-03-01

    Full Text Available sequences of V. cholerae MJ-1236 (7) and V. cholerae O1 biovar El Tor strain N16961 (8) with the NCBI Genomic (NG) Aligner tool of the NCBI Genome Workbench v2.5.5. A further 38 gaps were closed by PCR ampli- fication and Sanger sequencing. This resulted.... 2013. Genome sequence of Vibrio cholerae G4222, a South African clinical isolate. Genome Announc. 1(2):e00040-13. doi:10.1128/genomeA.00040-13. Copyright © 2013 le Roux et al. This is an open-access article distributed under the terms of the Creative...

  4. Secure and robust cloud computing for high-throughput forensic microsatellite sequence analysis and databasing.

    Science.gov (United States)

    Bailey, Sarah F; Scheible, Melissa K; Williams, Christopher; Silva, Deborah S B S; Hoggan, Marina; Eichman, Christopher; Faith, Seth A

    2017-08-08

    Next-generation Sequencing (NGS) is a rapidly evolving technology with demonstrated benefits for forensic genetic applications, and the strategies to analyze and manage the massive NGS datasets are currently in development. Here, the computing, data storage, connectivity, and security resources of the Cloud were evaluated as a model for forensic laboratory systems that produce NGS data. A complete front-to-end Cloud system was developed to upload, process, and interpret raw NGS data using a web browser dashboard. The system was extensible, demonstrating analysis capabilities of autosomal and Y-STRs from a variety of NGS instrumentation (Illumina MiniSeq and MiSeq, and Oxford Nanopore MinION). NGS data for STRs were concordant with standard reference materials previously characterized with capillary electrophoresis and Sanger sequencing. The computing power of the Cloud was implemented with on-demand auto-scaling to allow multiple file analysis in tandem. The system was designed to store resulting data in a relational database, amenable to downstream sample interpretations and databasing applications following the most recent guidelines in nomenclature for sequenced alleles. Lastly, a multi-layered Cloud security architecture was tested and showed that industry standards for securing data and computing resources were readily applied to the NGS system without disadvantageous effects for bioinformatic analysis, connectivity or data storage/retrieval. The results of this study demonstrate the feasibility of using Cloud-based systems for secured NGS data analysis, storage, databasing, and multi-user distributed connectivity. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Bioinformatic Challenges in Clinical Diagnostic Application of Targeted Next Generation Sequencing: Experience from Pheochromocytoma.

    Directory of Open Access Journals (Sweden)

    Joakim Crona

    Full Text Available Recent studies have demonstrated equal quality of targeted next generation sequencing (NGS compared to Sanger Sequencing. Whereas these novel sequencing processes have a validated robust performance, choice of enrichment method and different available bioinformatic software as reliable analysis tool needs to be further investigated in a diagnostic setting.DNA from 21 patients with genetic variants in SDHB, VHL, EPAS1, RET, (n=17 or clinical criteria of NF1 syndrome (n=4 were included. Targeted NGS was performed using Truseq custom amplicon enrichment sequenced on an Illumina MiSEQ instrument. Results were analysed in parallel using three different bioinformatics pipelines; (1 Commercially available MiSEQ Reporter, fully automatized and integrated software, (2 CLC Genomics Workbench, graphical interface based software, also commercially available, and ICP (3 an in-house scripted custom bioinformatic tool.A tenfold read coverage was achieved in between 95-98% of targeted bases. All workflows had alignment of reads to SDHA and NF1 pseudogenes. Compared to Sanger sequencing, variant calling revealed a sensitivity ranging from 83 to 100% and a specificity of 99.9-100%. Only MiSEQ reporter identified all pathogenic variants in both sequencing runs.We conclude that targeted next generation sequencing have equal quality compared to Sanger sequencing. Enrichment specificity and the bioinformatic performance need to be carefully assessed in a diagnostic setting. As acceptable accuracy was noted for a fully automated bioinformatic workflow, we suggest that processing of NGS data could be performed without expert bioinformatics skills utilizing already existing commercially available bioinformatics tools.

  6. Next-generation sequencing of lung cancer EGFR exons 18-21 allows effective molecular diagnosis of small routine samples (cytology and biopsy.

    Directory of Open Access Journals (Sweden)

    Dario de Biase

    Full Text Available Selection of lung cancer patients for therapy with tyrosine kinase inhibitors directed at EGFR requires the identification of specific EGFR mutations. In most patients with advanced, inoperable lung carcinoma limited tumor samples often represent the only material available for both histologic typing and molecular analysis. We defined a next generation sequencing protocol targeted to EGFR exons 18-21 suitable for the routine diagnosis of such clinical samples. The protocol was validated in an unselected series of 80 small biopsies (n=14 and cytology (n=66 specimens representative of the material ordinarily submitted for diagnostic evaluation to three referral medical centers in Italy. Specimens were systematically evaluated for tumor cell number and proportion relative to non-neoplastic cells. They were analyzed in batches of 100-150 amplicons per run, reaching an analytical sensitivity of 1% and obtaining an adequate number of reads, to cover all exons on all samples analyzed. Next generation sequencing was compared with Sanger sequencing. The latter identified 15 EGFR mutations in 14/80 cases (17.5% but did not detected mutations when the proportion of neoplastic cells was below 40%. Next generation sequencing identified 31 EGFR mutations in 24/80 cases (30.0%. Mutations were detected with a proportion of neoplastic cells as low as 5%. All mutations identified by the Sanger method were confirmed. In 6 cases next generation sequencing identified exon 19 deletions or the L858R mutation not seen after Sanger sequencing, allowing the patient to be treated with tyrosine kinase inhibitors. In one additional case the R831H mutation associated with treatment resistance was identified in an EGFR wild type tumor after Sanger sequencing. Next generation sequencing is robust, cost-effective and greatly improves the detection of EGFR mutations. Its use should be promoted for the clinical diagnosis of mutations in specimens with unfavorable tumor cell

  7. Sequence and expression analysis of gaps in human chromosome 20

    DEFF Research Database (Denmark)

    Minocherhomji, Sheroy; Seemann, Stefan; Mang, Yuan;

    2012-01-01

    The finished human genome-assemblies comprise several hundred un-sequenced euchromatic gaps, which may be rich in long polypurine/polypyrimidine stretches. Human chromosome 20 (chr 20) currently has three unfinished gaps remaining on its q-arm. All three gaps are within gene-dense regions and....../or overlap disease-associated loci, including the DLGAP4 locus. In this study, we sequenced ~99% of all three unfinished gaps on human chr 20, determined their complete genomic sizes and assessed epigenetic profiles using a combination of Sanger sequencing, mate pair paired-end high-throughput sequencing...... and chromatin, methylation and expression analyses. We found histone 3 trimethylated at Lysine 27 to be distributed across all three gaps in immortalized B-lymphocytes. In one gap, five novel CpG islands were predominantly hypermethylated in genomic DNA from peripheral blood lymphocytes and human cerebellum...

  8. Unlocking short read sequencing for metagenomics.

    Directory of Open Access Journals (Sweden)

    Sébastien Rodrigue

    Full Text Available BACKGROUND: Different high-throughput nucleic acid sequencing platforms are currently available but a trade-off currently exists between the cost and number of reads that can be generated versus the read length that can be achieved. METHODOLOGY/PRINCIPAL FINDINGS: We describe an experimental and computational pipeline yielding millions of reads that can exceed 200 bp with quality scores approaching that of traditional Sanger sequencing. The method combines an automatable gel-less library construction step with paired-end sequencing on a short-read instrument. With appropriately sized library inserts, mate-pair sequences can overlap, and we describe the SHERA software package that joins them to form a longer composite read. CONCLUSIONS/SIGNIFICANCE: This strategy is broadly applicable to sequencing applications that benefit from low-cost high-throughput sequencing, but require longer read lengths. We demonstrate that our approach enables metagenomic analyses using the Illumina Genome Analyzer, with low error rates, and at a fraction of the cost of pyrosequencing.

  9. Generation of a Transcriptome in a Model Lepidopteran Pest, Heliothis virescens, Using Multiple Sequencing Strategies for Profiling Midgut Gene Expression.

    Science.gov (United States)

    Perera, Omaththage P; Shelby, Kent S; Popham, Holly J R; Gould, Fred; Adang, Michael J; Jurat-Fuentes, Juan Luis

    2015-01-01

    Heliothine pests such as the tobacco budworm, Heliothis virescens (F.), pose a significant threat to production of a variety of crops and ornamental plants and are models for developmental and physiological studies. The efforts to develop new control measures for H. virescens, as well as its use as a relevant biological model, are hampered by a lack of molecular resources. The present work demonstrates the utility of next-generation sequencing technologies for rapid molecular resource generation from this species for which lacks a sequenced genome. In order to amass a de novo transcriptome for this moth, transcript sequences generated from Illumina, Roche 454, and Sanger sequencing platforms were merged into a single de novo transcriptome assembly. This pooling strategy allowed a thorough sampling of transcripts produced under diverse environmental conditions, developmental stages, tissues, and infections with entomopathogens used for biological control, to provide the most complete transcriptome to date for this species. Over 138 million reads from the three platforms were assembled into the final set of 63,648 contigs. Of these, 29,978 had significant BLAST scores indicating orthologous relationships to transcripts of other insect species, with the top-hit species being the monarch butterfly (Danaus plexippus) and silkworm (Bombyx mori). Among identified H. virescens orthologs were immune effectors, signal transduction pathways, olfactory receptors, hormone biosynthetic pathways, peptide hormones and their receptors, digestive enzymes, and insecticide resistance enzymes. As an example, we demonstrate the utility of this transcriptomic resource to study gene expression profiling of larval midguts and detect transcripts of putative Bacillus thuringiensis (Bt) Cry toxin receptors. The substantial molecular resources described in this study will facilitate development of H. virescens as a relevant biological model for functional genomics and for new biological

  10. Generation of a Transcriptome in a Model Lepidopteran Pest, Heliothis virescens, Using Multiple Sequencing Strategies for Profiling Midgut Gene Expression.

    Directory of Open Access Journals (Sweden)

    Omaththage P Perera

    Full Text Available Heliothine pests such as the tobacco budworm, Heliothis virescens (F., pose a significant threat to production of a variety of crops and ornamental plants and are models for developmental and physiological studies. The efforts to develop new control measures for H. virescens, as well as its use as a relevant biological model, are hampered by a lack of molecular resources. The present work demonstrates the utility of next-generation sequencing technologies for rapid molecular resource generation from this species for which lacks a sequenced genome. In order to amass a de novo transcriptome for this moth, transcript sequences generated from Illumina, Roche 454, and Sanger sequencing platforms were merged into a single de novo transcriptome assembly. This pooling strategy allowed a thorough sampling of transcripts produced under diverse environmental conditions, developmental stages, tissues, and infections with entomopathogens used for biological control, to provide the most complete transcriptome to date for this species. Over 138 million reads from the three platforms were assembled into the final set of 63,648 contigs. Of these, 29,978 had significant BLAST scores indicating orthologous relationships to transcripts of other insect species, with the top-hit species being the monarch butterfly (Danaus plexippus and silkworm (Bombyx mori. Among identified H. virescens orthologs were immune effectors, signal transduction pathways, olfactory receptors, hormone biosynthetic pathways, peptide hormones and their receptors, digestive enzymes, and insecticide resistance enzymes. As an example, we demonstrate the utility of this transcriptomic resource to study gene expression profiling of larval midguts and detect transcripts of putative Bacillus thuringiensis (Bt Cry toxin receptors. The substantial molecular resources described in this study will facilitate development of H. virescens as a relevant biological model for functional genomics and for new

  11. Scope and Sequence.

    Science.gov (United States)

    Callison, Daniel

    2002-01-01

    Discusses scope and sequence plans for curriculum coordination in elementary and secondary education related to school libraries. Highlights include library skills; levels of learning objectives; technology skills; media literacy skills; and information inquiry skills across disciplines by grade level. (LRW)

  12. Authentication of Herbal Supplements Using Next-Generation Sequencing.

    Directory of Open Access Journals (Sweden)

    Natalia V Ivanova

    Full Text Available DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious.We utilized Sanger and Next-Generation Sequencing (NGS for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components.All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven-by NGS. NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components.Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product. Interpretation of results should

  13. Genetic mapping using the Diversity Arrays Technology (DArT) : application and validation using the whole-genome sequences of Arabidopsis thaliana and the fungal wheat pathogen Mycosphaerella graminicola

    NARCIS (Netherlands)

    Wittenberg, A.H.J.

    2007-01-01

    Diversity Arrays Technology (DArT) is a microarray-based DNA marker technique for genome-wide discovery and genotyping of genetic variation. DArT allows simultaneous scoring of hundreds- to thousands of restriction site based polymorphisms between genotypes and does not require DNA sequence informat

  14. Genetic mapping using the Diversity Arrays Technology (DArT) : application and validation using the whole-genome sequences of Arabidopsis thaliana and the fungal wheat pathogen Mycosphaerella graminicola

    NARCIS (Netherlands)

    Wittenberg, A.H.J.

    2007-01-01

    Diversity Arrays Technology (DArT) is a microarray-based DNA marker technique for genome-wide discovery and genotyping of genetic variation. DArT allows simultaneous scoring of hundreds- to thousands of restriction site based polymorphisms between genotypes and does not require DNA sequence informat

  15. Genetic mapping using the Diversity Arrays Technology (DArT) : application and validation using the whole-genome sequences of Arabidopsis thaliana and the fungal wheat pathogen Mycosphaerella graminicola

    NARCIS (Netherlands)

    Wittenberg, A.H.J.

    2007-01-01

    Diversity Arrays Technology (DArT) is a microarray-based DNA marker technique for genome-wide discovery and genotyping of genetic variation. DArT allows simultaneous scoring of hundreds- to thousands of restriction site based polymorphisms between genotypes and does not require DNA sequence

  16. Deep sequencing of Myxilla (Ectyomyxilla) methanophila, an epibiotic sponge on cold-seep tubeworms, reveals methylotrophic, thiotrophic, and putative hydrocarbon-degrading microbial associations.

    Science.gov (United States)

    Arellano, Shawn M; Lee, On On; Lafi, Feras F; Yang, Jiangke; Wang, Yong; Young, Craig M; Qian, Pei-Yuan

    2013-02-01

    The encrusting sponge Myxilla (Ectyomyxilla) methanophila (Poecilosclerida: Myxillidae) is an epibiont on vestimentiferan tubeworms at hydrocarbon seeps on the upper Louisiana slope of the Gulf of Mexico. It has long been suggested that this sponge harbors methylotrophic bacteria due to its low δ(13)C value and high methanol dehydrogenase activity, yet the full community of microbial associations in M. methanophila remained uncharacterized. In this study, we sequenced 16S rRNA genes representing the microbial community in M. methanophila collected from two hydrocarbon-seep sites (GC234 and Bush Hill) using both Sanger sequencing and next-generation 454 pyrosequencing technologies. Additionally, we compared the microbial community in M. methanophila to that of the biofilm collected from the associated tubeworm. Our results revealed that the microbial diversity in the sponges from both sites was low but the community structure was largely similar, showing a high proportion of methylotrophic bacteria of the genus Methylohalomonas and polycyclic aromatic hydrocarbon (PAH)-degrading bacteria of the genera Cycloclasticus and Neptunomonas. Furthermore, the sponge microbial clone library revealed the dominance of thioautotrophic gammaproteobacterial symbionts in M. methanophila. In contrast, the biofilm communities on the tubeworms were more diverse and dominated by the chemoorganotrophic Moritella at GC234 and methylotrophic Methylomonas and Methylohalomonas at Bush Hill. Overall, our study provides evidence to support previous suggestion that M. methanophila harbors methylotrophic symbionts and also reveals the association of PAH-degrading and thioautotrophic microbes in the sponge.

  17. Deep Sequencing of Myxilla (Ectyomyxilla) methanophila, an Epibiotic Sponge on Cold-Seep Tubeworms, Reveals Methylotrophic, Thiotrophic, and Putative Hydrocarbon-Degrading Microbial Associations

    KAUST Repository

    Arellano, Shawn M.

    2012-10-11

    The encrusting sponge Myxilla (Ectyomyxilla) methanophila (Poecilosclerida: Myxillidae) is an epibiont on vestimentiferan tubeworms at hydrocarbon seeps on the upper Louisiana slope of the Gulf of Mexico. It has long been suggested that this sponge harbors methylotrophic bacteria due to its low δ13C value and high methanol dehydrogenase activity, yet the full community of microbial associations in M. methanophila remained uncharacterized. In this study, we sequenced 16S rRNA genes representing the microbial community in M. methanophila collected from two hydrocarbon-seep sites (GC234 and Bush Hill) using both Sanger sequencing and next-generation 454 pyrosequencing technologies. Additionally, we compared the microbial community in M. methanophila to that of the biofilm collected from the associated tubeworm. Our results revealed that the microbial diversity in the sponges from both sites was low but the community structure was largely similar, showing a high proportion of methylotrophic bacteria of the genus Methylohalomonas and polycyclic aromatic hydrocarbon (PAH)-degrading bacteria of the genera Cycloclasticus and Neptunomonas. Furthermore, the sponge microbial clone library revealed the dominance of thioautotrophic gammaproteobacterial symbionts in M. methanophila. In contrast, the biofilm communities on the tubeworms were more diverse and dominated by the chemoorganotrophic Moritella at GC234 and methylotrophic Methylomonas and Methylohalomonas at Bush Hill. Overall, our study provides evidence to support previous suggestion that M. methanophila harbors methylotrophic symbionts and also reveals the association of PAH-degrading and thioautotrophic microbes in the sponge. © 2012 Springer Science+Business Media New York.

  18. Current applications of high-throughput DNA sequencing technology in antibody drug research%高通量DNA测序技术在抗体新药研发中的应用

    Institute of Scientific and Technical Information of China (English)

    余馨; 刘启刚; 王明蓉

    2012-01-01

    Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.%自2005年首次报道了一种基于微乳液PCR技术的高通量DNA测序技术(high-throughput DNA sequencing technology)以来,高通量DNA测序平台已经发展为基因组和各种基因文库序列检测的强大工具.大容量的抗体基因库是日前获得抗体新药的基础,高通量DNA测序技术为从海量的抗体基因库中快速发现功能抗体分子提供了可能.本文就近几年高通量DNA测序技术在抗体基因库的多样性分析,抗体CDR3区的高通量测序、频率分析、功能基因发现及各种展示技术与高通量DNA测序技术的对接应用等方面进行了综述,以期为抗体新药的研发提供一条新的技术路线.

  19. De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies.

    Science.gov (United States)

    Karamitros, Timokratis; Harrison, Ian; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

    2016-01-01

    Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from assembly of accurate, full-length HHV-1 genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal.

  20. Ion Torrent PGMTM测序仪检测苯丙酮尿症患儿苯丙氨酸羟化酶基因突变%Analysis of phenylalanine hydroxylase gene mutations by Ion Torrent PGMTM sequencing in phenylketonuria patients

    Institute of Scientific and Technical Information of China (English)

    周保成; 穆原; 尹婷; 汤欣欣; 许天龙; 郑安舜; 毛华芬; 顾莹

    2014-01-01

    Objective To evaluate the feasibility of Ion Torrent PGMTM sequencing technology for analysis of phenylalanine hydroxylase (PAH) gene mutations in phenylketonuria (PKU) children patients.Methods The DNA samples were extracted from peripheral blood of 15 PKU children patient as well as their parents.All the exons including intron-exon boundaries of PAH gene were amplified by PCR and sequenced using Ion Torrent PGMTM.The samples with mutated PAH were validated by Sanger sequencing.Results The mean depth of coverage for the PAH gene (13 exons) in all the samples sequenced by Ion Torrent PGMTM was 1 465-fold and the mean percentage of coverage was 99.3%.Twenty-nine mutation alleles which were classified into 17 types including one novel mutation (p.P292L) were confirmed by Sanger sequencing.Conclusion This study should be a paradigm for PAH gene mutation of PKU children by Ion Torrent PGMTM application.The detection provided a potent platform for clinical diagnosis of PKU and would be easy-to-use.%目的 评价Ion Torrent PGMTM测序仪检测苯丙酮尿症(PKU)患儿苯丙氨酸羟化酶基因(PAH)突变的可行性.方法 提取15例确诊为经典型PKU的患儿及其父母外周血DNA,对PAH全部外显子及外显子-内含子交界区进行PCR反应,用IonTorrent PGMTM测序仪测序,再对检出突变的样本进行Sanger法验证.结果 Ion Torrent PGMTM平均覆盖深度为1 465倍,平均覆盖率为99.3%;共检出29个突变位点,分属17种突变,其中p.P292L为新发突变,所有检测结果均与Sanger法相一致.结论 用Ion Torrent PGMTM测序仪可快速简便检测PKU患儿PAH基因突变.

  1. HTS-PEG: a method for high throughput sequencing of the paired-ends of genomic libraries.

    Science.gov (United States)

    Zhou, Sisi; Fu, Yonggui; Li, Jie; He, Lingyu; Cai, Xingsheng; Yan, Qingyu; Rao, Xingqiang; Huang, Shengfeng; Li, Guang; Wang, Yiquan; Xu, Anlong

    2012-01-01

    Second generation sequencing has been widely used to sequence whole genomes. Though various paired-end sequencing methods have been developed to construct the long scaffold from contigs derived from shotgun sequencing, the classical paired-end sequencing of the Bacteria Artificial Chromosome (BAC) or fosmid libraries by the Sanger method still plays an important role in genome assembly. However, sequencing libraries with the Sanger method is expensive and time-consuming. Here we report a new strategy to sequence the paired-ends of genomic libraries with parallel pyrosequencing, using a Chinese amphioxus (Branchiostoma belcheri) BAC library as an example. In total, approximately 12,670 non-redundant paired-end sequences were generated. Mapping them to the primary scaffolds of Chinese amphioxus, we obtained 413 ultra-scaffolds from 1,182 primary scaffolds, and the N50 scaffold length was increased approximately 55 kb, which is about a 10% improvement. We provide a universal and cost-effective method for sequencing the ultra-long paired-ends of genomic libraries. This method can be very easily implemented in other second generation sequencing platforms.

  2. HTS-PEG: a method for high throughput sequencing of the paired-ends of genomic libraries.

    Directory of Open Access Journals (Sweden)

    Sisi Zhou

    Full Text Available Second generation sequencing has been widely used to sequence whole genomes. Though various paired-end sequencing methods have been developed to construct the long scaffold from contigs derived from shotgun sequencing, the classical paired-end sequencing of the Bacteria Artificial Chromosome (BAC or fosmid libraries by the Sanger method still plays an important role in genome assembly. However, sequencing libraries with the Sanger method is expensive and time-consuming. Here we report a new strategy to sequence the paired-ends of genomic libraries with parallel pyrosequencing, using a Chinese amphioxus (Branchiostoma belcheri BAC library as an example. In total, approximately 12,670 non-redundant paired-end sequences were generated. Mapping them to the primary scaffolds of Chinese amphioxus, we obtained 413 ultra-scaffolds from 1,182 primary scaffolds, and the N50 scaffold length was increased approximately 55 kb, which is about a 10% improvement. We provide a universal and cost-effective method for sequencing the ultra-long paired-ends of genomic libraries. This method can be very easily implemented in other second generation sequencing platforms.

  3. Validation of an Ion Torrent Sequencing Platform for the Detection of Gene Mutations in Biopsy Specimens from Patients with Non-Small-Cell Lung Cancer.

    Directory of Open Access Journals (Sweden)

    Shiro Fujita

    Full Text Available Treatment for patients with advanced non-small cell lung cancer (NSCLC is often determined by the presence of biomarkers that predict the response to agents targeting specific molecular pathways. Demands for multiplex analysis of the genes involved in the pathogenesis of NSCLC are increasing.We validated the Ion Torrent Personal Genome Machine (PGM system using the Ion AmpliSeq Cancer Hotspot Panel and compared the results with those obtained using the gold standard methods, conventional PCR and Sanger sequencing. The cycleave PCR method was used to verify the results.The Ion Torrent PGM resulted in a similar level of accuracy in identifying multiple genetic mutations in parallel, compared with conventional PCR and Sanger sequencing; however, the Ion Torrent PGM was superior to the other sequencing methods in terms of increased ease of use, even when taking into account the small amount of DNA that was obtained from formalin-fixed paraffin embedded (FFPE biopsy specimens.

  4. Discovery of Escherichia coli CRISPR sequences in an undergraduate laboratory.

    Science.gov (United States)

    Militello, Kevin T; Lazatin, Justine C

    2017-05-01

    Clustered regularly interspaced short palindromic repeats (CRISPRs) represent a novel type of adaptive immune system found in eubacteria and archaebacteria. CRISPRs have recently generated a lot of attention due to their unique ability to catalog foreign nucleic acids, their ability to destroy foreign nucleic acids in a mechanism that shares some similarity to RNA interference, and the ability to utilize reconstituted CRISPR systems for genome editing in numerous organisms. In order to introduce CRISPR biology into an undergraduate upper-level laboratory, a five-week set of exercises was designed to allow students to examine the CRISPR status of uncharacterized Escherichia coli strains and to allow the discovery of new repeats and spacers. Students started the project by isolating genomic DNA from E. coli and amplifying the iap CRISPR locus using the polymerase chain reaction (PCR). The PCR products were analyzed by Sanger DNA sequencing, and the sequences were examined for the presence of CRISPR repeat sequences. The regions between the repeats, the spacers, were extracted and analyzed with BLASTN searches. Overall, CRISPR loci were sequenced from several previously uncharacterized E. coli strains and one E. coli K-12 strain. Sanger DNA sequencing resulted in the discovery of 36 spacer sequences and their corresponding surrounding repeat sequences. Five of the spacers were homologous to foreign (non-E. coli) DNA. Assessment of the laboratory indicates that improvements were made in the ability of students to answer questions relating to the structure and function of CRISPRs. Future directions of the laboratory are presented and discussed. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):262-269, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.

  5. Complete Genome Sequence of Clavibacter michiganensis subsp. insidiosus R1-1 Using PacBio Single-Molecule Real-Time Technology.

    Science.gov (United States)

    Lu, You; Samac, Deborah A; Glazebrook, Jane; Ishimaru, Carol A

    2015-05-07

    We report here the complete genome sequence of Clavibacter michiganensis subsp. insidiosus R1-1, isolated in Minnesota, USA. The R1-1 genome, generated by a de novo assembly of PacBio sequencing data, is the first complete genome sequence available for this subspecies. Copyright © 2015 Lu et al.

  6. Next generation sequencing as a useful tool in the diagnostics of mosaicism in Alport syndrome.

    Science.gov (United States)

    Beicht, Sonja; Strobl-Wildemann, Gertrud; Rath, Sabine; Wachter, Oliver; Alberer, Martin; Kaminsky, Elke; Weber, Lutz T; Hinrichsen, Tanja; Klein, Hanns-Georg; Hoefele, Julia

    2013-09-10

    Alport syndrome (ATS) is a progressive hereditary nephropathy characterized by hematuria and/or proteinuria with structural defects of the glomerular basement membrane. It can be associated with extrarenal manifestations (high-tone sensorineural hearing loss and ocular abnormalities). Somatic mutations in COL4A5 (X-linked), COL4A3 and COL4A4 genes (both autosomal recessive and autosomal dominant) cause Alport syndrome. Somatic mosaicism in Alport patients is very rare. The reason for this may be due to the difficulty of detection. We report the case of a boy and his mother who presented with Alport syndrome. Mutational analysis showed the novel hemizygote pathogenic mutation c.2396-1G>A (IVS29-1G>A) at the splice acceptor site of the intron 29 exon 30 boundary of the COL4A5 gene in the boy. The mutation in the mother would not have been detected by Sanger sequencing without the knowledge of the mutational analysis result of her son. Further investigation of the mother using next generation sequencing showed somatic mosaicism and implied potential germ cell mosaicism. The mutation in the mother has most likely occurred during early embryogenesis. Analysis of tissue of different embryonic origin in the mother confirmed mosaicism in both mesoderm and ectoderm. Low grade mosaicism is very difficult to detect by Sanger sequencing. Next generation sequencing is increasingly used in the diagnostics and might improve the detection of mosaicism. In the case of definite clinical symptoms of ATS and missing detection of a mutation by Sanger sequencing, mutational analysis should be performed by next generation sequencing.

  7. Doing more with less: fluorescence in situ hybridization and gene sequencing assays can be reliably performed on archival stained tumor tissue sections.

    Science.gov (United States)

    Pelosi, Giuseppe; Perrone, Federica; Tamborini, Elena; Fabbri, Alessandra; Testi, Maria Adele; Busico, Adele; Settanni, Giulio; Picciani, Benedetta; Bovio, Enrica; Sonzogni, Angelica; Valeri, Barbara; Garassino, Marina; De Braud, Filippo; Pastorino, Ugo

    2016-04-01

    Little is known about molecular testing on tumor tissue retrieved from stained sections, for which there may be a clinical need. We retrospectively analyzed 112 sections from 56 tumor patients using either fluorescence in situ hybridization (FISH) with different probes (19 sections from 17 patients) or Sanger or targeted next generation sequencing for detection of BRAF, EGFR, KRAS, C-KIT, and TP53 mutations (93 sections from 39 patients). Tumor tissue sections had been stained by hematoxylin and eosin (H&E) (42 sections) or by immunohistochemistry for cytoplasmic or nuclear/nuclear-cytoplasmic markers (70 sections) with a peroxidase (P-IHC, with 3,3'-diaminobenzidine as chromogen) or alkaline phosphatase label (AP-IHC, with Warp Red™ as chromogen). For FISH analysis, the concordance rate between the original diagnosis and that obtained on H&E- or P-IHC-stained tissue sections (AP-IHC was not on record for this set of patients) was 95% (18 out of 19 tumor sections). Only one tumor sample, diffusely positive for MLH1, did not yield any nuclear hybridization signal. For sequencing analysis, the concordance rate was 100% on negative P-IHC and positive AP-IHC-stained sections, regardless of the subcellular localization of the reaction product. Mutations were detected in only 52% of cases expressing nuclear/nuclear-cytoplasmic markers, regardless of the sequencing technology used (p = 0.0002). In conclusion, stained sections may be a valuable resource for FISH or sequencing analysis, but on cases expressing nuclear markers sequencing results need to be interpreted cautiously.

  8. De Novo Transcriptome Sequencing of the Octopus vulgaris Hemocytes Using Illumina RNA-Seq Technology: Response to the Infection by the Gastrointestinal Parasite Aggregata octopiana

    Science.gov (United States)

    Castellanos-Martínez, Sheila; Arteta, David; Catarino, Susana; Gestal, Camino

    2014-01-01

    Background Octopus vulgaris is a highly valuable species of great commercial interest and excellent candidate for aquaculture diversification; however, the octopus’ well-being is impaired by pathogens, of which the gastrointestinal coccidian parasite Aggregata octopiana is one of the most important. The knowledge of the molecular mechanisms of the immune response in cephalopods, especially in octopus is scarce. The transcriptome of the hemocytes of O. vulgaris was de novo sequenced using the high-throughput paired-end Illumina technology to identify genes involved in immune defense and to understand the molecular basis of octopus tolerance/resistance to coccidiosis. Results A bi-directional mRNA library was constructed from hemocytes of two groups of octopus according to the infection by A. octopiana, sick octopus, suffering coccidiosis, and healthy octopus, and reads were de novo assembled together. The differential expression of transcripts was analysed using the general assembly as a reference for mapping the reads from each condition. After sequencing, a total of 75,571,280 high quality reads were obtained from the sick octopus group and 74,731,646 from the healthy group. The general transcriptome of the O. vulgaris hemocytes was assembled in 254,506 contigs. A total of 48,225 contigs were successfully identified, and 538 transcripts exhibited differential expression between groups of infection. The general transcriptome revealed genes involved in pathways like NF-kB, TLR and Complement. Differential expression of TLR-2, PGRP, C1q and PRDX genes due to infection was validated using RT-qPCR. In sick octopuses, only TLR-2 was up-regulated in hemocytes, but all of them were up-regulated in caecum and gills. Conclusion The transcriptome reported here de novo establishes the first molecular clues to understand how the octopus immune system works and interacts with a highly pathogenic coccidian. The data provided here will contribute to identification of biomarkers

  9. De novo transcriptome sequencing of the Octopus vulgaris hemocytes using Illumina RNA-Seq technology: response to the infection by the gastrointestinal parasite Aggregata octopiana.

    Directory of Open Access Journals (Sweden)

    Sheila Castellanos-Martínez

    Full Text Available BACKGROUND: Octopus vulgaris is a highly valuable species of great commercial interest and excellent candidate for aquaculture diversification; however, the octopus' well-being is impaired by pathogens, of which the gastrointestinal coccidian parasite Aggregata octopiana is one of the most important. The knowledge of the molecular mechanisms of the immune response in cephalopods, especially in octopus is scarce. The transcriptome of the hemocytes of O. vulgaris was de novo sequenced using the high-throughput paired-end Illumina technology to identify genes involved in immune defense and to understand the molecular basis of octopus tolerance/resistance to coccidiosis. RESULTS: A bi-directional mRNA library was constructed from hemocytes of two groups of octopus according to the infection by A. octopiana, sick octopus, suffering coccidiosis, and healthy octopus, and reads were de novo assembled together. The differential expression of transcripts was analysed using the general assembly as a reference for mapping the reads from each condition. After sequencing, a total of 75,571,280 high quality reads were obtained from the sick octopus group and 74,731,646 from the healthy group. The general transcriptome of the O. vulgaris hemocytes was assembled in 254,506 contigs. A total of 48,225 contigs were successfully identified, and 538 transcripts exhibited differential expression between groups of infection. The general transcriptome revealed genes involved in pathways like NF-kB, TLR and Complement. Differential expression of TLR-2, PGRP, C1q and PRDX genes due to infection was validated using RT-qPCR. In sick octopuses, only TLR-2 was up-regulated in hemocytes, but all of them were up-regulated in caecum and gills. CONCLUSION: The transcriptome reported here de novo establishes the first molecular clues to understand how the octopus immune system works and interacts with a highly pathogenic coccidian. The data provided here will contribute to

  10. Next generation sequencing to define prokaryotic and fungal diversity in the bovine rumen.

    Directory of Open Access Journals (Sweden)

    Derrick E Fouts

    Full Text Available A combination of Sanger and 454 sequences of small subunit rRNA loci were used to interrogate microbial diversity in the bovine rumen of 12 cows consuming a forage diet. Observed bacterial species richness, based on the V1-V3 region of the 16S rRNA gene, was between 1,903 to 2,432 species-level operational taxonomic units (OTUs when 5,520 reads were sampled per animal. Eighty percent of species-level OTUs were dominated by members of the order Clostridiales, Bacteroidales, Erysipelotrichales and unclassified TM7. Abundance of Prevotella species varied widely among the 12 animals. Archaeal species richness, also based on 16S rRNA, was between 8 and 13 OTUs, representing 5 genera. The majority of archaeal OTUs (84% found in this study were previously observed in public databases with only two new OTUs discovered. Observed rumen fungal species richness, based on the 18S rRNA gene, was between 21 and 40 OTUs with 98.4-99.9% of OTUs represented by more than one read, using Good's coverage. Examination of the fungal community identified numerous novel groups. Prevotella and Tannerella were overrepresented in the liquid fraction of the rumen while Butyrivibrio and Blautia were significantly overrepresented in the solid fraction of the rumen. No statistical difference was observed between the liquid and solid fractions in biodiversity of archaea and fungi. The survey of microbial communities and analysis of cross-domain correlations suggested there is a far greater extent of microbial diversity in the bovine rumen than previously appreciated, and that next generation sequencing technologies promise to reveal novel species, interactions and pathways that can be studied further in order to better understand how rumen microbial community structure and function affects ruminant feed efficiency, biofuel production, and environmental impact.

  11. Gap5—editing the billion fragment sequence assembly

    Science.gov (United States)

    Bonfield, James K.; Whitwham, Andrew

    2010-01-01

    Motivation: Existing sequence assembly editors struggle with the volumes of data now readily available from the latest generation of DNA sequencing instruments. Results: We describe the Gap5 software along with the data structures and algorithms used that allow it to be scalable. We demonstrate this with an assembly of 1.1 billion sequence fragments and compare the performance with several other programs. We analyse the memory, CPU, I/O usage and file sizes used by Gap5. Availability and Implementation: Gap5 is part of the Staden Package and is available under an Open Source licence from http://staden.sourceforge.net. It is implemented in C and Tcl/Tk. Currently it works on Unix systems only. Contact: jkb@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20513662

  12. Identification of minority resistance mutations in the HIV-1 integrase coding region using next generation sequencing

    DEFF Research Database (Denmark)

    Fonager, Jannik; Larsson, Jonas T; Hussing, Christian

    2015-01-01

    BACKGROUND: The current widely applied standard method to screen for HIV-1 genotypic resistance is based on Sanger population sequencing (Sseq), which does not allow for the identification of minority variants (MVs) below the limit of detection for the Sseq-method in patients receiving integrase......: raltegravir (RAL), elvitegravir (EVG) and dolutegravir (DTG). STUDY DESIGN: NGS and Sseq were used to analyze RT-PCR products of the HIV-1 integrase coding region from six patients and in serial samples from two patients. NGS sequences were assembled and analyzed using the low frequency variant detection...

  13. Exploring the environmental diversity of kinetoplastid flagellates in the high-throughput DNA sequencing era

    Science.gov (United States)

    d’Avila-Levy, Claudia Masini; Boucinha, Carolina; Kostygov, Alexei; Santos, Helena Lúcia Carneiro; Morelli, Karina Alessandra; Grybchuk-Ieremenko, Anastasiia; Duval, Linda; Votýpka, Jan; Yurchenko, Vyacheslav; Grellier, Philippe; Lukeš, Julius

    2015-01-01

    The class Kinetoplastea encompasses both free-living and parasitic species from a wide range of hosts. Several representatives of this group are responsible for severe human diseases and for economic losses in agriculture and livestock. While this group encompasses over 30 genera, most of the available information has been derived from the vertebrate pathogenic genera Leishmaniaand Trypanosoma. Recent studies of the previously neglected groups of Kinetoplastea indicated that the actual diversity is much higher than previously thought. This article discusses the known segment of kinetoplastid diversity and how gene-directed Sanger sequencing and next-generation sequencing methods can help to deepen our knowledge of these interesting protists. PMID:26602872

  14. Assessment of Genetic Diversity and Structure of Large Garlic (Allium sativum) Germplasm Bank, by Diversity Arrays Technology "Genotyping-by-Sequencing" Platform (DArTseq).

    Science.gov (United States)

    Egea, Leticia A; Mérida-García, Rosa; Kilian, Andrzej; Hernandez, Pilar; Dorado, Gabriel

    2017-01-01

    Garlic (Allium sativum) is used worldwide in cooking and industry, including pharmacology/medicine and cosmetics, for its interesting properties. Identifying redundancies in germplasm blanks to generate core collections is a major concern, mostly in large stocks, in order to reduce space and maintenance costs. Yet, similar appearance and phenotypic plasticity of garlic varieties hinder their morphological classification. Molecular studies are challenging, due to the large and expected complex genome of this species, with asexual reproduction. Classical molecular markers, like isozymes, RAPD, SSR, or AFLP, are not convenient to generate germplasm core-collections for this species. The recent emergence of high-throughput genotyping-by-sequencing (GBS) approaches, like DArTseq, allow to overcome such limitations to characterize and protect genetic diversity. Therefore, such technology was used in this work to: (i) assess genetic diversity and structure of a large garlic-germplasm bank (417 accessions); (ii) create a core collection; (iii) relate genotype to agronomical features; and (iv) describe a cost-effective method to manage genetic diversity in garlic-germplasm banks. Hierarchical-cluster analysis, principal-coordinates analysis and STRUCTURE showed general consistency, generating three main garlic-groups, mostly determined by variety and geographical origin. In addition, high-resolution genotyping identified 286 unique and 131 redundant accessions, used to select a reduced size germplasm-bank core collection. This demonstrates that DArTseq is a cost-effective method to analyze species with large and expected complex genomes, like garlic. To the best of our knowledge, this is the first report of high-throughput genotyping of a large garlic germplasm. This is particularly interesting for garlic adaptation and improvement, to fight biotic and abiotic stresses, in the current context of climate change and global warming.

  15. Next-generation sequencing

    DEFF Research Database (Denmark)

    Rieneck, Klaus; Bak, Mads; Jønson, Lars

    2013-01-01

    the feasibility of predicting the fetal KEL1 phenotype using next-generation sequencing (NGS) technology. STUDY DESIGN AND METHODS: The KEL1/2 single-nucleotide polymorphism was polymerase chain reaction (PCR) amplified with one adjoining base, and the PCR product was sequenced using a genome analyzer (GAIIx......, Illumina); several millions of PCR sequences were analyzed. RESULTS: The results demonstrated the feasibility of diagnosing the fetal KEL1 or KEL2 blood group from cell-free DNA purified from maternal plasma. CONCLUSION: This method requires only one primer pair, and the large amount of sequence...

  16. New genes and pathomechanisms in mitochondrial disorders unraveled by NGS technologies.

    Science.gov (United States)

    Legati, Andrea; Reyes, Aurelio; Nasca, Alessia; Invernizzi, Federica; Lamantea, Eleonora; Tiranti, Valeria; Garavaglia, Barbara; Lamperti, Costanza; Ardissone, Anna; Moroni, Isabella; Robinson, Alan; Ghezzi, Daniele; Zeviani, Massimo

    2016-08-01

    Next Generation Sequencing (NGS) technologies are revolutionizing the diagnostic screening for rare disease entities, including primary mitochondrial disorders, particularly those caused by nuclear gene defects. NGS approaches are able to identify the causative gene defects in small families and even single individuals, unsuitable for investigation by traditional linkage analysis. These technologies are contributing to fill the gap between mitochondrial disease cases defined on the basis of clinical, neuroimaging and biochemical readouts, which still outnumber by approximately 50% the cases for which a molecular-genetic diagnosis is attained. We have been using a combined, two-step strategy, based on targeted genes panel as a first NGS screening, followed by whole exome sequencing (WES) in still unsolved cases, to analyze a large cohort of subjects, that failed to show mutations in mtDNA and in ad hoc sets of specific nuclear genes, sequenced by the Sanger's method. Not only this approach has allowed us to reach molecular diagnosis in a significant fraction (20%) of these difficult cases, but it has also revealed unexpected and conceptually new findings. These include the possibility of marked variable penetrance of recessive mutations, the identification of large-scale DNA rearrangements, which explain spuriously heterozygous cases, and the association of mutations in known genes with unusual, previously unreported clinical phenotypes. Importantly, WES on selected cases has unraveled the presence of pathogenic mutations in genes encoding non-mitochondrial proteins (e.g. the transcription factor E4F1), an observation that further expands the intricate genetics of mitochondrial disease and suggests a new area of investigation in mitochondrial medicine. This article is part of a Special Issue entitled 'EBEC 2016: 19th European Bioenergetics Conference, Riva del Garda, Italy, July 2-6, 2016', edited by Prof. Paolo Bernardi.

  17. Improved Detection by Next-Generation Sequencing of Pyrazinamide Resistance in Mycobacterium tuberculosis Isolates.

    Science.gov (United States)

    Maningi, Nontuthuko E; Daum, Luke T; Rodriguez, John D; Mphahlele, Matsie; Peters, Remco P H; Fischer, Gerald W; Chambers, James P; Fourie, P Bernard

    2015-12-01

    The technical limitations of common tests used for detecting pyrazinamide (PZA) resistance in Mycobacterium tuberculosis isolates pose challenges for comprehensive and accurate descriptions of drug resistance in patients with multidrug-resistant tuberculosis (MDR-TB). In this study, a 606-bp fragment (comprising the pncA coding region plus the promoter) was sequenced using Ion Torrent next-generation sequencing (NGS) to detect associated PZA resistance mutations in 88 recultured MDR-TB isolates from an archived series collected in 2001. These 88 isolates were previously Sanger sequenced, with 55 (61%) designated as carrying the wild-type pncA gene and 33 (37%) showing mutations. PZA susceptibility of the isolates was also determined using the Bactec 460 TB system and the Wayne test. In this study, isolates were recultured and susceptibility testing was performed in Bactec 960 MGIT. Concordance between NGS and MGIT results was 93% (n = 88), and concordance values between the Bactec 460, the Wayne test, or pncA gene Sanger sequencing and NGS results were 82% (n = 88), 83% (n = 88), and 89% (n = 88), respectively. NGS confirmed the majority of pncA mutations detected by Sanger sequencing but revealed several new and mixed-strain mutations that resolved discordancy in other phenotypic results. Importantly, in 53% (18/34) of these isolates, pncA mutations were located in the 151 to 360 region and warrant further exploration. In these isolates, with their known resistance to rifampin, NGS of pncA improved PZA resistance detection sensitivity to 97% and specificity to 94% using NGS as the gold standard and helped to resolve discordant results from conventional methodologies.

  18. Efficient Generation of Myostatin Knock-Out Sheep Using CRISPR/Cas9 Technology and Microinjection into Zygotes.

    Directory of Open Access Journals (Sweden)

    M Crispo

    Full Text Available While CRISPR/Cas9 technology has proven to be a valuable system to generate gene-targeted modified animals in several species, this tool has been scarcely reported in farm animals. Myostatin is encoded by MSTN gene involved in the inhibition of muscle differentiation and growth. We determined the efficiency of the CRISPR/Cas9 system to edit MSTN in sheep and generate knock-out (KO animals with the aim to promote muscle development and body growth. We generated CRISPR/Cas9 mRNAs specific for ovine MSTN and microinjected them into the cytoplasm of ovine zygotes. When embryo development of CRISPR/Cas9 microinjected zygotes (n = 216 was compared with buffer injected embryos (n = 183 and non microinjected embryos (n = 173, cleavage rate was lower for both microinjected groups (P<0.05 and neither was affected by CRISPR/Cas9 content in the injected medium. Embryo development to blastocyst was not affected by microinjection and was similar among the experimental groups. From 20 embryos analyzed by Sanger sequencing, ten were mutant (heterozygous or mosaic; 50% efficiency. To obtain live MSTN KO lambs, 53 blastocysts produced after zygote CRISPR/Cas9 microinjection were transferred to 29 recipient females resulting in 65.5% (19/29 of pregnant ewes and 41.5% (22/53 of newborns. From 22 born lambs analyzed by T7EI and Sanger sequencing, ten showed indel mutations at MSTN gene. Eight showed mutations in both alleles and five of them were homozygous for indels generating out-of frame mutations that resulted in premature stop codons. Western blot analysis of homozygous KO founders confirmed the absence of myostatin, showing heavier body weight than wild type counterparts. In conclusion, our results demonstrate that CRISPR/Cas9 system was a very efficient tool to generate gene KO sheep. This technology is quick and easy to perform and less expensive than previous techniques, and can be applied to obtain genetically modified animal models of interest for

  19. Rare variant detection using family-based sequencing analysis.

    Science.gov (United States)

    Peng, Gang; Fan, Yu; Palculict, Timothy B; Shen, Peidong; Ruteshouser, E Cristy; Chi, Aung-Kyaw; Davis, Ronald W; Huff, Vicki; Scharfe, Curt; Wang, Wenyi

    2013-03-05

    Next-generation sequencing is revolutionizing genomic analysis, but this analysis can be compromised by high rates of missing true variants. To develop a robust statistical method capable of identifying variants that would otherwise not be called, we conducted sequence data simulations and both whole-genome and targeted sequencing data analysis of 28 families. Our method (Family-Based Sequencing Program, FamSeq) integrates Mendelian transmission information and raw sequencing reads. Sequence analysis using FamSeq reduced the number of false negative variants by 14-33% as assessed by HapMap sample genotype confirmation. In a large family affected with Wilms tumor, 84% of variants uniquely identified by FamSeq were confirmed by Sanger sequencing. In children with early-onset neurodevelopmental disorders from 26 families, de novo variant calls in disease candidate genes were corrected by FamSeq as mendelian variants, and the number of uniquely identified variants in affected individuals increased proportionally as additional family members were included in the analysis. To gain insight into maximizing variant detection, we studied factors impacting actual improvements of family-based calling, including pedigree structure, allele frequency (common vs. rare variants), prior settings of minor allele frequency, sequence signal-to-noise ratio, and coverage depth (∼20× to >200×). These data will help guide the design, analysis, and interpretation of family-based sequencing studies to improve the ability to identify new disease-associated genes.

  20. CROSS-DISCIPLINARY PHYSICS AND RELATED AREAS OF SCIENCE AND TECHNOLOGY: Chaos game representation (CGR)-walk model for DNA sequences

    Science.gov (United States)

    Gao, Jie; Xu, Zhen-Yuan

    2009-01-01

    Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model.

  1. DNA Sequencing Using capillary Electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Dr. Barry Karger

    2011-05-09

    application papers of sequencing up to this level were also published in the mid 1990's. A major interest of the sequencing community has always been read length. The longer the sequence read per run the more efficient the process as well as the ability to read repeat sequences. We therefore devoted a great deal of time to studying the factors influencing read length in capillary electrophoresis, including polymer type and molecule weight, capillary column temperature, applied electric field, etc. In our initial optimization, we were able to demonstrate, for the first time, the sequencing of over 1000 bases with 90% accuracy. The run required 80 minutes for separation. Sequencing of 1000 bases per column was next demonstrated on a multiple capillary instrument. Our studies revealed that linear polyacrylamide produced the longest read lengths because the hydrophilic single strand DNA had minimal interaction with the very hydrophilic linear polyacrylamide. Any interaction of the DNA with the polymer would lead to broader peaks and lower read length. Another important parameter was the molecular weight of the linear chains. High molecular weight (> 1 MDA) was important to allow the long single strand DNA to reptate through the entangled polymer matrix. In an important paper, we showed an inverse emulsion method to prepare reproducibility linear polyacrylamide polymer with an average MWT of 9MDa. This approach was used in the polymer for sequencing the human genome. Another critical factor in the successful use of capillary electrophoresis for sequencing was the sample preparation method. In the Sanger sequencing reaction, high concentration of salts and dideoxynucleotide remained. Since the sample was introduced to the capillary column by electrokinetic injection, these salt ions would be favorably injected into the column over the sequencing fragments, thus reducing the signal for longer fragments and hence reading read length. In two papers, we examined the role of

  2. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene.

    Science.gov (United States)

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-12-17

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations.

  3. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Science.gov (United States)

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-01-01

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations. PMID:27999334

  4. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Directory of Open Access Journals (Sweden)

    Karin Soares Cunha

    2016-12-01

    Full Text Available Neurofibromatosis 1 (NF1 is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11. We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G. Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns for different types of pathogenic variations, including the deep intronic splicing mutations.

  5. Clinical Use of Next-Generation Sequencing in the Diagnosis of Wilson's Disease.

    Science.gov (United States)

    Németh, Dániel; Árvai, Kristóf; Horváth, Péter; Kósa, János Pál; Tobiás, Bálint; Balla, Bernadett; Folhoffer, Anikó; Krolopp, Anna; Lakatos, Péter András; Szalay, Ferenc

    2016-01-01

    Objective. Wilson's disease is a disorder of copper metabolism which is fatal without treatment. The great number of disease-causing ATP7B gene mutations and the variable clinical presentation of WD may cause a real diagnostic challenge. The emergence of next-generation sequencing provides a time-saving, cost-effective method for full sequencing of the whole ATP7B gene compared to the traditional Sanger sequencing. This is the first report on the clinical use of NGS to examine ATP7B gene. Materials and Methods. We used Ion Torrent Personal Genome Machine in four heterozygous patients for the identification of the other mutations and also in two patients with no known mutation. One patient with acute on chronic liver failure was a candidate for acute liver transplantation. The results were validated by Sanger sequencing. Results. In each case, the diagnosis of Wilson's disease was confirmed by identifying the mutations in both alleles within 48 hours. One novel mutation (p.Ala1270Ile) was found beyond the eight other known ones. The rapid detection of the mutations made possible the prompt diagnosis of WD in a patient with acute liver failure. Conclusions. According to our results we found next-generation sequencing a very useful, reliable, time-saving, and cost-effective method for diagnosing Wilson's disease in selected cases.

  6. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    Directory of Open Access Journals (Sweden)

    Fabio eMarroni

    2012-06-01

    Full Text Available Next generation sequencing (NGS instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, only three research groups working in plant sciences have exploited this potentiality. They showed that pooled NGS can provide results in excellent agreement with those obtained by individual Sanger sequencing. Aim of this review is to convey to the reader the general ideas underlying the use of pooled NGS for the identification of rare variants. To facilitate a thorough understanding of the possibilities of the method we will explain in detail the variations in study design and discuss their advantages and disadvantages. We will show that information on allele frequency obtained by pooled next generation sequencing can be used to accurately compute basic population genetics indexes such as allele frequency, nucleotide diversity and Tajima’s D. Finally we will discuss applications and future perspectives of the multiplexed NGS approach.

  7. Genetic testing in hereditary breast and ovarian cancer using massive parallel sequencing.

    Science.gov (United States)

    Ruiz, Anna; Llort, Gemma; Yagüe, Carmen; Baena, Neus; Viñas, Marina; Torra, Montse; Brunet, Anna; Seguí, Miquel A; Saigí, Eugeni; Guitart, Miriam

    2014-01-01

    High throughput methods such as next generation sequencing are increasingly used in molecular diagnosis. The aim of this study was to develop a workflow for the detection of BRCA1 and BRCA2 mutations using massive parallel sequencing in a 454 GS Junior bench top sequencer. Our approach was first validated in a panel of 23 patients containing 62 unique variants that had been previously Sanger sequenced. Subsequently, 101 patients with familial breast and ovarian cancer were studied. BRCA1 and BRCA2 exon enrichment has been performed by PCR amplification using the BRCA MASTR kit (Multiplicom). Bioinformatic analysis of reads is performed with the AVA software v2.7 (Roche). In total, all 62 variants were detected resulting in a sensitivity of 100%. 71 false positives were called resulting in a specificity of 97.35%. All of them correspond to deletions located in homopolymeric stretches. The analysis of the homopolymers stretches of 6 bp or longer using the BRCA HP kit (Multiplicom) increased the specificity of the detection of BRCA1 and BRCA2 mutations to 99.99%. We show here that massive parallel pyrosequencing can be used as a diagnostic strategy to test for BRCA1 and BRCA2 mutations meeting very stringent sensitivity and specificity parameters replacing traditional Sanger sequencing with a lower cost.

  8. Clinical Use of Next-Generation Sequencing in the Diagnosis of Wilson’s Disease

    Directory of Open Access Journals (Sweden)

    Dániel Németh

    2016-01-01

    Full Text Available Objective. Wilson’s disease is a disorder of copper metabolism which is fatal without treatment. The great number of disease-causing ATP7B gene mutations and the variable clinical presentation of WD may cause a real diagnostic challenge. The emergence of next-generation sequencing provides a time-saving, cost-effective method for full sequencing of the whole ATP7B gene compared to the traditional Sanger sequencing. This is the first report on the clinical use of NGS to examine ATP7B gene. Materials and Methods. We used Ion Torrent Personal Genome Machine in four heterozygous patients for the identification of the other mutations and also in two patients with no known mutation. One patient with acute on chronic liver failure was a candidate for acute liver transplantation. The results were validated by Sanger sequencing. Results. In each case, the diagnosis of Wilson’s disease was confirmed by identifying the mutations in both alleles within 48 hours. One novel mutation (p.Ala1270Ile was found beyond the eight other known ones. The rapid detection of the mutations made possible the prompt diagnosis of WD in a patient with acute liver failure. Conclusions. According to our results we found next-generation sequencing a very useful, reliable, time-saving, and cost-effective method for diagnosing Wilson’s disease in selected cases.

  9. Automatic sequences

    CERN Document Server

    Haeseler, Friedrich

    2003-01-01

    Automatic sequences are sequences which are produced by a finite automaton. Although they are not random they may look as being random. They are complicated, in the sense of not being not ultimately periodic, they may look rather complicated, in the sense that it may not be easy to name the rule by which the sequence is generated, however there exists a rule which generates the sequence. The concept automatic sequences has special applications in algebra, number theory, finite automata and formal languages, combinatorics on words. The text deals with different aspects of automatic sequences, in particular:· a general introduction to automatic sequences· the basic (combinatorial) properties of automatic sequences· the algebraic approach to automatic sequences· geometric objects related to automatic sequences.

  10. Next-generation sequencing technologies and the application in microbiology-A review%高通量测序技术及其在微生物学研究中的应用

    Institute of Scientific and Technical Information of China (English)

    秦楠; 栗东芳; 杨瑞馥

    2011-01-01

    20世纪70年代发明的核酸测序技术为基因组学及其相关学科的发展做出了巨大贡献,本世纪初发展的以Illumina公司的HiSeq 2000,ABI公司的SOLID,和Roche公司的454技术为代表的高通量测序技术又为基因组学的发展注入了新活力.本文在阐述这些技术的基础上,着重讨论了新一代测序技术在微生物领域中的应用.%Since its invention in 1970s, nucleic acid sequencing technology has contributed tremendously to the genomics advances.The next-generation sequencing technologies, represented by HiSeq 2000 from Illumina, SOLiD from Applied Biosystems and 454 from Roche, re-energized the application of genomics.In this review, we first introduced the next-generation sequencing technologies, then, described their potential applications in the field of microbiology.

  11. DNA sequencing by CE.

    Science.gov (United States)

    Karger, Barry L; Guttman, András

    2009-06-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA-sequencing methods have evolved from the labor-intensive slab gel electrophoresis, through automated multiCE systems using fluorophore labeling with multispectral imaging, to the "next-generation" technologies of cyclic-array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes were only possible with the advent of modern sequencing technologies that were a result of step-by-step advances with a contribution of academics, medical personnel and instrument companies. While next-generation sequencing is moving ahead at breakneck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of CE in DNA sequencing based in part of several of our articles in this journal.

  12. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly.

    Science.gov (United States)

    Feng, Yan-Jie; Liu, Qing-Feng; Chen, Meng-Yun; Liang, Dan; Zhang, Peng

    2016-01-01

    In phylogenetics and population genetics, a large number of loci are often needed to accurately resolve species relationships. Normally, loci are enriched by PCR and sequenced by Sanger sequencing, which is expensive when the number of amplicons is large. Next-generation sequencing (NGS) techniques are increasingly used for parallel amplicon sequencing, which reduces sequencing costs tremendously, but has not reduced preparation costs very much. Moreover, for most current NGS methods, amplicons need to be purified and quantified before sequencing and their lengths are also restricted (normally HiSeq paired-end 90-bp data. Overall, we validate a rapid, cost-effective and scalable approach to sequence a large number of targeted loci from a large number of samples that is particularly suitable for both phylogenetics and population genetics studies that require a modest scale of data.

  13. Determination of RET Sequence Variation in an MEN2 Unaffected Cohort Using Multiple-Sample Pooling and Next-Generation Sequencing

    Directory of Open Access Journals (Sweden)

    R. L. Margraf

    2012-01-01

    Full Text Available Multisample, nonindexed pooling combined with next-generation sequencing (NGS was used to discover RET proto-oncogene sequence variation within a cohort known to be unaffected by multiple endocrine neoplasia type 2 (MEN2. DNA samples (113 Caucasians, 23 persons of other ethnicities were amplified for RET intron 9 to intron 16 and then divided into 5 pools of <30 samples each before library prep and NGS. Two controls were included in this study, a single sample and a pool of 50 samples that had been previously sequenced by the same NGS methods. All 59 variants previously detected in the 50-pool control were present. Of the 61 variants detected in the unaffected cohort, 20 variants were novel changes. Several variants were validated by high-resolution melting analysis and Sanger sequencing, and their allelic frequencies correlated well with those determined by NGS. The results from this unaffected cohort will be added to the RET MEN2 database.

  14. Comparative analysis of human mitochondrial DNA from World War I bone samples by DNA sequencing and ESI-TOF mass spectrometry.

    Science.gov (United States)

    Howard, Rebecca; Encheva, Vesela; Thomson, Jim; Bache, Katherine; Chan, Yuen-Ting; Cowen, Simon; Debenham, Paul; Dixon, Alan; Krause, Jens-Uwe; Krishan, Elaina; Moore, Daniel; Moore, Victoria; Ojo, Michael; Rodrigues, Sid; Stokes, Peter; Walker, James; Zimmermann, Wolfgang; Barallon, Rita

    2013-01-01

    Mitochondrial DNA is commonly used in identity testing for the analysis of old or degraded samples or to give evidence of familial links. The Abbott T5000 mass spectrometry platform provides an alternative to the more commonly used Sanger sequencing for the analysis of human mitochondrial DNA. The robustness of the T5000 system has previously been demonstrated using DNA extracted from volunteer buccal swabs but the system has not been tested using more challenging sample types. For mass spectrometry to be considered as a valid alternative to Sanger sequencing it must also be demonstrated to be suitable for use with more limiting sample types such as old teeth, bone fragments, and hair shafts. In 2009 the Commonwealth War Graves Commission launched a project to identify the remains of 250 World War I soldiers discovered in a mass grave in Fromelles, France. This study characterises the performance of both Sanger sequencing and the T5000 platform for the analysis of the mitochondrial DNA extracted from 225 of these remains, both in terms of the ability to amplify and characterise DNA regions of interest and the relative information content and ease-of-use associated with each method.

  15. Whole-genome sequencing overcomes pseudogene homology to diagnose autosomal dominant polycystic kidney disease.

    Science.gov (United States)

    Mallawaarachchi, Amali C; Hort, Yvonne; Cowley, Mark J; McCabe, Mark J; Minoche, André; Dinger, Marcel E; Shine, John; Furlong, Timothy J

    2016-11-01

    Autosomal dominant polycystic kidney disease (ADPKD) is the most common monogenic kidney disorder and is due to disease-causing variants in PKD1 or PKD2. Strong genotype-phenotype correlation exists although diagnostic sequencing is not part of routine clinical practice. This is because PKD1 bears 97.7% sequence similarity with six pseudogenes, requiring laborious and error-prone long-range PCR and Sanger sequencing to overcome. We hypothesised that whole-genome sequencing (WGS) would be able to overcome the problem of this sequence homology, because of 150 bp, paired-end reads and avoidance of capture bias that arises from targeted sequencing. We prospectively recruited a cohort of 28 unique pedigrees with ADPKD phenotype. Standard DNA extraction, library preparation and WGS were performed using Illumina HiSeq X and variants were classified following standard guidelines. Molecular diagnosis was made in 24 patients (86%), with 100% variant confirmation by current gold standard of long-range PCR and Sanger sequencing. We demonstrated unique alignment of sequencing reads over the pseudogene-homologous region. In addition to identifying function-affecting single-nucleotide variants and indels, we identified single- and multi-exon deletions affecting PKD1 and PKD2, which would have been challenging to identify using exome sequencing. We report the first use of WGS to diagnose ADPKD. This method overcomes pseudogene homology, provides uniform coverage, detects all variant types in a single test and is less labour-intensive than current techniques. This technique is translatable to a diagnostic setting, allows clinicians to make better-informed management decisions and has implications for other disease groups that are challenged by regions of confounding sequence homology.

  16. Whole exome sequencing identifies recessive PKHD1 mutations in a Chinese twin family with Caroli disease.

    Directory of Open Access Journals (Sweden)

    Xiwei Hao

    Full Text Available BACKGROUND: Mutations in PKHD1 cause autosomal recessive Caroli disease, which is a rare congenital disorder involving cystic dilatation of the intrahepatic bile ducts. However, the mutational spectrum of PKHD1 and the phenotype-genotype correlations have not yet been fully established. METHODS: Whole exome sequencing (WES was performed on one twin sample with Caroli disease from a Chinese family from Shandong province. Routine Sanger sequencing was used to validate the WES and to carry out segregation studies. We also described the PKHD1 mutation associated with the genotype-phenotype of this twin. RESULTS: A combination of WES and Sanger sequencing revealed the genetic defect to be a novel compound heterozygous genotype in PKHD1, including the missense mutation c.2507 T>C, predicted to cause a valine to alanine substitution at codon 836 (c.2507T>C, p.Val836Ala, and the nonsense mutation c.2341C>T, which is predicted to result in an arginine to stop codon at codon 781 (c.2341C>T, p.Arg781*. This compound heterozygous genotype co-segregates with the Caroli disease-affected pedigree members, but is absent in 200 normal chromosomes. CONCLUSIONS: Our findings indicate exome sequencing can be useful in the diagnosis of Caroli disease patients and associate a compound heterozygous genotype in PKHD1 with Caroli disease, which further increases our understanding of the mutation spectrum of PKHD1 in association with Caroli disease.

  17. Whole exome sequencing reveals a mutation in an osteogenesis imperfecta patient

    Directory of Open Access Journals (Sweden)

    Mehmet Ali Ergun

    2017-02-01

    Full Text Available Osteogenesis imperfecta (OI is an autosomal dominant disorder characterized mainly by bone fragility and blue sclerae. OI is caused by mutations in type I collagen genes, COL1A1 and COL1A2. Dentinogenesis imperfecta is a common disorder for osteogenesis imperfecta patients. More than half of the OI patients have also dentinogenesis imperfecta. Whole exome sequencing (WES, involves exome capture, which limits sequencing of the protein coding regions of the genome, composed of about 20,000 genes, 180,000 exons, and constituting approximately 1% of the whole genome. A major indication for use is molecular diagnosis of patients with suspected genetic disorders or of patients with known genetic disorders with substantial genetic heterogeneity involving substantial gene complexity. In this study, we performed WES for a patient prediagnosed as Osteogenesis imperfecta. He had also dentinogenesis imperfecta. The WES results confirmed with Sanger sequencing revealed as a missense mutation at codon 560 of COL1A1 gene: c.1678G>A p.(Gly560Cys. The mutation was in exon 25 and according to the dbSNP database this mutation corresponded to rs67507747. As a conclusion, it is very important to perform WES after an algorithm. This algorithm has to include, a suspect of a mendelian disorder, multiple genetic conditions in the differential diagnosis, and even if it is available the conventional diagnosis is prohibitively expensive. Finally, Sanger sequencing in order to confirm the results is also advised.

  18. Paired tumor and normal whole genome sequencing of metastatic olfactory neuroblastoma.

    Directory of Open Access Journals (Sweden)

    Glen J Weiss

    Full Text Available BACKGROUND: Olfactory neuroblastoma (ONB is a rare cancer of the sinonasal tract with little molecular characterization. We performed whole genome sequencing (WGS on paired normal and tumor DNA from a patient with metastatic-ONB to identify the somatic alterations that might be drivers of tumorigenesis and/or metastatic progression. METHODOLOGY/PRINCIPAL FINDINGS: Genomic DNA was isolated from fresh frozen tissue from a metastatic lesion and whole blood, followed by WGS at >30X depth, alignment and mapping, and mutation analyses. Sanger sequencing was used to confirm selected mutations. Sixty-two somatic short nucleotide variants (SNVs and five deletions were identified inside coding regions, each causing a non-synonymous DNA sequence change. We selected seven SNVs and validated them by Sanger sequencing. In the metastatic ONB samples collected several months prior to WGS, all seven mutations were present. However, in the original surgical resection specimen (prior to evidence of metastatic disease, mutations in KDR, MYC, SIN3B, and NLRC4 genes were not present, suggesting that these were acquired with disease progression and/or as a result of post-treatment effects. CONCLUSIONS/SIGNIFICANCE: This work provides insight into the evolution of ONB cancer cells and provides a window into the more complex factors, including tumor clonality and multiple driver mutations.

  19. CHILD: a new tool for detecting low-abundance insertions and deletions in standard sequence traces.

    Science.gov (United States)

    Zhidkov, Ilia; Cohen, Raphael; Geifman, Nophar; Mishmar, Dan; Rubin, Eitan

    2011-04-01

    Several methods have been proposed for detecting insertion/deletions (indels) from chromatograms generated by Sanger sequencing. However, most such methods are unsuitable when the mutated and normal variants occur at unequal ratios, such as is expected to be the case in cancer, with organellar DNA or with alternatively spliced RNAs. In addition, the current methods do not provide robust estimates of the statistical confidence of their results, and the sensitivity of this approach has not been rigorously evaluated. Here, we present CHILD, a tool specifically designed for indel detection in mixtures where one variant is rare. CHILD makes use of standard sequence alignment statistics to evaluate the significance of the results. The sensitivity of CHILD was tested by sequencing controlled mixtures of deleted and undeleted plasmids at various ratios. Our results indicate that CHILD can identify deleted molecules present as just 5% of the mixture. Notably, the results were plasmid/primer-specific; for some primers and/or plasmids, the deleted molecule was only detected when it comprised 10% or more of the mixture. The false positive rate was estimated to be lower than 0.4%. CHILD was implemented as a user-oriented web site, providing a sensitive and experimentally validated method for the detection of rare indel-carrying molecules in common Sanger sequence reads.

  20. Deep sequencing analysis of HBV genotype shift and correlation with antiviral efficiency during adefovir dipivoxil therapy.

    Directory of Open Access Journals (Sweden)

    Yuwei Wang

    Full Text Available Viral genotype shift in chronic hepatitis B (CHB patients during antiviral therapy has been reported, but the underlying mechanism remains elusive.38 CHB patients treated with ADV for one year were selected for studying genotype shift by both deep sequencing and Sanger sequencing method.Sanger sequencing method found that 7.9% patients showed mixed genotype before ADV therapy. In contrast, all 38 patients showed mixed genotype before ADV treatment by deep sequencing. 95.5% mixed genotype rate was also obtained from additional 200 treatment-naïve CHB patients. Of the 13 patients with genotype shift, the fraction of the minor genotype in 5 patients (38% increased gradually during the course of ADV treatment. Furthermore, responses to ADV and HBeAg seroconversion were associated with the high rate of genotype shift, suggesting drug and immune pressure may be key factors to induce genotype shift. Interestingly, patients with genotype C had a significantly higher rate of genotype shift than genotype B. In genotype shift group, ADV treatment induced a marked enhancement of genotype B ratio accompanied by a reduction of genotype C ratio, suggesting genotype C may be more sensitive to ADV than genotype B. Moreover, patients with dominant genotype C may have a better therapeutic effect. Finally, genotype shifts was correlated with clinical improvement in terms of ALT.Our findings provided a rational explanation for genotype shift among ADV-treated CHB patients. The genotype and genotype shift might be associated with antiviral efficiency.

  1. Graphene nanodevices for DNA sequencing

    Science.gov (United States)

    Heerema, Stephanie J.; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology.

  2. Molecular characterization of human T-cell lymphotropic virus type 1 full and partial genomes by Illumina massively parallel sequencing technology.

    Directory of Open Access Journals (Sweden)

    Rodrigo Pessôa

    Full Text Available BACKGROUND: Here, we report on the partial and full-length genomic (FLG variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs, 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP and 7 adult T-cell leukemia/lymphoma (ATLL patients, using an Illumina paired-end protocol. METHODS: Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. RESULTS: A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14 and FLG (n = 76 data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5% individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA and that 4 individuals (4.5% were infected with the Japanese sub-subtypes (aB. A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. CONCLUSIONS: This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data

  3. Development of primers for sequencing the NSP1, NSP3, and VP6 genes of the group A porcine rotavirus

    Directory of Open Access Journals (Sweden)

    Fernanda Dornelas Florentino Silva

    2014-02-01

    Full Text Available Rotavirus is the causative pathogen of diarrhea in humans and in several animal species. Eight pairs of primers were developed and used for Sanger sequencing of the coding region of the NSP1, NSP3, and VP6 genes based on the conserved regions of the genome of the group A porcine rotavirus. Three samples previously screened as positive for group A rotaviruses were subjected to gene amplification and sequencing to characterize the pathogen. The information generated from this study is crucial for the understanding of the epidemiology of the disease.

  4. Simultaneous discrimination of species and strains in Lactobacillus rhamnosus using species-specific PCR combined with multiplex mini-sequencing technology.

    Science.gov (United States)

    Huang, Chien-Hsun; Chang, Mu-Tzu; Huang, Lina; Chu, Wen-Shen

    2015-12-01

    This study described the use of species-specific PCR in combination with SNaPshot mini-sequencing to achieve species identification and strain differentiation in Lactobacillus rhamnosus. To develop species-specific PCR and strain subtyping primers, the dnaJ gene was used as a target, and its corresponding sequences were analyzed both in Lb. rhamnosus and in a subset of its phylogenetically closest species. The results indicated that the species-specific primer pair was indeed specific for Lb. rhamnosus, and the mini-sequencing assay was able to unambiguously distinguish Lb. rhamnosus strains into different haplotypes. In conclusion, we have successfully developed a rapid, accurate and cost-effective assay for inter- and intraspecies discrimination of Lb. rhamnosus, which can be applied to achieve efficient quality control of probiotic products. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Development of the Third Generation Sequencing Technologies and Related Bioinformatics%第三代DNA测序及其相关生物信息学技术发展概况

    Institute of Scientific and Technical Information of China (English)

    杨悦; 杜欣军; 梁彬; 郭季冬; 程晓真; 王硕

    2015-01-01

    本文介绍了第三代DNA测序的技术原理及应用现状,并对相关的生物信息学技术进行了综述。第三代测序技术以单分子测序为主要特点,目前已广泛应用于食品科学及生命科学研究的各个领域,其代表有 Heliscope BioScience公司的SMS技术、Pacific BioSciences公司的SMRT技术等。本文同时归纳总结了基因组学相关的生物信息学发展状况及常用的数据库。%In the present study, the principles and applications of the third generation of DNA sequencing technology were summerized, as well as the progresses of bioinformatics involved genome sequencing. The third generation DNA sequencing technology was characterized by single DNA molecular and had been used in many fields of food science and life science research, for instance, SMS from Heliscope BioScience and SMRT from Pacific BioSciences. Meanwhile, the developement of bioinformatics and the main bioinformatics databases were summarized in the paper.

  6. Genome Sequence of Pseudomonas aeruginosa Strain LCT-PA220, Which Was Selected after Space Flight by Using Biolog's Powerful Carbon Source Utilization Technology.

    Science.gov (United States)

    Xu, Guogang; Hu, Juan; Fang, Xiangqun; Zhang, Xuelin; Wang, Junfeng; Guo, Yinghua; Li, Tianzhi; Chen, Zhenghong; Dai, Wenkui; Liu, Changting

    2014-03-13

    To explore the changes of Pseudomonas aeruginosa in space flight, we present the draft genome sequence of P. aeruginosa strain LCT-PA220, which originated from a P. aeruginosa strain, ATCC 27853, that traveled on the Shenzhou-VIII spacecraft.

  7. Improved hybrid genome assemblies of 2 strains of Bacteroides xylanisolvens SD-CC-1b and SD-CC-2a using Illumina and 454 sequencing technologies

    Science.gov (United States)

    Bacteroides xlyanisolvens strains (SD_CC_1b, SD_CC_2a) isolated from human feces were able to grow on crystalline cellulose. Cellulolytic properties are not common in Bacteroides species. Here, we report improved genome sequences of both the B. xlyanisolvens strains....

  8. Fast clinical molecular diagnosis of hyperphenylalaninemia using next-generation sequencing-based on a custom AmpliSeq™ panel and Ion Torrent PGM sequencing.

    Science.gov (United States)

    Cao, Yan-yan; Qu, Yu-jin; Song, Fang; Zhang, Ting; Bai, Jin-li; Jin, Yu-wei; Wang, Hong

    2014-12-01

    Hyperphenylalaninemia (HPA) can be classified into phenylketonuria (PKU) and tetrahydrobiopterin deficiency (BH4D), according to the defect of enzyme activity, both of which vary substantially in severity, treatment, and prognosis of the disease. To set up a fast and comprehensive assay in order to achieve early etiological diagnosis and differential diagnosis for children with HPA, we designed a custom AmpliSeq™ panel for the sequencing of coding DNA sequence (CDS), flanking introns, 5' untranslated region (UTR) and 3' UTR from five HPA-causing genes (PAH, PTS, QDPR, GCH1, and PCBD1) using the Ion Torrent Personal Genome Machine (PGM) Sequencer. A standard group of 15 samples with previously known DNA sequences and a test group of 37 HPA patients with unknown mutations were used for assay validation and application, respectively. All variations were confirmed by Sanger sequencing. In the standard group, all the known mutations were detected and were consistent with the results of previous Sanger sequencing. In the test group, we identified mutations in 71 of 74 alleles, with a mutation detection rate of 95.9%. We also found a frame shift deletion p.Ile25Metfs*13 in PAH that was previously unreported. In addition, 1 of 37 in the test group was inconsistent with either the molecular diagnosis or clinical diagnosis by traditional differential methods. In conclusion, our comprehensive assay based on a custom AmpliSeq™ panel and Ion Torrent PGM sequencing has wider coverage, higher throughput, is much faster, and more efficient when compared with the traditional molecular detection method for HPA patients, which could meet the medical need for individualized diagnosis and treatment.

  9. The complete nucleotide sequence and genome organization of a novel betaflexivirus infecting Citrullus lanatus.

    Science.gov (United States)

    Xin, Min; Zhang, Peipei; Liu, Wenwen; Ren, Yingdang; Cao, Mengji; Wang, Xifeng

    2017-07-05

    The complete nucleotide sequence of a novel positive single-stranded (+ss) RNA virus, tentatively named watermelon virus A (WVA), was determined using a combination of three methods: RNA sequencing, small RNA sequencing, and Sanger sequencing. The full genome of WVA is comprised of 8,372 nucleotides (nt), excluding the poly (A) tail, and contains four open reading frames (ORFs). The largest ORF, ORF1 encodes a putative replication-associated polyprotein (RP) with three conserved domains. ORF2 and ORF4 encode a movement protein (MP) and coat protein (CP), respectively. The putative product encoded by ORF3, of an estimated molecular mass of 25 kDa, has no significant similarity with other proteins. Identity and phylogenetic analysis indicate that WVA is a new virus, closely related to members of the family Betaflexiviridae. However, the final taxonomic allocation of WVA within the family is yet to be determined.

  10. Complete genome sequence of a sapovirus from a child in Zhejiang, China.

    Science.gov (United States)

    Zhou, Xiaohong; Sun, Yi; Shang, Xiaochun; Gao, Jian; Zhao, Xueqin; Shuai, Huiqun; Zhang, Rui; Zhang, Yanjun

    2016-10-01

    Although Sapovirus (Caliciviridae) has been accepted as one of the causes of acute gastroenteritis worldwide, little is known about the genetic characteristics of the whole genome of sapoviruses in China, especially those that infect humans. Here we report the complete genome sequence of a sapovirus strain, Human/Zhejiang1/2015/China, obtained from a child with acute gastroenteritis in Hangzhou, Zhejiang Province, China. Samples were collected and delivered to the CDC laboratories and were detected by RT-PCR. Sanger sequencing was used to obtain the full genome and molecular characterization of the genome was determined. A phylogenetic analysis of the genome was also performed. The results indicated that Human/Zhejiang1/2015/China belongs to Genogroup I. No recombination events were detected. This is the first complete sequence from a child to be reported in China. The sequence information is important for surveillance of this emerging gastrointestinal infection.

  11. Exploring genetic polymorphism in innate immune genes in Indian cattle (Bos indicus) and buffalo (Bubalus bubalis) using next generation sequencing technology.

    Science.gov (United States)

    Patel, Shreya M; Koringa, Prakash G; Nathani, Neelam M; Patel, Namrata V; Shah, Tejash M; Joshi, Chaitanya G

    2015-02-01

    Activation of innate immunity initiates various cascades of reactions that largely contribute to defense against physical, microbial or chemical damage, prompt for damage repair and removal of causative organisms as well as restoration of tissue homeostasis. Genetic polymorphism in innate immune genes plays prominent role in disease resistance capabilities in various breeds of cattle and buffalo. Here we studied single nucleotide variations (SNP/SNV) and haplotype structure in innate immune genes viz CHGA, CHGB, CHGC, NRAMP1, NRAMP2, DEFB1, BNBD4, BNBD5, TAP and LAP in Gir cattle and Murrah buffalo. Targeted sequencing of exonic regions of these genes was performed by Ion Torrent PGM sequencing platform. The sequence reads obtained corresponding to coding regions of these genes were mapped to reference genome of cattle BosTau7 by BWA program using genome analysis tool kit (GATK). Further variant analysis by Unified Genotyper revealed 54 and 224 SNPs in Gir and Murrah respectively and also 32 SNVs was identified. Among these SNPs 43, 36, 11,32,81,21 and 22 variations were in CHGA, CHGB, CHGC, NRAMP1, NRAMP2, DEFB1 and TAP genes respectively. Among these identified 278 SNPs, 24 were found to be reported in the dbSNP database. Variant analysis was followed by structure formation of haplotypes based on multiple SNPs using SAS software revealed a large number of haplotypes. The SNP discovery in innate immune genes in cattle and buffalo breeds of India would advance our understanding of role of these genes in determining the disease resistance/susceptibility in Indian breeds. The identified SNPs and haplotype data would also provide a wealth of sequence information for conservation studies, selective breeding and designing future strategies for identifying disease associations involving samples from distinct populations.

  12. Exploring genetic polymorphism in innate immune genes in Indian cattle (Bos indicus and buffalo (Bubalus bubalis using next generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Shreya M. Patel

    2015-02-01

    Full Text Available Activation of innate immunity initiates various cascades of reactions that largely contribute to defense against physical, microbial or chemical damage, prompt for damage repair and removal of causative organisms as well as restoration of tissue homeostasis. Genetic polymorphism in innate immune genes plays prominent role in disease resistance capabilities in various breeds of cattle and buffalo. Here we studied single nucleotide variations (SNP/SNV and haplotype structure in innate immune genes viz CHGA, CHGB, CHGC, NRAMP1, NRAMP2, DEFB1, BNBD4, BNBD5, TAP and LAP in Gir cattle and Murrah buffalo. Targeted sequencing of exonic regions of these genes was performed by Ion Torrent PGM sequencing platform. The sequence reads obtained corresponding to coding regions of these genes were mapped to reference genome of cattle BosTau7 by BWA program using genome analysis tool kit (GATK. Further variant analysis by Unified Genotyper revealed 54 and 224 SNPs in Gir and Murrah respectively and also 32 SNVs was identified. Among these SNPs 43, 36, 11,32,81,21 and 22 variations were in CHGA, CHGB, CHGC, NRAMP1, NRAMP2, DEFB1 and TAP genes respectively. Among these identified 278 SNPs, 24 were found to be reported in the dbSNP database. Variant analysis was followed by structure formation of haplotypes based on multiple SNPs using SAS software revealed a large number of haplotypes. The SNP discovery in innate immune genes in cattle and buffalo breeds of India would advance our understanding of role of these genes in determining the disease resistance/susceptibility in Indian breeds. The identified SNPs and haplotype data would also provide a wealth of sequence information for conservation studies, selective breeding and designing future strategies for identifying disease associations involving samples from distinct populations.

  13. Failure to Identify Somatic Mutations in Monozygotic Twins Discordant for Schizophrenia by Whole Exome Sequencing

    Institute of Scientific and Technical Information of China (English)

    Nan Lyu; Li-Li Guan; Hong Ma; Xi-Jin Wang; Bao-Ming Wu; Fan-Hong Shang; Dan Wang

    2016-01-01

    Background:Schizophrenia (SCZ) is a severe,debilitating,and complex psychiatric disorder with multiple causative factors.An increasing number of studies have determined that rare variations play an important role in its etiology.A somatic mutation is a rare form of genetic variation that occurs at an early stage of embryonic development and is thought to contribute substantially to the development of SCZ.The aim of the study was to explore the novel pathogenic somatic single nucleotide variations (SNVs) and somatic insertions and deletions (indels) of SCZ.Methods:One Chinese family with a monozygotic (MZ) twin pair discordant for SCZ was included.Whole exome sequencing was performed in the co-twin and their parents.Rigorous filtering processes were conducted to prioritize pathogenic somatic variations,and all identified SNVs and indels were further confirmed by Sanger sequencing.Results:One somatic SNV and two somatic indels were identified after rigorous selection processes.However,none was validated by Sanger sequencing.Conclusions:This study is not alone in the failure to identify pathogenic somatic variations in MZ twins,suggesting that exonic somatic variations are extremely rare.Further efforts are warranted to explore the potential genetic mechanism of SCZ.

  14. Limb body wall complex, amniotic band sequence, or new syndrome caused by mutation in IQ Motif containing K (IQCK)?

    Science.gov (United States)

    Kruszka, Paul; Uwineza, Annette; Mutesa, Leon; Martinez, Ariel F; Abe, Yu; Zackai, Elaine H; Ganetzky, Rebecca; Chung, Brian; Stevenson, Roger E; Adelstein, Robert S; Ma, Xuefei; Mullikin, James C; Hong, Sung-Kook; Muenke, Maximilian

    2015-01-01

    Limb body wall complex (LBWC) and amniotic band sequence (ABS) are multiple congenital anomaly conditions with craniofacial, limb, and ventral wall defects. LBWC and ABS are considered separate entities by some, and a continuum of severity of the same condition by others. The etiology of LBWC/ABS remains unknown and multiple hypotheses have been proposed. One individual with features of LBWC and his unaffected parents were whole exome sequenced and Sanger sequenced as confirmation of the mutation. Functional studies were conducted using morpholino knockdown studies followed by human mRNA rescue experiments. Using whole exome sequencing, a de novo heterozygous mutation was found in the gene IQCK: c.667C>G; p.Q223E and confirmed by Sanger sequencing in an individual with LBWC. Morpholino knockdown of iqck mRNA in the zebrafish showed ventral defects including failure of ventral fin to develop and cardiac edema. Human wild-type IQCK mRNA rescued the zebrafish phenotype, whereas human p.Q223E IQCK mRNA did not, but worsened the phenotype of the morpholino knockdown zebrafish. This study supports a genetic etiology for LBWC/ABS, or potentially a new syndrome. PMID:26436108

  15. The pots and potters of Assyria : technology and organization of production, ceramics sequence and vessel function at Late Bronze Age Tell Sabi Abyad, Syria

    NARCIS (Netherlands)

    Duistermaat, Kim

    2007-01-01

    “The Pots and Potters of Assyria” is a comprehensive discussion of all evidence relating to pottery production from the Late Bronze Age site of Tell Sabi Abyad, Syria. Technological, morphological, stylistic and archaeological data are integrated into the understanding of pottery production and use.

  16. Characterization of promoter sequence of toll-like receptor genes in Vechur cattle

    Directory of Open Access Journals (Sweden)

    R. Lakshmi

    2016-06-01

    Full Text Available Aim: To analyze the promoter sequence of toll-like receptor (TLR genes in Vechur cattle, an indigenous breed of Kerala with the sequence of Bos taurus and access the differences that could be attributed to innate immune responses against bovine mastitis. Materials and Methods: Blood samples were collected from Jugular vein of Vechur cattle, maintained at Vechur cattle conservation center of Kerala Veterinary and Animal Sciences University, using an acid-citrate-dextrose anticoagulant. The genomic DNA was extracted, and polymerase chain reaction was carried out to amplify the promoter region of TLRs. The amplified product of TLR2, 4, and 9 promoter regions was sequenced by Sanger enzymatic DNA sequencing technique. Results: The sequence of promoter region of TLR2 of Vechur cattle with the B. taurus sequence present in GenBank showed 98% similarity and revealed variants for four sequence motifs. The sequence of the promoter region of TLR4 of Vechur cattle revealed 99% similarity with that of B. taurus sequence but not reveals significant variant in motifregions. However, two heterozygous loci were observed from the chromatogram. Promoter sequence of TLR9 gene also showed 99% similarity to B. taurus sequence and revealed variants for four sequence motifs. Conclusion: The results of this study indicate that significant variation in the promoter of TLR2 and 9 genes in Vechur cattle breed and may potentially link the influence the innate immunity response against mastitis diseases.

  17. Identification of Genomic Insertion and Flanking Sequence of G2-EPSPS and GAT Transgenes in Soybean Using Whole Genome Sequencing Method

    Science.gov (United States)

    Guo, Bingfu; Guo, Yong; Hong, Huilong; Qiu, Li-Juan

    2016-01-01

    Molecular characterization of sequence flanking exogenous fragment insertion is essential for safety assessment and labeling of genetically modified organism (GMO). In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS) method. More than 22.4 Gb sequence data (∼21 × coverage) for each line was generated on Illumina HiSeq 2500 platform. The junction reads mapped to boundaries of T-DNA and flanking sequences in these two events were identified by comparing all sequencing reads with soybean reference genome and sequence of transgenic vector. The putative insertion loci and flanking sequences were further confirmed by PCR amplification, Sanger sequencing, and co-segregation analysis. All these analyses supported that exogenous T-DNA fragments were integrated in positions of Chr19: 50543767–50543792 and Chr17: 7980527–7980541 in these two transgenic lines. Identification of genomic insertion sites of G2-EPSPS and GAT transgenes will facilitate the utilization of their glyphosate-tolerant traits in soybean breeding program. These results also demonstrated that WGS was a cost-effective and rapid method for identifying sites of T-DNA insertions and flanking sequences in soybean. PMID:27462336

  18. Identification of genomic insertion and flanking sequence of G2-EPSPS and GAT transgenes in soybean using whole genome sequencing method

    Directory of Open Access Journals (Sweden)

    Bingfu Guo

    2016-07-01

    Full Text Available Molecular characterization of sequences flanking exogenous fragment insertions is essential for safety assessment and labeling of genetically modified organisms (GMO. In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS method. About 21 Gb sequence data (~21× coverage for each line was generated on Illumina HiSeq 2500 platform. The junction reads mapped to boundary of T-DNA and flanking sequences in these two events were identified by comparing all sequencing reads with soybean reference genome and sequence of transgenic vector. The putative insertion loci and flanking sequences were further confirmed by PCR amplification, Sanger sequencing, and co-segregation analysis. All these analyses supported that exogenous T-DNA fragments were integrated in positions of Chr19: 50543767-50543792 and Chr17: 7980527-7980541 in these two transgenic lines. Identification of the genomic insertion site of the G2-EPSPS and GAT transgenes will facilitate the use of their glyphosate-tolerant traits in soybean breeding program. These results also demonstrated that WGS is a cost-effective and rapid method of identifying sites of T-DNA insertions and flanking sequences in soybean.

  19. Current applications of next-generation sequencing technology in Mycobacterium tuberculosis research%二代测序技术在结核分枝杆菌研究中的应用进展

    Institute of Scientific and Technical Information of China (English)

    徐鹏; 甘明宇; 高谦

    2015-01-01

    结核病是由结核分枝杆菌引起的全球第二大传染病。二代测序技术为从基因组水平研究结核分枝杆菌提供了重要的研究方法。本文从结核病流行病学、结核分枝杆菌耐药和进化及相关生物信息学等方面,介绍二代测序技术在结核分枝杆菌研究中的应用进展。%Tuberculosis caused by Mycobacterium tuberculosis (M . tuberculosis) is the second important infectious disease in the world , and next-generation sequencing technology provides an effective research method to study the genome of M . tuberculosis . The current applications of next-generation sequencing technology in the study of M . tuberculosis from the aspects of epidemiology ,drug resistance ,evolution and related bioinformatics are reviewed .

  20. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  1. A calculation of all possible oligosaccharide isomers both branched and linear yields 1.05 x 10(12) structures for a reducing hexasaccharide: the Isomer Barrier to development of single-method saccharide sequencing or synthesis systems.

    Science.gov (United States)

    Laine, R A

    1994-12-01

    The number of all possible linear and branched isomers of a hexasaccharide was calculated and found to be > 1.05 x 10(12). This large number defines the Isomer Barrier, a persistent technological barrier to the development of a single analytical method for the absolute characterization of carbohydrates, regardless of sample quantity. Because of this isomer barrier, no single method can be employed to determine complete oligosaccharide structure in 100 nmol amounts with the same assurance that can be achieved for 100 pmol amounts with single-procedure Edman peptide or Sanger DNA sequencing methods. Difficulties in the development of facile synthetic schemes for oligosaccharides are also explained by this large number. No current method of chemical or physical analysis has the resolution necessary to distinguish among 10(12) structures having the same mass. Therefore the 'characterization' of a middle-weight oligosaccharide solely by NMR or mass spectrometry necessarily contains a very large margin of error. Greater uncertainty accompanies results performed solely by sequential enzyme degradation followed by gel-permeation chromatography or electrophoresis, as touted by some commercial advertisements. Much of the literature which uses these single methods to 'characterize' complex carbohydrates is, therefore, in question, and journals should beware of publishing structural characterizations unless the authors reveal all alternate possible structures which could result from their analysis.(ABSTRACT TRUNCATED AT 250 WORDS)

  2. Dna Sequencing

    Science.gov (United States)

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  3. A general sequence processing and analysis program for protein engineering.

    Science.gov (United States)

    Stafford, Ryan L; Zimmerman, Erik S; Hallam, Trevor J; Sato, Aaron K

    2014-10-27

    Protein engineering projects often amass numerous raw DNA sequences, but no readily available software combines sequence processing and activity correlation required for efficient lead identification. XLibraryDisplay is an open source program integrated into Microsoft Excel for Windows that automates batch sequence processing via a simple step-by-step, menu-driven graphical user interface. XLibraryDisplay accepts any DNA template which is used as a basis for trimming, filtering, translating, and aligning hundreds to thousands of sequences (raw, FASTA, or Phred PHD file formats). Key steps for library characterization through lead discovery are available including library composition analysis, filtering by experimental data, graphing and correlating to experimental data, alignment to structural data extracted from PDB files, and generation of PyMOL visualization scripts. Though larger data sets can be handled, the program is best suited for analyzing approximately 10 000 or fewer leads or naïve clones which have been characterized using Sanger sequencing and other experimental approaches. XLibraryDisplay can be downloaded for free from sourceforge.net/projects/xlibrarydisplay/ .

  4. Cost-effective, species-specific microsatellite development for the endangered Dwarf Bulrush (Typha minima) using next-generation sequencing technology.

    Science.gov (United States)

    Csencsics, Daniela; Brodbeck, Sabine; Holderegger, Rolf

    2010-01-01

    The dwarf bulrush (Typha minima Funck ex Hoppe) is an endangered pioneer plant species of riparian flood plains. In Switzerland, only 3 natural populations remain, but reintroductions are planned. To identify suitable source populations for reintroductions, we developed 17 polymorphic microsatellite markers with perfect repeats using the 454 pyrosequencing technique and tested them on 20 individuals with low-cost M13 labeling. We detected 2 to 7 alleles per locus and found expected and observed heterozygosities of 0.05-0.76 and 0.07-1, respectively. The whole process was finished in less than 6 weeks and cost approximately USD 5000. Due to low costs and reduced expenditure of time, the use of next-generation sequencing techniques for microsatellite development represent a powerful tool for population genetic studies in nonmodel species, as we show in this first application of the approach to a plant species of conservation importance.

  5. Technologies and new approaches used by the INGV EMERGEO Working Group for real-time data sourcing and processing during the Emilia Romagna (northern Italy 2012 earthquake sequence

    Directory of Open Access Journals (Sweden)

    Giuliana Alessio

    2012-10-01

    Full Text Available On May 20, 2012, a Ml 5.9 seismic event hit the Emilia Po Plain, triggering intense earthquake activity along a broad area of the Po Plain across the provinces of Modena, Ferrara, Rovigo and Mantova (Figure 1. Nine days later, on May 29, 2012, a Ml 5.8 event occurred roughly 10 km to the SW of the first main shock. These events caused widespread damage and resulted in 26 victims. The aftershock area extended over more than 50 km and was elongated in the WNW-ESE direction, and it included five major aftershocks with 5.1 ≤Ml ≤5.3, and more than 2000 minor events (Figure 1. In general, the seismic sequence was confined to the upper 10 km of the crust. Minor seismicity with depths ranging from 10 km to 30 km extended towards the southern sector of the epicentral area (ISIDe, http://iside.rm.ingv.it/. […

  6. Report on achievements in fiscal 1999 on the project for research and development of an intellectual base creating and utilizing technology. Development of a base sequencer for ultra-difficult-to-read DNA; 1999 nendo chiteki kiban sosei riyo gijutsu kenkyu kaihatsu seika hokokusho. Chonandoku DNA enki hairetsu sequencer no kaihatsu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    This paper describes the achievements in fiscal 1999 on developing a device to read ultra-difficult-to-read DNA basic sequence. When DNA synthesizes complementary chains by using DNA polymerase , the DNA incorporates deoxyribonucleatide triphosphates (dNTP, available in four kinds), stretches the complementary chains, and discharges pyrophosphoric acid at the same time. Light is emitted when this is converted into adenosine triphosphate (ATP) by using sulfurylase, and reacted with luciferase. Progress of the complementary chain synthesis can be known by monitoring this reaction. Which nucleic acid having been put into has caused the complementary chain synthesis can be known by sending the four kinds of dNTPs independently into the reacting section in the respective sequences, and the basic sequence of the subject DNA can be decided. It was so devised that the excess dNTP in the pre-reaction is decomposed by enzyme not to remain because the reaction is performed being divided step-wise. The problems in the development include: achievement of sequential complementary chain synthesis at high rate and with high reaction yield, decision of reacting conditions suitable for automation and micronization, and development of a module that can supply dNTP being the reaction material into the reacting section sequentially. A prospect was attained on developing the elementary technology. (NEDO)

  7. A novel CRX mutation by whole-exome sequencing in an autosomal dominant cone-rod dystrophy pedigree

    Directory of Open Access Journals (Sweden)

    Qin-Kang Lu

    2015-12-01

    Full Text Available AIM: To identify the disease-causing gene mutation in a Chinese pedigree with autosomal dominant cone-rod dystrophy (adCORD. METHODS: A southern Chinese adCORD pedigree including 9 affected individuals was studied. Whole-exome sequencing (WES, coupling the Agilent whole-exome capture system to the Illumina HiSeq 2000 DNA sequencing platform was used to search the specific gene mutation in 3 affected family members and 1 unaffected member. After a suggested variant was found through the data analysis, the putative mutation was validated by Sanger DNA sequencing of samples from all available family members. RESULTS: The results of both WES and Sanger sequencing revealed a novel nonsense mutation c.C766T (p.Q256X within exon 5 of CRX gene which was pathogenic for adCORD in this family. The mutation could affect photoreceptor-specific gene expression with a dominant-negative effect and resulted in loss of the OTX tail, thus the mutant protein occupies the CRX-binding site in target promoters without establishing an interaction and, consequently, may block transactivation. CONCLUSION: All modes of Mendelian inheritance in CORD have been observed, and genetic heterogeneity is a hallmark of CORD. Therefore, conventional genetic diagnosis of CORD would be time-consuming and labor-intensive. Our study indicated the robustness and cost-effectiveness of WES in the genetic diagnosis of CORD.

  8. Determination and analysis of the complete mitochondrial genome sequence of Taoyuan chicken.

    Science.gov (United States)

    Liu, Li-Li; Xie, Hong-Bing; Yu, Qi-Fang; He, Shao-Ping; He, Jian-Hua

    2016-01-01

    Taoyuan chicken is excellent native breeds in China. This study firstly determined the complete mitochondrial genome sequence of Taoyuan chicken using PCR-based amplification and Sanger sequencing. The characteristic of the entire mitochondrial genome was analyzed in detail, with the base composition of 30.26% A, 23.79% T, 32.44% C, 13.50% G in the Taoyuan chicken (16,784 bp in length). It contained 2 ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and a major non-coding control region (D-loop region). The complete mitochondrial genome sequence of Taoyuan chicken will be useful for the phylogenetics of poultry, and be available as basic data for the genetics and breeding.

  9. Whole Genome Sequencing of Enterovirus species C Isolates by High-throughput Sequencing: Development of Generic Primers

    Directory of Open Access Journals (Sweden)

    Maël Bessaud

    2016-08-01

    Full Text Available Enteroviruses are among the most common viruses infecting humans and can cause diverse clinical syndromes ranging from minor febrile illness to severe and potentially fatal diseases. Enterovirus species C (EV-C consists of more than 20 types, among which the 3 serotypes of polioviruses, the etiological agents of poliomyelitis, are included. Biodiversity and evolution of EV-C genomes are shaped by frequent recombination events. Therefore, identification and characterization of circulating EV-C strains require the sequencing of different genomic regions.A simple method was developed to sequence quickly the entire genome of EV-C isolates. Four overlapping fragments were produced separately by RT-PCR performed with generic primers. The four amplicons were then pooled and purified prior to be sequenced by high-throughput technique.The method was assessed on a panel of EV-Cs belonging to a wide-range of types. It can be used to determine full-length genome sequences through de novo assembly of thousands of reads. It was also able to discriminate reads from closely related viruses in mixtures.By decreasing the workload compared to classical Sanger-based techniques, this method will serve as a precious tool for sequencing large panels of EV-Cs isolated in cell cultures during environmental surveillance or from patients, including vaccine-derived polioviruses.

  10. Whole Genome Sequencing of Enterovirus species C Isolates by High-Throughput Sequencing: Development of Generic Primers

    Science.gov (United States)

    Bessaud, Maël; Sadeuh-Mba, Serge A.; Joffret, Marie-Line; Razafindratsimandresy, Richter; Polston, Patsy; Volle, Romain; Rakoto-Andrianarivelo, Mala; Blondel, Bruno; Njouom, Richard; Delpeyroux, Francis

    2016-01-01

    Enteroviruses are among the most common viruses infecting humans and can cause diverse clinical syndromes ranging from minor febrile illness to severe and potentially fatal diseases. Enterovirus species C (EV-C) consists of more than 20 types, among which the three serotypes of polioviruses, the etiological agents of poliomyelitis, are included. Biodiversity and evolution of EV-C genomes are shaped by frequent recombination events. Therefore, identification and characterization of circulating EV-C strains require the sequencing of different genomic regions. A simple method was developed to quickly sequence the entire genome of EV-C isolates. Four overlapping fragments were produced separately by RT-PCR performed with generic primers. The four amplicons were then pooled and purified prior to being sequenced by a high-throughput technique. The method was assessed on a panel of EV-Cs belonging to a wide-range of types. It can be used to determine full-length genome sequences through de novo assembly of thousands of reads. It was also able to discriminate reads from closely related viruses in mixtures. By decreasing the workload compared to classical Sanger-based techniques, this method will serve as a precious tool for sequencing large panels of EV-Cs isolated in cell cultures during environmental surveillance or from patients, including vaccine-derived polioviruses. PMID:27617004

  11. Detecting authorized and unauthorized genetically modified organisms containing vip3A by real-time PCR and next-generation sequencing.

    Science.gov (United States)

    Liang, Chanjuan; van Dijk, Jeroen P; Scholtens, Ingrid M J; Staats, Martijn; Prins, Theo W; Voorhuijzen, Marleen M; da Silva, Andrea M; Arisi, Ana Carolina Maisonnave; den Dunnen, Johan T; Kok, Esther J

    2014-04-01

    The growing number of biotech crops with novel genetic elements increasingly complicates the detection of genetically modified organisms (GMOs) in food and feed samples using conventional screening methods. Unauthorized GMOs (UGMOs) in food and feed are currently identified through combining GMO element screening with sequencing the DNA flanking these elements. In this study, a specific and sensitive qPCR assay was developed for vip3A element detection based on the vip3Aa20 coding sequences of the recently marketed MIR162 maize and COT102 cotton. Furthermore, SiteFinding-PCR in combination with Sanger, Illumina or Pacific BioSciences (PacBio) sequencing was performed targeting the flanking DNA of the vip3Aa20 element in MIR162. De novo assembly and Basic Local Alignment Search Tool searches were used to mimic UGMO identification. PacBio data resulted in relatively long contigs in the upstream (1,326 nucleotides (nt); 95 % identity) and downstream (1,135 nt; 92 % identity) regions, whereas Illumina data resulted in two smaller contigs of 858 and 1,038 nt with higher sequence identity (>99 % identity). Both approaches outperformed Sanger sequencing, underlining the potential for next-generation sequencing in UGMO identification.

  12. Whole-exome sequencing reveals a novel frameshift mutation in the FAM161A gene causing autosomal recessive retinitis pigmentosa in the Indian population.

    Science.gov (United States)

    Zhou, Yu; Saikia, Bibhuti B; Jiang, Zhilin; Zhu, Xiong; Liu, Yuqing; Huang, Lulin; Kim, Ramasamy; Yang, Yin; Qu, Chao; Hao, Fang; Gong, Bo; Tai, Zhengfu; Niu, Lihong; Yang, Zhenglin; Sundaresan, Periasamy; Zhu, Xianjun

    2015-10-01

    Retinitis pigmentosa (RP) is a heterogenous group of inherited retinal degenerations caused by mutations in at least 50 genes. To identify genetic mutations underlying autosomal recessive RP (arRP), we performed whole-exome sequencing study on two consanguineous marriage Indian families (RP-252 and RP-182) and 100 sporadic RP patients. Here we reported novel mutation in FAM161A in RP-252 and RP-182 with two patients affected with RP in each family. The FAM161A gene was identified as the causative gene for RP28, an autosomal recessive form of RP. By whole-exome sequencing we identified several homozygous genomic regions, one of which included the recently identified FAM161A gene mutated in RP28-linked arRP. Sequencing analysis revealed the presence of a novel homozygous frameshift mutation p.R592FsX2 in both patients of family RP-252 and family RP-182. In 100 sporadic Indian RP patients, this novel homozygous frameshift mutation p.R592FsX2 was identified in one sporadic patient ARRP-S-I-46 by whole-exome sequencing and validated by Sanger sequencing. Meanwhile, this homozygous frameshift mutation was absent in 1000 ethnicity-matched control samples screened by direct Sanger sequencing. In conclusion, we identified a novel homozygous frameshift mutations of RP28-linked RP gene FAM161A in Indian population.

  13. Five simple guidelines for establishing basic authenticity and reliability of newly generated fungal ITS sequences

    Directory of Open Access Journals (Sweden)

    R. Henrik Nilsson

    2012-09-01

    Full Text Available Molecular data form an important research tool in most branches of mycology. A non-trivial proportion of the public fungal DNA sequences are, however, compromised in terms of quality and reliability, contributing noise and bias to sequence-borne inferences such as phylogenetic analysis, diversity assessment, and barcoding. In this paper we discuss various aspects and pitfalls of sequence quality assessment. Based on our observations, we provide a set of guidelines to assist in manual quality management of newly generated, near-full-length (Sanger-derived fungal ITS sequences and to some extent also sequences of shorter read lengths, other genes or markers, and groups of organisms. The guidelines are intentionally non-technical and do not require substantial bioinformatics skills or significant computational power. Despite their simple nature, we feel they would have caught the vast majority of the severely compromised ITS sequences in the public corpus. Our guidelines are nevertheless not infallible, and common sense and intuition remain important elements in the pursuit of compromised sequence data. The guidelines focus on basic sequence authenticity and reliability of the newly generated sequences, and the user may want to consider additional resources and steps to accomplish the best possible quality control. A discussion on the technical resources for further sequence quality management is therefore provided in the supplementary material.

  14. Fragment Merger: An Online Tool to Merge Overlapping Long Sequence Fragments

    Directory of Open Access Journals (Sweden)

    Anna Kramvis

    2013-03-01

    Full Text Available While PCR amplicons extend to a few thousand bases, the length of sequences from direct Sanger sequencing is limited to 500–800 nucleotides. Therefore, several fragments may be required to cover an amplicon, a gene or an entire genome. These fragments are typically sequenced in an overlapping fashion and assembled by manually sliding and aligning the sequences visually. This is time-consuming, repetitive and error-prone, and further complicated by circular genomes. An online tool merging two to twelve long overlapping sequence fragments was developed. Either chromatograms or FASTA files are submitted to the tool, which trims poor quality ends of chromatograms according to user-specified parameters. Fragments are assembled into a single sequence by repeatedly calling the EMBOSS merger tool in a consecutive manner. Output includes the number of trimmed nucleotides, details of each merge, and an optional alignment to a reference sequence. The final merge sequence is displayed and can be downloaded in FASTA format. All output files can be downloaded as a ZIP archive. This tool allows for easy and automated assembly of overlapping sequences and is aimed at researchers without specialist computer skills. The tool is genome- and organism-agnostic and has been developed using hepatitis B virus sequence data.

  15. SNMR pulse sequence phase cycling

    Science.gov (United States)

    Walsh, David O; Grunewald, Elliot D

    2013-11-12

    Technologies applicable to SNMR pulse sequence phase cycling are disclosed, including SNMR acquisition apparatus and methods, SNMR processing apparatus and methods, and combinations thereof. SNMR acquisition may include transmitting two or more SNMR pulse sequences and applying a phase shift to a pulse in at least one of the pulse sequences, according to any of a variety cycling techniques. SNMR processing may include combining SNMR from a plurality of pulse sequences comprising pulses of different phases, so that desired signals are preserved and indesired signals are canceled.

  16. Next-generation sequencing technology and non-invasive prenatal diagnosis%第二代测序技术与无创产前诊断

    Institute of Scientific and Technical Information of China (English)

    赵馨; 何天文; 尹爱华

    2014-01-01

    孕妇外周血中胎儿游离DNA(cell free fetal DNA,cffDNA)的发现开辟了无创产前诊断的新篇章,借助各种分子诊断技术针对cffDNA进行胎儿染色体疾病、遗传性疾病以妊娠相关疾病的研究迅速成为热点.以高通量、自动化为显著特征的第二代测序(next-generation sequencing,NGS)技术诞生,极大加速了cffDNA的实验室研究进展.目前基于NGS平台,建立的胎儿21/18/13三体综合征的产前基因诊断技术已应用于临床,其他如性染色体非整倍体、双胎妊娠染色体非整倍体、胎儿染色体结构异常疾病以及孟德尔单基因遗传病的研究也因NGS的出现获得了显著的进步.本文就NGS的基本原理、cffDNA的生理特性及NGS在无创产前诊断研究中的进展进行综述.

  17. Inhibition of expression in Escherichia coli of a virulence regulator MglB of Francisella tularensis using external guide sequence technology.

    Directory of Open Access Journals (Sweden)

    Gaoping Xiao

    Full Text Available External guide sequences (EGSs have successfully been used to inhibit expression of target genes at the post-transcriptional level in both prokaryotes and eukaryotes. We previously reported that EGS accessible and cleavable sites in the target RNAs can rapidly be identified by screening random EGS (rEGS libraries. Here the method of screening rEGS libraries and a partial RNase T1 digestion assay were used to identify sites accessible to EGSs in the mRNA of a global virulence regulator MglB from Francisella tularensis, a Gram-negative pathogenic bacterium. Specific EGSs were subsequently designed and their activities in terms of the cleavage of mglB mRNA by RNase P were tested in vitro and in vivo. EGS73, EGS148, and EGS155 in both stem and M1 EGS constructs induced mglB mRNA cleavage in vitro. Expression of stem EGS73 and EGS155 in Escherichia coli resulted in significant reduction of the mglB mRNA level coded for the F. tularensis mglB gene inserted in those cells.

  18. DNA Sequencing Using capillary Electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Dr. Barry Karger

    2011-05-09

    application papers of sequencing up to this level were also published in the mid 1990's. A major interest of the sequencing community has always been read length. The longer the sequence read per run the more efficient the process as well as the ability to read repeat sequences. We therefore devoted a great deal of time to studying the factors influencing read length in capillary electrophoresis, including polymer type and molecule weight, capillary column temperature, applied electric field, etc. In our initial optimization, we were able to demonstrate, for the first time, the sequencing of over 1000 bases with 90% accuracy. The run required 80 minutes for separation. Sequencing of 1000 bases per column was next demonstrated on a multiple capillary instrument. Our studies revealed that linear polyacrylamide produced the longest read lengths because the hydrophilic single strand DNA had minimal interaction with the very hydrophilic linear polyacrylamide. Any interaction of the DNA with the polymer would lead to broader peaks and lower read length. Another important parameter was the molecular weight of the linear chains. High molecular weight (> 1 MDA) was important to allow the long single strand DNA to reptate through the entangled polymer matrix. In an important paper, we showed an inverse emulsion method to prepare reproducibility linear polyacrylamide polymer with an average MWT of 9MDa. This approach was used in the polymer for sequencing the human genome. Another critical factor in the successful use of capillary electrophoresis for sequencing was the sample preparation method. In the Sanger sequencing reaction, high concentration of salts and dideoxynucleotide remained. Since the sample was introduced to the capillary column by electrokinetic injection, these salt ions would be favorably injected into the column over the sequencing fragments, thus reducing the signal for longer fragments and hence reading read length. In two papers, we examined the role of

  19. FAST: FAST Analysis of Sequences Toolbox.

    Science.gov (United States)

    Lawrence, Travis J; Kauffman, Kyle T; Amrine, Katherine C H; Carper, Dana L; Lee, Raymond S; Becich, Peter J; Canales, Claudia J; Ardell, David H

    2015-01-01

    FAST (FAST Analysis of Sequences Toolbox) provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU's Not Unix) Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R, and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics make FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format). Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.

  20. FAST: FAST Analysis of Sequences Toolbox

    Directory of Open Access Journals (Sweden)

    Travis J. Lawrence

    2015-05-01

    Full Text Available FAST (FAST Analysis of Sequences Toolbox provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU’s Not Unix Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics makes FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format. Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.

  1. Functionalized nanopore-embedded electrodes for rapid DNA sequencing

    CERN Document Server

    He, Haiying; Pandey, Ravindra; Rocha, Alexandre Reily; Sanvito, Stefano; Grigoriev, Anton; Ahuja, Rajeev; Karna, Shashi P

    2007-01-01

    The determination of a patient's DNA sequence can, in principle, reveal an increased risk to fall ill with particular diseases [1,2] and help to design "personalized medicine" [3]. Moreover, statistical studies and comparison of genomes [4] of a large number of individuals are crucial for the analysis of mutations [5] and hereditary diseases, paving the way to preventive medicine [6]. DNA sequencing is, however, currently still a vastly time-consuming and very expensive task [4], consisting of pre-processing steps, the actual sequencing using the Sanger method, and post-processing in the form of data analysis [7]. Here we propose a new approach that relies on functionalized nanopore-embedded electrodes to achieve an unambiguous distinction of the four nucleic acid bases in the DNA sequencing process. This represents a significant improvement over previously studied designs [8,9] which cannot reliably distinguish all four bases of DNA. The transport properties of the setup investigated by us, employing state-o...

  2. Probing the SELEX process with next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Tatjana Schütze

    Full Text Available BACKGROUND: SELEX is an iterative process in which highly diverse synthetic nucleic acid libraries are selected over many rounds to finally identify aptamers with desired properties. However, little is understood as how binders are enriched during the selection course. Next-generation sequencing offers the opportunity to open the black box and observe a large part of the population dynamics during the selection process. METHODOLOGY: We have performed a semi-automated SELEX procedure on the model target streptavidin starting with a synthetic DNA oligonucleotide library and compared results obtained by the conventional analysis via cloning and Sanger sequencing with next-generation sequencing. In order to follow the population dynamics during the selection, pools from all selection rounds were barcoded and sequenced in parallel. CONCLUSIONS: High affinity aptamers can be readily identified simply by copy number enrichment in the first selection rounds. Based on our results, we suggest a new selection scheme that avoids a high number of iterative selection rounds while reducing time, PCR bias, and artifacts.

  3. Phylogenetic and Functional Analysis of Metagenome Sequence from High-Temperature Archaeal Habitats Demonstrate Linkages between Metabolic Potential and Geochemistry

    DEFF Research Database (Denmark)

    Inskeep, William P; Jay, Zackary J; Herrgard, Markus

    2013-01-01

    from the sequence data. Analysis of protein family occurrence, particularly of those involved in energy conservation, electron transport, and autotrophic metabolism, revealed significant differences in metabolic strategies across sites consistent with differences in major geochemical attributes (e.......4 and to discuss specific examples where the metabolic potential correlated with measured environmental parameters and geochemical processes occurring in situ. Random shotgun metagenome sequence (∼40-45 Mb Sanger sequencing per site) was obtained from environmental DNA extracted from high-temperature sediments and....../or microbial mats and subjected to numerous phylogenetic and functional analyses. Analysis of individual sequences (e.g., MEGAN and G + C content) and assemblies from each habitat type revealed the presence of dominant archaeal populations in all environments, 10 of whose genomes were largely reconstructed...

  4. Screening of Candidate Leaf Morphology Genes by Integration of QTL Mapping and RNA Sequencing Technologies in Oilseed Rape (Brassica napus L.)

    Science.gov (United States)

    Jian, Hongju; Yang, Bo; Zhang, Aoxiang; Zhang, Li; Xu, Xinfu; Li, Jiana; Liu, Liezhao

    2017-01-01

    Leaf size and shape play important roles in agronomic traits, such as yield, quality and stress responses. Wide variations in leaf morphological traits exist in cultivated varieties of many plant species. By now, the genetics of leaf shape and size have not been characterized in Brassica napus. In this study, a population of 172 recombinant inbred lines (RILs) was used for quantitative trait locus (QTL) analysis of leaf morphology traits. Furthermore, fresh young leaves of extreme lines with more leaf lobes (referred to as ‘A’) and extreme lines with fewer lobes (referred to as ‘B’) selected from the RIL population and leaves of dissected lines (referred to as ‘P’) were used for transcriptional analysis. A total of 31 QTLs for the leaf morphological traits tested in this study were identified on 12 chromosomes, explaining 5.32–39.34% of the phenotypic variation. There were 8, 6, 2, 5, 8, and 2 QTLs for PL (petiole length), PN (lobe number), LW (lamina width), LL (Lamina length), LL/LTL (the lamina size ratio) and LTL (leaf total length), respectively. In addition, 74, 1,166 and 1,272 differentially expressed genes (DEGs) were identified in ‘A vs B’, ‘A vs P’ and ‘B vs P’ comparisons, respectively. The Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases were used to predict the functions of these DEGs. Gene regulators of leaf shape and size, such as ASYMMETRIC LEAVES 2, gibberellin 20-oxidase 3, genes encoding gibberellin-regulated family protein, genes encoding growth-regulating factor and KNOTTED1-like homeobox were also detected in DEGs. After integrating the QTL mapping and RNA sequencing data, 33 genes, including a gene encoding auxin-responsive GH3 family protein and a gene encoding sphere organelles protein-related gene, were selected as candidates that may control leaf shape. Our findings should be valuable for studies of the genetic control of leaf morphological trait regulation in B. napus. PMID

  5. Whole genome and transcriptome sequencing of a B3 thymoma.

    Directory of Open Access Journals (Sweden)

    Iacopo Petrini

    Full Text Available Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37. Copy number (CN aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs and 2 insertion/deletions (INDELs were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma.

  6. Main: Sequences [KOME

    Lifescience Database Archive (English)

    Full Text Available Sequences Nucleotide Sequence Nucleotide sequence of full length cDNA (trimmed sequence) kome_ine_full_seq...uence_db.fasta.zip kome_ine_full_sequence_db.zip kome_ine_full_sequence_db ...

  7. New Mutations in NEB Gene Discovered by Targeted Next-Generation Sequencing in Nemaline Myopathy Italian Patients.

    Science.gov (United States)

    Piga, Daniela; Magri, Francesca; Ronchi, Dario; Corti, Stefania; Cassandrini, Denise; Mercuri, Eugenio; Tasca, Giorgio; Bertini, Enrico; Fattori, Fabiana; Toscano, Antonio; Messina, Sonia; Moroni, Isabella; Mora, Marina; Moggio, Maurizio; Colombo, Irene; Giugliano, Teresa; Pane, Marika; Fiorillo, Chiara; D'Amico, Adele; Bruno, Claudio; Nigro, Vincenzo; Bresolin, Nereo; Comi, Giacomo Pietro

    2016-07-01

    Nemaline myopathy represents a group of clinically and genetically heterogeneous neuromuscular disorders. Different clinical-genetic entities have been characterized in the last few years, with implications for diagnostics and genetic counseling. Fifty percent of nemaline myopathy forms are due to NEB mutations, but genetic analysis of this large and complex gene by Sanger sequencing is time consuming and expensive. We selected 10 Italian patients with clinical and biopsy features suggestive for nemaline myopathy and negative for ACTA1, TPM2 and TPM3 mutations. We applied a targeted next-generation sequencing strategy designed to analyse NEB coding regions, the relative full introns and the promoter. We also evaluated copy number variations (by CGH array) and transcriptional changes by RNA Sanger sequencing, whenever possible. This combined strategy revealed 11 likely pathogenic variants in 8 of 10 patients. The molecular diagnosis was fully achieved in 3 of 8 patients, while only one heterozygous mutation was observed in 5 subjects. This approach revealed to be a fast and cost-effective way to analyse the large NEB gene in a small group of patients and might be promising for the detection of pathological variants of other genes featuring large coding regions and lacking mutational hotspots.

  8. RIKEN Integrated Sequence Analysis (RISA) System—384-Format Sequencing Pipeline with 384 Multicapillary Sequencer

    Science.gov (United States)

    Shibata, Kazuhiro; Itoh, Masayoshi; Aizawa, Katsunori; Nagaoka, Sumiharu; Sasaki, Nobuya; Carninci, Piero; Konno, Hideaki; Akiyama, Junichi; Nishi, Katsuo; Kitsunai, Tokuji; Tashiro, Hideo; Itoh, Mari; Sumi, Noriko; Ishii, Yoshiyuki; Nakamura, Shin; Hazama, Makoto; Nishine, Tsutomu; Harada, Akira; Yamamoto, Rintaro; Matsumoto, Hiroyuki; Sakaguchi, Sumito; Ikegami, Takashi; Kashiwagi, Katsuya; Fujiwake, Syuji; Inoue, Kouji; Togawa, Yoshiyuki; Izawa, Masaki; Ohara, Eiji; Watahiki, Masanori; Yoneda, Yuko; Ishikawa, Tomokazu; Ozawa, Kaori; Tanaka, Takumi; Matsuura, Shuji; Kawai, Jun; Okazaki, Yasushi; Muramatsu, Masami; Inoue, Yorinao; Kira, Akira; Hayashizaki, Yoshihide

    2000-01-01

    The RIKEN high-throughput 384-format sequencing pipeline (RISA system) including a 384-multicapillary sequencer (the so-called RISA sequencer) was developed for the RIKEN mouse encyclopedia project. The RISA system consists of colony picking, template preparation, sequencing reaction, and the sequencing process. A novel high-throughput 384-format capillary sequencer system (RISA sequencer system) was developed for the sequencing process. This system consists of a 384-multicapillary auto sequencer (RISA sequencer), a 384-multicapillary array assembler (CAS), and a 384-multicapillary casting device. The RISA sequencer can simultaneously analyze 384 independent sequencing products. The optical system is a scanning system chosen after careful comparison with an image detection system for the simultaneous detection of the 384-capillary array. This scanning system can be used with any fluorescent-labeled sequencing reaction (chain termination reaction), including transcriptional sequencing based on RNA polymerase, which was originally developed by us, and cycle sequencing based on thermostable DNA polymerase. For long-read sequencing, 380 out of 384 sequences (99.2%) were successfully analyzed and the average read length, with more than 99% accuracy, was 654.4 bp. A single RISA sequencer can analyze 216 kb with >99% accuracy in 2.7 h (90 kb/h). For short-read sequencing to cluster the 3′ end and 5′ end sequencing by reading 350 bp, 384 samples can be analyzed in 1.5 h. We have also developed a RISA inoculator, RISA filtrator and densitometer, RISA plasmid preparator which can handle throughput of 40,000 samples in 17.5 h, and a high-throughput RISA thermal cycler which has four 384-well sites. The combination of these technologies allowed us to construct the RISA system consisting of 16 RISA sequencers, which can process 50,000 DNA samples per day. One haploid genome shotgun sequence of a higher organism, such as human, mouse, rat, domestic animals, and plants, can

  9. RIKEN integrated sequence analysis (RISA) system--384-format sequencing pipeline with 384 multicapillary sequencer.

    Science.gov (United States)

    Shibata, K; Itoh, M; Aizawa, K; Nagaoka, S; Sasaki, N; Carninci, P; Konno, H; Akiyama, J; Nishi, K; Kitsunai, T; Tashiro, H; Itoh, M; Sumi, N; Ishii, Y; Nakamura, S; Hazama, M; Nishine, T; Harada, A; Yamamoto, R; Matsumoto, H; Sakaguchi, S; Ikegami, T; Kashiwagi, K; Fujiwake, S; Inoue, K; Togawa, Y

    2000-11-01

    The RIKEN high-throughput 384-format sequencing pipeline (RISA system) including a 384-multicapillary sequencer (the so-called RISA sequencer) was developed for the RIKEN mouse encyclopedia project. The RISA system consists of colony picking, template preparation, sequencing reaction, and the sequencing process. A novel high-throughput 384-format capillary sequencer system (RISA sequencer system) was developed for the sequencing process. This system consists of a 384-multicapillary auto sequencer (RISA sequencer), a 384-multicapillary array assembler (CAS), and a 384-multicapillary casting device. The RISA sequencer can simultaneously analyze 384 independent sequencing products. The optical system is a scanning system chosen after careful comparison with an image detection system for the simultaneous detection of the 384-capillary array. This scanning system can be used with any fluorescent-labeled sequencing reaction (chain termination reaction), including transcriptional sequencing based on RNA polymerase, which was originally developed by us, and cycle sequencing based on thermostable DNA polymerase. For long-read sequencing, 380 out of 384 sequences (99.2%) were successfully analyzed and the average read length, with more than 99% accuracy, was 654.4 bp. A single RISA sequencer can analyze 216 kb with >99% accuracy in 2.7 h (90 kb/h). For short-read sequencing to cluster the 3' end and 5' end sequencing by reading 350 bp, 384 samples can be analyzed in 1.5 h. We have also developed a RISA inoculator, RISA filtrator and densitometer, RISA plasmid preparator which can handle throughput of 40,000 samples in 17.5 h, and a high-throughput RISA thermal cycler which has four 384-well sites. The combination of these technologies allowed us to construct the RISA system consisting of 16 RISA sequencers, which can process 50,000 DNA samples per day. One haploid genome shotgun sequence of a higher organism, such as human, mouse, rat, domestic animals, and plants, can be

  10. Genealogy analysis of duchenne muscular dystrophy by multiplex ligation-dependent probe amplification and sequencing technology%MLPA及测序技术在DMD/BMD家系分析中的应用

    Institute of Scientific and Technical Information of China (English)

    古艳; 谢建生; 都莉; 韩春锡; 万琼

    2012-01-01

    Objective To analyze the DMD genealogy by MLPA technique in combine with DNA and cDNA sequencing technology. Methods There were 31 individuals accepted DMD gene diagnosis,including 6 DMD/BMD patients, 13 possible carriers and 6 healthy men in 2 DMD/BMD families,moreover 6 healthy women and men were selected from health examination people. Genomic DNA of the peripheral blood was extracted from the pedigrees' members with DMD. RNA was extracted from the bioptic muscle of the DMD patients and was reversed transcription to cDNA. Gene diagnosis was performed for theses pedigrees members using MLPA technique,the mutation was analyzed applying with DNA and/or cDNA sequence technique. Simultaneously,compare the effects of these methods on detecting DMD gene deletion. Results 4 patients of the first DMD family was deleted exon50,and the fetus was confirmed no DMD exons deletion. 2 patients were found deletion exon43 in the second family. MLPA analysis、DNA and cDNA sequencing technology showed the same result. Conclusion MLPA in company with DNA and cDNA sequencing technology could applied into clinical gene diagnosis for DMD.%目的 运用MLPA技术和DNA及cDNA测序技术对DMD/BMD进行家系分析,对患者、可能携带者基因诊断并探讨诊断流程的临床可行性.方法 对2个DMD/BMD家系中6例患者、13例女性可能携带者、6例男性家系成员,6例女性和男性健康对照共31例采集外周血提取DNA,运用MLPA技术分析对以上31例的DMD基因79个外显子;患者取右侧腓肠肌10~30 mg肌肉提取RNA,逆转录cDNA;分别进行DNA及cDNA序列测定,测序结果与MLPA结果进行比较.结果 经MLPA检测,家系1的4例患者均缺失DMD基因Exon50,家系2中2例患者均缺失Exon43.以上结果经肌肉cDNA测序证实了相应外显子缺失.结论 MLPA技术结合DNA及cDNA测序技术进行DMD家系分析具有可靠的临床应用价值.

  11. Sanger法检测分析非小细胞肺癌患者组织EGFR%Sanger method for detecting tissue EGFR in patients with non-small cell lung cancer

    Institute of Scientific and Technical Information of China (English)

    王金龙; 李宝锋; 罗凯

    2011-01-01

    patiens with EGFB mutation.Conclusions Sanger method for detecting gene mutations in NSCLC can provide technical support for the targeted therapy.

  12. Qualitative de novo analysis of full length cDNA and quantitative analysis of gene expression for common marmoset (Callithrix jacchus) transcriptomes using parallel long-read technology and short-read sequencing.

    Science.gov (United States)

    Shimizu, Makiko; Iwano, Shunsuke; Uno, Yasuhiro; Uehara, Shotaro; Inoue, Takashi; Murayama, Norie; Onodera, Jun; Sasaki, Erika; Yamazaki, Hiroshi

    2014-01-01

    The common marmoset (Callithrix jacchus) is a non-human primate that could prove useful as human pharmacokinetic and biomedical research models. The cytochromes P450 (P450s) are a superfamily of enzymes that have critical roles in drug metabolism and disposition via monooxygenation of a broad range of xenobiotics; however, information on some marmoset P450s is currently limited. Therefore, identification and quantitative analysis of tissue-specific mRNA transcripts, including those of P450s and flavin-containing monooxygenases (FMO, another monooxygenase family), need to be carried out in detail before the marmoset can be used as an animal model in drug development. De novo assembly and expression analysis of marmoset transcripts were conducted with pooled liver, intestine, kidney, and brain samples from three male and three female marmosets. After unique sequences were automatically aligned by assembling software, the mean contig length was 718 bp (with a standard deviation of 457 bp) among a total of 47,883 transcripts. Approximately 30% of the total transcripts were matched to known marmoset sequences. Gene expression in 18 marmoset P450- and 4 FMO-like genes displayed some tissue-specific patterns. Of these, the three most highly expressed in marmoset liver were P450 2D-, 2E-, and 3A-like genes. In extrahepatic tissues, including brain, gene expressions of these monooxygenases were lower than those in liver, although P450 3A4 (previously P450 3A21) in intestine and P450 4A11- and FMO1-like genes in kidney were relatively highly expressed. By means of massive parallel long-read sequencing and short-read technology applied to marmoset liver, intestine, kidney, and brain, the combined next-generation sequencing analyses reported here were able to identify novel marmoset drug-metabolizing P450 transcripts that have until now been little reported. These results provide a foundation for mechanistic studies and pave the way for the use of marmosets as model animals

  13. Validation of Next-Generation Sequencing of Entire Mitochondrial Genomes and the Diversity of Mitochondrial DNA Mutations in Oral Squamous Cell Carcinoma.

    Directory of Open Access Journals (Sweden)

    Anita Kloss-Brandstätter

    Full Text Available Oral squamous cell carcinoma (OSCC is mainly caused by smoking and alcohol abuse and shows a five-year survival rate of ~50%. We aimed to explore the variation of somatic mitochondrial DNA (mtDNA mutations in primary oral tumors, recurrences and metastases.We performed an in-depth validation of mtDNA next-generation sequencing (NGS on an Illumina HiSeq 2500 platform for its application to cancer tissues, with the goal to detect low-level heteroplasmies and to avoid artifacts. Therefore we genotyped the mitochondrial genome (16.6 kb from 85 tissue samples (tumors, recurrences, resection edges, metastases and blood collected from 28 prospectively recruited OSCC patients applying both Sanger sequencing and high-coverage NGS (~35,000 reads per base.We observed a strong correlation between Sanger sequencing and NGS in estimating the mixture ratio of heteroplasmies (r = 0.99; p10% were predominant. Four out of six patients who developed a local tumor recurrence showed mutations in the recurrence that had also been observed in the primary tumor. Three out of five patients, who had tumor metastases in the lymph nodes of their necks, shared mtDNA mutations between primary tumors and lymph node metastases. The percentage of mutation heteroplasmy increased from the primary tumor to lymph node metastases.We conclude that Sanger sequencing is valid for heteroplasmy quantification for heteroplasmies ≥10% and that NGS is capable of reliably detecting and quantifying heteroplasmies down to the 1%-level. The finding of shared mutations between primary tumors, recurrences and metastasis indicates a clonal origin of malignant cells in oral cancer.

  14. Transcriptome sequencing for SNP discovery across Cucumis melo

    Directory of Open Access Journals (Sweden)

    Blanca José

    2012-06-01

    Full Text Available Abstract Background Melon (Cucumis melo L. is a highly diverse species that is cultivated worldwide. Recent advances in massively parallel sequencing have begun to allow the study of nucleotide diversity in this species. The Sanger method combined with medium-throughput 454 technology were used in a previous study to analyze the genetic diversity of germplasm representing 3 botanical varieties, yielding a collection of about 40,000 SNPs distributed in 14,000 unigenes. However, the usefulness of this resource is limited as the sequenced genotypes do not represent the whole diversity of the species, which is divided into two subspecies with many botanical varieties variable in plant, flowering, and fruit traits, as well as in stress response. As a first step to extensively document levels and patterns of nucleotide variability across the species, we used the high-throughput SOLiD™ system to resequence the transcriptomes of a set of 67 genotypes that had previously been selected from a core collection representing the extant variation of the entire species. Results The deep transcriptome resequencing of all of the genotypes, grouped into 8 pools (wild African agrestis, Asian agrestis and acidulus, exotic Far Eastern conomon, Indian momordica and Asian dudaim and flexuosus, commercial cantalupensis, subsp. melo Asian and European landraces, Spanish inodorus landraces, and Piel de Sapo breeding lines yielded about 300 M reads. Short reads were mapped to the recently generated draft genome assembly of the DHL line Piel de Sapo (inodorus x Songwhan Charmi (conomon and to a new version of melon transcriptome. Regions with at least 6X coverage were used in SNV calling, generating a melon collection with 303,883 variants. These SNVs were dispersed across the entire C. melo genome, and distributed in 15,064 annotated genes. The number and variability of in silico SNVs differed considerably between pools. Our finding of higher genomic diversity in wild

  15. Exploring the environmental diversity of kinetoplastid flagellates in the high-throughput DNA sequencing era

    Directory of Open Access Journals (Sweden)

    Claudia Masini d’Avila-Levy

    2015-01-01

    Full Text Available The class Kinetoplastea encompasses both free-living and parasitic species from a wide range of hosts. Several representatives of this group are responsible for severe human diseases and for economic losses in agriculture and livestock. While this group encompasses over 30 genera, most of the available information has been derived from the vertebrate pathogenic genera Leishmaniaand Trypanosoma.Recent studies of the previously neglected groups of Kinetoplastea indicated that the actual diversity is much higher than previously thought. This article discusses the known segment of kinetoplastid diversity and how gene-directed Sanger sequencing and next-generation sequencing methods can help to deepen our knowledge of these interesting protists.

  16. Screening of BRCA1 sequence variants within exon 11 by heteroduplex analysis

    Directory of Open Access Journals (Sweden)

    Lucian Negura

    2013-03-01

    Full Text Available Germ-line mutations of either BRCA1 or BRCA2 represents the major hereditary risk to breast and ovariancancer. Screening for mutations in these genes is now standard practice in molecular diagnosis, opening the way tooncogenetic counselling and follow-up. Because mutations in both BRCA1 and BRCA2 are distributed throughout theloci, accepted clinical protocols involve screening their entire coding regions. Systematic Sanger sequencing is time andmoney consuming. Therefore, a lot of pre-screening techniques evolved over time in order to identify anomalousamplicons prior to sequencing. Because BRCA mutations are always heterozygous, heteroduplex analysis proved to be asuitable pre-screening step. We previously implemented mismatch specific endonuclease heteroduplex analysis forBRCA1 exon7. Here we show the utility of the same method for mutations and SNPs found in BRCA1 exon 11

  17. Viral Metagenomics: Analysis of Begomoviruses by Illumina High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Ali Idris

    2014-03-01

    Full Text Available Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes (genus, Begomovirus; family, Geminiviridae were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA. Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS. CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and referenc